You are on page 1of 799

A Course on

ALGEBRA

by

Ahmet K. Feyzioğlu
CHAPTER 1
Preliminaries

§1
Set Theory

We assume that the reader is familiar with basic set theory. In this pa-
ragraph, we want to recall the relevant definitions and fix the notation.

Our approach to set theory will be informal. For our purposes, a set is a
collection of objects, taken as a whole. "Set" is therefore a collective term
like "family", "flock", "species", "army", "club", "team" etc. The objects
which make up a set are called the elements of that set. We write

x S

to denote that the object x is an element of the set S. This can be read "x
is an element of S", or "x is a member of S", or "x belongs to S", or "x is in
S", or "x is contained in S", or "S contains x". If x is not an element of S,
we write

x S.

For technical reasons, we agree to have a unique set that has no


elements at all. This set is called the empty set and is denoted by .

1
A set S is called a subset of a set T if every element of S is also an
element of T. The notation
S T
means that S is a subset of T. This is read "S is a subset of T, or "S is
included in T", or "S is contained in T". By convention, the empty set is
a subset of any set. If S is not a subset of T, we write

S T.
This means there is at least one element of S which does not belong to T.

If S T and T S, then S and T have exactly the same elements. In this


case, S and T are said to be identical or equal. We write

S=T

if S and T are equal sets. Whenever we want to prove that two sets S
and T are equal, we must show that S is included in T and that T is
included in S. If S and T are not equal, we put

S T.

If S T but T S, then S is said to be a proper subset of T. So S is a


proper subset of T if and only if every element of S is an element of T
but T contains at least one element which does not belong to S. The nota-
tion

S T

means that S is a proper subset of T. This is read "S is a proper subset of


T", or "S is properly included in T", or "S is properly contained in T". By
convention, the empty set is a proper subset of every set except itself.

Some authors write S T to mean that S is a subset of T, the. possibility


S = T being included, and S T to mean that S is a proper subset of T.
The reader should be careful about the meaning of the symbol " " he or
she uses. In this book, " " denotes proper inclusion.

Sets are sometimes written by displaying their elements within braces


(roster notation). Hence
{1,2,3,4,5}
is the set whose elements are the numbers 1,2,3,4 and 5. Obviously, only
those sets which have a small number of elements can be written in this

2
way. In many cases, the elements of a set S are characterized by a
property P and the set is then written
{x : x has property P}.

In this book, = {1,2,3, . . . } is the set of natural numbers, = {0, 1, 2, . .


a
. } is the set of integers, = { : a,b , b 0} is the set of rational num-
b
bers, is the set of real numbers, is the set of complex numbers.
These notations are standard. Some authors regard 0 as a natural
number, but we agree that 0 in this book.

Given two sets S and T, we consider those objects which belong to S or to


T. Such objects will make up a new set. This set is called the union of S
and T and is denoted by S T. We remark here that 'or' in the definition
of a union is the logical 'or'. Let us recall that
'p or q' is true in case 'p' is true, 'q' is true;
'p' is true, 'q' is false;
'p' is false, 'q' is true;
and
'p or q' is false in case 'p' is false, 'q' is false.
Thus we have
S T = {x : x S or x T}.

In particular, S T=T S.

If we have sets S 1,S 2, . . . , S n, their union S 1 S2 ... S n is given by

S1 S2 ... S n = {x : x S1 or x S2 or . . . or x Sn}.
n
∑ai
n
We usually contract this notation into Si, just like we write
i=1
i=1

instead of a1 + a2 + . . . + an. More generally, if we have sets Si, indexed by


a set I, then their union Si is the set
i I

Si = {x : x Si for at least one i I}.


i I

Given two sets S and T, we consider those objects which belong to S and
to T. Such objects will make up a new set. This set is called the
intersection of S and T and is denoted by S T. We remark here that
'and' in the definition of a intersection is the logical 'and'. Let us recall
that
'p and q' is true in case 'p' is true, 'q' is true;

3
and
'p and q' is false in case 'p' is true, 'q' is false;
'p' is false, 'q' is true;
'p' is false, 'q' is false.
Thus we have
S T = {x : x S and x T}.

In particular, S T=T S.

If we have sets S1,S2,. . . , Sn, their intersection S1 S2 ... Sn is given by

S1 S2 ... Sn = {x : x S1 and x S2 and . . . and x Sn}.


n
We usually contract this notation into Si. More generally, if we have
i=1
sets Si, indexed by a set I, then their intersection Si is the set
i I

Si = {x : x Si for all i I}.


i I

Two sets S and T are said to be. disjoint if their intersection is empty:
S T = . Given a family of sets S i, indexed by a set I, the sets Si are
called mutually disjoint if any two distinct of them are disjoint: .
Si1 Si2 = for all i1, i2 I, Si1 Si2 .

The sets we consider in a particular discussion are usually subsets of a


set U. This set U is called the universal set. Given a set S, which is a
subset of a universal set U, those elements of U that do not belong to S
make up a new set, called the complement of S and denoted by S or S c
.
or CU(S). Hence

S = {x : x U and x S}.

More generally, we write

T \ S = {x : x T and x S}.

and call this set the relative complement of S in T, or the difference set T
minus S. The set S may or may not be a subset of T. Note that

T\ S=T S .

According to our definition of equality, the sets {a,b} and {b,a} are equal.
Frequently, we want to distinguish between a,b and b,a. To this end, we
define ordered pairs. An ordered pair is a pair of objects a,b, enclosed

4
within parentheses and separated by a comma. Thus (a,b) is an ordered
pair. The adjective "ordered" is used to emphasize that the objects have
a status of being first and being second. a is called the first component of
the ordered pair (a,b), and b is called its second component. Two ordered
pairs are declared equal if their first components are equal and their
second components are equal. Thus (a,b) and (c,d) are equal if and only
if a = c and b = d, in which case we write (a,b) = (c,d). Notice that we
have (a,b) (b,a) unless a = b (here means the negation of equality).

The set of all ordered pairs, whose first components are the elements of
a set S and whose second components are the elements of a set T, is
called the cartesian product of S and T, and is denoted by S T. Hence

S T = {(a,b): a S and b T}.

We can also define ordered triples (a,b,c), ordered . quadruples (a,b,c,d),


more generally ordered n-tuples (a1,a2,. . . , an). Equality of ordered n-
tuples will mean the equality of their corresponding. components. The
set of all ordered n-tuples, whose i-th components. are the elements of a
set S i, is called the cartesian product of S1,S2,. . ., Sn and is denoted by
S1 S2 . . . Sn. Hence

S1 S2 ... Sn = {(a1,a2,. . . , an): a1 S1, a2 S2,. . . , an Sn}.

It is possible to define the cartesian product of infinitely many sets, too.


We do not give this definition, for we will not need it.

A set can have finitely many or infinitely many elements. The number
of elements in a set S is called the cardinality or the cardinal number of
S. The cardinality of S is denoted by |S| The set S is said to be finite if |S|
is a finite number. S is said to be infinite if S is not finite. A rigorous
definition of finite and infinite sets must be based on the notion of one-
to-one correspondence between sets, which will be introduced in §3.
However, we will not make any attempt to give a rigorous definition of
finite and infinite sets. We shall be content with the suggestive
description above.

Exercises

1. Show that, if R is a subset of S and S is a subset of T, then R is a


subset of T.

5
2. Show that (R S) T = R (S T)
and (R S) T = R (S T).
3. Prove: S T = S if and only if S T, and S T if and only if S T = T.

4. Prove the distributivity of union over intersection and of intersection


over union:
R (S T) = (R S) (R T),
R (S T) = (R S) (R T).

5. Prove the deMorgan laws:


(S T) = S T and (S T) = S T
for any subsets S,T of a universal set U.

6. Show that T \ = T and T \ T = for any set T.

7. Prove: (S \ T) (T \ S) = (S T) \ (T \ S). This set is called the


symmetric difference of S and T. It is denoted by S T.

8. With the notation of Ex. 7, prove that


(R S) T = R (S T)
S =S
S S =
S T = T S.

9. Let S and T be finite sets. Prove the following assertions.


a) If S T = , then |S T| = |S|+ |T|.
b) |S T| = |S| + |T| |S T|. (Hint: S T = S (T \ S).)

10. Find all subsets of , {1}, {1,2},{1,2,3},{1,2,3,4}.

S
11. Prove: if S is a finite set, then S has exactly 2 subsets.

6
§2
Equivalence Relations

In mathematics, we often investigate relationships between certain


objects (numbers, functions, sets, figures, etc.). If an element a of a set A
is related to an element b of a set B, we might write
a is related to b
or shortly
a related b
or even more shortly
a R b.
The essential point is that we have two objects, a and b, that are related
in some way. Also, we say "a is related to b", not "b is related to a", so
the order of a and b is important. In other words, the ordered pair (a,b)
is distinguished by the relation. This observation suggests the following
formal definition of a relation.

2.1 Definition: Let A and B be two sets. A relation R from A into B is a


subset of the cartesian product A B.

If A and B happen to be equal, we speak of a relation on A instead of


using the longer phrase "a relation from A into A".

Equivalence relations constitute a very important type of relations on a


set.

2.2 Definition: Let A be a nonempty set. A relation R on A (that is, a


subset R of A A) is called an equivalence relation on A if the following
hold.
(i) (a,a) R for all a A,
(ii) if (a,b) R, then (b,a) R (for all a,b A)
(iii) if (a,b) R and (b,c) R, then (a,c) R
(for all a,b,c R).

7
This definition presents the logical structure of an equivalence relation
very clearly, but we will almost never use this notation. We prefer to
write a b, or a b, or a b or some similar symbolism instead of
(a,b) R in order to express that a,b are related by an equivalence
relation R. Here a b can be read "a is equivalent to b". Our definition
then assumes the form below.

2.2 Definition: Let A be a nonempty set. A relation R on A (that is, a


subset R of A A) is called an equivalence relation on A if the following
hold.
(i) a a for all a A,
(ii) if a b, then b a (for all a,b A),
(iii) if a b and b c then a c (for all a,b,c A).

A relation that satisfies the first condition (i) is called a reflexive


relation, one that satisfies the second condition (ii) is called a symmetric
relation, one that satisfies the third condition (iii) is called a transitive
relation. An equivalence relation is therefore a relation which is
reflexive, symmetric and transitive. Notice that symmetry and transi-
tivity requirements involve conditional statements (if . . . , then . . . ). In
order to show that is symmetric, for example, we must make the
hypothesis a b and use this hypothesis to establish b a. On the other
hand, in order to show that is reflexive, we have to establish a a for
all a A, without any further assumption.

2.3 Examples: (a) Let A be a nonempty set of numbers and let


equality = be our relation. Then = is certainly an equivalence relation on
A since
(i) a = a for all a A,
(ii) if a = b, then b = a (for all a,b A),
(iii) if a = b and a = b then a = c (for all a,b,c A).

(b) Let A be the set of all points in the plane except the origin. For any
two points P and R in A, let us put P R if R lies on the line through the
origin and P.
(i) P P for all points P in A since any point lies on the line
through the origin and itself. Thus is reflexive.

8
(ii) If P R, then R lies on the line through the origin and P;
therefore the origin, P, R lie on one and the same line; therefore P lies
on the line through the origin and R; and R P. Thus is symmetric.
(iii) If P R and R T, then the line through the origin and
R contains the points P and T, so T lies on the line through the origin and
P, so we get P T. Thus is transitive.
This proves that is an equivalence relation on A.

(c) Let S be the set of all straight lines in the plane. Let us put m n if
the line m is parallel to the line n. It is easily seen that (parallelism) is
an equivalence relation on S.

(d) Let be the set of integers. For any two numbers a,b in , let us put
a b if a b is even (divisible by 2).
(i) a a for all a since a a = 0 is an even number.
(ii) If a b, then a b is even, then b a = (a b) is also
even, so b a.
(iii) If a b and b c, then a b and b c are even. Their
sum is also an even number. So a c = (a b) + (b c) is even and a c.
We see that is an equivalence relation on .

(e) The last example may be generalized. We fix a whole number n 0


(n is called the modulus in this context). For any two numbers a,b in ,
let us put a b if a b is divisible by n.
(i) a a for all a since a a = 0 = n0 is divisible by n.
(ii) If a b, then a b = nm for some m , so
b a = (a b) = n( m) is divisible by n, and b a.
(iii) If a b and b c, then a b = nm and b c = nk for
some m,k , so a c = (a b) + (b c) = nm + nk = n(m + k) is divisible
by n, and so a b.
Therefore, is an equivalence relation on . This relation is called
congruence. For each nonzero integer n, there is a congruence relation.
In order to distinguish between them, we write, when n is the modulus,
a b (mod n) rather than a b.

(f) Let S = ( \{0}). Thus S is the set of all ordered pairs of integers
whose second components are distinct from zero. Let us write
(a,b) (c,d) for (a,b), (c,d) S if ad = bc.
(i) (a,b) (a,b) for all (a,b) S, since ab = ba for all a ,
b \{0}.

9
(ii) If (a,b) (c,d), then ad = bc, then da = cb, then cb = da, so
(c,d) (a,b).
(iii) If (a,b) (c,d) and (c,d) (e,f), then
ad = bc and cf = de
df = bcf and bcf = bde
adf = bde
d(af be) = 0
af be = 0 (since d 0)
af = be
(a,b) (e,f).
Thus is an equivalence relation on S.

(g) Let T be the set of all triangles in the Euclidean plane. Congruence of
triangles is an equivalence relation on T.

(h) Let S be the set of all continuous functions defined on the closed
A
interval [0,1]. For any two functions f,g in S, let us write f = g if

A
∫1
0 f(x)dx = ∫
1
0 g(x)dx.
Then = is an equivalence relation on S.

An equivalence relation is a weak form of equality. Suppose we have


various objects, which are similar in one respect and dissimilar in
certain other respects. We may wish to ignore their dissimilarity and
focus our attention on their similar behaviour. Then there is no need to
distinguish between our various objects that behave in the same way.
We may regard them as equal or identical. Of course, "equal" or
"identical" are poor words to employ here, for the objects are not
absolutely identical, they are equal only in one respect that we wish to
investigate more closely. So we employ the word "equivalent". That a
and b are equivalent means, then, a and b are equal, not in every
respect, but rather as far as a particular property is concerned. An
equivalence relation is a formal tool for disregarding differences
between various objects and treating them as equals.

Let us examine our examples under this light. In Example 2.3(b), the
points P and R may be different, but the lines they determine with the
origin are equal. In Example 2.3(c), the lines may be different, but their
directions are equal. In Example 2.3(d), the integers may be different,
but their parities are equal. In Example 2.3(e), the integers may be

10
different, but their remainders, when they are divided by n, are equal.
In Example 2.3(f), the pairs may be different, but the ratio of their
components are equal. In Example 2.3(g), the triangles may have
different locations in the plane, but their geometrical properties are the
same. In Example 2.3(h), the functions may be different, but the "areas
under their curves" are equal.

An equivalence relation on a set A gives rise to a partition of A into


disjoint subsets. This means that A is a union of certain subsets of A and
that the distinct subsets here are mutually disjoint. The converse is also
true: whenever we have a partition of a nonempty set A into pairwise
disjoint subsets, there is an equivalence relation on A. Before proving
this important result, we introduce a definition.

2.4 Definition: Let be an equivalence relation on a nonempty set A,


and let a be an element of A. The equivalence class of a is defined to be
the set of all elements of A that are equivalent to a.

The equivalence class of a will be denoted by [a] (or by class(a), cl(a),ba


or by a similar symbol): [a] = { x A : x a }.

An element of an equivalence class X A is called a representative of X.


Notice that x [a] and x a have exactly the same meaning. In partic-
ular, we have a [a] by reflexivity. So any a A is a representative of
its own equivalence class.

The equivalence classes [a] are subsets of A. The set of all equivalence
classes is sometimes denoted by A/ . It will be a good exercise for the
reader to find the equivalence classes in Example 2.3.

We now state and prove the result we promised.

2.5 Theorem: Let A be a nonempty set and let be an equivalence


relation on A. Then the equivalence classes form a partition of A. In
other words, A is the union of the equivalence classes and the distinct
equivalence classes are disjoint:

11
A= [a] and if [a] [b], then [a] [b] = .
a A

Conversely, let

A= Pi , Pi Pj = if i j
a A

be a union of nonempty, mutually disjoint sets Pi, indexed by I. Then


there is an equivalence relation on A such that the Pi's are the
equivalence classes under this relation.

Proof: First we prove A = [a]. For any a A, we have [a] A, hence


a A
[a] A. Also, if a A, then a [a] by reflexivity, so a [a] and
a A a A
A [a] . So A = [a].
a A a A

Now we must prove that distinct equivalence classes are disjoint. We


prove its contrapositive, which is logically the same: if two equivalence
classes are not disjoint, then they are identical. Suppose that the
equivalence classes [a] and [b] are not disjoint. This means there is a c in
A such that c [a] and c [b]. Hence

c a and c b
a c and c b (by symmetry)
(1) a b (by transitivity)
(2) b a (by symmetry).

We want to prove [a] = [b]. To this end, we have to prove [a] [b] and
also [b] [a]. Let us prove [a] [b]. If x [a], then x a and a b by (1),
then x b by transitivity, then x [b], so [a] [b]. Similarly, if y [b],
then y b , then y b and b a by (2), then y a by transitivity, then
y [a], so [b] [a]. Hence [a] = [b] if [a] and [b] are not disjoint. This
completes the proof of the first assertion.

Now the converse. Let A = Pi, where any two distinct Pi's are
a A
disjoint. We want to define an equivalence relation on A and want the
P i's to be the equivalence classes. How do we accomplish this? Well, if
the P i are to be the equivalence classes, we had better. call two elements
equivalent if they belong to one and the same Pi0 .

Let a A. Since A = Pi , we see that a Pi0 for some i 0 I. This


a A
index i 0 is uniquely determined by a. That is to say,. a cannot belong to

12
two or more of the subsets Pi, for then Pi would not be mutually disjoint.
So each element of A belongs to one and only one of the subsets Pi.

Let a,b be elements of A and suppose a Pi0 and b Pi1 . We put a b if


Pi0 = Pi1 , i.e., we put a b if the sets Pi to which a and b belong are
identical. We show that is an equivalence relation.
(i) For any a A, of course a belongs to the set Pi0 it belongs
to, and so a a and is reflexive..
(ii) Let a b. This means a and b belong to the same set Pi0 ,
say, so b and a belong to the same set Pi0 , hence b a. So is symmetric.
(iii) Let a b and b c. Then the set P i to which b belongs
contains a and c. Thus a and c belong to the same set Pi and a c. This
proves that is transitive.

We showed that is indeed an equivalence relation on A. It remains to


prove that Pi are the equivalence classes under . For any a A, we
have, if a Pi1 ,
[a] = {x A: x a}
= {x A : x belongs to Pi1 }
= {x P i : x belongs to Pi1 }
i I
= Pi1 .
This proves that Pi are the equivalence classes under .

Exercises

1. On , define a relation by declaring (a,b) (c,d) if and only if


a + d = b + c. Show that is an equivalence relation on .

2. Determine whether the relation on is an equivalence relation on


, when is defined by declaring x y for all x,y if and only if
ay+b
(a) there are integers a,b,c,d such that ad bc = ±1 and x = ;
cy+d
(b) |x y| 0.000001;
(c) |x| = |y|;
(d) x y is an integer;
(e) x y is an even integer;
(f) there are natural numbers n,m such that xn = ym;
(g) there are natural numbers n,m such that nx = my;
(h) x y.

13
3. Let and be two equivalence relations on a set A. We define by
declaring a b if and only if a b and a b; and we define by
declaring a b if and only if a b or a b. Determine whether and
are equivalence relations on A.

4. If a relation on A is symmetric and transitive, then it is also reflexive.


Indeed, let be the relation and let a A. Choose an element b A such
that a b. Then b a by symmetry, and from a b, b a, it follows that
a a, by transitivity. So a a for any a A and is reflexive.

This argument is wrong. Why?

14
§3
Mappings and Operations

Functions, also called mappings, build a very important type of relations.


Let us recall that a relation from A into B is a subset of A B. Under
special circumctances, a relation will be called a function or a mapping.
These two terms will be used interchangably.

3.1 Definition: Let A and B be nonempty sets. A relation f from A into


B is called a function from A into B, or a mapping from A into B if every
element of A is the first component of a single ordered pair in f A B.

This definition embraces two conditions. First, every element a of A will


appear as the first component of at least one ordered pair (a,*) in f, that
is, the first components of the ordered pairs in f should make up the
whole A. No element of A can be left out. There should be no element of
A which is not the first component of any pair in f. Second, for any
a A, there can be only one ordered pair in f whose first component is
a. In other words, if (a,b) and (a,b ) are both in f, these pairs should be
identical, which means b = b . A relation f from A into B is a mapping if
and only if every element of A is the first component of one and only
one ordered pair in f.

If f is a mapping from A into B, then A is called the domain of f, and B is


called the range of f. A function f from A into B must be thought of as a
rule or mechanism by which elements of A are assigned to certain
elements of B. The first condition, that every element of A is the first
component of at least one ordered pair in f, is a formal way of ex-
pressing that elements of A, not of any other set, in particular not of any
proper subset of A, are the objects that are assigned (to some elements
of B). The second condition, that every element of A is the first com-
ponent of at most one ordered pair in f, is a formal way of expressing
that no element of A is assigned to two, three or more elements of B.

We introduce some notation. We write f: A B to mean that f is a map-


ping from A into B. Occasionally, we write A f B. The reader probably

15
expects that we write f(a) = b in place of (a,b) f. This is the symbolism
that the reader is accustomed to, and reminds us of a mapping rule that
assigns b to a. However, we will rarely write f(a) = b. We prefer to write
(a)f = b or af = b, with the function symbol f on the right side of the
element a. This might seem odd, and the reader might wonder about this
strange order of elements and functions. It takes some time to get
accustomed to this way of writing functions on the right, but the ad-
vantages of this notation will far outweigh the little trouble it causes at
first. This will be amply clear in the sequel. We remark that not every
algebraist conforms to this usage, and an isolated notation will have
different meanings according as whether the functions are written on
the right or on the left. We will point out these differences as occasions
arise.

Suppose f is a mapping from A into B and a A and b B are such that


f
af = b (in this case, we sometimes write a b or a b and say that f
maps a to b). Then b is called the image of a under f. We also say a is a
preimage or an inverse image of b under f. Please mark the articles: b is
the image of a, since a has one and only one image, but a is a preimage
of b, for b may have many preimages.

3.2 Examples:(a) Let A be a nonempty set and let.


= {(a,a): a A} A A. Then is a function from . A into A. In our
second notation, this reads a = a. This function is called. the identity
mapping on A. When we want to point out the set A, we write A instead
of .
Now let A B and put = {(a,a) A B : a A} A B. Then is a
function from A into B. In our second notation, this reads a = a. This
function is called the inclusion mapping from A into B. Writing a for a is
a formal way of recalling A B and a B.

(b) Let A = {1,2,3,4,5} and B = {a,b,c,d}. Consider

f = {(1,b), (2,a), (4,d), (5,d)}.

Then f is not a function from A into B since 3 A is not the first


component of any ordered pair in f A B. Consider

g = {(1,b), (2,a), (3,a), (3,b), (4,c), (5,d)}.

16
Then g is not a function from A into B since 3 A is the first component
of two distinct ordered pairs in g A B.

(c) Let A and B be two nonempty sets and let b B be a fixed element
of B . Then f, defined by

af = b for all a A, i.e., f = {(a,b) A B: a A}

is a mapping from A into B. This is sometimes called the constant


function b.

(d) For any (a,b) , put (a,b)s = a + b. Then s is a function from


into . This s may be called the sum function. It is an example of
a binary operation. We will examine binary operations later in this
paragraph.

(e) Let A = {u,x,y,z} and B = {1,2,3}, and put

uf = 1, xf = 2, yf = 2, zf = 1.

Then f is a function from A into B.

(f) Let A be a nonempty set and let S be the set of all subsets of A. For
any a A, put af = {a} S. Then f is a function from A into S.

(g) Put xf = x2 for all x . Then f is a function from into .

(h) Consider f = {(x,y) : x2 = y2}. Then f is not a function from


into , since 1, for example, is the first component of two distinct
ordered pairs (1,1) and (1, 1) in f. On the other hand, if denotes the
2 2
set of positive real numbers, then g = {(x,y) : x =y } is a function
from into . In fact, g is the identity function on .

(i) Let f: A B be a mapping from A into B and let A1 be a nonempty


subset of A. For any a A1, we put ag = af. Then g is a mapping from A1
into B. In terms of ordered pairs, we have

g=f (A1 B).

g is called the restriction of f to A1. We usually write fA or f A to denote


1 1
the restriction of f to A1. If g is a restriction of f to a subset of the
domain of f, then f is called an extension of g.

17
(j) Let A be a nonempty set and let B be a fixed subset of A. For any a in
A, we put

B
=
{ 0 if a
1 if a
B
B.

Then B
is a function from A into {0,1}. It is called the characteristic
function of B. Here we wrote the function on the left.

(k) For any x , we put

f(x) =
{ 0 if x is irrational
1 if x is rational.

Then f is a function from into . In fact, f is the characteristic


function of the set of rational numbers. The image of some x is not
known. For instance, it is not known whether Euler's constant is
rational or not. Nevertheless, f is a genuine function. This example is due
to L. Dirichlet (1805-1859).

(l) Let A be a nonempty set and let be an equivalence relation on A.


Let A/ be the set of equivalence classes under . Then

: A A/
a [a]

is a mapping from A into A/ . It is called the natural mapping or the


canonical mapping from A into A/ .

3.3 Definition: Let f: A B and f1: A1 B be two functions. f and f1


are called equal if A = A1 and af = af1 for all a A = A1.

So, in order that two functions f and f1 be equal, their domains must be
equal and the images of any element in. this common domain under the
mappings f and f1 must be equal, too. In particular, if f: A B is a
function and B C, then the function g, defined by ag = af for all a A,
is equal to f. The ranges do not play any role in the definition of
equality. (In some branches of mathematics, for example in topology,

18
two functions with different ranges are sometimes considered distinct,
even if their domains and functional values coincide.)
In the definition of a mapping f: A B, we required that every element
of A be the first component of at least one ordered pair in f and also that
every element of A be the first component of at most one ordered pair
in f. There was no analogous requirement for the elements of B. If we
impose similar conditions on the elements of B, we get special types of
functions, which we now introduce.

3.4 Definition: Let f: A B be a mapping. If every element of B is the


second component of at least one ordered pair in f, then f is called a
mapping from A onto B.

The reader must be careful about the usage of the prepositions "into"
and "onto", for they are used with different meanings. That f is a
function from A onto B means that every element of B is the image of
some element of A. For an arbitrary mapping f: A B, an element of B
has perhaps no preimage at all, but if f is a mapping from A onto B, then
each element of B has at least one preimage in A.

The range should be specified whenever the term "onto" is used. A


function is not "onto" by itself, it is only onto a specific set. We shall
frequently treat the word "onto" as an adjective, but it will be always
clear from the context which range set is meant.

3.5 Examples: (a) The mapping f: , given by f(x) = x2 for all


x , is not onto, since 1 , for instance, has no preimage under f.

(b) Let denote the set of all positive real numbers. Then the map-
ping f: , given by f(x) = x2 for all x , is onto.

(c) The mapping g: {1,2,3,4,5} {a,b,c}, given by

1g = a, 2g = a, 3g = a, 4g = b, 5g = c

is onto.

19
(d) Let A be any nonempty set. Then A : A A is onto, for any a A
has a preimage a in A under A since a A = a.

3.6 Definition: Let f: A B be a mapping. If every element of B is the


second component of at most one ordered pair in f, then f is called a
one-to-one mapping from A into B.

A function f: A B is therefore one-to-one if an arbitrary element of B


has either no preimage in A or exactly one preimage: any two preimages
of b B (if b has a preimage at all) must be equal. So the necessary and
sufficient condition for a mapping f: A B to be one-to-one is

af = b and a1f = b a = a1 (a,a1 A, b B)

or, more shortly

af = a1f a = a1 (a,a1 A),

whose contrapositive reads

a a1 af a1f (a,a1 A).

A one-to-one mapping is a mapping by which different elements in the


domain are matched with different elements in the range. Being a one-
to-one function is the negation of being a "many-to-one" function, by
which many elements in the domain are matched with one element in
the range.

3.7 Examples: (a) {(x,y): x2 = y} is not a one-to-one function


from into , for two distinct elements x and x (if x 0) have the
same image.

(b) Let denote the set of all positive real numbers. Then the mapping
{(x,y): x2 = y} is a one-to-one function from into .

(c) The mapping g: {1,2,3} {a,b,c,d}, given by

1g = b, 2g = d, 3g = a

20
is one-to-one.

(d) Let A be a nonempty set. Then A: A A is one-to-one, for if


a A = b A, then a = b from the definition of A
.
Suppose we have two functions f: A B and g: B C. For any a A, we
find af = b B and then apply g to this element af = b of B. We get an
element c = bg of C. In this way, the element a of A is assigned to an
element c of C. Here af =b is uniquely determined by f (since f is a
mapping) and bg = c is uniquely determined by g (since g is a mapping).
So c is uniquely determined: we have a a mapping from A into C.

3.8 Definition: Let f: A B and g: B C be two functions. Then

h = {( a,(af)g ) A C: a A} A C,

which is a function from A into C, is called the composition of f with g, or


the product of f by g.

We write h = f o g or more simply h = fg. Thus a(fg) is defined as (af)g.

In order to compose two functions f and g, we must make sure that the
range of the first function f is a subset of the domain of the second
function g. Otherwise, their composition is not defined. Note the order of
the functions f and g. We apply f first, then g; and we write first f, then g
in the composition notation fg. One of the advantages of writing the
functions on the right becomes evident here. If we had written the
functions on the left, then fg would have meant: first apply g, then f [as
in the calculus, where (f o g)(x) = f(g(x))] and we would have been reading
backwards. Notice also that the domain of fg is the domain of f.

3.9 Examples: (a) Let f: A B be a mapping. Then it is easily seen that


f B = f and Af = f. Indeed, the domains of f B , f, Af are all equal to A and

a(f B ) = (af) B = af and a( Af) = (a A


)f = af

for all a A. In particular, if g: A A is a mapping, then g A


=g= A
g.

(b) Let f: {1,2,3,4} {a,b,c,d} and g: {a,b,c,d} {5,x,U, , } be given by

1f =a ag = U

21
2f = c bg = x
3f = d cg =
4f = b dg = 5
Then we have

1(fg) = (1f)g = ag =U
2(fg) = (2f)g = cg =
3(fg) = (3f)g = dg =5
4(fg) = (4f)g = bg = x.

(c) Given f: {1,2,3} {a,b} and g: {a,b} {x,y,z}, where

f: 1 a and g: a y
2 a b z
3 b c z,
we have
fg: 1 y
2 y
3 z.

Notice that gf is not defined.

(d) Given f: and g: , we have


2
x sin x x x

x(fg) = (xf)g = (sin x)g = (sin x)2 = sin2x,

x(gf) = (xg)f = (x2)f = sin(x2).

(e) Given f: and g: , we have


2 2
x x 1 x x +1

x(fg) = (xf)g = (x2 1)g = (x2 1)2 + 1 = x4 2x2 + 2

x (gf) = (xg)f = (x2+1)f = (x2+1)2 1 = x4 + 2x2.

Given two functions f: A B and g: B C, we might be tempted to ask


whether fg = gf. Example 3.9(b) and Example 3.9(c) tell us that this
question is meaningless, for, although fg is defined in these examples, gf
is not even defined, let alone is equal to fg. Example 3.9(d) and Example
3.9(e) show that the two functions fg and gf, even if they both exist, are
not necessarily equal. We have fg gf in general: the composition of
mappings is not commutative.

22
However, it is associative.

3.10 Theorem: Let f: A B, g: B C, h: C D be three functions. Then


(fg)h = f(gh).

Proof: We must prove that the domains of (fg)h and f(gh) are equal and
that an arbitrary element in the common domain is assigned to the same
element by (fg)h and by f(gh).

The domain of (fg)h is the domain of fg, which is the domain of f, which
is A. The domain of f(gh) is the domain of f, which is A. So the domains
of (fg)h and f(gh) coincide.

Now let a be an arbitrary element of A. Then

a ((fg)h) = (a(fg))h (by the definition of (fg)h; forget that fg is a


composition itself)
= ((af)g )h (recall now that fg is a composition,
applied to an element a)
= (af)(gh) (definition of gh, applied to an element af)
= a (f(gh)) (definition of f(gh)),

which yields (fg)h = f(gh).

Onto mappings and one-to-one mappings behave very nicely when they
are composed.

3.11 Theorem: Let f: A B, g: B C be two functions and let fg: A C


be their composition.
(1) If f is onto and g is onto, then fg is onto.
(2) If f is one-to-one and g is one-to-one, then fg is one-to-one.

Proof: (1) Suppose f and g are onto. For any c C, we must find a
preimage of c under fg.The only thing we know about C is that C is the
range of g. Now g is onto, so c has a preimage in B under g. Let b B be
such that bg = c. Since b B and B is the range of f, and f is onto, b has a
preimage a A under f, so that af = b. Then we get a(fg) = (af)g = bg = c.

23
So a is a preimage of c under fg. This proves that fg is onto. (Summary: a
preimage of a preimage is a preimage that works.)

(2) Now suppose f and g are one-to-one. We must prove a = a1 whenever


a(fg) = a1(fg), for all a, a1 A. Indeed, if
a(fg) = a1(fg),
then (af)g = (a1f)g,
af = a1f (since g is one-to-one),
a = a1 (since f is one-to-one).
This proves that fg is one-to-one.

The converse of Theorem 3.11 is wrong. If f: A B and g: B C are two


functions and if fg: A C is onto, it does not always follow that both f
and g are onto. Also, if f: A B and g: B C are two functions and if
fg: A C is one-to-one, it does not always follow that both f and g are
one-to-one. This can be read off from the functions displayed below.

f g f1
{a,b,c} {x,y,z} {1,2} {a,b} {x,y,z} g 1 {1,2,3}
a x 1 a x 1
b y 2 b y 2
c z z 3

Here fg is onto, but f is not onto; and f1g1 is one-to-one, but g1 is not one-
to-one.

However, we have a partial result in this direction. Observe that g is


onto and f1 is one-to-one in these examples. This is not a coincidence.

3.12 Lemma: Let f: A B, g: B C be two functions and let fg: A C


be their composition.
(1) If fg is onto, then g is onto.
(2) If fg is one-to-one, then f is one-to-one.

Proof: (1) Assume fg is onto. For any c C, we must find a preimage of


c in B under g. Now any c C has a preimage in A under fg. Let c = a(fg),
where a A. Then c = (af)g. So af B is a preimage of c in B under g.
This proves that g is onto. (Summary: the image of a preimage is a
preimage that works.)

24
(2) Assume fg is one-to-one. We wish to prove. that f is one-to-one.
Suppose that af = a1f, where a, a1 A. Applying g to both sides of this
equation, we get (af)g = (a1f)g, therefore a(fg)= a1(fg). Since fg is one-to-
one by hypothesis, we get a = a1. This proves that af = a1f implies a = a1.
Thus f is one-to-one.

In view of its importance, we record the most important corollary of


Theorem 3.11 as a separate theorem.

3.13 Theorem: Let f: A B, g: B C be one-to-one and onto. Then the


composition fg: A C is one-to-one and onto.

Assume we have a mapping f: A B. We want to define a new mapping


g: B A by inverting the order of the components of the ordered pairs
in f. In other words, we want to define g by putting (b,a) g if and only
if (a,b) f. This g is a relation from B into A. The question arises: when
is g in fact a mapping from B into A?

The necessary and sufficient condition for g to be a mapping is that each


element of B be the first component of at least one and at most one
ordered pair in g. By the definition of g, this is equivalent to the con-
dition that each element of B be the second component of at least one
ordered pair in f (i.e., f be onto) and also of at most one ordered pair in
f (i.e., f be one-to-one). Let us observe that the mapping g is then
uniquely determined by
bg = a if and only if af = b.
We proved the

3.14 Theorem: Let f: A B be a mapping. The following assertions are


equivalent.
(i) f is one-to-one and onto.
(ii) There is a unique mapping g: B A such that
bg = a if and only if af = b (a A, b B)

25
3.15 Definition: The mapping g of Theorem 3.14 is called the inverse
mapping of f, or simply the inverse of f. It is denoted by f 1.

3.16 Theorem: Let f : A B be one-to-one and onto, and let f 1: B A


be its inverse. Then ff 1= A
and f 1f = B .

Proof: We must show that the domains and functional values coincide.
The domain of ff 1 is the domain of f, which is . A, and A is the domain of

A
. Further, for any a A, we have a(ff 1) = (af)f 1
= a = a A
by the
1. 1
definition of f This proves ff = A
.

The domain of f 1f is the domain of f 1, which is . B, and B is the domain


of B . Further, for any b B, we have

b(f 1f) = (bf 1)f = af (where a is the unique element of A with af = b)


= b = b B.

This proves f 1f = B
.

3.17 Theorem: (1) Let f : A B be one-to-one and onto. Then f 1: B


A is one-to-one and onto.

(2) Let f: A B be a mapping. . If there is a mapping g: B A such that


fg = A and gf = B , then f is one-to-one and onto (and therefore g is the
the inverse of f)..

Proof: (1) We have f 1f = B


by Theorem 3.16. Since B
is one-to-one
(Example 3.7(d)), f 1 is one-to-one by Lemma 3.12(2). Also, we have
ff 1= A by Theorem 3.16. Since A is onto (Example 3.5(d)), f 1 is onto by
Lemma 3.12(1).

(2) We use the same reasoning. fg = A


is one-to-one, so f is one-to-one,
and gf = B is onto, so f is onto.

A mapping f: A B is said to be a one-to-one correspondence between


A and B in case f is one-to-one and onto. If f is a one-to-one correspond-
ence between A and B, then f 1 is a one-to-one correspondence between
B and A by Theorem 3.17(1).

26
*

* *

We now introduce binary operations. They constitute a generalization of


the four elementary operations addition, subtraction, multiplication and
division that everybody learns in the primary school. Consider addition,
for example. Given any two numbers a and b, their sum is a uniquely
determined number. This is the core of the operation concept: given two
objects a and b, associate with them a unique object of the same kind.
More precisely, we have the

3.18 Definition: Let S be a nonempty set. A binary operation on S is a


mapping from S S into S.

The important thing about a binary operation is that it is defined for


all ordered pairs (a,b) S S and that the result of the operation, (a,b) ,
is an element of S.

Although a binary operation is a mapping, we will not employ the


functional notation (a,b) . As in the case of the elementary operations,
we write a sign like "+", " ", "o ", " ", " " between the elements a and b to
denote the image of (a,b) under . So the image of (a,b) will be denoted
by a + b, a b , a o b, a b, a b or by a similar symbol.

3.19 Examples: (a) The elementary operations addition, subtraction,


multiplication are binary operations on . Subtraction is not a binary
operation on , since 1 2, for instance, is not an element of (although
1 and 2 are).

(b) Let M be a set and let S be the set of all subsets of M. Taking union
and taking intersection are binary operations on S. The usual notation
"A B", "A B" conforms to the remarks above.

(c) Let F be the set of all functions from a set A into A. The usual com-
position of functions is a binary operations on F.

(d) Let us write x o y = x + y2 and x y = x2+x+1 for real numbers x,y.


Then o and are binary operations on . Here y does not enter into

27
x y in any way, but this does not preclude from being a binary
operation.

3
(e) Let V be the set of all vectors in the three space . Taking dot
product of two vectors is not a binary operation on V, since the result is
a scalar (real number), not a vector. On the other hand, taking cross
product is a binary operation on V, since the result is a uniquely
determined vector in V.

(f) For any natural numbers m,n, let m • n denote their (positive)
greatest common divisor. Then • is a binary operation on .

(g) Let S be the set of all students in a classroom. For any students a,b
in S, let a .b be that student who sits in front of a. Then . is not a binary
operation on S, for a .b is not defined if a happens to sit in the foremost
row. Remember that a binary operation on S has to be defined for all
pairs in S S.

(h) For any ordered pairs (a,b), (c,d) of real numbers, we put
(a,b) + (c,d) = (a + c, b + d),
(a,b).(c,d) = (ac bd, ad + bc).
Then + and . are binary operations on . Notice that one and the
same symbol "+" stands for two different binary operations, one on ,
and one on .

Exercises

1. Let f : A B be a mapping. Prove that f is one-to-one if and only if


there is a mapping g: B A such that fg = A; prove that f is onto if and
only if there is a mapping h: B A such that hf = B .

2. Let f : A B be a mapping. For any subset A1 of A, we put


f(A1) = {f(a) B: a A1}
and for any subset B 1 of B, we put
f (B1) = {a A : f(a) B1}.
(f(A1) is called the image of A1, and f (B1) is called the preimage of B1.
Most people refer to f(A) as the range of f. Here we wrote the functions
on the left.) Prove that

f(A1 A2) f(A1) f(A2), f(A1 A2) = f(A1) f(A2)


f (B1 B2) =f (B1) f (B2), f (B1 B2) = f (B1) f (B2)

28
A1 f (f(A1)), f(f (B1)) B1

for any subsets A1,A2 of A and for any subsets B1,B2 of B.

3. Keep the notation of Ex. 2. Prove that f is one-to-one if and only if


f(A1 A2) = f(A1) f(A2) for any subsets A1, A2 of A.

4. Keep the notation of Ex. 2. Assume that f is one-to-one and onto, and
let f 1: B A be its inverse. Show that
f (B1) = f 1(B1) and (f 1) (A1) = f(A1)
for any subsets B1 and A1 of B and A, respectively.

29
§4
Mathematical Induction

Examine the propositions


2n n for all n ,
n(n + 1)
1 + 2 + . . . +n = for all n ,
2
nn + 1 (n + 1)n for all n .
How do we prove them? They are statements involving a variable n
running through the infinite set . Strictly speaking, each one of these
propositions above is a collection of infinitely many propositions. We can
verify them for a finite number of cases where n assumes some specific
values. Thus we might verify 2n n for n = 1,2,3, . . . ,1 000 000 and
convince ourselves of the truth of this statement, but this is far from a
proof. On the other hand, we cannot check the truth of infinitely many
statements within finite time. So we must resort to some other means.

In order to prove propositions about all natural numbers, an axiom is


introduced. It is the fifth Peano axiom about (Giuseppe Peano (1858-
1932), an Italian mathematician and logician). It is called the axiom of
mathematical induction.

4.1 Axiom (of mathematical induction): If S is a subset of such


that
I. 1 S,
II. for all k , if k S, then k + 1 ,
then S is the whole of , i.e., S = .

We can use this axiom to prove statements of the form 'pn for all n '
as follows. We let S be the set of all natural numbers n for which pn
is true. First we verify 1 S, that is, we verify that p1 is true. Second, we
assume that k . S and under this hypothesis, which is called the
induction hypothesis, we prove that pk+1 is true. So we show that k S
implies k+1 S. By the axiom of mathematical. induction, S = , so the

30
statement pn is true for all n . We formulate the axiom as an
operational procedure.

4.2 Principle of mathematical induction: Let pn be a statement


involving a natural number n. We can prove the proposition

for all n , pn
by establishing that
I. p1 is true,
II. for all k , if pk is true, then pk+1 is true.

Proofs by the principle of mathematical induction consist. of two steps.


In the first step, we show that p1 is true. In practice, this is often quite
easy, but we should not neglect it.. In the second step, we assume that pk
is true. This assumption is the inductive hypothesis.. Using this hypo-
thesis, we prove that pk+1 is true. A proof by induction will not be
complete (and valid). if we carry out the first step but not the second, or
if we carry out the second step but not the first. .

n(n + 1)
4.3 Examples: (a) Prove that 1 + 2 + . . . + n = for all n .
2
We use the principle of mathematical induction.
1(1 + 1)
I. 1= , so the formula is true for n = 1.
2
k(k + 1) .
II. Make the inductive hypothesis that 1 + 2 + . . . + k =
2
(k+1)((k+1) + 1) .
We want to establish 1 + 2 + . . . + k + (k + 1) = We have
2

k(k + 1)
1 + 2 + . . . + k + (k + 1) = + (k + 1) (by inductive
2
hyp.)
k
= ( + 1)(k + 1)
2
(k + 1)(k + 2) ,
=
2
so the formula is true for n = k + 1 if it is true for n = k. Hence

n(n + 1)
1 + 2 + . . . +n = for all n .
2

31
(b) Prove that 2 + 22 + 23 + . . . + 2n = 2n+1 2 for all n .

I. We have 2 = 21+1 2, which proves the assertion for n = 1.

II. Assume 2 + 22 + 23 + . . . + 2k = 2k+1 2. Now we must prove

2 + 22 + 23 + . . . + 2k + 2k+1 = 2(k+1)+1 2. We have

2 + 22 + 23 + . . . + 2k + 2k+1 = (2k+1 2) + 2k+1 (by inductive


hyp.)
= 2(2k+1) 2
= 2k+2 2,
so the assertion is true for n = k + 1 if it is true for n = k. Thus
2 + 22 + 23 + . . . + 2n = 2n+1 2 for all n .

(c) Let h 1 be a fixed real number. Prove that (1 + h)n 1 + nh for


all n .
I. We have (1 + h)1 1 + 1h, so the inequality is true for n =
1.
II. Let us assume (1 + h)k 1 + kh. We want to prove that
k+1
(1 + h) 1 + (k + 1)h. We have
(1 + h)k+1 = (1 + h)k (1 + h)
(1 + kh)(1 + h) (by inductive hyp. and 1 + h 0)
2
= 1 + h + kh + kh
1 + h + kh + 0
= 1 + (k + 1)h,
so the inequality is true for n = k + 1 if it is true for n = k. By the
principle of mathematical induction,
(1 + h)n 1 + nh for all n .

Sometimes it is convenient to use. the principle of mathematical induc-


tion in a slightly different form. We assume (not only qk, but rather)
each one of q1, q2, q3, . . . , qk is true and then conclude that qk+1 is true.
This establishes the truth of qn for all n , as the following lemma
shows. .

4.4 Lemma: Let qn be a statement involving a natural number n.


Assume that

32
i. q1 is true,
ii. for all k , if q1, q2, q3, . . . , qk are true, then qk+1 is true.
Then qn is true for all n .
Proof: We prove the lemma by the principle of mathematical induction.
We put
p1 = q1
pk = q1 and q2 and . . . and qk (for all k ,k 2).
Now induction.
I. p1 is true (by the hypothesis i.)
II. Make the inductive hypothesis that pk is true. Then
q1 and q2 and . . . and qk is true (definition of pk)
q1, q2, . . . , qk are all true (truth value of conjunction)
qk+1 is true (by the hypothesis ii.)
q1, q2, . . . , qk, qk+1 are all true
q1 and q2 and . . . and qk and qk+1 is true
pk+1 is true.
Hence, for all k , if pk is true, then pk+1 is true. By the principle of
mathematical induction, pn is true for all n . So
q1 and q2 and . . . and qn is true for all n .
In particular, qn is true for all n . This completes the proof.

We can now formulate a new form of the principle of mathematical


induction. This form will be used many times in the sequel.

4.5 Principle of mathematical induction: Let qn be a statement


involving a natural number n. We can prove the proposition
for all n , qn
by establishing that
I. q1 is true,
II. for all k , if q1, q2 ,. . . , qk are true, then
qk+1 is true.

The statement '2n . n2' is not true for all natural numbers n, but true
for all natural numbers n 5. The. principle of mathematical induction

33
can be used to prove this and similar propositions. Let a be a fixed. inte-
ger (positive, negative or zero) and let pn be a statement involving an
integer n a. We prove the truth of pn for all n a by showing that
1. pa is true
2. for all k a, if pk is true, then pk+1 is true.
This is easily seen when we put qn = pn+ a-1 for n and use Principle
4.2 with qn in place of pn. There is a similar modification of Principle 4.5.

Exercises

Prove the assertions in Ex. 1-6 for all n by the principle of mathe-
matical induction.
1. 1 + 3 + . . . + (2n - 1) = n2.
n(3n 1) .
2. 1 + 4 + 7 + . . . + (3n 2) =
2
n(n + 1)(2n + 1) .
3. 12 + 22 + . . . + n2 =
6
n (n + 1)2 .
2
4. 13 + 23 + . . . + n3 =
4
n(n + 1)(2n + 1)(3n2 + 3n 1) .
4 4
5. 1 + 2 + . . . 4
+n =
30
6. Prove that 2n n2 for all n 5, n .

7. Prove that n3 + 3n2 + 2n + 1 0 for all n 2, n .

8. Prove that, for any n and for any positive real numbers
a1,a2,. . . , a n ,
2
2
n a1+a2+. . . +a
2n .
a1a2. . . a n
2n 2

9. Prove that, for any n and for any positive real numbers
a1, a2,. . . , an,
n a1+a2+. . . +an
a1a2. . . an .
n

(Hint: if m 2n, then choose n so that 2n-1 m 2n. Put

b =(a1+a2+. . . +am)/m. Then use Ex. 8 with a1,a2, . . . , am,am+1,. . . , a n , where


2
am+1= . . . = a = b.)
2n

34
§5
Divisibility

In this paragraph, we remind the reader of certain properties of integers


concerning divisibility. First we recall the definition.

5.1 Definition: Let a, b . If a 0 and if there is a c such that ac = b,


then a is called a divisor or a factor of b, and b is said to be divisible by
a. We also say a divides b.

We write a b to express that a divides b. Whenever we employ the


notation a b , it will be assumed of course a 0. We shall write a b
when a 0 and b is not divisible by a. Thus we have 3 6, 3 9, 2 8, 5 9,
5 10, 3 7, 4 8, 3 6, 2 4, 2 4. The notations 0 b and 0 b are
meaningless: not true or false, simply undefined.

Some basic properties of divisibility are collected below.

5.2 Lemma: Let a,b,c,m,n,m1,m2, . . . ,ms,b1,b2, . . . , bs, be integers.


(1) If a|b, then a| b, a| b, a|b.
(2) If a|b and b|c, then a|c.
(3) If a|b and c 0, then ac|bc.
(4) If ac|bc, then a|b.
(5) If a|b and a|c, then a|b + c.
(6) If a|b and a|c, then a|b c.
(7) If a|b and a|c, then a|mb +nc.
(8) If a|b1, a|b2, . . . , a|bs, then a| m1b1 + m2b2+ . . . + msbs.
(9) If a 0, then a |0.
(10) 1|a and 1|a.
(11) If a|b and b 0, then |a| |b|.
(12) If a|b and b|a, then |a| = |b|.

Proof:(1) If a|b, then a 0 and ak = b for some k . So a 0 and


a( k) = b; a 0 and ( a)k = b; a 0 and ( a)( k) = b; with k, k .
Hence a| b, a| b, a|b.

35
(2) If a|b and b|c, then a 0 b and ak = b and bh = c for some k,h .
So a(kh) = bh = c and, since kh , we obtain a|c.

(3) If a|b, then a 0 and ak = b for some k . So (ac)k = bc. From a 0,


c 0, we conclude ac 0. Hence ac|bc.

(4) If ac|bc, then ac 0 and (ac)k = bc for some k . From ac 0, we


obtain a 0 and c 0. Since c 0, we have ak = b. Since a 0, we can
write a|b.

(5) If a|b and a|c, then a 0, and ak = b, ah = c for some k,h . So


a(k+h) = b + c. Since k+h and a 0, we have a|b+c.

(6) This can be proved in the same way as (5). We might also observe
that a| c if a|c by (1), hence a|b+( c) by (5), so a|b c.

(7) If a|b and a|c, then a 0, and ak = b, ah = c for some k,h . So


a(km+hn) = ak.m + ah.n = bm + cn = mb + nc. Since km +hn and a 0,
we have a|mb+nc.

(8) This can be proved by a simple application of the principle of


mahematical induction.

(9) a0 = 0 for any a . If a 0, we can write a|0.

(10) a = 1.a = ( 1)( a). Hence 1|a and 1|a.

(11) If a|b, then a 0 and ak = b for some k . So |a||k|= |b| Since b 0,


we have |k| 1. Thus |b| = |a||k| |a|.

(12) If a|b and b|a, then a 0 and b 0, so we may apply (11) to get
|a| |b| and |b| |a|. Thus |a| = |b|.

5.3 Theorem (Division algorithm): Let a,b , b 0. Then there


are unique integers q,r such that
a = qb + r, 0 r b.
(The integer q is called the quotient, and r is called the remainder ob-
tained when a is divided by b.)

36
Proof: There are two claims in this theorem: (1) that there are integers
q,r, with the stated properties and (2) that these are unique, that is, the
pair of integers q,r is the only one which has the stated properties. The
proof of this theorem will accordingly consist of two parts. In the first
part, we prove the existence of q,r, in the second part, their uniqueness.

Existence. Consider the set T = {a ub: u } . This set T contains


nonnegative integers (for example, a ( |a|)b is nonnegative). We choose
the smallest nonnegative integer in T. Let it be called r. Thus r 0 and,
by the very definition of T, we infer r = a qb for some q . We claim
r b. If we had r b, then, since b 0, we would get
r r b = a (q + 1)b 0
and r b would be a nonnegative integer in T, smaller than the smallest
nonnegative integer in T, which is absurd. So r b is impossible and
r b. Hence there are integers q,r such that
a = qb + r, 0 r b.

Uniqueness. Let a = qb + r, 0 r b, and a = q1b + r1, 0 r1 b,


where q,r,q1,r1 are integers. We wish to prove q1 = q and r1 = r. It
suffices to prove q1 = q, for then we would get r1 = a q1b = a qb = r
also. Suppose, by way of contradiction, that q q1. Then there are two
possibilities: q q1 or q q1. Interchanging q,r with q1,r1 if necessary,
we may assume q q1 without loss of. generality (make sure that you
understand this reasoning). From q q1, we get q q1 1, hence
r1 = r1 0 r1 r = (a q1b) (a qb) = (q q1)b 1.b = b,
a contradiction. So q1 = q and r1 = r.

This therom formalizes what everybody learns at primary school: When


we divide a by b, we get a quotient, and a remainder smaller than b. At
primary school, one learns it in the case a is positive, but here a can be
negative. Also, division is carried out by successive subtractions

a |b .
... q
...
r

37
We subtract b from a until we get a number r smaller than b. This is
exactly what happens when we perform division, and this is essentially
the proof of Theorem 5.3.

Given any two integers a,b, an integer d is said to be a common divisor


of a and b if d|a and d|b. Using the division algorithm, we can show that
any two integers have a greatest common divisor, provided only that not
both of them are equal to zero.

5.4 Theorem: Let a,b , not both zero. Then there is a unique integer
d such that
(i) d|a and d|b,.
(ii) for all d1 , if d1|a and d1|b, then d1|d,
(iii) d 0..

Proof:. The proof will be similar to the proof of Theorem 5.3. We


consider the set U =. {ax by : x,y }. Now U contains positive
integers. . (For example, a( 1) b0 is positive when a 0 and the sign is
chosen. suitably. When a = 0, a1 b( 1) = b is positive, provided we
choose. the sign appropriately, since b 0 when a = 0 by hypothesis.)
We. choose the smallest positive integer in U. Let it be called d. So d 0
and d satisfies (iii). Moreover, if d1|a and d1|b, then d1|ax by for any
x,y by Lemma 5.2(7), so d1 divides every element of U. In particular,
d1|d. Thus (ii) is satisfied. It remains to prove (i). .

By the very definition of U, we have d = ax0 by0 for some x0,y0 . We


want to prove d|a and d|b. Using the division algorithm, we write
a = qd + r, where 0 r d. Then
a = q(ax0 by0) + r,
r = a (ax0 by0)
= a(1 qx0) b( y0), with 1 qx0, y0 ,
so r is an element of U and 0 r d. Since d is the smallest positive
integer in U and r d, we have necessarily r = 0. This gives a = qd, so
d|a. The proof of d|b is similar and will be omitted.

Now the uniqueness of d. Suppose d satisfies the conditions (i), (ii), (iii),
too. Then d |a, d |b by (i), and so d |d by (ii). Also, d|a, d|b by (i), and so

38
d|d by (ii). By Lemma 5.2(12), we obtain |d| = |d |. From (iii), we get d 0,
d 0, which yields d = d . Thus d is unique.

5.5 Definition: Let a,b , not both zero. The unique integer d in
Theorem 5.4 is called the greatest common divisor of a and b.

The greatest common divisor of a and b will be denoted by (a,b). This


notation is standard. The reader should not confuse it with an ordered
pair. The greatest common divisor of a and b is a natural number, not an
ordered pair.

Definition 5.5 and the proof of Theorem 5.4 enables us to write the

5.6 Theorem: Let a,b , not both zero. Then (a,b) is the smallest
positive integer in the set {ax by : x,y }.

Theorem 5.4 is a typical existence theorem. It tells us that the greatest


common divisor (a,b) of any pair of integers a,b exists (provided a and b
are not both zero), but gives no method for finding it. If a and b are
small in absolute value, we might try to find the smallest positive
integer in the set {ax by : x,y }. This is not very satisfactory, of
course. Also, it is almost impossible if a and b are rather large. We pro-
pose to give a systematic method for finding (a,b) for any pair of
integers a,b, not both zero. This method will prove anew the existence of
(a,b) and in addition will give us a systematic method of finding integers
x,y such that (a,b) = ax by. It is Proposition 2 in Euclid's Elements, Book
VII. (in algebraic notation) and is known as the Euclidean algorithm.

We first observe that the set U in Theorem 5.6 does not change if we
write a in place of a or b in place of b. This yields
(a,b) = ( a,b) = ( a, b) = (a, b)
for all a,b, not both zero. Hence (a,b) = (|a|,|b|) and, when we want to find
(a,b), we may assume a 0, b 0 (the case a = 0, b = 0 is excluded)
without loss of generality. Moreover, the set U in Theorem 5.6 remains
unaltered if we interchange a and b. Thus
(a,b) = (b,a).

39
Therefore, when we want to find (a,b), we may assume a b without
loss of generality. (Instead of appealing to Theorem 5.6, we could use
the definition to obtain (a,b) = ( a,b) = ( a, b) = (a, b) = (b,a).)

The greatest common divisor of a (a 0) and 0 is easily found. We


have (a,0) = |a|, as follows from Theorem 5.6 or immediately from
Theorem 5.4.

Suppose now a b 0 and we want to find (a,b). We divide a by b and


get
a = q1b + r1, 0 r1 b.

Here r1 may be zero. If r1 0, we divide b by r1 and get


b = q2r1 + r2, 0 r2 r1.

Here r2 may be zero. If r2 0, we divide r1 by r2 and get

r1 = q3r2 + r3, 0 r3 r2.

We proceed in this way. We have b r1 r2 r3 . . . . Since the r 's are


j
nonnegative integers and b is a finite positive integer, this process
cannot go on indefinitely. Sooner or later, we will meet a division in
which the remainder is zero, say at the (k+1)-st step (k 0):
rk-2 = qkrk-1 + rk, 0 rk rk-1.

rk-1 = qk+1rk + rk+1, 0 = rk+1.

We claim that rk, the last nonzero remainder, is the greatest common
divisor of a and b, and that it can be written in the form ax by , where
x,y are integers.

5.7 Theorem: Let a b 0 be integers and let

a = q1b + r1, 0 r1 b,
b = q2r1 + r2, 0 r2 r1,
r1 = q3r2 + r3, 0 r3 r2,
........................

ri-1 = qi+1ri + ri+1, 0 ri+1 ri ,


........................
rk-2 = qkrk-1 + rk, 0 rk rk-1,
rk-1 = qk+1rk

40
be the equations we obtain when we. use the division algorithm
(Theorem 5.3) succesively until we get a nonzero. remainder. (This chain
of equations is known as the Euclidean algorithm.). Then the last nonzero
remainder rk is the greatest common divisor of a and b. Moreover, rk can
be written in the form ri-1x riy; x,y for i = k 1, k 2, . . . , 2, 1, 0 (we
put r0 = b, r 1 = a). In particular, there are integers x0, y0 such that (a,b)
= ax0 by0, and eliminating r1, r2, . . . , rk 1 from the equations above gives
a systematic way of finding the integers x0, y0.

Proof: We must show that rk satisfies the conditions (i),(ii),(iii) of


Theorem 5.4. We know rk 0 from the k-th equation in the Euclidean
algorithm, so (iii) of Theorem 5.4 is satisfied.

We prove (i) of Theorem 5.4, namely that rk|a and rk|b. We start from the
last equation. in the algorithm and go up through the algorithm. From
the (k+1)-st equation, we get rk|rk 1. Using Lemma 5.2, we get rk|rk 2 from
the k-th equation. So rk|rk 1 and rk|rk 2. From the (k 1)-st equation, we
get rk|rk 3, so rk|rk 2 and rk|rk 3. In general, if we have rk|ri+1 and rk|ri, the
(i+1)-st equation gives rk|ri 1, so we have rk|ri and rk|ri 1. Going through
the equations in this way, we finally get rk|r0 and rk|r 1, that is, we get
rk|b and rk|a. This proves (i) of Theorem 5.4.

Now (ii) of Theorem 5.4. Assume e|a and e|b. We must prove e|rk. We start
from the first equation. in the algorithm and go down through the
algorithm. From the first equation, we get e|a q1b and e|r1 by Lemma
5.2. So e|b and e|r1. From the second equation, we get e|b q2r1 and e|r2.
So e|r1 and e|r2. In general, if we have e|ri 1 and e|ri, the (i+1)-st equation
gives e|ri 1 qi+1ri and e|ri+1. So e|ri, and e|ri+1. Going through the equa-
tions in this way, we finally get e|rk. This proves (ii) of Theorem 5.4.

Hence rk is the greatest common divisor of a and b.

Finally, we show the representability of rk in terms of ri 1, ri as


described. We start from the penultimate . equation in the algorithm and
go up through the algorithm. From the k th equation, we obtain .
rk = rk 2 rk 1qk, so rk can be represented as rk 2x rk 1y, namely with x
= 1, y = q. Substituting rk 3 qk 1rk 2 for rk 1 in this equation, we get

rk = rk 2
rk 1qk = rk 2 (rk-3 qk 1rk 2)qk
= rk 3( qk) + rk 2(1 + qk 1qk),

41
so rk can be represented as rk 3x rk 2y, namely with x = qk,
y = (1 + qk 1qk). In general, if rk can be written in the form
rix ri+1y, x,y ,
we get, using the (i+1)-st equation in the Euclidean algorithm,

rk = rix ri+1y
= rix (ri 1 qi+1ri)y
= ri 1( y) + ri(x + qi+1y),

which shows that rk can be written also in the form ri 1x1 riy1, namely
with x1 = y, y1 = (x + qi+1y). Going through the equations in this way,
we finally obtain
rk = ax0 by0
for some x0,y0 . This completes the proof.

5.8 Example: To find the greatest common divisor of 14732 and


37149, and to express it in the form 14732x 37149y, with x,y .

We have 37149 = 2.14732 + 7685


14732 = 1.7685 + 7047
7685 = 1.7047 + 638
7047 = 11.638 + 29
638 = 22.29
and the last nonzero divisor is 29. So (14732,37149) = 29. Also
29 = 7047 11.638
= 7047 11(7685 1.7047)
= 12.7047 11.7685
= 12(14732 1.7685) 11.7685
= 12.14732 23.7685
= 12.14732 23(37149 2.14732)
= 58.14732 23.37149,
so 29 = 14732x 37149y with x = 58, y = 23.

5.9 Definition: Let a,b be integers, not both zero. a is said to be


relatively prime to b if (a,b) = 1.

Since (a,b) = (b,a), b is relatively prime to a in case a is relatively prime


to b. This observation enables us to use a symmetric phrase in this case.
We say a and b are relatively prime if (a,b) =1.

42
5.10 Lemma:. Let a,b be integers, not both zero. Then a and b are rela-
tively prime if and only if there are integers x0,y0 such that ax0 by0 = 1.

Proof: If (a,b) = 1, then there are integers x0,y0 such that ax0 by0 = 1
by Theorem 5.6 or also by Theorem 5.7. Conversely, if. there are
integers x0,y0 with ax0 by0 = 1, then 1 is certainly the smallest positive
integer in the set {ax by : x,y }, hence (a,b) = 1 by Theorem 5.6.

5.11 Lemma: Let a,b be integers, not both zero, and let d = (a,b). Then
a/d and b/d are relatively prime.

Proof: a/d, b/d are integers, not both of them zero. We have ax by = d
for suitable integers x,y by Theorem 5.7. Dividing both sides of this
equation by d 0, we get
(a/d)x (b/d)y = 1,
and so (a/d, b/d) = 1 by Lemma 5.10

Using Lemma 5.10, we prove an important result that will be crucial in


the proof of the fundamental theorem of arithmetic.

5.12 Theorem: Let a, b,c be integers. If a|bc and (a,b) = 1, then a|c.

Proof: Since (a,b) = 1, we have ax by = 1 with some x,y .


Multiplying both sides of this equation by c, we obtain acx bcy = c. Now
a|acx and, since a|bc by hypothesis, a|bcx; hence a|acx bcy by Lemma
5.2. So a|c.

We separate \{0} into three subsets: (1) units, (2) prime numbers, (3)
composite numbers. The numbers 1 and 1 will be called units. The units
divide every integer by Lemma 5.2(10). Any other integer a has at least
four divisors: 1, a. These are called the trivial divisors of a. A divisor of
a, which is not one of the four trivial divisors of a, is called a proper

43
divisor of a. If a nonzero integer a is not a unit and has no proper
divisors, then a is called a prime number. Thus 2, 3, 5, 7, 11 are prime
numbers. A nonzero integer, which is neither a unit nor a prime number,
will be called a composite number. So a \{0} is a composite number if
and only if there is a d with 1 |d| |a| and d|a.

Prime numbers are the building blocks of integers in the following


sense.

5.13 Theorem: Any nonzero integer, which is not a unit, is either a


prime number or a product of prime numbers.

Proof: Take an integer n 0, and assume. that n is not a unit. If n is


prime, there is nothing to prove. If n is composite, then n = n1n2 for
some n1,n2 , 1 |n1| |n|, 1 |n2| |n|. If n1 and n2 are prime, we
are through. Otherwise, factor n1 and n2 into two numbers. Keep factor-
ing until you get down to prime numbers. Since the factors. get smaller
and smaller in absolute value, we will reach prime numbers. at the end.
This is the basic idea. and we make this reasoning into a rigorous proof
by. induction.

We use Principle 4.5. Let qn be the statement that n is a prime


number or a product of prime numbers. We begin induction. at n = 2.
Since 2 is a prime number, q2 is true. q3 is also true, for 3 is prime. q4 is
true, for 4 = 2.2 is a product of the prime numbers 2 and 2.

Suppose now q2, q3, q4, . . . , qk-1 are true, so that 2,3,4, . . . , k 1 are either
prime numbers or products of prime numbers. We want. to prove that k
is a prime number or a product of prime numbers. If k is. prime, we are
done. If k is not prime, we have k = k1k2, 1 k1 k, 1 k2 k, for
some integers k1,k2. Since qk and qk are true by the induction hypo-
1 2
thesis, each of k1, k2 is either a prime number or a product of prime
numbers:
k1 = p1p2. . . pr k2 = p 1p 2. . . p s
where p1,p2, . . . , pr,p 1,p 2, . . . , p s are prime numbers (r = 1 or s = 1 is
possible, in which case k1 = p1 or k2 = p 1 are prime numbers), and so
k = k1k2 = p1p2. . . prp 1p 2. . . p s,
is a product of prime numbers. Hence qk is true.

44
This proves the theorem for positive integers. For a negative integer n,
where n is not a unit, we have
n = p1p2. . . pt
for some prime numbers p1,p2, . . . ,pt by what we proved above (possibly
t = 1). Hence
n = ( p1)p2. . . pt
is prime or is a product of prime numbers.

After reading the proof of this theorem, it will be clear to the reader
that an abbreviation of the phrase "prime number or a product of prime
numbers" will be very useful. When we speak of a product, we mean a
product of two, three, or more terms. We now extend this to one factor.
A single term will be called a product of one factor (or of one factors). A
prime number is also a product of prime numbers with this convention.
Our theorem reads now more shortly as follows.

5.13 Theorem: Any nonzero integer, which is not a unit, is a product of


prime numbers.

Now that we know any integer, which is not zero or a unit, can be ex-
pressed as a product of prime numbers, we ask if it can be written as a
product of prime numbers in different ways. By way of example, let us
begin decomposing 60 into prime numbers as in the proof of Theorem
5.13. We can begin from any decomposition of 60 into factors. For
instance,
60 = 10.6 60 = 15.4
Now we are to decompose each one of the factors 10,6,15,4 into smaller
factors until we get prime numbers. Will we reach the same prime
numbers if we use the two different decompositions as our starting
point? We know of course that further decomposition
60 = (2.5)(2.3) 60 = (3.5)(2.2)
yields the same prime numbers 2,2,3,5 (aside from order). Nevertheless,
our question should not be taken lightly. It is a very pertinent question.
We remark that Theorem 5.13 says nothing in this regard. Theorem 5.13
says that, after enough factorizations, the factors will be prime. As it is,
the prime numbers we obtain may very well be distinct if we start with

45
different factorizations. Indeed, if you start with different things, why
on earth should you end with the same things? If 60 can be written as a
product of factors in two different ways, as above, why should it not be
written as a a product of prime factors in two different ways? The
readers experience with the uniqueness of prime factors of integers
should not mislead him (or her) to believe the uniqueness is obvious. It
is anything but obvious.

Let us clarify what we mean by uniqueness. The two decompositions


2.5.2.3 3.5.2.2
of 60 involve the same prime numbers. Their order in the two decompo-
sitions are different, but nobody would consider these decompositions as
very distinct. After all, multiplication of integers is commutative, and we
can permute the factors without changing the value of the product. It
would be foolish to regard two factorizations as different when they
consist of the same prime numbers in different orders.

Moreover, we have ( 2)(5)(2)( 3) = 2.5.2.3, where the numbers


appearing are prime. These decompositions of 60 are not essentially
distinct, of course. Given two nonzero integers a, b, we say a is associate
to b if a = b or a = b. Then b is associate to a as well. Hence we may also
say that a and b are associate. This means a|b and b|a. It is clear from
the definition that, whenever p and q are associate and p is prime, then
q is a prime number, too. When we say uniqueness, we shall mean that
the prime numbers in the decompositions of an integer are associate; we
shall not mean that they are identical.

With this understanding, we will prove that any integer ( 0,1, 1) has a
unique decomposition into prime numbers. We need some lemmas.

5.14 Lemma: Let a be an integer and p be a prime number. If p a,


then (a,p) = 1.

Proof: Let d = (a,p). Then d|p. Since p is a prime number and d 0,


either d = p or d = 1. From d|a, p a, we conclude d p. So d = 1.

46
Our proof will depend heavily on the following corollary to Theorem
5.12. It is Proposition 30 in Euclid's Elements, Book VII. We shall refer to
it as Euclid's lemma.

5.15 Lemma (Euclid's lemma): Let a,b,p be integers. If p is prime


and p|ab, then p|a or p|b.

Proof: If p|a, the lemma is proved. If p a, then (a,p) = 1 by Lemma 5.14


and, since p|ab, we get p|b by Theorem 5.12 (with p,a,b in place of a,b,c,
respectively).

5.16 Lemma: Let a1,a2, . . . , an,p be integers. If p is prime and p|a1a2. . . an,
then p|a1 or p|a2 or . . . or p|an.

Proof: This follows from Euclid's lemma by a routine induction


argument. The details are left to the reader.

We can now prove uniqueness aside from trivial variations.

5.17 Theorem (Fundamental theorem of arithmetic): Every


integer, which is not zero or a unit, can be expressed as a product of
prime numbers in a unique way, apart from the order of the factors and
ambiguity of associate numbers.

Proof: Let n , n 0, n unit. By Theorem 5.13, n can be expressed as


a product of prime numbers. We must show uniqueness. This will be
done by induction on |n| . Given two decompositions
p1p2. . . pr = n = q1q2. . . qs

of n into prime factors, we have to show that


r=s
and that
p1,p2, . . . ,pr are, in some order, associate to q1,q2, . . . ,qs.

47
Assume first |n| = 2. Then n = 2 or n = 2 is prime and n = 2 is the
unique representation of n as a product of prime numbers (having only
one factor). So the theorem is true for n if |n| = 2.

Now we make the inductive hypothesis that |n| 2 and that the
theorem is true for all k with 2 |k| n 1, and prove it for n.

If n is a prime number and .


p1p2. . . pr = n = q1q2. . . qs (r,s ; p's and q's are prime),
then necessarily r = 1, s = 1, |p1| = |n| = |q1|, so p1 = q1. So p1 and q1 are
associate and the decomposition is unique..

Assume now that .


p1p2. . . pr = n = q1q2. . . qs (r,s ; p's and q's are prime),
and that n is not a prime number. From pr|p1p2. . . pr, we get pr|q1q2. . . qs.
By Lemma 5.16, pr|qi for some i = 1,2, . . . , s. Changing the order of the q's
if necessary, we may assume pr|qs. The divisors of qs are 1, qs. Since pr
is prime, so not a unit, and since pr|qs, we obtain pr = qs or pr = qs. Let pr
= qs, with the appropriate unit = 1 . Then we get
p1p2. . . pr = n = q1q2. . . ( qs)
= q1q2. . . qs 1pr,
so p1p2. . . pr 1 = n/pr = q1q2. . . qs 1
as two decompositions of n /pr into prime numbers. Since n is not a
prime number and |pr| 1, we have 1 | n/pr| n. The induction
hypothesis tells us that the two decompositions

p1p2. . . pr 1
= q1q2. . . qs 1

of | n/pr| are essentially the same:


r 1=s 1

and p1,p2, . . . ,pr 1


are, in some order, associate to q1,q2, . . . ,qs 1. Then

r =s

and p1,p2, . . . , pr 1are, in some order, associate to q1,q2, . . . ,qs 1; and pr is


associate to qs. This completes the proof.

5.18 Remarks: Collecting the same prime divisors of n (n 1) in a


single prime power, we can write

48
n = p1a 1 p2a 2 . . . pra r, (*)

where 0 p1 p2 ... pr are the distinct prime divisors of n, and


a1,a2,. . . ,ar positive integers. Then (*) is called the canonical decomposi-
tion of n into prime numbers.

Sometimes it is convenient to relax the condition that the exponents ai


be all positive to the condition that they be nonnegative. For example,
the divisors of n , whose canonical decomposition is (*), are exactly
the numbers

p1b1 p2b2 . . . prbr,

where 0 bi ai for all i = 1,2, . . . , r. If m and n are two natural


numbers and
m = p1c1 p2c2 . . . prcr, n = p1e1 p2e2 . . . prer ,

where p1, p2, . . . , pr are distinct prime numbers and ci 0, ei 0 for all
i = 1,2, . . . ,r, then (m,n) is given by

(m,n) = p1t1 p2t2 . . . prtr

with ti = min{ci,pi} for all i = 1,2, . . . , r. Here min{x,y} denotes the smaller
(minimum) of x and y when x y and denotes x when x = y.

It can be shown that ((a,b),c ) = (a,(b,c)) for any a,b,c , provided a,b
are not both equal to zero and b,c are not both equal to zero. The
positive number ((a,b),c ) is called the greatest common divisor of a,b,c,
and is denoted shortly by (a,b,c). One proves easily that (a,b,c) is the
unique integer d such that

(i) d|a, d|b, d|c,


(ii) for all d1 , if d1|a, d1|b, d1|c, then d1|d,
(iii) d 0,

and that there are integers x,y,z satisfying

ax + by + cz = (a,b,c).

Inductively, if the greatest common divisor of n 1 integers a1,a2, . . . , an 1


has already been defined and denoted as (a1,a2, . . . , an 1), then the
greatest common divisor (a1,a2, . . . , an 1, an) of n integers a1,a2, . . . , an 1, an

49
is defined to be ((a1,a2, . . . , an 1), an ). One can show that their greatest
common divisor (a1,a2, . . . , an 1, an) is the unique integer d such that

(i) d|a 1, d|a 2, . . . , d|a n 1, d|a n


(ii) for all d1 , if d1|a 1, d1|a 2, . . . , d1|a n 1, d1|a n, then d1|d,
(iii) d 0.
In addition, one proves that there are integers x1,x2, . . . , xn 1, xn such that

a1x1 + a2x2 + . . . + an 1xn 1


+ anxn = (a1,a2, . . . , an 1, an).

If (a1,a2, . . . , an 1, an) = 1, we say that a1,a2, . . . , an 1, an are relatively prime.


In this case, there are integers x1,x2, . . . , xn 1, xn satisfying

a1x1 + a2x2 + . . . + an 1xn 1


+ anxn = 1.

The proofs of these assertions are left to the reader.

Our final topic in this paragraph will be the least common multiple of
two nonzero integers. If a,b and a|b, we say that b is a multiple of a.

5.19 Theorem: Let a,b , neither of them zero (i.e., a 0 b). Then
there is a unique integer such that

(i) a|m and b|m,


(ii) for all m1 , if a|m1 and b|m1, then m|m1,
(iii) m 0.

Proof: The proof will be similar to that of Theorem 5.4. We consider the
set V = {n : a|n and b|n}. This set is not empty, since, for example,
|ab| is in V (here we use the hypothesis a 0 b). We choose the
smallest positive integer in V. Let it be called m. Thus m 0 and m
satisfies (iii). Also, a|m and b|m since m V, and m satisfies (i). It
remains to show that m satisfies (ii).

Suppose m1 , and a|m1 and b|m1. We divide m1 by m and get, say,


m1 = qm + r, where q,r and 0 r m. Since a|m, a|m1 and b|m,
b|m1, the equation m1 = qm + r yields that a|r and b|r. Hence r V. We
know. 0 r m. If r were not zero, then r would be a natural number

50
in. V smaller than the smallest natural number m in V, which is absurd.
Thus r = 0, so m1 = qm, and m|m1. This shows that m satisfies (ii).

Now the uniqueness of m. Suppose m satisfies the conditions (i), (ii), (iii),
too. Then a|m , b|m by (i), and so m|m by (ii). Also, a|m, b|m by (i), and
so m |m by (ii). Hence m|m and m |m. By Lemma 5.2(12), we obtain
|m| = |m |. From (iii), we have m 0, m 0, which yields m = m . Thus m
is unique.

5.20 Definition: Let a,b , neither of them zero. The unique integer
m in Theorem 5.19 is called the least common multiple of a and b.

The least common multiple of a and b will be denoted by [a,b]. From the
proof of Theorem 5.19, we see that [a,b] is indeed the smallest of the
positive multiples of a and b ([a,b] is the smallest number in V). From
the fact that a|m and a|m are equivalent, and likewise that b|m and
b|m are equivalent, it follows that the defining conditions (i), (ii), (iii) do
not change when we replace a by a or b by b. Therefore, [a,b] = [ a,b] =
[ a, b] = [a, b]. In the same way, the conditions (i), (ii), (iii) in Theorem
5.19 are symmetric in a and b, and this gives [a,b] = [b,a].

The greatest common divisor and the least common multiple of two
integers will be connected in Lemma 5.22. We need a preliminary result.

5.21 Lemma: Let a,b,m be integers and a 0, b 0. If a|m,b|m and


(a,b) = 1, then ab|m.

Proof: This follows immediately from the. fundamental theorem of


arithmetic (Theorem 5.17), but we give another proof.. Since a|m and
b|m, there are integers a 1,b1 such that aa1 = m = bb1. Hence a|bb1. Since
(a,b) = 1, Theorem 5.12 yields a|b1. So b1 = ac for some integer c and m =
bb1 = bac = abc, so ab|m, as claimed.

5.22 Lemma: Let a and b be integers, neither of them zero. Then we


have [a,b] = |ab|/(a,b).

51
Proof:. As neither [a,b], nor (a,b), nor |ab| changes when we replace a
and b by their absolute values,. we assume, without loss of generality,
that a 0, b 0.. We put d = (a,b). We show that ab/d satisfies the
three conditions (i), (ii), (iii) in Theorem 5.19. Let a = a1d, b = b1d, so that
(a1,b1) = 1 by Lemma 5.11.

We have a|ab1, so a|a(b/d); and b|a1b, so b|(a/d)b. Thus a divides ab/d


and b divides ab/d. Hence ab/d satisfies (i).. Clearly ab/d 0, so ab/d
satisfies (iii).. We now show that ab/d satisfies (ii) as well; i.e., we show
that ab/d divides m1 whenever a|m1 and b|m1. Let m1 and a|m1,
b|m1. Then d|m1 and in fact a1 divides m1/d and b1 divides m1/d. By
Lemma 5.21 (with a1,b1,m1/d in place of a,b,m, respectively), we get
a1b1divides m1/d, so ab|m1d, so ab/d divides m1. Thus ab/d satisfies (ii)
and ab/d = [a,b]

It can be shown that [[a,b],c ] = [a,[b,c]] for any a,b,c , provided a,b,c
are all distinct from zero. The positive number [[a,b],c ] is called the least
common multiple of a,b,c, and is denoted shortly by [a,b,c]. One proves
easily that [a,b,c] is the unique integer m such that

(i) a|m, b|m, c|m,


(ii) for all m1 , if a|m1, b|m1, c|m1, then m|m1,
(iii) m 0.

Inductively, if the least common multiple of n 1 integers a1,a2, . . . , an 1


has already been defined and denoted as [a1,a2, . . . , an 1], then the least
common multiple [a1,a2, . . . , an 1,an] of n integers a1,a2, . . . ,an 1,an is defined
to be [[a1,a2, . . . , an 1], an ]. One can show that their least common multiple
[a1,a2, . . . , an 1,an] is the unique integer m such that

(i) a1|m, a2|m, . . . , an 1|m, an|m,


(ii) for all m1 , if a1|m1, a2|m1, . . . , an 1|m1, an|m1, then m|m1,
(iii) m 0..

Exercises

1. Find (10897,16949) and express it in the form 10897x + 16949y,


where x and y are integers.

52
m n
2. Assume m,n and m n. What is (22 + 1,22 + 1)?

3. Let a,b , neither of them equal to zero, and assume (a,b) = 1.. Let
x0,y0 be integers such that ax0 + by0 = 1. Prove that all integer pairs x,y
satisfying ax + by = 1 are given by .
x = x0 + bt, y = y0 at
as t runs through all integers.

4. Let a,b,c be integers, none of them. equal to zero, and let (a,b) = d.
Prove that there are integers x,y satisfying. ax + by = c if and only if d|c.
Moreover, if d|c and x0,y0 are integers such that ax0 + by0 = c, prove that
all integer pairs x,y satisfying ax + by = c are given by
x = x0 + (b/d)t, y = y0 (a/d)t
as t runs through all integers.

5. Prove the assertions in Remark 5.18.

6. Let m and n be two natural numbers and


m = p1c1 p2c2 . . . prcr, n = p1e1 p2e2 . . . prer ,

where p1,p2, . . . , pr are distinct prime numbers and ci 0, ei 0 for all


i = 1,2, . . . , r. Show that
[m,n] = p1u1 p2u2 . . . prur

with ui = max{ci,ei} for all i = 1,2, . . . , r. Here max{x,y} denotes the greater
(maximum) of x and y when x y and denotes x when x = y.

7. Prove or disprove: (a,[b,c]) = [(a,b),(a,c)] for all a,b,c .

8. Let a,b , b 0, and a = qb + r, with q,r , 0 r b. Prove


directly that (a,b) = (a,r).

9. Let a,b and (a,b) = 1. Show that (a b,a + b) = 1 or 2.

10. Let a,m,n be natural numbers. Show that (a m 1,a n 1) = a (m,n) 1.

53
§6
Integers Modulo n

In Example 2.3(e), we have defined the congruence of two integers a,b


with respect to a modulus n . Let us recall that a b (mod n) means
n|a b. We have proved that congruence is an equivalence relation on .
The equivalence classes are called the congruence classes or residue
classes (modulo n). The congruence class of a will be denoted by a .
Notice that there is ambiguity in this notation, for there is no reference
to the modulus. Thus 1 represents the residue class of 1 with respect to
the modulus 1, also with respect to the modulus 2, also with respect to
the modulus 3, in fact with respect to any modulus. However, the
modulus will be usually fixed throughout a particular discussion and a
will represent the residue class of a with respect to that fixed modulus.
The ambiguity is therefore harmless.

By the division algorithm (Theorem 5.3), any. integer k can be written as


k = qn + r, with q,r ,0 r b. So any integer. k is congruent (mod n)
to one of the numbers 0,1,2, . . . ,n 1. Furthermore,. no two distinct of the
numbers 0,1,2,. . . , n 1 are congruent (mod n), for if r1, r2 {0,1,2,. . . , n
1} and r1 r2 (mod n), then n|r1 r2, so n |r1 r2| by Lemma 5.2(11),
and so n (n 1) 0, which is impossible. Thus any integer is congruent
to one of the numbers 0,1,2,. . . , n 1, and these numbers are pairwise
incon-gruent. This means that 0,1,2,. . . ,n 1 are the representatives of all
the residue classes. Hence there are exactly n residue classes (mod n),
namely

0 = {x : x 0 (mod n)} = {nz :z } =: n


1 = {x : x 1 (mod n)} = {nz + 1 :z } =: n + 1
2 = {x : x 2 (mod n)} = {nz + 2 :z } =: n + 2
...............................................................
_____
n 1 = {x :x n 1 (mod n)} = {nz + (n 1) :z } =: n +(n
1).

_____
The set {0,1,2,. . . , n 1} of residue classes (mod n) will be denoted by
n
. An element of n, thas is, a residue class (mod n) is called an integer
modulo n, or an integer mod n. An integer mod n is not an integer, not

54
an element of ; it is a subset of . An integer mod n is not an integer
with a property "mod n". It is an object whose name consists of the three
words "integer", "mod(ulo)", "n".

6.1 Lemma: Let n , a,a1,b,b1 . If a a1 (mod n) and b b1 (mod n),


then a + b a1+ b1 (mod n) and ab a1b1 (mod n).

Proof: If a a 1 (mod n) and b b1 (mod n), then n| a a1 and n| b b1.


Hence n | (a a1) + (b b1) by Lemma 5.2(5), which gives
n |(a + b) (a1+ b1), so a + b a1+ b1 (mod n). Also, n | b(a a1) + a1(b b1)
by Lemma 5.2(7), which gives n | ba a1b1, so ab a1b1 (mod n).

We want to define a kind of addition and a kind of multiplication on


n
. We put
a b=a + b (*)
a b = ab (**)

for all a ,b n
(for all a,b ). This is a very natural way of introducing
addition and multiplication on n.

(*) and (**) seem quite . innocent, but we must check that and are
really binary operations on n. The reader might say at this point that
and are clearly defined on n and that there is nothing to check. But
yes, there is. Let us remember that a binary operation on n is a
function from n n
into n (Definition 3.18). As such, to each pair (a ,b)
in n n
, there must correspond a . single element a b and a b if
and are to be binary operations on n (Definition 3.1) We must check
that the rules (*) and (**) produce elements of n that are uniquely
determined by a and b.

The rules (*) and (**) above convey the wrong . impression that a b
and a b are uniquely determined by a and b. In. order to penetrate
into the matter, let us try to evaluate X Y, where X,Y n
are not
given directly as the residue classes of integers a,b . . (We discuss ;
the discussion applies equally well to .) How do we. find X Y? Since
X,Y n
, there are integers a,b with a = X, b = Y. Now add a and b in
to get a+b , then take the residue class of a+b. The result is X Y.

55
The result? The question is whether we have only one result to justify
the article "the". We summarize telegrammatically. To find X Y,
1) choose a from X,
2) choose b from Y,
3) find a + b in ,
4) take the residue class of a + b.

This sounds a perfectly good recipe for finding X Y,. but notice that we
use some auxiliary objects, namely a and b, to find X Y,. which must be
determined by X and Y alone.. Indeed, the result a + b depends explicitly
on the auxiliary objects a and b.. We can use our recipe with different
auxiliary objects. Let us do it. 1) I choose a from X and you choose a1
from X. 2) I choose b from Y and you choose b1 from Y. 3) I compute
a + b and you compute a1 + b1. In general, a + b a1+ b1. Hence our recipe
gives, generally speaking, distinct elements a + b and a1+ b1. So far, both
of us followed the same recipe.. I cannot claim that my computation is
correct and yours is false.. Nor can you claim the contrary. Now we carry
out the fourth step.. I find the residue class of a + b as X Y, and you
find the residue class of a1 + b1 as X Y. Since a + b a1 + b1 in , it can
very well happen that a + b a1 + b1 in n. On the other hand, if
is. to be a binary operation on n, we must have a + b = a1 + b1. This is
the central issue. In order that be a binary operation on n, there
must work a mechanism which ensures a + b = a1 + a1 whenever a =
a1, b1 = b, even if a + b a1 + b1. If there is such a mechanism, we say
is a well defined operation on n. This means is really a genuine
operation on n: X Y is uniquely determined by X and Y alone. Any
dependence of X Y on auxiliary integers a . X and b Y is only
apparent. We will prove that and are well defined operations on n,
but before that, we discuss more generally well definition of functions..

A function f: A B is essentially a rule by which each element a of A is


associated with a unique element of f(a) = b of B. The important point is
that the rule produces an element f(a) that depends only on a.
Sometimes we consider rules having the following form. To find f(a),

1) do this and that


2) take an x related to a in such and such manner
3) do this and that to x
4) the result is f(a).

56
A rule of this type uses an auxiliary object x. The result then depends on
a and x. At least, it seems so. This is due to the ambiguity in the second
step. This step states that we choose an x with such and such property,
but there may be many objects x,y,z, . . . related to a in the prescribed
manner. The auxiliary objects x,y,z, . . . will, in general, produce different
results, so we should perhaps that the result is f(a,x) (or f(a,y), f(a,z), . . . ).
In order the above rule to be a function, it must produce the same
result. Hence we must have f(a,x) = f(a,y) = f(a,z) = . . . . The rule must be
so constructed that the same result will obtain even if we use different
auxiliary objects. If this be the case, the function is said to be well
defined.

This terminology is somewhat unfortunate. It sounds as though there


are two types of functions, well defined functions and not well defined
functions (or badly defined functions). This is definitely not the case. A
well defined function is simply a function. Badly defined functions do
not exist. Being well defined is not a property, such as continuity,
boundedness, differentiability, integrability etc. that a function might or
might not possess. That a function f: A B is well defined means: 1) the
rule of evaluating f(a) for a A makes use of auxiliary, foreign objects,
2) there are many choices of these foreign objects, hence 3) we have
reason to suspect that applying the rule with different choices may
produce different results, which would imply that our rule does not
determine f(a) uniquely and f is not a function in the sense of Definition
3.1, but 4) our suspicion is not justified, for there is a mechanism,
hidden under the rule, which ensures that same result will obtain even
if we apply the rule with different auxiliary objects. The question as to
whether a "function" is well defined arises only if that "function" uses
objects not uniquely determined by the element a in its "domain" in
order to evaluate f(a). We wrote "function" in quotation marks, for such
a thing may not be a function in the sense of Definition 3.1. Given such a
"function", which we want to be a function in the sense of Definition 3.1,
we check whether f(a) is uniquely determined by a, that is, we check
whether f(a) is independent of the auxiliary objects that we use for
evaluating f(a). If this be the case, our supposed "function" f is indeed a
function in the sense of Definition 3.1. We say then that f is well defined,
or f is a well defined function. This means f is a function. In fact, it is
more accurate to say that a function is defined instead of saying that a
function is well defined.

57
6.2 Examples: (a). Let L be the set of all straight lines in the Euclidean
plane, on which we have a cartesian coordinate system.. We consider the
"function" s: L { },. which assigns the slope of the line l to l. How
do we find s(l)? As follows: 1) choose a point, say (x1,y1), on l; 2) choose
another point, say (x2,y2), on l; 3) evaluate x2 x1 and y2 y1; 4) put s(l)
= (y2 y1)/(x2 x1) if x1 x2 and s(l) = if x1 = x2. Clearly we can choose
the points in many ways. For example, we might choose (x1 ,y1 ) (x1,y1)
as the first point, (x2 ,y2 ) (x2,y2) as the second point. Then we have, in
general, x2 x1 x2 x1 and y2 y1 y2 y1, so we might suspect that
(y2 y1 )/(x2 x1 ) (y2 y1)/(x2 x1). It is known from analytic geometry
that these two quotients are equal, hence s(l) depends only on l, and not
on the points we choose. Thus s is a well defined function. Ultimately,
this is due to the fact that there passes one and only one straight line
through two distinct points. The next example shows that well definition
breaks down if we modify the domain a little.

(b) Let C be the set of all curves in the Euclidean plane.. We consider the
"function" s: C { },. which assigns the "slope" of the curve c to c.
How do we find s(c)? As follows: 1) choose a point, say (x1,y1), on l; 2)
choose another point, say (x2,y2), on l; 3) evaluate x2 x1 and y2 y1; 4)
put s(l) = (y2 y1)/(x2 x1) if x1 x2 and s(l) = if x1 = x2. This is the
same rule as the rule in Example 6.2(a). Let us find the "slope" of the
curve y = x2. 1) Choose a point on this curve, for example (0,0). If you
prefer, you might choose ( 1,1). 2) Choose another point on this curve,
for example (1,1). If you prefer, you might choose (2,4) of course. 3)
Evaluate the differences of coordinates. We find 1 0 and 1 0. You find
2 ( 1) and 4 1. Hence 4) the slope is 2/1. You find it to be 3/3. So s(c)
= 2 and s(c) = 1. This is nonsense. We see that different choices of the
points on the curve (different choices of the auxiliary objects) give rise
to different results. So the above rule is not a function. We do not say "s
is not a well defined function". s is simply not a function at all. s is not
defined.

(c) Let F be the set of all continuous functions on a closed interval [a,b].
We want to "define" an integral "function" I: F , which assignes the

real number ab f(x)dx to f ∫
F. So I(f) = ab f(x)dx. I is a "function"
whose "domain" is a set of functions. How do we find I(f)?. As follows. 1)
Choose an indefinite integral of f,. that is, choose a function F on [a,b]
such that F (x) = f(x) for all x [a,b] . (we take one-sided derivatives at a

58
and b). 2) Evaluate F(a) and F(b). 3) Put I(f) = F(b) F(a).. There are
many functions F with F (x) = f(x) for all x [a,b].. For two different
choices F1 and F2, we have F1(b) F2(b) and F1(a) F2(a) in general. So
we may suspect that F1(b) F1(a) F2(b) F2(a). In order to show that I
is a well defined function, we must prove F1(b) F1(a) = F2(b) F2(a)
whenever F1 and F2 are functions on [a,b] such that F1 (x) = f(x) = F2 (x)
for all x [a,b]. We know from the calculus that, when F1 and F2 have
this property, there is a constant c such that F1(x) = F2(x) + c for all
x [a,b]. So F1(b) F1(a) = (F2(b) + c ) (F2(a) + c ) = F2(b) F2(a). There-
fore, I is well defined.

After this lengthy digression, we return to the integers mod n and to the
"operations" and .

6.3 Lemma: and are well defined operations on n


.

Proof: . We are to prove a b = a b and a b = a b whenever a = a


and b = b in n(different names for identical residue classes should not
yield differerent results). This follows from Lemma 6.1. Indeed, if a = a
and b = b , then a a (mod n) and b b (mod n) by definition, so we
obtain a + b a + b (mod n) and ab a b (mod n) by Lemma 6.1, hence
a + b = a + b and ab = a b , which gives a b= a + b = a +b =
a b and a b = ab = a b = a b.

Having proved that and are well defined operations on n


, we
proceed to show that and possess many (but not all) properties of
the usual addition and multiplication of integers. First we simplify our
notation. From now on, we write + and . instead of and . In fact, we
shall even drop . and use simply juxtaposition to denote a product of two
integers mod n. Thus. we will have a + b = a + b and a .b = ab or
simply a b = ab . The. reader should note that the same sign "+" is used
to denote two very distinct. operations: in the old notation and the
usual addition of integers.. If anything, they are defined on distinct sets
n
and . The same remarks apply to multiplication.

59
6.4 Lemma: For all a ,b,c n
, the following hold.
(1) a + b n
;
(2) (a + b) + c = a + (b + c );
(3) a + 0 = a;
(4) a + a = 0;
(5) a + b = b + a;
(6) a .b n
;
(7) (a .b). c = a .(b. c );
(8) a .1= a ;
(9) if (a,n) = 1, then there is an x n
such that a .x = 1;
(10) a .b = b.a ;
(11) a .(b + c ) = a .b + a . c and (b + c ).a = b.a + c .a ;
(12) a .0 = 0.

Proof: (1) is obvious. (2) follows from the corresponding property of


addition in . We indeed have

(a + b) + c = ( a + b ) + c
= (a + b) + c
= a + (b + c)
= a + (b + c)
= a + (b + c ).
The remaining assertions are proved in the same way by drawing bars
over integers in the corresponding equations in . We prove only (9),
which is not as straightforward as the other claims. If (a,n) = 1, Then
there are integers x,y with ax ny = 1 (Lemma 5.10). Using (3) and (12),
we get 1 = ax ny= ax ny = a .x n.y = a .x 0.y = a .x 0 = a .x.

Exercises

1. Determine whether the "function" g: 13


is well defined, if g is
defined as follows.
(a) g(a ) = (a,13);
(b) g(a ) = (a,26);
(c) g(a ) = (a,169);
(d) g(a ) = (a 2,13);

60
(e) g(a ) = (a 3,169);
(f) g(a ) = (a,6);
(g) g(a ) = (a 2,65);

where a 13
and a .

f
2) Let f: 12 12
be such that (a ,b) a 2+ab+b2 . Is f well defined?

3) For an integer a, we denote by a the residue class of a (mod 12), by ã


the residue class of a (mod 6), and by â the residue class of a (mod 5), so
that a 12
, ã 6
and â 5
. Determine whether the following
"functions" are well defined.
(a) 12
,a
6
ã;
(b) 6 12
,â a;
(c) 12 5
,a â;
(d) 5 6
,â ã;

(e) 5 6
,â ~
a+1 .

61
CHAPTER 2
Groups

§7
Basic Definitions

Before giving the formal definition of a group, we would rather present


some concrete examples.

7.1 Examples: (a) Consider the addition of integers. From the numer-
ous properties of this binary operation, we single out the following ones.
(i) + is a binary operation on , so, for any a,b , we have
a +b .
(ii) For all a, b, c , we have (a + b) + c = a + (b + c).
(iii) There is an integer, namely 0 , which has the property
a+0=a for all a .
(iv) For all a , there is an integer, namely a, such that
a + ( a) = 0.

(b) Consider the multiplication of positive real numbers. Let be the


set of positive real numbers. Here the multiplication enjoys properties
analogous to the ones above.
(i) . is a binary operation on , so, for any a,b , we have
a .b .
(ii) For all a, b, c , we have (a .b).c = a .(b.c).

62
(iii) There is a positive real number, namely 1 , which has
the property
a .1 = a for all a .
(iv) For all a , there is a positive real number, namely
1/a, such that
1
a . = 1.
a

(c) Let n be a natural number and consider the addition in n, which we


introduced in §6. .
(i) + is a binary operation on n, so, for any a , b n
, we
have a + b n
.
(ii) For all a , b, c , we have (a + b) + c = a + (b + c ). .
(iii) There is an integer mod n, namely 0 n
, which has the
property
a + 0 = a for all a n
.
(iv) For all a , there is an integer mod n, namely a , such
that
a + ( a ) = 0.

(d) Let X be a nonempty set and let SX be the set of all one-to-one
mappings from X onto X. Consider the composition o of mappings in SX .
(i) o is a binary operation on SX , for if and are one-to-one
mappings from X onto X, so is o by Theorem 3.13.
(ii) For all , , SX , we have ( o ) o = o ( o ) (Theorem
3.10).
(iii) There is a mapping in SX , namely X
SX , such that
o
X
= for all SX (Example 3.9(a)).
(iv) For all SX , there is a mapping in SX , namely 1, such
that
o 1 = .
X
(See Theorem 3.14 and Theorem 3.16. That 1 SX follows from
Theorem 3.17(1).)

These are examples of groups. In each case, we have a nonempty set


and a binary operation on that set which enjoys some special properties.
A group will be defined as a nonempty set and a binary operation on

63
that set having the same properties as in the examples above. A group
will thus consist of two parts: a set and a binary operation. Formally, a
group is an ordered pair whose components are the set and the opera-
tion in question.

7.2 Definition: An ordered pair (G, o ), where G is a nonempty set and o


is a binary operation on G, is called a group provided the following hold.
(i) o is a (well defined) binary operation on G. Thus, for any
a, b G, a o b is a uniquely determined element of G.
(ii) For all a, b, c G, we have (a o b) o c = a o (b o c).
(iii) There is an element e in G such that
a o e = a for all a G
and which is furthermore such that
(iv) for all a G, there is an x with
a o x = e.

When (G, o ) is a group, we also say that G is (or builds, or forms) a group
with respect to o (or under o ). Since a group is an ordered pair, two
groups (G, o ) and (H,*) are equal if and only if G = H and the binary
operation o on G is equal to the binary operation * on G (i.e., o and * are
identical mappings from G G into G). On one and the same set G, there
may be distinct binary operations o and * under which G is a group. In
this case, the groups (G, o ) and (G, *) are distinct.

The four conditions (i)-(iv) of Definition 7.2 are known as the group ax-
ioms. The first axiom (i) is called the closure axiom. When (i) is true, we
say G is closed under o .

A binary operation o on a nonempty set G is said to be associative when


(ii) holds. The associativity of o enables us to write a o b o c without
ambi-guity. Indeed, a o b o c has first no meaning at all. We must write
either (a o b) o c or a o (b o c) to denote a meaningful element in G. By
associa-tivity, we may and do make the convention that a o b o c will
mean
(a o b) o c = a o (b o c), for whether we read it as (a o b) o c or a o (b o c)
does not make any difference. This would be wrong if o were not
associative. For instance, : (division) is not an associative operation on
\{0} and (a:b):c a:(b:c) unless c = 1 (here a,b,c \{0}). Thus a:b:c is
ambiguous.

64
An element e of a set G, on which there is a binary operation o , is called
a right identity element or simply a right identity if a o e = a for all a in
G. The third group axiom (iii) ensures that group G has a right identity
element. We will show presently that group has precisely one identity
element, but we have not proved it yet and we must be careful not to
use the uniqueness of the right identity before we prove it. All we know
at this stage is that a group has at least one right identity for which (iv)
holds. As it is, there may be many right identities. In addition, there
may be some right identities for which (iv) is true and also some for
which (iv) is false. For the time being, these possibilities are not
excluded.

They will be excluded in Lemma 7.3, where we will prove further that
our unique right identity is also a left identity. A left identity element
or a left identity of G, where G is a nonempty set with a binary
operation o on it, is by definition an element f of G such that f o a = a for
all a G. The group axioms say nothing about left identities. If (G, o ) is a
group, we do not yet know if there is a left identity in G at all, nor do we
know any relation between right and left identities. For the time being,
there may be no or one or many left identities in G. If there is only one
left identity, it may or may not be right identity. If there are many left
identities, some or one or none of them may be right identities.

We mention all these possibilities so that the reader does not read in the
axioms more than what they really say. The group axioms say nothing
about left identities or about the uniqueness of the right identity.

The group axioms do say something about right inverses. If G is a


nonempty set with a binary operation o on it, and if e is a right identity
in G, and a G, an element x G is called a right inverse of a (with
respect to e) when a o x = e. The group axioms state that, in case (G, o ) is
a group, there is a right identity e in G with respect to which each
element of G has at least one right inverse. Until we prove Lemma 7.3,
there may be many right identities with this property. Also, some of the
right identity elements may and some of the right identity elements
may not have this property. Furthermore, some (or all) of the elements
may have more than one right inverses with respect to some (or all) of
the right identities. The group axioms make no uniqueness assertion
about the right inverses.

65
Before we lose ourselves in chaos, we had better prove our lemma.

66
7.3 Lemma: Let (G, o ) be a group and let e be a right identity element
of G such that, for all a G, there exists a suitable x in G with a o x = e.
The existence of e is assured by the group axioms (iii) and (iv).
(1) If g G is such that g o g = g, then g = e.
(2) e is the unique right identity in G.
(3) A right inverse of an element in G is also a left inverse of the same
element. In other words, if a o x = e, then x o a = e.
(4) e is a left identity in G. That is, e o a = a for all a G.
(5) e is the unique left identity in G.
(6) Each element has a unique right inverse in G.
(7) Each element has a unique left inverse in G.
(8) The unique right inverse of any a G is equal to the unique left
inverse of a.

Proof: (1) Let g G be such that g o g = g. We choose a right inverse of g


with respect to e. This is possible by the axiom (iv). Let us call it h. Thus
g o h = e. Then
(g o g) o h = g o h
g o (g o h) = g o h (by associativity),
g o e =e (since g o h = e),
g =e (since e is a right identity).
This proves part (1).

(2) The claim is that e is the unique right identity in G. This means: if
f G is a right identity, that is, if a o f = a for all a G, then f = e.
Suppose f is a right identity. Then a o f = a for all a G. Writing f for a in
particular, we see f o f = f. Hence f = e by part (1).

(3) A right inverse x of an arbitrary element a G is also a left inverse


of a. This is what we are to prove. So we assume a o x = e and try to de-
rive x o a = e. We use part (1). If a o x = e, then
(x o a) o (x o a) = [(x o a) o x ] o a (by associativity)
= [x o (a o x)] o a (by associativity)
= [x o e] o a
= x o a.
So g := (x o a) is such that g o g = g. By part (1), g = e. So x o a = e.

(4) We are to prove that e is a left identity. So we must show e o a = a


for all a G. Let a G and let x be a right inverse of a. Then
ao x =e

67
ao x =xo a (by part (3))
(a o x) o a = (x o a) o a
a o (x o a) = (x o a) o a
ao e =eo a
a = e o a.
Therefore, e is a left identity as well. This proves part (4).

(5) The claim is that e is the unique left identity in G. This means: if f is
a left identity in G so that f o a = a for all a G, then f = e. We know that
the right identity e is a left identity (part (4)), and that e is the unique
right identity (part (2)). So we conclude that e is the unique left
identity. Is this correct? No, this is wrong. This would be correct if we
knew that any left identity is also a right identity (and so the unique
right identity by part (2)), which is not what part (4) states. For all we
proved up to now, there may very well a unique right identity and
many left iden-tities (among them the right identity). We are to show in
part (5) that this is impossible.
After so much fuss, now the correct proof, which is very short. Suppose
f o a = a for all a G. Write in particular f for a. Then f o f = f and part (1)
yields f = e.

(6) The claim is that each element a G has a unique right inverse in G.
We know that a has at least one right inverse, say x. We have a o x = e.
We are to show: if a o y = e, then y = x (here y G). Suppose then
a o x = e and a o y = e. We obtain
xo a =e (by part (3))
( x o a) o y = e o y
x o (a o y) = e o y
xo e =eo y
x =eo y
x =y (by part (4)).

This proves part (6).

(7) and (8) Let a G and let x be the unique right inverse of a. From
part (3), we know that x is a left inverse of a, so that x o a = e. We must
prove: if x o a = e and y o a = e, then y = x. Suppose then x o a = e and
y o a = e. Then
a o x =e
y o (a o x) = y o e

68
(y o a) o x = y
eo x =y
x = y.
This completes the proof.

According to Lemma 7.3, a group (G, o ) has one and only one right
identity, which is also the unique left identity. Therefore, we can refer
to it as the identity of the group, without mentioning right or left. Simi-
larly, since any a G has a unique right inverse, which is also the
unique left inverse of a, we may call it the inverse of a. The inverse of a
is uniquely determined by a; for this reason, we introduce a notation
displaying the fact that it depends on a alone. We write a 1 for the
inverse of a (read: a inverse). Thus a 1 is the unique element of G such
that a o a 1 = a 1 o a = e, where e is the identity of the group.

The group axioms, as presented in Definition 7.2, assert the existence of


a right identity, and a right inverse of each element. We proved in
Lemma 7.3 that a right identity is also a left identity and a right inverse
of an element is also a left inverse of the same element. One could give
an alternative definition of a group by so modifying the axioms that
they assert the existence of a left identity, and a left inverse of each
element. A lemma analogous to Lemma 7.3 would prove then that there
is a unique left identity, which is also a unique right identity and that
each element has a unique left inverse, which is also a unique right
inverse of that element. Thus the existence of a right identity plus right
inverses lead to the same algebraic structure (group) as the existence of
a left identity plus left inverses.

However, existence of a right identity and the existence of left inverses


do not always produce a group. For example, consider the set . For
any (a,b), (c,d) , we put (a,b) (c,d) = (a, b + d). Let us check if
( , ) is a group.
(i) is a binary operation on since a , b + d
whenever a,b,c,d . So is closed under .
(ii) Is associative? For any (a,b), (c,d), (e,f) , we ask
?
[(a,b) (c,d)] (e,f) = (a,b) [(c,d) (e,f)]
?
(a,b + d) (e,f) = (a,b) (c,d + f)
?
(a, (b + d) + f) = (a, b + (d + f))

69
Yes, this is true since + is an associative operation on . Hence is asso-
ciative.
(iii) Is there an element in , (a 0,b0) say, such that
(a,b) (a0,b0) = (a,b) for all (a,b) ?
Well, this is true if and only if (a,b + b0) = (a,b), which is equivalent to
b0 = 0 . There is no condition on a0. For example,
(a,b) (0,0) = (a,b + 0) = (a,b)
(a,b) (1,0) = (a,b + 0) = (a,b)
for all (a,b) , so (0,0) and (1,0) are right identities. In fact, any
(n,0) is a right identity.

From Lemma 7.3, we know that a group has one and only one right
identity. so is not a group under . On the other hand, with
respect to (0,0) for example (in fact, with respect to any right identity),
each element (a,b) of has a left inverse (0, b):
(0, b) (a,b) = (0, b + b) = (0,0)
(with respect to (n,0), a left inverse of (a,b) is (n, b)).

So ( , ) is a system in which a right identity exists, plus a left


inverse of each element; nevertheless, it fails to be a group. Likewise,
fulfilling the existence of a left identity and right inverses is not enough
for building a group.

We could define a group by including the claims of Lemma 7.3 directly


into the definition. Then we would have

(iii) there is a unique e G such that


a o e = e o a = a for all a G
and
(iv) for all a G, there is a unique a 1 G such that
a oa 1=e=a 1oa
in place of (iii) and (iv) of Definition 7.2. Some textbooks define groups in
this way. This would save us from the trouble of proving Lemma 7.3.
Why, then, did we not use this definition? Because we do not want to do
unnecessary work. If we defined groups by (iii) and (iv) instead of (iii)
and (iv), then, each time when we wanted to show that a set G builds a
group under a binary operation o on G, we had to check
1) that there is an e G such that a o e = a for all a G,
2) that this e is also such that e o a = a for all a G,
3) that e is the unique element of G with these two properties,

70
4) that for each a G, there is an a 1 G such that a o a 1 = e,
5) that a 1 o a = e as well,
6) that this a 1 is the unique element of G with a o a 1 = e = a 1 o a,
which more than doubles our work. With our Definition 7.2, we need
check only 1) and 4). The other items 2),3),5),6) follow from 1) and 4)
automatically. We pay for our comfort by having to prove Lemma 7.3,
but, once this is over, we have less work to do in order to see whether a
given set G forms a group under a given operation o on it, as in the
following examples.

7.4 Examples: (a) For any two elements a,b of \{1}, we put
a o b = ab a b + 2. We ask if \{1} is a group under o . Let us check the
group axioms.
(i) For all a,b \{1}, we observe a o b = ab a b + 2 ,
but this is not enough. We must prove a o b 1 also. Let a,b , a 1
b. We suppose a o b = 1 and try to reach a contradiction. If a o b = 1, then
ab a b + 2 = 1
ab a b + 1 = 0
(a 1)(b 1) =0
a 1 = 0 or b 1 =0
a = 1 or b = 1,
a contradiction. So a o b \{1} and o is a binary operation on \{1}.
(ii) For all a,b,c \{1}, we ask if (a o b) o c = a o (b o c).
?
(ab a b + 2) o c =a o (bc b c + 2)
?
(ab a b + 2)c (ab a b + 2) c + 2 = a(bc b c + 2) a (bc b c + 2) + 2
?
abc ac bc + 2c ab + a + b 2 c + 2 = abc ab ac + 2a a bc + b + c 2+
2
The answer is "yes." So o is associative.
(iii) We are looking for an e \{1} such that a o e = a for all
a \{1}. Assuming such an e exists, we get
ae a e + 2 = a
ae e = 2a 2
(a 1)e = 2(a 1)
e =2 (since a 1 0).
We have not proved that 2 \{1} is a right identity element. We
showed only that a right identity element, if it exists at all, has to be 2.
Let us see if 2 is really a right identity. We observe

71
a o 2 = a2 a 2 + 2 = 2a a = a
far all a \{1}. Since 2 \{1}, 2 is indeed a right identity in \{1}.
(iv) For all a \{1}, we must find an x \{1} such that
a o x = 2. Well, this gives
ax a x + 2 = 2
ax a x + 1 =1
(a 1)(x 1) =1
x 1 = 1/(a 1)
x = a/(a 1),
which is meaningful since a 1. We have not proved yet that a/(a 1) is
a right inverse of a. We showed only that a right inverse of a \{1}, if
it exists at all, has to be a/(a 1). We must now show that a o a/(a 1) =
2 for all a \{1} and also that a/(a 1) \{1}. Good. We have
a o a/(a 1) = a (a/(a 1)) a (a/(a 1)) + 2
= (a 1)(a/(a 1)) a + 2
= 2,
and also a/(a 1) 1, for a/(a 1) and a/(a 1) = 1 would imply
that a = a 1, hence 0 = 1, which is absurd.
Since all the group axioms hold, \{1} is a group under o .

(b) Let us define an operation * on by putting a * b = a + b + 2 for all


a,b . Does form a group under *?
(i) For any a,b , a * b = a + b + 2 is an integer. So is
closed under *.
(ii) For all a,b,c , we ask if (a * b) * c = a * (b * c). We
have (a * b) * c = (a + b + 2) * c
= (a + b + 2) + c + 2
= a + (b + 2 + c) + 2
= a + (b + c + 2) + 2
= a + (b * c) + 2
= a * (b * c).
So * is associative.
(iii) Is there an integer e such that a * e = a for all a ?
Well, this gives a + e + 2 = a and e = 2. Let us check whether 2 is
really a right identity element. We observe that a * 2 = a + ( 2) + 2 = a
for all a . So 2 is a right identity element.
(iv) Does each integer a have a right inverse in ? The
condition a * x = 2 yields
a +x+2= 2

72
x= 4 a .
4 a is indeed a right inverse of a since a * ( 4 a) = a + ( 4 a) + 2 =
2.
Therefore is a group with respect to *.

(c) Let A se a nonempty set and let be the set of all subsets of A. The
elements of are thus subsets of A. Consider the forming of symmetric
differences (§1, Ex.7). is a group under :
(i) For all S,T , S T is a subset of A, so S T and is
closed under .
(ii) is associative (§1, Ex.8).
(iii) is a right identity (§1, Ex.8).
(iv) Each element S of has a right inverse, namely S itself,
as S S = for all S (§1, Ex.8).
So is a group under .

We have seen many examples of groups. In some of the groups (G, o ),


the underlying set G is infinite, in some finite. The number of elements
of G, more precisely the cardinality of G, is called the order of the group
(G, o ). We denote the order of (G, o ) by G . A group (G, o ) is called a finite
group if G is finite, and an infinite group if G is infinite. One might
distinguish between various infinite cardinalities, but we will not do so
in this book. When the order of a group (G, o ) is infinite, we write G = .
The symbol will stand for all types of infinities.

A. Cayley (1821-1895) introduced a convenient device for investigating


groups. Let (G, o ) be a finite group. We make a table that displays a o b
for each a,b G. We divide a square into G 2 parts by dividing the sides
into G parts. Each one of the rows will be indexed by an element of the
group, usually written on the left of the row. Likewise, each one of the
columns will be indexed by an element of the group, usually written
above the column. Each element will index only one row and only one
column. It is customary to use the same ordering of the elements to
index the rows and columns. Also, the first row and the first column are
customarily indexed by the identity element of the group. In the cell
where the row of a G and b G meet, we write down a o b. This

73
square is known as the Cayley table or the operation table
(multiplication or addition table, as the case may be) of the group (G, o ).

As an illustration, we give the addition table of ( 4,+) below. ( 4


,+) is a
group by Example 7.1(c). We drop the bars for convenience.

+ 0 1 2 3

0 0 1 2 3

1 1 2 3 0

2 2 3 0 1

3 3 0 1 2

We observe in this table that every element of 4


appears once in each
row and also in each column. This is a general property of groups: if (G, o)
is a group, then every element b of G appears once and only once in the
row of any a G, say in the cell where the row of a and the column of
x G meet. A similar assertion holds for columns. This is the content of
the next lemma.

7.5 Lemma: Let (G, o ) be a group and a,b G.


(1) There is one and only one x G such that a o x = b.
(2) There is one and only one y G such that y o a = b.

Proof: (1) We prove first that there can be at most one x G such that
a o x = b. Let a o x = b = a o x1. We prove x = x1. We have
a o x = a o x1
a 1o (a o x) = a 1 o (a o x1)
(a 1o a) o x = (a 1 o a) o x1
e o x = e o x1
x = x1
by Lemma 7.3. So there can be at most one x with a o x = b.

The existence of at least one such x is easily seen when we put x = a 1 o b.


Indeed, a o (a 1 o b) = (a o a 1) o b = e o b = b.
So there is one and only one element x of G, namely x = a 1 o b, such that
a o x = b. This proves (1).

74
The proof of (2) is similar and is left to the reader.

We give an application of Lemma 7.5. We determine the Cayley table of


groups of order 3. Let ({e,a,b}, o ) be a group of order 3, where e is the
identity. The Cayley table of this group contains the information given
in Figure 1. Now we fill the remaining four cells. What is a o a? The cell *
cannot contain a, for a would otherwise appear more than once in the

e a b e a b e a b

e e a b e e a b e e a b
*
a a a a b a a b e

b b b b b b e a

Figure 1 Figure 2 Figure 3

second row (or column). So the cell * contains e or b. If it contained e,


then the third entry in the second row had to be b and b would appear
at least twice in the third column, contrary to Lemma 7.5. This leaves
only the possibility a o a = b. Then we have the table in Figure 2. The
remaining cells are necessarily filled in as in Figure 3.

We did not prove that Figure 3 is a Cayley table of a group of order 3.


At this stage, we do not even know whether a group of order 3 exists.
We proved: if there is a group of order 3 at all, then its Cayley table is
the table of Figure 3. We now prove the existence of a group of order 3.
We use Figure 3. Let {e,a,b} be a set of 3 elements, and let the binary
operation o on this set be defined as in Figure 3. It is easy to check the
group axioms (i),(iii),(iv). It remains to check associativity. We must
verify 3.3.3 = 27 equations (x o y) o z = x o (y o z), where x,y,z {e,a,b}.
An equation of this type is true when one of x,y,z is equal to e. So we
are left with 2.2.2 = 8 equations

(a o a) o a = a o (a o a) (b o a) o a = b o (a o a)
(a o a) o b = a o (a o b) (b o a) o b = b o (a o b)
(a o b) o a = a o (b o a) (b o b) o a = b o (b o a)

75
(a o b) o b = a o (b o b) (b o b) o b = b o (b o b)

and these are verified easily. Hence ({e,a,b}, o ) is a group. There is a


group of order 3. Any two groups of order 3 have essentialy the same
Cayley table, namely the table in Figure 3. This statement will be made
precise in §20.

The Cayley tables of ( 4


,+) and ({e,a,b}, o ) are symmetric about the
principal diagonal (that joins the upper-left and lower-right cells). What
does this signify? The symmetry of the Cayley table of a group (G, o )
means that the cell where the i-th row and j-th column meet has the
same entry as the cell where the j-th row and i-th column meet, and
this for all i,j = 1,2, . . . , G . Assuming the i-th row is the row of a G
and the j-th column is the column of b G (and assuming we index the
rows and columns by the elements of G in the same order), this means: a
o b = b o a for all a,b G. So the group is commutative in the following
sense.

7.6 Definition: A group (G, o ) is called a commutative group or an abel-


ian group, if, in addition to the group axioms (i)-(iv), a fifth axiom

(v) a o b=b o a for all a,b G

holds.

A binary operation on a set G is called commutative when a o b = b o a


for all a,b G. So a commutative group is one where the operation is
commutative. The term "abelian" is used in honor of N. H. Abel, a Nor-
wegian mathematician (1802-1829).

We close this paragraph with some comments on the group axioms. The
reader might ask why we should study the structures (G, o ) where o
satisfies the axioms (i),(ii),(iii),(iv). Why do we not study structures (G, o )
where o satisfies the axioms (i),(iii),(iv),(v) or (i),(ii),(iii),(v)? What is the
reason for preferring the axioms (i),(ii),(iii),(iv) to some other combination
of (i),(ii),(iii),(iv),(v)? There is of course no reason why other combinations
ought to be excluded from study. As a matter of fact, all combinations
have a proper name and there are theories about them. However, they

76
are very far from having the same importance as the combination (i),(ii),
(iii),(iv).

A mathematical theory, if it deserves to be considered important, has to


possess both generality and informative significance. Clearly, a theory
whose axioms are too restrictive to hold in a variety of cases is bound to
be insignificant for those who cannot fulfill them in their area of study,
and the theory will have limited interest. An interesting theory is a gen-
eral one. But generality costs content. When we wish that the axioms of
a theory be fulfilled in diverse areas and in many contexts, we must
also realize that the theory can only deal with what is common in these
diverse areas, and this might be nil. There we have the danger that the
theory will degenerate into a list of uninformative paraphrases of the
axioms without substance. Imposing restrictions on the axioms dimin-
ishes the use and interest of a theory, and lifting restrictions tends to
make the theory void. The balance of generality against content is very
delicate. Group theory is one of the cases where this balance is attained
successfully. Group theory has applications in literally every branch of
mathematics, both pure and applied, as well as in theoretical physics
and other sciences, and it is a theory full of deep, interesting, beautiful
results. This is why the choice (i),(ii),(iii),(iv) is judicious. Other combina-
tions of the axioms are not as fruitful as (i),(ii),(iii),(iv).

Exercises

1. Determine whether the following sets build groups with respect to the
operations given. In each case, state which group axioms are satisfied.
(a) under subtraction, multiplication and division.
(b) \{0}, \{0}, \{0} under multiplication.
(c) {0,1}, { 1,1} under multiplication.
(d) {z : z 1} under multiplication.
(e) {z : z = 1} under multiplication.

77
(f) 5 = {5z :z } under multiplication and addition.
(g) {x} under o , where x o x = x.
(h) {(t,u) : t2 5u2 = 4} under *, where * is defined by
t1t2 +5u1u2 , t1u2+t2u1
(t1,u1) * (t2,u2) = ( )
2 2
for all (t1,u1),(t2,u2) in this set.
(i) 6 and 8 under multiplication and addition.
(j) 7 and 7\{0} under multiplication.
(k) {f,g} under the composition of mappings, where f: x x and
g: x 1/(1 x) are functions from \{1} into \{1}.
(l) {f,g,h} under the composition of mappings, where f: x x and
g: x 1/(1 x) and h: x (x 1)/x are functions from \{1,0} into
\{1,0}.
(m) {fa,b: a,b , a 0} under the composition of mappings,
where fa,b is defined by fa,b(x) = ax + b as a function from into .

2. For which m is the set m


\{0} a group under multiplication?

78
§8
Conventions and Some Computational Lemmas

In our study of groups, we are interested in how a o b depends on a and


b, not in the name or sign of the operation. For this reason, we suppress
the operation sign altogether and use juxtaposition. Henceforward, we
will write ab (and also occasionally a .b) for a o b. We will refer to the
operation as multiplication. Thus "multiplication" will be used in a broad
sense. It can mean the usual multiplication of numbers, but also the
composition of mappings, the taking of symmetric differences of two
sets, or some rather artificial operation like those in Example 7.4. With
this convention, there is no need to refer to the operation all the time
when we discuss groups. So we call the set G a group, instead of the
ordered pair (G, o ) (we keep in mind of course that there can be many
groups on the same set). We say then that the group is written multipli-
catively or that G is a multiplicative group. Conforming to this, we call
ab the product of a and b. Also, we write 1 for the identity element of
the group. Thus 1 is not necessarily the number one. It is perhaps the
identity mapping, perhaps the empty set, perhaps some other object.
What it is depends on the group we are investigating. But a warning: we
1
will not write for the inverse a 1 of an element a in a group.
a

This is the multiplicative notation for groups. Sometimes, we shall use


the additive notation, too, especially when the group is commutative.
Then the operation is denoted by "+" and is called addition. Like "multi-
plication", "addition" is used in a general sense. We call a + b the sum of
a and b. When we have an additive group, the identity element of the
group will be written as 0. So 0 is not necessarily the number zero. Also,
we write a for the inverse of an element a in an additively written
group. We call a the opposite of a.

8.1 Lemma: Let G be a group and let a,b,c G.


(1) If ab = ac, then b = c (left cancellation).
(2) If ba = ca, then b = c (right cancellation).

78
Proof: (1) If ab = ac, we multiply by a 1 on the left and get a 1(ab) =
a 1(ac). Using associativity, we obtain (a 1a)b = (a 1a)c. So 1b = 1c. Since 1
is the identity element of G, we finally get b = c.

(2) The proof of (2) is similar and is left to the reader.

We must be careful when we want to use Lemma 8.1 to make cancel-


lation. If the group is not commutative, left multiplication by an element
and right multiplication by the same element give in general different
results. In the proof of Lemma 8.1, we multiplied by a 1 on the same
side. We cannot conclude b = c from ab = ca, for instance. Indeed, we
have
ab = ca a 1(ab) = a 1(ca) (a 1a)b = (a 1c)a b = a 1ca
and this is all we can say. In general, a 1ca c, so b c. You must always
make sure that you cancel on the same side.

Cancellations are multiplications by inverse elements. We now evaluate


the inverse of an inverse, and the inverse of a product.

8.2 Lemma: Let G be a group and let a,b G. Then


(1) (a 1) 1= a,
(2) (ab) 1 = b 1a 1.

Proof: (1) aa 1 = 1 by the definition of a 1. So a is a left inverse of a 1. So


a is the inverse of a 1(Lemma 7.3).

(2) (ab)(b 1a 1) = a (b(b 1a 1)) = a ((bb 1)a 1) = a(1a 1) = aa 1 = 1, and so


b 1a 1
is the inverse of ab.

Therefore, the inverse of the inverse of an element is the element itself.


Also, the inverse of a product is the product of the inverses, but in the
reverse order. Do not write (ab) 1 = a 1b 1. This is wrong unless
a 1b 1 = b 1a 1, which is equivalent to ab = ba (why?) and which is not
true in general.

79
*
* *

We defined the product of two elements. The product of a and b is ab.


We now want to define the product of n elements and prove that the
usual exponentiation rules are valid. The rest of this paragraph is
extremely dull. The reader may just glance at the assertions and skip
the proofs if she (or he) wishes.

By the product of three elements a,b,c in a group G, we understand an


element abc of G. Let us recall we agreed to denote by abc the element
(ab)c = a(bc). So the product of a,b,c in this order is evaluated by two
successive multiplications. Either we evaluate ab first, then multiply it
by c, or we evaluate bc first, then multiply a by it. In either way, we get
the same result by associativity and this result is denoted by abc, with-
out parentheses.

Now let us consider the product of four elements a,b,c,d. Their product
in this order will be defined by three successive multiplications of two
elements. This can be done in five distinct ways:
a (b(cd)), a ((bc)d), (ab)(cd), ((ab)c )d, (a(bc))d,
but these five products are all equal by associativity. The first two
products are equal since b(cd) = (bc)d. The last two products are equal
since (ab)c = a(bc). Further, we have a (b(cd)) = (ab)(cd) [put cd = e, then
a(be) = (ab)e] and (ab)(cd) = ((ab)c )d [put ab = f, then f(cd) = (fc)d]. So
the five products are equal. This renders it possible to drop the
parentheses and write simply abcd. This is the product of a,b,c,d in the
given order.

More generally, we want to define the product of n elements a1,a2,. . . ,an


in a group G (n 2). The product of a1,a2,. . . ,an will be defined by n 1
successive multiplications of two elements. By inserting parentheses in
all possible ways, we obtain many products (their exact number is
2.4. . .(4n 6)/n!), but associativity assures that these products are equal.
Now we prove this. In view of some later applications, the following
lemma is stated more generally than for groups.

80
8.3 Lemma: Let G be a nonempty set and . let there be defined an
associative binary operation on G, denoted . by juxtaposition. Let
a1,a2, . . . ,an G. Then the products of a1,a2,. . . ,an are independent of the
mode of putting parentheses. This means the following. We define .

P1(a1) = {a1}

P2(a1,a2) = {a1a2}

P3(a1,a2,a3) = {(a1a2)a3, a1(a2a3)}

= {xy: x P1(a1), y P2(a2,a3) or x P2(a1,a2), y P1(a3) }

P4(a1,a2,a3,a4) = {a1(a2(a3a4)), a1((a2a3)a4), (a1a2)(a3a4), ((a1a2)a3)a4, (a1(a2a3))a4 }

= { xy: x P1(a1), y P3(a2,a3,a4) or x P2(a1,a2), y P2(a3,a4) or

x P3(a1,a2,a3), y P1(a4) }
..........................................

Pk(a1,a2,. . . ,ak) = { xy: x Pi(a1,a2,. . . ,ai), y Pk i(ai+1,. . . ,ak) for some


i = 1,2, . . . , k }

for k = 1,2, . . . , n. Thus Pk are subsets of G whose elements are the


products of a1,a2,. . . ,ak, reduced to k 1 successive multiplications of two
elements in G.

Claim: For all n and for all a1,a2,. . . ,an G, the set Pn(a1,a2,. . . ,an)
contains one and only one element.

Proof: The proof will be by induction on . n (in the form 4.5). For n =
1,2, it is evident that P1(a1), P2(a1,a2) each have exactly one element. For
n = 3, the claim is just the associativity of multiplication. For n = 4, the
argument preceding the lemma proves the claim. Notice that we used
only the associativity of multiplication there.

Suppose n 5 and the lemma is proved for 1,2, . . . , n 1. Let.


u,v Pn(a1,a2,. . . ,an). We are to prove u = v. By the definition of
Pn(a1,a2,. . . ,an), we have u = xy, v = st, where

x Pi(a1,a2,. . . ,ai), y Pn i(ai+1,. . . ,an), i , 1 i n 1,


s Pj (a1,a2,. . . ,aj ), t Pn j (aj +1,. . . ,an), j , 1 j n 1.

81
We prove u = v first under the assumption. i = j. By induction, the set
Pi(a1,a2,. . . ,ai) contains one and only one element. Hence x = s. Also,
applying the induction hypothesis to n i, with the elements ai+1,. . . ,an,
we conclude that Pn i(ai+1,. . . ,an) has one and only one element. This
gives y = t. Then we get u = xy = sy = st = v. So the. claim is proved in
case i = j. .

Now suppose i j. Without losing generality, we. assume i j. We put


j = i + h, with h . Now apply the induction. hypothesis to j, with the
elements a1,. . . ,aj . There is a unique element in Pj (a1,. . . ,aj ), which we
called s. Also by induction, applied to i with the elements a1,. . . ,ai, there
is a unique element in Pi(a1,a2,. . . ,ai), namely x. Again by induction,
applied to h with the elements ai+1,. . . ,aj there is a unique element in
Ph(ai+1,. . . ,aj ), say b. By the definition of Pj (a1,. . . ,aj ), we have
xb Pj (a1,. . . ,aj ), so xb = s.

We have n i = h + (n j). By induction, applied to. n i with the ele-


ments ai+1,. . . ,an, the set Pn i(ai+1,. . . ,an) has one and only one element,
which we called y. Also by induction, applied to h. with the elements
ai+1,. . . ,aj , there is a unique element in Ph(ai+1,. . . ,aj ), namely b. Again by
induction, applied to n j with the elements aj +1,. . . ,an, the set
Pn j (aj +1,. . . ,an) has a unique element, namely t. By the definition of
Pn i(ai+1,. . . ,ai+h,aj +1,. . . ,an), we have bt Pn i(ai+1,. . . ,an), so bt = y.

Thus xb = s and bt = y. This gives u = xy = x(bt) = (xb)t = st = v. This


completes the proof.

8.4 Definition: The unique element in Pn(a1,a2,. . . ,an) of Lemma 8.3 is


called the product of a1,a2,. . . ,an (in this order) and is denoted by a1a2. . .an

∏ai.
n
or by
i=1

So the product of n elements in a given order can be written without


parentheses. This simplifies the notation enormously.

82
Using the notation of Definition. 8.4, we can reformulate Lemma 8.3 as
follows. If G is a nonempty. set with an associative multiplication on it,
and if a1, a2, . . . , an G, then

a1(a2. . . an) = (a1a2)(a3. . . an) = (a1a2a3)(a4. . . an) = . . . = (a1a2. . . an 1)an = a1a2. . . an.

We write a n for a1a2. . . an in case a1, a2, . . . , an are all equal to a G, n .


In particular, a 1 = a. We have a n = a n 1a = aan 1. More generally, the
above reformulation of Lemma 8.3 gives

a ma n = a m+ n, for all a G and m,n . (*)

In particular, (a m)2 = a ma m = a m+ m = a 2m = a m2. We prove more generally


(a m)n = a mn by induction on n. The case n = 1 is trivial and the case n = 2
has just been shown. Assume now n 3 and (a m)n 1 = a m(n 1) for all
a G. We want to show (a m)n = a mn for all a G. We have (a m)n =
(a m)1+(n 1) = (a m)1(a m)n 1 = a m(a m)n 1 by (*), with a m,1,n 1 in place of
a,m,n, respectively. Then we get (a m)n = a ma m(n 1) = a ma mn m = a m+(mn m) by
(*), with a,m,mn n in place of a,m,n, respectively. This gives (a m)n = a mn.
Thus we proved the

8.5 Lemma: If there is an associative multiplication on a nonempty set


G, denoted by juxtaposition, then
ama n = a m+ n and (a m)n = a mn for all a G and m,n .

Lemma 8.5 can be extended to arbitrary integral powers in the case of


groups. We give the relevant definitions.

8.6 Definition: Let G be a group, a G, m . We put


a 0 = 1 = identity of G, and a m = (a m) 1 = inverse of a m.

8.7 Lemma: Let G be a group. Then


(1) a ma n = a m+ n;
(2) (a 1)m = a m;
(3) (a m)n = a mn;
for all a G and m,n .

83
Proof: (1) We prove a ma n = a m+ n. If m 1, n 1, Lemma 8.5 yields the
0 n n n 0+n
result. If m = 0, then a a = 1a = a = a for all n ; and if n = 0,
m 0 m m m+0
then a a = a 1 = a = a for all m . So we have

a ma n = a m+ n whenever m,n 0. (e)

We must prove this relation also when m 0, n 0; m 0, n 0;


m 0, n 0. Changing our notation (replacing m,n by m , n ) we must
prove (i) a a = a m n ; (ii) a ma n = a m+n; (iii) a ma n = a m+( n) for all m,n
m n
0.
mn n m
(i) Let m,n 0. If m n, then a a = a by (e). Multiplying by
(a ) = a on the right, we get a m n = a ma n if m
n 1 n
n. Taking the inverses
of both sides of this equation, we get, in case m n, a na m =
[(a n) 1] 1(a m) 1 = [a m(a n) 1] 1 = (a ma n) 1 = (a m n) 1 = a (m n) = a m+n. Interchang-
ing m and n, we get a ma n = a n+m = a m n in case n m. So a ma n = a m n,
irrespective of whether m n or n m.
(ii) Let m,n 0. If n m, then a ma m+n = a n by (e). Multiplying by
(a m) 1 = a m on the left, we get a m+n = a ma n if n m. Taking the inverses
of both sides of this equation, we get, in case n m, a na m = a n[(a m) 1] 1
= (a n) 1(a m) 1 = (a ma n) 1 = (a m+n) 1 = (a n m) 1 = a (n m) = a n+m. Interchanging
n and m, we get a ma n = a m+n in case m n. So a ma n = a m+n, irrespective
of whether m n or n m.
(iii) Let m,n 0. We have a ma n = a m+ n by (e). Taking the inverses
of both sides of this equation, we get a ma n = (a m) 1(a n) 1 = (a na m) 1 =
(a m+n) 1 = a (m+ n) = a m+( n) for all m,n 0.
m n m+ n
Thus a a = a for all a G and m,n .

(2) We prove (a 1)m = a m. This is true if m = 1, since (a 1)1 = a 1 = (a 1) 1.


Suppose now m ,m 2 and (a 1)m 1 = a (m 1). Then (a 1)m = (a 1)m 1(a 1)
= a (m 1)a 1 = a m+1a 1 = a m+1 1 = a m. So (a 1)m = a m for all m by induc-
1 0 0 0
tion. It is also true when m = 0, as (a ) = 1 = a = a . Now we must
prove it for m 0. With a slight change in notation, we are prove (a 1) m
= a m for all m . We have indeed (a 1) m = [(a 1)m] 1 = (a m) 1 = [(a m) 1] 1 =
a m. The first equality in this chain follows from Definition 8.6, with a 1 in
place of a, the second from the fact that (a 1)m = a m for all m , which
we just proved and the third from Definition 8.6.
So (a 1)m = a m for all m .

(3) We prove (a m)n = a mn. If m 1, n 1, Lemma 8.5 yields the result.


0 n n 0 0n
If m = 0, then (a ) = 1 = 1 = a = a for all n ; and if n = 0, then
m 0 0 m0
(a ) = 1 = a = a for all m . So we have

84
(a m)n = a mn whenever m,n 0. (e´)

We must prove this relation also when m 0, n 0; m 0, n 0;


m n m( n)
m 0, n 0. Replacing m,n by m , n we must prove (i) (a ) = a ;
m n ( m)n m n ( m)( n)
(ii) (a ) = a ; (iii) (a ) = a for all m,n 0.
Writing (e´) with a in place of a and using (2), we get (a m) n = [(a m) 1]n =
1

(a m)n = [(a 1)m]n = (a 1)mn = a (mn) = a m( n). This proves (i). We also get (a m)n
= a (mn) = a (m)n. This proves (ii). Finally, we have (a m) n = [(a m) 1] n =
([(a m) 1] 1)n = (a m)n = a mn = a ( m)( n). This proves (iii).
Thus (a m)n = a mn for all m,n .

The proof is complete.

8.8 Lemma: Let G be a group and a 1, a 2, . . . , a n G. Then


(a1a2. . . an) 1 = an 1. . . a2 1a1 1.

Proof: By induction on n. If n = 2, the assertion is true by Lemma


8.2(2). Suppose now n ,n 3 and (a1a2. . . an 1) 1 = a n 11. . . a2 1a1 1. Then

(a1a2. . . an 1an) 1 = ((a1a2. . . an 1)an) 1


= an 1(a1a2. . . an 1) 1
= an 1(a n 11. . . a2 1a1 1)
= an 1a n 11. . . a2 1a1 1,

as was to be proved.

Lemma 8.8 gives an alternative proof of (a 1)m = (a m) 1. When our group


is commutative, we have additional results, for example a mbn = bna m.

8.9 Lemma: Let G be a group and a,b G. If ab = ba, then


(1) abn = bna;
(2) a mbn = bna m;
for all m,n .

Proof: (1) We prove abn = bna. The case n = 0 is trivial. Also, ab1 = ab =
ba = b1a by hypothesis and the claim is true for n = 1. Suppose now
n , n 2 and the claim is proved for n 1, so that abn 1 = bn 1a.

85
Then abn = a(bn 1b) = (abn 1)b = (bn 1a)b = bn 1(ab) = bn 1(ba) = (bn 1b)a =
bna. By induction, abn = bna for all n .
n
We multiply this relation by b on the left and on the right. This gives
b na = ab n for n . So abn = bna is true also when n 1. So abn = bna
for all n .

(2) We have bna = abn by (1). We use this as a hypothesis and apply (1)
with a,b,n replaced by bn,a,m, respectively. Then we obtain a mbn = bna m
for all m,n .

If G is not a group but merely a nonempty set with an associative


multiplication on it, the proof remains valid for the case m,n ; and
also for the case m = 0 or n = 0, provided there is a unique identity e in
G and we agree to write a 0 = e for all a G:

8.10 Lemma: Let G be a nonempty set with an associative multiplica-


tion on it. Let a,b G.
(1) If ab = ba, then ambn = bna m for all m,n .
(2) If, in addition, there is a unique e G such that ce = ec for all c G,
0 m n n m
and if we put c = e for all c G, then a b = b a also when m = 0 or
n = 0.

8.11 Lemma: Let G be a nonempty set with an associative multiplica-


tion on it. For any m and for any a1,a2, . . . , am,b G such that

aib = bai for all i = 1,2, . . . , m

there holds (a1a2. . . am)b = b(a1a2. . . am).

Proof: By induction on m. The case m = 1 is included in the hypothesis.


Suppose now m 2 and the claim is true for m 1. Then

(a1a2. . .am 1am)b = ((a1a2. . .am 1)am)b


= (a1a2. . .am 1)(amb)
= (a1a2. . .am 1)(bam)
= ((a1a2. . .am 1)b)am

86
= (b(a1a2. . .am 1))am
= b((a1a2. . .am 1)am)
= b(a1a2. . .am 1am),
as was to be proved.

Lemma 8.11 gives a new proof of Lemma 8.10 when we choose


a1 = a2 = . . . = am = a and replace b by bn.

We proved in Lemma 8.3 that the product of n elements in a group (or


in a set with an associative multiplication on it) is independent of the
mode of putting parentheses. When the elements commute, the product
is also independent of the order of elements.

8.12 Lemma: Let G be a nonempty set with an associative multiplica-


tion on it. For all n , for all a1, a2, . . . , an G such that
aiaj = aj ai whenever i,j = 1,2,. . . ,n,
there holds
ak ak . . . ak = a1a2. . .an
1 2 n
for each arrangement k1, k2, . . . , kn of 1,2, . . . ,n (i.e., for each k1, k2, . . . , kn
such that {k1, k2, . . . , kn} = {1,2, . . . ,n}).

Proof: By induction on n. . The case n = 1 is trivial. Now assume n 2


and the claim is proved for n 1,. for all pairwise commuting elements
b1, b2, . . . , bn 1 of G, for all arrangements of 1,2,. . . ,n 1. Let a1, a2, . . . , an
be n arbitrary pairwise commuting elements of G and let k1, k2, . . . , kn
be an arbitrary arrangement of 1,2,. . . ,n. Then n = kj for some j {1,2,. . .
,n}. We have
ak ak . . . ak = (ak . . . ak )ak (ak . . . ak )
1 2 n 1 j-1 j j+1 n

= (ak . . . ak )(ak (ak . . . ak ))


1 j-1 j j+1 n

= (ak . . . ak )(ak . . . ak )ak )


1 j-1 j+1 n j

= ((ak . . . ak )(ak . . . ak ))ak


1 j-1 j+1 n j
= (ak . . . ak ak . . . ak )an
1 j-1 j+1 n
and here k1,. . . kj 1,kj+1,. . . ,kn are simply the numbers 1,2, . . . ,n 1 in
some order. By the inductive hypothesis, applied to the elements a1,a2, . .

87
. , an 1 and the arrangement k1,. . . kj 1,kj+1,. . . ,kn of the numbers 1,2, . . . ,n
1, we have ak . . . ak ak . . . ak = a1a2. . . an 1; therefore
1 j-1 j+1 n

ak ak . . . ak = (ak . . . ak ak . . . ak )an
1 2 n 1 j-1 j+1 n
= (a1a2. . . an 1)an
= a1a2. . . an 1an

and the induction argument goes through.. In the chain of equations


above, the term (ak . . . ak ) is absent if j = 1 and the term (ak . . . ak ) is
1 j-1 j+1 n

absent if j = n. The argument remains valid in these cases.

8.13 Lemma: Let G be a commutative group and let a1,a2, . . . , an be ar-


bitrary elements of G. Then
ak ak . . . ak = a1a2. . . an
1 2 n
for all arrangments k1,k2,. . . ,kn of the indices 1,2,. . . ,n.

Proof: This follows immediately from Lemma 8.12.

8.14 Lemma: Let G be a nonempty set with an associative multiplica-


tion on it and let a,b G.
(1) If ab = ba, then (ab)n = a nbn for all n .
(2) If, in addition, there is a unique e G such that ce = ec for all c G,
and if we put c 0 = e for all c G, then (ab)0 = a 0b0.
(3) If, in addition, G is a group, then (ab)n = a nbn for all n .

Proof: (1) The claim is trivially true when n = 1. Suppose now n 2


and assume (ab)n 1 = a n 1bn 1. Then
(ab)n = (ab)n 1(ab) = (a n 1bn 1)(ab)
= a n 1(bn 1a)b
= a n 1(abn 1)b (by Lemma 8.10)
n1 n1
= (a a)(b b)
= a nb n
and the claim is true for n. So (ab)n = a nbn for all n .

(2) Writing e for c, we get ee = e. Thus (ab)0 = e = ee = a 0e = a 0b0.

(3) That (ab)n = a nbn is proved for n 0. We are to prove it also when
n 1. Replacing n by n, we are to prove that (ab) n = a nb n for n .

88
We note that ab = ba implies b 1a 1 = (ab) 1 = (ba) 1 = a 1b 1, so the
hypothesis of (1) is satisfied when we replace a by a 1 and b by b 1.
Using (1) with a 1,b 1 in place of a,b, respectively, we obtain

(ab) n = [(ab) 1]n = [(ba) 1]n = (a 1b 1)n = (a 1)n(b 1)n = a nb n

for n . Thus (ab)n = a nbn is valid also when n 1. So (ab)n = a nbn


for all n .

8.15 Lemma: Let G be a commutative group. Then (ab)n = a nbn for all
a,b G and for all n .

Proof: This follows immediately from Lemma 8.14.

So far, we dealt with multiplicative groups.. For additive groups, there


are some modifications. In the case of an additive. group, the unique
ele-ment in Pn(a 1,a 2,. . . ,a n) of Lemma 8.3 is called the sum of a1,a2,. . . ,an

∑ ai. We write na for a1+ a2+ . . .


n
and is denoted by a1+ a2+ . . . + an or by
i=1

+ an in case n and a1,a2,. . . ,an are all equal to a G. Also, we define 0a


= 0 (the first 0 is the integer 0, the second 0 is the identity element of G)
and ( m)a = (ma) for m . Thus we defined na for all n ,a G.

8.16 Lemma: Let G be an additively written commutative group. Then


(1) ma + na = (m + n)a;
(2) ( m)a = m( a);
(3) n(ma) = (nm)a;
(4) n(a + b) = na + nb
for all m,n , a,b G.

Proof: (1),(2),(3) follow from Lemma 8.7 and (4) from Lemma 8.15.
Notice that commutativity is essential for (4).

Exercises

89
1. Let G be a group such that a 2 = 1 for all a G. Prove that G is
commutative.

2. Justify each step in the proof of Lemma 8.11.

3. Let G be a group and a,b,c G. Suppose ab = ba. Prove that (a mbnc r) 1 =


c ra mb n for all m,n,r , justifying each detail.

(4) Let G be a nonempty set with an associative multiplication on it and


let a1, a2, . . . , an be pairwise commuting elements of G. Show that
(a1a2. . . an)m = a1ma2m. . . anm
for all m .

(5) Show that, if G is an additive commutative group, then


(a1+ a2+ . . . + an) = ( a1) + ( a2) + . . . + ( an)
for all a1,a2,. . . ,an in G.

90
§9
Subgroups

A group is a set with a binary operation on it which has some nice


properties. Being a set, a group has subsets. Naturally, we are more
interested in those subsets which reflect the algebraic structure of the
group than in the other subsets. They help us understand the structure
of the group. Foremost among them are the sets which are groups them-
selves. We give them a name.

9.1 Definition: Let G be a group. A nonempty subset H of G is called a


subgroup of G if H itself is a group under the operation on G.

We write H G to express that H is a subgroup of G. Clearly, G is a


subgroup of G, so G G. If H is a subroup of G and a proper subset of G,
i.e., if H G and H G, we call H a proper subgroup of G. In this case,
we write H G. The notations H G and H G mean that H is not a
subgroup respectively not a proper subgroup of G.

Given a group G and a nonempty subset H of G, we must check the group


axioms for H in order to determine whether H is a subgroup of G. We
now discuss each one of these axioms. It turns out that we can do with-
out some of them.

First of all, there must be a binary operation on H. The operation on H is


the operation on G. More precisely, the operation on H is the restriction
of the operation on G to H. Hence, for a,b H, the element ab is com-
puted as the product of a and b in G. In order to have a binary operation
on H, given by (a,b) ab as in G, it is necessary and sufficient that ab
H for all a,b H. Hence H must be closed under the multiplication on G.
Then and only then is there a binary operation on H that is the restric-
tion of the multiplication on G.

In the second place, we must check associativity. For all a,b,c H, we


must show (ab)c = a(bc). But we know that (ab)c = a(bc) for all a,b,c G.

91
Since H G, we have all the more so (ab)c = a(bc) for all a,b,c H.
Indeed, if all the elements of G have a certain property, then all the
elements of H will have the same property. Thus associativity holds in H
automatically, so to speak. We do not have to check it.

In H, there must exist an identity, say 1H H such that a1H = a for all
a H. In particular, the identity 1H of H has to be such that 1H 1H = 1H .
Since 1H H G, Lemma 7.3(1) yields 1H = 1G = identity element of G.
So the identity element of G is also the identity element of H, provided it
belongs to H. Then we do not have to look for an identity element of H,
we must only check that the identity element of G does belong to H. We
write 1 for the identity element of H, since it is the identity element of
G.

Finally, for each a H, there must exist an x H such that ax = 1. Read-


ing this equation in G, we see x = a 1 = the inverse of a in G. We know
that the inverse of a exists. Where? The inverse of a exists in G. We
must also check a 1 H. Thus we do not have to look for an inverse of a.
We must only check that the inverse a 1 of a, which we know to be in G,
is in fact an element of H.

Summarizing this discussion, we see that a nonempty subset H of a


group G is a subgroup of G if and only if
(1) ab H for all a,b H,
(2) 1 H,
(3) a 1 H for all a H.
Moreover, (2) follows from (1) and (3). Indeed, if a H (remember that
1 1
H ), then a H by (3) and hence aa H by (1), which gives 1 H.
So (1),(2),(3) together is equivalent to (1),(3) together. We proved the
following lemma.

9.2 Lemma (Subgroup criterion): Let G be a group and let H be a


nonempty subset of G. Then H is a subgroup of G if and only if
(i) for all a,b H, we have ab H (H is closed under
multiplication) and
(ii) for all a H, we have a 1 H (H is closed under the
forming of inverses).

92
So we can dispense with checking 1 H when we know H . On the
other hand, when we do not know a priori that H , the easiest way to
ascertain H may be to check that 1 H.

When our subset is finite, we can do even better.

9.3 Lemma: (1) Let G be a group and let H be a nonempty finite subset
of G. Then H is a subgroup of G if and only if H is closed under
multiplication.

(2) Let G be a finite group and let H be a nonempty subset of G. Then H


is a subgroup of G if and only if H is closed under multiplication.

Proof: (1) We prove that 9.2(ii) follows from 9.2(i) when H is finite, so
that 9.2(i) and 9.2(ii) are together equivalent to 9.2(i), which is the
claim. So, for all a H, we must show that a 1 H under the assumption
that H is finite and closed under multiplication.

If a H and H is closed under multiplication, we have aa = a 2 H, a 2a =


a 3 H, . . . , in general a n H for all n . The infinitely many elements
2 3 n
a,a ,a ,. . . ,a ,. . . of H cannot be all distinct, because H is a finite set. Thus
a m = a k for some m,k , m k. Without loss of generality, let us
assume m k. Then
a m k 1a = a m k = a ma k = a m(a k) 1 = a m(a m) 1 = 1,
so that a 1 = a m k 1 H. So H is closed under the forming of inverses.

(2) This follows from (1), since any subset of a finite set is finite.

9.4 Examples: (a) For any group G, the subsets {1} and G are
subgroups of G. Here {1} is called the trivial subgroup of G.

(b) If K H and H G, then K is clearly a subgroup of G.

(c) Let 4 = {4z :z } = {u : 4 u} . Now is a group under


addition (Example 7.1(a)), and 4 is closed under addition and under
the forming of inverses by Lemma 5.2(5) and Lemma 5.2(1):
(i) if x,y 4 , then 4 x and 4 y, then 4 x + y, so x + y 4 ,

93
(ii) if x 4 , then 4 x, then 4 x, so x 4 .
Hence 4 .

(d) The additive group is a subgroup of the additive group . Also,


we have , where the group operation is ordinary addition.

(e) Under multiplication, := {x : x 0} is a subgroup of \{0},


since
(i) the product of two positive rational numbers is a positive
rational number, and
1
(ii) the reciprocal, that is, the multiplicative inverse of any
a
positive rational number a is a positive rational number.
( \{0} is a group under multiplication by §7,Ex.1(b).) Also, (see
Example 7.1(b)) and \{0} \{0}. We have in fact = ( \{0}) .

(f) If H1 and H2 are subgroups of G, then H1 H2 is a subgroup of G.


Indeed, H1 H2 since 1 H1 and 1 H2. Also
(i) a,b H1 H2 a,b H1 and a,b H2 ab H1 and
ab H2 ab H1 H2,
1 1
(ii) a H1 H2 a H1 and a H2 a H1 and a H2
1
a H1 H2.
Thus H1 H2 G. More generally, if Hi are subgroups of G, where i runs
through an index set I, then Hi G. Indeed, Hi since 1 Hi for
i I i I
all i I and .
(i) a,b Hi a,b Hi for all i I ab Hi for all i
i I
I ab Hi,
i I
1
(ii) a Hi a Hi for all i I a Hi for all i I
i I
1
a Hi.
i I

(g) Let S [0,1] be the set of all one-to-one mappings from [0,1] onto [0,1],
which is a group under the composition of mappings (Example 7.1(d)).
Consider
T={ S [0,1] : 0 = 0}.

Then T is a subgroup of S [0,1], for T is not empty (why?) and


(i) , T 0 = 0 and 0 = 0 0( ) = (0 ) = 0 = 0
T,
1
(ii) T 0 =0 0 =0 1 0 =0 1
0=0 1 1
T.

94
_ _ _ _
(h) Let U = {1,3,5,7} 8
and consider the multiplication in 8
. We see
_ _ _ _ _ _ _ _ _ _ _ _
11=1 13=3 15=5 17=7
_ _ _ _ _ _ _ _ _ _ _ _
31=3 33=1 35=7 37=5
_ _ _ _ _ _ _ _ _ _ _ _
51=5 53=7 55=1 57=3
_ _ _ _ _ _ _ _ _ _ _ _
71=7 73=5 75=3 77=1

so U is closed under multiplication. Since 8 is a finite set, U is a


subgroup of 8 by Lemma 9.3. Right? No, this is wrong. This would be
correct if 8
were a group under multiplication, which it is not (for
instance, 0 has no inverse by Lemma 6.4(12)). 8
is a group under addi-
tion, but this is something else. When we want to use Lemma 9.2 or
Lemma 9.3, we must make sure that the larger set is a group.

Nevertheless, U is a group under multiplication:

(i) U is closed under multiplication by the calculations above.


(ii) Multiplication on U is associative since it is in fact associa-
tive on 8
(Lemma 6.4(7)).
(iii) 1 U and a 1 = a for all a U. This follows from our
calculations or from Lemma 6.4(8). So 1 is an identity element of U.
(iv) Each element of U has an inverse in U. This follows from
the equations 1 1 = 1, 3 3 = 1, 5 5 = 1, 7 7 = 1 and from 1,3,5,7 U.

So U is a group. Let us find its subgroups. Now we can use Lemma 9.3.
This lemma shows that {1,3},{1,5},{1,7} are subgroups of U since they
are closed under multiplication. The reader will easily see that these are
the only nontrivial proper subgroups of U. Hence the subgroups of U
have orders 1,2,4, which are all divisors of the order U = 4 of U.

(i) E:= {1, 1,i, i} \{0} is a subgroup of the group \{0} of nonzero
complex numbers under multiplication by Lemma 9.3 as it is closed
under multiplication. The same lemma shows that {1, 1} is a subgroup
of E. Also, E has no other nontrivial proper subgroup, for any subgroup
of E that contains i or i must contain i 2,i 3,i 4 or ( i)2,( i)3,( i)4 and thus
must be E itself. So E has exactly three subgroups, one of order 1, one of
order 2,one of order 4. Here, too, the orders of the subgroups are
divisors of the order E = 4 of the group E.

95
(j) Lemma 9.3 may be false if the subset is not finite. For example, is
a group under addition, is a subset of and is closed with respect
to addition. Still, is not a subgroup of since there is no additive
identity in (0 ).

Exercises

1. Let G be a group and let H be a nonempty subset of G. Show that H is


a subgroup of G if and only if ab 1 H for all a,b H.

2. Show that n := {nz :z } = {u : n u} is a subgroup of


(under addition), where n is any natural number.

3. Let M = { S [0,1] : 0 = 0 or 1 = 1}. Is M a subgroup of S [0,1] under


multiplication?

_ _ _ _ _ _
4. Let L = {1,2,4,5,7,8} 9
. Show that L is a group under multiplication.
Find all subgroups of L. Do the orders of the subgroups divide the order
L = 6 of the group L?

5. Let G be a group and let H G, K G. Show that H K is not a


subgroup of G unless H K = H or H K = K. (The union of two sub-
groups is (generally) not a subgroup.)

6. Give an example of a group G and subgroups H,K,L of G such that


H K L G. (The union of three subgroups can be a subgroup.)

7. Let G be a group and let a be a fixed element of G. Determine whether


the subsets
C = {x G: ax = xa} and D = {x G: ax = xa or ax = xa 1}
of G are subgroups of G.

8. In Example 9.4(h), why cannot we use Lemma 6.4(9) to prove that


the axiom (iv) holds?

96
§10
Lagrange's Theorem

The order of any subgroup of U in Example 9.4(h) divides the order of U.


The same thing is true for the group E in Example 9.4(i). Likewise, the
reader verified that the order of L in §9, Ex.4 is divisible by the order of
any subgroup of L. These are special instances of a general theorem
named after J. L. Lagrange (1736-1813), which asserts that the order of
a subgroup divides the order of a group, provided, of course, the group
has finite order so that we can meaningfully speak about divisibility. It
is the first important theorem of group theory that we come across.

The proof of Lagrange's theorem requires the notion of cosets, which


plays an important role in group theory.

10.1 Definition: Let G be a group, H G and a G. We put

Ha := {ha G: h H} G

and call Ha a right coset of H in G. We put

aH := {ah G: h H} G

and call aH a left coset of H in G.

Right and left cosets of H are subsets of G. When the group is written
additively, we write H + a = {h + a G: h H} and a + H = {a + h G: h
H} for the right and left cosets of H. A right coset is not necessarily a left
coset and a left coset is not necessarily a right coset. However, when the
group is commutative, the right and left cosets coincide, as is evident
from the definition. During a particular discussion, we usually fix a sub-
group H of a group G and consider its various (right or left) cosets. Then
we refer to Ha as the right coset of a G, or as the right coset of H
determined by a. We use similar expressions for aH.

97
Cosets are subsets of a group, so the equality of two cosets is defined by
mutual inclusion. We ask when two cosets are equal. The next lemma
gives an answer.

10.2 Lemma: Let G be a group, H G and a,b G.


(1) The right coset H1 = the subgroup H = the left coset 1H.
(2) Ha = H if and only if a H; aH = H if and only if a H.
(3) Ha = Hb if and only if a = hb for some h H; aH = bH if and only if
a = bh for some h H.
(4) Ha = Hb if and only if a Hb; aH = bH if and only if a bH.
(5) Ha = Hb if and only if ab 1 H; aH = bH if and only if a 1b H.
(6) Ha = Hb if and only if Hab 1 = H; aH = bH if and only if a 1bH = H.

Proof: We prove only the assertions for right cosets and leave the
discussion of left cosets to the reader.

(1) From the definition of H1 and 1, we get


H1 = {h1 G: h H} = {h G: h H} = H.

(2) If Ha = H, then a = 1a {ha G: h H} = Ha = H, so a H. Conversely,


if a H, then
1
a H and a H,

ha H and ha 1 H for any h H, since H


is closed under multiplication),
1
ha H and h = (ha )a Ha for all h H,
Ha H and H Ha,
so Ha = H.

(3) If Ha = Hb, then a Ha = Hb, so a = hb for some h H. Conversely,


assume a = hb, where h H. Then

a = hb and b = h 1a,
h a = h hb Hb and h b = h h 1a Ha for all h H,
Ha Hb and Hb Ha,
so Ha = Hb.

(4) This is just a reformulation of (3).

98
(5) Ha = Hb if and only if a = hb for some h H, and there is a unique h
1
with a = hb, namely h = ab (Lemma 7.5(2)); thus a = hb for some h H
1
if and only if ab H.

(6) Ha = Hb if and only if ab 1 H by (5), and ab 1 H if and only if


Hab 1 = H by (2).

10.3 Lemma: Let H G. Then G is the union of the right cosets of H.


The right cosets of H are mutually disjoint. Analogous statements hold
for left cosets.

Proof: As Ha G for any a G, we get Ha G. Also, for any g G,


a G
we have g Hg, so g Ha, thus G Ha. This proves G = Ha.
a G a G a G

Now we prove that the right cosets of H are mutually disjoint. Assume
Ha Hb . We are to show Ha = Hb. Well, we take c Ha Hb if
Ha Hb . Then c Ha and c Hb. So Ha = Hc and Hc = Hb by Lemma
10.2(4). We obtain Ha = Hb.

The left cosets are treated similarly.

In the terminology of Theorem 2.5, right cosets of H form a partition of


G. Theorem 2.5 tells us that the right cosets are the equivalence classes
of a certain equivalence relation on G. By the proof of Theorem 2.5, we
see that this equivalence relation is given by

for all a,b G: a b if and only if Ha = Hb,

which we can read as

for all a,b G: a b if and only if ab 1 H.


It may be worth while to obtain Lemma 10.3 from this relation ,
instead of obtaining the relation from Lemma 10.3.

10.4 Definition: Let H G and a,b G. We write a r b (mod H) and


say a is right congruent to b modulo H if ab 1 H. Similarly,we write
a l b (mod H) and say a is left congruent to b modulo H if a 1b H.

99
10.5 Lemma: Let H G. Right congruence modulo H and left congru-
ence modulo H are equivalence relations on G.

Proof: We give the proof for right congruence only. We check that it is
reflexive, symmetric and transitive.

(i) For all a G, a r a (mod H), as this means aa 1 = 1 H. So


right congruence is reflexive. Reflexivity of right congruence follows
from the fact that 1 H .

(ii) If a r b (mod H), then ab 1 H, then (ab 1) 1 H, hence


1
ba H and b r a (mod H). So right congruence is symmetric. Sym-
metry of right congruence follows from the fact that H is closed under
the forming of inverses.

(iii) If a r b (mod H) and b r c (mod H), then ab 1 H and


1 1 1 1
bc H, then (ab )(bc ) H, hence ac H and a r c (mod H). So right
congruence is transitive. Transitivity of right congruence follows from
the fact that H is closed under multiplication.

Hence right congruence is an equivalence relation on G.

According to Theorem 2.5, G is the disjoint union of right congruence


classes. The right congruence class of a G is the right coset of a:
[a] = {x G: x r a (mod H)}
= {x G: xa 1 H}
= {x G: xa 1 = h, where h H}
= {x G: x = ha, where h H}
= {ha G: h H}
= Ha.
This gives a new proof of Lemma 10.3.

10.6 Lemma: Let H G. There are as many distinct right cosets of H in


G as there are distinct left cosets of H in G. More precisely, let be the
set of right cosets of H in G and let be the set of left cosets of H in G.
Then and have the same cardinality: = .

100
Proof: We must find a one-to-one correspondence between and .
We put :
Ha a 1H.
We show that is a one-to-one, onto mapping. First we prove it is a
mapping. We have to do it. Indeed, how do we find X if X ? Well,
we write X = Ha, that is, we choose an a X, then we find the inverse of
this a, and "map" X = Ha to the left coset a 1H of H determined by a 1. So
we must show that X is independent of the element a we choose from
X., i.e., that is a well defined function. We are to prove
Ha = Hb (Ha) = (Hb) .
1
If Ha = Hb, then ab H by Lemma 10.2(5), then (ab 1) 1 H, so ba 1
H, so a 1H = b 1H by Lemma 10.2(5), and (Ha) = (Hb) . Hence is indeed
a well defined function.

is one-to-one since (Ha) = (Hb) a 1H = b 1H (b 1) 1a 1 H


ba 1 H (ba 1) 1 H ab 1 H Ha = Hb, and is onto as
1
well, since any bH is the image of Hb under :
1 1 1
(Hb ) = (b ) H = bH.
Hence = .

10.7 Definition: Let G be a group and H G. The (cardinal) number of


distinct right cosets of H in G, which is also the (cardinal) number of
distinct left cosets of H in G, is called the index of H in G, and is denoted
by G:H .

So G:H is a natural number or G:H = . Notice that G is written before H


in G:H , but when we read, "H" is pronounced before "G": index of H in G.
Lemma 10.6 states essentially that we do not have to distinguish
between "right" and "left" index.

Note that G:H = 1 means H = H1 is the only right coset of H in G, whence


a Ha = H for all a G and so G H. Thus G:H = 1 if and only if H = G.

We will be mostly interested in cases where G:H is finite. This can


happen even if G is infinite. For instance, 4 is a subgroup of (under
addition) by Example 9.4(c) and the left (right) cosets

101
0+4 ,1+4 ,2+4 ,3+4
of 4 in are all the left cosets of 4 in . Hence : 4 = 4. Inciden-
tally, we see that Definition 10.4 is a natural generalization of the
congruence relation on .

We need one more lemma for the proof of Lagrange's theorem.

10.8 Lemma: Let G be a group and H G. Any right coset of H and


any left coset of H in G have the same (cardinal) number of elements as
H. In fact, Ha = aH = H for all a G.

Proof: We prove the lemma for right cosets only. For any a G, we
must find a one-to-one correspondence between H and Ha. What is more
natural than the mapping
:H Ha
h ha
from H into Ha? Now is indeed a mapping H into Ha. It is one-to-one,
for h = h (h,h H) implies ha = h a, which gives h = h after cancelling
a (Lemma 8.1(2)). Also, it is onto by the very definition of Ha. So we get
Ha = H .

10.9 Theorem (Lagrange's theorem): If H G, then G = G:H H . In


particular, if G is a finite group, then H | G.

Proof: From Lemma 10.3, we know G = Ha and that the Ha are


a G
mutually disjoint. Avoiding redundancies, we write

G= Ha,
Ha
where is the set of distinct right cosets of H in G. Since Ha are disjoint,
we obtain
G = ∑ Ha
Ha
when we count the elements. Since Ha = H for all Ha by Lemma
10.8, we get
G = ∑ Ha = ∑ H = H = G:H H
Ha Ha

as G:H = by Definition 10.7.

102
The basic idea of the preceding proof is simple. We have a disjoint union
G= Ha and we count the elements. Then we get G = ∑ Ha . In
Ha
Ha
the sequel, we will prove some important results by a similar reasoning.
We will have a disjoint union S = Ti and, counting the elements, we
i I
will get S = ∑ Ti . See §§25,26.
i I

Here is an application of Lagrange's theorem.

10.10 Theorem: Let p be a positive prime number and G be a group of


order p. Then G has no nontrivial proper subgroup.

Proof: We are to show that {1} and G are the only subgroups of G. Now
if H G, then H | G by Lagrange's theorem, so H | p and H = 1 or p.
If H = 1, then necessarily H = {1}. If H = p, then H = G and H G
together yield H = G.

If G is a finite group and K H G, Lagrange's theorem gives

G H G H G
G:H = , H:K = , so G:H H:K = = = G:K .
H K H K K

We give another proof of this result which works also in the case of
infinite groups and infinite indices.

10.11 Theorem: If K H G, then G:H H:K = G:K . In particular, if


any two of G:H , G:K , H:K is finite, then the third is finite, too.

Proof: Let = {Hai : i I} be the set of all distinct right cosets of H in G.


We have

G= Hai, with ai G, Hai Hai for i i 1, I = G:H . (1)


i I 1

Let = {Kbj : j J} be the set of all distinct right cosets of K in H. Then

H= Kbj , with bj H, Kbj Kbj for j j 1, J = H:K . (2)


j J 1

103
We must prove G:K = I J . Since I J = I J , this will be accomplished if
we can find a one-to-one correspondence between I J and the set of
right cosets of K in G. How we find this correspondence will be clear
when we observe

Hai = {hai G: h H} = {hai G: h Kbj }


j J
= {hai G: there are j J and k K with h = kbj }
= {kbj ai G: j J, k K}
= {kbj ai G: k K} = Kbj ai,
j J j J

so that G = Hai = Kbj ai = Kbj ai. This suggests


i I i I j J (i,j) I J

(i,j) Kbj ai

as a mapping from I J into the set of right cosets of K in G. Let us check


if it works.

For each (i,j) I J , bj ai is an element of G, hence Kbj ai is a right coset of


K in G. Thus the above correspondence is indeed a mapping from I J
into the set of right cosets of K in G.

It is onto, for if Kg (g G) is any right coset of K in G, then


g Kbj a i by our observation, so, by the definition of union,
(i,j) I J
there is (i0,j0) I J with g Kbj ai . Then Kg = Kbj ai and Kg is the
0 0 0 0
image of (i0,j0) I J.

It is one-to-one: if Kbj ai = Kbj ai , then bj ai = kbj ai for some k K H by


1 1 1 1

Lemma 10.2(3), then ai = bj 1kbj ai with bj 1kbj H, so Hai = Hai by


1 1 1 1

Lemma 10.2(3), so i = i 1 by (1). Thus ai = ai and we get Kbj = Kbj aiai 1 =


1

Kbj ai ai 1 = Kbj aiai 1 = Kbj by Lemma 10.2(6), which yields j = j 1 by (2).


1 1 1 1
Hence Kbj ai = Kbj ai implies (i,j) = (i 1,j 1). The mapping is one-to-one.
1 1

Thus G:K = I J = I J = G:H H:K .

Exercises

104
1. Find the right cosets of all subgroups of U (Example 9.4(h)), of E
(Example 9.4(i)) and of L (§9,Ex.4).

2. Let T be the subgroup of S [0,1] that we discussed in Example 9.4(g).


Show that { S [0,1]: 0 = 1} is a right coset of T in S [0,1]. Is it a left coset
of T? Would your answer be different if we wrote the functions on the
left? What is S [0,1]:T ?

3. Find all cosets of n in . What is :n ?

4. Why do we not use the "mapping" in the proof of Lemma


10.6? Ha aH

105
§11
Cyclic Groups

Let G be a group and a G. Consider the set {a n G: n } of all integral


powers of a. We designate this subset of G shortly by a . It is not
empty and is in fact a subgroup of G:
(i) if a m,a n a , then a ma n = a n+ m a , as m + n when
m,n ,
(ii) if a m a , then (a m) 1 = a m a , as m when m .

11.1 Definition: Let G be a group and a G. Then a = {a n G: n }


is called the cyclic subgroup of G generated by a. If it happens that a =
G, then G is called a cyclic group and a is called a generator of G.

Any cyclic group is abelian. Indeed, if G is a cyclic group, generated by


a, then any two elements a m,a n (m,n ) of G commute:
m n m+ n n+ m
a a =a =a = a na m .
The converse is false. There are abelian groups which are not cyclic. For
example, the group U of Example 9.4(h) is abelian but not cyclic since
the cyclic subgroups generated by 1,3,5,7 are all proper subgroups of U.

11.2 Examples: (a) Consider the subgroup i of \{0} under multipli-


cation. We have
i 0= 1, i 1 = i, i 2 = 1, a 3 = i
and other powers of i do not give rise to other complex numbers. To see
this, let n and divide n by 4 to get n = 4q + r, 0 r 3, q,r .
Then
i n = i 4q+ r = i 4qi r = (i 4)qi r = 14i r = i r { 1,i, 1, i}.
Hence i = { 1,i, 1, i} is a cyclic group of order 4.

(b) In §9, Ex.4, the reader proved that L = {1,2,4,5,7,8} is a group under
multiplication (mod 9). Let us find the cyclic subgroup of L generated by
2. We have
20 = 1, 21 = 2, 22 = 4, 23 = 8, 24 = 7, 25 = 5,

106
L = {1,2,4,5,7,8} = {2n L: n = 0,1,2,3,4,5} {2n L: n }= 2 ,

thus L = 2 . So L is a cyclic group and 2 is a generator of L. We see

4 = {1,4,7}, 8 = {1,8,}, 7 = {1,7,4}

are proper subgroups of L. In particular, 4,7,8 are not generators of L.


On the other hand,
5 = {1,5,7,8,4,2} = L

and 5 is another generator of L.

A cyclic group has many generators. The number of generators of a


cyclic group will be determined later in this paragraph.

11.3 Definition: Let G be a group and a G. The order a of the


cyclic subgroup of G generated by a is called the order of a and is
denoted by o(a).

Thus o(a) is either a natural number or . Of course, if G is a finite


group, then every element a of G will have finite order, in fact o(a)| G
by Lagrange's theorem. An infinite group, on the other hand, has in
general, elements of finite order as well as elements of infinite order.

11.4 Lemma: Let G be a group and a G. Then o(a) is finite if and only
if there is a natural number n with a n = 1. If this is the case, then o(a) is
the smallest natural number s such that a s = 1.

Proof: We put A = {n : a n = 1}. The claim is that o(a) is finite if and


only if A is not empty. First we suppose o(a) is finite and prove that A is
not empty. If o(a) is finite, then a is a finite subgroup of G and the
infinitely many elements
a 1,a 2,a 3,a 4, . . .
of a cannot be all distinct. So a k = a m for some k,m with k m.
mk m k
Assuming k m without loss of generality, we obtain a = a a =
m k 1 m m 1
a (a ) = a (a ) = 1, so m k A and A .

107
Suppose now there are natural numbers n with a n = 1, that is, suppose
that A . We prove that o(a) is finite, and is in fact the smallest natur-
al number in A. To this end, let s be the smallest natural number in A.
We show first s o(a) and then o(a) s.

Consider the s elements a 0,a 1,a 2, . . . ,a s 1 of a . These are all distinct, for
if a i = a j , i j, 0 i,j s 1
say with i j, then
a j i = 1, j i (s 1) 0, j i ,
j i A, j i s 1,
contradicting that s is the smallest natural number in A. So there are at
least s distinct elements in a . This gives s a = o(a).

Next we show that there are at most s distinct elements in a . If


ah a , where h , we divide h by s to get

h = qs + r, q,r , 0 r s 1,
h qs+ r sq r s q r q r r
a =a = a a = (a ) a = 1 a = a ,
so a {a 0,a 1,a 2, . . . ,a s 1},
a {a 0,a 1,a 2, . . . ,a s 1} ,
o(a) s

since the elements a 0,a 1,a 2, . . . ,a s 1 are all distinct.

From s o(a) and o(a) s, we get o(a) = s.

11.5 Lemma: Let G be a group and a G. Then o(a) = if and only if


powers of a with distinct exponents are distinct, i.e., if and only if
a m a k whenever m k (m,k ).

Proof: If a m a k whenever m k, then the infinitely many elements

. . . ,a 3,a 2,a 1,a 0,a 1,a 2,a 3, . . .

of a are all distinct. So a is an infinite group and o(a) = .

Suppose now the condition in the lemma does not hold. Then there are
m,k with a m = a k, m k. Assume m k without loss of generality.
mk
Then m k and a = 1. There is a natural number n, namely n

108
= m k, with a n = 1. Then o(a) is finite by Lemma 11.4. Hence o(a) =
implies that a m a k whenever m k (m,k ).

11.6 Lemma: Let G be a group and let a G be of finite order. Let n


. Then an = 1 if and only if o(a) n.

Proof: We put s = o(a). If s n, then n = sq for some q , hence a n = a sq


= (a s)q = 1q = 1 since a s = 1 by Lemma 11.4. Conversely, suppose a n = 1.
We divide n by s and get

n = qs + r, q,r , 0 r s 1,

1 = a n = a qs+ r = a sqa r = (a s)qa r = 1qa r = a r.

If r 0, then r would be a natural number smaller than s with a r = 1,


contradicting Lemma 11.4. So r = 0, n = qs and s n.

11.7 Lemma: If G is a finite group, then a|G| = 1 for all a G.

Proof: For any a G, o(a) = a divides G by Lagrange's theorem. So


|G|
a = 1 by Lemma 11.6.

Next we show that subgroups of cyclic groups are also cyclic.

11.8 Theorem: Let G be a cyclic group and let H G. Then H is cyclic.


More informatively, let G = a . Then {1} = 1 and if H {1}, then H =
a t , where t is the smallest natural number in the set {n : a n H}.

Proof: The subgroup {1} of G = a is clearly the cyclic subgroup of G


generated by 1, hence {1} = 1 is cyclic. Suppose now {1} H G. We
t
prove that H is cyclic, and in fact H = a as stated in the theorem. Since
H {1} by assumption, there is a nonidentity element in H, say a m H,
m
with m \{0}. Then a H since H is closed under the forming of
m m
inverses. So a ,a H, m 0. So there is a natural number n such that

109
an H, for instance n = |m|. Thus the set {n : an H} is not empty.
From the natural numbers in this set, we choose the smallest one and
call it t.

Now a t H. Also a t = (a t) 1 H. Since H is closed under multiplication,


kt t k t t t
we obtain a = (a ) = a a . . . a H and a kt = (a t)k = a ta t. . . a t H for all
0t mt tm
k . Since a = 1 H, we see a = a H for all m . Thus we
t tm
have a = {a G: m } H.

Assume next b H, where b G = a . We write b = a n with a suitable n


in and divide n by t. This gives

n = tq + r, q,r , 0 r t 1,
r n tq n t q
a =a = a (a ) H,

since a n,a t H. If r 0, then r would be a natural number smaller than


t such that a r = 1, contradicting the definition of t. So r = 0, n = tq, t|n
and b = a n = a tq a t . This holds for all b H. Hence H at .

From a t H and H a t , we get H = a t , as claimed.

11.9 Lemma: Let G be a group and a G. Let k ,k 0.


(1) If o(a) = , then o(a k) = .
(2) If o(a) = n , then o(a k) = n/(n,k).

Proof: (1) Suppose o(a) = . If o(a k) were finite, say o(a k) = m , then
k m km 0
(a ) = 1, so a = 1 = a , although km and 0 are distinct integers,
contrary to Lemma 11.5. So o(a) = implies o(a k) = .

(2) Now let us suppose o(a) = n . Then ak a and so o(a k) is


finite. By Lemma 11.4,

o(a k) = smallest natural number s such that (a k)s = 1


= smallest natural number s such that a ks = 1
= smallest natural number s such that n|ks (Lemma
11.6)
n k
= smallest natural number s such that |
(n,k) (n,k)
s

110
n
= smallest natural number s such that
(n,k) | s (Lemma

5.11 and
Theorem 5.12)
n .
=
(n,k)

From Lemma 11.9(1), we infer that any nontrivial subgroup of an


infinite cyclic group is infinite. Using Lemma 11.9(2), we can find the
number of generators of a finite cyclic group. Let G = a be a cyclic
group of order n . Which elements are the generators of G? Any
element a genera-tes a subgroup a k of a and a k is a generator of a
k

if and only if a k = a . We know a k a , so, since a = n is finite,


k k
a is a generator of a if and only if a = a . Thus a k is a generator
of a if and only if o(a k) = o(a), that is, if and only if n = n/(n,k), and so
if and only if (n,k) = 1. There are n distinct elements a 0,a 1,a 2, . . . ,a n 1 in
a , and among these,
{a k: (n,k) = 1, 0 k n 1} = {a k: (n,k) = 1, 1 k n}
is the set of generators of a . Hence the number of generators of a is
the number of positive integers smaller than (or equal to) n and
relatively prime to n. This number is traditionally denoted by (n). For
example
(1) = 1, (2) = 1, (3) = 2, (4) = 2, (5) = 4,
(6) = 2, (7) = 6, (8) = 4, (9) = 6, (10) = 4.
The function : is known as Euler's phi function or Euler's totient
function (L. Euler, a Swiss mathematician (1707-1783)).

Lagrange's theorem asserts that m| G when there is a subgroup H of


order H = m (provided G is a finite group). The converse of Lagrange's
theorem is false: if G is a finite group and m| G , then it is not necessarily
true that G has a subgroup of order m (see §16, Ex.7). However, for
cyclic groups, the converse of Lagrange's theorem is true.

111
11.10 Lemma: Let G = a be a cyclic group of order G = n. For any
positive divisor m of n, there is a unique subgroup H of order H = m,
namely a n/m .

Proof: o(a) = n by hypothesis. We write n = mk. Consider the subgroup


a k of a . We observe a k = o(a k) = n/(n,k) = mk/(mk,k) = mk/k = m,
so a k is a subgroup of order m.

We now show that a k is the unique subgroup of G of order m. Let L be


a subgroup of order m. We want to prove L = a k . Since L = a k = m
is finite, it will suffice to prove that L a k . This is certainly true if
L = {1}, that is, if m = 1. When m 1, we have, by Theorem 11.8, L =
a , where t is the smallest natural number such that a t L. In order to
t

show a t = L a k , we need only prove a t a k , i.e., we need only


prove k|t. This is easy: since o(a t) = a k = L = m, we get (a t)m = 1 by
Lemma 11.6, so a tm = 1, so n tm by Lemma 11.6 again, which gives
km tm, hence k|t.

Lemma 11.10 implies that a finite cyclic group G has, for any positive
divisor k of G , a unique subgroup of index k. This reformulation of
Lemma 11.10 extends immediately to infinite cyclic groups.

11.11 Lemma: Let G = a be a cyclic group of infinite order. For any


m , there is a unique subgroup H of G of index G:H = m, namely
m
H = a . Any nontrivial subgroup of G has finite index in G.

Proof: We have G = a , o(a) = . The elements of G are the symbols a k,


where k runs through the set of integers. By Lemma 11.5, a k a j for k
j. Two symbols are multiplied by adding the exponents: a k.a j = a k+ j . Also,
a 0 is the identity and (a k) 1 is the symbol a k. Essentially, we have the
group of integers under addition, but the integers are written as expo-
nents.

First we prove that a nontrivial subgroup of G has finite index in G. Let


L G = a , L {1}. From Theorem 11.8, we know L = a t , where t is
the smallest natural number such that a t L. Any element a n of G = a

112
can be written as a tq+ r, with some uniquely determined integers q,r,
where 0 r t 1. Thus any element a n of G belongs to one and only
one of the subsets

{a tq: q }, {a tq+1: q }, {a tq+2: q }, . . . , {a tq+(t 1): q },


which are just the right cosets

a t a 0, a t a 1, a t a 2, ... , at at 1
La0, La1, La2, ... , Lat 1

of L. The uniqueness of q and r implies that these cosets are distinct.


Alternatively, one can show that these cosets are distinct by noting that
Lai = Laj (0 i,j t 1) implies, when i j, say when i j, that L =
ji ji
La and thus (Lemma 10.2(2)) a L, where 0 j i t 1,
contrary to the definition of t as the smallest natural number such that
a t L. So there are exactly t distinct right cosets of L in G and G:L = t is
finite.

We proved in fact that G: a t = t when t . Thus, for any m , there


m
is a subgroup of G of index m, namely a . We proceed to show that
a m is the unique subgroup of G of index m. Assume K G with G:K =
m k
m . We are to show K = a . Now K = a , where k is the smallest
natural number such that a k K (as G:K is finite, K {1}). So m = G:K =
G: a k = k and a m = a k, which yields K = a k = a m . Therefore a m is
the unique subgroup of G of index m.

We learned the structure of cyclic groups quite well, but we had only a
few examples. We have not seen any cyclic group of order 5 or 7. For all
we know about cyclic groups up to now, it is feasible that there is no
cyclic group of order 5 or 7. We show next that there is a cyclic group of
any order. Incidentally, this shows that there are groups of all orders.

11.12 Theorem: There is a cyclic group of infinite order. Also, for


any n , there is a cyclic group of order n.

Proof: We give examples of cyclic groups in additive notation. In this


notation, a is the group {na: n }, the group operation being na + ma
= (n + m)a, the additive counterpart of the rule a na m = a n+ m.

113
(under addition) is a cyclic group of infinite order as = {m1 : m }
= 1 is generated by 1 .

n
(under addition) is a cyclic group of order n as n
= {m1 : m } =
1 is generated by 1 n
.

11.13 Theorem: Let p be a prime number. If G is a group of order p,


then G is cyclic.

Proof: Since p is prime, G = p 1 and so G does not consist of the iden-


tity element only. Let a be any element of G distinct from the identity.
Then 1 a G and a is a positive divisor of G = p by Lagrange's
theorem. Since a 1, we have a 1, and so a = p = G . This forces
G = a . Thus G is a cyclic group. (In fact, any nonidentity element of G is
a generator of G.)

Exercises

1. Let G be a group and let a be an element of finite order n in G. Show


that, for all m,k , the equality a m = a k holds if and only if
m k (mod n).

2. Find all subgroups of a cyclic group of order 8, of a cyclic group of


order 10, and of a cyclic group of order 12.

3. Let G be a group, a G and o(a) = 36. What are the orders of a 2, a 3, a 4,


a 7, a 12, a 15, a 17?

4. Let G be a group and a G. Let n,k and let m = [n,k] be the least
n
common multiple of n and k. Prove that a ak = am .

5. Let G be a group and a G with o(a) = n1n2 , where n1,n2 are


relatively prime natural numbers.. Show that there are uniquely deter-
mined elements a1,a2 of G such that
a1a2 = a = a2a1

114
and o(a1) = n1, o(a2) = n2.

6. Let G be a group and a,b G. Assume that o(a) , o(b) and that
o(a),o(b) are relatively prime. Prove: if ab = ba, then o(ab) = o(a)o(b).
Prove also that o(ab) = o(a)o(b) is not necessarily true when the hypo-
thesis ab = ba is omitted.

7. Show that, if p,n and p is prime, then (pn) = pn pn 1.

115
§12
Group of Units Modulo n

Let n be a natural number and consider n. We defined two operations


on this set, namely addition and multiplication (Lemma 6.3).. With
respect to addition, n forms a group. What about multiplication? With
respect to multiplication, n is not a group unless n = 1. This can be
easily seen from the fact that 0 has no multiplicative inverse in n
(Lemma 6.4(12); note that 0 1 when n 1).. However, as in Example
9.4(h), a suitable subset of n is a group under multiplication.

12.1 Lemma: Let n and a,b . If a = b in n


, then (a,n) = (b,n).

Proof: If a = b in n, then a b (mod n), so n b a, so nk = b a for


some k . We put d1 = (a,n) and d2 = (b,n). We have d1 n and d1 a, thus
d1 nk + a, thus d1 b. From d1 n and d1 b, we get d1 (b,n), so d1 d2. Likewise
we obtain d2 d1. So d1 = d2 by Lemma 5.2(12) and, since d1,d2 are
positive, we have d1 = d2.

The preceding lemma tells that the mapping n


is well defined. The
a (a,n)
claim of the lemma is not self-evident and requires proof. Compare it to
the apparrently similar but wrong assertion that a = b implies (a,n2) =
(b,n2). By Lemma 12.1, the following definition is meaningful.

12.2 Definition: Let n and a n


, where a . If (a,n) = 1, then a
is called a unit in n. The set of all units in n will be denoted by n.

The reader will observe that U in Example 9.4(h) is exactly 8. We see


7
= {1,2,3,4,5,6}. More generally, p = {1,2, . . . , p 1 } for any prime
number p. So p = p 1. When n 1, n consists of the residue classes
of the numbers among 1,2,3, . . . ,n 1,n that are relatively prime to n..

115
By the definition of Euler's phi function, we conclude n = (n). So (12)
= 4 and in fact 12
= {1,5,7,11}. Also, (15) = 8 and 15
=
{1,2,4,7,8,11,13,14}.

12.3 Lemma: Let n and a,b . If (a,n) = (b,n) = 1, then (ab,n) =


1.

Proof: This follows from the fundamental theorem of arithmetic (Theo-


rem 5.17), but we give another proof. We put d = (ab,n) and assume, by
way of contradiction, that d 1. Then p d for some prime number p
(Theorem 5.13). So

p ab and pn
p a or p b and pn (Euclid's
lemma)
p a and p n or p b and p n
p (a,n) or p (b,n),

contrary to the hypothesis (a,n) = 1 = (b,n). So (ab,n) = d = 1.

12.4 Theorem: For any n , n


is a group under multiplication.

Proof: (cf. Example 9.4(h).) We check the group axioms.

(i) Is n closed under multiplication? Let a ,b n


, so that a,b
are integers with (a,n) = 1 = (b,n). We ask whether a b n
, i.e., which is
equivalent to asking whether (ab,n) = 1.. By Lemma 12.3, ab is indeed
relatively prime to n and so n is closed under multiplication.

(ii) Multiplication in n
is associative since it is in fact asso-
ciative in n
(Lemma 6.4(7)).

(iii) 1 n
as (1,n) = 1 and a 1 = a1 = a for all a n
. Hence
1 is an identity element of n.

(iv) Each element in n has an inverse in n. This follows


from Lemma 6.4(9). Let us recall its proof. If a n
, with a and
(a,n) = 1, then there are integers x,y such that ax + ny = 1.. From this we

116
get a x = 1, so x is an inverse of a . Yes, but this is not enough.. We must
further show that x n
, or equivalently that (x,n) = 1. This follows
from the equation ax + ny = 1,. since d = (x,n) implies d x, d n, so d ax +
ny, so d 1, so d = 1. .

Hence n
is a group under multiplication.

n
is a finite group of order (n). Using Lemma 11.7, we obtain a (n) = 1
for all a n
. Writing this in congruence notation, we get an important
theorem of number theory due to L. Euler.

12.5 Theorem (Euler's theorem): Let n . For all integers that are
relatively prime to n, we have
a (n) 1 (mod n).

The case when n is a prime number had already been observed by


Pierre de Fermat (1601-1665). The result is known as Fermat's theorem
or as Fermat's little theorem.

12.6 Theorem (Fermat's theorem): If p is a positive prime number


then
a p 1 1 (mod p)

for all integers a that are relatively prime to p (i.e., for all integers a
such that p a.

Multiplying both sides of the congruence a p 1 1 (mod p) by a, we get


a p a (mod p). The latter congruence is true also without the hypothesis
(a,p) = 1, since both a p and a are congruent to 0 (mod p) when (a,p) 1.
This is also knows as Fermat's (little) theorem.

12.7 Theorem (Fermat's theorem): If p is a prime number, then

117
ap a (mod p)
for all integers a.

Exercises

1. Prove that n
is an abelian group under multiplication.

2. Construct the multiplication tables of n


for n = 2,4,6,10,12.

3. What are the orders of 2 in 3, 2 in 5, 3 in 7


, 2 in 11
, 2 in 13
, 3 in
17
, 2 in 19, 5 in 23? What do you guess?

4. Show that 3
, 32
, 33
, 34
are cyclic.

5. Assume p is prime, p is cyclic, and m , m 2. Prove that pm is


cyclic by establishing that, if a in p is a generator of p, then either a
or a+p in pm is a generator of pm.

6. Find the order of 5 in 8


, in 16
, in 32
, in 64
.

7. Prove or disprove: if a and a 5 (mod 8), then the order of a in


2m
is 2m 2 for all m 3.

8. Show that pq
is not cyclic if p and q are positive odd prime numbers.
(Hint: What is (pq) and what is a (p 1)(q 1)/2 congruent to (mod pq) if a is
an integer relatively prime to pq?)

118
§13
Groups of Isometries

For any nonempty set X, the set S X of all one-to-one mappings from X
onto X is a group under the composition of mappings . (Example 7.1(d)).
In particular, if X happens to be the Euclidean plane E, . then E is the set
of all points in the plane and S E is a group. We note that E is not merely
an ordinary set of points.. An important feature of E is that there is a
measure of distance between the points of E. Among the mappings in S E ,
we examine those functions. which preserve the distance between any
two points. . Clearly, such functions will be more important than other
ones in S E , since such mappings respect an important structure of the
Euclidean plane E. .

We choose an arbitrary but fixed cartesian coordinate system on E.. Each


point P in E will then be represented by the ordered pair (x,y). of its
coordinates.. We will not distinguish between the point P and the
ordered pair (x,y). So we write (x,y) in place of P , where S E . The

distance between two points P,Q in E is given by (x1 x2)2+(y1 y2)2, if P


and Q have coordinates (x1,y1),(x2,y2), respectively. This distance will be
denoted by d(P,Q) or by d((x1,y1),(x2,y2)).

13.1 Definition: A mapping S E is called an isometry (of E) if


d(P ,Q ) = d(P,Q)
for any two points P,Q in E.

This word is derived from "isos" and "metron",. meaning "equal" and
"measure" in Greek.. The set of all isometries of E will be denoted by
Isom E. Since the identity mapping E : E E is evidently an isometry,
Isom E is a nonempty subset of S E . In fact, Isom E S E.

13.2 Theorem: Isom E is a subgroup of S E .

119
Proof: We must show that the product of two isometries and the
inverse of an isometry are isometries (Lemma 9.2).

(i) Let , Isom E. Then, for any two points P,Q in E

d(P ,Q ) = d((P ) ,(Q ) )


= d(P ,Q ) (since is an isometry)
= d(P,Q) (since is an isometry),

so Isom E. Hence Isom E is closed under multiplication.

(ii) Let Isom E and let P,Q be any two points in E.. Since
S E , there are uniquely determined points P ,Q in E such that P = P,
1 1
Q = Q. Thus P = P ,Q =Q . Then

d(P ,Q ) = d(P ,Q ) (since is an isometry)


d(P 1,Q 1) = d(P,Q)
1
Isom E.

Hence Isom E S E.

We examine some special types of isometries, namely translations, rota-


tions and reflections.

Loosely speaking, a translation shifts every point of E by the same


amount in the same direction. In more detail, a translation is a mapping
which "moves" any point (x,y) in E by a units in the direction of the
x-axis and by b units in the direction of the y-axis (the directions being
reversed when a or b is negative). See Figure 1. The formal definition is
as follows.

13.3 Definition: A mapping E E is called a translation if there are


two real numbers a,b such that
(x,y) (x + a,y + b)
for all points (x,y) in E under this mapping.

The translation (x,y) (x + a,y + b) will be denoted by a,b.

120
(x+a,y+b)

(x,y)

(a,b)

(0,0)

Figure1

13.4 Lemma: Let a,b and c,d be arbitrary translations.


(1) a,b c,d = a+ c,b+ d.
(2) 0,0 = E = .
(3) a, b a,b = = a,b a, b.

Proof: (1) We have (x,y)( a,b c,d) = ((x,y) a,b) c,d


= (x + a,y + b) c,d
= ((x + a) + c,(y + b) + d)
= (x + (a + c),y + (b + d))
= (x,y) a+ c,b+ d
for all (x,y) E. Thus a,b c,d = a+ c,b+ d.

(2) We have (x,y) 0,0 = (x + 0,y + 0) = (x,y) = (x,y) for all (x,y) E. Thus
0,0 = .

(3) From (1) and (2) we get a, b a,b = ( a)+a, ( b)+b


= 0,0 = and likewise
a,b a, b = a+ ( a),b+ ( b) = 0,0 = .

13.5 Lemma: Any translation is an isometry.

Proof: First of all, we must show that any translation belongs to S E . Let
be an arbitrary translation (a,b
a,b ). There is a mapping : E E
such that a,b = = a,b, namely = a, b by Lemma 13.4(3). Thus a,b is
one-to-one and onto by Theorem 3.17(2). Hence a,b S E .
Next we show d((x1,y1),(x2,y2)) = d((x1,y1) a,b, (x2,y2) a,b) for any two
points (x1,y1),(x2,y2) in E. We have

121
d((x1,y1) a,b, (x2,y2) a,b) = d((x1 + a,y1 + b), (x2 + a,y2 + b))

= [(x1+a) (x2+a)]2 + [(y1+b) (y2+b)]2

= (x1 x2)2+(y1 y2)2


= d((x1,y1),(x2,y2))

and so a,b Isom E.

13.6 Theorem: The set T of all translations is a subgroup of Isom E.

Proof: Let T = { a,b: a,b } be the set of all translations. T is a subset


of Isom E by Lemma 13.5. From Lemma 13.4(2), = 0,0 T, so T .
Now we use our subgroup criterion (Lemma 9.2).

(i) The product of two translations a,b and c,d is a translation


a+ c,b+ d T by Lemma 13.4(1). So T is closed under multiplication.

(ii) The inverse of any translation a,b T is also a translation


a, b T by Lemma 13.4(3). So T is closed under taking inverses.

Thus T is a subgroup of Isom E.

Next we investigate rotations. By a rotation about a point C through an


angle , we want to understand a mapping from E into E which sends the
point C to C and whose effect on any point P C is as follows: we turn
the line segment CP about the point C through the angle into a new line
segment, say to CQ; the point P will be sent to the point Q (see Figure 2).
We recall that positive values of measure counterclockwise angles and
negative values of measure clockwise angles.

Rotations are most easily described in a polar coordinate system. We


choose the center of rotation, the point C, as the pole. The initial ray is
chosen arbitrarily. The point P with polar coordinates (r, ) is then sent
to the point Q whose polar coordinates are (r, + ). If C is the origin and
the initial ray is the positive x-axis of a cartesian coordinate system,
then the polar and cartesian coordinates of a point P are connected by

x = r cos y = r sin .

122
Q = (r, + )

P = (r, )

r
C

Figure 2

In our fixed cartesian coordinate system, the image of any point P = (x,y)
can be found as follows. If P has polar coordinates (r, ), then its image
will have polar coordinates (r, + ), so the cartesian coordinates x ,y of
Q := (r, + ) are

x = r cos( + ) = r(cos cos sin sin )


= (r cos ) cos (r sin ) sin
= x cos y sin ,
y = r sin( + ) = r(sin cos + cos sin )
= (r sin ) cos + (r cos ) sin
= y cos + x sin
= x sin + y cos .

This suggests the following formal definition.

13.7 Definition: A mapping E E is called a rotation about the origin


through an angle if there is a real number such that
(x,y) (x cos y sin , x sin + y cos )
for all points (x,y) in E under this mapping.

The rotation (x,y) . (x cos y sin , x sin + y cos ) will be denoted by


. We have an analogue of Lemma 13.4 for rotations.

123
13.8 Lemma: Let and be arbitrary rotations about the origin.
(1) = + .
(2) 0 = E = .
(3) = = .

Proof: (1) We have (x,y)( ) = ((x,y) )


= (x cos y sin , x sin + y cos )
= ((x cos y sin )cos (x sin +y cos )sin , (x cos y sin )sin + (x sin +y cos )cos )
= (x(cos cos sin sin ) y(sin cos + cos sin ),
x(cos sin + sin cos ) + y( sin sin + cos cos ))
= (x cos( + ) y sin( + ), x sin( + ) + y cos( + ))
= (x,y) +
for all (x,y) E. Thus = + .

(2) We have (x,y) 0


= (x cos0 y sin0, x sin0 + y cos0) = (x 0,0 + y)
= (x,y) = (x,y)
for all (x,y) E. Thus 0 = .

(3) From (1) and (2) we get = ( )+


= 0
= and likewise =
+( )
= 0= .

Lemma 13.8 was to be expected. When we carry out a rotation through


an angle and then a rotation through an angle , we have in effect a
rotation through an angle + . This is what Lemma 13.8(1) states. Also,
when we carry out a rotation through an angle and then a rotation
through the same angle in the reverse direction, the final result will be:
no net motion at all. This is what Lemma 13.8(3) states.

13.9 Lemma: Any rotation about the origin is an isometry.

Proof: First of all,. we must show that any rotation about the origin
belongs to S E . Let be an arbitrary rotation about the origin ( ).
There is a mapping : E E such that = = , namely = by
Lemma 13.8(3). Thus is one-to-one and onto by Theorem 3.17(2).
Hence S E.

124
Now we prove that preserves distance. For any two points (x,y),(u,v)
2
in E, we have d ((x,y) ,(u,v) )
= d2((x cos y sin , x sin + y cos ),(u cos v sin , u sin + v cos ))
= [(x u)cos (y v)sin ]2 + [(x u)sin + (y v)cos ]2
= (x u)2cos2 2(x u)(y v)cos sin + (y v)2sin2
2 2
+(x u) sin + 2(x u)(y v)cos sin + (y v)2cos2
= (x u)2(cos2 + sin2 ) + (y v)2(sin2 + cos2 )
= (x u)2 + (y v)2
= d2((x,y),(u,v)),

hence d((x,y) ,(u,v) ) = d((x,y),(u,v)). So is an isometry.

13.10 Theorem: The set R of all rotations about the origin is a


subgroup of Isom E.

Proof: Let R = { : } be the set of all rotations about the origin. R is


a subset of Isom E by Lemma 13.9. By Lemma 13.8(2), = 0 R, so R
. Now we use our subgroup criterion (Lemma 9.2).

(i) The product of two rotations and about the origin is


a rotation +
R about the origin by Lemma 13.8(1). So R is closed
under multiplication.

(ii) The inverse of any rotation R about the origin is also


a rotation R about the origin by Lemma 13.8(3). So R is closed
under taking inverses.

Thus R is a subgroup of Isom E.

So far, we have been dealing with rotations about the origin. What about
rotations about an arbitrary point C, whose coordinates are (a,b), say. A
rotation about C through an angle will map a point P with coordinates
(x + a,y + b) to a point Q with coordinates (x + a,y + b), where (x ,y ) is
the point to which (x,y) is mapped under a rotation about the origin
through an angle . So the image of (x,y) a,b will be (x,y) a,b. See Figure
3. This suggests the following formal definition..

125
13.11 Definition: Let C = (a,b) be a point in E. The mapping
( a,b) 1 a,b: E E is called a rotation about C through an angle .

(x´+a,y´+b)
(x+a,y+b)

(x´,y´) (a,b)
(x,y)

(0,0)

Figure 3

1 1
We put ( a,b) R a,b:= {( a,b) a,b: R}. This is the set of all rotations
about the point (a,b). It is a subgroup of Isom E. The proof of this state-
ment is left to the reader.

Now we examine reflections. The cartesian equations of a reflection are


very cumbersome. For this reason, we give a coordinate-free definition
of reflections. We need some notation. Let P,Q be distinct points in the
plane E. In what follows, PQ will denote the line through P and Q, and PQ
will denote the line segment between P and Q. So PQ is the set of points
R in E such that d(P,R) + d(R,Q) = d(P,Q).

The geometric idea of a reflection is that there is a line m and that each
point P is mapped to its "mirror image" Q on the other side of m. So PQ is
perpendicular to m and d(P,R) = d(R,Q), where R is the point of intersec-
tion of m and PQ. See Figure 4.

P R Q

Figure 4

126
13.12 Definition: Let m be a straight line in E and let m
:E E be the
mapping defined by
P m = P if P is on m
and
P m = Q if P is not on m and if m is the perpendicular bisector of PQ.
m
is called the reflection in the line m.

The perpendicular bisector of PQ is the line that is perpendicular to PQ


and that intersects PQ at a point R such that d(P,R) = d(R,Q). It is also the
locus of all points in E which are equidistant from P and Q. So it is the set
{R E: d(P,R) = d(Q,R)}. We will make use of this description of the per-
pendicular bisector in the sequel without explicit mention.

2
13.13 Lemma: Let m
be the reflection in a line m. Then m
= m
.

Proof: P m
P if P is not on the line m and so m
. Now we prove that
2 2
m
= . We have P m
= P( m m
) = (P m
) m
=P m
= P when P is a point on
2
m by definition. It remains to show P =
P also when P is not on m. Let
m
P be a point not on m and let Q = P m, P 1 = Q m. Then Q is not on m and
m is the perpendicular bisector of PQ as well as of QP1. So PQ and QP1
are parallel lines.. Since they have a point Q in common, they are
identical lines. Let R be the point at which m and PQ intersect. So P1 Q
and P1 is that point on PQ for which d(Q,R) = d(P1,R). Since P is on PQ and
d(Q,R) = d(P,R), we obtain P = P1, as was to be proved.

13.14 Lemma: Any reflection in a line is an isometry.

Proof: Let m be a line, m the reflection in m and let P,Q be arbitrary


points in the plane. We put P1 = P m, Q1 = Q m. We are to show d(P,Q) =
d(P1,Q1). We distinguish several cases.

Case 1. Assume both P and Q are on m. Then P1 = P and Q1 = Q. So d(P,Q)


= d(P1,Q1).

Case 2. Assume one of the points is on m, the other is not. We suppose,


without loss of generality, that P is on m and Q is not on m. Let QQ1

127
intersect m at S. Then d(Q,S) = d(S,Q1), d(P,S) = d(S,P1) since P = P1 and
the angles PSQ and P1Q1S are both right angles. By the side-angle-side
condition, the triangles QPS and Q1P 1S are congruent. So the
corresponding sides PQ and P1Q1 have equal length. This means d(P,Q) =
d(P1,Q1).

From now on, assume that neither P nor Q is on m. Let m intersect PP1 at
N and QQ1 at S.

Q P Q P P
Q Q

m S N S m N T m N S m
P=P S T
Q
Q
P Q
Q P P
Case 2 Case 3 Case 4 Case 4

Figure 5

Case 3. Assume that PQ is parallel to m.. Then the quadrilaterals NPQS


and NP1Q1S are rectangles. So PP1Q1Q is a rectangle and the sides PQ
and P1Q1 have equal length. This means d(P,Q) = d(P1,Q1).

Case 4.. Assume that PQ is not parallel to m. Then PQ intersects m at a


point T. As in case 2, PTN and P 1TN are congruent, and QST and
Q1ST are congruent, so d(P,T) = d(P1,T) and d(Q,T) = d(Q1,T). Also, STQ1
= STQ = NTP = NTP1, which shows that P1,T,Q1 lie on a straight line.
Then we obtain d(P1,Q1) = d(P1,T) d(T,Q1)
= d(P1,T) d(Q1,T)
= d(P,T) d(Q,T)
= d(P,T) d(T,Q)
= d(P,Q),
where the upper or lower sign is to be taken according as whether P,Q
are on the same or on the opposite sides of m.

13.15 Theorem: Let m be a line in E. Then { , m


} is a subgroup of Isom
E.

128
Proof: { , m
} is a finite nonempty subset of Isom E by Lemma 13.14. It
is closed under multiplication by Lemma 13.13. So it is a subgroup of
Isom E by Lemma 9.3(1).

Translations, rotations and reflections are isometries. Thus the products


of any number of these mappings, carried out in any order, will be iso-
metries, too. We show in the rest of this paragraph that all isometries
are obtained in this way. We need some lemmas.

13.16 Lemma: Let P,Q,R be arbitrary points in E.


(1) There is a unique translation that maps P to Q.
(2) If d(P,Q) = d(P,R), there is a rotation about P that maps Q to R.

Proof: (1) When P = (a,b) and Q = (c,d), say, then m,n maps P to Q if and
only if (a + m,b + n) = (c,d), i.e., if and only if m = c a, n = d b.. So
c a, d b
is the unique translation that maps P to Q.

(2) We draw the circle whose center is at P and whose radius is equal to
d(P,Q). This circle passes through R by hypothesis. Let be the angle
which the circular arc QR subtends at the center P. Then a rotation about
P through an angle maps Q to R.

The next lemma states that an isometry is completely determined by its


effect on three points not lying on a line.

13.17 Lemma: Let P,Q,R be three distinct points in E that do not lie on
a straight line. Let , be isometries such that P = P , Q = Q , R = R .
Then = .

129
P

N N

Figure 6

1
Proof: We put = . We suppose and try to reach a contradiction.
If , then there is a point N in E such that N N . Since P = P by
hypothesis, P = P and so P N. Similarly Q N and R N. Now is an
isometry, so d(P,N ) = d(P ,N ) = d(P,N) and likewise d(Q,N ) = d(Q,N) and
d(R,N ) = d(R,N). So the circle with center at P and radius d(P,N) and the
circle with center at Q and radius d(Q,N) intersect at the points N and N .
Then PQ is the perpendicular bisector of N N . Here we used N N . But
d(R,N ) = d(R,N) and R lies therefore on the perpendicular bisector of
N N , i.e., R lies on PQ, contrary to the hypothesis that P,Q,R do not lie on
a straight line. Hence necessarily = and = .

13.18 Theorem: Let P,Q,R be three distinct points in E that do not lie
on a straight line and let P ,Q ,R be three distinct points in E. Assume
that d(P,Q) = d(P ,Q ), d(P,R) = d(P ,R ), d(Q,R) = d(Q ,R ). Then there is a
translation , a rotation (about an appropriate point and through a
suitable angle) and a reflection such that

P =P ,Q =Q ,R =R ,

where denotes or .

Proof: By Lemma 13.16(1), there is a translation that maps P to P ..


We put Q1 = Q and R1 = R . Since is an isometry, d(P,Q) = d(P ,Q ) =
d(P ,Q1), so d(P ,Q1) = d(P ,Q ). By Lemma 13.16(2), there is a rotation
about P that maps Q1 to Q . Let us denote this rotation by . Then P =
P . We put

130
R1 = R2.

Here it may happen that R2 = R . Putting = in this case, we have P =


P , Q = Q , R = R , as claimed.

Suppose now R2 R . From d(P ,R ) = d(P,R) = d(P ,R ) = d(P ,R2) and


d(Q ,R ) = d(Q,R) = d(Q ,R ) = d(Q ,R 2), we deduce that both P and Q lie
on the perpendicular bisector of R R2. Denoting the reflection in the line
P Q by , we get P = P , Q = Q and R2 = R . Putting = in this case,
we have P = P , Q = Q , R = R , as claimed. .

The proof is summarized schematically below.

P P P P
Q Q1 Q Q
R R1 R2 R

13.19 Theorem: Every isometry can be written as a product of transla-


tions, rotations and reflections. In fact, if is an arbitrary isometry, then
there is a translation , a rotation and a reflection such that
= or .

Proof: Let be an arbitrary isometry. Choose any three distinct points


P,Q,R in E not lying on a straight line. Then d(P,Q) = d(P ,Q ), d(P,R) =
d(P ,R ), d(Q,R) = d(Q ,R ). So the hypotheses of Theorem 13.18 are
satisfied with P = P , Q = Q , R = R and there is a translation , a
rotation and a reflection such that
P =P ,Q =Q ,R =R ,
where = or . By Lemma 13.17, = . Thus = or .

Exercises

1. Let m be the line in E whose cartesian equation is ax + by + c = 0..


Show that the reflection m in the line m is given by
2a 2b
(u,v) m = (u 2 2 (au + bv + c), v (au + bv + c)).
a + b a + b2
2

131
2. Let m and n be two distinct lines intersecting at a point P.. Show that
m n
is a rotation about P. Through which angle?

3. Let m and n be parallel lines. Show that m n


is a translation.

4. Prove that every rotation and every translation can be written as a


product of two reflections.

5. Prove that every isometry can be written as a product of reflections.

6. A halfturn P
= (a,b)
about a point P = (a,b) is defined as the mapping
given by
(x,y) (2a x, 2b y)
for all points (x,y) in E. Show that any halfturn is an isometry of order
two. Prove that the product of three halfturns is a halfturn.
7. Prove that a halfturn P is the product of any two reflections in lines
intersecting perpendicularly at P. .

8. Prove that a product of two halfturns is a translation.

9. Show that the set of all translations and halfturns is a subgroup of


Isom E.

10. Prove that the product of four reflections can be written as a product
of two reflections.

11. Show that 2 /n


generates a cyclic subgroup or order n of Isom E.

12. Prove that every nonidentity translation is of infinite order and. that
is of finite order if and only if is a rational multiple of .

132
§14
Dihedral Groups

In this paragraph, we examine the symmetry groups of regular polygons.

Let F be any nonempty subset of the Euclidean plane E.. Here F might be
a set with a single point, a line, a geometric figure. or an arbitrary subset
of E. Let SE . We put

F = {x : x F} = {y E: y = x for some x F}.

F is called the image of F under . Clearly,

F = {x : x F} = {x: x F} = F

and we have F( ) = {x( ): x F} = {(x ) : x F}


= {(x ) : x F } = {y : y F } = (F )

for all , SE . We record this as a lemma.

14.1 Lemma: Let F be a nonempty subset of E and let , SE Then


F =F
and F( ) = (F ) .

Let P be a point in E and S E . We say fixes P if P = P. We also say P


is a fixed point of in this case.. Let F E. We say fixes F (as a set)
if F = F.. This means of course F F and F F , so P F for all P F
and also,. for every Q F, there is a P F such that Q = P . The reader
should not confuse this with fixing F pointwise.. We say that fixes F
pointwise if P = P for all P F, i.e., if fixes every point of F.. As an
example, let A be the x-axis {(x,0): x }. The translation 1,0 fixes A as
a set, but not pointwise.. On the other hand, the reflection :(x,y) (x, y)
in the x-axis fixes A pointwise. .

This terminology is meaningful for all elements of SE , but we consider


only isometries in this paragraph. .

133
14.2 Definition: Let F be a nonempty subset of E and let Isom E. If
F = F, then is called a symmetry of F.

So a symmetry of F is an isometry that fixes F as a set. A symmetry of F


is not a property of F. It is a mapping.

14.3 Examples: (a). Let F = {(0,0)} be the subset of E consisting of the


origin only. Any rotation about the origin is a symmetry of F, since
any rotation about the origin is an isometry and fixes the origin. (or,
equivalently, fixes F). .

(b) Let F = {(x,y) E: y = mx} be the line whose cartesian equation is y =


mx (where m ). Then the translation 1,m is a symmetry of F since

F 1,m
= {f 1,m E: f F}
= {(x,y) 1,m
E: y = mx}
= {(x+1,y+m) E: y = mx}
= {(x+1,(x+1)m) E: x }
= {(u,v) E: v = mu}
= F.

Similarly, all translations of the form a,am


is a is a symmetry of F. We
note that such translations form a group. In fact, the symmetries of any
nonempty subset of E form a group.

14.4 Theorem: Let F be a nonempty subset of the Euclidean plane E


and let Sym F be the set of all symmetries of F, so that
Sym F := { Isom E: F = F}.
Then Sym F is a subgroup of Isom E.

Proof: We have F = F by Lemma 14.1, so Sym F and Sym F is not


empty. Now we use Lemma 9.2.

(i) If , Sym F, then F = F and F = F, so F( ) = (F ) = F


= F by Lemma 14.1. Thus Sym F.

134
1 1 1
(ii) If Sym F, then F = F, so F( ) = (F ) = F( )=F =
F by Lemma 14.1. Thus 1 Sym F.

It follows that Sym F Isom E.

14.5 Definition: Let F be a nonempty subset of E. Then


Sym F = { Isom E: F = F}
is called the symmetry group of F.

We now study the symmetry groups of regular polygons.. For our pur-
poses, it will be convenient to define regular polygons as follows.. Let K
be a circle and let P1,P2, . . . ,Pn be n points on this circle K such that each
one of the arcs P1P2,P2P3, . . . ,Pn 1Pn subtends an angle of 2 /n radians at
the center of K (where n 3). So the points P1,P2, . . . ,Pn divide the circle
K into n circular arcs of equal length.. The union of the line segments
P1P2,P2P3, . . . ,Pn 1Pn,PnP1 is called a regular n-gon. The circle K is called the
circumscribing circle of this regular n-gon.. This is justified since a
regular n-gon has a unique circumscribing circle.. The center of the
circumscribing circle is called the center of the regular n-gon. and the
points P1,P2, . . . ,Pn are called the vertices of the regular n-gon.

Let F be a regular n-gon.. We want to determine Sym F. It is geometrically


evident that any in. Sym F maps a vertex to a vertex and fixes the
center of F. We use this fact without proof.. A proof is outlined in the
exercises at the end of this paragraph. Let P1,P2, . . . ,Pn be the vertices
and let C be the center of F.. We assume the notation so chosen that
P1,P2, . . . ,Pn are consecutive vertices as we trace the regular n-gon
counterclockwise. In the following discussion, Pn+1 will stand for P1, Pn+2
for P2, in general Pn+ k for Pk. In other words, the indices will be read
modulo n.

1
2 n

4
Figure 1

135
Now let Sym F. . Then is completely determined by its effect on
three distinct points not on a straight line (Lemma 13.17).. For example,
is determined by C ,P 1 ,P 2 . We have already remarked that C = C. Also
P 1 = P k for some k {1,2, . . . ,n}. What about P 2? Since is an isometry,
P 2 will be a vertex whose distance from P k is equal to the distance
between P 1 and P 2. Thus P 2 will be adjacent to P k: it is either Pk 1 or Pk+1.
We see that there are n choices for P 1 and, once the choice for P 1 has
been made, there are two choices for P 2 . Hence there are at most n.2 =
2n isometries in Sym F. . We exhibit 2n symmetries of F and this will
prove Sym F = 2n. .
1 3

2 3 1 2
Figure 2

First we examine the special case n = 3, when F is an equilateral triangle.


Consider a rotation about the center of F through an angle of 2 /3
radians, which we denote by . Under , the vertices P1,P2,P3 take the
2
places of P2,P3,P4 = P1 respectively. It is seen from Figure 3 that maps
3 3
P1,P2,P3 respectively to P3,P2,P1 and that fixes P1,P2,P3, which implies
= . We found three symmetries of F, namely , , 2. Since 3
= , we see
that is a cyclic subgroup of order 3 of Sym F.
P1 remains fixed and the vertices P2 and P3 exchange their places. We
know 2 = (Lemma 13.13). From Figure 4, we read off 1
= 2
= .
2
Using and , we obtain two new symmetries of F, namely and .
The reader may check that is the reflection in the perpendicular
2
bisector of P1P3 and that is the reflection in the perpendicular bisector
of P1P2. From the geometric meaning of these mappings, or from their
2 2
effect on P1,P2,P3, we infer that , , , , , are distinct. Thus they form
2 2
the symmetry group of F: Sym F = { , , , , , }. In particular, Sym F is
equal to 6.

1 1

2 3 3 2

1 1 3

2 3 3 2 2 1

1 3 3

2 3 1 2 2 1

Figure 4

The discussion of a general regular polygon follows much the same lines.
Consider a rotation about the center of F through an angle of 2 /n
radians, which we denote by . Under , the vertices P1,P2, . . . ,Pn are
k
mapped respectively to P2,P3, . . . ,Pn,P1. It is seen that maps P1,P2, . . . ,Pn
respectively to Pk+1,Pk+2, . . . ,Pk+ n, where k is any integer. Thus k = if and
only if Pk+ i = Pi, that is, if and only if k + i i (mod n) for all i, so if and
only if n k, from which we obtain o( ) = n. In this way, we found n
symmetries of F, namely , , 2,. . . ., n. Here is a cyclic subgroup of
order n of Sym F.

137
1 n
2 n 1 n-1

3 n-1 2 n-2

4 3

Figure 5

Now consider the reflection in the angular bisector of the angle PnP1P2.
The bisector of this angle passes through P(n/2)+1 if n is even and through
the midpoint of P(n+1)/2P(n+3)/2 if n is odd. One reads off from Figure 6 that
Pk = Pn+2 k for k = 1,2, . . . ,n.

1 1
2 n 2
n
3 n-1 n-1 3

4 n-2

Figure 6

Thus we have, for any j = 1,2, . . . ,n,

Pj = Pn+2 j
= P(n+2 j)+1
= Pn j+3
,

1
Pj = Pj 1
= P(n+2) (j 1)
= Pn j+3
,

1
so = , as can be seen from Figure 7 too.

138
1 1 2
2 n n 2 1 3

3 n-1 n
n-1 3 4

4 n-2 n-1

1 2 2
2 n 3 1 1 3

3 4 n
n-1 n 4

4 5 n-1

Figure 7

Using and , we obtain n 1 new symmetries , 2 , . . . , n 1 of F. These


are reflections in certain lines. The reader may verify this assertion in the
case of squares, regular pentagons, regular hexagons and regular
heptagons. In particular, we have ( m )2 = for any m = 0,1, . . . ,n 1. This
follows also from the lemma below.

2
14.6 Lemma: Let G be a group and let , G be such that = 1 and
= 1 . Then n = n for all n .

Proof: The claim is certainly true when n = 0, and also when n = 1 by


n
hypothesis. We prove = n for all n by induction on n. Suppose
k
we proved it for n = k , so that = k , then it is true for n = k + 1,
since k+1 = ( k ) = ( k) = ( k ) = k( ) = k( 1 ) = ( k 1) = (k+1) .
This shows n = n for all n 0. We must further show this when n
n n
0, or, equivalently, that = for all n . This will follow from what
we proved above, with 1, in place of , . Observe that 2 = 1, = 1
1
implies =
1
=
1
=
1
=
1
and, taking inverses, = ( 1) 1 ,
so the hypothesis is valid with 1, in place of , . Then we get

139
1 n
( ) = ( 1) n for all n ,
n n
and thus = for all n . This completes the proof.

We found 2n symmetries of F: , , 2, . . . , n 1, , , 2 , . . . , n 1 . These are


distinct (why?) From Sym F 2n, we get Sym F = 2n and

2 n1 2 n1
Sym F = { , , , ..., , , , , ..., }.

Every element in Sym F can be written as a product of a suitable power of


by suitable power of , which remark we summarize by saying that
and generate Sym F. We also say Sym F is generated by and , and
that and are generators of Sym F. The notation , denotes a group
n 2
with two generators. Including the relations = , = and = 1 , we
write
Sym F = , : n = , 2 = , = 1 .

14.7 Definition: Let G be a group having elements a,b such that

o(a) = n, o(b) = 2, ba = a 1b,

G = {a kbr: k = 0,1, . . . ,n 1, r = 0,1},

where n 2. Then G is called a dihedral group of order 2n.

It is easily seen from o(a) = n and o(b) = 2 that the elements of G


displayed in Definition 14.7 are indeed distinct. Using Lemma 14.6,
products in G can be brought to the form a kbr. Hence G is really of order
n. In a dihedral group of order 8, for example, we have

a 2bab3a 5a 7b 1 = a 2baba12b = a 2.bab.a 4b = a 2.a 1.a 4b = a 5b

with the foregoing notation. (The exponent of a changes sign when b


"passes through" a to the other side.)

We see that. symmetry groups of regular polygons are dihedral groups.


We write D2n for a dihedral group of order 2n. (Warning: some authors
write Dn for a dihedral group of order 2n.) Henceforward, we write D2n
instead of Sym F (F being a regular polygon with n sides).. The ambiguity

140
in "D2n" (whether it designates an arbitrary dihedral group or the par-
ticular dihedral group Sym F) is harmless. .

Some people use Definition 14.7 only when n 3. They do not consider
D4 as a dihedral group. This is consistent with the fact that D4 is not the
symmetry group of any regular polygon (see however Ex.10). But then
they have to formulate the following theorem of Leonardo da Vinci (yes,
of Leonardo da Vinci (1452-1519)) less beautifully.

14.8 Theorem: A finite subgroup of Isom E is either a cyclic group or a


dihedral group.

This theorem will not be used in the sequel and its proof is left to the
reader.

Exercises

1. Let be an isometry and F1, F2 nonempty subsets of E. Show that


(F1 F2) = F1 F2 and (F1 F2) = F1 F2 . Generalize to arbitrary
unions and intersections.

2. Let be an isometry and P,Q two distinct points in E. Show that (PQ)
= P Q and (PQ) = P Q . (Hint: PQ = {R E: d(P,R) + d(R,Q) = d(P,Q)}.)

3. Let be an isometry and R the midpoint of PQ. Show that R is the


midpoint of P Q .

4. Let be an isometry and PQR a triangle (i.e., the union of the


segments PQ ,QR,PR). Show that ( PQR) = P Q R . (By the side-side-
side condition, the triangles PQR and P Q R are congruent, hence
PQR and P Q R are equal: isometries preserve angles. In particular,
isometries preserve perpendicularity and so also parallelity.)

141
5. Let P 1,P 2, . . . ,P n be the vertices of a regular n-gon. If n happens to be
odd, let Qi be the midpoint of the side P[(n 1)/2]+iP[(n+1)/2]+i (i = 1,2, . . . ,n)
Prove that . the center C of the regular n-gon is uniquely determined as
the midpoint of P iP i+(n/2) if n is even and as the midpoint of P iQi if n is
odd. (As the radius of a circumscribing circle is equal to d(P i,C), this
proves that. a circumscribing circle is completely determined by the
vertices. Hence there is a unique circumscribing circle of a regular n-gon.).

6. Let be an isometry and M a regular n-gon. let C be the center of the


regular n-gon. Prove the following assertions.
(a) M is a regular n-gon with center C .
(b) If is a symmetry of M, then C = C.
(c) If is a symmetry of M, then {P1,P2, . . . ,Pn} = {P1,P2, . . . ,Pn}.
(Under a symmetry of M, a vertex is mapped to a vertex. Hint: A point P
on M is a vertex if and only if d(P,C) = radius of the circumscribing circle.)

7. Let m be a real number. Prove that { a,am: } is a subgroup of


Isom E without appealing to Theorem 14.4.

8. Let P be a point and m a line. Find all isometries that fix P, all isomet-
ries that fix m and all isometries that fix m pointwise. Show directly that
these three sets are subgroup of Isom E.

9. Let F be a nonempty subset of E. Is { Isom E: F F} necessarily a


subgroup of Isom E?

10. Find the symmetry group of a rectangle that is not a square.

11. With the notation of Definition 14.7, what is D2n: a ?

12. Prove Theorem 14.8.

13. Let : , : . Prove the following assertions.


x x+1 x x

(a) , S .
(b) x y = x y = x y . (So and preserve distance in .
For this reason, they are said to be isometries of .)
(c) o( ) = and o( ) = 2.

142
(d) = 1 . (Thus , satisfy the conditions an a,b in Definition 14.7,
except n is replaced by here. A dihedral group of infinite order is a
group D having element a,b such that o(a) = , o(b) = 2, ba = a 1b and
G = {a kbr: k , r = 0,1}.
The group generated by , is an example of a dihedral group of infinite
order.)

14. Prove that a group generated by two distinct elements a,b such that
o(a) = 2 = o(b) is a dihedral group (of finite or infinite order).

15. Let n be any natural number or . Find a group G and a,c G such
that o(a) = 2 = o(c) and o(ac) = n. (So o(ac) cannot be determined from
o(a) and o(c) alone.)

143
§15
Symmetric Groups

For any nonempty set X, the set SX of all one-to-one mappings from X
onto X is a group under the composition of functions. (Example 7.1(d)).
In particular,. choosing X to be the set {1,2, . . . ,n} of the first n natural
numbers, we get a group S{1,2,...,n}. We abbreviate this group as Sn.

15.1 Definition: Let n . The group of all one-to-one mappings from


{1,2, . . . ,n} onto {1,2, . . . ,n} is called the symmetric group (on n letters)
and is written Sn. The elements of Sn are called permutations (of 1,2, . . .
,n)..

The reader should not confuse the symmetric group with the symmetry
group of a figure in the Euclidean plane.

15.2 Theorem: Sn is a group of order n!.

Proof: Let Sn be a permutation of 1,2, . . . ,n. Then 1 is one of the


numbers 1,2, . . . ,n. Since is one-to-one, 1 2 , so 2 is one of the
remaining n 1 numbers among 1,2, . . . ,n after 1 has been determined.
Since is one-to-one, 1 3 and 2 3 , so 3 is one of the remaining
n 2 numbers among 1,2, . . . ,n after 1 and 2 have been determined.
Proceeding in this way, we see that, for any k = 1,2, . . . ,n, the number k
must be one of the numbers among 1,2, . . . ,n which are distinct from
1 ,2 , . . . ,(k 1) . Hence there are n choices for 1 ; and n 1 choices for
2 ; . . . ; and n (k 1) choices for k ; . . . ; and n (n 1) choices for n ;
and all these choices give a permutation of 1,2, . . . ,n. Therefore there are

n.(n 1).(n 2). . . . .2.1 = n!

permutations in Sn.

144
We introduce a notation for permutations. Let n and Sn. Then
is a mapping : {1,2, . . . ,n} {1,2, . . . ,n}. and can be exhibited by
associating any number in {1,2, . . . ,n} with its image by an arrow.. Thus
Sn for which 1 = 3, 2 =1, 3 = 2, 4 = 5, 5 = 4 can be displayed as
1 3
2 1
3 2
4 5
5 4,

or, in order to save space, as

1 2 3 4 5
.
3 1 2 5 4

We simplify this notation further by deleting the arrows and enclosing


the two rows of numbers in parentheses. Thus we arrive at

(13 21 32 45 54)
for our . In general, we write any S n as

... a ...
(... a ...)

In this notation, there are two rows of n elements and n columns of two
elements. The rows consist of the numbers 1,2, . . . ,n. The image under
of any a {1,2, . . . ,n} is written just below a in the second row. This
nota-tion is due to A. Cauchy (1789-1857).

The order of the columns is immaterial in this notation. For example

(16 21 34 42 53 65) (21 34 53 42 65 16) (53 34 21 16 65 42)


are all equal permutations in S6.

The identity permutation Sn maps any a to a, so the rows will be


identical. Thus

123. . . n
= (1 2 3 . . . n).

145
1
The inverse of any Sn is found easily. By definition,
is the function
... a ...
(permutation) that maps a to a, for all a {1,2, . . . ,n}. Let be (... a ...).

Then, under 1, any element in the second row is mapped to the number
just above it. 1 is therefore obtained by interchanging the rows of .
For instance, in S7, we have .

1
(17 26 33 45 54 61 72) = (71 62 33 54 45 16 27),which may also be
(16 2 3 4 5 6 7
7 3 5 4 2 1). Two permutations in Sn, say
written as and , are
... a ... ... b ...
multiplied as follows. We have = (... a ...) and = (... b ...). What is

? By definition, is the permutation that maps a to (a ) , for all a in


{1,2, . . . ,n}. To evaluate (a ) , we locate a in the first row of , then read
the number below it, which is a , and locate this a in the first row of .
The number below it is (a ) . We do this for a = 1,2, . . . ,n and in each
case, write the number we obtain below a. Enclosing this configuration in
parentheses, we get in double row notation. Here is an example.

(15 23 32 41 54)(12 21 35 43 54) = (1? 2? 3? 4? 5?)

In the first permutation, below 1, we see 5 and in the second permuta-


tion, below 5, we see 4. So, in the product, below 1, we write 4. Then, in
the first permutation, below 2, we see 3 and in the second permutation,
below 3, we see 5. So, in the product, below 2, we write 5:

(15 23 32 41 54)(12 21 35 43 54) = (14 25 3? 4? 5?).

The remaining entries are found by the same method and we get

(15 23 32 41 54)(12 21 35 43 54) = (14 25 31 42 53).


The product of three or more permutations is evaluated in the same
way:

(16 21 32 43 54 65)(12 24 35 43 51 66)(16 24 32 45 51 63) = (13 24 35 41 52 66).

We now introduce a more efficient notation for permutations. The per-


1234567
mutation = (4 5 1 3 6 2 7) in S7 is a mapping given explicitly as

146
1 4
2 5
3 1
4 3
5 6
6 2
7 7.

A more compact description of can be given as

1 4 3 1; 2 5 6 2; 7 7,
or as

1 4 2 5 7.

3 6

We drop the arrows and enclose the numbers in a "cycle" within


parentheses, in the order indicated by the arrows in a "cycle". Thus we
get
(143)(256)(7)

after juxtaposing the parentheses. The meaning of this symbolism is as


follows. Each number a is mapped, under , to the number that follows it
in the parenthetical expression ("cycle") which contains a. If a happens
to be the last entry in a "cycle", then the first number in that "cycle" is
considered to follow a. For example,

(15234)(6897) S9

is the permutation by which 1 is mapped to 5, 5 to 2, 2 to 3, 3 to 4, 4 to


1, 6 to 8, 8 to 9, 9 to 7, 7 to 6. Thus

152346897 123456789
(15234)(6897) =(5 2 3 4 1 8 9 7 6) = (5 3 4 1 2 8 6 9 7).

Here (15234) can also be written as (23415) or as (34152), (41523),


(52341). Similar remarks are valid for (6897).

An arbitrary permutation Sn is written as follows. We open a paren-


thesis and write down an arbitrary number a {1,2, . . . ,n}. If a = a, we
close the parenthesis and obtain the expression (a). If a = b a, we
write b after a. Now we have (ab . Here b b, because b a ( is one-

147
to-one). If b = a, we close the parenthesis and obtain the expression
(ab). If b = c b, we write c after b. Now we have (abc . Here c b,c,
because is one-to-one. We evaluate c . If c = a, close the parenthesis
and obtain the expression (abc). If c = d a, we repeat our procedure,
each time writing down the image of a number after that number. Since
we have n numbers at our disposal, we meet, after at most n steps, one
of the numbers for a second time. If this happens when we have the
expression
(abc. . . g

where a,b,c,. . . ,g are all distinct, but g is one of them, we conclude that
g b,c,. . . ,g, since b = a , c = b , . . . and is one-to-one. Hence g = a. We
close the parenthesis and obtain the expression (abc. . . g).

If a,b,c,. . . ,g exhaust all the numbers 1,2, . . . ,n, we are done. Otherwise,
we select an arbitrary number from 1,2, . . . ,n that is distinct from a,b,c,. . .
,g. Let us call it h. We open a new parenthesis starting with h and repeat
our procedure. After finitely many steps, we get an expression of the
form
(abc. . . g)(h. . . k). . . (t. . . x),

where {a,b,c, . . . ,g,h, . . . ,k, . . . ,t, . . . x} = {1,2, . . . ,n}. We call each one of the
expressions (abc. . . g),(h. . . k), . . . ,(t. . . x) a "cycle".

We will presertly give a rigorous definition of a cycle and prove that


every permutation can be written as a product of disjoint cycles. But let
us consider some examples first.

Let us write (12 26 37 41 55 64 79 88 93) in cycle notation. This is done in the

following steps, which are carried out mentally at once in practice.

(1 (12 (126 (1264


(1264) (1264)(3 (1264)(37
(1264)(379 (1264)(379) (1264)(379)(5
(1264)(379)(5) (1264)(379)(5)(8 (1264)(379)(5)(8)

In this notation, the order of the cycles is not important. We can write
the permutation above also as

(5)(379)(8)(1264) or as (379)(5)(1264)(8).

148
Besides, one can start a cycle with any number in that cycle. Our permu-
tation can thus be written as

(5)(793)(8)(6412) or as (937)(5)(2641)(8).

The identity permutation is given by (1)(2)(3). . . (n). For obvious reasons,


we prefer to write instead of (1)(2)(3). . . (n) for the identity permuta-
tion.

The inverse of a permutation is found easily. Let Sn, a,b {1,2, . . . ,n}.
In the cycles of , the number a follows a. By definition, b 1 is that
number a for which a = b. Hence b 1 is the number which is followed
by b. Stated otherwise, b 1 is the number that comes just before b in
the cycles of . So 1 consists of the same cycles, but the entries being
written in the reverse order. For example,

[(12)(357)(64)] 1 = (21)(753)(46), [(326)(15)(4)] 1 = (623)(51)(4).

Two permutations in Sn, say and , are multiplied as follows. We have


= (. . . a a . . . ) and = (. . . a a . . . ). What is ? By definition, is the
permutation that maps a to (a ) , for all a {1,2, . . . ,n}. To evaluate
(a ) , we locate a in the cycle of containing a, then read the number
that follows it, which is a , and locate this a in the cycle of . The
number that follows it is (a ) . Opening a parenthesis with an arbitrary
number a, we find (a ) = a( ) in this way, and write it after a. So we
get an expression (ab , say. We find b( ) = c. We write (abc . We repeat
this process until we get a. Then we close our cycle. At this step, we
have (abc. . . g), say. If there are numbers among 1,2, . . . ,n not used up in
this cycle, we select an arbitrary one of them and obtain a second cycle
starting with that number. We continue in this fashion until all the
numbers 1,2, . . . ,n are used up.

Let us compute the product (1256)(347).(157)(24)(3)(6) in S7. We start


with the smallest number 1, for example. We write (1 . Now 1 is
followed by 2 in the first permutation and 2 is followed by 4 in the
second per-mutation. Thus we get (14 . Now 4 is followed by 7 in the
first permutation and 7 is followed by 1 in the second permutation. We
close our first cycle. We have (14). We open a new cycle with 2, for
example. Now 2 is followed by 5 in the first permutation and 5 is
followed by 7 in the second permutation. We have (14)(27 . Continuing

149
in this way, we find (1256)(347).(157)(24)(3)(6) = (14)(273)(56).
Another example:
(152)(3476).(1724)(563) = (1654273).

We make a convention. Whenever there appears a cycle consisting of a


single number, we suppress it. Hence, whenever a number j does not
appear in the cycles of a permutation , we understand j = j. With this
convention, we write shortly
(123)(47) for (123)(47)(5)(6) in S7,
(245)(3876) for (245)(3876)(1) in S8.
This convention simplifies multiplication: if a number does not appear in
the cycles of one or more of the factors, it is mapped to itself by the
permutations in question. For example,
(123).(12) = (23)
(254).(12)(34) = (25341).

The way we multiply permutations, either in double row or in cycle


notation, reflects the fact that we write functions to the right of the
elements. If we had written functions on the left, then would mean:
first , then . A product would be evaluated in double row notation by
reading the permutations from right to left. In the cycle notation, we
would be reading the cycles from right to left, but the numbers in the
cycles from left to right. Writing our functions on the right, we avoid
backward or inconsistent reading. We read everything in the correct
order.

The alert reader will have noticed that the same. symbol in cycle
notation stands for many different permutations.. Thus (123)(45) stands
for (123)(45) in S5, for (123)(45)(6) in S6, for (123)(45)(6)(7) in S7, etc.
So an isolated symbol (123)(45) is ambiguous.. Also, our thumb rule for
finding inverses in the cycle notation works only. when the cycles are
disjoint. It is time that we discuss these points rigorously. .

15.3 Definition: Let Sn and m {1,2, . . . ,n}. When m = m, we say


that m is fixed by or that fixes m.. When m m, then m is said to be
moved by or is said to move m. .

150
15.4 Definition: Let , Sn. If the set of numbers moved by and the
set of numbers moved by are disjoint, then and are called disjoint
permutations. We also say is disjoint from in this case.

15.5 Lemma: Let , Sn and k {1,2, . . . ,n}. Assume , are disjoint.


(1) If k is moved by , then k is also moved by .
(2) If k is moved by and fixed by , then k is fixed by .

Proof: (1) If k were fixed by , so that (k ) = k , we would apply 1 to


1 1
both sides of this equation and get k = (k ) =k = k, contrary to
the hypothesis that k is moved by . So k is moved by .

(2) k is moved by according to part (1). If k were moved by , then


k would be moved both by and by , contrary to the hypothesis that
and are disjoint permutations. Thus k is fixed by .

We can now prove that disjoint permutations always commute.

15.6 Theorem: If , S n are disjoint permutations, then = .

Proof: We must show m( ) = m( ) for all m {1,2, . . . ,n}. Since and


are disjoint, for each m {1,2, . . . ,n}, there are three possibilities:
I. m is moved by , fixed by .
II. m is fixed by , moved by .
III. m is fixed by , fixed by .
In case I, m fixed by by Lemma 15.5(2) (with m, , in place of k, , ),
hence (m ) = m and
m( ) = (m ) = m ,
m( ) = (m ) = m (as m is fixed by ),
so m( ) = m( ).

In case II, m fixed by by Lemma 15.5(2) (with m, , in place of k, , ),


hence (m ) = m and
m( ) = (m ) = m (as m is fixed by ),
m( ) = (m ) = m ,
so m( ) = m( ).

151
In case III, we have
m( ) = (m ) = m = m,
m( ) = (m ) = m = m,
so m( ) = m( ).

In all three cases, we have m( ) = m( ). Since this holds for all m in the
set {1,2, . . . ,n}, we conclude = .

In order to prepare our way for a formal definition of cycle, let us


examine the permutation

(14 21 32 43 56 67 75 88 109 109)


in S10. Informally, we write this as (1432)(567)(8)(9,10) and call (1432),
(567), (8), (9,10) "cycles" (we use a comma to avoid confusion when we
have a number with more than one digits). The idea is to consider
(1432) etc. as a permutation by itself. Then

(1432)(567)(8)(9,10)

is a product of four permutations. We observe that {1432}, {567}, {8},


{9,10} are pairwise disjoint subsets of {1,2,3,4,5,6,7,8,9,10} and yield a
partition of {1,2,3,4,5,6,7,8,9,10}. So there is an equivalence relation on
{1,2,3,4,5,6,7,8,9,10} with these subsets as equivalence classes (Theorem
2.5). We want to find this equivalence relation.

15.7 Lemma: Let be a permutation in Sn. We put, for a,b {1,2, . . . ,n},
a b
k
if and only if there is an integer k such that a = b. Then is an
equivalence relation on {1,2, . . . ,n}.

0
Proof: (i) For all a {1,2, . . . ,n}, we have a = a, with 0 . So a a for
all a and is reflexive.

k k
(ii) If a b, then a = b for some k , so b = a with k and
therefore b a. So is symmetric.

152
(iii) If a b and b c, then a k = b and b m = c for some k,m , then
a k+ m = a k m = b m
= c, with k + m and therefore a c. So is
transitive.

The reader will check easily that {1432}, {567}, {8}, {9,10} are the equi-
valence classes of in {1,2,3,4,5,6,7,8,9,10} if denotes the permutation

(14 21 32 43 56 67 75 88 10
9 10
9)
we treated above. So the equivalence relation of Lemma 15.7 seems
promising.

15.8 Lemma: Let Sn and let A {1,2, . . . ,n} be an equivalence class


under the equivalence relation of Lemma 15.7. We define A by

b if b A
b A
= { b if b A

for b {1,2, . . . ,n}. Then A


is a permutation in Sn and, whenever x and y
k
are moved by A
, there is an integer k such that x A
= y.

Proof: By definition of , there holds x x for all x {1,2, . . . ,n} and x


belongs to the equivalence class of x. So the equivalence class of x and
the equivalence class of x are identical. Hence x A if and only if
x A.
Using this remark, we prove, for any n , that x n = x A
n
for all x A.
This is true when n = 1. If it is true for n = k 1, we have, for any x A,

k k1
x A
= (x A
) A
k1
= (x ) A
k1
= (x )
= x k,

and it is true for n = k. Thus it is true for all n .

In particular, it is true for m = o( ) and

m
x if x A
x m
A
= { x if x A
= x,

153
m1 m1
hence A A
= = A A
. By Theorem 3.17(2), A
is one-to-one and onto.
Thus A
Sn.

Finally, if x and y are moved by A


, then necessarily x,y A in view of
k
the definition of A
, so there is an integer k with x = y and thus
x( A
)k = y by what we proved above (since x A). This completes the
proof.

15.9 Theorem: Let Sn and let A1,A2, . . . ,Ah be the equivalence


classes of {1,2, . . . ,n} under the equivalence relation in Lemma 15.7..
Let
A , 1
A2 , ..., Ah be the associated permutations as in Lemma 15.8.

(1) A1 , A2 , ..., Ah are pairwise commuting permutations in Sn.


(2) = A1 A2 . . . Ah .

Proof: (1) The equivalence classes A1,A2, . . . ,Ah are pairwise disjoint
sets. Now A either moves no number at all (this happens if and only if
i

Ai has exactly one element), or moves only the numbers in Ai. There-
fore, the numbers moved by A and A make up disjoint sets whenever
i j
i j. So the permutations A1 , A2 , ..., Ah are pairwise disjoint permuta-
tions (Definition 15.4) and they commute by Theorem 15.6. .
(2) We have A A . . . A = A A . . . A for any arrangement 1´,2´, . . . ,h´
1 2 h 1´ 2´ h´
of the numbers 1,2, . . . ,h. (Lemma 8.12). We want to show.
b = b A A . . . A for all b {1,2, . . . ,n} = A1 A2 ... Ah. So let b be
1 2 h
in {1,2, . . . ,n}. Renumbering A1,A2, . . . ,Ah if need be, we may assume,
without loss of generality that b Ah. Then b A1, b A2, . . . , b Ah 1
and thus b =b = ... = b
A1 = b by the definition of these functions.
A2 Ah-1
Thus b A1 A2 . . . Ah = bA and the proof will be complete when we show
h
b = bA . But this follows immediately from the definition of Ah since b
h

Ah.

In our example, the associated permutations are


(1432)(5)(6)(7)(8)(9)(10) = (1432)

154
(567)(1)(2)(3)(4)(8)(9)(10) = (567)
(8)(1)(2)(3)(4)(5)(6)(7)(9)(10) = (8) (= )
(9,10)(1)(2)(3)(4)(5)(6)(7)(8) = (9,10).
In view of this, we define cycles as the associated permutations. Cycles
will be distinguished from other permutations by the property stated in
Lemma 15.8.

15.10 Definition: A permutation Sn is called a cycle if, for all x,y in


k
{1,2, . . . ,n} that are moved by , there is an integer k such that x = y.

The identity permutation is vacuously a cycle. Lemma 15.8 states that


A
is a cycle when A is an equivalence class under the equivalence
relation in Lemma 15.7. Since the cycles are disjoint, we may
reformulate Theorem 15.9 as follows.

15.9 Theorem: Every permutation in Sn can be written as a product


of disjoint cycles. These cycles are completely determined by , and they
commute in pairs.

Let be a cycle in Sn distinct from . Let a1,a2, . . . ,am be the numbers


moved by . Since is one-to-one, a1 ,a2 , . . . ,am are all distinct and we
may assume the numbering so chosen that

a1 = a2, a2 = a3, . . . , am 1
= am, am = a1.

In this case, we write (a1a2. . . am) for . Then m is called the length of the
cycle (a1a2. . . am) and (a1a2. . . am) = is called an m-cycle. The identity
permutation is called a 1-cycle.

With this notation, we have


a1 = a2 a1, a1 2 = a3 a1, . . . , a1 m1
= am a1
and so , 2 , ..., m 1 . On the other hand,
m
a1 = a1 and ak m = a1 k 1 m = a1 m k1
= a1 k1
= ak
m
for all k = 1,2, . . . ,n. So fixes a1,a2, . . . ,am. But fixes the numbers
m
among 1,2, . . . ,n which are distinct from a1,a2, . . . ,am, and then fixes

155
them, too. Hence b m = b for all b {1,2, . . . ,n}. Thus m is the smallest
m
natural number such that = . Using Lemma 11.4, we obtain the
following Theorem, which is also true when m = 1.

15.11 Theorem: The order of a cycle is its length. In other words, if =


(a1a2. . . am), then o( ) = m.

15.12 Remarks: (1) The inverse of a cycle = (a1a2. . . am) Sn, for
which
a1 = a2, a2 = a3, . . . , am 1 = am, am = a1 and which fixes any other
number in {1,2, . . . ,n} (if any). is by definition the mapping whose
effect on a1,a2, . . . ,am is given by am = am 1, . . . ,a3 = a2, a2 = a1, a1 = am
1
and which fixes the other numbers (if any). Thus = is the cycle.
(amam 1. . . a2a1).

(2) Let Sn be written as = A1 A2 . . . Ah with the notation of Theorem


15.9. A cycle Ai is the identity if there is only one number in Ai. Then
the cycle Ai may be deleted from the product.

(3) If = A1 A2 . .. Ah is the representation of as a product of disjoint


1 1 1 1
cycles, then = Ah . . . A2 A1 and so = A1 A2 . . . Ah . But this is true only
when Ai are disjoint. In any case, it is safer to reverse the order of the
cycles as well as the ordering of the numbers in each cycle when we
want to find the inverse of a product of cycles, as this is valid also in the
case the cycles are not pairwise disjoint and is a more consistent pro-
cedure: you reverse everything. For example,
[(15)(243)(687)] 1 = (786)(342)(51).

(4) The ambiguity in cycle notation is harmless, as it will be either clear


from the context which symmetric group we are working in, or the
results will be independent of the symmetric group.

In the rest of this paragraph, we determine the order of a permutation


written as a product of disjoint cycles. We start with a general lemma.

156
15.13 Lemma: Let G be a group and a,b G. Suppose ab = ba and
assume that o(a) and o(b) are finite. Suppose further that a b =
{1}. Then o(ab) is finite. In fact, o(ab) is the least common multiple of
o(a) and o(b): we have o(ab) = [o(a),o(b)].

Proof: First we show that (ab)k = 1 if and only if o(a) k and o(b) k
(where k ). Indeed, if o(a) k and o(b) k, then a k = 1 and bk = 1
(Lemma 11.6) and so (ab)k = a kbk = 1.1 = 1 (Lemma 8.14(3); here we use
ab = ba). Conversely, if (ab)k = 1, then a kbk = 1, so a k = b k a b =
k k k k
{1}. So we have a = 1 = b , and a = 1 = b , and thus o(a) k and o(b) k.
Therefore (ab)k = 1 if and only if o(a) k and o(b) k.

Then, by Lemma 11.4,

o(ab) = smallest number in {k : (ab)k = 1},


provided this set is not empty,

= smallest number in {k : o(a)|k and o(b)|k},


provided this set is not empty,

= the least common multiple of o(a) and o(b), as


the set is not empty,

= [o(a),o(b)]

Generally speaking, we cannot determine the order of a and b from o(a)


and o(b) alone. o(ab) depends also on the role the elements a,b play in
the group. (See §14, Ex.15.) Lemma 15.13 is one of the rare situations
where o(ab) is determined in terms of o(a) and o(b).

Lemma 15.13 will be used to find the order of a product of disjoint per-
mutations. We need the following result.

15.14 Lemma: (1) If 1 and , as well as 2


and are disjoint permuta-
tions in Sn, then 1 2 and are disjoint.

(2) If 1
, 2
, ..., m
are disjoint from , then 1 2
... m
and are disjoint.

157
1
(3) If and are disjoint, then and are disjoint.

m
(4) If and are disjoint, then and are disjoint for all m .

m r
(5) If and are disjoint, then and are disjoint for all m,r .

Proof: (1) By hypothesis, any k {1,2, . . . ,n} that is moved by is fixed


by 1 and 2. So k k implies k 1 = k and k 2 = k. So k k implies
k( 1 2) = (k 1) 2 = k 2 = k and 1 2 fixes every number that moves.
Hence 1 2 and are disjoint. (The argument is valid also when = .)

(2) This follows from (1) by induction on m. The details are left to the
reader.

(3) Let k {1,2, . . . ,n} be moved by . We wish to show that k is fixed by


1
. Since and are disjoint, k is fixed by . So k = k. Applying 1 to
both sides, we get (k ) 1 = k 1, hence k = k 1 and k is fixed by 1.
Therefore 1 and are disjoint.

(4) Let m . Choosing 1


, 2
, ..., m
all equal to in (2), we deduce that
m
and are disjoint. Now applying (3) with m, in place of , , we get
that m = ( m) 1 is disjoint from , for any m . As 0 = is trivially dis-
joint from , we conclude that m and are disjoint for all m .

(5) When and are disjoint and m,r , then m and are disjoint by
(4), and using (4) with r, , m respectively in place of m, , , we deduce
that r and m are disjoint. Hence m and r are disjoint for all m,r .

15.15 Theorem: Let and be disjoint permutations in Sn. Then


o( ) = [o( ),o( )].

Proof: We use Lemma 15.13. Since and are disjoint, = by


Theorem 15.6. Also, o( ) and o( ) are finite since Sn is a finite group by
Theorem 15.2. We must also show that = { }. When we do this,
the hypotheses of Lemma 15.13 will be satisfied and it will yield o( ) =
[o( ),o( )]. So we show { }.

Suppose { }. Then there is an with and =


m r
= for some integers m,r. Since , there is a j {1,2, . . . ,n} such
m
that j j. So j is moved by and also by . On the other hand, m and
r

158
r
are disjoint by Lemma 15.14(5) and there cannot be any number in
{1,2, . . . ,n} which is moved both by m and by r. This is a contradiction.
Thus { }. As remarked above, this completes the proof.

15.16 Theorem: Let 1, 2


, . . . , m be pairwise disjoint permutations in
Sn. Then o( 1 2. . . m) = [o( 1
),o( 2), . . . ,o( m)].

Proof: By induction on m. The case m = 2 is treated in Theorem 15.15.


The inductive step is left to the reader.

15.17 Theorem: The order of Sn is the least common multiple of


the lengths of the disjoint cycles in the representation of as a product
of disjoint cycles.

Proof: The disjoint cycles are pairwise disjoint and the order of a cycle
is its length (Theorem 15.11). The claim follows now immediately from
Theorem 15.16.

For instance. (134)(275698) has order 6, (124)(3756) has order 12 and


(34)(79)(12586) has order 10.

Exercises

1. Evaluate (15 23 31 42 54)(12 21 33 45 54), (16 21 32 43 54 65)(12 24 36 45 53 61) and


1
(14 21 32 43)(13 21 32 44) (13 24 31 42).
2. Evaluate (1253)(24315), (1542)(376)(1754) and
(1243)(345)(265)(1452)(135) 1(3246).

159
3. Write the permutations in Ex. 1 in cycle notation. Carry out the multi-
plication in cycle notation and compare the results.

4. Write the permutations in Ex. 2 in double row notation. Carry out the
multiplication in double row notation and compare the results.

5. Write all elements in S1,S2,S3,S4.

6. Construct multiplication tables of S1,S2,S3,S4.

7. Find the orders of all elements in S3 and S4.

8. Show that V4 := { ,(12)(34),(13)(24),(14)(23)} is a subgroup of S4.

9. Show that D := { ,(13),(24),(12)(34),(13)(24),(14)(23),(1234),(1432)} is


a subgroup of S4.. Prove that it is a dihedral group in the sense of Defini-
tion 14.7.

10. Find all subgroups of S3 and S4.

H
11. Let H S n. For a,b G, put a b if and only if there is a H such
that a = b. Show that H is an equivalence relation on {1,2, . . . ,n}. (Lemma
15.7 is a special case when H = .)

12. Let a1,a2, . . . ,am be pairwise commuting elements of finite order in a


group G such that ai aj = {1} whenever i j. Show that o(a1a2. . . am) =
[o(a1),o(a2), . . . ,o(am)]. This gives an alternative proof of Theorem 15.16.

13. For S4, we put V4 := { : V4} and V4 := { : V4} (Ex. 8).


Find V4 and V4 when = , = (12), = (123), = (12)(34), = (1234).

14. For H S4, S4, we put H := { : H} and H := { : H}. Thus


H and H are subsets of S4.
Let H1 = { ,(13),(24),(12)(34)}. Check whether (12)H1 = H1(12),
(13)H1 = H1(13), (123)H1 = H1(123), (12)(34)H1 = H1(12)(34), (1234)H1 =
H1(1234).
Let H2 = { ,(12),(34),(12)(34)}. Check whether (12)H2 = H2(12),
(13)H2 = H2(13), (123)H2 = H2(123), (12)(34)H2 = H2(12)(34), (1234)H2 =
H2(1234).
Compare to Ex. 13.

160
15. Show that, for any Sn, there holds 1(123) = (abc) with suitable
a,b,c. How are a,b,c related to ? (Work out some specific examples.)
Generalize your conclusion to 1 .

161
§16
Alternating Groups

In this paragraph, we examine an important subgroup of Sn, called the


alternating group on n letters. We begin with a definition that will play
an important role throughout this paragraph.

16.1 Definition: A cycle of length 2 in Sn (where n 2) is called a


transposition.

A transposition is therefore a permutation of the form (ab) and has


order 2 (Theorem 15.11). We remark that (ab) = (ba).

16.2 Theorem: Any permutation in Sn (where n 2) can be written as


a product of transpositions.

Proof: Since any permutation in Sn can be written as a product of


(disjoint) cycles (Theorem 15.9), it suffices to prove that any cycle can
be written as a product of transpositions. This follows from (abc. . . e) =
(ab)(ac). . . (ae) for cycles of length 1. Also = (12)(12) is a product of
transpositions. This completes the proof.

There is no uniqueness claim in Theorem 16.2. A permutation can be


written as a product of different transpositions. For instance,

(12345) = (12)(13)(14)(15) = (45)(41)(42)(43)

is written as a product of different transpositions. Nor is the number of


transpositions is unique. The permutation (132546) can be written as a
product of five or nine transpositions:

(132546) = (13)(12)(15)(14)(16) =
(24)(12)(14)(23)(46)(14)(16)(45)(16).

162
In fact, we can attach a product of two transpositions (ab)(ab) = at will
and increase the number of transpositions by 2. Hence a product of n
transpositions can be written also as a product of n + 2, n + 4, n + 6, . . .
transpositions. We note that this does not change the parity of the num-
ber of transpositions. The parity of the number of transpositions is
unique. If a permutation can be written as a product of an odd (even)
number of transpositions, then, in any representation of this permuta-
tion as a product of transpositions, the number of transpositions is odd
(even). A permutation cannot be written as a product of an odd number
of transpositions and also as a product of an even number of transpo-
sitions. We proceed to prove this assertion. We need the notion of inver-
sions of a permutation.

Let Sn. We write in double row notation, where, in the first row,
the numbers 1,2, . . . ,n are in their natural order:.

(112 . . . n
2 . . . n ).
Corresponding to the correct inequalities

1 2 1 3 ...... 1 n
2 3 ...... 2 n
.....................
n 1 n

among the numbers in the first row, we obtain the inequalities

1 2 1 3 ...... 1 n
2 3 ...... 2 n
.....................
(n - 1) n

among the numbers in the second row when we replace each k by k


(k = 1,2, . . . ,n). These inequalities will be referred to as the inequalities
of . In general, some of the inequalities of will be correct, some will
be wrong (if , there will be a wrong inequality of ). A wrong
inequality i j of means: i j but i j i.e., the natural order of i
and j is inverted in the second row (that is, the larger one precedes the
smaller one). We call each wrong inequality of an inversion of . For
123456
example, = (2 5 6 3 1 4) has the inequalities

163
2 5 2 6 2 3 2 1 2 4
5 6 5 3 5 1 5 4
6 3 6 1 6 4
3 1 3 4
1 4,

eight of which are wrong, namely 2 1, 5 3, 5 1, 5 4, 6


3,
6 1, 6 4, 3 1. Hence there are eight inversions of .

The main work of this paragraph is done in the next lemma.

16.3 Lemma: Let n 2, Sn and let (ik) be a transposition in Sn. If


has an odd number of inversions, then (ik) has an even number of
inversions. If has an even number of inversions, then (ik) has an odd
number of inversions.

Proof: Since (ik) = (ki), we assume, without loss of generality, that i k.


We have
1 . . . i . . . k . . . n 1 . . . i . . . k . . . n
= (1 . . . i . . . k . . . n ), (ik) = (1 . . . k . . . i . . . n ).

The second rows of and (ik) are identical, aside from the locations of
i and k . Here gives rise to the inequalities

1. h i , h k where h {1, . . . ,i 1} =: H,
i j , where j {i + 1, . . . ,k 1} =: J,
i k ,
2. i m , where m {k + 1, . . . ,n} =: M,
j k , where j J,
3. k m , where m M,
and to certain other inequalities that do not involve i or k . And (ik)
gives rise to the inequalities

1. h k , h i where h H,
k j , where j J,
k i ,
3. k m , where m M,
j i , where j J,
2. i m , where m M,

164
and to certain other inequalities that do not involve i or k .

In the cases i = 1, k = i + 1, k = n, there holds respectively H = , J = , M


= and the correponding inequalities should be deleted. This does not
impair the argument below.

We are to show that the number of inversions of and the number of


inversions of (ik) differ by an odd number.

The inequalities of and of (ik) that do not involve i or k are identi-


cal. Also, the inequalities 1., 2., 3. of and (ik) are the same (or absent).
So only the inequalities

I. i j , i k , j k (where j J) of
and
II. k j , k i , j i (where j J) of (ik)

are different. We must prove that the number of wrong inequalities in I


and II differ by an odd number.

Since one of i k ,k i is correct and the other is wrong, we must


prove only that the number of wrong inequalities in
A. i j , j k (where j J)
and in
B. k j , j i (where j J)
differ by an even number.

Suppose there are s wrong inequalities i j and t wrong inequalities


j k in A, where J s 0 and J t 0 (including the case J = ,
J = 0). Then there are s + t wrong inequalities and there are
( J s) + ( J t) = 2 J (s + t) correct inequalities in A. Since B consists of
the negations of the inequalities in A, there are 2 J (s + t) wrong in-
equalities in B. So

(no of wrong inequalities in A) (no of wrong inequalities in B)


= (s + t) (2 J (s + t)) = 2(s + t J ) = an even number.

This completes the proof.


16.4 Definition: Let n and let S n. If has an odd number of
inversions, then is called an odd permutation. If has an even number
of inversions, then is called an even permutation.

165
As the number of inversions of a permutation is uniquely determined, it
is clear that a permutation cannot be both odd and even. With this
terminology, Lemma 16.3 reads as follows.

16.3 Lemma: Let n 2 and S n. Let (ik) be a transposition in Sn. If


is odd, then (ik) is even. If is even, then (ik) is odd.

Applying Lemma 16.3 r times, we have

16.5 Lemma: Let n 2, Sn and let 1, 2, . . . , r be transpositions in


Sn. If r is odd, then and 1 2. . . r have the opposite"parity" (i.e., one of
them is odd, the other is even). If r is even, then and 1 2. . . r have
the same "parity".

16.6 Theorem: Let n 2, Sn. Then is an odd (even) permutation


if and only if can be written as a product of an odd (even) number of
transpositions. In particular, cannot be written as a product of an odd
number of transpositions and also as a product of an even number of
transpositions.

Proof: We use Lemma 16.5 with = .. Let be written as a product of


transpositions, say = 1 2. . . r. Lemma 16.5 tells us that = 1 2. . . r and
have opposite or same "parities". according as whether r is odd or even.
Since has 0 inversions, is an even permutation. So = 1 2. . . r is an
odd permutation or an even permutation. according as whether r is an
odd number or an even number.. The other assertion follows from the
remark made after Definition 16.4.

We describe the "parity" of a product.

166
16.7 Theorem: Let n 2. The product of two permutations in Sn has
the "parity" given by the following law.

(odd)(odd) = (even) (odd)(even) = (odd)


(even)(odd) = (odd) (even)(even) = (even).

Proof: Let , Sn. We want to find the "parity" of . Let = 1 2. . . s


and = 1´ 2´ . . . p´ , where 1, 2, . . . , s, 1´ , 2´ , . . . , p´ are transpositions (Theorem
16.2). Then = 1 2. . . s 1´ 2´ . . . p´ is a product of s + p transpositions.

If is an odd permutation and is an odd permutation, then s is an odd


number and p is an odd number (Theorem 16.6), so s + p is an even
number, so is an even permutation (Theorem 16.6). Thus (odd)(odd) =
(even). The other cases are proved similarly.

The assertion of Theorem 16.7 resembles the rule for finding the sign of
a product of two real numbers: the product of a negative number by a
negative number is positive, etc. In order to exploit this analogy, we
introduce a new term.

16.8 Definition: Let n and Sn. The sign of is the integer 1 or


1. We write ( ) for the sign of , and define it as follows.

1 if is an even permutation
( )= { 1 if is an odd permutation.

With this definition, the content of Theorem 16.7 can be expressed more
succintly.

16.7 Theorem: For any , in Sn, there holds ( )= ( ) ( ).

16.9 Theorem: Let n 2. The number of odd permutations in Sn is


equal to the number of even permutations in Sn. This number is n!/2.

167
Proof: We must find a one-to-one correspondence. between the set of
odd permutations and the set of even permutations in Sn. Now

T: { S n: ( ) = 1} { S n: ( ) = 1}
(12)

is a one-to-one mapping (by Lemma 8.1(1)). from the set of odd permu-
tations in Sn into the set of even permutations in Sn (by Lemma 16.3),
which is in fact onto, since any even permutation is the image,. under
T, of the odd permutation (12) (Lemma 16.3).. So T is a one-to-one
corre-spondence between these sets and. they contain equal number of
ele-ments, say k elements.. Since these sets are disjoint, and their union
is Sn, there are 2k elements in Sn, whose order is n! by Theorem 15.2.
Hence k = n!/2. .

Theorem 16.7 asserts that the set of even permutations in Sn is closed


under multiplication. So it is a subgroup of Sn by Lemma 9.3(2).

16.10 Definition: The subgroup of even permutations in Sn (n 2) is


called the alternating group (on n letters) and is written as An.

16.11 Theorem: For n 2, An is a group of order n!/2.


Proof: Theorem 16.9.

Exercises

1. Find the sign of (13524) and of (153462).

2. Show that a cycle of length m is odd (even) if and only if m is even


(odd).
3. Prove that ( 1 2. . . t) = ( 1) ( 2). . . ( t) for all permutations 1, 2, . . .
t
in S n.

168
4. Find the sign of (143)(1245)(243) and of (1435)(25643) without
evaluating these products.

5. Write all elements in A2,A3,A4.

6. Construct multiplication tables of A2,A3,A4.

7. Find all subgroups of A4. Does A4 have a subgroup of order 6?

8. Verify Lemma 16.3 by going through the argument in its proof in the
specific cases below.
1234567
= (3 1 5 7 2 4 6), (ik) = (12), (14), (23), (26), (27), (67).

169
§17
Groups of Matrices

In this paragraph, we examine some groups whose elements are


matrices. The reader probably knows matrices (whose entries are real or
complex numbers), but this is not a prerequisite for understanding this
paragraph. We give an elementary account of the theory of matrices as
far as needed here. Matrix theory will be taken systematically in
Chapter 4, §43.

We allow the entries to be elements of any field. Fields will be formally


introduced in Chapter 3, §29 (Definition 29.13). Until then, we shall be
content with the following definition.

17.1 Temporary Definition: A field is one of the sets , , and p


,
where p is a prime number.

After having learned about fields in Chapter 3, the reader may check
that the theory in this paragraph carries over to the more general situa-
tion where the term "field" is used in the sense of Definition 29.13.

We note that K is a commutative group under addition,. whose identity


element we shall denote by 0. (so that 0 is the number 0 in case K is one
of , , , and it is the residue class 0 = 0 + p in case K is p for some
prime number p), and that K\{0} is a group under multiplication.. This
will be used many times in this paragraph..

17.2 Definition: Let K be a field. A matrix over K is an array

(ac bd)
of four elements a,b,c,d of K, arranged in two rows and two columns, and
enclosed within parentheses. (The plural of "matrix" is "matrices".)

170
Thus ( 14 20) is matrix over (and also over and ), (5 2
7
) is a
matrix over (and also over ). In addition, (25 34) is a matrix over 7
,

when bars mean residue classes modulo 7.

The set of all matrices over a field K will be denotded by Mat2(K). The
subscript 2 signifies that there are 2 rows and 2 columns in a matrix (in
the sense of Definition 17.2).

If K is a field and A,B are matrices from Mat2(K), we say A is equal to B


provided the corrsponding entries in A and B are equal. More exactly,

a b a´ b´
A: (c d) is equal to B: (c´ d´)

if and only if a = a´, b = b´, c = c´, d = d´.. In this case, we write A = B.


A single matrix equation is equivalent to four. equations between the
elements of the underlying field.. It is clear that matrix equality is an
equivalence relation on Mat2(K). In particular, it is legitimate to say that
A and B are equal when A is equal to B. .

In this definition of matrix equality, the location of the entries are taken
51 25
into account. Thus (0 2) and (1 0) are different matrices, although they

are made up of the same numbers.

We introduce two binary operations on Mat2(K), addition and multiplica-


tion. Addition is defined in the most obvious way..

17.3 Definition: Let K be a field. For any A = (ac bd), B = (eg hf ) in


Mat2(K), we define the sum of A and B as the matrix

(a+e b+f
c+g d+h).
The sum of A and B will be denoted by A + B. Taking sums in Mat2(K)
will be called addition (of matrices).
Addition of matrices is essentially the addition in the underlying field,
carried out four times. Not surprisingly, many properties of addition in
the field are reflected in matrix addition. For example, just like a field is
a group under addition, matrices over a field form a group under
addition, too.

171
17.4 Theorem: Let K be a field. Then Mat2(K) is a commutative group
under addition.

Proof: We check the group axioms


a b e f
(i) For any matrices A = (c d), B = (g h) in Mat2(K), we have

a + e, b + f, c + g, d + h K since a,b,c,d,e,f,g,h K and K is closed under


addition. Hence
a+e b+f
A + B = (c+g d+h) K
and Mat2(K) is closed under (matrix) addition.
(ii) Associativity of addition in Mat2(K) follows from associati-
a b e f k m
vity of addition in K. Indeed, for any A = (c d), B = (g h), C = (n p ) in
Mat2(K), we have
a b e f km a+e b+f km
(A + B) + C = [(c d) + (g h)] + (n p ) = (c+g d+h) + (n p )
(a+e)+k (b+f)+m
= ((c+g)+n (d+h)+p )
a+(e+k) b+(f+m)
= (c+(g+n) d+(h+p) )
a b e+k f+m
= (c d) + (g+n h+p )

= A + (B + C).

(iii) What can be the identity element? Well, probably the


0 0
matrix (0 0), where 0 denotes the zero element of the field K (for
instance, when K is p for some prime number p, 0 is the residue class 0
a b
= p ). Indeed, we have, for any A = (c d) Mat2(K),
00 a b 00 a+0 b+0 a b
A + (0 0) = (c d) + (0 0) = (c+0 d+0) = (c d) = A
00 0 0
and (0 0) is a right identity of Mat2(K). The matrix (0 0) will be called

the zero matrix (over K) and will be designated by the symbol 0. This
should not be confused with the zero element of the underlying field K.

(iv) Any matrix A = (ac bd) Mat2(K) has a right inverse


a b
(opposite) A in Mat2(K), namely ( c d) (since a, b, c, d K):

172
(ac bd) + ( a b a+( a) b+( b)
c d) = ( c+( c) d+( d)) = (00 00) = 0.
Thus Mat2(K) is a group under addition. We finally check commutativity.

(v) Commutativity of addition in


Mat2(K) follows from
a b e f
commutativity of addition in K. Indeed, for any A = (c d), B = (g h) in
Mat2(K), we have

a b e f a+e b+f e+a f+b e f a b


A + B = (c d) + (g h) = (c+g d+h)= (g+c h+d) = (g h) + (c d) = B + A.
So Mat2(K) is a commutative group under addition.

The additive group Mat2(K) is somewhat dull. It is just four copies of the
additive group K.. More interesting matrix groups arise when the opera-
tion is multiplication. We introduce this operation now..

17.5 Definition: Let K be a field. For any A = (ac bd), B = (eg hf ) in


Mat2(K), we define the product of A and B as the matrix

(ae+bgaf+bh
ce+dgcf+dh ).
The product of A and B will be denoted by A .B or simply by AB.. Taking
products in Mat2(K) will be called multiplication (of matrices).

This definition looks bizarre. One would expect the product of A and B,
ae bf
with the notation of Definition 17.5, to be (cg dh). Some motivation for
a b
Definition 17.5 can be gained as follows. With each matrix (c d) (over

, say), there is associated a coordinate transformation

x = ax´ + by´
y = cx´ + dy´

of the Euclidean plane. Carrying out the transformations associated with

(ac bd), (eg hf ) successively, we obtain

173
x = ax´ + by´ x´ = ex´´ + fy´´
y = cx´ + dy´ y´ = gx´´ + hy´´,

which gives

x = a(ex´´ + fy´´) + b(gx´´ + hy´´) = (ae+bg)x´´ + (af+bh)y´´


y = c(ex´´ + fy´´) + d(gx´´ + hy´´) = (ce+dg)x´´ + (cf+dh)y´´,

so the product of the matrices is the one which is associated with the
successive application of the transformation.

If matrix multiplication is new to you, you are urged to write down


matrices over and multiply them in order to acquire dexterity in
performing this operation.

We collect some basic properties of matrix multiplication. in the next


theorem. Let us recall that K\{0} is a group under multiplication.. The
identity element of this group will be denoted by 1.. Thus 1 is the
number 1 when K is one of , , , and the residue class 1 = 1 + p
when K = p for some prime number p. .

17.6 Theorem: Let K be a field, whose zero element is 0 and whose


identity element is 1.
(1) Mat2(K) is closed under matrix multiplication.
(2) (AB)C = A(BC) for all A,B,C Mat2(K).
10
(3) Let I = (0 1). Then AI = IA = A for all A Mat2(K).
(4) A(B + C) = AB + AC and (B + C)A = (BA + CA) for all A,B,C Mat2(K).

(ac bd), (eg hf ), (kn m


p ) be arbitrary elements of
Proof: Let A = B = C =
Mat2(K).

(1) Since a field is closed under addition and multiplication,. ae + bg,


af + bh, ce + dg, cf + dh K whenever a,b,c,d,e,f,g,h K. So AB Mat2(K)
for all A,B Mat2(K) and Mat2(K) is closed under multiplication.

(2) This is routine calculation. We evaluate (AB)C and A(BC):

174
a b e f km ae+bg af+bh k m
(AB)C = [(c d)(g h)](n p ) = (ce+dg cf+dh )(n p )

(ae+bg)k + (af+bh)n (ae+bg)m + (af+bh)p


= ( (ce+dg)k + (cf+dh)n (ce+dg)m + (cf+dh)p )
aek+bgk+afn+bhn aem+bgm+afp+bhp
= ( cek+dgk+cfn+dhn cem+dgm+cfp+dhp ), (i)
a b e f km a b ek+fn em+fp
A(BC) = (c d)[(g h)(n p )] = (c d)(gk+hn gm+hp)
a(ek+fn)+b(gk+hn) a(em+fp)+b(gm+hp)
= (c(ek+fn)+d(gk+hn) c(em+fp)+d(gm+hp) )
aek+afn+bgk+bhn aem+afp+bgm+bhp
= ( cek+cfn+dgk+dhn cem+cfp+dgm+dhp ). (ii)
Since addition is commutative in K, the matrices (i). and (ii) are equal.
Hence (AB)C = A(BC) for all A,B,C Mat2(K).

a b 10 a1+b0 a0+b1 a b
(3) We compute AI = (c d)(0 1) = (c1+d0 ) ( d) = A,
=
c0+d1 c
10 a b 1a+0c 1b+0d a b
IA = (0 1)(c d) = (0a+1c 0b+1d) = (c d) = A,
as claimed.

a b e f km
(4) We have A(B + C) = (c d)[(g h) + (n p )]
a b e+k f+m
= (c d)(g+n h+p )
a(e+k)+b(g+n) a(f+m)+b(h+p)
= (c(e+k)+d(g+n) c(f+m)+d(h+p) )
ae+ak+bg+bn af+am+bh+bp
= ( ce+ck+dg+dn cf+cm+dh+dp )
ae+bg+ak+bn af+bh+am+bp
= ( ce+dg+ck+dn cf+dh+cm+dp )
ae+bg af+bh ak+bn am+bp
= (ce+dg cf+dh ) + (ck+dn cm+dp )
a b e f a b km
= (c d)(g h) + (c d)(n p )

= AB + AC.

The proof of (B + C)A = (BA + CA) follows similar lines and is left to the
reader.

175
Theorem 17.6 seems promising. Three of the group axioms are satisfied,
with I as the identity. It remains to investigate whether every matrix
over a field has a right inverse.

a b
Suppose K is a field and A = (c d) Mat2(K). Then A has a right inverse
xy
X = (z u) in Mat2(K) if and only if AX = I, which is equivalent to

(1) ax + bz = 1, (2) ay + bu = 0,
(3) cx + dz = 0, (4) cy + du = 1.

We multiply the equation (1) by d, (3) by b and add them side by side.
Using associativity of addition in K, distributivity of multiplication over
addition, and commutativity of multiplication in K, we get

(ad bc)x = d.

We multiply (2) by d, (4) by b and add them. We multiply (1) by c, (3)


by a and add them. We multiply (2) by c, (4) by a and add them. We
get

(ad bc)y = b, (ad bc)z = c, (ad bc)u = a.

We emphasize again that commutativity of multiplication in K is used


crucially to derive these equations.

The element ad bc appears in each one of these equations. In view of


its importance, we give it a name.

17.7 Definition: Let K be a field and A = (ac bd) Mat2(K). Then the

element ad bc in K is called the determinant of A, written as det(A) or


as det A.

a b
We have shown: if K is a field and A = (c d) Mat2(K), and if X = (xz yu)
in Mat2(K) is a right inverse of A, then

(det A)x = d (det A)y = b


(D)
(det A)z = c (det A)u = a.

176
These equations impose certain conditions on a matrix having a right
inverse. We cannot expect that every matrix has a right inverse. Those
having a right inverse are charecterized very simply as the matrices
with a nonzero determinant.

17.8 Theorem: Let K be a field and A = (ac bd) Mat2(K). Then A has a

right inverse if and only if det A 0. If this is the case, then there is a
unique right inverse of A, namely the matrix

(det A) 1d (det A) 1b
( (det A) 1c (det A) 1a )
where (det A) 1 is the inverse of det A K\{0} in the multiplicative
group K\{0}.

Proof: First we assume det A = 0 and show that A has no right inverse.
Indeed, if det A = 0 and A had a right inverse, then the equations (D)
would become
d =0 b =0
c =0 a = 0,
a b 0 0
and A = (c d) would be the zero matrix (0 0). The existence of a right
xy
inverse X = (z u) would yield

(10 01) = I = AX = (00 00)(xz yu) = (0x+0z 0y+0u 00


0x+0z 0y+0u ) = (0 0),
hence 1 = 0 in K, a contradiction. Thus A has no right inverse if det A =
0.

Now let us assume det A 0 and show that A has a unique right inverse.
Since det A K\{0} and K\{0} is a group under multiplication, det A has
an inverse in K\{0}, which we denote by (det A) 1. This is the nonzero
element of the field K such that (det A) 1(det A) = (det A)(det A) 1 = 1 =
the identity element of K\{0}. So we can solve for x,y,z,u in (D) by
multiplying the equations in (D) by (det A) 1. We get

x = (det A) 1d, y = (det A) 1b,


z = (det A) 1c, u = (det A) 1a.
Thus, if A has a right inverse at all, this right inverse must be the matrix
written in the enunciation of the theorem (in particular, A has a unique

177
right inverse). It is easy to check that this matrix is indeed a right
inverse of A:

a b (det A) 1d (det A) 1b
( )(
c d (det A) 1c (det A) 1a )
(det A) 1(ad bc) (det A) 1( ab+ba)
= ( (det A) 1(cd dc) (det A) 1( cb+da) )
10
= (0 1) = I.

Hence A does have a unique right inverse and it is the matrix given in
this theorem.

We will prove presently that the matrices with right inverses form a
group under multiplication. From Lemma 7.3, it will then follow that the
unique right inverse of a matrix with a nonzero determinant is also the
unique left inverse of the same matrix. We shall refer to is as its inverse.
a b
The rule for finding the inverse of A = (c d) is simple: interchange a and

d, then put a minus sign in front of b and c, and multiply each entry by
(det A) 1 [i.e., divide each entry by det A]. For example, the inverse of
 1 2 1 2
8 8 
(5  
2 1/4 1/4
) Mat ( ) is = ( 1/8 5/8) and that of
1 2 2
 1 1 1 5
8 8 

(51 24) Mat2( 7) is 22 41 2 2 1 3


( 2 5 ) (
= 5 3 )
since the deter-

minant is equal to 18 = 4 and 4 1 = 2.

17.9 Theorem: Let K be a field. .


(1) det (AB) = (det A)(det B) for all A,B Mat2(K).
(2) det I = 1 ( K)..
(3) If AX = I, then det X = (det A) 1. .
Proof: (1) We use the notation of Definition 17.5. We get

det (AB) = (ae + bg)(cf + dh) (af + bh)(ce + dg)


= aecf + aedh + bgcf + bgdh afce afdg bhce bhdg
= aedh afdg + bgcf bhce
= ad(eh fg) bc(eh fg)

178
= (ad bc)(eh fg)
= (det A)(det B).

(2) det I = 1.1 0.0 = 1 0 = 1.

(3) This follows from (1) and (2): if AX = I, then


1 = det I = det(AX) = (det A)(det X), so det X = (det A) 1.

The formula det AB = (det A)(det B). is known as the multiplication rule
of determinants.. Loosely speaking, the determinant of a product is the
product of the determinants.. By induction on n, it is extended to n
factors: det (A1A2. . . An) = (det A1)(det A2). . . (det An).

We finally have a group of matrices under multiplication.

17.10 Theorem: Let K be a field. Then


{A Mat2(K): det A 0}
is a group under matrix multiplication.

Proof: We check the group axioms. Let us call our set G for brevity.

(i) For A,B G, we have det A 0 det B. In the field K,


product of nonzero elements is nonzero (K\{0} is a group, and closed
under multiplication). So det AB = (det A)(det B) 0 by Theorem 17.9(1)
and consequently AB G. Thus G is closed under multiplication.

(ii) Associativity of multiplication in G follows from Theorem


17.6(2).

(iii) I is a right identity element of G, for det I = 1 0 by


Theorem 17.9(2), so I G; and AI = A for all A G by Theorem 17.6(3).
(iv) Any A G has a right inverse in G. Indeed, if A G, then
det A 0, so A has a right inverse X in Mat2(K). As det X = (det A) 1 0
(Theorem 17.9(3)), we see X G. Thus A has a right inverse in G.

Therefore, G is a group.

179
17.11 Definition: Let K be a field. The group of Theorem 17.10 is
called the general linear group (of degree 2) over K, and is written as
GL(2,K).

Since GL(2,K) is a group, the unique right inverse of any matrix A in


GL(2,K) is also the unique left inverse of that matrix (Lemma 7.3). It will
be called the inverse of A, and will be written as A 1, in conformity with
the usual terminology and notation. The matrix I will be called the iden-
tity matrix. Elements of GL(2,K) are called invertible matrices or regular
matrices. Matrices whose determinants are zero are called singular.

The next theorem furnishes another matrix group.

17.12 Theorem: Let K be a field. Then


{A Mat2(K): det A = 1}
is a group under matrix multiplication.

Proof: Let us call this set S for brevity. As 1 0 in K, we get S GL(2,K).


We use the subgroup criterion (Lemma 9.2) to check that S is a subgroup
of GL(2,K).

(i) For A,B S, we have det A = 1 = det B, therefore det AB =


(det A)(det B) = 1.1 = 1 by Theorem 17.9(1) and consequently AB S.
Thus S is closed under multiplication.

(ii) For any A S, we have det A = 1, so det (A 1) = (det A) 1 =


1 1 = 1 by Theorem 17.9(3) and A 1 S. Thus S is closed under the
forming of inverses.

Therefore, S is a subgroup of GL(2,K).


17.13 Definition: Let K be a field. The group of Theorem 17.12 is
called the special linear group (of degree 2) over K, and is written as
SL(2,K).

We close this paragraph with a group that plays an important role in


number theory and in complex analysis.

180
17.14 Theorem: The set

{(ac bd) Mat2( ): a,b,c,d , ad bc = 1}

is a group under matrix multiplication.

Proof: Let us call this set H for brevity. Clearly H SL(2, ). We


check that H is a subgroup of SL(2, ).

a b e f
(i) Suppose A = (c d) and B = (g h) are elements of H. Then
ae+bg af+bh
AB = (ce+dg cf+dh ). Here the entries of AB, namely ae+bg, ce+dg, af+bh,

cf+dh are integers, because a,b,c,d,e,f,g,h are integers. Also, det A = 1 =


det B, therefore det AB = (det A)(det B) = 1.1 = 1 by Theorem 17.9(1)
and AB H. Thus H is closed under multiplication.

a b d b
(ii) Let A = (c d) H. Then det A = 1 and so A 1 = c ( a )
1
by Theorem 17.8. The entries d, b, c, a of A are integers, because a, b,
c, d are integers. Also, we have det A = 1, so det (A 1) = (det A) 1 = 1 1 = 1
by Theorem 17.9(3) (or det (A 1) = da ( b)( c) = ad bc = 1). So A 1 H
and H is closed under the forming of inverses.

Therefore, H is a subgroup of SL(2, ).

17.15 Definition: The group of Theorem 17.14 is called the special


linear group (of degree 2) over , or the modular group, and is written
as SL(2, ) or as .

Exercises

1. Let K be a field. Show that GL(2,K) is not an abelian group.

2. Find all elements of GL(2, 2


). What is the order of GL(2, 2
)?

3. Write down the multiplication table of GL(2, 2). Compare it (eventual-


ly after reordering the rows and columns). with the multiplication table
of S 3.

181
4. Find all elements of SL(2, 3
). What is the order of SL(2, 3
)?

5. Write down the multiplication table of SL(2, 3


)

a b
6. Let K be a field and let A = (c d) Mat2(K). When a = 0 = b, we have
det A = 0. In case (a,b) (0,0),. prove that det A = 0 if and only if there
is an element k in K such that c = ka, d = kb. . Use this result and show
that GL(2, p
) = (p2 1)(p2 p).

7. Determine how many elements in GL(2, p


) have the same determi-
nant. Find the order of SL(2, p).

8. Show that {(1a 0b) Mat2(K): b 0} is a subgroup of GL(2,K).

9. Prove that {(a0 bd) Mat2(K): ad 0} is a group under multiplication.

Its elements are called triangular matrices.

10. Let K be a field. For any A = (ac bd) Mat2(K), we define the trace of
A to be the element a + d of K . (sum of the entries in the upper-left
lower-right diagonal).. Show that the trace of AB is equal to the trace of
BA for all A,B Mat2(K).

11. Let K be a field. For any A = (ac bd) Mat2(K), we define the

transpose of A to be the matrix (ab dc ) Mat2(K), which is written At.


Show that det At = det A and (AB)t = B tAt for all A,B Mat2(K).

12. Let m 2 and put Mat2( m


)= {(ac bd): a,b,c,d m
}. Show that the
theory in the text, until Theorem 17.8,. remains valid for the elements of
Mat2( m), which are called matrices over m.

In place of Theorem 17.8, prove that A Mat2( m


) has a unique right
inverse if and only if det A m
.

Put GL(2, m
) = {A Mat2( m
): det A m
}. Show that GL(2, m
) is a
group under multiplication.

Prove that Theorem 17.12 remains true if "K" is replaced by " m


".

182
13. Develope a theory of matrices over by modifying the theory of
matrices over . How do you define GL(2, )?

a b
14. Let H = {( b a ): a,b } Mat2( ), where x is the complex
conjugate of x . Prove that H is closed under addition and multiplica-
tion. Show that H\{0} is a group under multiplication.

a b a b
15. If K is a field and A = (c d) Mat2(K), we write A = ( c d). Let
10 i 0 0 1 0 i
1 = (0 1), i = (0 i ), j = (1 0), k = ( i 0) Mat2( ).
Thus 1 is the identity matrix over .. Show that ij = k, jk = i, ki = j. Prove
that {1, 1,i, i,j, j,k, k} is a group under multiplication,. called a
quaternion group of order 8 and is denoted as Q8. Show that Q8 has
exactly one element of order 2. Find all subgroups of Q8.

183
§18
Factor Groups

In this paragraph,. we learn a way of constructing new groups from a


given one.. This construction is a generalization of obtaining the additive
group n from the additive group . We recall that the elements of n
are certain subsets of ,. namely the cosets of the subgroup {nz: z } in
(cf. §10, Ex. 3). Addition in n is induced from addition in (see §6).
We want to do the same thing with an arbitrary group G.. We start with
a group G and a subgroup H of G.. On the set of cosets of H in G, we wish
to define a binary operation. which reflects the operation on G and which
makes the set of cosets into a group..

Two questions present themselves immediately. First, we have a set


of right cosets of H in G and a set of left cosets of H in G. If G is an
abelian group, the right cosets and the left cosets coincide. However, in
general, the right cosets of H in G are different from the left cosets of H
in G. Thus we have two different sets of cosets: . Do we want to
make into a group or ? Is it possible to make both and into
groups? If so, how are these groups related? If not, why not?

Another question is about the operation.. The central issue in §6, where
we introduced the operations on n, was whether these operations were
well defined. Once we knew that addition in n is a well defined opera-
tion, it was straightforward to prove that n is a group. Not surprisingly,
we have the same problem here.. The main point of the following
discussion is to show that we have a well defined operation. (Theorem
18.4).. Once we know it, it is easy to show that our set of cosets is a
group (Theorem 18.7)..

It turns out that these questions are intimately connected and they will
be resolved simultaneously.

18.1 Suggestion: Let G be a group, H a subgroup of G, and let be the


set {Ha: a G} of all right cosets of H in G. We suggest that we define a
binary operation on , to be denoted by . or by juxtaposition, according
to the "rule"

185
Ha.Hb = Hab

for all a,b G.

This is the most natural way of defining a binary operation on . Now


we have to ask whether this is a well defined operation on , for the
"rule" of evalutating a product Ha.Hb makes use of the elements a,b of G,
which can be chosen in many ways. The "rule" says, in order to evaluate
the product X.Y of X and Y in , that we (1) take an a X so that X = Ha;
(2) that we take an b Y so that Y = Hb; (3) that we evaluate ab in G; (4)
that we find the right coset Hab of this ab. The right coset Hab is
supposed to be the product X.Y. We must make sure that we get the
same right coset at the end, even if we choose different elements from
the right cosets X and Y. We investigate when this "rule" yields a well
defined operation on .

The operation suggested in 18.1 is well defined if and only if the


implication

for all a,a 1,b,b1 G, Ha = Ha1 and Hb = Hb1 Hab = Ha1b1

is valid. Using Lemma 10.2, we write this in the equivalent form

for all a,a1,b,b1 G, h,h1 H, a1 = ha and b1 = h1b a1b1 Hab

which simplifies to

for all a,a1,b,b1 G, h,h1 H, hah1b Hab.

Using Lemma 10.2 again, we can write this as

for all a G, h1 H, ah1 Ha

or as

for all a G, aH Ha. (o)

Thus the operation sugested in 18.1 is well defined if and only if H is a


subgroup of G such that aH Ha for all a G. This is not true for every G
and for every subgroup H of G. After we give other descriptions of such
subgroups, we will see some examples.

186
18.2 Lemma: Let H G. For a G, let a 1Ha be the set
{a 1ha G: h H} = {b G: aba 1 H}.
The following are equivalent.

(1) a 1ha H for all a G, h H.


(2) a 1Ha H for all a G.
(3) a 1Ha = H for all a G.
(4) Ha = aH for all a G.
(5) aH Ha for all a G.

Proof: (1) (2) This follows from the definition of the set a 1Ha.

(2) (3) Suppose a 1Ha H for all a G. Then, for any a G, it is true
1 1 1
that (a ) Ha H. Hence, for any h H, a G, we have aha H, so h =
1 1 1
a (aha )a a Ha. Since this holds for all h H, we obtain H a 1Ha, for
all a G. Together with the hypothesis a 1Ha H for all a G, this yields
1
a Ha = H for all a G.

(3) (4) If a 1Ha = H, then Ha = {ha G: h H}


= {a(a 1ha) G: h H}
= {ax G: x a 1Ha}
= {ax G: x H}
= aH.

(4) (5) This is trivial.

(5) (1) Suppose aH Ha for all a G. Then a 1H Ha 1 for all a G.


1 1
Keeping a fixed, we see a h Ha for all h H. Thus, for all h H, there
1 1 1 1
is an h1 H such that a h = h1a . So a ha = h1 H. So a ha H for all h
in H, and this holds for all a G.

18.3 Definition: Let H G. If H satisfies one (and hence all) of the


conditions in Lemma 18.2, then H is called a normal subgroup of G, or
normal in G.

187
We employ the symbol H G to denote that H is a normal subgroup of
G. Also, H G means that H is not a normal subgroup of G. If H is a
proper and normal subgroup of G, we write H G. Finally, H G means
that H is not a proper normal subgroup of G.

18.4 Theorem: The operation suggested in 18.1 is well defined if and


only if H G.

Proof: This follows from (o), Lemma 18.2(5) and Definition 18.3.

By Lemma 18.2(4), any right coset of H in G is a left coset of H in G if and


only if H G. So the set of right cosets of H is equal to the set of
left cosets of H if and only if H G. Theorem 18.4 shows that we have a
well defined operation on if and only if = . This answers our two
questions. We do not have to bother about the distinction between
and : if (and only if) the operation is well defined, there is no
distinction between and . See also Ex. 1 at the end of this paragraph.

18.5 Examples: (a) For any group G, it is clear that G G. Also, {1}
1
G, since a 1a {1} for all a G. We make a convention here. The trivial
subgroup {1} will henceforward be written simply as 1. It will be clear
from the context whether 1 stands for the identity element or for the
trivial subgroup. Thus 1 G and G G.

(b) Any subgroup of an abelian group is normal in that group. Indeed, if


G is abelian and H G, then hg = gh for all h H, g G, hence Hg = gH
for all g G. Thus H G by Lemma 18.2(4).

In the abelian group case, Hg = gH is satisfied trivially, for hg = gh for all


h H, g G. You should notice, however, Hg = gH does not mean that g
commutes with every element of H. This is an equation between certain
sets, so is equivalent to the inclusions Hg gH and gH Hg. The first
inclusion means

for all h H, there is h1 H such that hg = gh1.

188
Here h1 h in general and therefore hg = gh1 gh. The second inclusion
has a similar meaning..

Hg = gH means that, when we multiply the elements of H by g on the


right and on the left, we get the same collection of elements. It does not
mean that, when we multiply any element of H by g on the right and on
the left, we get the same product.

Many beginners misunderstand this point. Be careful not to read more


than set equality in Hg = gH. Compare this with an isometry fixing a
subset F of the Euclidean plane E and one fixing F pointwise (§14).

(c) Consider the subgroup A3 = { ,(123),(132)} of S 3. There are S 3:A3 =


S 3 / A3 = 6/3 = 2 right cosets and 2 left cosets of A3 in S 3. These are

A3 and A3(12) = {(12),(23),(13)}


A3 and (12)A3 = {(12),(13),(23)}

and so any right coset of A3 in S 3 is also a left coset of A3 in S 3. Thus


A3 S 3.

(d) The result in Example 18.5(c) can be generalized. Let H G of index


G:H = 2. Then there are two right cosets of H in G and two left cosets of
H in G. Let H and X be the right cosets, H and Y the left cosets. From the
disjoint unions
G = H X and G = H Y,
we read off
X = G \ H = Y,
so the right cosets H,X of H in G coincide with the left cosets H,X of H in G.
Hence H G: if H has index two in G, then H is normal in G.

(e) Consider the subgroup H := { ,(12)} of S 3. Now S 3:H = 6/2 = 3. The


three right cosets of H and the three left cosets of H are

H = { ,(12)} H = { ,(12)}
H(13) = {(13),(123)} (13)H =
{(13),(132)}
H(23) = {(23),(132)} (23)H = {(23),(123)}

and the right coset {(13),(123)} is not a left coset. So H S 3. In the same
way, { ,(13)} and { ,(23)} are not normal subgroups of S 3.

189
(f) Let H = { ,(12),(34),(12)(34)}. It is easy to see that H S 4. Is H
normal in S 4? We compare the right and left cosets of H in S 4. Aside
from H, we see that the right coset
H(13)(24) = {(13)(24),(1423),(3241),(14)(23)}
is a left coset:
(13)(24)H = {(13)(24),(1324),(1423),(14)(23)}

since (3241) = (1324). This is of course not enough to conclude H S 4.


We must examine the other cosets also. We see.

H(13) = {(13),(123),(341),(1234)}
(13)H = {(13),(132)

and we stop here. This shows H(13) (13)H. Hence H S 4.

(g) Let V4 = { ,(12)(34),(13)(24),(14)(23)}. It is easily seen that V4 S4.


The subgroup V4 is known as Klein's four group (after the German
mathematician Felix Klein (1849-1925); Vierergruppe, whence V4). The
cosets of V4 in S 4 are
V4 V4
V4(12) = {(12),(34),(1324),(1423)}, (12)V4 =
{(12),(34),(1423),(1324)}
V4(13) = {(13),(1234),(24),(1432)}, (13)V4 =
{(13),(1432),(24),(1234)}
V4(23) = {(23),(1342),(1243),(14)}, (23)V4
={(23),(2431),(2134),(14)}
V4(123) ={(123),(134),(243),(142)}, (123)V4 =
{(123),(243),(142),(134)}
V4(132) ={(132),(234),(124),(143)}, (132)V4 =
{(132),(143),(234),(124)}

and since each right coset is a left coset, V4 S4. For a more conceptual
proof of this result, see Ex. 5 at the end of this paragraph..

(h) Consider K = { ,(12),(13),(23),(123),(132)} S 4. Is K normal in S 4?


We observe. (14) 1K(14) = (14)K(14) = { ,(42),(43),(23),(423),(432)} K
and so K S 4.

(i) Normality is not an intrinsic property of a subgroup.. It is meaningles


to speak about normality of a subgroup H itself.. It is only meaningful to
speak about normality of H in a group G.. We have to specify the group G

190
as well as the subgroup H when we speak about normality.. It is possible
that H G1 and H G2 for two groups G1,G2 containing H. Here is an
example.. Take

4 2 1 1
G1 = D8 = , : = 1, = 1, =
2 2 2
G2 = , = {1, , , } G1
H= = {1, }.

Then H G1 and H G2. Now G2:H = 2, so H G2 by Example 18.5(d)


above. However
1 1 1 1 2
H = {1, } = {1, } = {1, } H
and thus H G1.

Incidentally, G2 G1 since G1:G2 = 2. This shows that normality is not a


transitive relation. It is possible that H G2, G2 G1, yet H G1.

(j) For any field K, we have SL(2,K) GL(2,K). Indeed, if S SL(2,K),


then det S = 1 and, for any G GL(2,K),

det (G 1SG) = det (G 1.SG) = det G 1.det (SG) = (det G) 1.(det S)(det G)
= (det G) 11(det G) = 1,
G 1SG SL(2,K) for all S SL(2,K), G GL(2,K),

and so SL(2,K) GL(2,K) by Lemma 18.2(1).

(k) If H G and K G, then H K G. More generally, if Hi G


(where i I, an index set), then Hi G. We show this. Put H = Hi
i I i I
for brevity. From Hi G, it follows that H G (Example 9.4(f)). Also, for
any h H and g G,

h Hi for all i I,
1
g hg Hi for all i I,
g 1hg H

and H G by Lemma 18.2(1).

(l) If H G and K G, then H K K. Indeed, let h H K,k K.


1 1
Then k hk H since h H and H is normal in G. Also, k hk K because
h K and K is closed under multiplication. Thus k 1hk H K for all
h H K and for all k K. Thus H K K by Lemma 18.2(1).

191
18.6 Definition: When H G, the set of all right cosets of H in G, which
is also the set of all left cosets of H in G by Lemma 18.2(4), will be
denoted by G/H, read G by H, or G modulo H, or G mod H.

Most authors do not insist on the condition H G when they write G/H.
They write G/H for the set of right cosets of H in G (or for the set of
left cosets, especially when they write functions on the left) and employ
some other symbol for the the set of left cosets (or for the the set of
right cosets). Throughout this book, whenever we write G/H, it will be
tacitly supposed that H G. The notation G/H is meaningless if H G
and will not be used in this case.

18.7 Theorem: Let H G. Then G/H is a group under the operation


suggested in 18.1, by which
Ha.Hb = Hab for all Ha, Hb G/H.

Proof: We check the group axioms.

(i) The operation on G/H is well defined by Theorem 18.4 and


the product of two right cosets is again a right coset. So G/H is closed
under this operation.

(ii) For all Ha,Hb,Hc G/H, we have (Ha.Hb)Hc = Hab.Hc =


H(ab.c) = H(a.bc) = Ha.Hbc = Ha(Hb.Hc) since ab.c = a.bc for all a,b,c G.
The operation is therefore associative.

(iii) H = H1 G/H is a right identity element of since


Ha.H1 = Ha1 = Ha for all Ha G/H.

(iv) Any Ha G/H has a right inverse in G/H, namely Ha 1:


Ha.Ha 1 = Ha.a 1 = H1 = H = identity element of G/H.

Therefore G/H is a group.

18.8 Definition: Let H G. The group G/H of Theorem 18.7 is called


the factor group of G with respect to H, or the factor group G by H, or the
factor group G mod(ulo) H. Instead of the term "factor group", the term

192
"quotient group" is also used. The group operation is called multiplica-
tion (of cosets).

Please notice that G/H is not a subgroup of G. The elements of G/H are
subsets of G, not elements of G.

Since the multiplication on G/H is based on the multiplication on G, we


expect that some properties of G are inherited by G/H. Here are some
properties that are taken over by the factor groups.

18.9 Lemma: Let H G.


(1) G/H = G:H . In particular, if G is finite, so is G/H and G/H = G / H .
(2) If G is abelian, so is G/H.
(3) If G is cyclic, so is G/H.

Proof: (1) The elements of G/H are the cosets of H in G and there are
G:H cosets of H in G by Definition 10.7. So the order of G/H is the index
of H in G. The second assertion follows from Lagrange's theorem.

(2) If G is abelian, then ab = ba for all a,b G and Ha.Hb = Hab = Hba =
Hb.Ha for all Ha, Hb G/H. Thus G/H is abelian, too.

(3) Assume that G is cyclic, say G = g . Then any element x of G is of the


form g n, where n . Hence any coset of H in G is of the form Hx = Hgn =
(Hg)n. This shows G/H = Hg .

The converses of the claims in Lemma 18.9 are false. The factor group
G/H can be finite (abelian, cyclic) without G being finite (abelian, cyclic).

We close this paragraph with some examples of factor groups.

18.10 Examples: (a) Let G be a group and H = 1 = {1}. Then H G


(Example 18.5(a)). The cosets of H = 1 the subsets of G having only one
element:
Ha = {1}a = {a} for all a G

193
and multiplication in G/H = G/1 is given by

{a}{b} = {ab}.

The factor group G/1 is governed by the same operation as G. Thus G/1
is almost the same group as G. The only difference is that the elements
of G are enclosed within braces in G/1.

(b) Let be the additive group of integers and let n = {nz :z }


be the subgroup of consisting of integers divisible by n. Since is
abelian, n and /n consists of the n cosets
n , n + 1, n + 2, . . . , n + n 1
which are usually abbreviated as
0,1,2, . . . ,n 1

(see §6; we write the cosets additively of course). Thus /n = n


as
sets.

In the factor group /n , the operation is given by


(n + a) + (n + b) = n + (a + b) for all a,b
which can be written shortly as
a +b= a + b for all a ,b /n .
This is the definition of addition in n. So the operation in /n coin-
cides with the operation on n that we learned in §6. Hence /n = n as
(additive) groups..

We understand the real reason why addition on n, as defined in §6, is a


well defined operation. It is well defined only because n ..

(c) Let G = C12 = g: g 12 = 1 be a cyclic group of order 12 and let H =


g 3 = {1,g 3,g 6,g 9} G. Since G is abelian, H G and G/H consists of the
cosets

H = {1,g 3,g 6,g 9}, Hg = {g,g 4,g 7,g 10}, Hg2 = {g 2,g 5,g 8,g 11}.

The multiplication table of G/H is given below.

194
H Hg Hg2

H H Hg Hg2

Hg Hg Hg2 H

Hg2 Hg2 H Hg

(d) We know V4 S4 (Example 18.5(g)). The elements of S 4/V4 are


V4(12),V4(13),V4(23),V4(123),V4(132) and the multiplication table of
S 4/V4 is

. V4 V4 (12) V4 (13) V4 (23) V4 (123) V4 (132)

V4 V4 V4 (12) V4 (13) V4 (23) V4 (123) V4 (132)

V4 (12) V4 (12) V4 V4 (123) V4 (132) V4 (13) V4 (23)

V4 (13) V4 (13) V4 (132) V4 V4 (123) V4 (23) V4 (12)

V4 (23) V4 (23) V4 (123) V4 (132) V4 V4 (12) V4 (13)

V4 (123) V4 (123) V4 (23) V4 (12) V4 (13) V4 (132) V4

V4 (132) V4 (132) V4 (13) V4 (23) V4 (12) V4 V4 (123)

This is almost identical with with the multiplication table of S 3:

(12) (13) (23) (123) (132)

(12) (13) (23) (123) (132)

(12) (12) (123) (132) (13) (23)

(13) (13) (132) (123) (23) (12)

195
(23) (23) (123) (132) (12) (13)

(123) (123) (23) (12) (13) (132)

(132) (132) (13) (23) (12) (123)

Thus S4/V4 is almost the same group as S3. They are not the same groups,
of course, for the underlying sets are different.. Nevertheless, it is clear
from the tables above that the operations on S 4/V4 and on S 3 are closely
related. This will be made more precise in §20. .

Exercises

1. Let H G and let be the set of all left cosets of H in G. We suggest


that we define a binary operation on , according to the "rule"
aH.bH = abH
for all a,b G. Show that this operation is well defined if and only if
H G.

2.Let H G. Prove that H G if and only if Ha aH for all a G.

3. Prove that, if H G, a G and Ha is a left coset of H in G, then Ha = aH.

4. Find a group G, a subgroup H of G, and an element a of G such that


a 1Ha H but a 1Ha H. Why does this not contradict Lemma 18.2?

5. Let {a,b,c,d} = {1,2,3,4}. Show that, for any S 4,


(ab)(cd) = (a ,b )(c ,d )
1
and thus V4 for all V4. This proves V4 S4. Compare with
§15, Ex. 15.

6. Find all normal subgroups of S 4 (cf. §15, Ex.10).

7. Find all normal subgroups of SL(2, 3


).

8. Determine whether the following are normal subgroups in the groups


indicated.

196
{g GL(2, ): det g 5} in GL(2, )
{g GL(2, ): det g 0} in GL(2, )
{g GL(2, ): det g 0} in GL(2, )
{g GL(2, ): det g = 1} in GL(2, )
{g GL(2, ): (det g)18 = 1} in GL(2, )
{g GL(2, 11): det g = 1 or 3 or 4 or 5 or 9} in GL(2, 11
).

9. Let K be a field. Then K\{0} is a group under multiplication. Suppose U


is a subgroup of K\{0}. Prove that {g GL(2,K): det g U} is a subgroup
of GL(2,K).

10. Let n and put


a b a 1b 0
n
= ( {
c d ) SL(2, ): c 0 d 1 (mod n) .}
Determine if n SL(2, ).

11. Let H G and let Ha G/H. Show that o(Ha) = n (n is a natural


number) if and only if n is the smallest natural number such that xn H.

12. Show by counterexamples that the converses of the claims in Lemma


18.9 are false.

197
§19
Product Sets in Groups

In the preceding paragraph, we introduced a multiplication on the set of


right cosets of a subgroup H of a given group G. This involved selecting
elements from the cosets to be multiplied. Selecting elements from the
cosets is an artificial step in this coset multiplication. We showed in
Theorem 18.4 that the resulting coset is independent of the elements
chosen when (and only when) H is a normal subgroup of G. However,
this does not get rid of the inherent artificiality of the coset multiplica-
tion we studied in §18. A more natural multiplication would treat the
elements of cosets on equal standing, rather than distinguishing (select-
ing) one of them (as in Suggestion 18.1) and then showing (as in
Theorem 18.4) that no injustice to the remaining elements has been
commited. We introduce in this paragraph a natural multiplication of
cosets, and in fact more generally of arbitrary nonempty subsets in a
group. The new multiplication will coincide with the one of Suggestion
18.1.

19.1 Definition: Let G be a group. For any nonempty subsets X,Y of G,


the product set XY is defined to be
XY = {xy G: x X, y Y}.

When X has only one element, say when X = {x}, we write xY instead of
{x}Y. Likewise, we write Xy instead of X{y}. This is consistent with the
definition of cosets (Definition 10.1).

This multiplication is associative.

19.2 Lemma: Let G be a group. For any nonempty subsets X,Y,Z of G,


there holds (XY)Z = X(YZ).

Proof: This follows from the associativity of multiplication in G:

198
(XY)Z = {uz G: u XY, z Z}
= {(xy)z G: x X, y Y, z Z}
= {x(yz) G: x X, y Y, z Z}
= {xv G: x X, v YZ}
= X(YZ).

Using Lemma 8.3, we may and do drop the parentheses in any product
set involving more than two subsets. For example, we write XYZUV for
(XY)(Z(UV)).

19.3 Examples: (a) Let H G. As we have remarked earlier, H{x} = Hx


= {hx: h H} is the right coset of H in G containing x G. Analogously,
{x}H = xH is the left coset of H that contains x.

(b) Let G be a group, H G and x G. Then x 1Hx = {x 1hx: h H} (see


Lemma 18.2) is the product of the sets {x 1}, H, {x}.

(c) Let G be a group and let X be a nonempty subset of G. Then XX


consists of all products x1x2, where x1 and x2 run through X indepen-
dently. Notice that XX {x2 G: x X} in general. X is a multiplicatively
closed subset of G if and only if XX X. In particular, HH H for any
subgroup H of G.

(d) Let G be a group and let X,Y be nonempty subsets of G. It follows


from Definition 19.1 that
XY = Xy = xY.
y Y x X

(e) Let X = { ,(12)}, Y = { ,(13)}. Now X and Y are subsets of S 3. Then XY =


{ , (13),(12) ,(12)(13)} = { ,(13),(12),(123)}.. Notice that X,Y are
subgroups of S 3, but XY is not. So the product of two subgroups is not
necessarily a subgroup..

(f) Let X = { ,(13)} and V4 = { ,(12)(34),(13)(24),(14)(23)}. Then X S4


and V4 S4 (Example 18.10(d)). Here XV4
=
{ , (12)(34), (13)(24), (14)(23),(13) ,(13)(12)(34),(13)(13)(24),(13)(14)(2
3)}.

199
= { ,(12)(34),(13)(24),(14)(23),(13),(1432),(24),(1234)}.
is easily seen to be closed under multiplication, hence XV4 is a subgroup
of S4 (Lemma 9.3(2)), but not a normal subgroup of S4, for (13) XV4
but (12) 1(13)(12) = (23) XV4 (Lemma 18.2(1)). . We see that the
product of two subgroups is not necessarily a normal subgroup. even if
one of the factors is a normal subgroup..

In Example 19.3(f) above, it is easy to see XV4 = V4X. This is the basic
reason why XV4 turns out to be a subgroup of S4. The next lemma
describes the situation..

19.4 Lemma: Let H G and K G.


(1) HK G if and only if HK = KH.
(2) If H G or H G, then HK G.
(3) If H G and H G, then HK G.

Proof: Before we present the proof, it will be worthwhile to discuss the


equation HK = KH. What does it mean? Well, HK and KH are subsets of G
and equality of them is equivalent to the inclusions
HK KH and KH HK.

The first inclusion means, for any h H and k K, the element hk ofG
belongs to KH, so that there are k1 K and h1 H such that hk = k1h1.
Similarly, the second inclusion means, for any k K and h H, there are
h2 H and k2 K such that kh = h2k2.

HK = KH does not mean that hk = kh for all h H,k K. Of course, if hk =


kh for all h H,k K, then trivially HK = KH. However, it does not follow
from HK = KH that hk = kh for all h H,k K. From HK = KH, it follows
only that, for any h H,k K, there are k1 K, h1 H and h2 H, k2 K
such that hk = k1h1 and kh = h2k2.

Now the proof.

(1) We are to show: (a) if HK = KH, then HK G; and (b) if HK G, then


HK = KH.

200
(a) Suppose first HK = KH. We prove that HK is closed under multiplica-
tion and the forming of inverses (Lemma 9.2).

(i) If HK = KH, then HK.HK = H.KH.K = H.HK.K = HH.KK HK and


so HK is closed under multiplication (see Example 19.3(c)). If you are
not satisfied with this demonstration, here is another. Let x,y HK, say
x = hk, y = h1k1 with h,h1 H and k,k1 K. We wish to show xy HK.
Now xy = hk.h1k1 = h.kh1.k1 and kh1 KH = HK by hypothesis, so kh1 =
h2k2 for some h2 H, k2 K. So xy = h.kh1.k1 = h.h2k2.k1 = hh2.k2k1 HK
since hh2 H and k2k1 K as H G and K G. Thus HK is closed under
multiplication.

(ii) Let x HK, say x = hk with h H, k K.We are to show


1 1 1 1 1
that x HK. We have x = (hk) = k h KH = HK, because k 1 K and
1
h H as K and H are subgroups of G. So HK is closed under the forming
of inverses.

This proves that HK G whenever H G, K G and HK = KH.

(b) Now suppose H G, K G and HK G. We want to show HK = KH,


that is, HK KH and KH HK. These inclusions follow from the fact that
HK is closed under taking inverses. Indeed, if x HK, then x 1 HK, say
1 1 1 1
x = hk with h H, k K. Then x = (hk) = k h KH. So HK KH. The
other inclusion is proved in the same way.

This proves that H G, K G and HK G implies HK = KH.

The proof of (1) is complete.

(2) We suppose H G, K G and prove that HK G. According to part


(1), it suffices to show HK = KH. First we prove HK KH. Let h H, k K.
1 1
Then k hk H since H G (Lemma 18.2(1)) and hk = k.k hk KH. This
proves HK KH. Now we prove KH HK. For any h H, k K, we have
1 1
khk H since H G and thus kh = khk .k HK and KH HK.
Therefore HK = KH and HK G.

The proof of HK G under the hypotheses H G, K G follows similar


lines and is left to the reader.

(3) We now assume H G, K G. From part (2), we get HK G. We are


1
to show HK G. To do that, we prove g xg HK for all g G, x HK
(Lemma 18.2(1)). For any x HK, there are h H, k K with x = hk and

201
g 1xg = g 1hkg = g 1hg.g 1kg HK since g 1hg H and g 1kg K as H G
and K G. Hence HK G.

This completes the proof.

In Lemma 19.4(3), it would not be enough to prove that g 1xg HK for


all g G, x HK. It is necessary to show HK G also. Generally
speaking, "A B" summarizes two conditions on A and B: that A is a
subgroup of B and that A is normal in B. We must check both of them
whenever we want to show A B.

We turn our attention to the product of two right cosets. The product of
two right cosets, as in Definition 19.1, is a subset of the group under
discussion. When is it a right coset? The next lemma gives the answer.

19.5 Lemma: Let H G. The product of arbitrary right cosets of H in G,


according to Definition 19.1, is always a right coset of H in G if and only
if H G.

Proof: The product of Ha and Hb (where a,b G) is

HaHb = {hah1b G: h,h1 H}

and ab = 1a1b HaHb. Thus HaHb is a right coset of H in G if and only if


it is the right coset of H in G to which ab belongs:

H is the right coset of H in G HaHb = Hab.

We show that HaHb = Hab for all a,b G if and only if H G.

If H G, then HaHb = Hab for all a,b G. Indeed, if H G, then aH = Ha


for all a G (Lemma 18.2(4)), and, for any a,b G, we have
HaHb = H.aH.b = H.Ha.b = HH.ab = Hab.
Here we use HH = H, which follows from HH H (Example 19.3(c)) and
H = 1H HH.

Conversely, assume HaHb = Hab for all a,b G. Then


HaH = (HaH)(bb 1) = (HaHb)b 1 = (Hab)b 1 = (Ha)bb 1 = Ha
HaH = Ha

202
aH = 1aH HaH = Ha
and so aH Ha for all a G. From Lemma 18.2(5), we obtain H G

The product of any two right cosets of H G, as in Suggestion 18.1, is


always a right coset of H, provided this multiplication is well defined,
and it is well defined if and only if H G. On the other hand, the
product of any two right cosets of H G, as in Definition 19.1, is always
a definite subset of G, but this subset is a right coset of H if and only if H
G. The relation HaHb = Hab in the proof of Lemma 19.5 shows that
these two multiplications are identical when H G.

We know that HK is not necessarily a subgroup of G even if H G, K G.


It is a subset of G. We now determine the number of elements in it.

19.6 Lemma: Let H,K G and assume that H and K are finite. Then HK
is a finite subset of G, whose cardinality is given by

H K .
HK =
H K

Proof: We list all products hk, where h and k run through H and K,
respectively. In this way, we get H K elements of G. These are the
elements of HK. Naïvely, we expect HK to be equal to H K , but there
may be repetitions in our list: the same element of HK may be written
more than once. We have to keep account of repetitions. We show that
each of the H K products hk appears exactly n times in our list, where
n := H K . Thus there are H K /n distinct elements in the list and HK =
H K /n. In other words, the mapping

:H K HK
(h,k) hk

is an n-to-one mapping. By this, we understand that exactly n elements


in the domain H K have the same image under .

To prove that is an n-to-one mapping,. let us investigate when we have


(h1,k1) = (h2,k2) . Well, (h1,k1) = (h2,k2) if and only if h1k1 = h2k2 and

203
therefore if and only if h1 1h2 = k1k2 1 = s belongs to H K. Thus (h1,k1)
and (h2,k2) have the same image under if and only if h2 = h1s and
k2 = s 1k1 for some s H K. Denoting by 1 = s1,s2, . . . ,sn the n = H K
elements of H K, we conclude that the n ordered pairs.
(h1,k1),(h1s2,s2 1k1),(h1s3,s3 1k1), . . . ,(h1sn,sn 1k1)
and only these ordered pairs have the image h1k1 under . This proves
that is indeed n-to-one, and consequently

H K .
HK =
H K

Exercises

1. Let X,Y be arbitrary nonempty subsets of a group G and let g be an


arbitrary element of G. Prove the following equivalences.

X Y gX gY g 1Xg g 1Yg;
X=Y gX = gY Xg = Yg g 1Xg = g 1Yg;
X = gY g 1X = Y.

2. Let H,K be subgroups of a group G. Assume that G is finite, H G


and K G . Prove that H K 1.

3. Let A,B,C be subgroups of a group G, with A C. Prove that


A(B C) = AB C.

4. Let H,K be subgroups of a group G and let g G. Prove that

H K .
HgK = 1
g Hg K

(A subset of the form HgK is called a double coset.)

204
§20
Group Homomorphisms

In Example 18.10(d), we have observed that the groups S4/V4 and S3


have almost the same multiplication table. They have the same struc-
ture. In this paragraph, we study groups with the same structure.

20.1 Definition: Let G and G1 be groups and let : G G1 be a mapping


from G into G1. If
(ab) = a .b for all a,b G,

then is called a (group) homomorphism.

The equation (ab) = a .b is paraphrased by saying that preserves


multiplication or that preserves products. Loosely speaking, a homo-
morphism is a mapping under which the image of a product is the
product of the images.

Here "products" might refer to different operations.. For a,b G, the


product ab G . is clearly the result of the binary operation of the group
G, whereas a ,b G1 and a .b G1 is the result of the binary operation
of the group G1. This is implicit in the equation (ab) = a .b which does
not make any sense unless ab is the product of a,b in. G and a .b is the
product of a ,b in G1.

More precisely,. if o is the binary operation on G and if * is the binary


operation on G1, then : G G1 is a homomorphism provided
(a o b) = a * b for all a,b G.

20.2 Examples: (a) One homomorphism is very well known to the


reader. It is the logarithm function
log:

205
from the group of positive real numbers (under multiplication) into
the group of all real numbers (under addition). The homomorphism
property of the logarithm function is the well known identity
log ab = log a + log b
that holds for all a,b .

(b) The determinant mapping


det: GL(2, ) \{0}
is a homomorphism from GL(2, ) into the group of nonzero rational
numbers under multiplication, for
det AB = (det A)(det B)
for all A,B GL(2, ) by Theorem 17.9(1). The same thing is true for the
mapping det: GL(2,K) K\{0}, where K is any arbitrary field.

(c) The sign mapping


: Sn {1, 1}
is a homomorphism from S n into the multiplicative group {1, 1} since
( ) = ( ) ( ).
for all , S n by Theorem 16.7.

(d) The absolute value function


: \{0}
a a
is a homomorphism from the group of all nonzero real numbers (under
multiplication) into the group of positive real numbers (under multi-
plication) since
(ab) = ab = a b = a b
for all a,b \{0}.

(e) The signum function


sgn: \{0}
{1, 1}
1 if x is positive
x {
1 if x is negative
is a homomorphism from the group of nonzero real numbers into the
group {1, 1}.

(f) Let G be a group. Then the identity mapping


:G G
is a homomorphism from G into G since
(ab) = ab = a b
for all a,b G. More generally, let H be a subgroup of G and let

206
:H G
h h
be the inclusion mapping (Example 3.2(a)).Then
(ab) = ab = a b
for all a,b H. Hence is a homomorphism. Both and are one-to-one
homomorphisms.

(g) Let : G G1 be a group homomorphism and let H G. Then the


restriction
H
:H G1
of to H (Example 3.2(i)) is a homomorphism from H into G1 since
(ab) H = (ab) = (a) (b) = (a) H (b) H
for all a,b H.

20.3 Lemma: Let : G G1 be a homomorphism of groups.


(1) 1 = 1.
(2) (a 1) = (a ) 1 for all a G.
(3) (a 1a 2. . . a n) = (a 1 )(a 2 ). . . (a n ) for all a1,a 2, . . . ,a n G, n ,n 2.
(4) (a n) = (a )n for all a G, n .
(5) If o(a ) = , then o(a) = . If o(a) = n , then o(a ) divides n; in
particular, o(a ) o(a).

Proof: (1) Here we use the same symbol "1". with two different
meanings. In "1 ", 1 is the identity element of the group G.. On the right
hand side, 1 is the identity element of the group G1. A more accurate
way of writing the claim is.

(1G) = 1G ,
1

where 1G is the identity element of G and 1G is that of G1. For the homo-
1

morphisms in Examples 20.2(a)-(e), the assertion means

log 1 = 0; det (10 01) = 1; is an even permutation; 1 = 1; 1 is positive

respectively.

The proof is easy. We have 1 = (1.1) = 1 .1 , hence 1 is the identity


of G1 by Lemma 7.3(1). One can also use a 1 = (a1) = a with some a
G to conclude 1 = 1.

207
1
(2) For any a G, we have a (a ) = (aa 1) = 1 = 1 = (a )(a ) 1, hence
a 1 = (a ) 1.

(3) We make induction on n. The case n = 2 is covered by the very defi-


nition of a homomorphism. Supposing the claim to be true for n = k, i.e.,
supposing (a 1a 2. . . a k) = (a 1 )(a 2 ). . . (a k ) for all a 1,a 2, . . . ,a k G, we get

(a 1a 2. . . a ka k+1) = ((a 1a 2. . . a k)a k+1)


= (a 1a 2. . . a k) (a k+1)
= (a 1 )(a 2 ). . . (a k )(a k+1)

and the claim is true for n = k + 1. Hence it is true for all n ,n 2.

(4) We prove (a n) = (a )n for all a G, n . If n 0, this follows from


(3) when we take a 1 = a 2 = . . . = a n = a. If n 0, this follows from (3)
when we take a 1 = a 2 = . . . = a n = a 1. If n = 0, the claim is proved in (1).

(5) Suppose o(a ) = . If o(a) were a natural number m, then we would


obtain a m = 1, so (a )m = (a m) = 1 = 1 and a would be of finite order
by Lemma 11.4, a contradiction. Thus o(a ) = implies o(a) = .

Suppose o(a) = n . Then a n = 1 and (a )n = (a n) = 1 = 1, so o(a ) n


by Lemma 11.4 and Lemma 11.6.

Next we show that composition of homomorphisms is also a homo-


morphism.

20.4 Theorem: Let :G G1 and : G1 G2 be group homomorphisms.


Then the composition mapping

:G G2

is a homomorphism from G into G2

Proof: We are to show that (ab) = (a) .(b) for all a,b G. This
follows immediately:

(ab) = ((ab) ) (definition of )

208
= (a .b ) ( is a homomorphism)
= (a ) .(b ) ( is a homomorphism)
= (a) .(b) (definition of )

for all a,b G. Hence is indeed a homomorphism.

20.5 Definition: Let : G G1 be a group homomorphism. The set

{a G1: a G} = {b G1: b = a for some a G}

of all images (under ) of the elements of G is called the image of and


is denoted by Im or by G . The set

{a G: a = 1}

of all elements of the domain G . that are mapped to the identity of the
range group G1 is called the kernel of and is written as Ker .

Thus Im G1 and Ker G. It is immediate from the definition of Im


that Im , for G . Also, 1 = 1G Ker by Lemma 20.3(1), so Ker
. We prove now that Im is a subgroup of G1 and that Ker is a
subgroup of G. . In fact, Ker is a normal subgroup of G. This is a very
important fact..

20.6 Theorem: Let : G G1 be a group homomorphism. Then


Im G1 and Ker G.

Proof: First we prove Im G1. We know Im . We use our sub-


group criterion (Lemma 9.2).

(i) Let x,y Im . We are to show xy Im . Now x,y Im


means x = a , y = b for some a,b G. Then xy = (a )(b ) = (ab) is the
image (under ) of an element in G, namely of ab G. So xy Im and
Im is closed under multiplication.

(ii) Let x Im . We are to show x 1 Im . Now x Im


means x = a for some a G. Then x 1 = (a ) 1 = a 1
is the image (under

209
1
) of an element in G, namely of a G. So x 1 Im and Im is closed
under taking inverses.

Thus Im G1.

Now we prove Ker G. First Ker G. We know Ker . Compare


the following with the proof of Theorem 17.12.

(i) For any a,b Ker , we have a = 1 = b , so (ab) =


(a )(b ) = 1.1 = 1 and ab Ker . Thus Ker is closed under
multiplication.

(ii) For any a Ker , we have a = 1, so a 1 = (a ) 1 = 1 1 = 1


1
and a Ker . Thus Ker is closed under taking inverses.

Therefore Ker G. Now we prove that Ker is a normal subgroup of G.


Compare the following with Example 18.5(j).

We must show that g 1kg Ker for any g G, k Ker (Lemma


18.2(1)). This is easy: if k Ker , then k = 1 and, for any g G,
1 1 1
(g kg) = (g )(k )(g ) = (g ) .1.g = 1,
1
so g kg Ker . Thus Ker G.

The elements of a group which have the same image under a homo-
morphism make up a coset of the kernel of that homomorphism.

20.7 Lemma: Let : G G1 be a group homomorphism. For any a,b G,


there holds a = b if and only if (Ker )a = (Ker )b.

Proof: Let a,b G. Then a = b if and only if

(a )(b ) 1 = 1, so if and only if


(a )(b 1 ) = 1, so if and only if
(ab 1) = 1, so if and only if
ab 1 Ker , so if and only if
(Ker )a = (Ker )b

by Lemma 10.2(5).

210
Since Ker G by Theorem 20.6, we also have a(Ker ) = {b G: b =
a }. Alternatively, one may prove a lemma analogous to Lemma 20.7,
stating that a and b have the same image under if and only if the left
cosets a(Ker ) and b(Ker ) are equal, and combine it Lemma 20.7 to get
a(Ker ) = (Ker )a, thereby proving Ker G anew.

It follows from Lemma 20.7 that is a one-to-one homomorphism if and


only if Ker has only one element. We give a direct proof of this.

20.8 Theorem: Let : G G1 be a group homomorphism. Then is one-


to-one if and only if Ker = 1.

Proof: Here 1 is the trivial subgroup of of G (Example 18.5(a)). We


prove is not one-to-one if and only if Ker 1.

If is not one-to-one, then there are a,b G with a = b and a b. We


obtain then 1 = a .(a ) 1 = a .(b ) 1 = a .b 1 = (ab 1) , with ab 1 1. Thus
1 ab 1 Ker and Ker 1.

Conversely, if Ker 1, then there is an a Ker with a 1. Then we


have a = 1 = 1 and a 1. So is not one-to-one.

We can determine whether a homomorphism is one-to-one by examin-


ing its kernel. A homomorphism is one-to-one if and only if Ker = 1.
Also, we can determine whether a homomorphism is onto by examining
its image. A homomorphism is onto if and only if Im is the whole
range. Homomorphisms which are both one-to-one and onto will have a
name.

20.9 Definition: A group homomorphism : G G1 is called an iso-


morphism if it is one-to-one and onto.. If there is an isomorphism from
G onto G1, we say G is isomorphic to G1, and write G G1. If G is not
isomorphic to G1, we write G G1.

211
20.10 Examples: (a) The logarithm function is well known to be a one-
to-one function onto the set of real numbers. Thus
log:
is an isomorphism.

(b) For any group G, the identity mapping


:G G
is an isomorphism.

(c) Let G be a group. Then


:G G/1
g {g}
is an isomorphism from G onto G/1 (see Example 18.10(a)). Thus G
G/1.

(d) The mapping

: S3S 4/V4
V4
(where, on the right hand side, is the permutation in S 4 that fixes 4
and maps 1,2,3 as S 3 does) is an homomorphism. This is evident
from the tables in Example 18.10(d). Also, is clearly one-to-one and
onto. So is an isomorphism and S 3 S 4/V4.

An isomorphism, being one-to-one and onto, has an inverse mapping. It


is natural to ask if the inverse of an isomorphism is an isomorphism.
Also, is it true that composition of two isomorphisms is an isomorphism?

20.11 Lemma: Let : G G1 and : G1 G2 be group isomorphisms.


(1) The composition :G G2 is an isomorphism from G onto G2.
1
(2) The inverse : G1 G of is an isomorphism from G1 onto G.

Proof: (1) The composition is a homomorphism by Theorem 20.4. It


is one-to-one and onto by Theorem 3.13. So is an isomorphism.

1 1 1
(2) For any x,y G1, we must show (xy) = x .y . Since is onto,
there are a,b G such that a = x and b = y. Now a and b are unique

212
with this property, for is one-to-one, and a = x 1, b = y 1. This is the
definition of the inverse mapping. Since is a homomorphism, we have

(ab) = a .b = xy

1 1
Hence, by definition of , we get ab = (xy) . Thus

1 1 1
(xy) = ab = x .y

1
and this holds for all x,y G1. So :G1 G is a homomorphism. As it is
1
one-to-one and onto by Theorem 3.17(1), is an isomorphism.

From Example 20.10(b) and Lemma 20.11, we see that


G G
if G G1, then G1 G
if G G1 and G1 G2, then G G2
for any groups G,G1,G2. We are tempted to say that is an equivalence
relation on the set of all groups. It is true indeed that is an equival-
ence relation, but we must avoid the phrase "the set of all groups". This
phrase leads to logical difficulties. For more information about this point,
the reader is referred to the appendix.

Since G G1 implies G1 G, it is legitimate to say G and G1 are isomorphic


when G is isomorphic to G1.

We are not interested in the nature of the elements in a group.. The


essential thing is the algebraic structure of the group. If G G1, then any
algebraic property of G is immediately carried over to G1. For this reason,
we do not distinguish between isomorphic groups.. For example, any two
cyclic groups of the same order are easily seen to be isomorphic.. By
abuse of language, we call any cyclic group of order n .
the cyclic
group of order n, and write Cn for it. Likewise, any two dihedral groups
of order 2n are isomorphic,. and we speak of the dihedral group of order
2n, and write D2n for it.

We saw in Theorem 20.6 that the kernel of any homomorphism is a


normal subgroup of the domain. We show now conversely that any
normal subgroup of G is the kernel of some homomorphism from G. Into
which group? Since we are given only a normal subgroup of G, the only

213
range group that we can construct out of G and its normal subgroup is
the factor group with respect to that normal subgroup.

20.12 Theorem: Let N G. Then the mapping

:G G/N
a Na

is a homomorphism. It is onto G/N and Ker = N.

Proof: is a homomorphism, for (ab) = N(ab) = Na.Nb = a .b for all a,b


in G, by the very definition of multiplication in G/N. Obviously, any
element Na of G/N is the image of a G under , so is onto. Finally

Ker = {a G: a = identity of G/N}


= {a G: a = N1}
= {a G: Na = N}
= {a G: a N} (Lemma 10.2(2))
= N.

20.13 Definition: Let N G. The mapping : G G/N is called the


a Na
natural (or canonical) homomorphism from G onto G/N.

20.14 Theorem: Let N G. Then there is a homomorphism : G G1


with Ker = N.

Proof: N is the kernel of the natural homomorphism : G G/N by


Theorem 20.12.

The coincidence of kernels with normal subgroups shows that normal


subgroups, factor groups and homomorphisms are closely related. The-
orem 20.12 describes the relation between N G, G/N and the natural
homomorphism : G G/N. We prove next that any homomorphism is
connected in the same way to Ker and G/Ker as is connected to N

214
and G/N. This is done by showing that is essentially a natural homo-
morphism.

20.15 Theorem (Fundamental theorem on homomorphisms):.


Let
:G G1 be a homomorphism of groups. Let N = Ker , which is normal
in G by Theorem 20.6,. and let : G G/N be the associated natural
homomorphism..
Then there is a one-to-one homomorphism : G/N G1 such that = .

[This theorem may be summarized in a diagram. The hypothesis is the


diagram (a) below. The claim is that there is a one-to-one homomorph-
ism y such that both paths from G to G1 ( and ) in diagram (b) have
the same effect.

G/N G/N

G G1 G G1

(a) (b)

The equation = can be regarded as a factorization of . Since the


path passes through G/Ker , we say factors through G/Ker .]

Proof: We must find a suitable : G/N G1. We want = , so that a =


a( ) = (a ) = (Na) . So we define

: G/N G1
Na a

In order to find the image of any coset of N under , we have to choose


an element a from that coset, which can be done, generally speaking, in
many ways. So we have to make sure that is a well defined function.
Thus we have to show

for all a,b G, Na = Nb (Na) = (Nb) .

215
From the definition of N = Ker and of , we see that this implication is
equivalent to

for all a,b G, (Ker )a = (Ker )b a =b ,

and this is true by Lemma 20.7. Thus is indeed a well defined


mapping.

is a homomorphism. This is verified easily:


?
(Na.Nb) = (Na) .(Nb) for all a,b G
?
(Nab) = (Na) .(Nb) for all a,b G
?
(ab) = a .b for all a,b G

Since is a homomorphism, the last line is true. Hence is a homo-


morphism.

is one-to-one. To prove this, we need only show Ker = {N} (see


Theorem 20.8; N is the identity of G/N). We observe

Ker = {Na G/N: (Na) = 1 = 1G }


1

= {Na G/N: a = 1}
= {Na G/N: a Ker }
= {Na G/N: a N}
= {N}

by Lemma 10.2(2) and is one-to-one.

From the definition of , we have a( ) = (a ) = (Na) =a for all a G,


so = .

This completes the proof.

20.16 Theorem: Let : G G1 be a group homomorphism. Then


G/Ker Im .

In more detail: there is an isomorphism ´: G/Ker Im . such that ´


= , where : G G/Ker is the natural homomorphism and : Im G1
is the inclusion homomorphism (Example 20.2(f)).. This means that the
diagram.

216
G G1

G/Ker Im

can be so completed with a homomorphism ´ that both paths ´ and

G G1

´
G/Ker Im

have the same effect.

Proof: We use the homomorphism : G/N G1 of Theorem 20.15, where


N = Ker . Obviously,
Im = {(Na) : Na G/N} = {a : a G} = Im
and since is one-to-one, is an isomorphism from G/Ker onto Im =
Im . We observe
a( ) = (a )( ) = (Na)( ) = ((Na) ) = (a ) = a
for all a G, as maps any element of Im to itself. Hence = . The
theorem follows when we write ´ in place of .

According to Theorem 20.16, any homomorphism : G G1 is factored


into three homomorphisms , ´, :

´
G G/Ker Im G1

where (a) is onto G/Ker ; (b) ´ is one-to-one and onto Im and (c)
is one-to-one. So ´ is onto and ´ is one-to-one (Theorem 3.11). Hence,
if fails to be onto, it is only due to the fact that is not onto. Also, if
fails to be one-to-one, it is only due to the fact that is not one-to-one.
We see that any homomorphism is essentially an isomorphism ´,
"diluted" by a natural homomorphism which (eventualy) accounts for its
failure to be one-to-one and by an inclusion mapping which (eventualy)
accounts for its failure to be onto. In fact, is one-to-one if and only if
the associated natural homomorphism :G G/Ker is one-to-one and
is onto if and only if the associated inclusion mapping. : Im G1 is
onto.

217
20.17 Examples: (a) Let a = {a n: n } be a cyclic group. The
mapping : a
n an

is a homomorphism from the additive group into a , because

(m + n) = a n+ m = a ma n = m . n

for all m,n . From Theorem 20.16, we obtain

/Ker Im .

The homomorphism is onto by definition of a , hence Im = a and

/Ker a .

We see that any cyclic group is isomorphic to a factor group of . In


order to get more information, we distinguish two cases, where a has
finite or infinite order.

First suppose that a has finite order k . Then o(a) = k and

Ker = {n : n = 1 = a 0}
= {n : a n = 1}
= {n : o(a) n} (Lemma 11.6)
=k .

Thus /k a . We see that any cyclic group of order k is isomorphic


to /k . Consequently, any two cyclic groups of order k are isomorphic
to each other. For this reason, we speak of the cyclic group of order k.

In the second case, suppose a has infinite order. Then

Ker = {n : n = 1 = a 0}
= {n : a n = 1}
= {0} (Lemma 11.5)

and so /{0} a . From /{0} (Example 20.10(c)), we infer a .


We see that any infinite cyclic group is isomorphic to . Consequently,
any two cyclic groups of infinite order are isomorphic. For this reason,
we speak of the infinite cyclic group.

218
(b) The determinant homomorphism (Example 20.2(b))

det: GL(2, ) \{0}

is onto \{0}, because any a \{0} is the determinant of (a0 01) in


GL(2, ). Thus Im det = \{0}. Also

ker det = {A GL(2, ): det A = 1} = SL(2, ).

From Theorem 20.16, we obtain

GL(2, )/SL(2, ) \{0}.

In the same way, GL(2,K)/SL(2,K) K\{0}

for any field K.

(c) The sign homomorphism

: Sn {1, 1} = C2

is onto C2 when n 2, because ( ) = 1 and ((12)) = 1. Hence Im =


C2. As Ker = { S n: ( ) = 1} = An by definition, the relation

S n/Ker Im

yields S n/An C2.

(d) Consider the absolute value homomorphism

: \{0}
a a

Here Im = {a : a \{0}} = and Ker = {a \{0}: a = 1} = {1, 1} =


C2. Thus
S n/Ker Im

gives ( \{0})/C2 .

(e) The mapping

: \{0}
x e2 xi

is a homomorphism from the additive group into \{0}:

219
(x + y) = e2 (x+ y)i = e2 xie2 yi = x .y
for all x,y . We have /Ker Im . The reader may verify that

Im = {z : z = 1}.

As for the kernel, Ker = {x : e2 xi = 1}


= {x : cos 2 x + isin 2 x = 1}
= {x : cos 2 x = 1, sin2 x = 0}
= .
Thus
/ {z : z = 1},

where / is an additive, the right hand side is a multiplicative group.

Exercises

1. Show that the mapping exp: is an isomorphism.


x
x e

2. Determine whether the mapping x log(log x) is a homomorphism.

3. Find an isomorphism from \{0} under multiplication onto the group


of Example 7.4(a).

4. Find an isomorphism from under addition onto the group of Example


7.4(b).

5. Let i: Gi Gi+1 be homomorphisms of groups, where i = 1,2, . . . ,n.


Show that 1 2
. . . n is a homomorphism from G1 onto Gn+1. Prove a
correspond-ing result for isomorphisms.

6. Let : G G1 be an isomorphism. Prove that o(a) = o(a ) for all a G.

7. Let : G G1 be a homomorphism. Show that a = b if and only if


a(Ker ) = b(Ker ), where a,b are arbitrary elements of G.

8. Prove directly that any two cyclic groups of the same order are iso-
morphic.

9. Prove that any two dihedral groups of the same order are isomorphic.

220
10. Imitating Example 20.17(a), show that any dihedral group is iso-
morphic to a factor group of D .

11. Show that a factor group of a dihedral group is either dihedral or


cyclic.

12. Let n and let X be a set with n elements. Prove that S X S n.

13. Prove that ( \{0})/ C2.

14. Let : G G1 be a homomorphism, let K be a normal subgroup of G


such that K Ker ,. and let : G G/K be the associated natural
homomorphism. Show that there is a homomorphism : G/K G1 such
that = and Ker = (Ker )/K.. What happens when we drop the
condition K Ker ?.

221
§21
Isomorphism Theorems

This paragraph is devoted to some very important theorems of group


theory.

At this stage, it will be useful to introduce Hasse diagrams. A Hasse


diagram is a convenient means to visualise inclusions holding between
subgroups of a group. Subgroups are represented by points. Two points
(subgroups) are joined by a line segment if and only if the lower
subgroup is contained in the upper one. The line segments may be
vertical or slate. The line segments may be thought of as factor groups
when the lower subgroup is normal in the upper one. The Hasse
diagrams of S 3, C8 and C12 are depicted below..

C
S C

C C
C
A

H J K C C

1
1 1

H = { ,(12)}, J = { ,(13)}, K = { ,(23)}.

21.1 Theorem: Let : G G1 be a group homomorphism from G onto


G1. Then there is a one-to-one correspondence between. the set of all
sub-groups of G that contain Ker and the set of all subgroups of G1. This
correspondence preserves inclusion.. The normal subgroups of G that
contain Ker correspond to normal subgroups of G1, and conversely. The
factor groups by corresponding normal subgroups are isomorphic..

222
In more detail and more precise language, the claim is the following.

(1) For every H G with Ker H, there is associated a unique


subgroup of G1, which will be denoted by H1.
(2) If Ker H J G, then H1 J1.
(3) If Ker H G, Ker J G and H1 J1, then H J.
(4) If Ker H G, Ker J G and H1 = J1, then H = J.
(5) If S is any subgroup of G1, then there is an H G such that Ker H
and H1 = S.
(6) For Ker H G, there holds H G if and only if H1 G1.
(7) If Ker H G and H1 G1, then G/H G1/H1.

The situation is described in the accompanying diagrams.

G G1 G G1 G1/H1

J J1

H H1 H H1 1

Ker 1 Ker 1

1 1

Proof: (1) For each H G with Ker H,. we are to find a subgroup of
G1. How can we find it? Well, the subgroup we are looking for will be
first of all a subset of G1. How can we associate with H a subset of G1? At
our disposal, we have only one means of transportation from G to G1,
namely the mapping .. The only thing we can do, then, is form the set of
images of the elements of H under . Hence we put.

H1 := {h G1: h H}.

We now prove H1 G1. We can do it by the subgroup criterion, but we


prefer to use Theorem 20.6, which states that the image of a homo-
morphism is a subgroup of its range. We note that the restriction of to
H is a homomorphism (Example 20.2(g)) and

H1 = {h G1: h H} = {h H G1: h H} = Im H

223
by definition. Theorem 20.6 gives now Im H G1, hence H1 G1. The
description H1 = Im H will be useful.

(2) Suppose Ker H J G. Under this assumption, we prove H1


J1. This is easy: for any h H, we have h J, so h Im J = J1. Since h
J1 for all h H1, we get H1 J1.

(3) Suppose Ker H G, Ker J G and H1 J1. We want to


prove H J. Now H1 J1 means Im H Im J. Then, for every h H,
we have h Im J:

for every h H, there is a j J such that h = j .

We obtain, when h,j are as above

1 = h (j ) 1 = h .j 1 = (hj 1) ,
hj 1 Ker J
h Jj = J.

Thus h J for all h H. Therefore H J.

(4) This is immediate from (3). If H1 = J1, we have H1 J1 and J1 H1, so


H J and J H by (3), hence H = J. . (This shows that the correspondence
H H1 is one-to-one.)

(5) For any S G1, we are to find an H G such that Ker H and
H1 = S. What can H be? As in part (1), there is only one thing we can do:
take the preimages of the elements in S. Hence we put.

H = {a G: a S}.

Thus a H means a S. We show that H G, that Ker H and that


H1 = S.

First H G. From 1G = 1G S (Lemma 20.3(1)), we get 1 H. So H .


1

We apply the subgroup criterion.

(i) If a,b H, then a , b S, so a b S, so (ab) S, so


ab H. Thus H is closed under multiplication.

(ii) If a H, then a S, then (a ) 1 S, then (a 1) S, then


1
a H. Thus H is closed under the forming of inverses.

224
Thus H is a subgroup of G.

We prove next that H contains Ker . This is trivial. If a Ker , then a


= 1, so a S, so a H. Hence Ker H.

It remains to prove H1 = S. We have


H1 = Im H = {h G1: h H} = {h G1: h S} = S
as claimed.

This completes the proof of (5). (Part (5) shows that the correspondence
H H1 is onto.)

(6) First we assume H G and show that H1 G1. We are to show that
x 1h1x H1 for all x G1 and for all h1 H1 (Lemma 18.2(1)). If x G1
and h1 H1, then there are a G with a = x and h H with h = h1. This
is so because is onto G1 and H1 is defined as Im H . Then we are to
show (a ) 1(h )(a ) H1. This is equivalent to (a 1ha) H1. Since H G,
we know a 1ha H, so (a 1ha) Im H = H1. This proves H1 G1.

We assume now H1 G1 and prove H G. We can give an argument


similar to the one above, but we prefer to use the fact that normal sub-
groups and kernels coincide. Our method will be used in the proof of
part (7) as well.

The assumption is H1 G1. By Theorem 20.12, H1 = Ker ´, where


´: G1 G1/H1
is the natural homomorphism. We get then the homomorphism

´: G G1/H1 (Theorem 20.4):

´
G G1 G1/H1. (i)

We have Ker ´ = {a G: a( ´) = H1}


= {a G: a Ker ´}.
= {a G: a H1}

So (Ker ´)1 = Im Ker ´


= {a G1: a Ker ´} = {a G1: a H1} = H1
and we obtain
Ker ´=H (ii)

by part (4). Theorem 20.6 gives H G, as was to be proved.

225
(7) We saw that any one of H G and H1 G1 implies the other. Assume
that one, and hence both of them are true. Then we have the homo-
morphism ´. From Theorem 20.16, we obtain

G/Ker ´ Im ´ (iii)

We know Ker ´ = H by (ii). As for the image, since is onto G1 by


hypothesis and ´ is onto G1/H1 by Theorem 20.12, the composition ´ is
onto by Theorem 3.11(1). Hence Im ´ = G1/H1 and (iii) becomes

G/H G1/H1.

The proof is complete.

An important special case of Theorem 21.1 is the case of a natural


homo-morphism, recorded in the next theorem. It gives a complete
description of the subgroups of a factor group. The last part of the
theorem is known as the factor of a factor theorem.

21.2 Theorem: Let N G. The subgroups of G/N are the factor groups
S/N, where S runs through the subgroups of G satisfying N S. More
precisely, for each subgroup X of G/N, there is a unique subgroup S of G
satisfying N S such that X = G/N. When X1 and X2 are subgroups of
G/N, say X1 = S 1/N and X2 = S 2/N, where N S1 G and N S2 G,
then X1 X 2 if and only if S 1 S 2. Furthermore, S/N G/N if and
only if S G. In this case, there holds

G/N / S/N G/S.

Proof: Since N G, we can build the factor group G/N. The natural
homomorphism : G G/N is onto by Theorem 20.12. We can therefore
apply Theorem 21.1.

Theorem 21.1 states that any subgroup of G/N is of the form Im S


for
some S G with Ker S (here S is the restriction of to S). Now

Im S
= {s G/N: s S}
= {Ns G/N: s S} = S/N

226
and Ker = N by Theorem 20.12 (notice that S/N is meaningful, for N
G and N S imply N S; cf. Example 18.5(l)).. Thus the subgroups of
G/N are given by S/N, where N S G. By Theorem 21.1(2),(3),(4),.
S 1/N S 2/N if and only if S 1 S 2 and S 1/N S 2/N whenever S 1 S 2.
Finally, S 1/N G/N if and only if S G by Theorem 21.1(6) and in this

case G/N / S/N G/S by Theorem 21.1(7).This completes the proof.

G G/N G/N / S/N

S S/N 1

N 1

As an application of Theorem 21.2, we classify the factor groups of cyclic


groups. We treat infinite and finite cyclic groups separately.

Any infinite cyclic group is isomorphic to under addition (Example


20.17(a)), so we need find the factor groups of . As is abelian, any
subgroup of is normal in (Example 18.5(b)) and we can build factor
groups of by any subgroup of . The subgroups of are 0 (see
Example 18.5(a)) and n , where n (Theorem 11.8). For each n ,
the subgroup n is the unique subgroup of index n (Lemma 11.11). The
factor group /0 is isomorphic to (Example 20.10(c)). The factor
groups /n are known to be cyclic of order n (Example 20.17(a)). So all
factor groups of are cyclic (cf. Lemma 18.9(3)). For each m { },
there is a unique factor group of order m of , namely /m if m
and /0 if m = .
Now let Cn = a be a finite cyclic group of order n . As Cn is abelian,
we can build factor groups of Cn by any subgroup of Cn. We know that Cn
/n from Example 20.17(a). The subgroups of /n are described in
Theorem 21.2: any subgroup of /n is of the form M/n , where
n M . Now M means M = m for some m or M = 0 (The-
orem 11.8) and the condition n M excludes M = 0. Hence M = m for
some m , where furthermore m n, because n m . So the sub-

227
groups of /n are given by m /n , where m runs through all positive
divisors of n. For the factor group, we know

/n /m /n /m

from Theorem 21.2. So all factor groups of /n and of Cn are cyclic (cf.
Lemma 18.9(3)). For each positive divisor m of n,. there is a unique
factor group of order m of Cn = a , namely a / a m , where a m is the
unique subgroup of order n/m of Cn (Lemma 11.10).

Cn /n /n / m /n

am m /n m 1

1 1 n

We end this paragraph with another important theorem of group theory.

21.3 Theorem: Let H G and K G. Then H K K and

K/ H K HK/H. (*)

Proof: Since H G, there is a group G/H and a homomorphism


: G G/H
Let K be the restriction of to K. This K is a homomorphism (Example
20.2(g)). Hence K/Ker K Im K by Theorem 20.16. Here
Ker K = {k K: k = 1}
= {k K: k Ker }
=K Ker
=K H.

228
G

HK

H K

It remains to find Im K .We claim Im K = HK/H. First of all, HK = KH is a


subgroup of G because H G (Lemma 19.4(2)), and H HK, so H HK.
So HK/H is meaningful. For any k K, we have k K = Hk HK/H, which
shows that Im K HK/H. Conversely, each element of HK/H is of the
form Hhk, where h H, k K. But Hhk = Hk = k K Im K , so
HK/H Im K . Thus Im K = HK/H and (*) yields
K/ H K HK/H
as was to be proved.

Exercises

1. Let A C G and B G. Prove that A B C B and


C B / A B A(C B)/A.

2. Let A C G, B G and let : G H be a group homomorphism.


Prove that A C . Choosing in particular to be the natural homo-
morphism : G G/B, prove that AB CB.

229
§22
Direct Products

In this paragraph, we learn a method of constructing new groups from


given ones. This method consists essentially in writing the groups one
adjacent to the other.

22.1 Theorem: Let H and K be groups. On the cartesian product H K,


we define a binary operation by declaring

(h,k)(h1,k1) = (hh1,kk1)

for all (h,k),(h1,k1) H K. With respect to this operation, H K is a


group.

Proof: Before beginning with the proof, it will not be amiss to formulate
the theorem in a more precise way. Suppose (H, o ) and (K, *) are groups.
The claim is that (H K, ) is a group, where is defined by

(h,k) (h1,k1) = (h o h1,k * k1)

for all (h,k),(h1,k1) H K.

The multiplication in H K is carried out componentwise. Since H and K


are groups themselves, it is natural to expect that H K will be a group.
We check the group axioms.

(i) For all (h,k),(h1,k1) H K, we have h,h1 H, k,k1 K, so


hh1 H and kk1 K as H and K are closed under multiplication, and so
(hh1,kk1) H K. So we have a binary operation on H K. In other
words, H K is closed under multiplication.

(ii) Associativity in H K. follows from associativity in H and


K. For any (h,k),(h1,k1),(h2,k2) H K, we have

[(h,k)(h1,k1)](h2,k2) = (hh1,kk1)(h2,k2)
= ((hh1)h2,(kk1)k2)
= (h(h1h2),k(k1k2))

230
= (h,k)(h1h2,k1k2)
= (h,k)[(h1,k1)(h2,k2)]
and the operation on H K is associative.

(iii) What can be the identity element of H K?. The only


reasonable guess would be (1,1) = (1H ,1K ). We indeed have

(h,k)(1,1) = (h1,k1) = (h,k)

for all (h,k) H K. Thus (1,1) is a right identity of H K.

(iv) What can be the inverse of (h,k) H K? Probably


1 1
(h ,k ). We indeed have

(h,k)(h 1,k 1) = (hh 1,kk 1) = (1,1)

for all (h,k) H K. So any (h,k) H K has a right inverse in H K,


namely (h,k) 1 = (h 1,k 1).

Therefore, H K is a group.

22.2 Definition: Let H and K be groups. Then the group of Theorem


22.1 is called the direct product of H and K. It will be denoted by H K.

Thus the notation "H K" stands for the cartesian product of the sets H
and K as well as the direct product of the groups H and K. This ambiguity
will not lead to any confusion. The reader should be careful to distin-
guish between HK and H K. The former is defined only when H and K
are subgroups of a common group G, whereas H K is a meaningful
group regardless of whether H and K are subgroups of a group. The
elements of HK are elements of the group that contains H and K; the
elements of H K ore ordered pairs.

When the groups H and K are written additively, we write the group of
Theorem 22.1 in the additive form, too. The operation is then given by

(h,k) + (h1,k1) = (h + h1,k + k1)

231
for all (h,k),(h1,k1) H K. The operation is called addition in this case,
and the group is called the direct sum of H and K. We write the group as
H K, to avoid confusion with H + K (which is HK in additive notation,
where H and K are subgroups of a group G).

22.3 Examples: (a) Consider C2 , where C2 = {1, 1} and is


the multiplicative group of the positive rational numbers.. The elements
of C2 are ordered pairs ( 1,q), where q . Multiplication in C2
is carried out according to the rule ( ,q)( ´,q´) = ( ´,qq´).. We observe
that the mapping.

: \{0} C2
q (sgn q, q )

is a homomorphism, since (qq´) = (sgn qq´, qq´ )


= (sgn q.sgn q´, q q´ )
= (sgn q, q )(sgn q´, q´ )
= (q )(q´ )

for all q,q´ \{0}. Its kernel is

Ker = {q \{0}: q = (1,1)} = {q \{0}: sgn q = 1, q = 1}


= {q \{0}: q 0, q = 1}
= {1},

which means that is one-to-one (Theorem 20.8). As any ( ,q) C2


is the image of q \{0}, the homomorphism is onto. Hence is an
isomorphism and

\{0} C2 .

(b) Consider , where is the additive group of real numbers. The


elements of are ordered pairs of real numbers. The operation on
is given by

(a,b) + (c,d) = (a + c,b + d)

for all (a,b),(c,d) . We leave it to the reader to prove that

:
a + bi (a,b)

232
is an isomorphism (where is the group of complex numbers under
addition). Hence

22.4 Theorem: Let H and K be groups and let G := H K be the direct


product of H and K. Then there are subgroups H1 and K1 of G such that

H1 H, K1 K
H1 G, K1 G
H1K1 = G, H1 K1 = 1

Proof: We put H1 = {(h,1) G: h H} and K1 = {(1,k) G: k K}. First we


prove H1,K1 G. Since
(i) (h,1)(h´,1) = (hh´,1) H1 for all (h,1),(h´,1) H1 and
(ii) (h,1) 1 = (h 1,1 1) = (h 1,1) H1 for all (h,1) H1,
H1 is a subgroup of G. In the same way, K1 G.

H1 and K1 are in fact normal subgroups of G. To establish K1 G, we


show that (h,k) 1(1,k0)(h,k) K1 for all (h,k) G, (1,k0) K1 (Lemma
18.2(1)). We indeed have

(h,k) 1(1,k0)(h,k) = (h 1,k 1)(1,k0)(h,k) = (h 11h,k 1k0k) = (1,k 1k0k) K1

as K is closed under multiplication. Hence K1 G. One proves similarly


H1 G.

Next we show H H1 and K K1. The mapping 1: H H1, h (h,1) is


one-to-one (by the definition of equality of ordered pairs). and onto (by
the definition of H1), and is furthermore a homomorphism, since

(hh´) 1
= (hh´,1) = (h,1)(h´,1) = h 1
.h´ 1

for all h,h´ H. Thus 1 is an isomorphism and H H1. An analogous


argument shows that 2: K K1, k (1,k) is an isomorphism, so K K1.

That H1K1 = G follows immediately from the fact that any (h,k) G can be
written as (h,1)(1,k) with (h,1) H1, (1,k) K1.

233
Finally, H1 K1 = 1. Indeed, if (h,k) H1 K1, then h = 1 as (h,k) K1 and
k = 1 as (h,k) H1, thus (h,k) = (1,1) and so H1 K1 {(1,1)} = 1, yielding
H1 K1 = 1.

This completes the proof.

22.5 Theorem: Let G be a group and let H,K be subgroups of G. The


following statements are equivalent.
(1) H G, K G, G = HK and H K = 1.
(2) Every element of G can be expressed uniquely in the form hk, where
h H and k K; and every element of H commutes with every element
of K.

Proof: (1) (2) Suppose H G, K G, G = HK and H K = 1. Since G =


HK, every element of G can be expressed as hk, with h H, k K. We
must show that this representation is unique, i.e., when hk = h´k´with
h,h´ H and k,k´ K, then necessarily h = h´ and k = k´. This follows
from H K = 1. Indeed, from hk = h´k´, we get kk´ 1 = h 1h´ H K = 1, so
kk´ 1 = 1 = h 1h´, so k = k´ and h = h´.

It remains to prove that any element of H commutes with any element


of K. Let h H, k K. We have to show hk = kh, or, equivalently,
h 1k 1hk = 1. Now h 1k 1h.k K, since h 1k 1h K (because k 1 K and
1 1 1
K G) and h .k hk H, since k hk H (because h H and H G), so
h 1k 1hk H K = 1 and h 1k 1hk = 1, as claimed.

(2) (1) By hypothesis, every element of G can be written in the form


hk, where h H, k K. So G = HK. We now prove H K = 1. Let a H K.
If a 1, then 1a = a1 are two distinct representations of a G with 1
H, a K and a H, 1 K, contrary to the hypothesis that every element
of G, in particular a, can be expressed uniquely in the form hk, with h
H, k K. Thus a = 1. This proves H K = 1.

In order to prove H G, we must show g 1hg H for all h H, g G


(Lemma 18.2(1)). Let g G = HK. Then g = h´k´ for some h´ H, k´ K.
Thus
g 1hg = (h´k´) 1h(h´k´)
= k´ 1(h´ 1hh´)k´
= k´ 1.k´(h´ 1hh´) (h´ 1hh´ H and k´ K commute)

234
= h´ 1hh´ H

and therefore H G. The proof of K G is similar and is left to the


reader.

22.6 Theorem: Let G be a group and H,K be subgroups of G. Assume


that H G, K G, G = HK and H K = 1. Then G H K.

Proof: We want to find an isomorphism : H K G. For each (h,k) in


H K, this should give us an element (h,k) of G. By hypothesis, G = HK.
This suggests that (h,k) hk might be an appropriate mapping from
H K into G. So we put : H K G. We show that is a homomorphism,
(h,k) hk
one-to-one and onto.

is a homomorphism if and only if


((h´,k)(h,k´)) = (h´,k) .(h,k´) for all h,h´ H, k,k´ K,
that is, if and only if
h´hkk´ = h´khk´ for all h,h´ H, k,k´ K,
which is equivalent to
hk = kh for all h H, k K,
and this is true by Theorem 22.5. So is a homomorphism.

is one-to-one, for if (h,k) = (h´,k´) , then hk = h´k´; but every element


in G can be expressed in the form hk with h H, k K in a unique way
by Theorem 22.5. So h = h´ and k = k´. Thus (h,k) = (h´,k´). This proves
that is one-to-one.

is onto because HK = G by hypothesis.

Hence is an isomorphism and H K G, and also G H K.

22.7 Theorem: (1) A group G is isomorphic to the direct product of two


subgroups H and K if and only if (i) every element of G can be expressed
uniquely in the form hk, where h H and k K and (ii) every element
of H commutes with every element of K.

(2) Let G be a group and H,K G. If G H K, then G/H K and G/K H.

235
Proof: (1) follows from Theorem 22.4, Theorem 22.5, Theorem 22.6. As
for (2), we observe that G H K implies H G, K G, G = HK, H K=
1, so that G/H = HK/H K / H K = K/1 K by Theorem 21.3. The proof
of G/K H is similar.

When the conditions of Theorem 22.7(1) are satisfied, G is said to be the


internal direct product of H and K. The direct product of Definition 22.2
is called the external direct product of H and K. Theorem 22.5 and The-
orem 22.6 state that the internal direct product of H and K is isomorphic
to the external direct product H K. For this reason, we will not
distinguish between external and internal direct products and refer to
both of them simply as direct products.

As an illustration of Theorem 22.7(1), consider \{0} C2 (Example


22.3(a)). Theorem 22.7(1) asserts that every nonzero rational number
can be written as ( 1)q, where q , q 0 in a unique way. This is of
course well known to everybody.

Next we investigate the direct product of two finite cyclic groups of


relatively prime orders. It will be sufficient to examine the direct sum of
/m and /n .

Let m and n be relatively prime natural numbers.. For any integer a, we


denote the residue class of a (mod m) by a ,. and the residue class of a
(mod n) by a*. Hence a m
and a* n
.

Consider the mapping : m n


. It is easy to see that is a homo-
a (a ,a*)
morpism:
(a + b) = (a + b , (a + b)*) =(a + b, a*+b*) = (a ,a*) + (b,b*) = a + b
for all a,b . So is a homomorphism and

/Ker Im

by Theorem 20.16. Now a Ker if and only if a = 0 and a* = 0*, that is,
if and only if m a and n a. Since m and n are relatively prime, the latter
condition is equivalent to mn a. Hence Ker = mn and /mn Im ,
where Im is a subgroup of /m /n . From

236
mn = /mn = Im /m /n = /m /n = mn

we conclude Im = mn, hence Im = /m /n . Therefore is


onto and /mn /m /n . Writing this multiplicatively, we get

22.8 Theorem: If m and n are relatively prime natural numbers, then

Cmn Cm Cn.

We record an important result that we obtained as a bonus.

22.9 Theorem: Let m and n be relatively prime natural numbers. Then


the mapping
: /m /n
a (a ,a*)

is a group homomorphism onto /m /n .

So far, we have examined the direct product of two groups. The con-
struction extends immediately to n groups, where n 2. We shall be
content with enunciating the appropriate theorems. Their proofs consist
in writing n-tuples in place of ordered pairs in the proofs above. The
only novel point is extension of the previous condition H K = 1. This is
discussed in Theorem 22.12, whose proof we briefly sketch.

22.10 Theorem: Let H1,H2, . . . ,Hn be arbitrary groups. On the cartesian


product H1 H2 . . . Hn, we define a binary operation by declaring

(h1,h2, . . . ,hn)(h1´,h2´, . . . ,hn´) = (h1h1´,h2h2´, . . . ,hnhn´)

for all (h1,h2, . . . ,hn),(h1´,h2´, . . . ,hn´) H1 H2 ... Hn.With respect to this


operation, H1 H2 . . . Hn is a group.

237
22.11 Definition: The group of Theorem 22.10. is called the direct pro-
duct of H1,H2, . . . ,Hn and is denoted by H1 H2 . . . Hn. If the groups are
written additively, we call the group of Theorem 22.10. the direct sum
of H1,H2, . . . ,Hn and denote is by H1 H2 ... Hn.

22.12 Theorem: Let H1,H2, . . . ,Hn be groups and G = H1 H2 ... Hn.


Then there are subgroups G1, G2, . . . ,Gn of G such that
Gi Hi and Gi Hi for all i = 1,2, . . . ,n,
G = G1G2. . .Gn and G1G2. . . Gj 1 Gj = 1 for all j = 2,. . . ,n.

Sketch of proof: Let Gi be the set {(1,. . . ,x,. . . ,1): x Hi} of all n-tuples
in G whose k-th components are equal to 1 Hk whenever k i. It is
easily verified that Gi is a subgroup of G, normal in G, isomorphic to Hi
and that G = G1G2. . .Gn. In fact, for all j = 2,. . . ,n,

G1G2. . .Gj 1= {(h1,h2,. . . ,hj 1,1,. . . ,1): h1 H1,h2 H2,. . . ,hj 1


Hj 1}.

Finally, to prove G1G2. . .Gj 1 Gj = 1 for all j = 2,. . . ,n, let


(u1,u2, . . . ,un) G1G2. . .Gj 1 Gj, where j {2,. . . ,n}. Here uk = 1 for k j,
because (u1,u2, . . . ,un) Gj. Thus (u1,u2, . . . ,un) = (1,. . . ,uj,. . . ,1). But

(1,. . . ,uj,. . . ,1) G1G2. . .Gj 1


= {(h1,h2,. . . ,hj 1,1,. . . ,1): h1 H1,h2 H2,. . . ,hj 1
Hj 1},
hence uj = 1 and

(u1,u2, . . . ,un) = (1,. . . ,1,. . . ,1) = 1 G. Thus G1G2. . .Gj 1 Gj = 1.

22.13 Theorem: Let G be a group and let G1,G2, . . . ,Gn be subgroups of G.


The following statements are equivalent..
(1) Gi Hi for all i = 1,2, . . . ,n, G = G1G2. . .Gn and G1G2. . . Gj 1 Gj = 1 for all j
= 2,. . . ,n. .
(2) Every element of G can be expressed uniquely in the form g1g 2. . . g n,
where g 1 G1, g 2 G2, . . . , g n Gn; and every element of Gk commutes
with every element of Gl (k l).

238
22.14 Theorem: Let G be a group and let G1,G2, . . . ,Gn be subgroups of G.
Assume that Gi G for all i = 1,2, . . . ,n, G = G1G2. . .Gn and G1G2. . . Gj 1 Gj = 1
for all j = 2,. . . ,n. Then G G1 G2 . . . Gn.

If n 3 and G1,G2, . . . ,Gn are normal subgroups of a group G such that G =


G1G2. . .Gn and Gi Gj = 1 whenever i j, then G is need not be isomorphic
to the direct product of G1,G2, . . . ,Gn. By way ow example, let G = V4 and
let A = { ,(12)(34)}, B = { ,(13)(24)}, C = { ,(23)(14)}. Then A,B,C are
normal subgroups of G, and G = ABC, and A B = B C = A C = 1.
However, G is not isomorphic to A B C, because, for one thing, G has
order 4, whereas A B C has order 8. Thus the condition

G1G2. . . Gj 1 Gj = 1 for all j = 2,. . . ,n


cannot be relaxed to
Gi Gj = 1 for all i j.

22.15 Theorem: A group G is isomorphic to the direct product of n


.
sub-groups G1,G2,. . . ,Gn if and only if (i) every element of G can be
expressed uniquely in the form g1g 2. . . g n, where g 1 G1, g 2 G2, . . . , g n
Gn and (ii) every element of Gk commutes with every element of Gl (k l).

The last two elementary results will be needed in §28.

22.16 Lemma: Let G1,G2, . . . ,Gn,H1,H2, . . . ,Hn be groups and assume that
G1 H1, G2 H2, . . . , Gn Hn. Then G1 G2 . . . Gn H1 H2 . . . Hn.

Proof: Let i: Gi Hi be an isomorphism (i = 1,2, . . . ,n). The mapping

: G1 G2 . . . Gn H1 H2 . . . Hn
(g1,g2, . . . ,gn) (g1 1,g2 2, . . . ,gn n)

is a homomorphism, because ((g1,g2, . . . ,gn)(g1´,g2´, . . . ,gn´))


= (g1g1´,g2g2´, . . . ,gngn´)
= ((g1g1´) 1,(g2g2´) 2, . . . ,(gngn´) n)
= (g1 1g1´ 1,g2 2g2´ 2, . . . ,gn ngn´ n)
= (g1 1,g2 2, . . . ,gn n)(g1´ 1,g2´ 2, . . . ,gn´ n
)

239
= (g1,g2, . . . ,gn) (g1´,g2´, . . . ,gn´)

for all (g1,g2, . . . ,gn),(g1´,g2´, . . . ,gn´) G1 G2 ... Gn. Since

Ker = {(g1,g2, . . . ,gn) G1 G2 ... Gn: (g1 1,g2 2, . . . ,gn n) = (1,1, . . . ,1)}
= {(g1,g2, . . . ,gn) G1 G2 ... Gn: g1 1 = 1, g2 2 = 1, . . . , gn n = 1}
= {(g1,g2, . . . ,gn) G1 G2 ... Gn: g1 =1, g2 = 1, . . . , gn = 1}
= {(1,1, . . . ,1)} = 1,

is one-to-one. Also, is onto: given any (h1,h2, . . . ,hn) H1 H2 . . . Hn,


there are g1 G1, g2 G2, . . . , gn Gn with g1 1 = h1, g2 2 = h2, . . . ,gn n = hn,
thus (h1,h2, . . . ,hn) is the image, under , of (g1,g2, . . . ,gn) G1 G2 . . . Gn.

So is an isomorphism and G1 G2 ... Gn H1 H2 ... Hn.

22.17 Lemma: Let G1,G2, . . . ,Gn be groups and H1 G1, H2 G2, . . . , Hn


Gn.
Then H1 H2 . . . Hn G1 G2 . . . Gn and
G1 G2 ... Gn / H1 H2 ... Hn G1/H1 G2/H2 ... Gn/Hn.

Proof: The mapping : G1 G2 . . . Gn G1/H1 G2/H2 . . . Gn/Hn


(g1,g2, . . . ,gn) (H1g1, H2g2, . . . , Hngn)
is a homomorphism because ((g1,g2, . . . ,gn)(g1´,g2´, . . . ,gn´))

= (g1g1´,g2g2´, . . . ,gngn´)

= (H1g1g1´,H2g2g2´, . . . ,Hngngn´)

= (H1g1H1g1´, H2g2H2g2´, . . . , HngnHngn´)

= (H1g1,H2g2, . . . ,Hngn)(H1g1´,H2g2´, . . . ,Hngn´)

= (g1,g2, . . . ,gn) (g1´,g2´, . . . ,gn´)

for all (g1,g2, . . . ,gn),(g1´,g2´, . . . ,gn´) G1 G2 . . . Gn. Moreover, is onto:


any (H1g 1,H2g 2, . . . ,Hng n) in G1/H1 G2/H2 . . . Gn/Hn is the image, under
, of (g 1,g 2, . . . ,g n) G1 G2 . . . Gn. Thus

Im = G1/H1 G2/H2 ... Gn/Hn.

To complete the proof, we need only show Ker = H1 H2 ... Hn


(Theorem 20.16). We indeed have.

240
Ker = {(g1,g2, . . . ,gn) G1 G2 ... Gn: (g1,g2, . . . ,gn) = (H1,H2, . . . ,Hn)}
= {(g1,g2, . . . ,gn) G1 G2 ... Gn: H1g1 = H1, H2g2 = H2, . . . ,Hngn = Hn}
= {(g1,g2, . . . ,gn) G1 G2 ... Gn: g1 H1, g2 H2, . . . , gn Hn}
= H1 H2 . . . Hn.

Exercises

1. Prove that V4 C2 C2.

2. Show that Cmn is not isomorphic to Cm Cn if (m,n) 1.

3. Find three nonisomorphic abelian groups of order 8 and three noniso-


morphic abelian groups of order 12.

4. Show that G1 G2 ... Gn Gi Gi ... Gi for any permutation


1 2 n

1 2 . . . n
(i i i n) in S n.
1 2 . . .

5. Prove that, if G is isomorphic to the direct product of its subgroups G1,


G2, . . . ,Gn, then G1. . . Gk 1Gk+1. . . Gn Gk = 1 for all k = 1,2, . . . ,n.

6. Let H,K be normal subgroups of G. Find a one-to-one homomorphism


from G/H K into G/H G/K. Prove that HK/H K H/H K K/H K.

7. Let i: Gi Hi be group homomorphisms (i = 1,2, . . . ,n). Define by

: G1 G2 . . . Gn H1 H2 . . . Hn
(g 1,g 2, . . . ,g n) (g 1 ,g
1 2 2
, . . . ,g n n)

( is sometimes denoted by 1 2
... n
). Show that is a homo-
morphism and

Ker =Ker 1
Ker 2
... Ker , Im
n
= Im 1
Im 2
... Im n
.

8. For any abelian group A, let A be the set of all homomorphisms from
A into \{0}. Prove that A is an abelian group under the multiplication

241
a( ) = a .a for all a A, , A

and show that A1 A2 A1 A2.

242
§23
Center and Automorphisms
of Groups

We introduce an important subgroup of a group.

23.1 Definition: Let G be a group. We put


Z(G) = {z G: zg = gz for all g G}
and call Z(G) the center of G.

The center of G consists, therefore, of the elements of G that commute


with every element of G. It is a subset of G. Since 1g = g1 for all g G,
the identity element belongs to Z(G), so Z(G) . Obviously, Z(G) = G if
and only if G is abelian.

23.2 Theorem: Let G be a group. Then Z(G) G.

Proof: We use our subgroup criterion (Lemma 9.2).

(i) Let z1,z2 Z(G). We want to show z1z2 Z(G). Thus we


must show that (z1z2)g = g(z1z2) for all g G. This follows easily from z1g
= gz1, z2g = gz2 for all g G, which are true since z1,z2 Z(G):
(z1z2)g = z1(z2g) = z1(gz2) = (z1g)z2 = (gz1)z2 = g(z1z2).
Hence Z(G) is closed under multiplication.

(ii) Let z Z(G). We want to show z 1 Z(G). We know


g 1z = zg 1 for any g G;
so, taking inverses, we get
z 1g = gz 1 for any g G,
which means z 1 Z(G). Hence Z(G) is closed under the forming of
inverses.

Thus Z(G) G.

243
As any two elements of Z(G) commute, Z(G) is an abelian subgroup of G.
It is also a normal subgroup of G. We prove a slightly stronger result.

23.3 Theorem: Let G be a group. If H Z(G), then H G.

Proof: We are to show g 1hg H for all g G, h H. Now, if g G, h H


then g 1hg = g 1(hg) = g 1(gh) = (g 1g)h = h H, for h Z(G) commutes with
g. Thus H G.

A subgroup of G which is contained in the center of G is called a central


subgroup of G. With this terminology, Theorem 23.3 states that any
central subgroup of G is normal in G. Central subgroups are abelian.
Elements of Z(G) are also called central elements of G.

23.4 Examples: (a) Let K be a field and let us put G = GL(2,K) for
a b a b
brevity. We want to find Z(G). Let (c d) G. Then (c d) Z(G) if and
a b xy xy a b xy
only if (c d)(z u) = (z u)(c d) for all (z u) G. In particular,

(ac bd)(10 11) = (10 11)(ac bd) and (ac bd)(01 10) = (01 10)(ac bd),
hence a = a + c, a +b =b+d and b =c a =d
c = c, c+d =d d =a c =b

for all (ac bd) Z(G), so (ac bd) = (a0 0a ), where a 0 since det (ac bd) 0.

Therefore Z(G) {(a0 0a ) G: a 0}

and conversely the set on the right hand side is contained in Z(G), for

(a0 0a )(xz yu) = (ax ay xa ya xy a 0 xy


az au ) = (za ua ) = (z u)(0 a ) for all (z u) G.

Thus Z(G) = {(a0 0a ) G: a 0}. The elements of Z(G) are called scalar

matrices.

244
(b) Let D4n = a,b: a 2n = 1, b2 = 1, bab = a 1
be a dihedral group of order
4n 4. What is Z(D4n)? Well, let x Z(D4n). Then x = a j or x = a j b for
some j ,0 j 2n 1. Since xa = ax and xb = bx, we get

a j a = aaj and a j b = baj in case x = a j ,


a j ba = aaj b and a j bb = baj b in case x = aj b.

These are equivalent to

(1) a j+1 = a j+1 and a jb = a jb in case x = a j ,


(2) a j 1b = aj+1b and aj = a j in case x = aj b.

The equations in (1) are satisfied only when n 2j, that is to say, only
when j = 0,n, so only when x = a 0,a n. The first equation in (2) is never
satisfied, for n 1 by hypothesis. Thus Z(D4n) {1,a n}. The reader will
easily show the reverse inclusion. Hence Z(D4n) = {1,a n} = a n .

(c) Let us find Z(S3). It is easy to see that and (12) are the only permu-
tations in S3 that commute with (12). Also, and (13) are the only
permu-tations in S 3 that commute with (13). Hence is the only
permutation in S3 that commute with both (12) and (13). A fortiori, Z(S3)
= 1.

23.5 Lemma: Let H be a central subgroup of G. If G/H is cyclic, then G


is abelian.

Proof: H G by Theorem 23.3 and so G/H is meaningful. By hypothesis,


G/H = Hg for some g G. Then, for any x G, there holds Hx = (Hg)m =
Hgm with a suitable m . This means that any x G can be written in
the form hgm, where h H, m .

Let x,y be arbitrary elements of G. We write them as x = hgm, y = kgn,


where h,k H Z(G) and m,n . Then xy = (hgm)(kgn) = h(g mk)g n =
h(kgm)g n = (hk)(g mg n) = (hk)(g m+ n) = (hk)(g n+ m) = (kh)(g ng m) = k(hgn)g m =
k(g nh)g m = (kgn)(hgm) = yx and G is commutative.

The center of any group G is normal in G (Theorem 23.3) and is


therefore the kernel of some homomorphism (Theorem 20.14). Now we

245
construct a homomorphism whose kernel is Z(G). We will need the
concept of auto-morphisms.

23.6 Definition: Let G be a group. An isomorphism :G G from G onto


G itself is called an automorphism of G. The set of all automorphisms of G
will be denoted by Aut(G).

Since any isomorphism is one-to-one and onto, Aut(G) implies SG.


Thus Aut(G) SG. The identity mapping G on G is an isomorphism from G
onto G, so G Aut(G) and Aut(G) . We can form the composition of
any , Aut(G). It turns out that Aut(G) is a group.

23.7 Theorem: Let G be a group. Then Aut(G) is a group under the


composition of mappings.

Proof: We can check the group axioms,. but there is a shorter way. We
make use of Aut(G) SG. Now SG is a group under the composition of
mappings (Example 7.1(d)),. so all we have to do is show that Aut(G) is a
subgroup of SG.

(i) Let , Aut(G). Then is an isomorphism from G onto G


by Lemma 20.11(1). Thus Aut(G) and Aut(G) is closed under multi-
plication.

(ii) Let Aut(G). Then 1 is an isomorphism from G onto G


by Lemma 20.11(2). Thus 1 Aut(G) and Aut(G) is closed under the
forming of inverses.

By Lemma 9.2, Aut(G) S G. Thus Aut(G) is a group.

Aut(G) is not a subgroup or a factor group of G, of course. The underlying


set is neither a subset nor a set of cosets of a subgroup of G.

246
23.8 Example: Let G be a group. We fix an arbitrary element g of G.
With each x G, we associate g 1xg. This is a uniquely determined
element of G, so we have a mapping x g 1xg, which we denote by g . So

g: G G
x g 1xg

We claim g is a homomorphism. For all x,y G, we have


1 1 1
(xy) g = g xyg = g xg.g yg = x g y g ,
and g is therefore a homomorphism.

We can build g with any g G. Let us take the composition of two of


them, g and h, say. For g,h G, we have

x( g h) = (x g ) h = (g 1xg) h = h 1(g 1xg)h = (h 1g 1)x(gh) = (gh) 1x(gh) = x gh


for all x G. Thus

g h = gh for all g,h G.


(1)

There holds x 1 = 1 1x1 = x for all x G. Thus

1 = .
(2)

For any g G, there holds g g -1 = gg -1 = 1 = and g -1 g = g -1g = 1 = by


(1) and (2). Thus g is one-to-one and onto (Theorem 3.17(2)) and g -1 is
the inverse of g :
( g) 1 = g -1 (3)

So g is an automorphism of G.

Such automorphisms deserve a name.

23.9 Definition: Let G be a group. An automorphism of G of the form


g , where g G, is called an inner automorphism of G. The set

{ g Aut(G): g G}

247
of all inner automorphisms of G will be denoted by Inn(G).

Inner automorphisms of a group form a group.

23.10 Theorem: Let G be a group. Then Inn(G) Aut(G).

Proof: = 1 Inn(G) by (2), so Inn(G) . Now (i) the product of two


inner automorphisms is an inner automorphism by (1); and (ii) the
inverse of an inner automorphism is an inner automorphism by (3). So
Inn(G) Aut(G).

The relation (1) has a deep significance. It states that the mapping

:G Aut(G)
g g

is a homomorphism. Theorem 20.16 gives G/Ker Im .

Here Im ={ g Aut(G): g G} = Inn(G) by definition and

Ker = {z G: z = }
= {z G: g z = g for all g G}
= {z G: z 1gz = g for all g G}
= {z G: gz = zg for all g G}
= Z(G).

Thus Z(G) is the kernel of : G Aut(G). We proved

23.11 Theorem: Let G be a group. Then G/Z(G) Inn(G).

Next we prove that Inn(G) is a normal subgroup of Aut(G).

23.12 Lemma: Let G be a group. Then Inn(G) Aut(G).


Proof: We know Inn(G) Aut(G) from Theorem 23.10. We are to show
1
g Inn(G) for any g Inn(G), Aut(G). For any x G, we have

248
1 1
x( g ) = (x )( g )
1
= ((x ) g)
= (g (x 1)g )
1

= (g 1 )((x 1) )(g )
= (g ) 1x(g )
=x g ,

1 1
thus g = g and g Inn(G). This proves Inn(G) Aut(G).

Let G be a group and let H G. According to Lemma 18.2(3), H G if


and only if H g = H for all g Inn(G). This suggests a way of strengthen-
ing the normality concept: Instead of requiring H = H for all Inn(G),
we prescribe this to hold for all Aut(G).

23.13 Definition: Let G be a group. A subgroup H of is said to be a


characteristic subgroup of G or to be characteristic in G provided H = H
for all Aut(G).

Here H means the set {h : h H} G as usual. The equality H = H is a


set equality, of course. It does not mean that h = h for all h H. It
means that, h H for any h H, and, for any h H, there is an h´ H
such that h´ = h. Cf. Example 18.5(b). As Inn(G) Aut(G), any charac-
teristic subgroup of G is normal in G, but the converse is not true in
general.

Being characteristic is a transitive relation, a good property not shared


by normality (Example 18.5(i)).

23.14 Lemma: Let K H G. If K is characteristic in H and H is


characteristic in G, then K is characteristic in G.

Proof: We are to prove that K = K for all Aut(G).. Let Aut(G). We


restrict to H. Then H : H G is a one-to-one homomorphism onto H .
Since H is characteristic in G,we have H = H and H is an automorphism

249
of H. Then K H = K, because K is characteristic in H. Thus K = K for all
in Aut(G) and K is a characteristic subgroup of G..

Another useful result of this type is given in the next lemma.

23.15 Lemma: Let K H G. If K is characteristic in H and H is


normal in G, then K is normal in G.

Proof: We are to prove that K g = K for all g Inn(G). Let g Inn(G).


We restrict g to H. Then g |H : H G is a one-to-one homomorphism onto
H g = g 1Hg. Since H is normal in G,we have g 1Hg = H and g |H is an
automorphism of H. Then K g |H = K, because K is characteristic in H. Thus
g 1Kg = K g |H = K for all g G and K is a normal subgroup of G.

23.16 Theorem: Let G be a group. Then Z(G) is characteristic in G.

Proof: We must show Z(G) = Z(G) for all Aut(G). If we can prove
Z(G) Z(G) for all Aut(G), then we will have Z(G) 1 Z(G), that is,
Z(G) Z(G) for any Aut(G) also (cf. the proof of (2) (3) in Lemma
18.2). So we need only prove Z(G) Z(G). For any z Z(G), we are to
show that (z )g = g(z ) for all g G. As g runs through G, so does g ,
because is onto G. Thes we need only show (z )(g ) = (g )(z ) for all
g G. But this is obvious: (z )(g ) = (zg) = (gz) = (g )(z ) since z Z(G)
and is a homomorphism. Consequently, Z(G) is characteristic in G.

We end this paragraph by finding the automorphism group of a finite


cyclic group. In general, given a group G, it is quite difficult to find
Aut(G).

Let Cn = x: xn = 1 be a cyclic group of order n . An automorphism of


Cn is first of all a homomorphism of Cn. We claim that a homomorphism
from Cn into Cn is uniquely determined by its effect on the generator x.
In other words, if and are homomorphisms from Cn into Cn and x =
x , then = . To show this, we must prove a = a for all a Cn. But a =

250
xm for some m , and a = xm = (x )m = (x )m = xm = a . This proves
the claim.

Let be a homomorphism from Cn into Cn. Then x = xm for some m .


Then xk = (x )k = (xm)k = xmk = (xk)m for any k .. This shows a = a m
for any a Cn. Thus a homomorphism from Cn into Cn simply sends each
element of Cn to its m-th power, m being a natural number depending
only on the homomorphism.. The homomorphism of taking m-th powers
will be denoted by m. Hence
m
: x x
a am

is a homomorphism from Cn into Cn, and any homomorphism from Cn into


Cn is one of the m.

From the homomorphisms { m: m }, we want to select the homo-


morphisms. These are the one-to-one m's onto Cn. Since Cn is a finite set,
any one-to-one mapping from Cn into Cn is in fact onto Cn. So we need
find only one-to-one m's. These and exactly these are the automorph-
isms of Cn.

Now m
is one-to-one if and only if Ker m
= 1 (Theorem 20.8) and
Ker m = {g Cn: g m
= 1}
= {xk: k and xkm = 1}
= {xk: k and n|km}
k
= {x : k and n/(n,m) | km/(n,m)}
= {xk: k and n/(n,m) | k}
n/(n,m)
= x ,
n
so Ker m = 1 = x if and only if (n,m) = 1. Thus m is an automorphism
of Cn if and only if (n,m) = 1.

Hence Aut(Cn) = { m
: (n,m) = 1}.

This description of Aut(Cn) looks like an infinite set. Aut(Cn) is finite of


course. Therefore, there are repetitions among m. To see this more
vividly, we remark that m = k if and only if m k (mod n). Indeed, m
is equal to k if and only if x m = x k by the claim above, thus if and
only if xm = xk, thus if and only if xm k = 1, thus if and only if n | m k by
Lemma 11.6, thus if and only if m k (mod n).

Hence, for any m n


, we may unambiguously write m
: Cn Cn. With

251
a am
this notation, we have
Aut(Cn) = { m
:m n
}

and m k implies m k
. In other words, the mapping

: n
Aut(Cn)

is one-to-one and onto. It is a homomorphism, because

x mk
= xmk = (xm)k = (xm) k
= (x m
) k
= x( m k
)

and, by the claim at the beginning, mk = m k for any m ,k n


. Hence
is an isomorphism and n Aut(Cn). We proved

23.17 Theorem: If G is a cyclic group of order n , then Aut(G) n


.

Exercises

1. Let H,K be groups. Prove that Z(H K) = Z(H) Z(K).

2. Let K G and K = 2. Prove that K is a central subgroup of G.

3. Prove that K G implies Z(K) G. Show by an example that Z(K) is


not necessarily characteristic in G.

4. Find groups K,G such that K G and Z(K) Z(G).

5. Let G be a group and x,y G. Prove that, if xy Z(G), then xy = yx.

6. Find the centers of D4, D2n (n odd), SL(2, ),SL(2, ).

7. Prove that Z(S n) = 1 for n 3 and Z(An) = 1 for n 4.

8. Define a subgroup M of G by M/Z(G) = Z(G/Z(G)). Show that M is char-


acteristic in G.

252
9. Let Aut(G) and H G. Prove that H is a subgroup of G and is iso-
morphic to H.

10. Let A Aut(G) and K H G. Suppose that K is characteristic


in H and H = H for all A. Prove that K = K for all A.

11. Show that, if G H, then Aut(G) Aut(H).

12. Find all characteristic subgroups of D8. Prove that Inn(D8) 1 and
that Aut(D8) D8.

13. Prove that Aut( ) C2, Aut(V4) S3, Aut(S3) S3, Aut(Q8) S4 (see
§17, Ex. 15).

14. Let H be a characteristic subgroup of G and put


N ={ Aut(G): (x )x 1 H for all x G}. Prove that N Aut(G).

15. Find a one-to-one homomorphism from Aut(H K) into


Aut(H) Aut(K).

16. Let H K G and Aut(G). Prove that H K and, if also H


K, then H K .

253
§24
Generators and Commutators

We introduce an important subgroup which distingishes abelian factor


groups from nonabelian ones. It is generated by the set of commutotars.
First we define 'generation'.

24.1 Definition: Let G be a group and let X G. The intersection of all


subgroups of G which contain X is called the subgroup of G generated by
X and is denoted by X .

Hence X = H. Here H runs through a nonempty set, since at least


X H G
G is a subgroup of G that contains X. Note that = 1.. When X is a finite
set, for instance X = {x1,x2, . . . ,xn}, we write x1,x2, . . . ,xn rather than
{x1,x2, . . . ,xn} . In particular, if X = {x} consists of a single element, then
x = {x} is the cyclic group generated by x, as we introduced in Defini-
tion 11.1. Definitions 11.1 and 24.1 are consistent, as will be proved in
Lemma 24.2, below. Our notation , for dihedral groups is also con-
sistent with Definition 24.1.

When K G and X K, then X K by definition. So X is the smallest


subgroup of G containing X. In particular, if H G, then H = H.

The elements of X are described in the next lemma. See also Ex. 1 at
the end of this paragraph.

24.2 Lemma: Let X be a nonempty subset of a group G. Then

X = {x1m1 x2m2 . . . xkmk G: k , xi X and mi for each i = 1,2, . . . ,k}.

Proof: Let Y be the set on the right hand side. We must show Y X
and X Y.

254
In order to prove Y X , we show that Y H for every H G such that
X H. This follows from the closure properties of subgroups. If X H
n
and H G, then, for any x X, there holds x H for any n since H
n
is closed under multiplication, and also x H for any n since H is
0
closed under taking inverses and x = 1 H. Hence, for any k , any
m1 m2 mk
x1,x2, . . . ,xk X, any m1,m2, . . . ,mk , we have x1 ,x2 ,. . . ,xk H and,
from the closure of H under multiplication, we get x1m1 x2m2 . . . xkmk H.
Thus Y H whenever X H G. This proves Y X .

Now we show X Y.. By definition of Y, we have X Y (take k = 1 and


m1 = 1). So X Y will be proved if we show that Y is a subgroup of G.
But Y is closed under multiplication (because k runs through ). and
under the forming of inverses (because mi when mi ). So X Y
G and consequently X Y. .

24.3 Remark: X consists of all finite products of elements in X and


the inverses of the elements in X. Notice that the set Y of Lemma 24.2
does not change if the elements of X are replaced by their inverses. Thus
X = Z , where Z = {x 1 G: x X}.

24.4 Definition: Let G be a group. If X G and X = G, then X is called


a set of generators of G, and G is said to be generated by X. If G has a
finite set of generators, G is said to be a finitely generated group.

24.5 Examples: (a) If x G, then x = {xn: n } by Lemma 24.2. So


x is the cyclic group generated by x as in Definition 11.1.

(b) Any element of the dihedral group D2n can be written in the form
m j
, where m,j . Hence D2n = , . So the notation of §14 is
consistent with Definition 24.1.

(c) Any permutation in Sn (n 2) can be written as a product of


transpositions (Theorem 16.2). Let T be the set of all transpositions in Sn.
Then Sn = T by Lemma 24.2.

255
(d) SL(2, ) is generated by {(10 11),(01 1
}
0 ) . A proof of this is outlined
in Ex. 9.

24.6 Lemma: Let G be a group and let X be a nonempty subset of G.


Suppose x X for all x X and for all Aut(G) [respectively for all
Inn(G)]. Then X is a characteristic [respectively normal] subgroup
of G.

Proof: Let y X . Then y = x1m1 x2m2 . . . xkmk for some suitable k ,


x1,x2, . . . ,xk X, and m1,m2, . . . ,mk (Lemma 24.2). Then, for any in
Aut(G) [respectively for any in Inn(G)],

y =(x1m1 x2m2 . . . xkmk) = (x1 )m1 (x2 )m2 . . . (xk )mk X

by Lemma 24.2, for x1 ,x2 , . . . ,xk X by hypothesis. Thus X X


for any Aut(G) [respectively for any Inn(G)]. But then we have
1
X X for any Aut(G) [respectively for any Inn(G)], too.
Then X = X 1 X X . Hence X = X for all Aut(G)
[respectively for all Inn(G)] and X is a characteristic [respectively
normal] subgroup of G.

We are now in a position to introduce commutator subgroups.

24.7 Definition: Let G be a group and x,y G. Then

x 1y 1xy G

is called the commutator of x and y (in this order) and is denoted by


[x,y].

Some authors define [x,y] to be xyx 1y 1. In this book, [x,y] will always
stand for x 1y 1xy. Clearly, xy = yx[x,y] for any x,y G. In general, xy
yx, and [x,y] is that element z in G for which xy = yx.z, whence the name
commutator.

256
24.8 Lemma: Let G be a group and x,y G.
(1) [x,y] 1 = [y,x].
(2) [x,y] = 1 if and only if x and y commute: xy = yx.

Proof: (1) [x,y] 1 = (x 1y 1xy) 1 = y 1x 1(y 1) 1(x 1) 1 = y 1x 1yx = [y,x].

(2) [x,y] = 1 means x 1y 1xy = 1, and this means xy = yx.

From Lemma 24.8(2), we understand that commutators measure, so to


speak, how nonabelian a group is. When the set of commutators consists
of 1 only, then the group is abelian. Rather sloppily, the more
nonidentity commutators a group has, the more elements of G fail to
commute with other elements of G, and the more nonabelian G is. This
vague statement will acquire a precise meaning below (Lemma 24.12
and Theorem 24.14).

24.9 Definition: Let H,K G. We define the commutator subgroup


corresponding to H and K as

[H,K] = [h,k] G: h H, k K .

We saw in Lemma 24.8(1) that the inverse of a commutator is a commu-


tator. However, when H and K are subgroups of G, the inverse of a com-
mutator of the form [h,k], where h H, k K, need not be a commutator
of the form [h´,k´], with h´ H, k´ K. Also, the product of two commu-
tators is not a commutator in general. The commutator subgroups are
defined to be the subgroups generated by the set of appropriate com-
mutators, not as the set of commutators.

24.10 Lemma: Let H,K G. Then [H,K] = [K,H].

Proof: We have [H,K] = [h,k] G: h H, k K


= [h,k] 1 G: h H, k K (by Remark 24.3)

257
= [k,h] G: k K, h H
= [K,H].

24.11 Lemma: Let H, K G. If H and K are characteristic [respectively


normal] subgroups of G, then [H,K] is characteristic [respectively normal]
in G.

Proof: We use Lemma 24.6, with X = {[h,k] : h H, k K}. It suffices to


show that x X for all x X and for all Aut(G) [respectively for all
Inn(G)]. This follows from

x = [h,k] for some h H, k K,


x = [h,k] = (h 1k 1hk) = (h ) 1(k ) 1(h )(k ) = [h ,k ] X

as h H, k K for any Aut(G) [respectively for any Inn(G)]


when H and K are characteristic [respectively normal] subgroups of G.

24.12 Lemma: Let H G, K G. Then [H,K] H K. In particular, if


H K = 1, then every element of H commutes with every element of K.

Proof: It suffices to show that [h,k] H K for all h H, k K. For any


h H, k K, we have indeed
[h,k] = h 1.k 1hk H since H G
1 1
[h,k] = h k h.k K since K G,
yielding [h,k] H K.

If H K = 1, then [h,k] H K = 1 and [h,k] = 1, so hk = kh for all h H,


and k K.

The preceding lemma supports our vague remark that commutators


measure how nonabelian a group is. Suppose we treat, somehow, com-
mutators like the identity. Then the group will be like an abelian group.
The formal way of treating commutators like 1 is to define an equi-
valence relation on the group in such a way that all commutators will be
equivalent to 1. The most natural equivalence relation of this type is
right congruence modulo the subgroup generated by all commutators

258
(Definition 10.4). The equivalence classes are the right cosets of this
subgroup, which is normal, form a factor group. We expect this factor
group to be abelian. First we give a name to the subgroup.

24.13 Definition: Let G be a group. Then the subgroup

[G,G] = [g,g´]: g,g´ G

generated by all commutators in G is called the derived subgroup of G,


denoted by G´.

G is abelian if and only if G´ = 1. Now G´ is a characteristic subgroup of G


(Lemma 24.11), hence we can build the factor group G/G´. We expect
G/G´ is abelian. In fact, much more is true.

24.14 Theorem: Let K G. Then G/K is abelian if and only if G´ K.

Proof: G/K is abelian (xK)(yK) = (yK)(xK) for all x,y G


xyK = yxK for all x,y G
x 1y 1xyK = K for all x,y G
x 1y 1xy K for all x,y G
[x,y] K for all x,y G
[x,y]: x,y G K
G´ K.

Exercises

1. Let G be a group and X a nonempty subset of G. Prove that.


X = {x1 1 x2 2 . . . xk k G: k , xi X and i = 1 for all i = 1,2, . . . ,k}.

2. Show that S n = (12),(123. . . n 1,n) when n 3.

3. If H G, G:H is finite and G is finitely generated, show that H is also


finitely generated.

259
4. If H G and G is finitely generated, show that G/H is also finitely
generated.

5. Show that every finitely generated subgroup of is cyclic.

6. Let H1 H2 H3 . . . be subgroups of G. Prove that H := Hi is a


i=1
subgroup of G. Prove further that, if each Hi is a proper subgroup of G,
then H is also a proper subgroup of G.

n n
7. Let : and : and put G = , S . Let n
= for
u u+1 u 2u
2
n . Show that n+1
= n
for all n . Show that
.. . .
1 2 3

Prove that n
is a proper subgroup of A := i for all n . Using
i=1
Ex. 6, conclude that A is not finitely generated. Thus a subgroup of a
finitely generated group need not be finitely generated.

8. Let M = \{0,1} and : M M and : M M. Prove that , SM


x 1/x x 1/(1 x)

and that , is isomorphic to S 3.

(10 11),S (01 1


0 ) by going
9. Show that SL(2, ) = T,S , where T = =
a b
through the following steps. Let M = (c d) SL(2, ). If c = 0, then M is a

power of T. Make induction: suppose a matrix in SL(2, ) belongs to T,S


a b
whenever its lower-left entry is positive and c. If M = (c d) is a

matrix whose lower-left entry is c, divide d by c, so that d = qc + r. Then


a b
MT qS is in T,S , and so is M. Thus (c d) T,S whenever c 0. If c is

negative, MS2 T,S , and so M T,S .

10. Let H G. Prove that [H,G] = 1 if and only if H Z(G) and also that
[H,G] H if and only if H G.

11. Show that, if G´ N G, then N is a normal subgroup of G.

12. Let K G. Prove that [xK,yK] = [x,y]K G/K for any x,y G. Then
prove that [HK/K,JK/K] = [H,J]K/K for all H,J G.

260
13. Let H1,H2 H and K1,K2 K. Show that [H1 K1,H2 K2] = [H1,H2]
[K1,K2] as subgroups of H K.

14. Show that [xy,z] =y 1[x,z]y.[y,z] and [x,yz] = [x,z].z 1[x,y]z for any
elements x,y,z of a group G. Deduce that [HJ,K] = [H,K][J,K] whenever H,J,K
are normal subgroups of G.

15. For any elements x,y,z of a group G, show that


y 1[[x,y 1],z]y . z 1[[y,z 1],x]z . x 1[[z,x 1],y]x = 1.

16. Let H,K,L be subgroups of a group G and N G. If two of the sub-


groups [[H,K],L], [[K,L],H], [[L,H],K] are contained in N, prove that the
third is also contained in N.

17. Give an example of a group and three subgroups H,K,L of G such that
[[H,K],L] [H,[K,L]].

18. Prove: if K G, then K´ G.

19. Find the derived subgroups of S 3,S 4,A4,D8,Q8 (see §17, Ex.
15),SL(2, 3), GL(2, 3), S n, An (for n 2).

20. Let G be a group such that G´ Z(G) and let a be a fixed element of G.
Prove that the mapping : G G is a homomorphism.
x [x,a]

261
§25
Group Actions

Many of the important groups we have examined so. far are groups of
functions. S X is the group of one-to-one mappings on the set X, Isom E is
the group of distance preserving functions on the Euclidean plane, Aut(G)
is the group of multiplication preserving functions on a group G. You will
see more examples later. In general, when X is a set with some structure
on it (algebraic, geometric, analytic, topological or of some other type),
the mappings on X that preserve this structure form a group. Up to now,
we neglected the functional character of the elements of a group they
might have. In this paragraph, we consider groups whose elements can
be thought of as functions on a set X. This leads to the idea of group ac-
tions.

25.1 Definition: Let G be a group and let X be a nonempty set. We say


that G acts on X provided, for all x X and g G, there corresponds a
uniquely determined element of X, denoted by xg, such that the follow-
ing hold:

(xg1)g2 = x(g1g2) for all x X, g1,g2 G,


x1 =x for all x X.

More precisely, we say then that G acts on X on the right.. We similarly


define a left action of G on X by stipulating that (g1g2)x = g1(g2x ) and 1x
= x for all x X, g1,g2 G, where gx is a uniquely determined element of
X corresponding to the pair g,x. .

25.2 Examples: (a) Let X be a nonempty set and G = S X . Then G acts on


X when. we naturally interpret xg as the image of x X under the map-
ping g G. The condition (xg1)g2 = x(g1g2) is satisfied for all x X and for
all g1,g2 G, for it is nothing else than the definition of composition of
mappings.. The condition x1 = x holds, too, since it is the definition of the
identity mapping 1 G on X. More generally, if G SX , then G acts on X.

(b) Let X = and let G = GL(2, ). Then G acts on X if we put

262
a b
(x,y)(c d) = (xa + yc, xb + yd).

We have indeed ((x,y)(ac bd))(eg hf ) = (xa + yc, xb + yd)(eg hf )


= ((xa + yc)e + (xb + yd)g, (xa + yc)f + (xb + yd)h)
= (xae + yce + xbg + ydg, xaf + ycf + xbh + ydh)

and ((ac bd)(eg hf ))


(x,y)
ae+bg
= (x,y)(ce+dg
af+bh
cf+dh )

= (x(ae + bg) + y(ce + dg), x(af + bh) + y(cf + dh))


= (xae + xbg + yce + ydg, xaf + xbh + ycf + ydh)

and so ((x,y)(ac bd))(eg hf ) = (x,y)((ac bd)(eg hf )) for all (x,y) X and

(ac bd),(eg hf ) G.

One proves analogously that G acts on Y ={(xy): x,y } on the left


a b x ax+by a b x
when we put (c d)(y) = (cx+dy ) for all (c d) G, (y) Y.

Clearly, the field can be replaced by any field in this example.

(c) Let X = and G = SL(2, ) = {( ) Mat2( ): }


= 1 .

Then G acts on X when we define (a,b,c)( ) to be


2 2 2 2
(a +b + c , 2a + b( + ) + 2c , a +b +c ).

The verification is left to the reader.

(d) Suppose G acts on X on the left and we denote the element of X cor-
responding to the pair g,x (g G, x X) by g *x. Then G acts on X on the
1
right when we put xg := g *x, because

(xg1)g2 = (g1 1*x)g2 = g2 1*(g1 1*x) = g2 1g1 1*x = (g1g2) 1*x = x(g1g2)
and
x1 = 1 1*x = 1*x = x

263
for all x X, g1,g2 G. We could not write xg := g *x, for then we would
get (xg1)g2 = x(g2g1) instead of (xg1)g2 = x(g1g2). However, if G is com-
mutative, G acts on X on the right when we put xg := g *x.

(e) Let F be a nonempty subset of the Euclidean plane E. Then Sym F


acts on F, because f F for all f F, Sym F and

f( 1 2
) = (f 1
) 2
f =f

for all f F, 1
, 2
Sym F.

(f) Let G be a group. Then Aut(G) acts on G, because g( 1 2


) = (g 1
) 2
and g = g for all g G and for all 1, 2 Aut(G).

(g) Let be the set of all nonempty subsets. of the Euclidean plane E.
Then S E acts on since (F ) = F( ) and F = F for all F and , SE
(Lemma 14.1). .

(h) Assume that a group G acts on a set X. Then any subgroup of G also
acts on X.

In the next two theorems, we shall show that any group. action on a set
X is essentially a homomorphism into S X .

25.3 Theorem: Let G act on X. For each g G, consider x xg as a


function and put g
:X X. Then g S X and the mapping
x xg

:G SX
g g
is a homomorphism (called the permutation representation of G corre-
sponding to the action).

Proof: Let g G. Since G acts on X, to each x X,. there corresponds a


uniquely determined element xg of X. Hence g : x xg is indeed a func-
tion from X into X. .

For any x X, g 1,g 2 G, we have

264
x g1g2
= x(g 1g 2) = (xg1)g 2 = (xg1) g2
= (x g1
) g2
= x( g1 g2
),

so g1g2
= g1 g2
.

(1)

Furthermore, x 1
= x1 = x for all x X, hence

1
= X
S X.
(2)

From (1) and (2), we obtain

g g -1
= gg -1
= 1
= X
= S X; g -1 g
= g -1g
= 1
= X
= SX

and thus g
is one-to-one and onto (Theorem 3.17(2)). So g
S X for all g
in G.

So we have a mapping : G S X and it is a homomorphism by (1).

g g

25.4 Theorem: Let X be a nonempty set and let :G S X be a group


homomorphism. Then G acts on X when we put

xg = x(g )

for all x X, g1,g2 G. Furthermore, the permutation representation of G


corresponding to this action is .

Proof: The proof consists in observing that is a homomorphism. We


have
(xg1)g2 = (xg1)(g2 ) = (x(g1 ))(g2 ) = x((g1 )(g2 )) =x((g1g2) ) = x(g1g2)
and
x1 = x(1 ) = x

for all x X, g1,g2 G. Here we use the fact that 1 S X is the identity
element of the group S X (Lemma 20.3(a)), which is the identity mapping
on X. Thus setting xg = x(g ) does define a group action.

Let us find the permutation representation of G. corresponding to this


action. This is : G S X, g g
where g is the mapping x xg on G.
Since

265
x g
= xg = x(g )

for all x X, g G, we have g


= g for all g G. Hence = by the defi-
nition of equality of mappings.

We now show that group actions define an equivalence relation on the


underlying set X. The number of elements in an equivalence class can be
expressed in group theoretical terms. This gives some arithmetical infor-
mation about groups.

25.5 Lemma: Let G act on X. for any x,y X, we put x y if and only if
there is an element g G such that xg = y. Then is an equivalence
relation on X.

Proof: (cf. Lemma 15.7.) (i) Since 1 G and x1 = x for all x X, we have
x x for all x X. Thus is reflexive.

(ii) If x,y X and x y, then there is a g G such that xg = y, so yg 1 =


(xg)g 1 = x(gg 1) = x1 = x. From g 1 G and yg 1 = x, we conclude y x.
Thus is symmetric.

(iii) Suppose x,y,z X and x y, y z. Then there are g,h G such that xg
= y and yh = z. Then x(gh) = (xg)h = yh = z. From gh G and x(gh) = z, we
conclude x z. Thus is transitive.

So is an equivalence relation on X.

25.6 Definition: Let G act on X. The equivalence classes of the equi-


valence relation in Lemma 25.5 are called orbits. The equivalence class
{xg X: g G} of x X is called the orbit of x.

25.7 Lemma: Let G act on X. For x X, we write

StabG(x) = {g G: xg = x}.

Then StabG(x) is a subgroup of G (called the stabilizer of x in G).

266
Proof: The proof is a routine application of our subgroup criterion.

(i) Let g,h StabG(x). Then xg = x and xh = x. So x(gh) = (xg)h


= xh = x, so gh StabG(x). Hence StabG(x) is closed under multiplication.

1
(ii) Let g StabG(x). Then xg = x. So xg 1 = (xg)g 1
= x(gg 1) =
1
x1 = x, so g StabG(x). Hence StabG(x) is closed under the forming of in-
verses.

Thus StabG(x) G.

Stabilizers of elements in the same orbit are closely related.

25.8 Lemma: Let G act on X. Let x X and g G. Then


StabG(xg) = g 1StabG(x)g.

Proof: As h StabG(xg) (xg)h = xg


x(gh) = xg
(x(gh))g 1 = x
x(ghg 1) = x
ghg 1 StabG(x)
h g 1StabG(x)g,

StabG(xg) = g 1StabG(x)g.

The kernel of the permutation representation can be expressed in terms


of the stabilizers.

25.9 Lemma: Assume G acts on X and let :G S X be the permutation


representation. Then Ker = StabG(x).
x X

Proof: For g G, we have : g g


S X , where g
:x xg. Hence

Ker = {g G: g = 1 S X }
= {g G: x g = x for all x X}

267
= {g G: xg = x for all x X}
= {g G: xg = x }
x X
= StabG(x).
x X

The following elementary counting principle has many applications.

25.10 Lemma: Let G act on X. For any x X, we have

orbit of x = G:StabG(x) .

Proof: The orbit of x is the set {xg X: g G}. The index G:StabG(x) is
the number of right cosets of StabG(x) in G, more precisely, the cardinal
number of = {StabG(x)g: g G}. We must find a one-to-one corre-
spondence between the orbit {xg X: g G} of x and the set =
{StabG(x)g: g G} of the right cosets of StabG(x) in G. The description of
these sets leads us to consider the mapping

: orbit of x ,
xg Sg

where we put S = StabG(x) for brevity. Let us see if is one-to-one and


onto.

Before that, however, we must check that is well defined, for one and
the same element in the orbit of x can have representations xg,xh with
g h. We must prove that xg = xh implies Sg = Sh. If xg = xh, then x(gh 1)
= (xg)h 1 = (xh)h 1 = x(hh 1) = x1 = x, so gh 1 S and therefore Sg = Sh by
Lemma 10.2(5). Thus is well defined.

That is one-to-one follows by reversing the argument above. If (xg) =


(xh) , then Sg = Sh, then gh 1 S, then x(gh 1) = x, then (x(gh 1))h = xh, so
xg = xh. Therefore is one-to-one.

is certainly onto, since any Sg is the image of xg in the orbit of x.

Thus is a one-to-one mapping from the orbit of x onto . This gives

orbit of x = G:StabG(x) .

268
25.11 Definition: Let G act on X. We say G acts transitively on X or the
action of G on X is said to be a transitive action if, for any x,y X, there
is a g G such that xg = y. If G does not act transitively on X, then G is
said to act intransitively on X.

Thus G acts transitively on X if and only if there is one and only one or-
bit. The whole set X is the single orbit of the action.

25.12 Examples: (a) A group G acts on itself by right multiplication:. to


the pair x,g G, there corresponds the product xg G.. The conditions
(xg1)g 2 = x(g 1g 2) and x1 = x (for all x,g 1,g 2 G) are immediate from the
associativity of multiplication. and from the definition of the identity el-
ement.. This action is transitive, because, given any x,y G, there is an
element g in G, namely g = x 1y, such that xg = y.. Hence, for any x G,
we have G = orbit of x = G:StabG(x) ,. thus StabG(x) = 1, as can be seen
also from StabG(x) = {g G: xg = x} = {g G: g = 1} = {1} = 1.. This action is
called the regular action of G on G.. The kernel of the permutation repre-
sentation : G S X is Ker = StabG(x) = 1 by Lemma 25.9. Thus is
x X
one-to-one and Theorem 20.16 gives G G/1 = G/Ker Im S G.

(b) The preceding example can be generalized. Let H G and let =


{Ha: a G} be the set of all right cosets of H in G. Then G acts on by
right multiplication, where, to the pair Ha, g, there corresponds the coset
Hag, because

((Ha)g1)g2 = (Hag1)g2 = H((ag1)g2) = H(a(g1g2)) = (Ha)(g1g2)


and

(Ha)1 = Ha1 = Ha
for all Ha , g1,g2 G.

This action is transitive, because, given any Ha,Hb , there is an ele-


ment g in G, namely g = a 1b, such that (Ha)g = Hb.

We have StabG(H) = {g G: Hg = H} = {g G: g H} = H
and StabG(Ha) = a 1StabG(H)a = a 1Ha

269
by Lemma 25.8.

The kernel of the permutation representation : G S is, by Lemma


25.9,

Ker = StabG(Ha) = StabG(Ha) = a 1Ha.


Ha a G a G

The intersection a 1Ha is called the core of H in G, and is designated


a G
by HG. Theorem 20.16 gives now G/HG = G/Ker Im S .

25.13 Theorem : Let G be a group.


(1) (Cayley's theorem) G is isomorphic to a subgroup of SG.
(2) Let H G be of index G:H = n. Then G/HG is isomorphic to a sub-
group of Sn.

Proof: (1) This follows from Example 25.12(a).

(2) From Example 25.12(b), it follows that G/HG is isomorphic to a sub-


group of S , where is a set with n elements. Let : {1,2, . . . ,n} be a
one-to-one mapping from onto {1,2, . . . ,n}. Then, for each f S , the
mapping 1f is a one-to-one mapping from {1,2, . . . ,n} onto {1,2, . . . ,n},
so 1f S n. Now the function
M: S S n,
1
f f

is easily verified to be a homomorphism: fgM = 1fg = 1f 1g = fMgM


for all f,g S ; and M is one-to-one and onto, because the mapping

N: S n S
1

is such that MN = identity mapping on S and NM = identity mapping on


S n (Theorem 3.17(2)). Hence M is an isomorphism and S S n. Together
with G/HG S , this gives G/HG S n.

25.14 Example: Another important group action is conjugation. For


any x,g G, we call g 1xg the conjugate of x by g. In order to avoid any

270
confusion with right multiplication, we shall write xg for g 1xg. This
notation is standard. Since

(xg 1 )g 2 = (g 1 1xg1)g 2 = g 2 1(g 1 1xg1)g 2 = g2 1g1 1xg1g 2 = (g 1g 2) 1x(g 1g 2) = x(g 1 g 2 )


and
x1 = 1 1x1 = x

for all x,g 1,g 2 G, conjugation is indeed an action of G on G.

The orbit {xg : g G} = {g 1xg: g G} of x G is called the conjugacy class


of x. We have

StabG(x) = {g G: xg = x} = {g G: g 1xg = x} = {g G: xg = gx};

so StabG(x) consists of the all those elements in G which commute with x.


It is called the centralizer of x in G in this case and is denoted by CG(x).

The permutation representation is : G S G, where g: G G. Hence g is


g g x xg
the inner automorphism of G induced by g. We get

Ker = CG(x) = {g G: xg = gx} = {g G: xg = gx for all x G} = Z(G)


x G x G

as we know also from the proof of Theorem 23.10. In this case, Lemma
25.10 assumes the following form.

25.15 Lemma: Let G be a group and x G. Then


conjugacy class of x = G:CG(x) .

25.16 Lemma (Class equation): Let G be a finite group. Assume G


has k distinct conjugacy classes and let x1,x2, . . . ,xk be representatives of
these classes. Then .


k
G = G:CG(xi) .
i=1

Proof: Conjugacy is an equivalence relation on G and gives rise to a par-


tition of G (Theorem 2.5):

271
k
G= conjugacy class of xi,
i=1

the union being disjoint. Counting the number of elements on both sides,
and using Lemma 25.15, we obtain

∑ ∑
k k
G = conjugacy class of xi = G:CG(xi) .
i=1 i=1

We give an important application of the class equation.

25.17 Theorem: Let G be a group of order pn, where p is prime and n


is a natural number. Then Z(G) 1.

Proof: Let k be the number of conjugacy classes in G, and let x1,x2, . . . ,xk
be representatives of these classes. Then, in the class equation.

∑ G:CG(xi) ,
k
G =
i=1

each summand on the right hand side is a divisor of pn by Lagrange's


theorem. So G:CG(xi) = pmi with suitable nonnegative integers mi (for
each i = 1,2, . . . ,k). Thus the class equation is
pn = pm1 + pm2 + . . . + pmk.
Here pmi = 1 if and only if G:CG(xi) = 1, so if and only if CG(xi) = G, and so
if and only if xi Z(G). Thus exactly Z(G) summands on the right hand
side are equal to 1, and the class equation gives
pn = Z(G) + (a sum of powers of p greater than p0= 1)
(The second term is absent in case G = Z(G) ; in this case Z(G) = G 1).
The last equation tells us that Z(G) is divisible by p, so Z(G) 1, hence
Z(G) 1.

25.18 Lemma: Let p be a prime number. If G is a group of order p2,


then G is abelian.

Proof: We must show Z(G) = G, or, equivalently, Z(G) = p2. We know


Z(G) = 1 or p or p2 by Lagrange's theorem, and Z(G) 1 by Theorem
25.17. We suppose, by way of contradiction, that Z(G) = p. Since Z(G)

272
G (Theorem 23.3), we can build the factor group G/Z(G), which has order
p2/p = p and which is therefore cyclic by Theorem 11.13. Then G must
be abelian by Lemma 23.5, and Z(G) = p2, contrary to the assumption
Z(G) = p. Thus Z(G) = p is impossible and there remains only the
possibility Z(G) = p2. Hence G is abelian.

We wish to present the basic idea in the proof of Theorem 25.17 in its
purest form. We need a definition.

25.19 Definition: Let G act on X. If x X, g G and xg = x, we say that


g fixes x. The set

{x X: xg = x for all g G} = {x X: StabG(x) = G}

of all elements in X which are fixed by each element. of G is called the


fixed point subset of X and denoted by FixX (G).

Thus FixX (G) consists of all those elements in X which form an orbit with
only one element in it.. When we count the number of elements in X as
the sum of the number of elements in each orbit, each element in FixX (G)
contributes 1 to this sum.. Notice that, under the action of a group G on
itself by conjugation, FixG(G) is nothing else than Z(G).

25.20 Lemma: Let G act on X. If G has order pn, where p is a prime


number and n , and X is a finite set, then

X FixX (G) (mod p).

Proof: We consider the equivalence relation of Lemma 25.5 on X. Un-


der this equivalence relation, X is partitioned into finitely many disjoint
orbits, say
k
X= orbit of xi.
i=1

273
Counting the number of elements on both sides, we get


k
X = orbit of xi
i=1


k
Hence, by Lemma 25.10, X = G:StabG(xi) .
i=1

Now each of the indices G:StabG(xi) is a divisor of G = pn, hence is equal


to some power pmi of p with a nonnegative integer mi. Here pmi = p0 = 1
if and only if G = StabG(xi), that is to say, if and only if xi FixX (G). Thus
there are exactly FixX (G) summands equal to 1, and the sum above be-
comes

X = (1 + 1 + . . . + 1) + (sum of pmi with mi 0)


FixX(G) times

(the second term is missing in case there is no pmi with mi 0). So

X = FixX (G) + (a number divisible by p)

and therefore X FixX (G) (mod p), as was to be proved.

We end this paragraph with a generalization of conjugation.

25.21 Example: Let G be a group and let be the set of all nonempty
subsets of G. For any U and g G, we put

Ug = {ug G: u U} = {g 1ug G: u U} = g 1Ug.

Ug consists therefore of conjugates by g of the elements of U and is


called the conjugate of U by g. With this definition, G acts on , because

(Ug 1 )g 2 = {ug 1 G: u U}g 2 = {(ug 1 )g 2 G: u U} ={u(g 1 g 2 ) G: u U} = U(g 1 g 2 )


and
U1 = 1 1U1 = U

for all U ,g 1,g 2 G.

274
The orbit {Ug : g G} = {g 1Ug: g G} of U is called the conjugacy class
of U. We have

StabG(U) = {g G: Ug = U} = {g G: g 1Ug = U} = {g G: Ug = gU};

so StabG(U) consists of the all those elements in G which fix U as a set. It


is called the normalizer of U in G in this case and is denoted by N G(U).
The set

{g G: ug = u for all u U}
= {g G: g 1ug = u for all u U} = {g G: ug = gu for all u U}

of all those elements in G which fix each element of U under conjugation,


or, what is the same, which commute with every element of U, is called
the centralizer of U in G and is denoted by CG(U). So CG(U) is the inter-
section of the centralizers of the elements of U:

CG(U) = CG(u).
u U

In particular, CG(U) is a subgroup of G. We have CG(U) N G(U) G.

The orbit of U is

{Ug :g G} = {g 1Ug :g G}

and is called the conjugacy class of U in G. We have

conjugacy class of U = G:N G(U)


by Lemma 25.10.

In general, U neither contains nor is contained in CG(U) or N G(U). How-


ever, if U happens to be a subgroup of G, we have g 1ug U for all u,g in
g g 1
U, so U = {u G: u U} = {g ug G: u U} U and, for any g in U, we
g -1 g g g
get U (U ) U U, thus U = U and U N G(U).

We collect the last two remarks in a theorem.

25.22 Theorem: Let G be a group and let H be a subgroup of G. Then


H N G(H) G and conjugacy class of H = G:N G(H) .

275
Exercises

a b/2
1. Prove that SL(2, ) acts on X := {(b/2 c ): a,b,c } when we

associate the matrix g txg with the pair (x,g) X SL(2, ).

2. Let G act on X and let H be a subgroup of G. Show that


StabG(x) = StabG(x) H
for any x X.

3. Let G act on X and H act on Y. Prove that the direct product G H acts
on the cartesian product X Y.

4. Give an examples of groups G and subsets U of G such that U N G(U),


N G(U) U, U CG(U), CG(U) U.

5. Prove that CG(U) NG(U) for any nonempty subset U of a group G.

6. Let H G. Show that N G(H) acts on H by conjugation. Considering the


permutation representation of this action, prove that N G(H)/CG(H) is iso-
morphic to a subgroup of Aut(H).

7. Assume G acts on X, and let K be the kernel of the permutation repre-


sentation of this action. Suppose H G and H K. Show that G/H acts
on X when we put x(Hg) = xg for all x X, Hg G/H. What is the kernel
of the permutation representation?

276
§26
Sylow's Theorem

Let G be a finite group.. Lagrange's theorem asserts that, if G has a sub-


group of order k, then k is a divisor of G .. The converse of Lagrange's
theorem, like the converses of many theorems, is wrong.. If k is a divisor
of G , thenG need not have a subgroup of order k. For instance, A4 has
order 12, 12 is divisible by 6, yet A4 has no subgroups of order 6.

The converse of Lagrange's theorem becomes true if we impose the ad-


ditional condition that k be a prime power such that k and G /k are rel-
atively prime. In other words, if G = pa m, where p is a prime number
and p m, then G does have a subgroup H of order pa . Then any conjugate
Hg of H, too, is a subgroup of order pa and the question arises as to
whether G has subgroups of order pa other than the conjugates of H. The
answer turns out to be negative. The conjugates of H are the only
subgroups of order pa .

This theorem was proved by the Norwegian mathematician L. Sylow in


1872. It is a very important tool in the theory of finite groups. We pre-
sent here a very elegant proof due to H. Wielandt (1959).

26.1 Theorem (Sylow's Theorem): Let G be a finite group of order


G = pa m, where p is a prime number and p m (that is, let pa be the
highest power of p dividing G ). Then the following assertions hold.

(1) G has a subgroup H of order pa .


(2) If J is any subgroup of G whose order J is a power of p, then there is
an x G such that J H x.
(3) If np denotes the number of subgroups of order pa , then np m and
np 1 (mod p).

Some remarks will now be in order. If pa G and pa+1 G , then a sub-


group of G of order pa is called a Sylow p-subgroup of G. Part (1) of Sy-

276
low's theorem states that every finite group has a Sylow p-subgroup, for
all prime numbers p.

If H is a Sylow p-subgroup of G, so is Hg for any g G. Part (2) of Sylow's


theorem states that any subgroup of p-power-order of G is a subgroup
of a suitable conjugate of H. In particular, any Sylow p-subgroup of G is
contained in a suitable Hx for some x G, and, since the orders of that
Sylow p-subgroup and of H coincide, that Sylow p-subgroup must be Hx
x

itself. So any Sylow p-subgroup of G is a conjugate of H.

If a Sylow p-subgroup H of G is normal in G, then all conjugates of H are


equal to H, hence H is the unique Sylow p-subgroup of G. Then, for any
automorphism of G, H is a subgroup of order pa , and therefore is
equal to H. So H is in fact a characteristic subgroup of G in this case.

Part (3) of Sylow's theorem gives us arithmetical information about the


possible number Sylow p-subgroups. Two applications of this is given in
Lemma 26.5 and in Lemma 26.6.

Proof of Sylow's theorem: The basic idea of the proof is as follows. If


there is a Sylow p-subgroup H of G, then H is first of all a subset of G
having exactly pa elements and is furthermore such that Hh = H for all
h H. So H = {h G: Uh = U} for some subset U of G with U = pa . In or-
der to find a subgroup of order pa , so we look at the sets {h G: Uh = U},
a
for each U G with U = p . Such sets are the stabilizers of U's under the
group action described below. A juidicious choice of U will produce a
subgroup of order pa .

Step 1. Let = {U G: U = pa }. Then the number of elements of


(= subsets of G in ) is not divisible by p:

pa m pa m
There are clearly (pa ) subsets of G in . We are to prove p ( pa ) .

pa m (pa m)! pa m pa m 1 pa m 2 . . . pa m (pa 1) . Now


We have ( pa ) =
pa !(pa m pa )!
=
pa pa 1 pa 2 1
pa m s
consider each one of the factors a (s = 1,2, . . . ,pa 1). We write s =
p s
b
p t, with t and p t, and observe that neither the numerator nor the
denominator of these numbers

277
pa m s pa m pbt pa bm t
= =
pa s pa pbt pa b t

pa m
contain p after cancellations are made. Hence their product ( pa ) is not
divisible by p.

As an example, note that all 3's are cancelled in

322
( ) ( ) = 189 178 167 156 145 134 123 112 101 = 21 178 167 52 145 134 41 112 101
18
9
=
32
.

Step 2. G acts on when we put Ug = {ug: u U} for U ,g G:

The mapping U Ug is one-to-one (Lemma 8.1(2)) and onto (by defini-


u ug
tion of Ug). Hence Ug = U = pa and Ug is an element of . Now (Ug1)g 2 =
U(g 1g 2) for all U , g 1,g 2 G by Lemma 19.2 and also U1 = {u1: u U}
= {u: u U} = U for all U . Thus G acts on .

Step 3. There is an orbit of under the action of Step 2 such that the
number of elements (of ; equivalently, the number of subsets of G) in it
is not divisible by p:

The orbit of any U is {Ug :g G}. Now is partitioned into dis-


joint orbits. If 1, 2
, . . . , k
are the orbits, then

= 1 2
... k
.

Counting the number of elements and keeping in mind that the orbits
are pairwise disjoint, we get

= 1
+ 2
+ ... + k
.

If 1 , 2
, . . . , k were all divisible by p, their sum would be di-
visible by p, too, contrary to Step 1.. Thus at least one of the numbers
1
, 2 , . . . , k is not divisible by p, as contended.

Let U0 be such that the number of elements (of ) in its orbit is not
divisible by p.. This is the juidicious choice we have alluded to. We put
H = StabG(U0).

278
Step 4. H G and H = pa :

H is a subgroup of G by Lemma 25.7. As to the second assertion, first we


note that the orbit of U0 is equal to G:H (Lemma 25.10) and, by the
choice of U0, this index G:H is not divisible by p. So p G / H , so
p pa m/ H . Writing H = pbn, where n , p n and, by Lagrange's theo-
rem, b a and n m, we get p p m/n. This is possible only in case pa b =
ab

p0. Hence a = b and H = pa n pa . On the other hand, if


U0 = {u1,u2, . . . ,upa }, then, for any h H = StabG(U0), we have

u1h U0h = U0 = {u1,u2, . . . ,upa }


h {u1 1u1,u1 1u2, . . . ,u1 1upa }
H {u1 1u1,u1 1u2, . . . ,u1 1upa }
H pa .
From H pa and H pa , we get H = pa .

By Step 4, H is a Sylow p-subgroup of G. This completes the proof of part


(1). We proceed to the proof of part (2). Let J G be such that J = pb,
where b 0.

Step 5. There is an x G such that J H x:

Let = {Ha: a G} be the set of all right cosets of H in G. Then G acts on


by right multiplication (Example 25.12(b)) and its subgroup J also
acts on . Since the order of J is a power of p, we can apply Lemma
25.20 and conclude

Fix (J) (mod p),

hence Fix (J) = G:H = m 0 (mod p)

Fix (J) 0

Fix (J) .

So there is a right coset Hx in Fix (J). Thus StabJ(Hx) = J. But StabJ(Hx) =


J StabG(Hx) = J Hx by Example 25.12(b). So we obtain J Hx = J,
which means J H x.

279
This completes the proof of part (2). In view of the remarks preceding
the proof, all Sylow p-subgroups of G are conjugate; and a normal Sylow
p-subgroup of a finite group is the unique Sylow p-subgroup of that
group.

Let N := N G(H) = {g G: Hg = H} be the normalizer of H in G. Then H N,


N G and, since p N:H , H is a Sylow p-subgroup of N. Thus H is the
unique Sylow p-subgroup of N.

We now prove part (3). Let np be the number of Sylow p-subgroups of G.

Step 6 np = G:N :

Let = {Hx G: x G}. Then is the set of all Sylow p-subgroups of G.


We want to evaluate np = . Here G acts on by conjugation, because
(Hx)g = Hxg ; (Hg 1 )g 2 = H(g 1 g 2 ); and H1 = 1 1H1 = H for all H ,g 1,g 2 G.
Lemma 25.10 gives now
orbit of H = G:StabG(H) .
But the orbit of H = {Hx G: x G} = and StabG(H) = N G(H) = N. Thus
np = = G:N ,
as was to be proved.

Step 7. np m and np 1 (mod p):

Of course np = G:N divides G:N N:H = G:H = m.

Now we want to prove np 1 (mod p). This will be done by applying


Lemma 25.20. In order to apply Lemma 25.20, we need the action of a
group of p-power order on a finite set. Our group of p-power order will
be H, as this is the only group of p-power order available to us. H acts on
1
= {Na: a G} the set of all right cosets of N in G by right multiplica-
tion (Example 25.12(b), Example 25.2(h)). Lemma 25.20 yields

1
Fix (H) (mod p).
1

Since np = G:N = 1
, the claim will be established when we show that
Fix (H) = 1.
1

280
From the equivalences

Na Fix (H) StabH (Na) = H


1
(Na)h = Na for all h H
Naha 1 = N for all h H
aha 1 N for all h H
-1
ha N for all h H
a -1
H N
a -1
H is a Sylow p-subgroup of N
-1
Ha the unique Sylow p-subgroup H of N
-1
Ha = H
a 1 N G(H) = N
a N
Na = N,

it follows that Fix (H) = {N}. Thus Fix (H) = 1 and np 1 (mod p).
1 1

This completes the proof.

26.2 Definition: Let p be a prime number. A finite group G is called a


finite p-group if G = pa for some integer a 0.

26.3 Theorem: Let G be a finite p-group, with G = pa 1.


(1) G has a normal subgroup of order p.
(2) There are normal subgroups Hi of G such that Hi = pi (i = 0,1,2, . . . ,a)
and 1=H H H ... H H = G.
0 1 2 a 1 a

Proof: (1) From Theorem 25.17, we know Z(G) 1. Let z Z(G) with z
k-1 k-1
1, and let o(z) = pk (1 k a). Then o(zp ) = p. Thus zp is a
subgroup of order p and is normal in G (Theorem 23.3).

(2) We make induction on a.. If a = 1, then G = p and G has normal


subgroups H0 and H1, namely H0 = 1 and H1 = G, with H0 = 1 and H1 = p
such that H0 H1.

Assume now that a 2 and that the claim is true for any finite p-group
a1
of order p . By part (1), there is H1 G with H1 = p. We consider the

281
factor group G/H1, which has order G/H1 = G / H1 = pa /p = pa 1. By
induction, there are normal subgroups, say Hi+1/H1, of G/H1 with Hi+1/H1
= pi (i = 0,1, . . . ,a 1) and

1 = H1/H1 H2/H1 H3/H1 ... Ha 1/H1 Ha /H1 = G/H1.

By Theorem 21.2, each Hi G (i = 1,2, . . . ,a) and

H1 H2 ... Ha Ha = G.
1

Here Hi+1 = Hi+1/H1 H1 = pip = pi+1 for i = 0,1, . . . ,a 1. Thus, when we put
H0 = 1, the claim is proved for finite p-groups of order pa .

26.4 Theorem: Let G be a finite group and let p be a prime number.


Suppose pb G , where b 0. Then G has a subgroup of order pb.

Proof: Let us write G = pa m, with m and p m. Then G has a Sylow


p-subgroup H of order pa , and, by Theorem 26.3(2), H has a subgroup J
of order pb. Hence J is a subgroup of G with J = pb.

Theorem 26.4 generalizes Sylow's theorem (1) to the case where pb is


any prime power divisor of G (not necessarily the highest power of p
dividing G ). Part (2) of Sylow's theorem does not generalize: two
subgroups J1 and J2, of the same order pb, are not necessarily conjugate
in G, or even isomorphic. Part (3) of Sylow's theorem, however, is true in
the more general case: if pb G , then the number of subgroups of order
pb in G is congruent to 1 modulo p.

We close this paragraph with two applications of Sylow's theorem.

26.5 Lemma: Let p and q be distinct prime numbers and let G be a


group of order pq. Then either a Sylow p-subgroup or a Sylow q-
subgroup of G is normal in G. In fact, if p q, then a Sylow p-subgroup
of G is normal in G.

282
Proof: Suppose p q and let np be the number of Sylow p-subgroups
of G. Then np divides G /p = q, so np = 1 or q, and np 1 (mod p). So np =
q implies p q 1, which is not compatible with p q. Thus np = q is
impossible and np = 1. Then there is a unique Sylow p-subgroup of G,
and it is normal in G.

26.6 Lemma: Let p and q be distinct prime numbers and let G be a


group of order p2q. Then either a Sylow p-subgroup or a Sylow q-
subgroup of G is normal in G.

Proof: Let np,nq be the number of Sylow p and Sylow q-subgroups of G,


respectively. The claim is that either np = 1 or nq = 1. Suppose, by way of
contradiction, that np 1 and nq 1.

Since np divides G /p2 = q, and since q is prime, we have np = q. From q


= np 1 (mod p), we get p q 1, so q p. Besides, nq divides G /q =
p2, so nq = p or p2. Here nq = p is impossible, because nq 1 (mod q) and
q p. Thus nq = p2.

Let Q1, Q2, . . . , Qp2 be the Sylow q-subgroups of G. An element of order q is


a nonidentity element in one of these subgroups,. and any two distinct of
them have a trivial intersection: Qi Qj = 1. Hence

{g G: o(g) = q} = (Q1\{1}) (Q2\{1}) ... (Qp2\{1}),

where the union is taken over pairwise disjoint sets. Counting the
number of elements on the right hand side, we see that there are exactly
p2(q 1) elements of order q in G. So there are exactly G p2(q 1) = p2
elements in

G\{g G: o(g) = q} = {g G: o(g) q}.

Let P be a Sylow p-subgroup of G. Then P {g G: o(g) q}, and, since


2
both of these sets have p elements, we have P = {g G: o(g) q}.
Therefore {g G: o(g) q} is the unique Sylow p-subgroup of G and np is
equal to 1, a contradiction. So G has either a normal Sylow p-subgroup or
a normal Sylow q-subgroup.

283
Exercises

1. Find Sylow 2- and Sylow 3-subgroups of S 4, A4, SL(2, 3


), GL(2, 3
).

2. Find a Sylow p-subgroup of D2n (n ).

3. Let G be a finite group with exactly one Sylow p-subgroup. Prove that
every subgroup and every factor group of G, too, has exactly one Sylow
p-subgroup.

4. Let G be a finite group and K G. If P is a Sylow p-subgroup of G,


show that P K is a Sylow p-subgroup of K and PK/K is a Sylow p-
subgroup of G/K.

5. Let G be a finite group and H G. Show that, if P is a Sylow p-


subgroup of G, then P H is not necessarily a Sylow p-subgroup of H.

6. Let G be a finite group,. H G and let P1 be a Sylow p-subgroup of H.


Show that there is a Sylow p-subgroup P of G such that P H = P1.

7. Let P K G,. where G is a finite group and P is a Sylow p-subgroup


of K. Show that G = N G(P)K.

8. Let G be a finite group and H,J G.. Suppose J is a finite p-group and
H 1 (mod p), where p is a prime number. Prove that H CG(J) 1.

9. Let G be a finite p-group. Show that, if 1 H G, then H Z(G) 1.

10. Let G be a finite p-group and H G. Prove that H N G(H).

11. Let p,q,r be distinct prime numbers and let G be a group of order
pqr. Show that G has a nontrivial proper normal subgroup.

12. Let G be a finite p-group, with G = pa 1, and let K G. Prove that


there are subgroups Hi of G such that
(i) Hi = pi for all i = 0,1,2, . . . ,a,
(ii) K = Hj for some i = 0,1,2, . . . ,a,
(iii) 1 = H H H ... H Ha = G.
0 1 2 a 1

284
285
§27
Series

In this paragraph, we study series of groups. The celebrated Jordan-


Hölder theorem is proved and the class of solvable groups is introduced.

27.1 Definition: A nontrivial group G is called a simple group if G has


no nontrivial proper normal subgroup.

Thus a group G is simple if and only if G 1 and 1 and G are the only
normal subgroups of G. This resembles the definition of prime numbers.
Just as prime numbers are the building blocks of integers, simple groups
are the building blocks of certain groups, as will be seen below. More-
over, the fundamental theorem of arithmetic has a counterpart, namely
the Jordan-Hölder theorem. This theorem states that, any group G satis-
fying certain conditions that will be specified later, the building blocks
of G are uniquely determined. However, this analogy should not be
pushed too far. For one thing, the building blocks may be combined in
various ways to produce different groups. Stated otherwise, different
groups may have the same building blocks. In fact, the problem of de-
termining a group from its building blocks, known as the extension
problem, still awaits its solution.

It is an easy matter to find all abelian simple groups. Any subgroup of


an abelian group is normal in that group, so an abelian group is simple if
and only if it has no subgroups except 1 and itself. If G is an abelian
simple group, then G 1 by definition, and so there is an x G, x 1.
Then x is a nontrivial subgroup of G and, since G is simple, x has to
be G. Thus G = x is cyclic. Now an infinite cyclic group has subgroups of
every index (Lemma 11.11) and cannot be simple. Therefore G is a finite
cyclic group, say G = n 1. Then, for every positive divisor m of n, the
group G has a subgroup of order m (Lemma 11.10). But the order of any
subgroup of G is either 1 or n. Hence 1 and n are the only positive divi-
sors of n and n is prime. Thus an abelian simple group is a cyclic group

285
of prime order. Conversely, a cyclic group of prime order has no non-
trivial proper subgroup by Lagrange's theorem, and is therefore an
abelian simple group. We proved the following theorem.

27.2 Theorem: An abelian group G is simple if and only if G is cyclic of


prime order.

We prove next that the alternating groups An, where n 5, are simple.
We need a lemma. Let us recall that a 3-cycle is a permutation of the
form (abc), a b c a.

27.3 Lemma: If n 3, then An is generated by the set of all 3-cycles


in An.

Proof: We must prove that every element of An can be written as a


product of 3-cycles (Lemma 24.2). Every element of An can be written
as a product of an even number of transpositions and,. taking the
transposi-tions in pairs, we see that every element of An can be written
as a pro-duct of permutations of the form (ab)(cd), where a b and c
d.. Hence it suffices to prove that. every permutations of the form
(ab)(cd) can be written as a product of 3-cycles. .

There are three cases to consider, in which two or one or none of c,d is in
the set {a,b}. In the first case, {c,d} = {a,b}, hence (cd) = (ab) and (ab)(cd)
= (ab)(ab) = = (abe)(abe)(abe) is a product of 3-cycles, where e is
distinct from a and b (here we use the assumption n 3). In the second
case, we may assume c = a without loss of generality. Then (ab)(cd) =
(ab)(ad) = (abd) is a product of one 3-cycle. In the third case, a,b,c,d are
all distinct and (ab)(cd) = (abc)(adc) is a product of two 3-cycles. The
proof is complete.

27.4 Theorem: If n 5, then An is simple.

286
Proof: Let 1 N An. We will prove N = 1.

First we prove that there can be no 3-cycle in N.. Assume, by way of


contradiction, that there is a 3-cycle (abc) in N.. Let (a´b´c´) be any 3-
cycle and choose two distinct numbers e,f from {1,2, . . . ,n}\{a´,b´,c´}.. This
is possible because n 5. Let be a permutation in S n such that a = a´,
b = b´ and c = c´ and put = (ef).. Then a = a´, b = b´ and c = c´ as
well and 1(abc) = (a´b´c´) = 1(abc) .. Since the signs of and = (ef)
are different, either or is in An. Then, since (abc) N and N A n,
either 1(abc) or 1(abc) is in N.. So (a´b´c´) N and N contains all 3-
cycles. From Lemma 27.3, we conclude An N, contrary to N A n.
Therefore there can be no 3-cycle in N..

Secondly, there can be no permutation in N involving a cycle of length


greater than or equal to 4 when written out as a product of disjoint
cycles. Indeed, if = (abcd. . . ) N, where (abcd. . . ) and are disjoint
permutations, then
1
.(abc) 1 (abc) = 1(. . . dcba)(cba)(abcd. . . ) (abc)
= 1 (. . . dcba)(cba)(abcd. . . )(abc)
= (. . . dcba)(cba)(abcd. . . )(abc)
= (abd)
would be in N, contrary to what we proved above. So the disjoint cycles
of a nonidentity permutation in N have lengths (1, in which case we do
not write them, or) 2 or 3.

Thirdly, in the disjoint cycle decomposition of any nonidentity element


of N, there can be no 3-cycle. To prove this, we first note that, if there
were only one 3-cycle in the disjoint cycle decomposition of a noniden-
tity element of N, so that its disjoint cycle decomposition is a 3-cycle
times a product of transpositions, then the square of that element would
be a 3-cycle in N, which is impossible. Thus, if there is a in N whose
disjoint cycle decomposition involves a 3-cycle at all, then there are at
least two 3-cycles in the disjoint cycle decomposition of . Then we have
= (abc)(def) , say, where (abc),(def), are disjoint permutations and
.(dec) 1 (dec) = (abc)(def) (ced)(abc)(def) (dec)
= (abc)(def)(ced)(abc)(def)(dec) 2
= (adcbf) 2
is in N, which is impossible, since there is a cycle of length 5 in its
disjoint cycle decomposition ((adcbf) and 2 are disjoint permutations).

287
Hence, in the disjoint cycle decomposition of any nonidentity element of
N, there is no cycle of length 3. Combining this with what we proved
above, we conclude that any nonidentity element in N must be a product
of (an even number of) disjoint transpositions.

Fourthly, a product of 2k disjoint transpositions cannot belong to N if k


is greater than or equal to 2, for if = (ab)(cd)(ef)(gh) belonged to N,
where = or is a product of disjoint transpositions and disjoint from
(ab)(cd)(ef)(gh), then .(de) 1(bc) 1 (bc)(de)
= (ab)(cd)(ef)(gh) (ed)(cb)(ab)(cd)(ef)(gh) (bc)(de)
= (aed)(bcf) 2
would also belong to N. This possibility was excluded above.

Since we assume N 1, there is a N, . Here is necessarily a


product of two disjoint transpositions, say = (ab)(cd). We choose a
number e from {1,2, . . . ,n}\{a,b,c,d}. Then the 3-cycle .(aeb) 1 (aeb) =
(ab)(cd)(bea)(ab)(cd)(aeb) = (abe) belongs to N as well, the final contra-
diction. This shows that the assumption N 1 is untenable. Thus N = 1
and An is simple.

27.5 Definition: Let G be a nontrivial group. A proper normal


subgroup M of G is said to be a maximal normal subgroup of G if there is
no subgroup L of G such that M L G. Equivalently, M is a maximal
normal subgroup of G if M G, and M K G implies that either M =
K or K = G.

27.6 Lemma: Let M G. Then G/M is a simple group if and only if M is


a maximal normal subgroup of G.

Proof: Since M G, we have G/M 1. If G/M is not simple, there is a


normal subgroup of G/M, say N/M, which is distinct from M/M and G/M,
so M/M N/M G/M. By Theorem 21.2, M N G and M is not a
maximal normal subgroup of G. Conversely, if M is not a maximal normal
subgroup of G, there is an N such that M N G and, by Theorem 21.2,
M/M N/M G/M. So G/M has a nontrivial proper normal subgroup
N/M and G/M is not simple.

288
27.7 Definitions: Let H G. A finite sequence of subgroups of G,
including H and G, is called a series from H to G, or a series between H
and G, if each group in the sequence is a normal subgroup of the next
one. Thus a series from H to G can be written

H = H0 H1 ... Hn Hn = G.
1
(1)

The subgroups H0,H1, . . . ,Hn 1,Hn are called the terms of the series (1). The
factor groups H1/H0, H2/H1, . . . ,Hn/Hn 1 are called the factors of the series
(1). A series from 1 to G will be called shortly a series of G..

If each term H0,H1, . . . ,Hn 1,Hn of the series (1) happens to be normal
(characteristic) in G, the series (1) will be called a normal (characteristic)
series.

There may be repetitions in (1). If, however, Hi 1


Hi for each i = 1,2,. . .
,n, the series (1) will be called a proper series.

A series

H = J0 J1 ... Jm Jm = G.
1
(2)

from H to G is said to be a refinement of (1) if every term of (1) is also a


term of (2). Thus a refinement of (1) is obtained from (1) by inserting
additional groups between some consecutive terms of (1). These
additional terms need not be distinct from the terms of (1). For example,
A B B C is a refinement of A B C. If (2) is a refinement of
(1) and if there is at least one term in (2) which is not a term of (1), then
(2) is called a proper refinement of (1).

27.8 Definition: Let G be a group. A series of G is called a composition


series of G if it is a proper series of G and has no proper refinement. A
factor of a composition series of G is called a composition factor of G.

289
27.9 Lemma: A series
1 = G0 G1 ... Gn 1 Gn = G
of a group G is a composition series of G if and only if all factors Gi/Gi 1
(i = 1,2, . . . ,n) are simple.

Proof: Suppose first that the given series is a composition series of G.. By
definition, it is a proper series. So Gi 1 Gi and all factors Gi/Gi 1 are dis-
tinct from the trivial group (i = 1,2, . . . ,n). If one of the factors, say
Gj/Gj 1, were not simple, Gj/Gj 1 would have a nontrivial proper normal
subgroup, which may be written as H/Gj 1, where Gj 1 H Gj by
Theorem 21.2. Hence Gj 1 H Gj (in fact Gj 1 Gj ) and the given series
has a proper refinement which is obtained by inserting H between Gj 1
and Gj , contrary to our hypothesis. that the given series is a composition
series. Hence Gi/Gi 1 are all simple (i = 1,2, . . . ,n).

Conversely, let us assume that all factors Gi/Gi 1 are simple (i = 1,2, . . . ,n).
Then Gi/Gi 1 is not trivial and so Gi 1 Gi for all i = 1,2, . . . ,n. Thus the
given series is proper.. If it were not a composition series, it would have
a proper refinement.. To fix the ideas, let us assume that such a refine-
ment has a term H between Gj and Gj 1, so that Gj 1 H Gj . By Theorem
21.2, H/Gj 1 would be a nontrivial proper normal subgroup of Gj/Gj 1,
contrary to the hypothesis that all factors, including Gj/Gj 1, are simple.
Hence the given series is a composition series..

27.10 Examples: (a) 1 S3 is a series of S 3 and 1 A3 S3 is a


refinement thereof. The latter is a composition series of S 3, because the
factors A3/1 C3 and S 3/A3 C2 are simple (Theorem 27.2, Lemma
27.9). It is easily seen that 1 A3 S3 is the unique composition series
of S 3 (cf. §15, Ex.10)..

(b) 1 V4 A4 S 4 is a normal series of S 4 (it is a chief series of S 4;


see Ex. 4). It is not a composition series of S 4, for it can be refined by
inserting one of the subgroups U1 = { ,(12)(34)}, U2 = { ,(13)(24)}, and U3
= { ,(14)(23)} between 1 and V4 = { ,(12)(34),(13)(24),(14)(23)}. Each one
of the three series 1 Ui V4 A4 S4 is a composition series of S 4
(i = 1,2,3).. The reader will easily verify that these are the only
composition series of S 4.

290
(c) We want to find all composition series of S n for n 5. For this
purpose, we determine all normal subgroups of S n.

Let n 5 and 1 N Sn. Then N An An by Theorem 21.3 and,


since An is simple (Theorem 27.4), either N An = An or N An = 1.

In case N An = An, we have An N S n. Thus N:An divides S n:An = 2


and N:An = 1 or N:An = 2. Hence N = An or N = S n.

In case N An = 1, we have N An (because 1 N), so An AnN S n,


so AnN = S n and N = N:1 = N:N An = AnN:An = S n:An = 2. Thus N =
{ , } for some S n\An. Since N Sn, we obtain
{ , } = N = N = { , } = { , } = { , } for all S n,

hence = for all S n. (*)

From o( ) = = N = 2, we see that the disjoint cycle decomposition of


involves transpositions only (Theorem 15.17), say

= (a 1b1)(a 2b2). . . (a mbm)

for some odd number m. If m 3, then := (a 3b3). . . (a mbm) is disjoint


from (a 1b1)(a 2b2) and

(a 1 a 2 )
= (a 1b1)(a 1 a 2 )(a 2b2)(a 1 a 2 ) (a 1 a 2 )
= (a 2b1)(a 1b2) (a 1b1)(a 2b2) =

contrary to (*). Hence m = 1 and = (a 1b1). Let c {1,2, . . . ,n}\{a 1,b1}.


Now

(a 1 c)
= (ca1)(a 1b1)(a 1c) = (b1c) (a 1b1) = ,

again contradicting (*).. Hence there is no nontrivial normal subgroup N


of S n such that N An = 1.

Consequently, 1,An,S n are the only normal subgroups of S n when n 5.


Thus if 1 = G0 G1 ... Gk 1 Gk = S n is a composition series of S n,
here Gk 1 has to be An and Gk 2 has to be 1. Therefore the series must be
1 An Sn, which is indeed a composition series of S n, for An/1 An
and S n/An C2 are simple groups (Lemma 27.9).

Thus 1 An Sn is the unique composition series of S n when n 5.

291
(d) Not every group has a composition series. For example, has no
composition series. Indeed, any series of is of the form

0 m1 m2 ... mn , (3)
where m2 m1, m3 m2, . . . , mn mn 1. If m0 is a multiple of m1 and m0 m 1,
then
0 m0 m1 m2 ... mn
is a proper refinement of (3). Thus any series of has a proper refine-
ment. Consequently, no series of can be a composition series of .

(e) Let a be a cyclic group of order 12. Then


1 a6 a2 a ; 1 a6 a3 a ; 1 a4 a2
a
are the composition series of a . The composition factors are isomorphic
to C2,C3,C2; C2,C2,C3; C3,C2,C2.
Thus, aside from order, the composition factors arising from different
composition series are isomorphic groups.

27.11 Definition: Let G be a group. Two series

1 = G0 G1 ... Gn Gn = G
1
1 = H0 H1 ... Hm Hm = G
1

of G are said to be equivalent if n = m and if the factors Gi/Gi 1


are, in
some order, isomorphic to the factors Hj /Hj 1 (i,j = 1,2, . . . ,n).

Here it is not stipulated that Gi/Gi 1 Hi/Hi 1


for all i = 1,2, . . . ,n. The
condition in Definition 27.11 is that Gi/Gi 1 Hi /Hi 1 for some S n.
Clearly, Definition 27.11 introduces an equivalence relation on the set of
all series of G. The three series in Example 27.10(e) are equivalent. We
will prove that any two composition series of a group are equivalent,
provided G does have a composition series (Jordan-Hölder theorem). In
fact, a much stronger theorem is true (see Schreier's theorem below). We
need some elementary results.

292
27.12 Lemma (Dedekind's modular law): Let G be a group and let
A,B,C be subgroups of G such that A C. Then
A(B C) = AB C.
(A(B C) and AB C are not necessarily subgroups of G.)

Proof: Let x A(B C). Then x = ab for some a A, b B C. Thus x =


ab AB and x = ab AC = C, so x AB C. This gives A(B C) AB C.
To show the reverse inclusion, let c AB C. Then c = a 1b1 for some a 1
in A and b1 in B. From b1 = a 1 1c AC = C, we conclude b1 B C, hence
c = a 1b1 A(B C). This gives AB C A(B C). So A(B C) = AB C.

27.13 Lemma: Let A C G and B G. Then


A B C B and C B / A B A(B C)/A.

Proof: If G is a group, H G and K G, then H K K and K / H K is


isomorphic to HK/H by Theorem 21.3. Using this theorem with G,H,K
replaced by C,A,C B, respectively, we obtain A (C B) C B and
C B / A (C B) A(C B)/A. Since A (C B) = A B and A(C B)
= A(B C), the claim follows.

BC = CB
C BA = AB
AB C = A(B C) B

A C B

A B

27.14 Lemma: Let A C G and B G. Then


BA BC and BC/BA C/A(B C).

Proof: Since B G, we know from Lemma 19.4 that AB = BA G and


that CB = BC G. Thus BA BC. We prove next that BA is normal in BC.
We observe

293
B BA N G(BA)
hence (BA)b = BA for all b B.
Then, for any b B, c C, we obtain
(BA)bc = [(BA)b]c = (BA)c = B cAc = BAc = BA
since B G and A C. Thus (BA)x = BA for all x BC and BA BC.

Using Theorem 21.3 with BC, BA, C in place of G,H,K, respectively, we get
AB C C and C/AB C C(AB)/AB.
Since AB C = A(B C) and C(AB) = (CA)B = CB = BC, this isomorphism
means C/A(B C) BC/BA.

27.15 Lemma (Zassenhaus' lemma): Let G be a group,

U1 U2 G and V1 V2 G.

Then U1(U2 V1) U1(U2 V2), V1(U1 V2) V1(U2 V2) and

U1(U2 V2) /U1(U2 V1) V1(U2 V2) / V1(U1 V2).

Proof: We put Dij := Ui Vj (i,j = 1,2). Since U1 U2, we have


U1 V2 U2 V2 by Lemma 27.13, so D12 D22. Similarly,V1 V2
and Lemma 27.13 gives U2 V1 U2 V2, so D21 D22. Now D12 D22
and D21 D22 and writing E = D12D21 = D21D12 for brevity, we get E
D22 by Lemma 19.4(3).
U V
2 2
U 1 D22 V1 D 22
D 22
U 1D21 V1 D 12

U1
V
E 1
D
12
D
21

D11

Since E D22 U2 and U1 U2, Lemma 27.14 gives

U 1E U1D22 and U1D22/U1E D22/E(U1 D22). (4)

294
Here U1E = U1(D12D21) = (U1D12)D21 = U1D21 and E(U1 D22) = E (because
U1 D22 = D12 E), so (4) becomes

U1D21 U1D22 and U1D22/U1D21 D22/E. (5)

Repeating the same argument with U's replaced by V's, we get

V1D12 V1D22 and V1D22/V1D12 D22/E. (6)

The claim follows from (5) and (6).

27.16 Theorem (Schreier): Any two series of a group have


equivalent refinements. More precisely, if

1 = G0 G1 ... Gn Gn = G (g)
1
1 = H0 H1 ... Hm Hm = G (h)
1

are series of G, then there are series (g´) and (h´) of G such that (g´) is a
refinement of (g), (h´) is a refinement of (h), and (g´) and (h´) are equi-
valent.

Proof: We will try to build a series between each Gi 1


and Gi (i = 1,2,. . .
,n) by so modifying the terms of (h) that the modified series begin from
Gi 1 and terminate at Gi. There are two natural ways of doing this. Either
we multiply each term of (h) by Gi 1 (the resulting series will thus begin
from Gi 1) and intersect the products with Gi (the modified series will
thus terminate at Gi); or we intersect each term of (h) by Gi (the
resulting series will thus terminate at Gi) and multiply the intersections
by Gi 1(the modified series will thus begin from Gi 1). By Dedekind's
modular law (Lemma 27.12), these two series between Gi 1 and Gi are
identical. .

Hm Gi 1Hm Gi 1Hm Gi = Gi

Hm 1
Gi 1Hm 1
Gi 1Hm 1
Gi

H1 Gi 1H1 Gi 1H1 Gi

295
H0 Gi 1H0 Gi 1H0 Gi = Gi 1

Hm Hm Gi Gi 1(Hm Gi) = Gi

Hm 1
Hm 1
Gi Gi 1(Hm 1
Gi)

H1 H1 Gi Gi 1(H1 Gi)

H0 H0 Gi Gi 1(H0 Gi) = Gi 1

We put Gij = Gi 1Hj Gi = Gi 1(Hj Gi) (i = 1,2, . . . ,n; j = 1,2, . . .


,m).

Similarly, we put Hij = Hj 1Gi Hj = Hj 1(Gi Hj) (i = 1,2, . . . ,n; j = 1,2, . . .


,m).

Here Gi 1 Gi, hence Gi 1(Hj Gi) Gi by Lemma 19.4(2). Thus Gij is a


subgroup of Gi. In the same way, Hij is a subgroup of Hi. So

Gi = Gi0 Gi1 Gi2 ... Gi,m 1 Gim = Gi (g i)


1

and Hj = H0j H1j H2j ... Hn 1,j Hnj = Hj . (hj )


1

Using Zassenhaus' lemma (Lemma 27.15) with


U1 = Gi 1, U2 = Gi, V1 = Hj 1, V2 = Hj ,
we obtain, for each i = 1,2, . . . ,n, j = 1,2, . . . ,m:

Gi 1(Gi Hj 1) Gi 1(Gi Hj), Hj 1(Gi 1


Hj) Hj 1(Gi Hj)

and Gi 1(Gi Hj) / Gi 1(Gi Hj 1) Hj 1(Gi Hj) / Hj 1(Gi 1


Hj).

Thus Gi,j 1 Gij , Hi 1,j Hij and Gij /Gi,j 1 Hij/Hi 1,j .

Therefore (g i) is a series between Gi 1 and Gi, and (hj ) is a series between


Hj 1 and Hj . Writing the terms of (g 1),(g 2),(g 3), . . . ,(g n) consecutively, we
obtain a series (g´) of G with nm factors;. and writing the terms of
(h1),(h2),(h3), . . . ,(hm) consecutively, we obtain a series (h´) of H with mn

296
factors.. Here (g´) is a refinement of (g) and (h´) is a refinement of (h).
Finally, in view of the isomorphisms Gij /Gi,j 1 Hij/Hi 1,j , the series (g´)
and (h´) are equivalent..

27.17 Theorem: Let G be a group and assume that G has a composition


series.
(1) Every proper series of G has a refinement which is a composition
series.
(2) (Jordan-Hölder Theorem) Any two composition series of G are equi-
valent.

Proof: Let
1 = G0 G1 ... Gn Gn = G (g)
1
be a proper series of G and let
1 = H0 H1 ... Hm Hm = G (h)
1

be a composition series of G. By Schreier's theorem (Theorem 27.16),


there are equivelent series (g´) and (h´) of G such that (g´) is a
refinement of (g) and (h´) is a refinement of (h). From (g´) and (h´), we
delete repeated factors and thereby obtain two equivalent proper series,
say (g´´) and (h´´), respectively. Here (g´´) is a refinement of (g) and (h´´)
is a refinement of (h), because both (g) and (h) are proper series.

(1) (h´´) is a proper series and is a refinement of (h). But (h) has no
proper refinement, because (h) is a composition series. Hence (h´´) is
identical with (h). Thus (g) has a refinement (g´´) which is equivalent to
the composition series (h´´) = (h). Then the factors of (g´´), being iso-
morphic to the composition factors in (h), are all simple groups and (g´´)
itself is a composition series by Lemma 27.9. Therefore any proper
series (g) of G has a refinement (g´´) which is a composition series.

(2) Assume now (g) is also a composition series of G. By the same


argument as above, (g´´) must be identical with (g). Then (g) = (g´´) and
(h´´) = (h) are equivalent. Thus any two composition series of G are equi-
valent.

We now discuss the class of solvable groups.

297
27.18 Definition: A series
H = H0 H1 ... Hm Hm = G
1
from H to G is said to be an abelian series if all the factors
H1/H0, H2/H1, . . . , Hm/Hm 1
are abelian groups.

27.19 Definition: A group G is called a solvable (or soluble) group if G


has an abelian series (from 1 to G).

Clearly, any abelian group A is solvable:. 1 A is an abelian series of A.


S 3 is an example of a nonabelian solvable group. Not every group is
solvable.. For example nonabelian simple groups are certainly not solv-
able. In particular, An is not solvable for n 5.

27.20 Lemma: If G is a solvable group, then all subgroups and factor


groups of G are solvable.

Proof: Being solvable, G has an abelian series

1 = H0 H1 ... Hm Hm = G.
1

Let K be an arbitrary subgroup of G. Then, by Lemma 27.13,

1 = H0 K H1 K ... Hm K Hm K=G K=K (6)


1

is a series of K and Hi K / Hi 1 K Hi 1(K Hi)/Hi Hi/Hi 1 for all i =


1,2, . . . ,m. Since Hi/Hi 1 are abelian, Hi K / Hi 1 K are also abelian and
(6) is an abelian series of K. Hence K is solvable.

Now let N be an arbitrary normal subgroup of G. By Lemma 27.14 and


Theorem 21.2,

N/N = H0N/N H1N/N ... Hm 1N/N HmN/N = G/N (7)

298
is a series of G/N, and, for all i = 1,2, . . . ,m, HiN/N / Hi 1N/N HiN/Hi 1N

Hi/Hi 1(N Hi) Hi/Hi 1


/ Hi 1(N Hi)/Hi 1
is a factor group of the

abelian group Hi/Hi 1


and therefore HiN/N / Hi 1N/N is abelian (Lemma
18.9(2)). So (7) is an abelian series of G/N and G/N is solvable.

27.21 Lemma: Let N G. If N and G/N are both solvable, then G is


solvable.

Proof: By hypothesis, there are an abelian series

1 = N0 N1 ... Nm Nm = N
1
of N and an abelian series

N/N = H0/N H1/N ... Hk 1/N Hk/N = G/N


of G/N. By Theorem 21.2,

1 = N0 N1 ... Nm Nm = N = H0 H1 ... Hk Hk = G
1 1

is a series of G. Since Nj /Nj 1


is abelian for j = 1,2, . . . ,m and Hi/Hi 1

Hi/N / Hi 1/N is abelian for i = 1,2, . . . ,k, this is an abelian series of G.


Thus G is solvable.

27.22 Lemma: Let H and K be normal solvable subgroups of a group G.


Then HK is a normal solvable subgroup of G.

Proof: HK is a normal subgroup of G by Lemma 19.4(3). Also, since K is


solvable, K/H K is solvable by Lemma 27.20, so HK/H is solvable by
Theorem 21.3. So H and HK/H are solvable and consequently HK is solv-
able by Lemma 27.21.

27.23 Theorem: If G is a finite p-group, then G is solvable.

Proof: If G is a finite p-group of order G = pa , then there is a series


1 = H0 H1 ... Ha 1 Ha = G

299
of G whose factors Hi/Hi 1
(i = 1,2, . . . ,a) are cyclic of order p (Theorem
26.3(2)). Thus G has an abelian series and G is solvable.

The series in Theorem 27.23 is a composition series of G. We now want


to prove more generally that a finite group G is solvable if and only if
every composition factor of G is cyclic of prime order. A finite group
does have a composition series, of course.

27.24 Lemma: A solvable group G is simple if and only if G is cyclic of


prime order.

Proof: Let G be a simple solvable group. Then G 1 and G has an abelian


series
1 = H0 H1 ... Hm 1 Hm = G.
After deleting repetitions, we may assume that this is a proper series.
Then Hm 1 G and G/Hm 1 is a nontrivial abelian group. Thus G´ Hm 1
(Theorem 24.14) and G´ is a proper normal subgroup of G. Since G is
simple, G´ = 1 and G is abelian. Thus G is cyclic of prime order by The-
orem 27.2. Conversely, a cyclic group of prime order is simple; and
abelian, hence solvable.

27.25 Theorem: Let G be a finite group. G is solvable if and only if


every composition factor of G has prime order.

Proof: If G is solvable, then any composition factor of G is solvable by


Lemma 27.20, simple by Lemma 27.9, and so has prime order by
Lemma 27.24. Conversely, if every composition factor of G has prime
order, then a composition series of G is an abelian series of G and
therefore G is solv-able.

The following result will play a crucial role in proving that a polynomial
equation of degree greater than four cannot be solved by radicals.

300
27.26 Theorem: If n 5, then Sn is not solvable.

Proof: Otherwise the subgroup An of S n would be solvable (Lemma


27.20), whereas An, being a nonabelian simple group (Lemma 27.3),
cannot have an abelian series.. The conclusion follows also from Theorem
27.25, since An is a composition factor of S n (1 An Sn is the unique
composition series of S n by Example 27.10(c)).

Exercises

1. Let {Gi: i } be a collection of simple groups such that Gi Gi+1 for all

i and let G = Gi. Prove that G is a simple group.


i=1

2. Let S ( ) = { S :k k for at most finitely many k } and, for


each n , let S(n) = { S :k k for all k n + 1}. Show that
Sn S(n) S( ) S . Let A(n) denote the image of An under the

isomorphism S n S(n) for n 2 and show that A := A(n) is a simple


i=5
group. (A is called the infinite alternating group of degree .)

3. Let M G and G:M be prime. Prove that M is a maximal normal


subgroup of G.

4. A normal series of a group G is called a chief series of G if it is a


proper series and if it has no proper refinement which is a normal series
of G. A factor of a chief series of G is called a chief factor of G.

Let G be a nontrivial group. A nontrivial normal subgroup M of G is


called a minimal normal subgroup of G if there is no L G such that
1 L M.

Prove the following statements.

301
(a) H/K is a chief factor of G if and only if H/K is a minimal
subgroup of G/K.
(b) If M is a minimal normal subgroup of G, then M has no
characteristic subgroup except 1 and M.
(c) If G has a composition series, then G has a chief series.
(d) 1 V4 A4 S4 is the unique chief series of S 4.

5. Suppose G has a finite abelian group having no characteristic


subgroups except 1 and G. Show that there is a prime number p such
that g p = 1 for all g G.

6. Prove that an abelian group has a composition series if and only if it is


finite.

7. Find an infinite abelian subgroup of the infinite alternating group (see


Ex. 2). Conclude that a subgroup of a group with a composition series
does not necessarily have a composition series.

8. Let H G. Prove that, if G has a composition series, so does G/H.

9. Let H G. Prove that, if H and G/H have composition series, so does G.

10. Repeat the proof of Schreier's theorem for the two series 1 C18
C36 and 1 C4 C12 C36 of the cyclic group C36.

11. Prove that, if H and K are solvable, so is H K.

12. Prove that, if H,K G and G/H, G/K are solvable, so is G/H K.

13. For each n , we define a subgroup G(n) of G recursively by G(n+1) =


(G(n))´ = [G(n),G(n)]. The series

G G(1) G(2) ...

is called the derived series of G. Show that each G(n) is characteristic in G


and that, if
Gr Gr 1 Gr 2 ... G1 G0 = G
is an abelian series between Gr and G, then G(n) Gn for each n = 1,2, . . .
,r. Prove that G is solvable if and only if G(r) = 1 for some r .

14. Prove that a solvable group has a composition series if and only if it
is finite (cf. Ex. 6).

302
§28
Finitely Generated Abelian Groups

In this last paragraph of Chapter 2, we determine the structure of


finitely generated abelian groups. A complete classification of such
groups is given. Complete classification theorems are very rare in math-
ematics and, in general, they require sophisticated machinery. However,
the main theorems in this paragraph are proved by quite elementary
methods, chiefly by induction! This is due to the fact that commutativity
is a very strong condition.

This paragraph is not needed in the sequel.

28.1 Lemma: Let G be an abelian group. We write


T(G) := {g G: o(g) is finite}.
(1) T(G) is a subgroup of G (called the torsion subgroup of G).
(2) In G/T(G), every nonidentity element is of infinite order.

Proof: (1) Since o(1) = 1 , 1 T(G) and T(G) . Suppose now a,b are
nm
in T(G), say o(a) = n, o(b) = m (n,m ). Then (ab) = a nmbnm = 1.1 = 1,
so o(ab) nm, thus ab T(G); and o(a 1) = n , thus a 1 T(G). By the
subgroup criterion, T(G) G.

(2) Since G is abelian, we can build the factor group G/T(G). If T(G)x in
G/T(G) has finite order, say n , then (T(G)x)n = T(G), so T(G)xn = T(G),
so xn T(G), so o(xn) is finite. Let o(xn) = m . Then xnm = (xn)m = 1, so
o(x) nm. Thus o(x) is finite and x T(G). It follows that T(G)x = T(G) is
the identity element of G/T(G). Hence every nonidentity element of
G/T(G) has infinite order.

28.2 Definition: A group G is called a torsion group if every element of


G has finite order. A group is said to be without torsion, or torsion-free if
every nonidentity element of G has infinite order.

302
Thus 1 is the only group which is both a torsion group and torsion-free.

Every finite group is a torsion group, but there are also infinite torsion
groups, for example / .

In view of Lemma 28.1, we are led to investigate two classes of abelian


groups: torsion abelian groups and torsion-free abelian groups. When
this is done, we will know the structure of T(G) and G/T(G), where G is
an abelian group. We must then investigate how T(G) and G/T(G) are
combined to build G.

We cannot expect to carry out this ambitious program without imposing


additional conditions on G. We will assume that G is finitely generated
(Definition 24.4). Under this assumption, T(G) turns out to be a finite
group (Theorem 28.15). The study of finite abelian groups reduces to the
study of finite abelian p-groups, p being a prime number, whose
structures are described in Theorem 28.10. After that, we turn our
attention to torsion-free abelian groups (Theorem 28.13). The next step
in our program is to put the pieces T(G) and G/T(G) together in the
appropriate way to form G. The appropriate way proves to be the
simplest way: G is isomorphic to the direct product of T(G) and G/T(G).
The structure of G will be completely determined by a set of integers.

28.3 Definition: Let G be an abelian group and let S = {g1,g2, . . . ,gr} be a


finite, nonempty subset of G. If, for any integers a 1,a 2, . . . ,a r, the relation
g1a 1 g2a 2 . . . gra r = 1
implies that g1a 1 = g2a 2 = . . . = gra r = 1, then S is said to be independent. If S
is independent and generates G, and if 1 S, then S is called a basis of
G..

In the following lemma, we will prove, among other things,. that S =


{g1,g2, . . . ,gr} is a basis of G if and only if G is the direct product of the
cyclic groups g1 , g2 , . . . , gr . Lemma 28.4(2) is of especial importance:
it states that. a finitely generated abelian torsion group is in fact a finite
group..

303
28.4 Lemma: Let G be an abelian group and g1,g2, . . . ,gr be finitely many
elements of G, not necessarily distinct (r 1). Let B G.

(1) g1,g2, . . . ,gr = g1 g2 . . . gr .


(2) If each gi has finite order, then g1,g2, . . . ,gr o(g1)o(g2). . . o(gr).
(3) If G = g1,g2, . . . ,gr and : G A is a homomorphism onto A, then A =
g1 ,g2 , . . . ,gr .
(4) If G = g1,g2, . . . ,gr , then G/B = Bg1,Bg2, . . . ,Bgr .
(5) If G/B = Bg1,Bg2, . . . ,Bgr , then G = B g1,g2, . . . ,gr . If, in addition,
b1, . . . ,bs B and B = b1, . . . ,bs , then G = b1, . . . ,bs,g1,g2, . . . ,gr .
(6) If B = g1 and G/B = Bg2, . . . ,Bgr , then G = g1,g2, . . . ,gr .
(7) {g1,g2, . . . ,gr} is an independent subset of G and G = g1,g2, . . . ,gr if and
only if G = g1 g2 ... gr . In particular, in case g1,g2, . . . ,gr are all
distinct from 1, the subset {g1,g2, . . . ,gr} is a basis of G if and only if G =
g1 g2 ... gr .

Proof: (1) Certainly {g1,g2, . . . ,gr } g1 g2 . . . gr G by repeated use of


Lemma 19.4(3), and so g1,g2, . . . ,gr g1 g2 . . . gr by the definition of
g1,g2, . . . ,gr . Also, any element of g1 g2 . . . gr , necessarily of the form
g1m1g2m2. . . grmr with suitable integers m1,m2, . . . ,mr, is in g1,g2, . . . ,gr by
Lemma 24.2 and so g1 g2 . . . gr g1,g2, . . . ,gr . Hence g1,g2, . . . ,gr =
g1 g2 . . . gr .

(2) Suppose o(gi) = ki for each i = 1,2, . . . ,r. If g g1,g2, . . . ,gr , then,
by part (1), g = g1m1g2m2. . . grmr with suitable integers mi. Dividing mi by ki,
we may write mi = kiqi + ti, where qi,ti and 0 ti ki. Then gimi =
(giki)qigiti = giti and g = g1t1g2t2. . . grtr. Thus
g1,g2, . . . ,gr {g1t1g2t2. . . grtr : 0 ti ki for all i = 1,2, . . . ,r}
and
g1,g2, . . . ,gr k1k2. . . kr.

(3) If a A, then a = g for some g G since is onto and


m1 m2 mr
g = g1 g2 . . . gr with suitable integers mi since G = g1,g2, . . . ,gr . Thus
a = g = (g1m1g2m2. . . grmr) = (g1 )m1(g2 )m2. . . (gr )mr g1 ,g2 , . . . ,gr
and A g1 ,g2 , . . . ,gr .

(4) This follows from part (3) when we take A to be G/B and to be the
natural homomorphism : G G/B.

304
(5) Suppose G/B = Bg1,Bg2, . . . ,Bgr . Let g G. Then Bg G/B and, by part
(1) with G/B in place of G and Bgi in place of gi, we have
Bg = (Bg1)m1(Bg2)m2. . . (Bgr)mr = Bg1m1g2m2. . . grmr for some integers mi.
Hence g = bg1m1g2m2. . . grmr for some b B and g B g1,g2, . . . ,gr . So G =
B g1,g2, . . . ,gr . If, in addition, B = b1, . . . ,bs , then
G = b1, . . . ,bs g1,g2, . . . ,gr = b1 . . . bs g1 g2 . . . gr
= b1, . . . ,bs,g1,g2, . . . ,gr .

(6) This follows from part (5) with a slight change in notation.

(7) Since G is abelian, G = g1 g2 ... gr if and only if every


element of G can be expressed in the form u1u2. . . ur, where ui gi , in a
unique manner (Theorem 22.15).

Every element of G has at least one such representation if and only if G =


g1 g2 . . . gr , that is, if and only if G = g1,g2, . . . ,gr .

We want to show that every element. of G has at most one such repre-
sentation if and only if {g1,g2, . . . ,gr} is independent. Equivalently, we will
prove that there is an element in G with. two different representations if
and only if {g1,g2, . . . ,gr} is not independent. Indeed, there is an element
in G with two different representations if and only if g1m1g2m2. . . grmr =
g1n1g2n2. . . grnr for some integers such that gimi gini for at least one
i {1,2, . . . ,r}. The latter condition holds if and only if

g1m1 n1g2m2 n2. . . grmr nr = 1,

where not all of g1m1 n1, g2m2 n2, . . . , grmr nr are equal to 1, that is, if and only
if {g1,g2, . . . ,gr} is not independent.

28.5 Lemma: Let G be a group and g1,g2, . . . ,gr elements of G. Let B = g1


and suppose o(gi) = o(Bgi) for i = 2, . . . ,r.

(1) If {Bg2, . . . ,Bgr} is an independent subset of G/B, then {g1,g2, . . . ,gr} is an


independent subset of G.
(2) Assume g1,g2, . . . ,gr are all distinct from 1. If G/B = Bg2 ... Bgr ,

then G = g1 g2 ... gr .

305
Proof: (1) If m1,m2, . . . ,mr are integers such that

g1m1g2m2. . . grmr = 1,
(*)

then B = Bg1m1g2m2. . . grmr = (Bg1)m1(Bg2)m2. . . (Bgr)mr = (Bg2)m2. . . (Bgr)mr ,


so (Bg2)m2 = . . . = (Bgr)mr = B since {Bg2, . . . ,Bgr} is independent. Thus o(gi) =
o(Bgi) divides mi in case o(gi) is finite and mi = 0 in case o(gi) = o(Bgi) is
infinite (i = 2, . . . ,r). In both cases gimi = 1 (i = 2, . . . ,r), and, because of (*),
g1m1 = 1 as well. Hence {g1,g2, . . . ,gr} is independent.

(2) If G/B = Bg2 ... Bgr , then G/B = Bg2, . . . ,Bgr and
{Bg2, . . . ,Bgr} is independent (Lemma 28.4(7)),
G = g1,g2, . . . ,gr (Lemma 28.4(6)),
{g1,g2, . . . ,gr} is independent (Lemma 28.5(1)),
G = g1 g2 ... gr (Lemma 28.4(7)).

We now examine the structure of finite abelian groups. A finite abelian


group is a direct product of its Sylow p-subgroups. This follows immedi-
ately if the existence of Sylow p-subgroups is granted. In order to keep
this paragraph independent of §26, we give another proof, from which
the existence of Sylow p-subgroups (of finite abelian groups) follows as
a bonus. We need a lemma.

28.6 Lemma: Let A be a finite abelian group and let q be a prime


number. If q divides A , then A has an element of order q.

Proof: Let A = n and let a1,a2, . . . ,an be the n elements of A. We write


mi = o(ai) for i = 1,2, . . . ,n. We list all products

a1k1 a2k2 . . . ankn

where each ki runs through 0,1, . . . ,mi 1. Our list has thus m1m2. . . mn
entries. Every element of A appears in our list. Two entries a1k1 a2k2 . . . ankn
and a1s1 a2s2 . . . ansn are equal if and only if the entry a1r1 a2r2 . . . anrn , where ri

306
is such that 0 ri mi 1 and ki si ri (mod mi), is equal to the
identity element of A.. Thus any element of A appears in our list as
many times as 1 does, say t times.. The number of entries is therefore
m1m2. . . mn = nt. Since q divides n, we see q m1m2. . . mn and q divides one
of the numbers m1,m2, . . . ,mn (Lemma 5.16), say q m1. Let us put m1 =
qh, h . By Lemma 11.9(2), a1h has order
o(a1h) = o(a1)/(o(a1),h) = m1/(m1,h) = qh/(qh,h) = qh/h = q.

28.7 Theorem: Let G be a finite abelian group and let G = p1a 1 p2a 2 . . . psa s
be the canonical decomposition of G into prime numbers (ai 0).

(1) For n , we put G[n] := {g G: g n = 1}. Then G[n] G for any n .


(2) Let Gi = G[pia i] for i = 1,2, . . . ,s. Then G = G1 G2 . . . Gs.
(3) Gi = pia i (and Gi is called a Sylow pi-subgroup of G).
(4) Let H be an abelian group with H = G and Hi = H[pia i] (i = 1,2,. . . ,s).
Then G H if and only if Gi Hi for all i = 1,2, . . . ,s.

Proof: (1) Let n . From 1n = 1, we get 1 G[n], so G[n] . We use


our subgroup criterion.
(i) If x,y G[n], then xn = 1 = yn and (xy)n = xnyn = 1.1 = 1
and so xy G[n].
(ii) If x G[n] then xn = 1 and (x 1)n = (xn) 1 = 1 1 = 1 and so
x 1 G[n].
Thus G[n] G.

(2) We must show that G = G1G2. . . Gs and G1. . .Gj 1 Gj = 1 for all j = 2, . . . ,s
(Theorem 22.12). We put G /piai
= mi (i = 1,2, . . . ,s). Here the integers
m1,m2, . . . ,ms are relatively prime and there are integers u1,u2, . . . ,us such
that u m + u m + . . . + u m = 1.
1 1 2 2 s s

We now show G = G1G2. . . Gs. If g G, then g = gu1m1gu2m2. . . gusms, with guimi Gi


a
since (guimi)pi i = guii|G| = 1 (i = 1,2, . . . ,s). Thus G G1G2. . . Gs and G = G1G2. . . Gs.
Secondly, let j {2, . . . ,s} and g G1. . .Gj 1 Gj. Then g = g1. . . gj 1, where
a a j-1 a 1 p a j-1
g1p1 1 = . . . = gj 1pj-1 = 1, therefore g p1 ... j-1 = 1 and o(g) p1a 1 . . . pj 1
a j-1
. On
a
the other hand, g Gj, so gpj j = 1 and o(g) pj a j. Thus o(g) = 1 and g = 1.
Thus G1. . .Gj 1 Gj 1 and G1. . .Gj 1 Gj. = 1. This proves G = G1 G2 . . .
Gs.

307
(3) By the very definition of Gi = G[pia i], the order of any element in Gi is
a divisor of pia i. Then, by Lemma 28.6, Gi is not divisible by any prime
number q distinct from pi. Thus Gi = pibi for some bi, 0 bi ai. From
p1b1p2b2. . . psbs = G1 G2 . . . Gs = G1 G2 ... Gs = p1a 1 p2a 2 . . . psa s, we get
pibi = Gi = pia i for all i = 1,2, . . . ,s.

a
(4) Let : G H be an isomorphism. For any g Gi, we have gpi i = 1, so
a a
(g )pi i = (g pi i) = 1 = 1. Thus g Hi and Gi Hi. Also, if h Hi, then h
p ai p ai p ai p ai
= g for some g G and (g i ) = (g ) i =h i = 1. Thus g i Ker = 1,
p ai
so g i = 1, so g Gi and h = g Gi . Hence Hi Gi . We obtain Gi = Hi.
Consequently, Gi
: Gi Hi is an isomorphism and Gi Hi for all i = 1,2, . . .

,s.

Conversely, assume G = H and Gi Hi for all i = 1,2, . . . ,s. From part (2),
we get G = G1 G2 . . . Gs and H = H1 H2 . . . Hs and Lemma 22.16
gives G H..

According to Theorem 28.7, the structure of a finite abelian group is


completely determined by the structure of its Sylow subgroups. Conse-
quently, we focus our attention on finite abelian p-groups. After two
prepatory lemmas, the structure of finite abelian p-groups will be de-
scribed in Theorem 28.10.

28.8 Lemma: Let G be an abelian group and g1,g2, . . . ,gr elements of G.


Let n . We write Gn = {g n: g G}.

(1) Gn G.
(2) If G = g1,g2, . . . ,gr , then Gn = g1n,g2n, . . . ,grn .
(3) If G = g1 g2 ... gr , then Gn = g1n g2n ... grn and
G/Gn g1 / g1n g2 / g2n ... gr / grn .
(4) Let H be an abelian group. If G H, then Gn Hn and G/Gn H/Hn.

Proof: (1) and (2) Since (ab)n = a nbn for all a,b G, the mapping
:G Gn
a an

308
is a homomorphism onto Gn. So Gn = Im G by Theorem 20.6. Also, if
G = g1,g2, . . . ,gr , then Gn = g1 ,g2 , . . . ,gr = g1n,g2n, . . . ,grn by Lemma
28.4(3).

(3) If G = g1 g2 ... gr , then G = g1,g2, . . . ,gr and {g1,g2, . . . ,gr} is


independent (Lemma 28.4(7)). Then Gn = g1n,g2n, . . . ,grn by part (2).
Moreover, {g1n,g2n, . . . ,grn} is independent, for if m1,m2, . . . ,mr are integers
and (g1n)m1(g2n)m2. . . (grn)mr = 1, then g1nm1g2nm2. . . grnmr = 1, so (gin)mi = ginmi =
1 because {g1,g2, . . . ,gr} is independent. From Lemma 28.4(7), we obtain
that Gn = g1n g2n .. . grn . The second assertion follows from
Lemma 22.17.

(4) Assume : G H is an isomorphism. For any g G, gn = (g )n H n,


and therefore Gn Hn. Also, if h1 Hn, then h1 = hn for some h H and
h = g for some g G, so h1 = hn = (g )n = g n Gn and thus Hn Gn .
Hence Hn = Gn and Gn
: Gn Hn is an isomorphism: Gn Hn. By Theorem
21.1(7), we have also G/Gn G /Gn = H/Hn.

28.9 Lemma: Let p be a prime number. and G a finite abelian p-group.


Let g1 G be such that o(g1) o(a) for all a G and put B = g1 . If
Bx G/B and o(Bx) = pm, then Bx = Bg for some g G satisfying o(g) = pm.
u u
Proof: Let o(g1) = ps, o(Bx) = pm and o(x) = pu. Since (Bx)p = Bxp = B1 =
m m m
B, we have pm|pu by Lemma 11.6. Also, Bxp = (Bx)p = B, thus xp B =
pm n
g1 and x = g1 for some n with 1 n p . We write n = pkt,
s

where k and t are integers, k 0 and (p,t) = 1. Then pk p kt = n ps


and, by Lemma 11.9,
m
pu m = pu/pm = pu/(pu,pm) = o(x)/(o(x),pm) = o(xp )
k
= o(g1n) = o(g1p t) = o(g1)/(o(g1),pkt) = ps/(ps,tpk) = ps/pk = ps k

So ps+ m k = pu = o(x) o(g1) = ps by hypothesis and m k.

k-m
We put z = g1tp and g = z 1x. Then z g1 = B and Bg = Bx (Lemma
pm tpk k-m m m
10.2(5)). From x = g1n = g1 = (g1tp )p = zp ,
m m m m
g p = (z 1x)p = (zp ) 1xp = 1,
we obtain o(g) pm. Also pm = o(Bx) = o(Bg) o(g). Thus o(g) = pm. This
completes the proof.

309
We can now describe finite abelian groups.

28.10 Theorem: (1) Let p be a prime number and let G be a nontrivial


finite abelian p-group. Then G has a basis, that is, there are elements
g1,g2, . . . ,gr in G\1 such that
G = g1 g2 ... gr .

(2) The number of elements in a basis of G, as well as the orders of the


elements in a basis of G, are uniquely determined by G. More precisely,
let {g1,g2, . . . ,gr} and {h1,h2, . . . ,hs} be bases of G, let o(gi) = pmi (i = 1,2,. . . ,r)
and o(hj) = pnj (j = 1,2, . . . ,s), and suppose the notation is so chosen that
m1 m2 ... mr 0 and n1 n2 ... ns 0. Then r = s and the
r-tuple (pm1 ,pm2 , . . . ,pmr) is equal to the s-tuple (pn1 ,pn2 , . . . ,pns). The r-
tuple (pm1 ,pm2 , . . . ,pmr) is called the type of G.

(3) Let H be a nontrivial finite abelian p-group. Then G H if and only if


G and H have the same type.

Proof: (1) We make induction on u, where G = pu. If u = 1, then G = p,


so G is cyclic (Theorem 11.13) and the claim is true. Assume now G is a
finite abelian p-group, G p2 and assume that, whenever G1 is a finite
abelian p-group with 1 G1 G , then G1 is a direct product of certain
nontrivial cyclic subgroups.

We choose an element g1 of G such that o(g1) o(a) for all a G and put
g1 = B. Since G 1, we have B 1. If G = B = g1 , the claim is
established, so we suppose B G.. Then G/B is a finite abelian p-group
with 1 G/B G . By induction, there are elements Bx2, . . . ,Bxr of G/B,
distinct from B1, such that
G/B = Bx2 ... Bxr .
mi
Let us put o(Bxi) = p for i = 2, . . . ,r. Using Lemma 28.9, we find gi G
mi m1
such that Bxi = Bgi and o(gi) = p (i = 2, . . . ,r). Let us write o(g1) = p .
Then G/B = Bg2 ... Bgr and, by Lemma 28.5(2),
G = g1 g2 ... gr ,
where g2, . . . ,gr are distinct from 1 since Bg2, . . . ,Bgr are distinct from B
and g1 is distinct from 1 since o(g1) o(a) for all a G and G 1. This
completes the proof of part (1).

310
(2) and (3). For convenience, a t-tuple (pa 1 ,pa 2 , . . . ,pa t) will be called a
type of a nontrivial finite abelian p-group if a1 a2 .. . as 0 and
if A has a basis {f1,f2, . . . ,ft} with o(fk) = pa k (k = 1,2, . . . ,t). We cannot say
the type of A, for part (2) is not proved yet. The claim in part (2) is that
all types of a nontrivial finite abelian p-group (arising from different
beses) are equal.

Let G and H be nontrivial finite abelian p-groups, let (pm1 ,pm2 , . . . ,pmr) be
a type of G, arising from a basis {g1,g2, . . . ,gr} of G and let (pn1 ,pn2 , . . . ,pns)
be a type of H, arising from a basis {h1,h2, . . . ,hs} of H.

If r = s and (pm1 ,pm2 , . . . ,pmr) = (pn1 ,pn2 , . . . ,pns), then gi Cpmi hi for
i = 1,2, . . . ,r and G = g1 g2 ... gr h1 h2 ... hr = H
(Lemma 22.16). This proves the "if" part of (3).

Now the "only if" part of (3), which includes (2) as a particular case
(when G = H): we will prove that G H implies r = s and (pm1 ,pm2 , . . . ,pmr)
= (pn1 ,pn2 , . . . ,pns).

Suppose G H. We make induction on u, where G = pu.. If u = 1, then G


= p = H , so G and H are both cyclic, hence G = g1 and H = h1 . Thus r =
1 = s and pm1 = o(g1) = p = o(h1) = pn1 . The claim is therefore established
when u = 1. Now suppose G p2 and suppose inductively that, if G1 and
H1 are isomorphic finite abelian p-groups with 1 G1 G , and if
(pa 1 ,pa 2 , . . . ,pa r´) is a type of G1 and (pb1 ,pb2 , . . . ,pbs´) is a type of H1, then r´
= s´ and (pa 1 ,pa 2 , . . . ,pa r´) = (pb1 ,pb2 , . . . ,pbs´). We distinguish two cases: the
case when Gp = 1 and the case Gp 1.

In case Gp = 1, we have g p = 1 for all g G, in particular pmi = o(gi) = p for


all i = 1,2, . . . ,r. Also Hp = 1 (Lemma 28.8(4)) and pnj = o(hj) = p for all j =
1,2, . . . ,s. Hence pr = g1 g2 . . . gr = G = H = h1 h2 . . . hr = ps,
so r = s and (pm1 ,pm2 , . . . ,pmr) = (p,p, . . . ,p) = (pn1 ,pn2 , . . . ,pns), as claimed.

Suppose now Gp 1. Then Hp 1. Thus there are elements in G and H of


order p, so pm1 p and pn1 p. Assume k is the greatest index in
{1,2, . . . ,r} with p mk
p, so that (when k r) pmk+1 = . . . = pmr = p. Let the
index l {1,2, . . . ,s} have a similar meaning for the group H. Then

(pm1 ,pm2 , . . . ,pmr) = ((pm1 , . . . ,pmk,p, . . . ,p) (†)


r k times

311
(pn1 ,pn2 , . . . ,pnr) = (pn1 , . . . ,pnl,p, . . . ,p), (††)
s l times
it being understood that the entries p should be deleted when k = r or s
= l. By Lemma 28.8(3),
Gp = g1p g2p ... grp
= g1p ... gkp 1 . . . 1
r k times
= g1p ... gkp
with o(gip) = pmi 1 1 for i = 1, . . . ,k. Hence {g1p, . . . ,gkp} is a basis and
(pm1 1, . . . ,pmk 1) is a type of Gp. In the same way, (pn1 1, . . . ,pnl 1) is a type
. . . +(mk 1)
of Hp. Here Gp is an abelian p-group with 1 Gp = p(m1 1)+
...
pm1 +m2 + +ms
= G . Since Gp Hp by Lemma 28.8(4), our inductive hypo-
thesis gives
k = l and (pm1 1, . . . ,pmk 1) = (pn1 1, . . . ,pnl 1).

Then pmi = pni for i = 1, . . . ,k. From

. . . +mk r k . . . +nl s l . . . +mk s l


pm 1 + p = G = H = pn1 + p = pm 1 + p

we get r k=s l=s k. Thus r = s and a glance at (†),(††) shows

(pm1 ,pm2 , . . . ,pmr) = (pn1 ,pn2 , . . . ,pnr). This completes the proof.

28.11 Examples: (a) We find all abelian groups of order p5, where p is
a prime number. An abelian group A of order p5 is determined by its
...
type (pm1 , . . . ,pmr), where of course pm1 + +mr
= A = p5. Since mi 0 and
m1 + . . . + mr = 5, the only possible types are
(p5), (p4,p), (p3,p2) ,(p3,p,p), (p2,p2,p), (p2,p,p,p), (p,p,p,p,p)
and any abelian group of order p5 is isomorphic to one of
Cp5 , Cp4 Cp, Cp3 Cp2 , Cp3 Cp Cp, Cp2 Cp2 Cp,
Cp2 Cp Cp Cp, Cp Cp Cp Cp Cp.
In particular, there are exactly seven nonisomorphic abelian groups of
order p5.

(b) The number of nonisomorphic. abelian groups of order pn (p prime)


can be found by the same argument.. This number is clearly the number
of ways of writing n as a sum of positive integers m1, . . . ,mr. If n , an
equation of the form n = m + . . . + m , where m ,m , . . . ,m are natural
1 r 1 2 r
numbers and m1 m2 ... mr 0, is called a partition of n. Thus

312
the number of nonisomorphic abelian groups of order pn is the number
of partitions of n. Notice that this number depends only on n, not on p.

The partitions of 6 are


6, 5+1, 4+2, 4+1+1, 3+3, 3+2+1, 2+2+2, 2+2+1+1, 2+1+1+1+1,
1+1+1+1+1+1
and an abelian group of order p6 is isomorphic to one of
Cp6 , Cp5 Cp, Cp4 Cp2 , Cp4 Cp Cp, Cp3 Cp3 , Cp3 Cp2 Cp,
Cp2 Cp2 Cp2 , Cp2 Cp2 Cp Cp Cp2 Cp Cp Cp Cp,
Cp Cp Cp Cp Cp Cp.

(c) Let us find all abelian groups of order 324 000 = 253453 (to within
isomorphism). An abelian group A of this order is the direct product
A2 A3 A5, where Ap denotes the Sylow p-subgroup of A (p = 2,3,5).
Here A2 has order 25 and is isomorphic to one of the seven groups of
type
(25), (24,2), (23,22) ,(23,2,2), (22,22,2), (22,2,2,2), (2,2,2,2,2).
Likewise there are five possibilities for A3:
(34), (33,3), (32,32) ,(32,3,3), (3,3,3,3)
and three possibilities for A5:
(53), (52,5), (5,5,5).
The 7.3.5 various direct products A2 A3 A5 gives us a complete list of
nonisomorphic abelian groups of order 324 000.

Now that we obtained a complete classification of finite abelian groups,


we turn our attention to torsion-free ones.

28.12 Lemma: Let G be an abelian group, B a subgroup of G and


assume that G/B is a direct product of k infinite cyclic groups (k 1),
say

G/B = By1 By2 ... Byk

(y1,y2, . . . ,yk G). Then y1 , y2 , . . . , yk are infinite cyclic groups and

G=B y1 y2 ... yk .

313
Proof: Let Y := y1,y2, . . . ,yk G. Then G/B = By1,By2, . . . ,Byk and, from
Lemma 28.4(5), we obtain G = BY.. We will show that G = B Y and Y =
y1 y2 ... yk .

To establish G = B Y, we need only prove B Y = 1. Let g B Y. Then


a1 a2 ak
g = y1 y2 . . . yk for some integers a1,a2, . . . ,ak (Lemma 28.4(1)) and B = Bg
= (By1)a 1(By2)a 2. . . (Byk)a k. Since {By1,By2, . . . ,Byk} is an independent subset
of G/B (Lemma 28.4(7)), we get (By1)a 1 = (By2)a 2 = . . . = (Byk)a k = B. But
o(By ) = o(By ) = . . . = o(By ) =
1 2 k
by hypothesis, so a = a = . . . = a = 0
1 2 k
and thus g = y10y20. . . yk0 = 1. This proves B Y = 1. Hence G = B Y.

We now prove Y = y1 y2 ... yk . In view of Lemma 28.4(7), we


must only show that {y1,y2, . . . ,yk} is an independent subset of Y. Suppose
m1,m2, . . . ,mk are integers with
y1m1y2m2. . . ykmk = 1.
Then (By1)m1(By2)m2. . . (Byk)mk = B,
so m = m = ... = m = 0
1 2 k
and y1m1
= = . . . = ykmk = 1.
y2m2
Hence {y1,y2, . . . ,yk} is independent and Y = y1 y2 ... yk .

Finally, since Byi has infinite order, we see that yi has also infinite order
and yi is an infinite cyclic group (i = 1,2, . . . ,k).

28.13 Theorem: Let G be a finitely generated nontrivial torsion-free


abelian group.

(1) G has a basis, that is, there are elements g1,g2, . . . ,gr in G\1 such that
G = g1 g2 ... gr .

(2) The number. of elements in a basis of G is uniquely determined by G.


More precisely, if {g1,g2, . . . ,gr} and {h1,h2, . . . ,hs} are bases of G,then r = s.
The number of elements in a basis of G is called the rank of G.

(3) Let H be a finitely generated nontrivial torsion-free abelian group.


Then G H if and only if G and H have the same rank.

Proof: (1) Let G be a nontrivial torsion-free abelian group and assume


that G = u1,u2, . . . ,un . We prove the claim by induction on n. If n = 1,

314
then G = u1 is a nontrivial cyclic group and the claim is true (with r = 1,
g1 = u1).

Suppose now n 2 and suppose inductively: if G1 is a nontrivial torsion-


free abelian group generated by a set of m elements,. where m n 1,
then G1 is a direct product of a finitely many cyclic subgroups of G.

If u1 = 1, then G = u1,u2, . . . ,un = u2, . . . ,un is generated by a set of n 1


elements and, by induction, G has a basis. Let us assume therefore u1
1. Then o(u1) = . We put B/ u1 := T(G/ u1 ).

For any b B, the element u1 b of B/ u1 has finite order, thus there is


a natural number n with bn u1 . Consequently, for any b B, there is
an n and m such that bn = u1m.

We define a mapping : B by declaring b = m/n for any b B,


n m
where n ,m are such that b = u1 . This mapping is well defined,
for if n´ and m´ are also such that bn´ = u1m´, then u1m´n n´m =
(u1m´)n[(u1m)n´] 1 = bn´n(bnn´) 1 = 1, so m´n n´m = 0 (because o(u1) = )
and m/n = m´/n´.

is in fact a homomorphism. To see this, let b,c B and b = m/n, c =


m´/n´ (where n,n´ , m,m´ ). Then b = u1 and c n´ = u1m´, so
n m

(bc)nn´ = (bn)n´(c n´)n = u mn´u m´n = u mn´+m´n


1 1 1
and (bc) = (mn´ + m´n)/nn´ = m/n + m´/n´ = b + c .
Thus is a homomorphism.

Since Ker = {b B: b = 0/1} = {b B: b1 = u10} = 1, the homomorphism


is one-to-one and : B Im is an isomorphism: B Im ..

Claim: if B is finitely generated, then B is cyclic.. To prove this, assume


B = b1,b2, . . . ,bt and let bi = mi/ni (i = 1,2, . . . ,t). Using Lemma 28.4(3),
we see that Im = b1 ,b2 , . . . ,bt = m1/n1,m2/n2, . . . ,mt/nt is a
subgroup of the additive cyclic group 1/n1n2. . . nt . Hence Im is cyclic
and B is cyclic. .

If B = G, then B is finitely generated by hypothesis, so B = G is cyclic and


(1) is proved. We assume therefore B G. Then

G/B = Bu1,Bu2, . . . ,Bun = Bu2, . . . ,Bun

315
(see Lemma 28.4(4)) is a nontrivial abelian group, generated by n 1
elements. Moreover, G/B = G/ u1 / B/ u1 = G/ u1 / T(G/ u1 ) is
torsion-free by Lemma 28.1(2).. So, by induction,

G/B = Bg2 ... Bgr

with suitable gi G, where Bgi is distinct from B (i = 2, . . . ,r). Thus o(Bgi)


= , and this forces o(gi) = (i = 2, . . . ,r). Lemma 28.12 yields

G=B g2 ... gr .

We put g2, . . . ,gr = A. Then G = B A and B G/A is finitely generated


by Theorem 22.7(2),. Lemma 28.4(4). Hence, by the claim above, B is
cyclic, say B = g1 . Since 1 u1 g1 , we have o(g1) 1, so o(g1) =
and

G = g1 g2 ... gr .

This completes the proof of (1).

(2) and (3) For convenience, a natural number r will be called a rank of
a finitely generated nontrivial torsion-free abelian group A if A has a
basis of r elements. We cannot say the rank of A, for part (2) is not
proved yet. The claim in part (2) is that all ranks of a finitely generated
nontrivial torsion-free abelian group (arising from different bases) are
equal.

Let G and H be finitely. generated nontrivial torsion-free abelian groups,


let r be a rank of G and s be a rank of H, say G = g1 g2 ... gr
and H = h1 h2 ... hs .

If r = s, then gi hi for i = 1,2, . . . ,r and


G = g1 g2 ... gr h1 h2 ... hr H
by Lemma 22.16. This proves the "if" part of (3).

Now the "only if" part of (3), which includes (2) as a particular case
(when G = H): we will prove that G H implies r = s. This is easy. Now

G/G2 g1 / g12 g2 / g22 ... gr / gr2 C2 C2 ... C2

is a finite group of order 2r by Lemma 28.8(3). Also

H/H2 h1 / h12 h2 / h22 ... hs / hs2 C2 C2 ... C2

316
is a finite group of order 2s. If G H, then G/G2 H/H2 (Lemma 28.8(4)),
so 2r = G/G2 = H/H2 = 2s. Hence r = s.

28.14 Remark: Theorem 28.13 states essentially that a direct sum r :=


... of r copies of cannot be isomorphic to a direct sum
s
:= ... of s copies of unless r = s. This is not obvious:
there are many one-to-one mappings from r onto s, and there is no a
priori reason why one of these mappings should not be an isomorphism.
The proof of r s
r = s does not and cannot consist in cancelling
one at a time from both sides of r s
. In general, it does not follow
from A B A C that B C. As a matter of fact, there are abelian
groups G such that G G G G but G G G!

28.15 Theorem: Let G be a finitely generated abelian group. Then T(G)


is a finite group and there is a subgroup I of G such that G = T(G) I.

Proof: G/T(G) is a finitely generated abelian group (Lemma 28.4(4)),


and is torsion-free (Lemma 28.1(2)). Thus either G/T(G) 1; or G/T(G)
T(G)g1 T(G)g2 ... T(G)gr with suitable g1,g2, . . . ,gr G (Theorem
28.13(1)) and therefore G = T(G) g1 g2 ... gr (Lemma 28.12).
Putting I = 1 in the first case and I = g1 g2 ... gr in the second
case, we obtain G = T(G) I.

Then T(G) G/I by Theorem 22.7(2). Since G is finitely generated, so is


G/I (Lemma 28.4(4)) and T(G) is also finitely generated. From Lemma
28.4(2), it follows that T(G) is a finite group.

The subgroup I in Theorem 28.15 is not uniquely determined by G.


However, its rank r(I), which is the rank of G/T(G) is completely deter-
mined by G when G/T(G) 1. Let us define the rank of the trivial group
1 to be 0 and let us call a basis of 1. Then the rank of any finitely
generated torsion-free abelian group is the number of elements in a
basis of that group, and r(I) is completely determined by G, also in case
G/T(G) 1.

317
As G = T(G) I, the finitely generated abelian group G is determined
uniquely to within isomorphism by T(G) and I. Now I is determined
uniquely to within isomorphism by the integer r(I) (Theorem 28.13.(3)
and the definition r(1) = 0); and T(G), being a finite abelian group
(Theorem 28.15), is determined uniquely to within isomorphism by its
Sylow subgroups (Theorem 28.7(4)). Let s be the number of distinct
prime divisors of T(G) (so s = 0 when T(G) 1). Each one of the s Sylow
subgroups (corresponding to the s distinct prime divisors) is determined
uniquely to within isomorphism by its type (Theorem 28.10(3)). Thus
the finitely generated abelian group G gives rise to the following system
of nonnegative integers.
(i) A nonnegative integer r, namely the rank of G/T(G). Here r = 0
means that G is a finite group. If r 0, then T(G) I, where I is a direct
product of r cyclic groups of infinite order. The subgroup I is not, but its
isomorphism type is uniquely determined by G.
(ii) A nonnegative integer s, namely the number of distinct prime
divisors of T(G) . Here s = 0 means that T(G) 1 and G is a torsion-free
group.
(iii) In case s 0, a system p1,p2, . . . ,ps of prime numbers, namely
the distinct prime divisors of T(G) ;. and for each i = 1,2, . . . ,s, a positive
integer ti and ti positive integers mi1,mi2, . . . ,mit , so that
i
mi1 mi2 mit
(pi ,pi , . . . ,pi ) is the type of the Sylow pi-subgroup of T(G).

With this information, G is a direct product of r + t1 + t2 + . . . + ts cyclic


subgroups. r of them are infinite cyclic; and (in case s 0) ti of them
have orders equal to a prime number pi, more specifically, ti of them
have orders pimi1 ,pimi2 , . . . ,pimit . Furthermore, two finitely generated abel-
ian groups are isomorphic if and only if they give rise to the same
system of integers.

Exercises

1. Let G be an abelian group and H G. Prove that


(a) T(H) = T(G) H,

318
(b) T(G)/T(H) HT(G)/H T(G/H)
and that HT(G)/H need not be equal to T(G/H).

2. Let G be an abelian group. Show that


(a) if G is finite, then G/Gn G[n] for all n ;
n
(b) if G is infinite, then G/G G[n] need not hold for any n
\{1}.

3. Let G be a finite abelian group. The exponent of G is defined to be the


largest number in {o(a): a G}, i.e., the largest possible order of the
elements in G. Show that
(a) the exponent of G divides G ;
(b) for any g G, o(g) divides the exponent of G;
(c) the exponent of G is the least common multiple of the order of
the elements in G;
(d) G is cyclic if and only if the exponent of G is G .

4. Let G be a finite abelian group and H G. Let K G such that H K=


1 and H L 1 for any L G satisfying K L. Let g G.
p
(a) Assume g K for some prime number p. Prove that, if g K,
then there are h H, k K and an integer r relatively prime to p such
r
that h = kg . Conclude that g HK.
(b) Prove that G = H K if and only if, for any prime number p and
elements g G, h H, k K such that g p = hk, there is an element h´ H
p
satisfying h = (h´) .

5. Let G be a finite abelian group of exponent e and let g G be of order


e, so that o(g) = e. Put H = g . Show that G = H K for some K G. (Hint:
Use Ex. 4. Consider the cases p e and p e separately.

6. Let G be a nontrivial finite abelian group.. Using Ex. 5, prove by


induction on G that there are non-trivial elements g1,g2, . . . ,gr in G such
that G = g1 g2 ... gr and (in case r 1) o(gi) divides o(gi+1) for
i = 1,2, . . . ,r 1. .

7. Keep the notation of Ex. 6. Prove that the integers o(g1), o(g2), . . . ,o(gr)
determine the types of the Sylow p-subgroups of G uniquely,. and con-
versely the types of the Sylow p-subgroups of. G completely determine
the integers o(g1), o(g2), . . . ,o(gr). (The integers o(g1), o(g2), . . . ,o(gr) are

319
called the invariant factors of G.. Two finite abelian groups are thus iso-
morphic if and only if they have the same invariant factors.).

8. Find the invariant factors of the finite abelian groups C6 C9,


C6 C8 C15 C30, C4 C6 C15 C20.

320
CHAPTER 3
Rings

§29
Basic Definitions

In the preceding chapter, we have examined groups. Groups are sets


with one binary operation on them. In this chapter, we want to study
sets with two binary operations defined on them. The most fundamental
algebraic structure with two binary operations is called a ring.

29.1 Definition: Let R be a nonempty set and let + and . be two binary
operations defined on R. The ordered triple (R,+,.) is called a ring if the
following conditions (ring axioms) are satisfied.

(i) For all a,b R, a + b R.


(ii) For all a,b,c R, (a + b) + c = a + (b + c).
(iii) There is an element in R, denoted by 0, such that
a + 0 = a for all a R.
(iv) For each a R, there is an element in R, denoted by a,
such that
a + ( a) = 0.
(v) For all a,b R, a + b = b + a.
(1) For all a,b R, a .b R.

320
(2) For all a,b,c R, (a .b).c = a .(b.c).
(D) For all a,b,c R, there hold
a .(b + c) = a .b + a .c and (b + c).a = b.a + c .a.

The conditions (i) and (1) assert that two binary operations + and . are
defined on R. We shall refer to + as addition and to . as multiplication.
Further, we shall call the element a + b the sum of a and b, and the
element ab the product of a and b. The conditions (i)-(v) say that R forms
a group with respect to addition. The identity element 0 of this group
will be called the zero element, or simply the zero of R. So 0 is an
element of the set R and not neccessarily the number zero. The inverse
element a of a R is called the opposite of a.

The condition (2) states that the multiplication on R is associative. The


condition (D) relates the two binary operations + and .. It is called the
distributivity of multiplication over addition. Here it should be noted
that a .b + a .c stands for (a .b) + (a .c) and similarly b.a + c .a for (b.a) + (c .a).
Notice that there are two equations in (D), and we must check both of
them when we want to show that a given ordered triple (R,+,.) is a ring.
In general, neither of them implies the other, and it is not enough to
check one of them. There are ordered triples (R,+,.) for which all the
conditions above are satisfied, except for one of the equations in (D), and
they fail to be a ring just for that reason.

For ease of notation, we shall frequently denote multiplication by juxta-


position and thus write ab in place of a .b. Also, we shall write a b for
a + ( b). Since multiplication in a ring is associative, the products of ele-
ments in a ring are independent of the mode of inserting parentheses
and the usual exponentiation rules are valid (see §8). We shall use the
results of §8 without explicit mention.

29.2 Examples: (a) Let (R,+) be any commutative group, whose


identity element we shall denote as 0. We define a multiplication on R
by declaring
ab = 0 for all a,b R.

It is easily seen that (R,+,.) is a ring.

321
(b) A more interesting ring is ( ,+,.), where + and . are the usual addi-
tion and multiplication of integers.

(c) Let 2 denote the set of even integers. Then (2 ,+,.), where + and .
are the usual addition and multiplication of integers, is a ring. In the
same way, if n and n is the set of integers divisibile by n, then
(n ,+,.) is a ring.

(d) ( ,+,.), ( ,+,.), ( ,+,.), ( n


,+,.) are rings under the usual addition and
multiplication.

(e) Let R := {a/b : (a,b) = 1 and 5 b}. With respect to the usual
addition and multiplication of rational numbers, (R,+,.) is a ring.

(f) Let S := {a/b : (a,b) = 1 and 6 b}. With respect to the usual
addition and multiplication of rational numbers, (S,+,.) is a not ring. The
very first property (i) is not satisfied. For example

1 1 1 1 5
S, S, but + = S.
2 3 2 3 6

(g) Let p be a prime number and put T = {a/b : (a,b) = 1 and p b}.
With respect to the usual addition and multiplication of rational num-
bers, (T,+,.) is a ring.

(h) Let R be a ring. A matrix over R is an array (ac bd) of four elements
a,b,c,d of R, arranged. in two rows and two columns and enclosed within
parentheses. The set of all matrices over R will be denoted by Mat2(R). If
A,B Mat2(R), we say A is equal to B provided the correspording entries
in A and B are equal and write A = B in this case.. This is clearly an equi-
valence relation on Mat2(R).

a b e f
Let A = (c d), B = (g h) Mat2(R). The sum A + B of A and B is defined
a+e b+f
to be the matrix (c+g d+h) and the product AB of A and B is defined to
ae+bg af+bh
be the matrix (ce+dg cf+dh ).
The proof of Theorem 17.4 remains valid and shows that Mat2(R) is a
commutative group under addition.. The proof of Theorem 17.6(1),(2),(4)
is also valid and establishes the ring axioms (1),(2),(D). So (Mat (R),+,.) is
2
a ring. .

322
(i) Let K be the set of all real-valued functions defined on the closed
interval [0,1]. We define operations + and . on K by
(f + g)(x) = f(x) + g(x), (f.g)(x) = f(x)g(x) for all x [0,1]
(f,g K). So f + g is that function that maps any x [0,1] to the sum of
the values f(x) and g(x) of the functions f and g at x; and f.g is that
function that maps any x [0,1] to the product of the values f(x) and
g(x). In "f + g", the sign "+" stands for the binary operation + we just
defined, and in "f(x) + g(x)", the sign "+" stands for the usual addition of
real numbers. It is easily verified that (K,+,.) is a ring. The sum f + g and
the product f.g are said to be defined pointwise. The operations + and .
are called pointwise addition and pointwise multiplication.

(j) Let S be any set and let (R,+,.) be any ring. Let L denote the set of all
functions from S into R. For f, g L, we put

(f + g)(s) = f(s) + g(s), (f.g)(s) = f(s)g(s) for all s S.

On the right, we have the sum (product) of elements f(s),g(s)in R, on the


left, we have the operations on L. The operations + and . on L are called
pointwise addition and pointwise multiplication. With these operations,
(L,+,.) is a ring.

Let us find the zero elements of the rings in Example 29.2. This is the
identity element of the commutative group R in Example 29.2(a); the
number zero in the Examples 29.2(b),(c),(d),(e),(f),(g) except in the case
n
of Example 29.2(d), where the zero element is the residue class 0 n
0 0
of 0 ; the so-called zero matrix (0 0) Mat2(R), where 0 is the zero

element of R in Example 29.2(h). In the ring of Example 29.2(i), the zero


element is the function : [0,1] for which (x) = 0 for all x [0,1];
and in the ring of Example 29.2(j), the zero element is the function u: S
R for which u(s) = 0 for all s S.

We make a convention. As in the case of groups, if (R,+,.) is a ring, and if


it is clear from what the binary operations + and . are, we shall call the
set R a ring. Hence we shall speak of the ring instead of using the more
correct but more cumbersome expression "the ring ( ,+,.)", etc.

323
The addition in a ring has all the desirable properties one could wish for:
it is associative, there is an identity element, all elements possess
inverses, and it is also commutative. As for multiplication, only one of
these properties, namely the associativity, is assumed to be satisfied. It
may happen, of course, that multiplication in a ring has some of these
properties. Then we make the following definitions.

29.3 Definition: A ring R is called a commutative ring if ab = ba for all


a,b R.

29.4 Definition: A ring R is called a ring with identity if there is an


element e in R such that ae = ea = a for all a R.

Thus a ring is a commutative ring. if the multiplication on it is commuta-


tive.. This is a natural definition: since addition is commutative in any
ring,. commutativity can refer only to multiplication. , , , , 2 and
n
are examples of commutative rings. Mat2( ) is not a commutative
ring because, for instance,.

(11 01).(01 01) = (01 01) (02 01) = (01 01).(11 01).
Likewise, a ring with identity is a ring with a multiplicative identity. The
additive identity exists in any ring anyway. Notice that e in Definition
29.4 must be both a right identity and a left identity. Since multiplica-
tion in a ring is not necessarily commutative, we cannot conclude, say,
from
ae = a for all a R
that the other condition
ea = a for all a R
also holds. In the case of groups, we proved that a right identity is also a
left identity, but in the proof we made use of the existence of inverse
elements. We cannot use the same argument in the case of rings, for we
do not know anything about the existence of inverse elements. They
may or may not exist for all a R. It is possible that a ring R has an
element f such that
af = a for all a R
but fb b for some b R.

324
In short, R may have a multiplicative right identity which is not a left
identity. If each right identity in a ring fails to be a left identity, then
the ring is not a ring with identity.

These remarks make sense only for noncommutative rings. Of course, in


a commutative ring, any right (left) identity is also a left (right) identity.

A ring may be commutative without having an identity: 2 is an


example. A ring may have an identity without being commutative:
1 0
Mat2( ) is an example. An identity of this ring is the matrix (0 1). More
generally, if R is a ring with an identity e, then Mat2(R) is a ring with an
e 0
identity (0 e ). The proof of Theorem 17.6(3) works here without

change.

29.5 Lemma: Let R be a ring with identity. Then its multiplicative


identity is unique (i.e., there is one and only one element e such that ea
= ae = a for all a R).

Proof: If e and f are identity elements of R, then e = ef since f is a right


identity and ef = f since e is a left identity, so e = ef = f.

In view of this lemma, we can speak of the identity. We shall follow the
convention of writing 1 for the multiplicative identity of a ring with
identity. 1 is therefore an element of the ring under study, and not
necessarily the number one. For instance, in the ring Mat2( ), the
10
element 1 is the matrix (0 1), the identity matrix. The ring K of Example

29.2(i) is a ring with identity, and one checks easily that 1 here is the
function h: [0,1] such that h(x) = 1 (real number one) for all x
[0,1].

What about the existence of multiplicative inverses? Of course the ring


must be a ring with identity if we are to speak about multiplicative
inverses. We will see presently that the additive identity 0 of a ring
cannot have a multiplicative inverse unless the ring is idiosyncratic.

29.6 Lemma: Let R be a ring and 0 its zero element.

325
(1) a0 = 0 for all a R.
(2) 0a = 0 for all a R.
(3) a( b) = (ab) for all a,b R.
(4) ( a)b = (ab) for all a,b R.
(5) ( a)b = a( b) for all a,b R.
(6) ( a)( b) = ab for all a,b R.

Proof: (1) Since 0 is the additive identity of R, we have 0 + 0 = 0. Thus


a(0 + 0) = a0 for all a R,
a0 + a0 = a0 for all a R.
By Lemma 7.3(1), a0 must be the identity of the group (R, +). Thus a0 =
0.

(2) This is proved by the same agument, using 0a + 0a = (0 + 0 )a = 0a.

(3) For any a,b R, we have


0 = a0 = a (b + ( b)) = ab + a( b).
So a( b) is the additive inverse of ab. The additive inverse of ab is (ab)
by definition. Hence a( b) = (ab).

(4) For any a,b R, we have


0 = 0b = (a + ( a))b = ab + ( a)b.
So ( a)b is the additive inverse of ab. The additive inverse of ab is (ab) .
Hence ( a)b = (ab).

(5) This follows from (3) and (4).

(6) This follows from (5) on writing b for b and observing ( b) = b.

29.7 Lemma: Let R be a ring with identity 1. If the zero element 0 of R


has an inverse (i.e., if there is an element t R such that 0t = t0 = 1),
then R has only one element.

Proof: If r R, then r = r1 = r(0t) = (r0)t = 0t = 0, so R {0}, so R = {0}.

The set {0} can be made into a ring if we define + and . in the only
possible way: 0 + 0 = 0 and 0.0 = 0. This is a commutative ring with

326
identity, the multiplicative identity being the additive identity 0. This
ring is called the null ring.

29.8 Lemma: Let R be a ring with identity 1. If R is not the null ring,
then 1 0.

Proof: If R is not the null ring, then there is an r R, r 0. Then the


assumption 1 = 0 leads to the contradiction r = r1 = r0 = 0. So 1 0.

Lemma 29.7 states that 0 in a ring cannot possess a multiplicative


inverse unless the ring is the null ring. We now want to show that
divisors of 0 cannot possess a multiplicative inverses, either.

29.9 Definition: Let R be a ring. If a 0, b 0 are elements of R such


that ab = 0, then a is called a left zero divisor and b is called a right zero
divisor.

It may very well happen that a 0, b 0, but ab = 0 in a ring.. For


example, in the ring Mat2( ) of matrices over ,

(00 01) 0 = (00 00) and (10 00) 0, but (00 01)(10 00) = (00 00) = 0.
As a second example, consider the ring K of real-valued functions on
[0,1] with respect to pointwise addition and multiplication (Example
29.2(i)). The zero element in this ring is the function , where (x) = 0
for all x [0,1]. The functions a and b, where

a(x) = { 01 if
if 0
1/2 x
x 1/2
1 , b(x) =
1 if 0
{
0 if 1/2
x
x
1/2
1
are thus distinct from , but their pointwise product is , as a(x)b(x) = 0
for all x [0,1].

In a commutative ring, there is no distinction between right and left


zero divisors. But in a non commutative ring, an element a 0 may be a
right zero divisor without being a left zero divisor, and vice versa.

327
29.10 Lemma: Let R be a ring with identity. If a is a left zero divisor,
then a does not have a multiplicative left inverse. If a is a right zero
divisor, then a does not have a multiplicative right inverse.

Proof: Let 1 be the identity of R. If a is left zero divisor, then a 0 and


there is a b 0 in R such that ab = 0. Now if a had a left inverse x, so
that xa = 1, we would obtain b = 1b = (xa)b = x(ab) = x0 = 0, a contra-
diction. So a has no left inverse. The second statement is proved analog-
ously.

We know that the zero element in a ring distinct from the null ring
cannot have an inverse and we understand from Lemma 29.10 that
being a zero divisor is the very opposite of having an inverse. So if we
want a ring to have the property that every nonzero element in it has a
multiplicative inverse, the ring has to be free from zero divisors.

29.11 Definition: A commutative ring with identity, which is distinct


from the null ring, and which has no zero divisors, is called an integral
domain.

29.12 Definition: A ring with identity, which is distinct from the null
ring, and in which every nonzero element has a right inverse, is called a
division ring.

An integral domain is therefore a ring in which we may expect that


nonzero elements have inverses, but nothing is said about the actual ex-
istence of inverses. The necessary condition that zero divisors be absent
is satisfied in an integral domain, plus commutativity. Whether the
nonzero elements do in fact have inverses is not relevant in the defini-
tion of integral domains.

In a division ring, every nonzero element does have a right inverse;


more precisely, a right inverse. But this means that the nonzero ele-
ments in a division ring form a group under multiplication. We know

328
that, in any group, right inverses are also left inverses and that they are
unique (Lemma 7.3). Hence, in a division ring, every nonzero element
has a left inverse as well, and the right and left inverse of an arbitrary
element coincide. This will be called the inverse of that element.

is an integral domain. In fact, is the prototype of all integral


domains. 2 is not an integral domain, because 2 is not a ring with
identity, although 2 is commutative and has no zero divisors. An
example of division rings is given in Ex. 9.

A ring which is both an integral domain and a division ring deserves a


name.

29.13 Definition: A commutative ring with identity, which is distinct


from the null ring, and in which every nonzero element has a multiplica-
tive inverse, is called a field.

Thus a field is a commutative division ring. Also, a field is an integral


domain in which every nonzero element does have an inverse. A field is
a ring in which the nonzero elements form a commutative group under
multiplication.

is not a field, since 2 ,. for instance, does not have an inverse in


(there is no z such that 2z = 1).. Thus is an integral domain which
is not a field. The rings , , , and p (where p is a prime number) are
example of fields,. so Definition 17.1 is consistent with Definition 29.13.
There are fields with finitely many elements. as well as with infinitely
many elements..

29.14 Definition: Let R be a ring with identity. An element a R of R


is sait to be a unit of R if a has both a right inverse and a left inverse in
R. The set of all units in R will be denoted by R .

329
For example, the units of are 1 and 1, so = {1, 1}. The units in n
are the residue classes a for which there is a b n
such that a b = 1,
and this holds if and only if (a,n) = 1. Hence n = {a n
: (a,n) = 1}, as in
§11. We know that = {1, 1} and n are groups under multiplication
(Theorem 12.4). This are a special cases of the following theorem..

29.15 Theorem: Let R be a ring with identity. Then R is a group under


multiplication.

Proof: We denote the identity of R by 1. Since 1.1 = 1, we have 1 R


and so R . We now show that any unit of R has a unique right
inverse, which is also the unique left inverse of that unit. Let a R , let x
be any right inverse of a and let y be any left inverse of a. Then ax = 1 =
ya and
y = y1 = y(ax) = (ya)x = 1x = x.
Thus any right inverse of a is equal to y. Hence there is only one right
inverse of a, namely x. Then any left inverse of a is also equal to x.
Hence there is a unique left inverse of a, namely the unique right
inverse x of a.

We check the group axioms.

(i) If a,b R , then there are uniquely determined elements


x, z in R with ax = 1 = xa and bz = 1 = zb. From
(ab)(zx) = a(bz)x = a1x = ax = 1, (zx)(ab) = z(xa)b = z1b = zb = 1,
we see that zx is both a right inverse and a left inverse of ab. Hence
ab R and R is closed under multiplication.

(ii) The multiplication on R is associative since R is a ring.

(iii) Since a1 = a = 1a for all a R, and since 1 R , we see


that 1 is the identity element of R .

(iv) If a R , then there is an x R with ax = 1 = xa. This x is


in fact an element of R : it follows from ax = 1 = xa that a is a left and
right inverse of x, so x R . So any a R has an inverse in R .

Thus R is a group under multiplication.

330
The reader will check easily that, if R is a ring with identity, distinct
from the null ring, then R is a division ring if and only if R = R\{0}.
Likewise, if K is a commutative ring with identity, distinct from the null
ring, then K is a field if and only if K = K\{0}.

From now on we will write , , for the multiplicative groups \{0},


\{0}, \{0} of nonzero rational, real, complex numbers, respectively.

We conclude this paragraph with the binomial theorem.

29.16 Theorem (Binomial Theorem) : Let R be a ring and a,b R. If

n n
ab = ba, then (a + b)n = ∑  k a n kbk.
k=0

Proof: First we remark that a nb0 and a 0bn are to be interpreted as a n


n n!
and bn respectively, even if R has no identity. As usual,  k =
k!(n k)!
n n n+1
and 0! = 1. We use the formula  k +  k 1 =  k  for 1 k n 1.

1 1
We make induction on n. The formula (a + b)1 =  0 a 1b0 +  1 a 0b1 is

clearly true. We suppose that the formula is proved when the exponent
of a + b is n. Then

n n
(a + b)n+1 = (a + b)(a + b)n = (a + b) ∑  k a n kbk
k=0

n n n n
= ∑  k a n+1 kbk + ∑  k a n kbk+1
k=0 k=0

n n n n 1 n n
=  0 a n+1b0 + ∑  k a n+1 kbk + ∑  k a n kbk+1 +  n a 0bn+1
k=1 k=0

n+1 n n n n n+1
=  0  a n+1b0 + ∑  k a n+1 kbk + ∑  k 1 a n (k 1)bk +  n+1 a 0bn+1
k=1 k=1

331
n+1 n n n n+1
=  0  a n+1b0 + ∑   k + k 1  a n+1 kbk +  n+1 a 0bn+1
k=1

n+1 n n+1 n+1


=  0  a n+1b0 + ∑  k  a n+1 kbk +  n+1 a 0bn+1
k=1

 n+1 a n+1 kbk


n+1
= ∑  k 
k=0

and the formula is true when the exponent of a + b is n + 1. This com-


pletes the proof.

Exercises

3
1. Let X = {a + b 2 : a,b } and Y = {a + b 2 : a,b }. Determine
whether X and Y are rings under the usual addition and multiplication of
real numbers.

2. Let (R,+,.) be a ring. On the group (R,+), we define an operation o by


declaring a o b = ba for all a,b R. Show that (R,+,o ) is a ring (called the
opposite ring of (R,+,.)).

3. On the group , we define a multiplication by


(a,b).(c,d) = (ac,b)
for all (a,b), (c,d) . Does become a ring with this multipli-
cation?

4. Show that the set A = {a/b : (a,b) = 1, n b} is not a ring (under the
usual addition and multiplication of rational numbers) if n is a composite
number.

5. Prove that n
has zero divisors if n is composite, and that n
is a field
if n is prime.

332
6. On the group R = , we define a multiplication by
(a,b).(c,d) = (ac,ad)
for all (a,b), (c,d) . Prove that, with this multiplication, R becomes
a ring. Show that (1,0) is a left identity in R, but not a right identity; and
that (1,0) is a right zero divisor, but not a left zero divisor. Is R a ring
with identity?

7. Let R be a ring without identity, and let S = R . On the


commutative group S, we define a multiplication by
(r,a).(r´,b) = (rr´ + ar´ + br,ab)
for all (r,a), (r´,b) S. Prove that S is a ring with identity.

8. On the group R = n n
, we define a multiplication by
(a ,b).(c ,d) = ( ac bd , ad + bc )

for all (a ,b),(c ,d) R. Show that R is a commutative ring with identity.
Prove that R is a field when n = 3,7,11 and that R is not an integral
domain if n = 5,13,17.

9. Let H = {(ab b
a ): a,b }
Mat2( ). Prove that, under the usual
matrix addition and multiplication, H is a division ring (cf. §17, Ex. 14).

10. Let R1,R2, . . . ,Rn be rings. Prove that the group R1 R2 ... Rn
becomes a ring if multiplication is defined by

(r1,r2, . . . ,rn)(s1,s2, . . . ,sn) = (r1s1,r2s2, . . . ,rnsn)

for all (r1,r2, . . . ,rn),(s1,s2, . . . ,sn) R1 R2 ... Rn. Moreover, prove that
R1 R2 ... Rn is a commutative ring if and only if each Rk is; and
that R1 R2 ... Rn is a ring with identity if and only if each Rk is.
The ring R1 R2 ... Rn is called the direct sum of R1,R2, . . . ,Rn.

333
§30
Subrings, Ideals and Homomorphisms

As in the case of groups, we give a name to subsets of a ring which are


themselves rings.

30.1 Definition: Let R be a ring. A nonempty subset S of R is called a


subring of R if S itself is a ring with respect to the operations on R.

Thus a nonempty subset S of a ring R is a subring of R if and only if S


satisfies all the ring axioms in Definition 29.1. As in the case of groups,
we can dispense with some of them.

Let (R,+,.) be a ring and S R. If S is a subring of R, then (S,+) is a


commutative group, thus (S,+) is a subgroup of (R,+); and (S,+) is a
subgroup of (R,+) if and only if
(i) a + b S for all a,b S,
(ii) a S for all a S,
as we know from Lemma 9.2. Let us now consider multiplication. If
(S,+,.) is to be a ring, the the restriction of the operation . to S must be a
binary operation on S; and this holds if and only if
(1) a .b S for all a S.
So, if a nonempty subset S of a ring R is a subring of R, then (i),(ii),(1)
hold. Conversely, if S is a nonempty subset of a ring R and (i),(ii),(1) hold,
then (S,+) is a subgroup of (R,+), so (S,+) is a a commutative group, and .
is a binary operation on S, and the associativity of multiplication and the
disributivity of multiplication over addition holds in S since they hold in
fact in R. Thus (S,+,.) is a subring of (R,+,.). We proved the following
lemma.

30.2 Lemma (Subring criterion): Let (R,+,.) be a ring and let S be a


nonempty subset of R. Then (S,+,.) is a subring of R if and only if
(i) a + b S for all a,b S,

334
(ii) a S for all a S,
.
(iii) a b S for all a,b S.

30.2´ Examples: (a) {0} and R are subrings of any ring R.

(b) If R is a ring and S i is an arbitrary collection of subrings of R, then it


follows immediately from Lemma 30.2 that S i is a subring of R.
i I

(c) If R is a ring and X is a subset of R, the intersection of all subrings of


R that contain X is a subring of R by Example 30.2´(b). It is called the
subring generated by X.

Some properties of multiplication are inherited by subrings.

30.3 Lemma: (1) A subring of a commutative ring is a commutative


ring.
(2) A subring of a noncommutative ring can be commutative.
(3) A subring of a ring with identity can be a ring without identity.
(4) A subring of a ring without identity can be a ring with identity.
(5) A subring of a ring without zero divisors is a ring without zero
divisors.
(6) A subring of a ring with zero divisors can be a ring without zero
divisors.
(7) A subring of a division ring is not necessarily a division ring.
(8) A subring of a field is not necessarily a field.
(9) A subring, distinct from {0}, of an integral domain is an integral
domain if and only if it contains the identity.

Proof: Let R be a ring and S a subring of R.

(1) If R is commutative, then ab = ba for all a,b R and, a fortiori, ab =


ba for all a,b S. Hence S is commutative.

(2) Assume R is not commutative.. Then there are a,b R with ab ba.
The point is that all such pairs a,b may be outside S,. and that st = ts may

335
hold for all s,t S. For example, R = Mat2( ) is not commutative, but S =

{(a0 0b) : a,b } is a subring of R and S is commutative.


(3) The point is that the identity 1 of R need not belong to S. For
example, is a ring with identity, 2 is a subring of and 2 has no
identity.

(4) The point is that. there may be an e in S such that es = se = s for all s
in S, but er = re = r need not be true for all r R, i.e., er0 r0 or r0e r0
for a particular r0 in R. As an example, considerR = , on which
addition and multiplication are defined by declaring.
(a,b) + (c,d) = (a + c,a + d)
(a,b)(c,d) = (ac,ad)
for all (a,b) R and which is easily verified to be a ring with respect to
these operations. If (a,b) R is a left identity element of R so that
(a,b)(x,y) = (x,y) for all (x,y) R, then (ax,ay) = (x,y) for all (x,y) R, thus
a = 1. But (1,b) is not a right identity element of R, because (x,y)(1,b) =
(x,xb) (x,y) for any (x,y) R with y xb. Thus R is a ring without an
identity. However, S = {(a,0): a } is a subring of R with an identity
(1,0) S, as (1,0)(a,0) = (a,0) = (a,0)(1,0) for any (a,0) S.

(5) If R has no zero divisors, then


for all a,b R, a 0 b ab 0.
But this holds for all a,b S, too. Hence S has no zero divisors.

(6) If R has no zero divisors, it may happen that all zero divisors fall
outside S, and in this case S has no zero divisors. For instance, The ring R
a 0
{
= Mat2( ) has zero divisors, but its subset S = (0 0) : a } is a sub-
a 0
ring of R with no zero divisors. For if s,t S and st = 0, then s = (0 0)
b 0 a 0 b 0 0 0
and t = (0 0) for some a,b , and st = 0 means (0 0)(0 0) = (0 0),

which is possible only if a = 0 or b = 0 (in ), that is, only if s = 0 or t = 0


(in S).

(7) and (8) Consider the division ring , which is a field as well. Its
subring is neither a division ring nor a field.

336
(9) A subring S {0} of an integral domain R is commutative by (1) and
has no zero divisors by (5). Hence S is an integral domain if and only if S
has an identity. We claim S has an identity if and only if the identity of
R belongs to S. Indeed, if S contains the identity element 1R of R, then of
course 1R is an identity element of S. Conversely, if S has an identity
element e, then ee = e = 1R e, so ee 1R e, so (e 1R )e = 0 and, since e 0
(for S {0} by assumption) and R has no zero divisors, e 1R = 0 and
hence e must be equal to 1R .

The claim in the proof of Lemma 30.3(9) is not self-evident. If R is a ring


with identity and S is a subring of R, then it is possible that S is a ring
with identity and the identity of S is distinct from the identity of R. Can
you give some examples?

Just as in the case of groups, we want to define factor rings by subrings.


We take our factor group construction as a model. For a group G and a
subgroup H of G, the factor group G/H is the set of all right cosets of H in
G, on which the multiplication is defined by the rule Ha.Hb = Hab. In
order that this multiplication be well defined, it is necessary and
sufficient that H be normal in G (Theorem 18.4).

Now let R be a ring and S a subring of R. Then R is an abelian group with


respect to addition and S is a subgroup of R. Using our results in group
theory, we build the factor group R/S. This is possible because S is a
normal subgroup of R (any subgroup of an abelian group is normal in
that group). The elements of R/S are the (right or left) cosets r + S,
where r ranges over R. Of course we must write the cosets as r + S or as
S + r, not as rS or as Sr, for the group R is an additive group. We now
wish to define a multiplication on R/S and make R/S into a ring.

The most natural way to define a multiplication on R/S is to put


(r + S)(u + S) = ru + S for all r,u R.
Let us see if this multiplication is well defined. Once we show that this
multiplication is well defined, it is routine to prove that R/S becomes a
ring with this multiplication. This multiplication is well defined if and
only if the implication
r1 + S = r2 + S, t1 + S = t2 + S r1t1 + S = r2t2 + S (for all r1,r2,t1,t2 R)
holds, and it holds if and only if

337
r1 = r2 + s1, t1 = t2 + s2, s1,s2 S r1t1 r2t2 S (for all r1,r2,t1,t2 R),
i.e., if and only if
s1,s2 S (r2 + s1)(t2 + s2) r2t2 S (for all r2,t2 R),
i.e., if and only if
s1,s2 S r2s2 + s1t2 + s1s2 S (for all r2,t2 R),

that is, since s1s2 S when s1,s2 S, if and only if

s1,s2 S rs2 + s1t S (for all r,t R) (*)

is true. We dropped the subscripts of r2 and t2.

Assume (*) holds. Then, choosing t = 0, we see rs2 S whenever r R,


s2 S; and choosing r = 0, we see s1t S whenever s1 S, t R.
Conversely, if rs2 S and s1t S whenever r R, s2 S and s1 S, t R,
then rs2 + s1t S for all r,t R, s2,s1 S, since S is a subgroup of R with
respect to addition. Thus (*) is equivalent to, and the multiplication on
R/S is well defined if and only if:
for all s S, r R, there hold rs S and sr S. (**)

Subrings with this property have a name.

30.4 Definition: A nonempty subset S of a ring R is called an ideal of R


if the following two conditions are satisfied.
(i) S is a subgroup of R under addition.
(ii) For all s S, r R, we have rs S and sr S.

According to this definition, an ideal of a ring R is a subring of R, since it


is closed under multiplication by (ii). The condition (ii) tells more than
simply that the product of an element in S by an element in S is in S. It
tells that the product of any element in R by any element in S, as well as
the product of any element in S by an element in R, are both in S. Thus S
"swallows" or "absorbs" products by elements in R.

The condition (ii) consists of two subconditions: rs S and sr S. In a


commutative ring, these subconditions are identical. But when R is not
commutative, neither of them implies the other in general, and one of
them is not enough to make S an ideal: both of them ought to hold.

Definition 30.4 and the discusion preceding it give us the following


theorem (cf. Theorem 18.4).

338
30.5 Theorem: Let R be a ring and S a subgroup of R under addition.
The multiplication on the set R/S of right (and left) cosets of S, given by
(r + S)(u + S) = ru + S for all r,u R.
is well defined if and only if S is an ideal of R.

After giving some examples of ideals, we will prove that the multiplica-
tion on R/S makes R/S into a ring.

30.6 Examples: (a) In any ring R, the set {0} is an ideal (Lemma
29.6(1) and (2)). The set R itself is also an ideal of R since R is closed
under multiplication.

(b) In the ring of integers, 2 is a subring and in fact an ideal of ,


since the product of an even integer by an arbitrary integer is always an
even integer. In the same way, the set n is an ideal of (n ).

(c) Let K be the ring of real-valued functions on [0,1] (Example 29.2(i)).


Its subset {f K: f(1/2) = 0} is an ideal of K. Similarly, when Y is a subset
of [0,1], the subset {f K: f(y) = 0 for all y Y} is an ideal of K.

(d) Let T = {a/b : (a,b) = 1, p b} be the ring in Example 29.2(g). Then


its subsets
A = {a/b : (a,b) = 1, p b, p a} and {a/b : (a,b) = 1, p b, p2 a}
are ideals of T.

1 1.
(e) is not an ideal of , since for example, 1 , , but 1 .
2 2

(f) Consider the subset S = {(ab 00): a,b } of Mat ( 2


). Then S is a
subring of Mat2( ). Also, one sees easily that rs S for all r Mat2( ),
s S. Nevertheless, S is not an ideal of Mat2( ), since it is not true that
1 0 1 1
sr S for all r Mat2( ), s S: for example (1 0) S, (1 1)
10 11 11
Mat2( ), but (1 0)(1 1) = (1 1) S.

339
Now let S 1 = {(0c 00): c } and S = {(a0 0b): a,b
2 }. It is easy to see
that S 1 and S 2 are subrings of Mat2( ) and of course S 1 S 2. Here S 1 is
a 0 c 0 c 0 a 0 ac 0
an ideal of S 2, because (0 b)(0 0) = (0 0)(0 b) = ( 0 0) S 1 for any

(a0 0b) S 2 and (0c 00) S 1. On the other hand, S 1 is not an ideal of
10 11
Mat2( ), because, for example, (0 0) S 1, (0 0) Mat2( ) and yet

(10 00)(10 10) = (10 10) S 1. Thus S 1 is an ideal of S 2 but not an ideal of
Mat2( ). This shows that "idealness" is not an intrinsic property of a
subring. A subring is not merely an ideal, but an ideal of a ring that has
to be clearly specified. Compare this with Example 18.5(i).

(g) Intersection of ideals in a ring is an ideal.. More precisely, if R is a


ring and S i are ideals of R (i I), then S := S i is an ideal of R: we
i I
know that S is an additive subgroup of R (Example 9.4(f)). and whenever
r R, s S, we have s S i for all i I, hence rs S i and sr S i for all
i I, hence rs S and sr S, and S is therefore an ideal of R..

(h) Let R be a ring and X a subset of R. There are ideals of R which


contain X, for example R itself. The intersection of all ideals that contain
X is an ideal of R by Example 30.6(g). This ideal is called the ideal
generated by X. Compare this with Definition 24.1. When X consists of a
single element only, say when X = {a}, the ideal generated by X is said to
be a principal ideal, more exactly the principal ideal generated by a. It is
easy to verify that the principal ideal generated by a is


n
{za + ua + at + riasi : z , u,t,ri,si R, n }
i=1

(cf. Lemma 24.2). If R has an identity, this ideal can be written more
simply as

{∑
n
riasi : ri,si R, n }.
i=1

If R is commutative, the principal ideal generated by a is

{za + ra : z ,r R}.

If R is a commutative ring with identity, in particular, if R is an integral


domain,

340
{ra : r R} = {ar : r R}

is the principal ideal generated by a. This is usually written as Ra, or as


aR, or as (a).

30.7 Theorem: Let R be a ring and A an ideal of R. On the set R/A of


right cosets of A in R, we define two operations + and . by
(r + A) + (s + A) = (r + s) + A, (r + A).(s + A) = rs + A
for all r,s R. With respect to these operations, R/A is a ring.

Proof: The addition on R/A is well defined since A is a normal additive


subgroup of R and the multiplication on R/A is well defined since A is an
ideal of R (Theorem 30.5).

R/A is a commutative group under addition (Theorem 18.7, Lemma


18.9(2)). We must now check the associativity of multiplication and the
distributivity laws.

For all r + A, s + A, t + A R/A, we have


[(r + A).(s + A)].(t + A) = (rs + A).(t + A)
= (rs)t + A
= r(st) + A
= (r + A).(st + A)
= (r + A).[(s + A).(t + A)],
so multiplication is associative; and we also have
(r + A).[(s + A) + (t + A)] = (r + A).[(s + t) + A)]
= r(s + t) + A
= (rs + rt) + A
= (rs + A) + (rt + A)
= (r + A).(s + A) + (r + A).(t + A)
and
[(s + A) + (t + A)].(r + A) = [(s + t) + A)].(r + A)
= (s + t)r + A
= (sr + tr) + A
= (sr + A) + (tr + A)
= (s + A).(r + A) + (t + A).(r + A).
Hence R/A is a ring.

341
30.8 Definition: Let A be an ideal of a ring R. The ring R/A of Theorem
30.7 is called the factor ring of R with respect to A, or the factor ring R
by A, or the factor ring R mod(ulo) A. Other names for R/A are: "quotient
ring", "difference ring", "residue class ring".

30.9 Examples: (a) In the ring of integers, the multiples n of an


integer n form an ideal, the principal ideal generated by n (Example
30.6(b) and (h)). The factor ring /n is exactly the ring n of integers
mod n.

(b) Let T and A be as in Example 30.6(d). Then A is an ideal of T and we


can build the factor ring T/A. This factor ring has precisely p elements.
What are they?

(c) Let R be a ring and A an ideal of R. If R is commutative, so is R/A, for


then (r + A).(s + A) = rs + A = sr + A = (s + A).(r + A) for all (r + A),(s + A)
in R/A; and if R is a ring with identity, so is R/A, for if 1 is an identity of
R, then 1 + A R/A is an identity of R/A, because
(r + A).(1 + A) = r1 + A = r + A = r1 + A = (1 + A).(r + A)
for all r + A R/A.

Ideals are the subrings with respect to which we can build factor rings,
just as normal subgroups are the subgroups with respect to which we
can build factor groups. We know that normal subgroups are exactly the
kernels of homomorphisms. We now show that ideals, too, are the ker-
nels of homomorphisms.

30.10 Definition: Let R and R1 be rings and let :R R1 be a mapping


from R into R1. If
(a + b) = a + b and (ab) = a .b
for all a,b R, then is called a (ring) homomorphism.

The operations on the left hand sides are the operations on R,. and those
on the right hand side are the operations on R1. If the operations on R1

342
were denoted by and , the equations would read (a + b) = a b
and (ab) = a b .

If : R R1 is a ring homomorphism and S is a subring of R, then the


restriction S of to S is also a ring homomorphism.

A ring homomorphism is a homomorphism of additive groups which


preserves products as well. This remark enables us to use the properties
of group homomorphisms whenever we investigate ring homomorph-
isms.

30.11 Lemma: Let : R R1 be a ring homomorphism.


(1) 0 = 0. .
(2) ( a) = (a ) for all a R. .
(3) (a + a + . . . + a ) = a
1 2 n 1
+ a2 + . . . + an for all a1,a2, . . . ,an R, n ,
n 2. (In particular, (na) = n(a ) for all a R)..
(4) (a1a2. . . an) = a1 a2 . . . an for all a1,a2, . . . ,an R, n , n 2. (In
particular, (a n) = (a )n for all a R). .

Proof: (1),(2),(3) follow immediately from Lemma 20.3, since is a


group homomorphism. (4) is proved by the same argument as in the
proof of Lemma 20.3(3).

We now establish the ring theoretical analogues of theorems about


group homomorphisms.

30.12 Theorem: Let : R R1 and : R1 R2 be a ring homomorphisms.


Then the composition mapping
:R R2
is a ring homomorphism from R into R2.

Proof: We regard and as group homomorphisms. We know from


Theorem 20.4 that is an additive group homomorphism. It remains to
show that preserves multiplication. Since
(rs) = ((rs) )
= (r .s )

343
= (r ) .(s )
= r( ).s( )
for all r,s R, does preserve multiplication and hence is a ring
homomorphism.

Since any ring homomorphism : R R1 is a group homomorphism, we


can talk about the image and kernel of . Of course.

Im = {r R1: r R} R1 and Ker = {r R: r = 0} R.

30.13 Theorem: Let : R R1 be a ring homomorphism. Then Im is a


subring of R1 and Ker is an ideal of R (cf. Theorem 20.6).

Proof: Im is a subgroup of R1 by Theorem 20.6. We must show that


Im is closed under multiplication (Lemma 30.2). Let x,y Im . Then x
= r , y = s for some r,s R. Then xy = r . s = (rs) is the image, under ,
of an element of R, namely of rs R. So xy Im and Im is closed
under multiplication. This proves that Im is a subring of R1. .

Ker is a subgroup of R by Theorem 20.6. We must only show that Ker


has the "absorbing" property (Definition 30.4). For any r R and
a Ker , we have a = 0 and so
(ra) = r .a = r .0 = 0 and (ar) = a .r = 0.r = 0
by Lemma 29.6(1),(2). Thus ra Ker and ar Ker . Therefore Ker is
an ideal of R.

We prove conversely that every ideal is the kernel of some homomorph-


ism.

30.14 Theorem: Let R be a ring and let A be an ideal of R. Then


:R R/A
r r+A
is a ring homomorphism from R onto R/A and Ker = A ( is called the
natural or canonical homomorphism).

344
Proof: The natural mapping : R R/A is a group homomorphism from
R onto R/A and Ker = A by Theorem 20.12. So we need only show that
is a ring homomorphism, i.e., that preserves multiplication. This
follows from the very definition of multiplication in R/A: we have
(rs) = rs + A = (r + A)(s + A) = r .s
for all r,s R. So is a ring homomorphism.

30.15 Definition: A ring homomorphism : R R1 is called a (ring)


isomorphism if it is one-to-one and onto.. In this case, we say R is iso-
morphic to R1 and write R R1. If R is not isomorphic to R1, we put R R1.

So a ring isomorphism is a group isomorphism that preserves multiplica-


tion. We use the same sign " " for isomorphic rings as for isomorphic
groups. This should not lead to any confusion. When confusion is likely,
we state explicitly whether we mean ring isomorphism or group iso-
morphism.

30.16 Lemma: Let : R R1 and : R1 R2 be ring isomorphisms.


(1) :R R2 is a ring isomorphism.
1
(2) : R1 R is a ring isomorphism.

Proof: (1) We know that is a group isomorphism (Lemma 20.11(1))


and a ring homomorphism (Theorem 30. 12), so is a ring
isomorphism. This proves (1).

1
(2) We know that is a group isomorphism (Lemma 20.11(2)). We
1
must also show that preserves products. For any x,y R1, we must
show (xy) 1 = x 1.y 1. Since is onto, there are a,b R such that a = x
and b = y. Now a and b are unique with this property, for is one-to-
one, and a = x 1, b = y 1. This is the definition of the inverse mapping.
Since is a homomorphism, we have
(ab) = a .b
(ab) = xy
ab = (xy) 1

345
1
x .y 1 = (xy) 1
1
for all x,y R1. So : R1 R is a ring homomorphism and consequently
1
is a ring isomorphism. .

30.17 Theorem . (Fundamental theorem on homomorphisms): Let


:R R1 be a ring homomorphism and let :R R/Ker be the natural
homomorphism..

R/Ker R/Ker

R R1 R R1

(a) (b)

Then there is a one-to-one ring homomorphism :R/Ker R1 such that


= ..

Proof: From Theorem 20.15 and its proof, we know that the mapping

: R/Ker R1
r + Ker r

is a well defined one-to-one group homomorphism such that = . It


only remains to check that preserves multiplication. For all r,s R, we
have ((r + Ker ).(s + Ker )) = (rs + Ker )
= (rs)
= r .s
= (r + Ker ) . (s + Ker ) ,
so preserves products and is a ring homomorphism.

30.18 Theorem: Let :R R1 be a ring homomorphism. Then


R/Ker Im (ring isomorphism).

Proof: The mapping :R/Ker R1 is a one-to-one ring homomorphism


r + Ker r

346
(Theorem 30.17) and Im = Im (see the proof of Theorem 20.16).
Thus is a one-to-one ring homomorphism onto Im and therefore
R/Ker Im .

30.19 Theorem: Let :R R1 be a ring homomorphism from R onto


R1.
(1) Each subring S of R with Ker S,. is mapped to a subring of R1,
which will be denoted by S1.
(2) If S and T are subrings of R and Ker S T, then S1 T1.
(3) If S and T are subrings of R containing Ker and if S1 T1, then S T.
(4) If S and T are subrings of R containing Ker and if S1 = T1, then S = T.
(5) For any subring U of R1, there is a subring S of R such that Ker S
and S1 = U.
(6) Let S be a subring of R containing Ker . Then S is an ideal of R if and
only if S1 is an ideal of R1.
(7) If S is an ideal of R containing Ker , then R/S R1/S1.

R R1 R R1 R1/S1

T T1

S S1 S S1 1

Ker {0} Ker {0}

{0} {0}

Proof: (1) As in Theorem 21.1, we put S1 = Im S . By Theorem 30.13, S1


is a subring of R1. (The restriction of a ring homomorphism to a subring
is also a ring homomorphism.)

(2),(3),(4) We regard merely as a group homomorphism and apply


Theorem 21.1(2),(3),(4).

(5) Let U be a subring of R1. Consider U as an additive subgroup of R1.


From Theorem 21.1(5), we know that there is a subgroup S of R, namely.

S = {r R: r U}

347
with Ker S and S1 = U. It remains to show that S is a subring of R. We
need only check that S is closed under multiplication,. and this is easy: if
r,s S, then r , s U, then r .s
.
U, then (rs) U, then rs S and S
is multiplicatively closed. .

(6) Let S be a subring of R, with Ker S.. First we assume that S is an


ideal of R and prove that S1 is an ideal of R1. We must show that r1s1 S1
and s1r1 S1 for all r1 R1, s1 S1. Well, if r1 R1, s1 S = Im S , then
there are r R with r = r1 and s S with s = s1, and so
r s = r .s = (rs)
1 1
Im S
= S since rs S as S is an ideal of R,
1
s1r1 = s .r = (sr) Im S = S1 since sr S as S is an ideal of R.
This proves that S1 is an ideal of R1 if S is an ideal of R.

Next we suppose S1 is an ideal of R1. By Theorem 30.14, S1 = Ker ´,


where ´: R1 R1/S1 is the natural homomorphism. Then ´: R R1/S1 is
a ring homomorphism (Theorem 30.12) with
Ker ´ = S, (*)
as follows from (ii) on page 225. By Theorem 30.13, S is an ideal of R.

(7) Assume that S is an ideal of R and S1 is an ideal of R1. From the ring
homomorphism ´: R R1/S1, we get

R/Ker ´ Im ´ (ring isomorphism)

by Theorem 30.18. Here Ker ´ = S by (*) and Im ´ = R1/S1, for and ´


are both onto. Thus.
R/S R1/S1 (ring homomorphism).

30.20 Theorem: Let A be an ideal of R.. The subrings of R/A are given
by S/A, where S runs through the subrings of R containing A.. In other
words, for each subring U of R/A,. there is a unique subring S of R such
that A U and U = S/A. When U1 and U2 are subrings of R/A, say with
U1 = S1/A and U2 = S2/A, where S1, S2 are subrings of R containing A,
then U1 U2 if and only if S1 S2. Furthermore, S /A is an ideal of R/A if
and only if S is an ideal of R. In this case .
R/A / S/A R/S (ring isomorphism).

Proof: The natural homomorphism : R R/A . is onto by Theorem


30.14. Now we may apply Theorem 30.19,. which states that any subring

348
of R/A is of the form S1 = Im S = {s R/A: s S} = {s + A R/A: s S} =
S/A for some subring S of R containing Ker = A . (notice that S/A is
mean-ingful, for A is an ideal of S when A S . and S is a subring of R).
We know that U1 = Im S1
Im S2
= U2 if and only if S1 S2 (Theorem
30.19 (2),(3)). Finally, S /A = Im S
is an ideal of R/A if and only if S is
an ideal of R, in which case R/A / S/A R/S (Theorem 30.19(6),(7)). .

30.21 Theorem: Let R be a ring, S a subring of R and A an ideal of R.


(1) S + A is a subring of R (here S + A denotes {s + a R: s S, a A} R
in accordance with Definition 19.1).
(2) A is an ideal of S + A, and S A is an ideal of S.
(3) S + A / A S /S A (ring isomorphism).

Proof: (1) S + A is an additive subgroup of R (Lemma 19.4), and it is also


closed under multiplication since
(s + a)(s´ + a´) = ss´ + sa´ + as´ + aa´ S + A
for all s,s´ S, a,a´ A, because then ss´ S; and sa´,as´,aa´ A, conse-
quently sa´ + as´ + aa´ A. So S + A is a subring of R.

(2) A is an ideal of R and a subset of S + A, so, a fortiori, A is an ideal of


S + A. Also, S A is a subgroup of S and, for all a S A, s S,
sa S and sa A, so sa S A,
as S and as A, so as S A
since S is closed under multiplication and A is an ideal of R. This shows
that S A is an ideal of S.

(3) We have a ring homomorphism S : S R/A, the restriction of the


natural homomorphism : R R/A. Hence S/Ker S Im S . From the
proof of Theorem 21.3, we know Ker S = S A and Im S = S + A /A. So
S +A/A S /S A
as contended.

Exercises

349
1. Let R be a ring. The center of R is defined to be the set
Z(R) = {z R: za = az for all a R}.
Is Z(R) a subring or an ideal of R?

2. Given a ring R, find Z(Mat2(R)).

3. Prove that, if D is a division ring, then Z(D) is a field.

4. Let R be a ring and b R. Is the centralizer


CR (b) := {r R: rb = br}
of b a subring of R?

5. Let R be a ring with identity. Prove or disprove that Z(R ) = (Z(R)) .

6. Show that, if K is a field, then {0} and K are the only ideals of K.

7. Let D be a division ring. Find all ideals of Mat2(D).

8. Let R be a ring and let End(R) be the set of all ring homomorphisms
from R into R. For any , End(R), we define + : R R by
r( + ) = r + r .
Show that + End(R) and that (End(R),+,o ) is a ring (o is the composi-
tion of functions).

9. Let R be a ring and A an ideal of R. Prove that


{r R: rx A for all x R}
is an ideal of R.

10. Let (R,+,.) be a ring. If A,B are nonempty subsets of R, we define AB


to be the nonempty subset
{a1b1 + a2b2 + . . . + anbn R: n , ai A, bi B}
of R. A subgroup A of (R,+) is called a right (resp. left) ideal of R
provided ar A (resp. ra A) for all a A, r R. Prove that, if A,B,C are
arbitrary right (resp. left) ideals of R, then
(a) A + B, AB, A B are right (resp. left) ideals of R,
(b) (AB)C = A(BC),
(c) A(B + C) = AB + AC and (B + C)A = BA + CA.

11. Let R be a ring. An ideal P of R is said to be prime if P R and if, for


any two ideals A,B of R, the implication

350
AB P A P or B P
is valid (see Ex. 10). Prove the following statements.
(a) Let P be an ideal of R and P R. If, for any a,b R,
ab P a P or b P
then P is a prime ideal of R.
(b) Let R be commutative. If P is a prime ideal of R, then
ab P a P or b P
for any a,b R.
(c) {0} is a prime ideal of any integral domain.
(d) Let R be a commutative ring with identity and P an ideal of R.
Then P is a prime ideal of R if and only if R/P is an integral domain.

12. Let R be a ring. An ideal (resp. right ideal, resp. left ideal) M of R is
said to be maximal ideal (resp. right ideal, resp. left ideal)of R if M R
and if there is no ideal (resp. right ideal, resp. left ideal) N of R such that
M N R. Prove the following statements.
(a) If R is a commutative ring with identity, then every maximal
ideal of R is prime.
(b) If R is a ring with identity, distinct from the null ring, and if M
is an ideal of R such that R/M is a division ring, then M is maximal.
(c) If R is a ring with identity and M a maximal ideal of R, then
R/M is a field.
(d) Find a noncommutative ring R with identity and a maximal
ideal M of R such that R/M is not a division ring.

13. An element a in a ring R is said to be nilpotent if a n = 0 for some


n . Prove that, if a,b are nilpotent elements in a ring, and if ab = ba,
then a + b is also nilpotent.

14. Let R be a commutative ring. Show that the set N of all nilpotent ele-
ments in R is an ideal of R and that the factor ring R/N has no nilpotent
elements other than 0.

15. Find rings R,S with identities 1R , 1S respectively and a ring homo-
morphism : R S such that (1R ) 1S .

16. If R,S are rings with identities 1R ,1S respectively, and if :R S is a


ring homomorphism onto S, prove that (1R ) = 1S .

17. The notation being as in §29, Ex. 7, prove that the mapping r (r,0)
is a one-to-one ring homomorphism from R into S.

351
§31
Field of Fractions of an Integral Domain

Let D be an integral domain, i.e., a commutative ring with identity which


has zo zero divisors, distinct from the null ring. Let a,b,c be elements of
D such that
a 0, ab = ac. (i)
1
If a has a multiplicative inverse a in D, we could multiply both sises of
this equation by a 1 and obtain
b = c. (ii)
But we do not know whether a has an inverse in D and we cannot argue
in this way. Nevertheless, it is true that (i) implies (ii) in an integral
domain: from (i), we get
ab ac = 0
a(b c) = 0,
and, since a 0 and D has no zero divisors,
b c =0
b = c.

Hence the cancellation law holds in an integral domain D just as if the


nonzero elements in D had inverses in D, i.e., as if D\{0} were a group
under multiplication.

It is the objective of this paragraph to show that any integral domain is


in fact a subring of a ring F such that F\{0} is a commutative multiplica-
tive group, i.e., a subring of a field F. We can then say that the nonzero
elements in D do have inverses, perhaps not in D, but certainly in F.

First we show that finite integral domains are always fields.

31.1 Theorem: If an integral domain has finitely many elements, then


it is a field.

Proof: (cf. Lemma 9.3) Let D be an integral domain with finitely many
elements. We are to show that every nonzero element of D has a multi-
plicative inverse in D.

352
Let a D, a 0. Since D is finite, the elements

a,a 2,a 3,. . . ,a n,. . .

of D cannot be all distinct. So there are natural numbers k,l such


that a k = a l, with k l, say. We obtain then
ak a l = 0
a k a ka l k = 0
ak(1 a l k) = 0,
where 1 is the identity of D. Since D has no zero divisors and a 0, we
conclude a k = a .a . . . . .a 0, which yields
1 al k =0
lk
a = 1,
. lk1
aa = 1.
lk1
Thus a D is an inverse of a. So D is a field.

Starting from an integral domain D, we now construct, without any


hypothesis on D , a field F which contains D as a subring. This construc-
tion is an immediate generalization of the construction of from ,
whose basic moments we recollect: every rational number is a fraction
a
of integers a,b, with b 0; different fractions can represent the same
b
a c
rational number, in fact = if and only if ad = bc (wherea,b,c,d
b d
a , c
and b 0 c); the addition of two rational numbers is carried out
b d
by writing them with a common denominator and adding the
a c ad bc ad + bc
numerators ( + = + = ); the multiplication is carried
b d bd bd bd
a c
out by multiplying the numerators and denominators separately ( =
b d
ac a
;); an integer a is considered to be equal to the rational number .
bd 1

All these carry over to the more general case of an arbitrary integral
domain D in place of , and give rise to a field F which is related to D in
the same way as is related to . The elements of F will be like
"fractions" of elements of D. We introduce them in the next two lemmas.

353
31.2 Lemma: Let D be an integral domain and put
S := {(a,b): a,b D, b 0} = D (D\{0}).
We define a relation on S by declaring
(a,b) (c,d) if and only if ad = bc
for all (a,b), (c,d) S. Then is an equivalence relation on S.

Proof: (i) For all (a,b) S, we have (a,b) (a,b) since ab = ba. So is
reflexive.

(ii) If (a,b), (c,d) S and (a,b) (c,d), then


ad = bc
da = cb
cb = da
(c,d) (a,b)
and is symmetric.

(iii) If (a,b), (c,d), (e,f)


S and (a,b) (c,d), (c,d) (e,f), then
ad = bc and cf = de
adf = bcf and bcf = bde
daf = dbe
d(af be) = 0.
From (c,d) S, we know d 0, and, since D has no zero divisors, we
obtain af be = 0. Thus af = be, so (a,b) (e,f) and is transitive.

So is an equivalence relation on S.

31.3 Lemma: Let D be an integral domain, S = D (D\{0}), and let be


the equivalence relation of Lemma 31.2. For (a,b) S, we designate the
equivalence class of (a,b) by [a:b]. Thus [a:b] = {(c,d) S: (c,d) (a,b)}.

Let F = {[a:b]: (a,b) S} be the set of all equivalence classes of the


elements in S. For all [a:b], [c:d] F, we put
[a:b] + [c:d] = [ad + bc : bd],
[a:b].[c:d] = [ac: bd].

Then + and . are well defined operations on F.

Proof: First we remark that, if (a,b), (c,d) S, then b,d 0, and so bd 0


since D has no zero divisors. Thus (ad + bc,bd), (ac,bd) S and therefore
[ad + bc:bd], [ac:bd] F.

354
We must show that + and . are well defined operations on F. This means
we must show that the implication

[a:b] = [x:y], [c:d] = [z:u] [ad + bc:bd] = [xu + yz:yu], [ac:bd] = [xz:yu]

is valid. This implication is equivalent to

(a,b) (x,y), (c,d) (z,u) (ad + bc,bd) (xu + yz,yu), (ac,bd) (xz,yu)

which, in turn, is equivalent to

ay = bx, cu = dz (ad + bc)yu = bd(xu + yz), ac.yu = bd.xz,

where b,d,y,u 0. But certainly, when ay = bx, cu = dz, we have

(ad + bc)yu = adyu + bcyu = ay.du + by.cu = bx.du + by.cu = bx.du + by.dz
= bd.xu + bd.yz = bd(xu + yz) and ac.yu = ay.cu = bx.dz = bd.xz.

31.4 Theorem: With the notation of Lemma 31.3, (F,+,.) is a field.

Proof: (i) According to Lemma 31.3, + is a binary operation on F.

(ii) + is associative since for any [a:b], [c:d], [e:f] F, we have


([a:b] + [c:d]) + [e:f] = [ad + bc:bd] + [e:f]
= [(ad + bc)f + (bd)e : (bd)f]
= [a(df) + b(cf + de): b(df)]
= [a:b] + [cf + de:df]
= [a:b] + ([c:d] + [e:f]).

(iii) [0:1] is a right additive identity since [a:b] + [0:1] = [a1 + b0: b1] = [a:b]
for any [a:b] F. (Notice that [0:1] = [0:d] for all d D, d 0.)

(iv) Any [a:b] F has a right additive inverse: [ a:b] is the opposite of
[a:b], for [a:b] + [ a:b] = [ab + b( a): b2] = [0:b2] = [0:1].

(v) + is commutative since for any [a:b], [c:d] F, we have


[a:b] + [c:d] = [ad + bc:bd] = [cb + da:db] = [c:d] + [a:b].

We proved that (F,+) is a commutative group. We now check the


remaining ring axioms.

(1) According to Lemma 31.3, . is a binary operation on F.

355
(2) . is associative since for any [a:b], [c:d], [e:f] F, we have
([a:b].[c:d]).[e:f] = [ac:bd].[e:f]
= [(ac)e :(bd)f]
= [a(ce):b(df)]
= [a:b]. [ce:df]
= [a:b].([c:d].[e:f]).

(D) For all [a:b], [c:d], [e:f] F, we have


[a:b].([c:d] + [e:f]) = [a:b].[cf + de:df]
= [a(cf + de):b(df)]
= [acf + ade:bdf]
= [bacf + bade:bbdf] (why?)
= [ac.bf + bd.ae:bd.bf]
= [ac:bd] + [ae:bf]
= [a:b].[c:d] + [a:b].[e:f]
and one of the distributivity laws hold in F. We must prove the other
distributivity law. We can give an argument similar to the above, but we
show presently that . is commutative, and this will give the other distri-
butivity law as a bonus.

We have not yet proved that (F,+,.) is a ring.

(3) . is commutative since for any [a:b], [c:d] F, we have


[a:b].[c:d] = [ac:bd] = [ca:db] = [c:d].[a:b].
As we have already remarked above, this yields the distributivity law
we have not checked:
([c:d] + [e:f]).[a:b] = [a:b].([c:d] + [e:f])
= [a:b].[c:d] + [a:b].[e:f]
= [c:d].[a:b] + [e:f].[a:b]
for all [a:b], [c:d], [e:f] F.

We now proved that (F,+,.) is a commutative ring.

(4) [1:1] is the multiplicative identity because


[a:b].[1:1] = [a1:b1] = [a:b]
for all [a:b] F. Since multiplication is commutative, there holds also
[1:1].[a:b] = [a:b] for any [a:b] F. (Notice that [1:1] = [d:d] for all d D
with d 0.)

Thus (F,+,.) is a commutative ring with identity. It remains to show that


every nonzero element in F has a multiplicative inverse in F.

356
(5) For all [a:b] F\{0}, we show that [b:a] is a multiplicative inverse of
[a:b]. First of all, since [a:b] [0:1] in F,
(a,b) is not equivalent to (0,1) in S
a1 b0
a 0
(b,a) S
and [b:a] is an element of F. Secondly, [a:b].[b:a] = [ab:ba] = [ab:ab] = [1:1]
= multiplicative identity of F. Thus [b:a] is a multiplicative inverse of
[a:b] [0:1] in F.

This proves that (F,+,.) is a field.

31.5 Theorem: Let D be an integral domain and let (F,+,.) be the field of
Theorem 31.4. Then D is isomorphic to a subring of F.

Proof: Let : D F. We demonstrate that is a one-to-one ring homo-


a [a:1]
morphism. For all a,b D,
a + b = [a:1] + [b:1] = [a1 + 1b:1.1] = [a + b:1] = (a + b)
a .b = [a:1].[b:1] = [ab:1.1] = [ab:1] = (ab) ,
thus is a homomorphism. Also
Ker = {a D: a = zero element of F}
= {a D: [a:1] = [0:1]}
= {a D: (a,1) (0,1)}
= {a D: a .1 = 1.0}
= {a D: a = 0}
= {0}
and hence is one-to-one. By Theorem 30.18, D is isomorphic to the
subring Im = D = {[a:1] F: a D} of F.

31.6 Definition: Let D be an integral domain. Then the field of


Theorem 31.4 is called the field of fractions or the field of quotients of D.

357
a
From now on, we shall write for [a:b]. The elements of F will be called
b
fractions (of elements from D). Furthermore, we identify the integral
domain D with its image D under the mapping in Theorem 31.5. Thus
a
we write a instead of and regard D as a subring of F. Then the inverse
1
1 a b
of b D F is F and that of is (here a,b D, a,b 0). With these
b b a
notations, calculations are carried out in the usual way.

Starting from an integral domain D,. we constructed the field F of frac-


tions of D. Now this field F is also an integral domain, too,. and we may
repeat our construction and obtain the field of fractions of F, say F1.
However, nothing is gained by this repetition, for F1 is not essentially
distinct from F. .

31.7 Theorem: Let K be a field and let K1 be the field of fractions of K.


Then K1 is isomorphic to K.

Proof: From Theorem 31.6, we know that : K K1 is an isomorphism


a
a
1

from K onto K = Im . We will show that Im = K1.


a
Any element of K1 can be written as , where a,b K and b 0. Since K
b
is a field and b 0, there is an inverse b 1 K of b in K. Thus ab 1 K and
a ab 1
= [a:b] = [ab 1:1] = = (ab 1) Im .
b 1
This proves K1 Im , so Im = K1 and K1 is isomorphic to K.

Next we show that the field of fractions of an integral domain is the


smallest field containing that integral domain.

31.8 Theorem: Let D be an integral domain and F the field of fractions


of D. If K is any field that contains D, then K contains a subring iso-
morphic to F.

358
Proof: We construct an isomorphism From F onto a subring of K. The
a
elements of F are fractions , where a,b D and b 0. Regarded as an
b
element of K, b has an inverse b 1 in K, so ab 1 K. Let : F K.
a
ab 1
b

a c
is a well defined mapping, for if = (a,b,c,d D, b 0 d), then ad
b d
a c
= bc, so ad.b 1d 1 = bc.b 1d 1, so ab 1 = cd 1, so ( ) = ( ) . Now
b d

a c ad+bc
( + ) =( ) = (ad + bc)(bd) 1
b d bd
a c
= (ad + bc)d 1b 1 = ab 1 + cd 1 = ( ) + ( )
b d
a c ac a c
and ( . . ) = ( ) = (ac)(bd) 1 = ac.d 1b 1 = ab 1.cd 1 = ( ) .( )
b d bd b d

a c
for any two fractions . , in F and is a ring homomorphism. Here is
b d
a a
one-to-one because Ker = {0}, for Ker implies ( ) = 0, so ab 1 = 0,
b b
a 0 0
so a = abb 1 = 0b 1 = 0, so = = = 0. Hence F is isomorphic to the
b b 1
subring Im of K (Theorem 30.18).

Exercises

1. Let D1 = {a + bi : a,b }, D2 = {a + 2bi : a,b } and


3 3
E = {a + b 2 + c 4 : a,b,c }. Show that D1, D2, E are integral
domains and describe, as simply as you can, the elements in the field of
fractions of these integral domains.

2. Let R be a commutative ring and let M be a nonempty multiplicatively


closed subset of R. For ordered pairs in R M, we put

(r,m) (r´,m´) if and only if there is an n M such that n(rm´ r´m)

359
Show that is an equivalence relation on R M. Denote the equivalence
r
class of (r,m) by and define, on the set M 1R of all equivalence classes,
m
addition and multiplication by
r r´ rm´+mr´ r r´ rr´
+ = and . . =
m m´ mm´ m m´ mm´
1
Prove that M R is a commutative ring with identity under these opera-
tions.

3. Keep the notation of Ex. 2. Prove that, if A is an ideal of R, then M 1A =


a
{ M 1R: a A, m M} is an ideal of M 1R. If A,B are ideals of R, then
m
M 1(A + B) = M 1A + M 1B and M 1(A B) = M 1A M 1B. Does every ideal
of M 1R have the form M 1A for some ideal A of R?

4. Let R be a commutative ring with identity and let P be a prime ideal


of M (see §30, Ex. 11). Show that M := R\P is a multiplicatively closed
subset of R. Prove that M 1R, in the notation of Ex. 2, has a unique
maximal ideal (see §30, Ex. 12).

5. Discuss the rings in Example 29.2(e),(g) under the light of Ex. 3 and 4.

360
§32
Divisibility Theory in Integral Domains

As we have already mentioned, the ring of integers is the prototype of


integral domains. There is a divisibility relation on *: an integer b is
said to be divisible by a nonzero integer a when there is an integer c
such that ac = b. Integers with no nontrivial divisors are called prime,
and every nonzero integer that is not a unit can be written as a product
of prime numbers in a unique way.

We want to investigate whether there are similar results in other


integral domains. More generally, one can ask whether there are similar
results in an arbitrary ring. However, in an arbitrary ring, one has to
distinguish between left divisors and right divisors: if a,b,c are elements
of a ring and ab = c, then a is called a left divisor, and b is called a right
divisor of c. Here a may be a left divisor of c without being a right
divisor of c, and vice versa. Furthermore, the existence of zero divisors
in a ring complicates the theory. For these reasons, in this introductory
book, we confine ourselves to integral domains.

32.1 Definition: Let D be an integral domain and let , D. If 0


and if there is a D such that = , then is called a divisor or a
factor of and is said to be divisible by . We also say divides .

We write when divides , and when 0 and does not


divide .

32.2 Lemma: Let D be an integral domain and let , , , , , 1


, 2
, . . . , s,
, , . . . , s be elements of D.
1 2
(1) If , then | , | , | .
(2) If and , then | .
(3) If and 0, then .

* More precisely, on \{0}

361
(4) If , then .
(5) If and , then | + .
(6) If and , then | .
(7) If and , then | + .
(8) If 1
, 2
, ... , s
, then | 1 1
+ 2 2
+ ... + .
s s
(9) If 0, then |0.
(10) 1| and 1| .

Proof: The claims are proved exactly as in the proof of Lemma 5.2.

We know that the units 1, 1 of divide every integer. This is true in


any arbitrary integral domain (see Definition 29.14).

32.3 Lemma: Let D be an integral domain and D. Then is a unit of


D (i.e., D ) if and only if for all D.

Proof: If is a unit, then = 1 for some D; in particular, since D is


an integral domain, = 1 0 and 0. For any D, we have ( ) =
( ) = 1 = , with D. Thus for any D. Conversely, if for
all in D, then 1, so = 1 for some D. Thus has an inverse in D
and is a unit of D.

32.4 Definition: Let D be an integral domain and , D. Then is said


to be associate to if there is a unit D such that = . In this case,
we write .

32.5 Lemma: Let D be an integral domain. Then is an equivalence


relation on D.

Proof: (i) For any D, we have = 1 and 1 is a unit. Hence and


is reflexive. (ii) If , D and , then = for some D , then
1 1
= with D (for D is a group by Theorem 29.15) and , so is
symmetric. (iii) If , , D and , , then = , = ´, where

362
, ´ are units in D. So = ´ , with ´ D (Theorem 29.15), thus
and is transitive.

Since is a symmetic relation, it is legitimate to say that and are


associate when is associate to . The alert reader will have noticed that
the group D acts on the set D in the sense of Definition 25.1, and the
orbit of any D consists of the associates of (that is, elements of D
which are associate to ). Lemma 32.5 is thus merely a special case of
Lemma 25.5.

For any D, the units and associates of are divisors of . A divisor of


, which is neither a unit nor an associate of , is called a proper divisor
of . An element need not have proper divisors; for instance, a unit has
no proper divisors.

The relation holds if and only if the relation 1 1 holds for any
associate 1 of and for any associate 1 of . In other words, as far as
divisibility is concerned, associate elements play the same role..

32.6 Examples: (a) The theory of divisibility in was discussed in §5.


The units in are 1 and 1, and the associates of a are a and a. The
terminology in this paragraph is consistent with that of §5.

(b) Let D be a field. Then for any , D, 0 since 0 implies


1 1 1
that there is an inverse of in D and ( ) = with D. In
particular, 1 for any D, 0. Hence any nonzero element in D is a
unit and any two nonzero elements are associate. The divisibility theory
is not very interesting in a field.

(c) Let R = {a/b : (a,b) =1, 5 b} be the ring of Example 29.2(e). It is


easily seen that R is an integral domain. Let us find the units of R. The
multiplicative inverse of a/b R ((a,b) = 1) is b/a , and b/a R
if and only if 5 a. Thus R = {a/b R: (a,b) =1, 5 b, 5 a}. The associates
of a/b R are the numbers x/y with (x,y) = 1, where a and x are exactly
divisible by the same power of of 5.

363
(d) We put [i] := {a + bi : a,b }. One easily checks that [i] is a
subring of and that [i] is an integral domain. The elements of [i] are
called gaussian integers (after C. F. Gauss (1777-1855) who introduced
them in his investigations about the so-called biquadratic reciprocity
law).

Since [i] is a subring of , each element = a + bi in [i] has a conjugate


and a norm. The conjugate of = a + bi [i] is defined to be = a bi
in [i] (a,b ). Notice that = for any , [i]. The norm N( ) of
a + bi [i] is defined by N( ) = ; hence N(a + bi) = a 2 + b2 (a,b ).
Thus N( ) is a nonnegative integer for any [i], and equals 0 if and
only if = 0 + 0i = 0. Moreover, N( ) = . = . = = N( )N( )
for any , [i].

Using this, it is easy to determine the units in [i]. We claim [i] is a


1
unit in [i] if and only if N( ) = 1. Indeed, if is a unit in [i], then
= 1, then N( )N( 1) = 1, where N( ),N( 1) are positive integers. This
forces N( ) = 1, as claimed. Conversely, if N( ) = 1, then = 1, where
[i], and this yields 1, which means is a unit.

Thus a + bi is a unit if and only if N( ) = a 2 + b2 = 1 (here a,b )


2 2 2 2 2 2
and a + b = 1 if and only if a = 1, b = 0 or a = 0, b = 1. Therefore
is a unit if and only if = 1, 1,i, i, so that [i] = {1, 1,i, i}. The associates
of in [i] are the numbers , , i, i.

1+ 3i 2 2
(e) We put = . Thus = cos + isin . By de Moivre's
2 3 3
4
2 2 1 3i
theorem, 2 = cos 2 + isin 2 = e3 = = and
3 3 2

3 2 2
= cos 3
+ isin 3 = 1. So 3 1 = 0, so ( 1)( 2 + + 1) = 0. Since
3 3
1 0, we conclude 2 + + 1 = 0, which can also be verified directly.
From 3 = 1, we obtain 4 = , whence ( 2)2 + 2 + 1 = 0.

We put [ ] := {a + b : a,b }. One easily checks that [ ] is a


subring of and that [ ] is an integral domain. The closure of [ ]
under multiplication follows from 2 = 1 :
(a + b )(c + d ) = ac + ad + bc + bd 2
= ac + ad + bc + bd( 1 )
= (ac bd) + (ad + bc bd)

364
[ ]
for all a + b , c + d [ ]. The ring [ ] was introduced independently
by C. G. J. Jacobi (1804-1851) and by G. Eisenstein (1823-1852) in their
investigations about the so-called cubic reciprocity law).

Repeating the proof for [i], we see that a + b [ ] is a unit in [ ] if


and only if N(a + b ) = 1. This is equivalent to a 2 ab + b2 = 1, so
equivalent to 4a 2 4ab + 4b2 = 4, so to (2a b)2 + 3b2 = 4. The last equa-
tion holds if and only if 2a b = 2, b = 0 or 2a b = 1, b = 1 (a,b ).
2
In this way, we get a + b = 1, , ( 1 )= . The units in [ ] are
2
1, , ( 1 )= ; the associates of [ ] are the numbers , ,
2
( 1 ) = .

(f) We put [ 5i] = {a + b 5i : a,b }. Again, it is easily verified


that [ 5i] is an integral domain, and [ 5i] is a unit if and only if
2 2
N( ) = 1. Now N(a + b 5i) = a + 5b = 1 if and only if a = 1, b = 0 (here
a,b ). Thus 1 are the only units in [ 5i] and the associates of a
number [ 5i] are the numbers .

So far, the divisibility theory in an arbitrary integral domain has been


completely analogous to the theory in , which culminates in the funda-
mental theorem of arithmetic asserting that every integer, not a zero or
a unit, can be written as a product of prime numbers in a unique way.
We proceed to investigate if a similar theorem is true in an arbitrary
integral domain. First we introduce the counterparts of prime numbers.

32.7 Definition: Let D be an integral domain and D. Then is said


to be irreducible if is neither zero nor a unit, and if, in any factoriza-
tion = of , where , D, either or is a unit in D. When is nei-
ther zero nor a unit, and when is not irreducible in D, is said to be
reducible.

An irreducible element in D is therefore one which has no proper


divisors. Clearly, when and are associates, is irreducible if and only
if is irreducible. One might expect that such elements be called prime

365
rather than irreducible, but the term "prime" is reserved for another
property (Definition 32.20).

We now ask if every nonzero,. nonunit element in an integral domain D


can be expressed as a product of finitely many irreducible elements. (cf.
Theorem 5.13) Let us try to argue as in Theorem 5.13.. Given D(
0, not a unit), is either irreducible or not.. In the former case, is a
product of one irreducible element. In the latter case, = 1 for some
suitable proper divisors of . Here 1 is either irreducible or not. In the
former case, 1 is an irreducible divisor of . In the latter case, 1 = 2
for some suitable proper divisors of 1. Here 2 is either irreducible or
not. Repeating this procedure, we get a sequence.
= 0, 1, 2, . . . (s)
of elements in D, where i+1 is a proper divisor of i (i = 0,1,2, . . . ).

When the sequence (s). stops after a finite number of steps, we obtain an
irreducible divisor of .. However, we do not know that the sequence (s)
ever terminates. In the case of , the absolute values of i, which are
nonnegative integers, get smaller and smaller and,. since there are
finitely many nonnegative integers less than ,. the sequence (s) does
come to an end.. But this argument cannot be extended to the general
case, for there is no absolute value concept.. Let us suppose, however,
that there is associated a nonnegative integer d( i) to each i in such a
way that d( i+1) d( i). If this is possible, we can conclude that
sequence (s) does terminate..

For example, when D is one of [i], [ ], [ 5i],. we may consider the


norm N( i) of i. The norm N( i) is a nonnegative integer, and also
N( i+1) N( i) whenever i+1 is a proper divisor of i. In fact, with the
norm function, there is a division algorithm in [i] and in [ ]..

32.8 Theorem: Let , be elements of [i] (resp. of [ ]), with 0.


Then there are two elements and in [i] (resp. in [ ]) such that
= + and N( ) N( ).

Proof: (Cf. Theorem 5.3; note that and are not claimed to be unique.)
The elements , of [i] (resp. of [ ]) are complex numbers, and 0.
Thus / . Let us write

366
= x + yi (resp. =x+y )

with x,y . We want to be "approximately equal" to , with an

"error" so small that N( ) 1. So we approximate x + yi (resp. x + y )


by an element in [i] (resp. in [ ]) as closely as we can. To this end,
we choose integers a,b such that
1 1
x a y b
2 2
and put = a + bi (resp. = a + b ). This is possible since the distance
1
between x and the integer closest to x is less than or equal to . When x
2
is half an odd integer, there are two choices for a, and therefore there
1
can be no hope for uniqueness. In this case, we have in fact x a =
2
and the approximation above is the best possible one. The same remarks
apply to y and b.

We now put = . Then = + . It remains to show that N( ) 1.


We have indeed

N( ) = N( ) = N( )= { NN (((x(x ++ yi)
(a + bi))
y )(a + b ))
N ((x a) + (y b)i )
={ N ((x a) + (y b) )
2
(x a) + (y b)2
={
(x a)2 (x a)(y b) + (y b)2
x a2 + y b2
={
x a2 + x a y b + y b2
(1/2)2 + (1/2)2
{ (1/2)2 + (1/2)(1/2) + (1/2)2
2/4
={ 1.
3/4
This completes the proof.

What happens in [ 5i]? For a,b [ 5i], 0, we write = x + yi, with

x,y . The best approximation to is given by = a + b 5i, where a,b


1 1
are integers such that x a , y b . Putting = , we can
2 2
conclude only

367
N( ) = N( ) = N( ) = N ((x a) + (y b) 5i )
= (x a)2 + 5(y b)2 (1/2)2 + 5(1/2)2 = 3/2
(†)
instead of N( ) 1 as in [i], [ ].

32.9 Theorem: In [ 5i], there are elements 0


, 0
with 0
0 such that
N( 0 0
) N( 0
) for all [ 5i].

Proof: We choose 0, 0 in such a way that equality holds in (†) above.


This will be the case when x and y are half odd integers. So we set 0
=
1+ 5i, 0
= 2. Then, for any = a + b 5i [ 5i] (with a,b ), we
have
0 0 0 1+ 5i
N( 0 0
) = N( 0
)N( ) = N( 0
)N( ) = N( 0
)N ( (a +
0 0 2
b 5i))
1 1 3
= N( 0
) (
2[ a)2+ 5(
2
b)2 ] N( 0
)[(1/2)2 + 5(1/2)2] = N( 0
)
2
N( 0
),

as claimed.

The integral domains on which there is a division algorithm are called


Euclidean domains. The formal definition is as follows.

32.10 Definition: Let D be an integral domain. D is called a Euclidean


domain if there is a function d: D\{0} {0} such that
(i) d( ) d( ) for all , D\{0},
(ii) for any , D\{0}, there are , D satisfying
= + and = 0 or d( ) d( ).

The first condition (i) assures that the d-value of a divisor of D\{0} is
less than or equal to the d-value of . It follows that d( ) = d( ´) when-
ever and ´ are associate. Using (ii) repeatedly, the analog of the
Euclidean algorithm is seen to be valid, and the last nonzero remainder

368
is a greatest common divisor. It will be a good exercise for the reader to
prove this result.

is a Euclidean domain, with the absolute value function working as the


function d of Definition 32.10. This follows from Theorem 5.3, with b
replaced by b . Also, [i] and [ ] are Euclidean domains, with the norm
function working as the function d of Definition 32.10, as Theorem 32.8
shows. On the other hand, we do not yet know whether [ 5i] is a
Euclidean domain. It does not follow from Theorem 32.9 that [ 5i] is
not Euclidean. From Theorem 32.9, it follows only that either [ 5i] is
not Euclidean, or [ 5i] is a Euclidean domain with a function d that is
necessarily distinct from the norm function.

In a Euclidean domain D, the sequence (s) terminates after a finite num-


ber of steps. We shall prove a more general statement (Theorem 32.14).
Recall that { D: D} = D , where D, is the principal ideal gene-
rated by (Example 30.6(h)).

32.11 Theorem: If D is a Euclidean domain, then every ideal of D is a


principal ideal.

Proof: Let D be a Euclidean domain, and let d be the function of Defini-


tion 32.10. For any ideal A of D, we must find an such that A = D . We
argue as in Theorem 5.4.

When A = {0}, we clearly have A = D0, and the claim is true. Assume
now A {0}. Then U = {d( ) {0}: A, 0} is a nonempty
subset of the set of nonnegative integers. Let m be the smallest integer
in U. Then m = d( ) for some A, 0; and d( ) d( ) for all A,
0.

We show that A = D . First we have D A, because A and A has the


"absorbing" property. To prove A D , take an arbitrary from A. There
are , D such that
= + , = 0 or d( ) d( ),
provided 0 (Definition 32.10). Now A, so A, and since A
as well, we see that A. Here d( ) d( ) is impossible, for then d( )
in U would be less than m, which is the smallest number in U. Hence

369
necessarily = 0 and = D . This shows D for all A, pro-
vided 0. Since 0 = 0 D as well, we get A D . Thus A = D .

32.12 Definition: An integral domain D is called a principal ideal


domain if every ideal of D is a principal ideal.

With this terminology, Theorem 32.11 can be reformulated as follows.

32.11 Theorem: Every Euclidean domain is a principal ideal domain.

In any integral domain, if and only if D D , and if and only if


D = D . Thus the sequence (s) gives rise to the chain
D =D 0
D 1
D 2
...
of principal ideals in D. The sequence (s) breaks down if and only if this
chain of ideals breaks down. For principal ideal domains, this is always
true.

32.13 Definition: Let D be an integral domain. D is said to satisfy the


ascending chain condition (ACC) if, for every chain
A0 A1 A2 A3 . . .
of ideals in D, there is an index k such that Am = Ak for all m k; or,
what is the same, every chain.

B0 B1 B2 B3 ...
of ideals in D consists of finitely many terms. An integral domain satis-
fying the ascending chain condition is also called a noetherian domain
(in honor of Emmy Noether (1882-1935)).

32.14 Theorem: Every principal ideal domain satisfies the ascending


chain condition (is noetherian).

Proof: Let D be a principal ideal domain and let

370
A0 A1 A2 A3 . . .
be a chain of ideals of D.. We must show there is an integer k such that

Am = Ak for all m k. To this and, we put B := Ai. We claim B is an


i=1
ideal of D. Indeed, if , B, then Aj , Al for some indices j,l.
Assuming j l without loss of generality, we have Aj Al. Since Al is an
ideal of D, we have + Al, so + B. Also Al, so B. This
shows that B is a subgroup of D under addition.. Finally, if is an
arbitrary element of D, then Al, since Al is an ideal, so B.
Hence B is an ideal of D. .

Since D is a principal ideal domain, B = D for some D.. As =1 D

=B = Ai, we see Ak for some k. We claim Am = Ak for all m k.


i=1
We know that Ak Am for all m k because each ideal in the chain is
contained in the next one (the chain is acsending).. On the other hand,

for any m k, we have Am Ai = B = D Ak, because Ak and Ak


i=1
is an ideal of D. Thus Am = Ak for all m k and D satisfies the ascending
chain condition..

Using Theorem 32.14, we shall prove the analog of Theorem 5.13 for any
arbitrary principal ideal domain.

32.15 Theorem: Let D be a principal ideal domain. Then every element


of D that is neither zero nor a unit can be exressed as a product of
finitely many irreducible elements of D.

Proof: First we prove that every element in D, which is neither zero


nor a unit, has an irreducible divisor in D. Let D, 0, unit.
Arguing as on page 366, we get a sequence
= 0, 1, 2, . . . (s)
of elements in D, where i+1 is a proper divisor of i (i = 0,1,2, . . . ). In
particular, none of the i is a unit. This sequence gives rise to the
ascending chain
D =D 0
D 1
D 2
...

371
of ideals of D (here D i
D i+1
because i+1
is a proper divisor of i
).
Since D is noetherian (Theorem 32.14), this chain breaks off: the chain
consists only of the ideals
D =D 0 D 1 D 2 ... D k
say. Hence
= 0, 1, 2, . . . , k
are the only elements in the sequence (s). We claim that k is irreducible
in D. Otherwise, there would be proper divisors k+1 and k+1 of k with
k
= k+1 k+1, and the sequence (s) would contain the term k+1 after k,
and would not terminate with the term k, a contradiction. Hence k is
an irreducible divisor of .. We proved that every element in D, which
is neither zero nor a unit, has an irreducible divisor in D..

Let be an arbitrary nonzero, nonunit element in D.. We want to show


that . can be written as a product of finitely many irreducible elements
in D.. By what we proved above, we know that has an irreducible
divisor, 1 say. We put = 1 1. Here 1 0. If 1 is not a unit, then 1
has an irreducible divisor, 2 say. We put 1 = 2 2. Thus = 1 2 2. Here
2
0. If 2 is not a unit, then 2 has an irreducible divisor, 3 say. We
put 2 = 3 3. Thus = 1 2 3 3. Continuing in this way, we get a
sequence.
= 0, 1, 2, 3, . . .
of elements in D inducing a chain.
D =D 0 D 1 D 2 D 3 ...
of ideals of D. By Theorem 32.14, this chain is finite, for example
D =D 0 D 1 D 2 D 3 ... D k
We claim k is a unit. Otherwise k would have an irreducible divisor
k+1
and, when we put k = k+1 k+1, there would be, in the chain, an
additional ideal D k+1 containing D k properly, a contradiction. Thus k is
a unit.. Then
= 1 2 3 . . . k 1 k k = 1 2 3 . . . k 1 ( k k)
is a product of the irreducible elements 1, 2, 3, . . . , k 1, k k.

Having established the analog of Theorem 5.13, we proceed to work out


the counterpart of Euclid's lemma (Lemma 5.15). For this we need the
notion of greatest common divisor.

372
32.16 Definition: Let D be an arbitrary integral domain and let , D,
not both zero. An element of D is called a greatest common divisor of
and if
(i) and ,
(ii) for all 1 in D, if 1 and 1 , then 1 .

Notice that any associate of above satisfies the same conditions and
hence any associate of a greatest common divisor of and is also a
greatest common divisor of and . It is seen easily that any two great-
est common divisor of and , if and have a greatest common divi-
sor at all, are associates. So a greatest common divisor of and is not
uniquely determined and we have to say a greatest common divisor, not
the greatest common divisor.

We let ( , ) stand for any greatest common divisor of and . Thus ( , )


is determined uniquely to within ambiguity among associate elements.

Although we defined a greatest common divisor of two elements in an


integral domain, this does not mean, of course, that any two elements
(not both zero) in that domain do have a greatest common divisor.
Introducing a definition does not create the definiendum. Given two ele-
ments (not both zero) in an integral domain, we cannot assert that they
have a greatest common divisor. As a matter of fact, in an arbitrary
integral domain, not every pair of elements (not both zero) has a
greatest common divisor. For the special class of principal ideal domains,
however, the following theorem holds.

32.17 Theorem: Let D be a principal ideal domain and let , be arbit-


rary elements in D, not both of them being zero. Then there is a greatest
common divisor of and in D. Furthermore, if is a greatest common
divisor of and , then there are , D such that = + .

Proof: (cf. Theorem 5.4) As in the proof of Theorem 5.4, we consider the
set A := { + : , D}. A is a nonempty subset of D. We claim that A is
an ideal of D. To prove this, let 1, 2 be arbitrary elements of A. Then 1 =
1
+ 1, 2 = 2 + 2 for some 1, 2, 1, 2 D. Hence

373
1
+ 2 = ( 1 + 1) + ( 2
+ 2) = ( 1 + 2
)+ ( 1
+ 2
) D
1
= ( 1 + 1) = ( 1
) + ( 1) D

and therefore A is a subgroup of D under addition. Also, for any D,

1
= ( 1
+ 1
)= ( 1
)+ ( 1
) D

and thus A has the "absorbing" property as well. So A is an ideal of D.

Since , are not both equal to zero, A {0}.. Now D is a principal ideal
domain, so A = D for some D, and A {0} implies 0.. Also, since
=1 D = A, there are 0, 0 D with = 0 + 0. We prove now that
is a greatest common divisor of and ..

(i) = 1+ 0 A = D , so = = for some D. Since


we have 0, we can write . Likewise .
(ii) If 1 D and 1 , 1 , then 1 0
+ 0, hence 1
.

Thus is a greatest common divisor of and , and = 0


+ 0
for some
,
0 0
D. The proof is complete.

In a principal ideal domain D, we see that D + D = D( , ) whenever


, D are not both zero. Either from this remark, or better from Defini-
tion 32.16, it follows that ( , ) = ( , ). When ( , ) 1, we say is
relatively prime to , or and are relatively prime. In this case, there
are , in D with + = 1.

32.18 Lemma: Let D be a principal ideal domain and , , , D.


(1) If and ( , ) 1, then .
(2) If is irreducible and , then ( , ) 1.

Proof: (1) If ( , ) 1, we have + = 1 for some , D. Hence +


= . Now and , so and , so + , so .

(2) Let be irreducible. Then 0. So ( , ) exists by Theorem 32.17. Let


be a greatest common divisor of and . Then . Since is irre-
ducible, either is associate to , or is a unit. In the first case , we

374
get (since ) against our hypothesis . Thus is a unit and 1,
as claimed.

32.19 Lemma: Let D be a principal ideal domain and , , D. If is


irreducible and , then or .

Proof: If , the lemma is true. If , then ( , ) 1 by Lemma


32.18(2) and so by Lemma 32.18(1), with in place of .

32.20 Definition: Let D be an arbitrary integral domain. If D is not


zero or a unit, and if has the property that
for all , D, or ,
then is called a prime element in D.

In an arbitrary integral domain, all prime elements are irreducible. In-


deed, let be prime. Then is not zero or a unit by definition. We show
that has no proper divisors. Suppose = . Then and therefore
or . Without restricting generality, let us assume . But as
well, so and is a unit. Thus admits no proper factorization and
is irreducible.

The converse of this remark is not true. That is to say, in an arbitrary


integral domain, there may be irreducible elements which are not prime
(see Ex. 13). Lemma 32.19 asserts that irreducible and prime elements
coincide in a principal ideal domain. This is the basic reason why there
turns out to be a unique factorization theorem in principal ideal do-
mains.

32.21 Lemma: Let D be an arbitrary ideal domain and 1


, 2, . . . , n
,
D. If is prime and 1 2
. . . n, then 1
or 2
or . . . or n
.

Proof: Omitted.

375
32.22 Theorem: Let D be a principal ideal domain. Every element of D,
which is not zero or a unit, can be expressed as a product of irreducible
elements of D in a unique way, apart from the order of the factors and
the ambiguity among associate elements.

Proof: (cf. Theorem 5.17.) Let D, 0, unit. By Theorem 32.15,


can be expressed as a product of irreducible elements of D. We must
show uniqueness. Given two decompositions

1 2
... r
= = 1 2
... s

of into irreducible elements, we must show r = s and 1, 2, . . . , r are, in


some order, associate to 1, 2, . . . , s. This will be proved by induction on r.

First assume r = 1. Then 1 = = 1 2. . . s, so is irreducible. This forces


s = 1 and 1= = 1. This proves the theorem when r = 1.

Now assume r 2 and that the theorem is proved for r 1. This means,
whenever we have an equation

1
´ 2
´. . . r 1
´ = 1
´ 2
´. . . t´

with irreducible ´, ´, there holds r 1 = t and 1


´, 2
´, . . . , ´ are, in
r 1
some order, associates of 1´, 2´, . . . , t´.

We have 1 2. . . r = = 1 2. . . s. So r . So r 1 2. . . s. Now r is prime


by Lemma 32.19, and so r j for some j {1,2, . . . ,s}. (Here we use the
fact that irreducible elements are prime in a principal ideal domain.. The
conclusion r j is not valid in an arbitrary integral domain.) Reordering
the 's if necessary, we may assume r s. Since s is irreducible, s has
no proper divisors. Hence the divisor r of s is either a unit or an
associate of s. But r is irreducible, so not a unit. Therefore r and s are
associate. So r = s
for some unit D. Then we obtain

1 2
. . . r 1( s
)= = 1 2
... s

1 2
...( r 1
)= 1 2
... s 1
and by induction, we get
r 1 = s 1,
1
, 2
, ..., ( r 1
) are, in some order, associates of 1
, 2
, ..., s 1
.

Hence r=s
and 1
, 2
, ..., r 1
are, in some order, associates of 1
, 2
, ..., s 1
;

376
and r
is associate to s
. This completes the proof.

32.23 Definition: Let D be an integral domain. If every element of D,


which is not zero or a unit, can be expressed as a product of finitely
many irreducible elements of D in a unique way, apart from the order of
the factors and the ambiguity among associate elements, then D is called
a unique factorization domain.

With this definition, Theorem 32.22 reads as follows.

32.22 Theorem: Every principal ideal domain is a unique factorization


domain. In particular, every Euclidean domain is a unique factorization
domain.

We generalize Lemma 32.19 to unique factorization domains.

32.24 Lemma: Let D be a unique factorization domain. Then every


irre-ducible element of D is prime.

Proof: Let be irreducible in D and , where , D. Thus there is a


D with = . Then

= 1 2
... ,
r
= ´ 1
´ 2
´. . . s
´, = ´´ 1
´´ 2
´´. . . t´´,

where , ´, ´´ are units and i, j ´, k´´ are irreducible elements in D.


From the uniqueness of the decomposition.

´´ 1
´´ 2
´´. . . t
´´ = = = 1 2
... r
´ 1
´ 2
´. . . ´=
s
´ 1 2
... r 1
´ 2
´. . . s
´

we see that must be associate to one of the irreducible elements j


´ or
k
´´. Thus divides or . So is prime.

377
There is the following generalization of Theorem 32.22. If D is an
integral domain in which every nonzero, nonunit element can be written
as a product of finitely many irreducible elements, and if every
irreducible element in D is prime, then D is a unique factorization
domain. The proof of Theorem 32.22 is valid in this more general case.

In a unique factorization domain D, any two elements , (not both zero)


have a greatest common divisor. Clearly ( , ) if = 0 and ( , ) if
m1 m2
= 0; and if 0 , then = 1 2
. . . r and = ´ 1 2 . . . rnr
mr n1 n2

with suitable units , ´, irreducible elements 1, 2, . . . , r and nonnega-


tive integers mi, ni, and it is easily seen that = 1k1 2k2 . . . rkr, where ki =
min{mi, ni}, is a greatest common divisor of and . Thus ( , ) exists in
any unique factorization domain, provided only , are not both equal
to zero. However, in an arbitrary unique factorization domain, ( , ) can-
not, in general, be expressed in the form + .

There are unique factorization domains which are not principal ideal
domains and there are principal ideal domains which are not Euclidean
domains.

32.25 Theorem: Let D be a principal ideal domain and let D be a


nonzero, nonunit element of D. Then is irreducible if and only if the
factor ring D/D is a field.

Proof: D/D is a commutative ring with identity 1 + D (Example


30.9(c)). Suppose is irreducible. We are to show that every nonzero
element + D of D/D has an inverse in D/D . Let + D be distinct
from the zero element 0 + D = D of D/D . This means D , so .
Since is irreducible, we obtain ( , ) 1 from Lemma 32.18(2), and
there are therefore , in D such that + = 1. So 1 D and
( + D )( + D ) = 1 + D .
Thus + D is an inverse of + D . This proves that D/D is a field.

We now prove that, if is not irreducible, then D/D is not a field.


Indeed, if is not irreducible, then = for some , D, where
neither nor is a unit. Here and, in view of this, would imply
that ; then would be a unit, a contradiction. Hence and
likewise . So D and D , so + D 0+D + D , but

378
( + D )( + D ) = + D = + D = 0 + D = zero element of D/D .
Thus + D and + D are zero divisors in D/D and D/D cannot be a
field.

Exercises

1. Let D be an integral domain and D. Prove that is a prime element


of D if and only if D is a prime ideal of D (see §30, Ex. 11).

2. Let D be a principal ideal domain and D. Prove that is an irre-


ducible element of D if and only if D is a maximal ideal of D (see §30,
Ex. 12).

3. Show that [ d] := {a + b d : a,b } is a Euclidean domain when


d = 2,3,6.

4. Show that [ 2i] := {a + b 2i : a,b } is a Euclidean domain.

1+ 7i
5. Let = . Show that [ ] := {a + b : a,b } is a Euclidean
2
domain.

6. Let D be a Euclidean domain, with the function d as in Definition


32.10. Prove that D is a unit if and only if d( ) = d(1).

7. Find the decomposition into irreducible elements of 2 in [i] and of 3


in [ ].

8. Let p be an odd prime number. Prove that (i) p = p + 0i [i] is


2
prime in [i] in case x 1 (mod p) has no solution and (ii) p [i] is not
prime in [i], and in fact p = with a suitable prime element of [i],
2
in case x 1 (mod p) has a solution.

9. Let [i]. Show that [i]/ [i] has exactly N( ) elements.

10. Using the Euclidean algorithm, find a greatest common divisor of


3 + 5i and 2 + 3i; and of 14 + 23i and 11 + 44i in [i].

379
11. Prove: an integral domain is a principal ideal domain if and only if
there is a function d: D\{0} {0} satisfying
(i) d( ) d( ) for any , D\{0} with , and d( ) = d( ) if
and only if ;
(ii) for all , D\{0} with and , there are , , D
such that = + and d( ) min{d( ),d( )}.

1+ 19i
12. Let = . Show that [ ] := {a + b : a,b } is a principal
2
ideal domain, but not a Euclidean domain.

13. Prove that 2, 3, 1 + 5i, 1 5i are irreducible in [ 5i]. Show that


2,3 are not associate to 1 + 5i, 1 5i. Hence there are two essentially
distinct decompositions
2.3 = 6 = (1 + 5i)(1 5i)
of 6 [ 5i] and therefore [ 5i] is not a unique factorization domain.

380
§33
Polynomial Rings

The reader is familiar with polynomials. In high school, it is taught that


expressions like

x2 + 2x + 5 x3 + 2x2 7x + 1

are polynomials. One learns how to add, subtract, multiply and divide
two polynomials. Although one acquires a working knowledge about
polynomials, a satisfactory definition of polynomials is hardly given. In
this paragraph, we give a rigorous definition of polynomials.

Polynomials are treated in the calculus as functions. For example,


x2 + 2x + 5 is considered to be the function (defined on , say) that maps
any x to x2 + 2x + 5. With this interpretation, a polynomial is a
function and x is a generic element in its domain. The equality of two
polynomials means then the equality of their domains and the equality
of the function values at any element in their domain.

This is a perfectly sound approach, but it will prove convenient to treat


polynomials differently in algebra. We propose to define the equality of
two polynomials as the equality of their corresponding coefficients. This
definition is motivated by the so-called comparison of coefficients. Note
that this definition of equality does not involve x at all. Whatever x may
be, it is not relevant to the definition of equality. Nor is it relevant to the
addition and multiplication of two polynomials. So we may forget about
x completely. We then deprive of a polynomial a0 + a1x + . . . + anxn of the
symbols xr. What remains is a finite number of coefficients and "+" signs.
The "+" signs can be thought of as connectives. Then a polynomial is
essentially a finite number of coefficients. This leads to the following
definition.

33.1 Definitions: Let R be a ring. A sequence


f = (a0, a1, a2, . . . )

381
of elements a0, a1, a2, . . . in R, where only finitely many of them are
distinct from the zero element of R, is called a polynomial over R. .

The terms a0, a1, a2, . . . are called the coefficients of the polynomial f =
(a0, a1, a2, . . . ). The term a0 will be referred to as the constant term of f.

Two polynomials f = (a0,a1,a2, . . . ) and g = (b0,b1,b2, . . . ) over R are


declared equal when they are equal as sequences of course,. that is to
say, when ai = bi for all i = 0,1,2, . . . . In this case, we write f = g. Other-
wise we put f g. .

If f = (a0, a1, a2, . . . ) is a polynomial over R, there is an index d such that


an = 0 R whenever n d. If the coefficients a0, a1, a2, . . . are not all
equal to zero, there is an index d, uniquely determined by f,. such that
ad 0 and an = 0 for all n d. This index d is called the degree of f. We
write then d = deg f. If d is the degree of f, then ad is said to be the
leading coefficient of f. It is the last nonzero coefficient of f.. If R happens
to be a ring with identity 1 . and if f is a polynomial over R with leading
coefficient equal to 1, then f is called a monic polynomial..

A polynomial of degree one is called a linear polynomial, one of degree


two is called a quadratic polynomial, one of degree three is called a cubic
polynomial, one of degree four is called a biquadratic or quartic
polynomial and one of degree five is called a quintic polynomial.

The polynomial 0* = (0,0,0, . . . ) over R, whose terms are all equal to the
zero element 0 R of R, is called the zero polynomial over R. The leading
coefficient and the degree of the zero polynomial are not defined. The
leading coefficient of any other polynomial is defined. The constant term
of the zero polynomial is defined, and is 0 R.

Notice that indexing begins with 0, not with 1. For example,


(1,0,2,5,0,0,0, . . . ) is a polynomial over of degree 3, not of degree 4. Its
constant term is 1 , leading coefficient is 5 .

33.2 Definition: Let R be a ring and let

382
f = (a0,a1,a2, . . . ) and g = (b0,b1,b2, . . . )

be two polynomials over R. Then the sum of f and g, denoted by f + g, is


the sequence

f + g = (a0 + b0, a1 + b1, a2 + b2, . . . )

obtained by termwise addition of the coefficients. The product of f by g,


denoted by f.g or by fg, is the sequence

fg = (c0,c1,c2, . . . )

where the terms c R are given by

c0 = a0b0
c1 = a0b1 + a1b0
c2 = a0b2 + a1b1 + a2b0
c3 = a0b3 + a1b2 + a2b1 + a3b0
.....................
ck = a0bk + a 1bk 1 + a2bk 2 + . . . + a k 2b2 + a k 1b1 + akb0
..................... .

To find the k-th term ck in fg, we multiply all a's with all b's in such a
way that the sum of the indices is k, and add the results. We write ck =


k
aibk i. The summation variable runs through different values for
i=0

different k's (through 0,1,2,3 for k = 3, through 0,1,2,3,4,5 for k = 5, etc.).

It will be convenient to write ck = ∑ aibj , it being understood that i and


i+ j=k

j run through nonnegative integers in such a way that their sum is k.

33.3 Lemma: Let R be a ring and let f = (a0,a1,a2,. . . ) and g = (b0,b1,b2,. . . )


be arbitrary polynomials over R. Let 0* = (0,0,0, . . . ) be the zero poly-
nomial over R.

(1) f + 0* = f and 0* + g = g. Also f0* = 0* and 0*g = 0*.


(2) The sum f + g is a polynomial over R. If deg f = m and deg g = n, then

383
deg(f + g) = max{m,n} in case m n,
deg(f + g) m in case m = n and f + g 0*.
(3) The product fg is a polynomial over R. If deg f = m and deg g = n,
then
deg fg m+n in case fg 0*,
deg fg = m + n in case R has no zero divisors.

Proof: (1). The assertions f + 0* = f and 0* + g = g are immediate from


the definitions: f + 0* = (a0,a1,a2,. . . ) + (0,0,0, . . . ) = (a0 + 0,a1 + 0,a2 + 0,. . . ) =
(a0,a1,a2,. . . ) = f and similarly 0* + g = g. Also, the k-th coefficient of f0* is
a 0 + a 0 + a 0 + . . . + a 0 = 0 + 0 + 0 + . . . + 0 = 0 by Lemma 29.6, for any
0 1 2 k
k. This proves f0* = 0*. Likewise 0*g = 0*..

(2) We must show that f + g has only finitely many terms. distinct from
0. We proved it in part (1) when f = 0* or g = 0*.. Now we assume f 0*
g. Then f and g have degrees. Suppose def f = m and deg g = n,. so that
am 0, ar = 0 for all r m and bn 0, br = 0 for all r n.

If m n, then f + g = (a0 + b0, a1 + b1, . . . , am + bm, bm+1, . . . , bn, 0, 0, 0, . . . ).


So the n-th term in f + g is bn 0, and the later terms are ar + br = 0 + 0 =
0 for r n m. This shows that f + g is a nonzero polynomial and .
deg f + g = n = max{m,n}. .

If n m, then f + g = (a0 + b0, a1 + b1, . . . , an + bn, an+1, . . . , am, 0, 0, 0, . . . ).


So the m-th term in f + g is am 0, and the later terms are ar + br = 0 + 0
= 0 for r m n. This shows that f + g is a nonzero polynomial and .
deg f + g = m = max{m,n}.. [Question: why cannot we combine the two
cases m n and n m into a single one. by assuming m n without
loss of generality?] .

If m = n, then f + g = (a0 + b0, a1 + b1, . . . , am + bm, 0, 0, 0, . . . ). The r-th term


in f + g is ar + br = 0 for all r m. This shows that f + g is a polynomial.
Either it is the zero polynomial, or it is not the zero polynomial.. In the
latter case, the nonzero terms in f + g have indices m.. In particular,
the degree of f + g is m. (More exactly, deg f + g = m if am + bm 0, and
deg f + g m if am + bm = 0.)

(3) To prove that the product fg is a polynomial over R, we must show


that fg has only finitely many terms distinct from zero. We proved it in
part (1) when f = 0* or g = 0*.. Now we assume f 0* g. Then f and g

384
have degrees. Suppose def f = m and deg g = n,. so that am 0, ar = 0 for
all r m and bn 0, br = 0 for all r n.

The k-th term in fg = (c0,c1,c2,. . . ) is given by ck = ∑ aibj . Suppose now


i+ j=k

k m + n. If i + j = k, then either i m or j n . (for i m and j n


implies the contradiction k = i + j m+n k), so either ai = 0 on bj = 0

for each one of the summands aibj in ck = ∑ aibj . So each summand is


i+ j=k

either 0bj = 0 or ai0 = 0 by Lemma 29.6 and ck = 0 + 0 + . . . + 0 = 0. This


shows that ck = 0 for all k m + n. Hence fg has at most m + n terms
distinct from 0 and fg is a polynomial over R and. deg fg m + n in case
fg 0*. .

The (m + n)-th term cm+ n in fg is


cm+ n = a0bm+ n + a1bm+ n 1
+ a2bm+ n 2
+ . . . + a m 1bn+1
+ ambn
+ am+1bn 1
+ am+2bn 2
+ . . . + am+ n 1b1 + am+ nb0.
Here the summands in the first line are 0 since bm+ n,bm+ n 1,bm+ n 2, . . . ,bn+1
are 0 and the summands in the third line are 0 since am+1,am+2,. . . am+ n 1,
am+ n are 0. This gives cm+ n = ambn. If R has no zero divisors, then cm+ n =
ambn since am 0 and bn 0. So m + n is the greatest index k for which
the k-th term in fg is distinct from 0.. This proves that deg fg = m + n in
case R has no zero divisors. .

33.4 Remark: The last argument shows in fact that the leading coeffi-
cient of fg is the leading coefficient of f times the leading coefficient of g,
provided R has no zero divisors.

33.5 Theorem: Let R be a ring. The set of all polynomials over R is a


ring with respect to the operations + and . given in Definition 33.2 (called
the addition and multiplication of polynomials, respectively).

Proof: First of all, we must prove that + makes the set of all polynomial
over R into an abelian group. The closure property was shown in Lemma
33.3(2). The associativity and commutativity of addition of polynomials

385
follow from the associativity and commutativity of addition in R. The
zero polynomial 0* is the zero element (Lemma 33.3(1)) and each poly-
nomial (a0,a1,a2, . . . ) over R has an opposite ( a0, a1, a2, . . . ).. The details
are left to the reader..

Now the properties of multiplication in Definition 29.1. The closure of


the set of all polynomial over R under multiplication was shown in
Lemma 33.3(3). The associativity of multiplication is proved by
observing that the m-th term in (fg)h, where

f = (a0,a1,a2, . . . ), g = (b0,b1,b2, . . . ), h = (c0,c1,c2, . . . )

are arbitrary polynomials over R, is given by

∑(k-th term in fg)cl = ∑ ( ∑a b )c i j l


= ∑ (aibj )cl
k+ l=m k+ l=m i+ j=k i+ j+ l=m

and that the m-th term in f(gh) is given by

∑a (s-th term in gh)


i
= ∑ ai ( ∑b c ) j l
= ∑ ai(bj cl).
i+ s=m i+ s=m j+ l=s i+ j+ l=m

Here we used the distributivity in R. Since (aibj )cl = ai(bj cl), the m-th
term in (fg)h and f(gh) are equal, and this for all m. So (fg)h = f(gh) for
all polynomials f,g,h over R and the multiplication is associative.

It remains to prove the distributivity laws. For any polynomials f =


(a0,a1,a2, . . . ), g = (b0,b1,b2, . . . ), h = (c0,c1,c2, . . . ) over R, we have

f(g + h) = (a0,a1,a2, . . . )(b0 + c0,b1 + c1,b2 + c2, . . . )

= polynomial whose k-th coefficient is ∑ ai(bj + cj )


i+ j=k

= polynomial whose k-th coefficient is ∑ (aibj + aicj )


i+ j=k

= polynomial whose k-th coefficient is ∑ aibj + ∑ aicj


i+ j=k i+ j=k

= (polynomial whose k-th coefficient is ∑ aibj )


i+ j=k

+ (polynomial whose k-th coefficient is ∑ aicj )


i+ j=k

386
= fg + fh

and a similar argument proves (f + g)h = fh + gh for all polynomials f,g,h


over R. This completes the proof.

The ring of all polynomials over R will be denoted by R[x]. When f R[x],
we say f is a polynomial with coefficients in R.

We now want to simplify our notation. A polynomial f = (a0,a1,a2, . . . )


over R, for which an = 0 whenever, say, n d, can be written as

(a0,0,0,0, . . . ) + (0,a1,0,0, . . . ) + (0,0,a2,0, . . . ) + . . . + (0,0,. . . ,ad,0, . . . ).

Each one of the polynomials above has at most one nonzero coefficient. A
polynomial over R which has at most one nonzero coefficient will be
called a monomial over R. We can write monomials over R more com-
pactly as follows. If, for example, g is a monomial over R whose r-th
coefficient is a (the possibility a = 0 is not excluded) and whose other
coefficients are zero, then we can write g = (0,0,. . . ,a,0, . . . ) shortly as
(a,r). Here r denotes the index with the only the possibly nonzero
element, and a R is that possibly nonzero element in the r-th place.
.
Then our f would be written as (a0,0) + (a1,1) + (a2,2) + . . . + (ad,d). The
essential point is that a polynomial can be written as a sum of
monomials, and a monomial is determined as soon as the index r and the
possibly nonzero element a is given. We can choose other notations for
monomials, of course, as long as they display the index r and the
possibly nonzero element a. We prefer to write axr instead of (a,r) for
the monomial (0,0,. . . ,a,0, . . . ). In this notation, both the index r and the
element a are displayed. It should be noted that x does not have a
meaning by itself. It is like the comma in (a,r). In particular, xr is not the
r-th power of anything. r in axr is an index, a superscript showing where
the element a sits in. With this notation, our f is written as

f = a0x0 + a1x1 + a2x2 + . . . + adxd.

The product of two monomials axr and bxs is easily evaluated to be


abxr+ s. The multiplication of two polynomials can be carried out in the
familiar way by using this rule and the distributivity. The symbol x is a
convenient device that simplifies computations. x will be called an

387
indeterminate (over R). This does not mean that x fails to be determined
in some way. "Indeterminate" is just an odd name of a computational
device. Finally, we agree to write a0 for a0x0 and a1x for a1x1. In
particular, we write 0 for the zero polynomial 0*. This convention brings
f to the form

a0 + a1x + a2x2 + . . . + adxd.

With the convention of writing a0 for a0x0, we regard R as a subring of


R[x]. In particular, we can multiply polynomials by elements of R in the
natural way:
b(a0 + a1x + a2x2 + . . . + adxd) = ba0 + ba1x + ba2x2 + . . . + badxd,
(a + a x + a x2 + . . . + a xd)b = a b + a bx + a bx2 + . . . + a bxd.
0 1 2 d 0 1 2 d

If R is a ring with identity 1, then x can be interpreted in another way.


The rule axr.bxs = abxr+ s yields 1xr.1xs = 1xr+ s. Let p denote the
polynomial 1x = 1x1 = (0,1,0,0, . . . ). We calculate that p2 = 1x2, p3 = 1x3,
p4 = 1x4, etc. Our f can now be written as

f = a0p + a1p + a2p2 + . . . + adpd,

where this time the superscripts indicate the appropriate powers of p =


(0,1,0,0, . . . ), taken according to the definition of multiplication given in
Definition 33.2. So any polynomial over R can be written as a sum of
powers of p, and calculations are performed by using the distributivity.
Since x obeys the same rules as a computational device as p does as a
polynomial, we write the polynomial p = 1x as x. Then x is the polyno-
mial (0,1,0,0, . . . ) in R[x]. We emphasize again that this interpretation of
x as a polynomial is possible only when R has an identity. If R has no
identity, then x is not a polynomial in R[x].

The ring R[x] is said to be constructed by adjoining x to R. When we want


to examine several copies of R[x] at the same time, we use different
letters to denote the indeterminates of the copies of R[x]. Thus we may
have R[x], R[y], R[z], etc.

∑ aixi for the polynomial


d
Whenever convenient, we shall write
i=0

a0 + a1x + a2x2 + . . . + adxd.

388
33.6 Lemma: Let R be a ring.
(1) If R is commutative, then R[x] is commutative.
(2) If R has an identity, then R[x] has an identity.
(3) If R has no zero divisors, then R[x] has no zero divisors.
(4) If R is an integral domain, then R[x] is an integral domain.

∑ aixi and g = ∑ bjxj be arbitrary polynomials in R[x].


m n
Proof: Let f =
i=0 j=0

(1) If R is commutative, then aibj = bj ai for all i = 0,1, . . . ,m and j = 0,1, . . .


n. We have then

∑ ∑aibj )xk = ∑ ( ∑b a )x
m+ n n+ m
fg =
k=0
( i+ j=k k=0 j+ i=k
j i
k
= gf

and R[x] is commutative.

(2) If R has an identity 1, then 1 = 1x0 = (1,0,0,0, . . . ) is a polynomial in


R[x] and

( ∑ aixi)1 = ∑ ai1xi = ∑ aixi = f,


m m m
f.1 =
i=0 i=0 i=0

( ∑ aixi) = ∑ 1aixi = ∑ aixi = f


m m m
1.f = 1
i=0 i=0 i=0

for arbitrary f R[x]. Thus 1 is an identity element of R[x].

(3) Assume now R has no zero divisors.. Let us suppose also that f 0
and g 0. Without loss of generality, we may assume that am is the
leading coefficient of f and that bn is the leading coefficient of g. Then
am 0, bn 0. By remark 33.4, the leading coefficient of fg is ambn and
ambn 0 since R has no zero divisors. Thus fg has a nonzero coefficient,
namely the (m + n)-th coefficient and fg 0.. This shows that R[x] has no
zero divisors..

(4) An integral domain is a commutative ring with identity having no


zero divisors, distinct from the null ring. Now if R is an integral domain,
then R is not the null ring, and since R R[x], the polynomial ring R[x] is
not the null ring, either. The claim follows then immediately from
(1),(2), and (3).

389
33.7 Lemma: Let R and S be two rings and let : R S be a ring
homomorphism. Then the mapping : R[x] S[x], defined by

( ∑ aixi) ∑ (ai
m m
= )xi
i=0 i=0

is also a ring homomorphism. Furthermore, Ker = (Ker )[x] and Im =


(Im )[x]. (Note: Ker and Im are rings by Theorem 30.13, so (Ker )[x]
and Im = (Im )[x] are meaningful.)

∑ aixi, ∑ bjxj
m n
Proof: Let f = g = be arbitrary polynomials in R[x]. We
i=0 j=0

show that preserves addition. Here we may assume m = n, for we may


add 0xm+1 + 0xm+2 + . . . + 0xn to f in case m n and 0xn+1 + 0xn+2 + . . . + 0xm
to g in case n m. We have

( ∑ aixi + ∑ bjxj)
m n
(f + g) =
i=0 j=0

( ∑ aixi + ∑ bixi)
m m
=
i=0 i=0

( ∑ (ai + bi)xi)
m
=
i=0

∑ [(ai + bi)
m
= ]xi
i=0

∑ (ai
m
= + bi )xi
i=0

∑ (ai ∑ (bi
m m
= )xi + )xi
i=0 i=0

f +g
and so preserves addition. As for multiplication (here we do not have
to assume m = n), we observe

[ ∑ ( ∑a b )x ]
m+ n
k
(fg) = i j
k=0 i+ j=k

∑ [( ∑a b ) ]x
m+ n
k
= i j
k=0 i+ j=k

390
∑ ( ∑(a b ) )x
m+ n
k
= i j
k=0 i+ j=k

∑ ( ∑a
m+ n
=
k=0 i+ j=k
i
.b
j )x k

( ∑ (ai )( ∑ (bj
m n
=
i=0
)xi
j=0
)xj )
= f .g .
Thus preserves multiplication as well. So is a ring homomorphism.

∑ aixi belongs to the kernel of ( ∑ aixi)


m m
A polynomial if and only if
i=0 i=0

∑ (ai
m
= )xi is the zero polynomial in S[x], so if and only if the coefficients
i=0

ai are all equal to 0 S (i = 0,1, . . . ,m), so if and only if ai Ker for all

∑ aixi
m
i = 0,1, . . . ,m, so if and only if (Ker )[x].
i=0

∑ cixi ∑ cixi
m m
A polynomial S[x] belongs to the image of if and only if
i=0 i=0

( ∑ aixi) ∑ aixi
n n
= for some R[x], so (assuming m = n without loss of
i=0 i=0

generality) if and only if, for each i = 0,1, . . . ,m, there is an ai R such
that ci = ai , so if and only if ci Im for all i = 0,1, . . . ,m, and so if and

∑ cixi
m
only if (Im )[x].
i=0

As an illustration of Lemma 33.7,. we consider the natural


homomorphism : 3
. Then the mapping : [x] 3
[x] is given by
reducing the coefficients modulo 3.. For example,
(5x3 4x2 + 2x + 1) = 2x3 + 2x2 + 2x + 1
(6x4 3x2 + x + 5) = 1x + 2.
The reader will easily verify that
(5x3 4x2 + 2x + 1)(6x4 3x2 + x + 5)
= 30x7 24x6 3x5 + 23x4 + 15x3 21x2 + 11x + 5,

391
whose image under is
= 30 x7 24 x6 3x5 + 23x4 + 15x3 21x2 + 11x + 5
= 2x4 + 2x + 2.
We have also (2x3 + 2x2 + 2x + 1)(1x + 2) = 2x4 + 2x + 2.

If : R S is an isomorphism, then Ker = 0 and Im = S. This gives the


following corollary to Lemma 33.7.

33.8 Theorem: If R and S are isomorphic rings, then R[x] and S[x] are
isomorphic.

Let R be a ring. Adjoining an indeterminate x to R, we get the ring R[x].


Now we can adjoin a new indeterminate y to R[x] and get the ring

∑ fiyi, where
m
(R[x])[y] =: R[x][y]. The elements of R[x][y] are of the form
i=0

fi R[x]. Similarly we can construct the ring R[y][x] := (R[y])[x]. We show


that they are isomorphic.

33.9 Lemma: Let R be a ring and let x,y be two indeterminates over R.
Then R[x][y] R[y][x].

Proof: We consider the mapping T: R[x][y] R[y][x], given by

∑ ( ∑aijxj)yi ∑ ( ∑aijyi)xj,
m n n m

i=0 j=0 j=0 i=0

which seems to be the only reasonable mapping from R[x][y] to R[y][x]. It


certainly preserves addition, for we have

[∑ ( ∑aijxj)yi ∑ ( ∑bijxj)yi]T
m n r s
+
i=0 j=0 i=0 j=0

[∑ ( ∑aijxj + ∑bijxj)yi]T
m n s
= (assuming r = m without loss of
i=0 j=0 j=0

generality)

[∑ ( ∑(aij + bij)xj)yi]T
m n
= (assuming s = n without loss of generality)
i=0 j=0

392
∑ ( ∑(aij + bij)yi)xj
n m
=
j=0 i=0

∑ ( ∑aijyi + ∑bijyi)xj
n m m
=
j=0 i=0 i=0

∑ ( ∑aijyi)xj ∑ ( ∑bijyi)xj
n m n m
= +
j=0 i=0 j=0 i=0

[∑ ( ∑aijxj)yi]T [∑ ( ∑bijxj)yi]T
m n m n
= +
i=0 j=0 i=0 j=0

∑ ( ∑aijxj)yi, ∑ ( ∑bijxj)yi
m n r s
for all R[x][y].
i=0 j=0 i=0 j=0

Secondly T preserves multiplication of polynomials of the form (axj )yi


(i.e., monomials over R[x], whose eventually nonzero coefficients in R[x]
are themselves monomials over R; they will be referred to as monomials
in R[x][y] over R). We indeed have

[(aijxj )yi. (brsxs)yr]T = [(aijxj )(brsxs)yi+ r]T (def. of multiplication in R[x][y])


= [(aijbrsxj+ s)yi+ r]T (def. of multiplication in R[x])
i+ r j+ s
= (aijbrsy )x
= [(aijy )(brsyr)]xj+ s
i

= (a yi)xj .(b yr)xs


ij rs
= [(aijx )y ]T. [(brsxs)yr]T
j i

for all monomials (aijxj )yi, (brsxs)yr in R[x][y].

Thirdly,. T preserves multiplication of arbitrary polynomials. Any poly-


nomial can be written as p + p + . . . + p , where p ,p , . . . ,p are suitable
1 2 t 1 2 t
monomials. Now for all polynomials p1 + p2 + . . . + pt, q1 + q2 + . . . + qu in
R[x][y], where p's and q's are monomials, we have.

[(p1 + p2 + . . . + pt)(q1 + q2 + . . . + qu)]T

= (∑ i,j
piqj T ) (by distributivity)

= ∑ (piqj )T (since T preserves addition)


i,j

393
= ∑ piT.qj T (since T preserves products of monomials)
i,j

= (p1T + p2T + . . . + ptT)(q1T + q2T + . . . + quT)


= (p + p + . . . + p )T.(q + q + . . . + q )T,
1 2 t 1 2 u

so T preserves arbitrary products. Hence T is a ring homomorphism.

∑ ( ∑aijxj)yi
m n
T is one-to-one, for if R[x][y] is in the kernel ofT, then
i=0 j=0

∑ ( ∑aijyi)xj
n m
its image is the zero polynomial in R[y][x], so all the
j=0 i=0

∑aijyi
m
coefficients are equal to the zero polynomial in R[y], so all
i=0

elements aij of R are equal to the zero element in R, so all polynomials

∑aijxj ∑ ( ∑aijxj)yi
n m n
are the zero polynomial in R[x], so is the zero
j=0 i=0 j=0

polynomial in R[x][y]. Thus Ker T consists of the zero polynomial and T is


one-to-one.

∑ ( ∑aijyi)xj
n m
Moreover, T is onto, for any polynomial in R[y][x] is the
j=0 i=0

∑ ( ∑aijxj)yi in R[x][y].
m n
image of the polynomial
i=0 j=0

Hence T is an isomorphism from R[x][y] onto R[y][x].

In view of this result, we identify R[x][y] and R[y][x]. To simplify the


notation, we write R[x,y] for R[x][y]. The elements of R[x,y] are of the

form ∑ aijxiyj , where aij R and there are finitely many terms in the
i,j

sum. Multiplication is carried out in the customary way, using distribu-


tivity and collecting terms. We have R[x,y] = R[y,x].

394
We can of course adjoin a new inteterminate z to R[x,y] and obtain
(R[x,y])[z] = (R[x][y])[z] =: R[x][y][z]. We see

R[x][y][z] = (R[x][y])[z] (definition)


(R[x][z])[y] (Lemma 33.9 with R[x],z in place of R,x)
(R[z][x])[y] (Lemma 33.9 and Theorem 33.8)
(R[z][y])[x] (Lemma 33.9 with R[z] in place of R)
(R[y][z])[x]
(R[y][x])[z].

We regard these six rings as identical and write R[x,y,z] for it. The
notations "R[x,y,z]", "R[x,z,y]", "R[z,x,y]", "R[z,y,x]", "R[y,z,x]", "R[y,x,z]" will
mean the same ring.

More generally, if x1,x2, . . . ,xn are indeterminates over a ring R, then


R[x1,x2, . . . ,xn] is defined to be the ring R[x1,x2, . . . ,xn 1][xn]. It is isomorphic
1 2 . . . n
to each one of the n! rings R[xi ,xi , . . . ,xi ], where i i . . .
1 2 n 1 2
( )
in runs
through the permutations in S n. These n! isomorphic rings will be
considered identical. Elements of R[x1,x2, . . . ,xn] are of the form

N1 N2 Nn
∑ ∑ ... ∑ aij...lx1ix2j . . . xnl, aij...l R.
i=0 j=0 l=0

The polynomials in R[x1,x2, . . . ,xn] of the form ax1ix2j . . . xnl will be called
monomials over R. It is customary to omit the indeterminates with
exponent zero in a monomial. For example, ax10x22x30x43 in R[x1,x2,x3,x4]
is written ax22x43. An exponent is dropped when it is equal to 1. If R
does not have an identity, the indeterminates x1,x2, . . . ,xn are not
elements of R[x1,x2, . . . ,xn] and the expressions x1ix2j . . . xnl are not
polynomials.

The degree of a nonzero monomial ax1ix2j . . . xnl is defined to be the


nonnegative integer i + j + . . . + l. The total degree of a polynomial f =

N1 N2 Nn
∑ ∑ . . . ∑ aij...lx1ix2j . . . xnl is defined to be the maximum of the
i=0 j=0 l=0
degrees of the monomials aij...lx1ix2j . . . xnl with aij...l 0. The total degree
of f will be denoted by deg f.. The degree of f, considered as an element
of R[x1, . . . ,xh 1,xh+1, . . . ,xn][xh] will be called the degree of f in xh; this will

395
be written deghf (h = 1,2, . . . ,n). The analog of Lemma 33.3 holds for
polynomials in n indeterminates,. both with the total degree and the
degree in xh in place of deg f.

We record a lemma that can be proved by induction on the number of


indeterminates.

33.10 Lemma: Let R be a ring and x1,x2, . . . ,xn indeterminates over R.


(1) If R is commutative, then R[x1,x2, . . . ,xn] is commutative.
(2) If R has an identity, then R[x1,x2, . . . ,xn] has an identity.
(3) If R has no zero divisors, then R[x1,x2, . . . ,xn] has no zero divisors.
(4) If R is an integral domain, then R[x1,x2, . . . ,xn]is an integral domain.

Exercises

1. Evaluate: (5x2 3x + 1)(7x3 + 6x 1) in 8


[x],
3 2
(3x + 4x + 1)(3x + 7x + 2) in 9
[x],

[(01 10)x4 + (10 20)x2 + (21 01)][(10 00)x2 (10 10)x + (11 1
1)] in
(Mat2( ))[x],
[(01 10)x3 + (10 32)x2 + (11 00)][(20 13)x2 + (12 50)x + (21 4
0)] in
(Mat2( 7))[x],
(we dropped the bars for ease of notation).

2. Let R, R1, R2 be rings. Prove that


(Mat2(R))[x] Mat2R[x] and (R1 R2)[x] R1[x] R2[x]
(see §29, Ex. 10).

3. Generalize Lemma 33.7 to polynomial rings in n indeterminates.

4. Let R be a commutative ring with identity and let anxn + an 1xn 1 + . . . + a0


be a zero divisor in R[x]. Show that there exists a nonzero b in R such
that ban = ban 1 = . . . = b a0 = 0.

396
N1 N2 Nn
5. Let R be a ring and f = ∑ ∑ ... ∑ aij...lx1ix2j . . . xnl R[x1,x2, . . . ,xn].
i=0 j=0 l=0
Prove that deg1f is the largest i such that aij...l 0, deg2f is the largest j
such that aij...l 0, . . . , degnf is the largest l such that aij...l 0.

6. Extend Lemma 33.3 to polynomial rings in n indeterminates,. both


with total degree and the degree in xh in place of the degree of f.

397
§34
Divisibility in Polynomial Domains

We learned in Lemma 33.6 that some properties of a ring R are trans-


ferred to the polynomial ring R[x]. In particular, if R is an integral
domain, so is R[x]. In any integral domain, we have a theory of divisi-
bility (§32). In this paragraph, we want to investigate the divisibility
properties of polynomials. Lemma 33.6 suggests the questions: Is R[x] a
Euclidean domain if R is a Euclidean domain? Is R[x] a principal ideal
domain if R is a principal ideal domain? Is R[x] a unique factorization
domain if R is a unique factorization domain? The answer to the first
two questions is 'no'. For example, [x] is not a principal ideal domain,
let alone a Euclidean domain, although is Euclidean. On the other hand,
the third question recieves an affirmative answer: if R is a unique
factorization domain, so is R[x]. This will be proved as Theorem 34.13.

Let us recollect the basic definitions. Assume D is an integral domain.


Then D[x] is an integral domain (Lemma 33.6). A polynomial f D[x] is
said to be divisible by a nonzero polynomial g D[x] if there is a poly-
nomial h in D[x] such that f = gh. We write then g f. Notice that the
coefficients of h are required to be in D. The notation g f does not merely
mean that f = gh for some arbitrary polynomial h. It means f = gh for
some polynomial h in D[x].

When f 0 and f = gh, we have deg f = deg gh = deg g + deg h deg g:

34.1 Lemma: Let D be an integral domain. If g,f D[x], g 0 f and g f,


then, deg g deg f.

A nonzero polynomial e D[x] is a unit of D[x] if eh = 1 for some h D[x],


or, equivalently, if e f for all f D[x]. In this case, Lemma 33.3 yields

0 = deg 1 = deg eh = deg e + deg h 0 + 0 = 0,


deg e = 0, deg h = 0,

397
e D, h D,
eh = 1 holds in D,
e is a unit in D.
(e 0 h, because eh = 1 0 and D is an integral domain.) So a unit in

∑ aixi is a unit in D[x], then a0


m
D[x] is a unit in D: if a polynomial e = D
i=0

is a unit in D and a1 = a2 = . . . = am = 0. Conversely, if e is a unit in D so


that eh = 1 for some h D, then of course e,h D[x] and e is a unit in
D[x]. We proved the following lemma.

34.2 Lemma: Let D be an integral domain. Then e D[x] is a unit in D[x]


if and only if e D and e is a unit in D. In symbols, D[x] = D .

Thus any unit in D[x] has degree 0 and the associates of a polynomial in
D[x] have the same degrees as the polynomial itself. Any proper divisor
of f D[x] is therefore of degree distinct from 0 and deg f.

A polynomial f in D[x]\{0} is irreducible if f is not a unit in D[x] and if, in


any factorization of f as f = gh in D[x], either g or h is a unit. This is
Definition 32.7. We paraphrase this as follows: f D[x]\{0} is irreducible
if deg f 0 and if there are no polynomials g,h in D[x] such that f = gh
and 0 deg g, deg h deg f. The phrase "in D[x]" is important. Suppose
D D1, where D1 is another integral domain. Then f D1[x], too.. Now it is
possible that.

there exist no g,h D[x] such that f = gh, 0 deg g deg f

and yet possibly

there exist some g,h D1[x] such that f = gh, 0 deg g deg f.

Then f is irreducible in D[x], but not in D1[x]. This shows that irreducibil-
ity of f is not an intrinsic property of f. It is a property of f relative to
the polynomial domain D[x]. For this reason, we have to mention the
domain D whenever we speak about irreducible polynomials. We say f is
irreducible over D when f is irreducible in D[x]. For example, x2 + 1

398
[x] is irreducible over since x2 + 1 has no proper divisors in [x], but
x2 + 1 is reducible in [x] since x2 + 1 = (x i)(x + i), with x i, x + i [x]
2
and 0 1 = deg(x i) 2 = deg (x + 1).

We now compare the irreducibility of an element of D in D with its irre-


ducibility in D[x].

34.3 Lemma: Let D be an integral domain and let a be any nonzero


element of D D[x]. Then a is irreducible in D[x] if and only if a is irre-
ducible in D.

Proof: Suppose that a is irreducible in D. We prove that a is irreducible


in D[x]. First we must show that a is not a unit in D[x]. Since a is irre-
ducible in D, so not a unit in D, we have a D = D[x] (Lemma 34.2), so a
is not a unit in D[x]. Secondly we must show that a = bc, where b,c D[x],
implies either b or c is a unit in D[x]. Indeed, if a = bc, then 0 = deg a =
deg bc = deg b + deg c 0, so deg b = 0 = deg c. Then a = bc is an
equation in D. Since a is irreducible in D, either b or c is a unit in D, so, in
view of Lemma 34.2, either b or c is a unit in D[x]. This proves that a is
irreducible in D[x].

Now the converse. We suppose that a is irreducible in D[x] and show that
a is irreducible in D. First we must show that a is not a unit in D. Since a
is irreducible in D[x], so not a unit in D[x], we have a D[x] = D (Lemma
34.2), so a is not a unit in D. Secondly we must show that a = bc, where
b,c D, implies either b or c is a unit in D. We read a = bc as an equation
in D[x]. Since a is irreducible in D[x], either b or c is a unit in D[x], so, in
view of Lemma 34.2, either b or c is a unit in D. This proves that a is
irreducible in D.

We want to find the integral domains D such that D[x] is a unique


factorization domain. What conditions must be imposed on D? If D[x] is
to be a unique factorization domain, then each element of D[x]\{0} that is
not a unit in D[x], must be written as a product of irreducible elements
of D[x] in a unique way. In particular, each element of D\{0} that is not a
unit in D[x], must be written as a product of irreducible elements of D[x]
in a unique way. As any divisor in D[x] of an element in D belongs to D

399
by degree considerations, the last statement means (Lemma 34.2,
Lemma 34.3): each element of D\{0} that is not a unit in D, must be
written as a product of irreducible elements of D in a unique way. Thus
D must be a unique factorization domain. We shall prove conversely that
D[x] is a unique factorization domain whenever D is. The proof will make
use of the polynomial ring F[x], where F is the field of fractions of D
(§31). F[x] will turn out to be a Euclidean domain.

We show generally that K[x] is a Euclidean domain if K is a field. In order


to do that, let us remember, we must find a function d:K[x]\{0} {0}
such that d(f) d(fg) for all f,g K[x]\{0} and such that, for any nonzero
polynomials f,g in K[x], there are polynomials q,r K[x] with f = qg + r
and r = 0 or deg r deg g. The degree of polynomials will work as the
function d. First we prove a slightly more general theorem.

34.4 Theorem (Division algorithm): Let D be an integral domain and


let f,g be polynomials in D[x]. If the leading coefficient of g is a unit in D,
then there are unique polynomials q,r in D[x] such that
f = qg + r, r = 0 or deg r deg g.

Proof: First we prove the existence of q and r. This is nothing but the
long division of polynomials. Suppose we divide f = x5 2x4 + 3x3 + x2 x +
2 by g = x2 + x + 1. What do we do? We subtract x3 times g from f:

x5 2x4 + 3x3 + x2 x + 2 x2 + x + 1
x5 + x4 + x3 x3
3x4 + 2x3 + x2 x + 2

and get the polynomial f1 = 3x4 + 2x3 + x2 x + 2, whose degree is


smaller than the degree of f. Then we subtract 3x2 times g from f1 and
get a polynomial f2 = 5x3 + 4x2 x + 2, whose degree is smaller than the
degree of f1. We continue this process until we get a polynomial r whose
degree is smaller than the degree of g = x2 + x + 1:

400
x5 2x4 + 3x3 + x2 x+2 x2 + x + 1
x5 + x4 + x3 x3 3x2 + 5x 1

3x4 + 2x3 + x2 x + 2
3x4 3x3 3x2

5x3 + 4x2 x + 2
5x3 + 5x2 + 5x

x2 6x + 2
x2 x 1

5x + 3.

Hence f = (x3 3x2 + 5x 1)g + ( 5x + 3). In general, we have

f g
m
ax g axm

f1 = f axmg

where a and m {0} are chosen appropriately, and deg f1 deg f.


Then, by induction on the degree of f, we can divide f1 (and hence f) by
g and get a remainder r. This is essentially the proof.

Now let f,g be nonzero polynomials in D[x] and suppose that the leading
coefficient of g is a unit in D. We prove the existence of q and r by
induction on deg f.

I. Induction begins at 0. Suppose deg f = 0. Then f D\{0}. Since


the leading coefficient of g is a unit in D by hypothesis, if g D, there is
1 1 1
ag D such that g g = 1, hence fg D and we can write
1
f = (fg )g + 0
If g D[x]\D, then deg g 1 and we can write
f = 0g + f.
This proves the existence of q and r with
q = fg 1, r=0 in case g D,
q = 0, r=f in case g D[x]\D.

II. Now the inductive step.. We use the principle of induction in the
form 4.5.. We assume that deg f = n 1 and that, for any nonzero

401
polynomial h with deg h n, there are polynomials q1 and r1 in D[x]
such that.
h = q1g + r1, r1 = 0 or deg r1 deg g.

In case deg g n, we have


f = 0g + f deg f = n deg g
and this proves the existence of q and r with
q = 0, r = f.

Having disposed of the case deg g n, we assume now deg g n. We


subtract a suitable multiple of g from f to get a polynomial of degree
smaller than n. If, say

f = anxn + a n 1xn 1 + . . . + a0, g = bmxm + bm 1xm 1 + . . . + b0,


bm is a unit in D,
bmbm 1 = 1 for some bm 1 D,
m n,
1 nm
then we put f1 := f anbm x g. Here either f1 = 0 and the the existence
of q and r is proved with q = anbm 1xn m, r = 0; or
f1 = f anbm 1xn mg
= (anxn + a n 1xn 1 + . . . + a0) anbm 1xn m(bmxm + bm 1xm 1 + . . . + b0)
is a polynomial in D[x] of degree n.. By the induction hypothesis, there
are polynomials q1, r1 in D[x] such that
f1 = q1g + r1, r1 = 0 or deg r1 deg g.
Hence f = f1 + anbm 1xn mg
= (q1g + r1) + (anbm 1xn mg)
= (q1 + anbm 1xn m)g + r1, r1 = 0 or deg r1 deg g
1 nm
and this proves the existence of q and r with q = q1 + anbm x , r = r1
and completes the proof of the inductive step.. The hypothesis that the
leading coefficient of g be a unit has been used to construct the f1 with
deg f1 deg f.

The uniqueness of q and r. Suppose


f = qg + r = q´g + r´; r = 0 or deg r deg g; r´ = 0 or deg r´ deg g.
Then (qg + r) (q´g + r´) = f f = 0,
(q q´)g = r´ r
and the assumption q q´ 0 leads to the contradiction
deg g deg (q q´) + deg g = deg (q q´)g

402
= deg (r´ r) max{deg r´, deg r} deg g
by Lemma 33.3. This forces q q´ = 0, so q = q´, so r = f qg = f q´g = r.
Thus q and r are uniquely determined.

34.5 Theorem: Let K be a field.


(1) For any nonzero polynomials f,g in K[x], there are unique polynomials
q and r in K[x] such that
f = qg + r, r = 0 or deg r deg g.
(2) K[x] is a Euclidean domain.
(3) K[x] is a unique factorization domain.

Proof: (1) Since g 0, it has a leading coefficient, which is distinct from


0 K. Then the leading coefficient of g is a unit in K (Example 32.6(b)).
The assertion follows now from Theorem 34.4.

(2) We prove that deg: K[x]\{0} {0} satisfies the conditions in


Definition 32.10. Certainly deg f is a nonnegative integer by definition
and deg f deg fg for all f,g K[x]\{0} by Lemma 33.3. This proves the
condition (i) in Definition 32.13. The condition (ii) is proved in part (1).

(3) This follows from Theorem 32.22.

We record some consequences of Theorem 34.5.

34.6 Theorem: Let K be a field. Any two polynomials f,g in K[x], not
both zero, have a greatest common divisor d in K[x]. If d is a greatest
common divisor of f and g, then there are polynomials h and l in K[x]
such that d = hf + lg. Any two greatest common divisors of f and g are
associate. In particular, there is one and only one monic greatest
common divisor of f and g. (This unique monic greatest common divisor
of f and g is sometimes called the greatest common divisor of f and g).
Any irreducible polynomial in K[x] is prime in K[x] (Definition 32.20).

403
Theorem 34.5 is very satisfactory. If the underlying ring is a field, then
the polynomial domain is a unique factorization domain. We turn our
attention to polynomials with coefficients in a unique factorization
domain. Let D be a unique factorization domain and let F be the field of
fractions of D. We recall that the elements of F are fractions a/b of
element a,b D, b 0. We identify a D with a/1 F and thus regard D
as a subring of F. In this way, D[x] F[x]. (If you find this and the
following discussion too abstract, you may just assume D = and F = .)

Let f D[x] F[x]. Now, a priori, f may be irreducible over D and not ir-
reducible over F. See the comments preceding Lemma 34.3. In the case
where D is a unique factorization domain and F is the field of fractions of
D, it is in fact true that an irreducible polynomial in D[x] is also irre-
ducible in F[x]. After some preparation, this will be proved in Lemma
34.11. The hypothesis that D be a unique factorization domain is essen-
tial, for otherwise the following definition, which plays an important role
in the proof of Lemma 34.11, does not make sense.

34.7 Definition: Let D be a unique factorization domain and let f be


any nonzero polynomial in D[x]. A greatest common divisor of the
coefficients of f is called a content of f.

Since greatest common divisors are uniquely determined to within am-


biguity among associate elements, any two contents of f are associate.
We write C(f) for any content of f. Ignoring the distinction among associ-
ate elements, we sometimes call C(f) the content of f by abuse of lan-
guage.

The contents of f = 2x4 8x2 + 2x + 6 [x] and g = 6x2 9x + 18 [x]


are easily seen to be C(f) = 2 and C(g) = 3. The content of fg =
12x6 18x5 12x4 + 84x3 126x2 18x + 108 is C(fg) = 6 = 2.3 = C(f)C(g).
This is an example of a general phenomenon.

34.8 Lemma (Gauss' lemma): Let D be a unique factorization domain


and let f,g be arbitrary nonzero polynomials in D[x]. Then C(fg) C(f)C(g).

404
Proof: First we remark that we cannot write C(fg) = C(f)C(g), for
contents are unique only up to associate elements.

f and g can be written as f = C(f)f1 and g = C(g)g1, where f1 and g1 are


polynomials in D[x] with C(f1) 1 and C(g1) 1. Similarly fg = C(fg)h,
where h D[x] and C(h) 1. We have.
C(f)f1. C(g)g1 = fg = C(fg)h
C(f)C(g)f1g1 = C(fg)h.
Taking contents of both sides and observing C(al) aC(l) fora D\{0}
and l D[x]\{0}, we obtain
C(f)C(g)C(f1g1) C(fg)C(h)
C(f)C(g)C(f1g1) C(fg)
and the theorem will be proved if we can show C(f1g1) 1. Dropping the
subscripts, we must prove: .
if C(f) 1 and C(g) 1, then C(fg) 1.

Suppose now C(f) 1, C(g) 1 and C(fg) is not a unit. Then there is an
irreducible element in D with C(fg). Since C(f) 1 and C(g) 1 by
assumption, cannot divide all the coefficients of
f = anxn + a n 1xn 1 + . . . + a1x + a0
nor of
g = bmxm + bm 1xm 1 + . . . + b1x + b0,
say. Let ah be the coefficient of f with the largest index that is not
divisible by and let bk have a similar meaning for g. Then
an, a n 1, . . . , ah+1, ah
(1) bm, bm 1, . . . , bk+1, bk.
(2)
But divides the coefficient
(. . . + ah+2bk 2 + ah+1bk 1) + ahbk + [a h 1bk+1 + a h 2bk+2 + . . . ]
of xh+ k in fg. Because of (1) and (2),. divides the expressions in ( ) and
[ ]. So divides ahbk as well. Thus ah, bm and ahbk, which tells us
that is not a prime element in D.. On the other hand, D is a unique
factorization domain. and every irreducible element in D is prime
(Lemma 32.24),. hence is prime. This is a contradiction. We conclude
C(fg) 1. .

34.9 Lemma: Let D be a unique factorization domain and let F be the


field of fractions of D. Let f,g be any nonzero polynomials in D[x] with

405
C(f) C(g). Then f and g are associate in F[x] if and only if f and g are
associate in D[x].

Proof: By Lemma 34.2,


e is a unit in D[x] e is a unit in D,
u is a unit in F[x] u is a unit in F u F\{0}.

If f and g are associate in D[x], then f = eg for some unit e in D[x]. Then e
is a unit in D, so e is a nonzero element of D, so e is a nonzero element of
F, so e is a unit in F, so e is a unit in F[x], so f and g are associate in F[x].

If f and g are associate in F[x], then f = ug for some unit u in F[x]. Thus
u F\{0} and so u = a/b, where a,b D\{0}. So bf = ag. Thus
bC(f) C(bf) C(ag) aC(g) aC(f)
and b a in D. So a/b = u is a unit in D. Hence u is a unit in D[x] and f is
associate to g in D[x].

34.10 Lemma: Let D be a unique factorization domain and let F be the


field of fractions of D. Let f be a nonzero polynomial in D[x] with C(f) 1
and assume
f = g1g2. . . gr,
where g1, g2, . . . ,gr are polynomials in F[x]. Then there are polynomials
h1, h2, . . . ,hr in D[x] such that gi is associate to hi in F[x] and C(hi) 1 (for
all i = 1,2, . . . ,r) and
.
f = h1h2. . . hr.

Proof: The coefficients of g1,g2,. . . ,gr are fractions of elements from D. We


multiply each gi by an appropriate element ai in D, for example by the
product of the "denominators" in the coefficients of gi to get a polynomial
ki D[x]. Thus aigi = ki D[x]. We write ki = cihi, where ci C(ki) D and
hi is a polynomial in D[x] with C(hi) 1. We have
a a . . . a f = a g .a g . . . a g = k k . . . k = c c . . . c h h . . . h
1 2 r 1 1 2 2 r r 1 2 r 1 2 r 1 2 r
and, taking contents of both sides, and using Lemma 34.8 r 1 times, we
get
a1a2. . . arC(f) = c1c2. . . cr C(h1)C(h2). . . C(hr)

a1a2. . . ar c1c2. . . cr.

406
Thus e:= c1c2. . . cr/a1a2. . . ar is a unit in D and

f = (eh1)h2. . . hr.

Observe that hi = (ai/ci)gi is associate to gi in F[x], because ai/ci F\{0} is


a unit in F[x]. When we make a slight change of notation and write h1 for
eh1, the proof is complete (eh1 is also associate to g1 in F[x]).

34.11 Lemma: Let D be a unique factorization domain and let F be the


field of fractions of D. Let f be a nonzero polynomial in D[x] with C(f) 1.
Then f is irreducible in F[x] if and only if f is irreducible in D[x].

Proof: Assume first that f is irreducible in F[x]. Then f is not a unit in


F[x], hence deg f 1, hence f is not a unit in D[x]. Also, if g,h D[x] and
f = gh, we read this equation in F[x] and conclude that either g or h is
associate to f in F[x]. We know 1 C(f) C(gh) C(g)C(h), so C(g) 1
C(f) and C(h) 1 C(f). Using Lemma 34.9, we deduce that either g or h
is associate to f in D[x]. Thus f is not a unit in D[x] and has no proper
divisors in D[x]. This means f is irreducible in D[x].

Conversely, assume that f is irreducible in D[x]. Then f is not a unit in


D[x] and so not a unit in D. This gives deg f 1, for otherwise f C(f) 1
would be a unit in D. So deg f 1 and f is not a unit in F[x]. We now
want to show that f has no proper divisors in F[x]. Assume f = g1g2,
where g1,g2 F[x]. By Lemma 34.10, f = h1h2, where h1,h2 D[x], C(h1) 1
C(f), C(h2) 1 C(f) and g1,g2 are respectively associate to h1,h2 in F[x].
Since f is irreducible in D[x], either h1 or h2 is associate to f in D[x] and
thus, by Lemma 34.9, either h1 or h2 is associate to f in F[x], hence either
g1 or g2 is associate to f in F[x]. Thus f has no proper divisors in F[x] and f
is irreducible in F[x].

We need one more lemma to prove that D[x] is a unique factorization


domain whenever D is. It comprises the main argument.

407
34.12 Lemma: Let D be a unique factorization domain and let f be a
nonzero polynomial in D[x] such that C(f) 1 and deg f 1. Then f can
be written as a product of irreducible polynomials in a unique way.

Proof: Let F be the field of fractions of D. We will use the fact that F[x] is
a unique factorization domain and the fact that irreducibility in D[x] and
in F[x] coincide (Theorem 34.5, Lemma 34.11).

Consider f as a polynomial in F[x]. By Theorem 34.5,

f = g1g2. . . gr, g1, g2, . . . ,gr F[x]

where g1, g2, . . . ,gr are irreducible in F[x]. According to Lemma 34.10,

f = h1h2. . . hr, h1, h2, . . . ,hr D[x]

for some polynomials hi in D[x] with C(hi) 1 and hi is associate to gi in


F[x] (i = 1,2, . . . ,r). Hence hi is irreducible in F[x] and, by Lemma 34.11, hi
is also irreducible in D[x]. We proved that f can be written as a product
of irreducible polynomials in D[x].

Now uniqueness (up to the order of factors and ambiguity among


associate polynomials). Let f D[x] with C(f) 1 and deg f 1, and let

f = p1p2. . . pr = q1q2. . . qs pi, qj D[x]


(1)

be two representations of f as a product of irreducible polynomials.


p1, p2, . . . ,pr, q1, q2, . . . ,qs in D[x]. Taking contents and using Lemma 34.8,
we get
C(p1)C(p2). . . C(pr) C(f) 1 C(q1)C(q2). . . C(qs)
so that C(pi) and C(qs) are units in D. By Lemma 34.11, the polynomials
pi, qj are irreducible in F[x]. Since F[x] is a unique factorization domain,
we deduce from (1). that r = s and, eventually after reindexing the
polynomials, pi is associate to qi in F[x]. Since C(pi) C(qi), Lemma 34.9
tells us that pi is associate to qi in D[x] (i = 1,2, . . . ,r). This completes the
proof.

408
34.13 Theorem: If D is a unique factorization domain, then D[x] is a
unique factorization domain.

Proof: Given any nonzero polynomial f in D[x] which is not a unit in D[x],
we have to show that f can be written as a product of irreducible poly-
nomials in D[x], and that this representation is unique up to the order of
factors and ambiguity between associate polynomials.

Now let f D[x], f 0, f unit in D[x]. If deg f = 0, then f D and, since D


is a unique factorization domain,. f can be written as a product of irre-
ducible elements p1, p2, . . . ,pr of D. These elements are uniquely deter-
mined, and they are irreducible also in D[x] . (Lemma 34.3). So f can be
written. as a product of irreducible elements in a unique way if deg f = 0.

Suppose next deg f 1. We write f = cf1, where c C(f) D and f1 D[x]


with C(f1) 1, deg f1 1. Here c and f1 are uniquely determined up to a
unit in D. Now c D can be written as a product of irreducible elements
in D, which are also irreducible in D[x]:

c = a1a2. . . ar ai are irreducible in D[x],

and ai are uniquely determined. By Lemma 34.12, f1 can be written as a


product of irreducible polynomials in D[x]:

f1 = q1q2. . . qs qj are irreducible in D[x]

and qi are uniquely determined. Hence

f = a1a2. . . arq1q2. . . qs

is a product of the irreducible polynomials ai,qj in D[x], which are unique


up to the order of factors and ambiguity between associate elements.

By repeated application of Theorem 34.13, we get

34.14 Theorem: If D is a unique factorization domain, then D[x1,x2,. . .


,xn] is a unique factorization domain.

409
In particular,

34.15 Theorem: If K is a field, then K[x1,x2,. . . ,xn] is a unique factoriza-


tion domain.

Exercises

1. Prove that x4+ 1 [x] is irreducible over by comparing the


coefficients of both sides in a hypothetical factorization x2 + 1 = fg and
deriving a contradiction from it. Investigate the cases deg f =1, deg g = 3
and deg f = 2 = deg g separately.

2. Do Ex. 1 for x4 + 2 and x4 + 3 [x].

3. Show that x4 + 4 is reducible over .

4. Show that x4 + 1 2
[x] is reducible over 2
.

5.Show that x4 + 1 ( [ 2])[x] is reducible over [ 2] (see §32, Ex. 3).

6. Find a content of

(a) 65x4 + 26x2 9x + 143 [x]


(b) (5 + i)x3 + ( 1 + 5i)x + ( 4 + 7i) ( [i])[x]
(c) (1 + )x4 + ( 1 + 2 )x3 + (1 2 )x2 + 3x + (2 + 3 )
( [ ])[x]
(d) 8x4 + 24x3 32x2 48x + 56 [x]
(e) 3x2 + 5x + 7 97
[x].

7. Let D be a unique factorization domain and let F be the field of


fractions of D. Let f D[x] be a nonzero polynomial whose leading
coefficient is a unit in D. Suppose that g,h F[x] and f = gh. Prove that
then g D[x] and h D[x].

8. Let D be a unique factorization domain and let f,g D[x]\D. Prove that
a greatest common divisor of f and g has degree 1 if and only if there
are polynomials h,k in D[x] satisfying deg h deg g and deg k deg f
such that fh = gk.

410
§35
Substitution and Differentiation

In this paragraph, we study the divisibility of polynomials by those of


the first degree. We prove the familiar remainder theorem. Roots of
polynomials are introduced and multiple roots are examined.

Everything in this paragraph is based on the substitution homomorph-


ism which we now define.

∑ aixi
m
35.1 Definition: Let R be a ring and let f = be an arbitrary
i=0

polynomial in R[x]. Let S be a ring containing R. For any s S, the

∑ aisi of S is called the value of f at s. The value of f at s is said


m
element
i=0

to be obtained by substituting s for x or by evaluating f at s. The value

∑ aisi of f at s will be denoted by f(s).


m

i=0

In many cases, S is taken to be R, and then f(s) S. In fact, we may al-


ways assume S = R by taking f as a polynomial in S[x]. However, if R S
and s S\R, then f(s) need not belong to R.

35.2 Examples: (a) Let g = 4x2 + 6x + 8 E[x], where E is the ring of


even integers; so E . Now 1 and g(1) = 4.12 + 6.1 + 8 = 18 .

(b) Let h = 3x3 + 4x2 + x 1 [x]. Here is a ring that contains and
2 2 2 3 2 2 2 29
. We have h(5 ) = 3(5 ) + 4(5 ) + (5 ) 1 = 125 .
5

411
(c) Let f = (01 10)x2 + (10 11)x + ( 21 00) (Mat2( ))[x]. Now Mat2( ) is a
01 0 1
ring containing Mat2( ) and (1 0) Mat2( ). Then f((1 0 ))
01 01 2 11 01 10 02
= (1 0)(1 0) + (0 1)(1 0) + ( 2 0) = (4 0) Mat2( ).

∑ aixi
m
(d) Let R be a ring with identity and f = R[x]. Then R R[x] and
i=0

∑ aixi = f, so f(x) = f
m
x R[x]. The value of f at x R[x] is f(x) = R[x].
i=0

From now on, the notations f and f(x) for a polynomial in R[x] will be
used interchangably.

∑ aixi
m
(e) Again let R be a ring with identity and f = R[x] be a
i=0

polynomial with coefficients in R. Let y be an indeterminate distinct


from x. Then R is contained in R[y] and y R[y]. The value of f at y is

∑ aiyi
m
f(y) = R[y].
i=0

(f) Let p = x3 x + 1 [x]. Now [x], x + 1 [x] and


p(x + 1)= (x + 1)3 (x + 1) + 1 = x3 + 3x2 + 2x + 1 [x]. Similarly x2
[x] and p(x2) = (x2)3 (x2) + 1 = x6 x2 + 1 [x].

(g) Let R be a ring. For any f R[x], the value of f at g R[x] can be
found as in the last example, and it is a polynomial f(g(x)) in R[x].

(h) Let f = 3x2 5x + 2 12


[x]. The value f(1) of f at 1 is not
defined, for does not contain 12.

(i) Let q = x2 + x + 2 and r = x3 + x + 3 [x]. We put


t = qr = x5 + x4 + 3x3 + 4x2 + 5x + 6 [x]. One checks easily that q(2) = 8,
r(2) = 13, t(2) = 104. Notice t(2) = 8.13 = q(2).r(2). This is explained in
the next lemma.

35.3 Lemma: Let R be a ring, S a ring that contains R, and s an element


of S. If S is commutative, then the mapping

412
Ts: R[x] S
f f(s)

is a ring homomorphism (called the substitution or evaluation homo-


morphism).

∑ aixi, g = ∑ bjxj in R[x], we have


m n
Proof: For any f =
i=0 j=0

(f + g)Ts = ( ∑ aixi + ∑ bjxj)Ts


m n

i=0 j=0

= ( ∑ (ai + bi)xi)Ts
m
(assuming n = m without loss of generality)
i=0

∑ (ai + bi)si
m
=
i=0

∑ aisi + ∑ bisi
m m
=
i=0 i=0

= f(s) + g(s)
= fTs + gTs,

and further

[∑ ∑aibj )xk]Ts = ∑ ( ∑a b )s ,
m+ n m+ n
(fg)Ts =
k=0
( i+ j=k k=0 i+ j=k
i j
k

( ∑ aixi)Ts. ( ∑ bjxj)Ts = ( ∑ aisi).( ∑ bjsj)


m n m n
(fTs)(gTs) =
i=0 j=0 i=0 j=0

= ∑ aisibj sj
i,j

= ∑ aibj si+ j (using commutativity of S)


i,j

∑ ( ∑a b )s
m+ n
k
= i j
k=0 i+ j=k

= (fg)Ts.

Hence Ts preserves sums and products, and is therefore a ring homo-


morphism.

413
In the proof of Lemma 35.3,. the commutativity of S is used in a crucial
way. If S is not commutative, then Ts is not a homomorphism. For
example, .
10 01 01
Ix2 (0 1) = [Ix + (1 0)][Ix (1 0)] in (Mat2( ))[x]
11
but substituting (0 0) for x does not preserve sums and products:

(10 10)2 (10 01) [(10 10) + (01 10)][(10 10) (01 10)].

The substitution homomorphism is closely related to the division algo-


rithm in an integral domain.

35.4 Theorem (Remainder theorem): Let D be an integral domain,


f D[x] and a D. There is a unique polynomial q in D[x] such that
f(x) = q(x)(x a) + f(a).

Proof: We divide f by (x a). This is possible by Theorem 34.4, because


the leading coefficient of x a is a unit in D (in fact = 1). Thus there are
unique polynomials q and r such that
f(x) = q(x)(x a) + r(x) r = 0 or deg r deg (x a) = 1.
So r is an element of D (zero or not). To find r, we substitute a for x;
since substitution is a homomorphism by Lemma 35.3, we get
f(a) = q(a)(a a) + r(a)
f(a) = r.
This completes the proof.

35.5 Definition: Let R be a ring, S a commutative ring that contains R


and let f be a polynomial in R[x]. An element a of S is called a root or
zero of f if f(a) = 0.

35.6 Theorem (Factor theorem): Let D be an integral domain, and let


f be an arbitrary polynomial in D[x]. Let E be an integral domain contain-
ing D and let a E. Then a is a root of f if and only if (x a) f in E[x].

414
Proof: By the remainder theorem (with E in place of D), there is a
polynomial q in E[x] such that f(x) = q(x)(x a) + f(a). If a is a root of f,
then f(a) = 0, so f(x) = q(x)(x a) and (x a) f(x) in E[x]. Conversely, if
(x a) f(x) in E[x], then (x a) [f(x) q(x)(x a)] in E[x], so (x a) f(a) in
E[x]. Thus f(x) = u(x)(x a) for some u(x) E[x]. Substituting a for x, we
get f(a) = u(a)(a a) = 0. So a is a root of f.

The factor theorem puts an upper bound to the number of roots of poly-
nomials over integral domains, in particular of those over fields.

35.7 Theorem: Let D be an integral domain, f a nonzero polynomial in


D[x] and let E be an integral domain containing D. Then there are at most
deg f distinct roots of f in E.

Proof: We make induction on the degree of f. Polynomials of degree 0


are just the nonzero elements of D, and they have no roots in E (zero
roots). So the theorem is true when deg f = 0. Assume now deg = 1, so
that f = cx + d, where c,d D and c 0. If f had more then one roots in E,
say if a1, a2 were roots of f in E and a1 a2, we would get.

ca1 + d = f(a1) = 0 = f(a2) = ca2 + d


ca1 = ca2
c(a1 a2) = 0 c 0
a1 a2 = 0,
contrary to a1 a2. Thus cx + d has either no roots in E or one and only
one root in E, and the theorem is proved when deg f = 1. .

Suppose now n 2, deg f = n and that, for all integral domains D´, any
polynomial of degree n 1 in D´[x] has at most n 1 distinct roots in any
integral domain E´ that contains D´. If f has no roots in E, the theorem is
.
true. If f has a root a0 in E, we have
f(x) = q(x)(x a0) for some q(x) E[x]
by the factor theorem. Here q(x) is of degree n 1 by Lemma 33.3(3). By
our induction hypothesis, q(x) has at most n 1 distinct roots in E. Now
let A be the set of all distinct roots of q(x) in E (possibly A = ) so that
A n 1.

415
If b E is any root of f, then f(b) = 0, so q(b)(b a0) = 0, so q(b) = 0 or
b = a0, so b A or b = a0. Hence B A {a0}, where B is the set of all
distinct roots of f(x) in E. Thus B A +1 (n 1) + 1 = n and f has at
most n = deg f distinct roots in E. This completes the proof.

Theorem 35.7 may be false if the underlying ring is not commutative or


if it has zero divisors. For example, x2+ 1 H[x] of degree two over the
noncommutative ring H of Ex. 9 in §29 has infinitely many roots in H.
Also, the polynomial x2 1 = 1x2 1 over 8, which has zero divisors,
possesses four distinct roots 1, 3, 5, 7 in 8.

We give two applications of Theorem 35.7. In these applications, the


underlying integral domain is a field.

35.8 Theorem (Lagrange's interpolation formula):. Let K be a field


and a0,a1, . . . ,an be distinct elements of K. Let b0,b1, . . . ,bn be arbitrary
elements of K (not necessarily distinct).. Then there is a unique poly-
nomial in K[x] such that f(a0) = b0, f(a1) = b1, . . . , f(an) = bn and such that
deg f n (one less than the number of a's or b's) or f = 0.. This poly-
nomial is given explicitly by the formula.


n (x a0). . . (x a i 1)(x ai+1). . . (x a0)
f= b i.
(ai a0). . . (ai a i 1)(ai ai+1). . . (ai a0)
i=0

Proof: The i-th summand fi :=

(x a0). . . (x a i 1)(x ai+1). . . (x a0)


bi
(ai a0). . . (ai a i 1)(ai ai+1). . . (ai a0)

in the formula is 0 K[x] (when bi = 0) or a polynomial in K[x] of degree


n (when bi 0). Here fi(ai) = bi and fi(aj ) = 0 for i j. So f := f1 + f2 + . . . + fn
is either the zero polynomial or a polynomial. of degree at most n such
that f(a ) = f (a ) + f (a ) + . . . + f (a ) = 0 + . . . + f (a ) + 0 + . . . + 0 = b for all
i 1 i 2 i n i i i i
i = 1,2, . . . ,n. . This proves the existence of a polynomial with the
properties stated in the theorem,. namely the one given explicitly above.

The uniqueness of f follows from Theorem 35.7. If g is a polynomial in


K[x] with deg g n, and if g(a1) = b1, g(a2) = b2, . . . , g(an) = bn,. then the

416
polynomial h = f g has at least n + 1 roots a0, a1, . . . ,an in K, and, if h 0,
then h has degree at most equal to n (Lemma 33.3(2)). This is not
compatible with Theorem 35.7, so h = 0 and g = f. Therefore f is the
unique polynomial satisfying the conditions above.

The formula for f is easy to remember. We have f = f1 + f2 + . . . + fn,


where fi(ai) = bi and fi(aj ) = 0 for i j. The second condition leads to fi
= (x a0). . . (x a i 1)(x ai+1). . . (x a0)ci for some c i K, and ci must as in
the formula if fi(ai) is to be equal to bi.

35.9 Theorem (Wilson's theorem): If p is a prime number, then


(p 1)! + 1 0 (mod p).

Proof (Lagrange): Fermat's theorem (Theorem 12.6) states that


a p 1 1 (mod p) for any integer a with (a,p) = 1. We can write this as

ap 1 1=0 in p
if a 0.

Thus the polynomial f = xp 1 1 = 1xp 1 1 p


[x] has p 1 distinct roots
in p, namely 1, 2, . . . , p 1. The polynomial
g = (x 1)(x 2). . . (x p 1)
has the same roots. Hence the polynomial

h=f g = ( 1xp 1 1) (x 1)(x 2). . . (x p 1) = (xp 1 1) (xp 1 + . . . )

over p has at least p 1 roots 1, 2, . . . , p 1 in p. If h were not the


zero polynomial in p[x], its degree would be less than p 1. This
contradicts Theorem 35.7. So h is the zero polynomial in p[x]: each
coefficient of h is equal to 0 p
. In particular,

0 = coefficient of x0 in h
= (coefficient of x0 in f) (coefficient of x0 in g)
= ( 1) (( 1)( 2). . . ( (p 1))
= 1 ( 1)p 1(p 1)!
= ((p 1)! + 1) in p

417
provided p is odd. Hence (p 1)! + 1 0 (mod p) when p is an odd prime
number. But this congruence holds also when p = 2. This completes the
proof.

The next theorem will be familiar to the reader in the case of D = , F=


under the name of "rational root theorem".

35.10 Theorem: Let D be a unique factorization domain and let F be


the field of fractions of D. Let f = anxn + a n 1xn 1 + . . . + a1x + a0 D[x] be
b
an arbitrary polynomial in D[x]. If a = F is a root of f, where b,c D
c
and (b,c) 1, then
c an and b a0 in D.
In particular, if the leading coefficient of f is a unit in D, then any root of
f in F is actually in D.

b
Proof: By hypothesis, a = is a root of f so that
c
bn bn 1 b
an n + a n 1 n 1 + . . . + a1 + a0 = 0.
c c c
Multiplying both sides by c n, we obtain
anbn + (a n 1bn 1c + . . . + a1bcn 1 + a0c n) = 0,
[a bn + a bn 1c + . . . + a bcn 1] + a c n = 0.
n n 1 1 0
n
c divides the expression in ( ), so c anb . As (b,c) 1, we have (bn,c) 1.
n n
From (b ,c) 1 and c anb , we conclude c an. Likewise, b divides the
expression in [ ], so b a0c n. As (b,c) 1, we have (b,c n) 1. From (b,c n) 1
n
and b a0c , we conclude b a0. In particular, if an is a unit in D, then c is
1
also a unit in D since c an, so there is a c D such that cc 1 = 1 and the
b bc 1 bc 1
root a = = = = bc 1 D.
c cc 1 1

35.11 Example: As an illustration of Theorem 35.10, we prove that the


real number 2 is irrational. Let f(x) = x2 2 [x]. Since is a unique
factorization domain and the leading coefficient of f is a unit in (f is in

418
fact a monic polynomial), any root of f in must be actually in by
Theorem 35.10. But

f(0) = 2 0; f( 1) = 1 0;

f( m) = m2 2 2, so f( m) 0 for m 2;

so f has no integer roots, and consequently no rational roots, as claimed.

Next we discuss the multiplicity of roots.. Let D be an integral domain


and f a nonzero polynomial in D[x].. If a D is a root of f, then we have
f(x) = (x a)q1(x) for some q1(x) D[x] by the factor theorem (Theorem
35.6). Either a is not a root of q1(x), or we have q1(x) = (x a)q2(x) and
therefore f(x) = (x a)2q2(x) for some q2(x) D[x]. In the latter case,
either a is not a root of q2(x), or we have q2(x) = (x a)q3(x) and
therefore f(x) = (x a)3q3(x) for some q3(x) D[x]. We repeat this
argument. Since the degrees of q1(x), q2(x), q3(x), . . . get smaller and
smaller, we will reach a polynomial qm(x) with
f(x) = (x a)mqm(x), qm(a) 0.

35.12 Definition: Let D be an integral domain and f a nonzero poly-


nomial in D[x]. Suppose a D and f(a) = 0. The uniquely determined
integer m 1 such that

f(x) = (x a)mqm(x), qm(x) D[x], qm(a) 0,

that is, the uniquely determined integer m 1 such that

(x a)m f(x), (x a)m+1 f(x) in D[x]

is called the multiplicity of the root a of f. The root a of f is called a


simple root when m = 1 and a multiple root when m 1.

This definition makes sense also when a is a root of f in E,. where E is an


integral domain containing D:. we need only regard f as a polynomial
over E and use the definition with E in place of D. When E1 and E2 are
two integral domains containing D and a root a of f is both in E1 and E2,
we have, say,.

419
f(x) = (x a)m1 q1(x), q1(x) E1[x], q1(a) 0,
f(x) = (x a)m2 q2(x), q2(x) E2[x], q2(a) 0,
f(x) = (x a)m0 q0(x), q0(x) (E1 E2)[x], q0(a) 0,

as the equations defining the multiplicity of a as a root in E1, E2, E1 E2.


Then
(x a)m1 q1(x) = (x a)m0 q0(x) in E1[x]

and the assumption m1 m0 or m1 m0 leads to the contradiction

(x a)m1 m0 q1(x) = q0(x) or q1(x) = (x a)m0 m1 q0(x),


0 = q0(a) or q1(a) = 0.

Hence m1 = m0. Likewise m2 = m0 and therefore m1 = m2: the multiplicity


of a root of f D[x] is independent of the integral domain to which the
root belongs.

In order to find out whether a polynomial has multiple roots, we take


derivatives.

In analysis, the derivative of a real-valued function u of a real variable


x is defined by
u(x + h) u(x)
u´(x) = lim .
h 0 h

This definition cannot be extended to polynomials over a ring. For one


thing, polynomials are not functions. Second, what should
u(x + h) u(x)
mean in a ring? Third, we did not define limits in a
h
ring. In fact, in many rings, a reasonable limit process cannot be
introduced at all. But we know from analysis that the derivative of the

∑ akxk ∑ kakxk 1.
m m
function x is the function x This suggests the
k=0 k=1

following definition.

420
∑ akxk
m
35.13 Definition: Let R be an arbitrary ring and let f = be an
k=0

arbitrary polynomial in R[x]. The derivative of f is defined as the poly-


nomial

∑ kakxk 1= ∑ (k+1)akxk 1
m m1
f´ = f´(x) = R[x].
k=1 k=0

kaak means of course ak +ak + . . . + ak in R (k times). This definition has


nothing to do with limits. Taking the derivative of a polynomial is called
differentiation.

35.14 Examples: (a) Let f (x) = x4 3x2 + x + 10 [x]. Then


f´(x) = 4x3 6x + 1 [x].

1 5 1 4 2 3 4
(b) Let g(x) = x + x + x + x 3 [x]. Then
3 7 5 3
5 4 6 4
g´(x) = x4 + x3 + x2 + [x].
3 7 5 3

12 0 1 12 00
(c) Let h(x) = (3 4)x3 + ( 1 1)x2 + (0 3)x + (1 0) (Mat2( ))[x]. Then
12 0 1 12
h´(x) = 3(3 4)x2 + 2( 1 1)x + 1(0 3)
3 6 0 2 12
= (9 12)x2 + ( 2 2)x + (0 3) (Mat2( ))[x].

(d) Let k(x) = 2x4 + 4x2 + 3x + 5 8


[x]. Then
k´(x) = 4.2x3 + 2.4 + 1.3 = 3 [x].
8

(e) Let l(x) = x125 + x25 + 2x5 + 3 5


[x]. Then
l´(x) = 125. 1x 124
+ 25.1x + 5.2x = 0
24 4
5
[x].

The familiar rules of differentiation hold in any polynomial ring.

35.15 Lemma: Let R be a ring, c R, and let f,g R[x]. Then


(f + g)´ = f´ + g´, (cf)´ = cf´, (fg)´ = f´g + fg´.

421
∑ akxk ∑ bjxj. We have
m n
Proof: Let f = and g =
k=0 j=0

( ∑ akxk ∑ bjxj)´
m n
(f + g)´= +
k=0 j=0

( ∑ akxk ∑ bkxk)´(assuming
m m
= + n = m without loss of
k=0 k=0

generality)

( ∑ (ak + bk)xk)´
m
=
k=0

∑ k(ak + bk)xk 1
m
=
k=1

∑ (kak + kbk)xk 1
m
=
k=1

∑ kakxk 1 + ∑ kbkxk 1
m m
=
k=1 k=1

= f´ + g´,

(c ∑ a x )´= ( ∑ ca x )´= ∑ kca x = c ∑ kakxk 1 = cf´


m m m m
k k k1
(cf)´ = k k k
k=0 k=0 k=1 k=1

Next we find (fg)´ and f´g + fg´. We have

[( ∑ a x )( ∑ b x )]´
m n
k j
(fg)´ = k j
k=0 j=0

[ ∑ ( ∑a b )x ]´
m+ n
s
= k j
s=0 k+ j=s

∑ ( ∑a b )x
m+ n
s1
= s k j
,
s=1 k+ j=s

(1)

( ∑ akxk)´( ∑ bjxj) ( ∑ akxk)( ∑ bjxj)´


m n m n
f´g + fg´ = +
k=0 j=0 k=0 j=0

( ∑ kakxk 1)( ∑ bjxj) ( ∑ akxk)( ∑ jbjxj 1)


m n m n
= +
k=1 j=0 k=0 j=1

422
∑ ∑kakbj )xs 1 ∑ ( ∑ja b )x
m+ n m+ n
=
s=1
( k+ j=s
+
s=1 k+ j=s
k j
s1

∑ ( ∑ka b + ja b )x
m+ n
s1
= k j k j
s=1 k+ j=s

∑ ( ∑a b )x
m+ n
s1
= s k j
.
s=1 k+ j=s

(2)

From (1) and (2), we conclude (fg)´ = f´g + fg´. This completes the proof.

35.16 Lemma: Let R be a ring and let f1,f2, . . . ,fn,f,g R[x].


(1) (f1 + f2 + . . . + fn)´ = f1´ + f2´ + . . . + fn´.
(2) (f f . . . f )´ = f ´f . . . f + f f ´. . . f + . . . + f f . . . f ´.
1 2 n 1 2 n 1 2 n 1 2 n
n n1
(3) (g )´ = ng g´.
(4) [f(g(x))]´ = f(g(x))g´(x).
Proof: (1) and (2) follow from Lemma 35.16 by induction on n. (3) is a
special case of (2), with f1 = f2 = . . . = fn = g. We now prove (4). Let

∑ akxk. ∑ akg k
m m
f = Then f(g(x)) = R[x] and, by (1) and (3), the
k=0 k=0

derivative of f(g(x)) is

( ∑ akg k) ∑ ak(g k)´ = ∑ ak(g k)´ = ∑ kakg k 1g´


m ´ m m m
=
k=0 k=0 k=1 k=1

( ∑ kakg k 1)g´ = f´(g)g´.


m
=
k=1

We are now in a position to determine which roots are multiple roots.

35.17 Theorem: Let D be an integral domain, and E an integral domain


that contains D. Let c E and let f be a nonzero polynomial in D[x]. Then
c is a multiple root of f if and only if c is a root of both f and f´.

423
Proof: Suppose c is a multiple root of f. Then it is a root of f. We wish to
show that c is a root of f´ as well. We have f(x) = (x c)2g(x) for some
g(x) E[x]. Differentiating and substituting c for x, we obtain
f´(x) = 2(x c)g(x) + (x c)2g´(x)
f´(c) = 2(c c)g(c) + (c c)2g´(c) = 0
and c is indeed a root of f´.

Conversely, suppose c is a root of f and f´. We write f(x) = (x c)h(x),


where h(x) E[x]. We want to show that c is a root of h. Since
f´(x) = h(x) + (x c)h´(x)
f´(c) = h(c) + (c c)h´(c)
0 = h(c) + 0,
h(c) = 0 and c is a multiple root of f.

35.18 Theorem: Let K be a field and E an integral domain that contains


K. Let f(x), g(x) be arbitrary nonzero polynomials in K[x].
(1) If f and g are relatively prime, then f and g have no common root in
E.
(2) If f and f´ are relatively prime, then f has no multiple roots in E.
(3) If f is irreducible in K[x], then either f and g are relatively prime or
f g in K[x].
(4) If f is irreducible in K[x] and deg f deg g, then f and g have no
common root in E.
(5) If f is irreducible in K[x] and f´ 0, then there is no root of f in E
which is a multiple root.
(6) If f is irreducible in K[x] and if f has a root in E which is not a
multiple root of f, then f´ 0.

Proof: (1) Suppose f and g are relatively prime in K[x]. By Theorem


34.6, there are polynomials h,l in K[x] such that
1 = h(x)f(x) + l(x)g(x),
where 1 is the identity element of K. If f and g had a root c E in
common, we would have
1 = h(c)f(c) + l(c)g(c) = h(c)0 + l(c)0 = 0 + 0 = 0,
a contradiction. So f and g have no common root in E.

(2) Assume f and f´ are relatively prime. If f has no root in E, then


certainly f has no multiple root in E. Now we suppose f has a root c in E

424
and prove that c is not a multiple root of f. Indeed, since f and f´ are
relatively prime, f and f´ have no common root by part (1), so f´(c) 0
and c is not a multiple root of f by Theorem 35.17.

(3) Suppose f is irreducible in K[x] and let d K[x] be a greatest common


divisor of f and g. Since d f and f is irreducible, d is either a unit in K[x]
or an associate of f. In the first case, f and g are relatively prime, in the
second case, f d and d g yields f g.

(4) Suppose f is irreducible in K[x] and deg g deg f, then f cannot


divide g, so f and g are relatively prime by part (3). By part (1), f and g
have no common root in E.

(5) Suppose f is irreducible in K[x] and f´ 0. Then deg f´ deg f. Since f


is irreducible, f and f´ have no common root in E by part (4). Now if f has
no root in E, then f has certainly no multiple root in E. If f has a root c in
E, then c is not a root of f´, so c is not a multiple root of f by Theorem
35.17. In any case, f has no multiple root in E.

(6) Suppose f is irreducible in K[x] and suppose c E is a simple root of f


in E. If we had f´ = 0, we would have f(c) = 0 and f´(c) = 0 and c would be
a multiple root of f by Theorem 35.17, a contradiction. Thus, if there are
roots in E and if they are all simple, then f´ 0.

We finish this paragraph with a brief discussion of successive substitu-


tions.

35.19 Definition: Let R be a ring and let

N1 N2 N n-1 N n
f= ∑ ∑ ... ∑ ∑ aij...klx1ix2j . . . xnk1xnl
i=0 j=0 k=0 l=0

be a polynomial in R[x1,x2, . . . ,xn 1,xn]. Let S be a ring that contains R and


let c1,c2, . . . ,c n 1,cn be elements of S. The element

N1 N2 N n-1 N n
∑ ∑ ... ∑ ∑ aij...klc1ic2j . . . cnk1cnl
i=0 j=0 k=0 l=0

of S is called the value of f at (c1,c2, . . . ,c n 1,cn). It will be denoted by


f(c1,c2, . . . ,c n 1,cn).

425
N1 N2 N n-1 N n
With the foregoing notation, f = ∑ (∑ ... ∑ ∑ )
aij...klx1ix2j . . . xnk1 xnl
i=0 j=0 k=0 l=0
is a polynomial in R[x1,x2, . . . ,xn 1][xn]. Substituting cn for xn in the sense
of Definition 35.1 (with S[x1,x2, . . . ,xn 1], R[x1,x2, . . . ,xn 1], xn, cn in place of
S, R, x, c, respectively), we get an element of S[x1,x2, . . . ,xn 1], namely

N1 N2 N n-1 Nn
∑ ∑ ... ∑ (∑ )
cnlaij...kl x1ix2j . . . xnk1 S[x1,x2, . . . ,xn 2][xn 1].
i=0 j=0 k=0 l=0

Substituting cn 1 for xn 1 in this polynomial over S[x1,x2, . . . ,xn 2], we get a


polynomial in S[x1,x2, . . . ,xn 2], namely

N1 N2 N n-2 N n-1 N n
∑ ∑ ... ∑ (∑ ∑ )
c nk1cnlaij...j´kl x1ix2j . . . xnj´ 2.
i=0 j=0 j´=0 k=0 l=0

We continue in this way. If S is commutative, we obtain f(c1,c2, . . . ,c n 1,cn)


after n substitutions. Thus

f(c1,c2, . . . ,c n 1,cn) = fTc Tc . . . Tc Tc ,


n n-1 2 1

where Tc : R[x1,x2, . . . ,xn 1,xn] S[x1,x2, . . . ,xn 1]


n
Tc : S[x1,x2, . . . ,xh 1,xh] S[x1,x2, . . . ,xh 1] (h = 2, . . . ,n 1)
h
and Tc : S[x1] S
1

are the substitution homomorphisms in the sense of Definition 35.1.


Since the composition of homomorphisms is a homomorphism (Theorem
30.12), we obtain the following lemma.

35.20 Lemma: Let R be a ring, S a ring that contains R, and


c1,c2, . . . ,c n 1,cn elements of S. If S is commutative, then the mapping

T(c ,c2 , . . . ,cn-1,cn ): R[x1,x2, . . . ,xn 1,xn] S


1
f f(c1,c2, . . . ,c n 1,cn)

is a ring homomorphism (called the evaluation or substitution homo-


morphism).

426
Exercises

1. Let f = x3 + ax2 + bx + c [x]. Prove that f is reducible over if and


only if f has an integer root.

2. Find a polynomial f [x] with deg f 4 satisfying


f( 2) = 9, f( 1) = 2, f(0) = 1, f(1) = 4, f(2) = 25.

3. Let p be a prime number of the form 4k + 1. Using Wilson's theorem,


show that p-1 ! is a root of x2 + 1 p
[x].
2

4. Let R be a ring and f = ∑ aijkxiyj zk R[x,y,z]. The derivative of f,


i,j ,k

when f is regarded as a polynomial in R[y,z][x], is called the derivative of


f . f
f with respect to x and is written Thus = iaijkxi 1yj zk. The
x x i,j ,k
i 1

derivatives with respect to y and z are defined similarly.. f is said to be


homogeneous of degree m if i + j + k = m for all i,j,k with aijk 0. Prove
the following assertions..
(a) Let t be an indeterminate over R[x,y,z]. If f(x,y,z) R[x,y,z] is a
homogeneous polynomial of degree m, then
f(tx,ty,tz) = tmf(x,y,z) R[x,y,z,t]. (*)
(b) Let t be an indeterminate over R[x,y,z] and f(x,y,z) R[x,y,z]. If
(*) holds in R[x,y,z,t], then f(x,y,z) is a homogeneous polynomial of
degree m.
(c) If f(x,y,z) R[x,y,z] is a homogeneous polynomial of degree m,
then
f(rx,ry,rz) = rmf(x,y,z)
for all r R.
(d) If f(x,y,z) [x,y,z] and f(rx,ry,rz) = rmf(x,y,z) for all r , then
f(x,y,z) is a homogeneous polynomial of degree m.
(e) Find a polynomial f(x,y,z) 5
[x,y,z] such that.
f(rx,ry,rz) = rmf(x,y,z) for all r 5
and which is not homogeneous of degree m.
(f) If f(x,y,z) R[x,y,z] is homogeneous of degree m, then
f f f
x +y +z = mf.
x y y

427
5. Let R be a ring and f R[x]. The derivative of f´ is called the second
derivative of f, and is written as f´´ or as f(2). More generally, the (n+1)-
st derivative of f is defined recursively as the derivative of the n-th
derivative f(n) of f, and is written as f(n+1). Thus f(n+1) = (f(n))´. We write
f(1) for f´ and f(0) = f. Prove that, for any f,g R[x], any c R, any n

(f + g)(n) = f(n) + g (n), (cf)(n) = cf(n),


n
∑ (k)f(n k)g (k).
n
(fg)(n) =
k=0

6. Let K be a field, f a nonzero polynomial of degree n in K[x] and assume


that (n!)1K 0, where 1K is the identity of K. Show that
f(k)(x) k

n
f(x + y) = y
k=0 k!
f(k)(x)
in K[x,y], where, of course, means [(k!)1K ] 1f(x).
k!

7. Let p be a prime number and f p


[x]. Show that f´ = 0 if and only if
f(x) = g(xp) for some g p
[x].

8. Let K be a field. We put M = Mat2(K) for brevity. Let us recall that the
a b
determinant of (c d) M is ad bc and that A M is a unit in M if and

only if det A is a unit in K.

Let A(x), B(x) M[x] be nonzero polynomials and assume that the
leading coefficient of B(x) has a nonzero determinant. Show that there
are uniquely determined polynomials Q(x), R(x), Q†(x), R†(x) in M[x] such
that A(x) = Q(x)B(x) + R(x), R(x) = 0 or deg R(x) deg B(x).
and A(x) = B(x)Q†(x) + R†(x), R†(x) = 0 or deg R†(x) deg B(x).
Q(x) and R(x) are called the right quotient and right remainder, Q†(x) and
R†(x) are called the left quotient and left remainder when A(x) is divided
by B(x).

If F(x) = Fnxn + Fn 1xn 1 + . . . + F1x + F0


M[x] and A M, then
F(A) := FnAn + Fn 1An 1 + . . . + F1A + F0 M
is called the right value of F(x) at A and
F†(A) := AnFn + An 1Fn 1 + . . . + AF1+ F0 M

428
is called the left value of F(x) at A. Prove that the right (resp. left)
remainder of F(x) M[x], when F(x) is divided by Ix A, is equal to F(A)
(resp. F†(A)).
9. Let R be a ring and Di : R[x] R[x] be functions (i = 1,2) such that

Di(f + g) = Di f + Di g Di(cf) = c Di f, Di(fg) = (Di f)g + fDi(g)

for all f,g R[x]. Define D : R[x] R[x] by


Df = D1(D2f) D2(D1f).
Prove that
D(f + g) = Df + Dg D(cf) = c Df, D(fg) = (Df)g + fD(g)

for all f,g R[x].

429
§36
Fields of Rational Functions

The reader might have missed the familiar quotient rule (\f(f,g))´ =
f´g fg´ f
2 in Lemma 35.15. It was missing because is not a polynomial.
g g
f
We now introduce these quotients .
g

36.1 Definition: Let D be an integral domain and x, x1,x2, . . . ,xn indeter-


minates over D. Then D[x] and D[x1,x2, . . . ,xn] are integral domains
(Lemma 33.6, Lemma 33.10).. An element in the the field of fractions of
D[x] is called a rational function (in x) over D.. The field of fractions of
D[x] will be called the field of rational functions over D (in x). and will be
denoted by D(x).. An element in the the field of fractions of D[x1,x2, . . . ,xn]
is called a rational function (in x1,x2, . . . ,xn) over D. The field of fractions
of D[x1,x2, . . . ,xn] will be called the field of rational functions over D (in
x1,x2, . . . ,xn) and will be denoted by D(x1,x2, . . . ,xn).

f
Thus a rational function over D is a fraction of two polynomials over D,
g
f1 f2
with g 0. Two rational functions and are equal if and only if the
g1 g2
f1 f2
polynomials f1g2 and g1f2 are equal. Two rational functions and are
g1 g2
added and multiplied according to the rules

f1 f2 f1g2 + g1f2 f1 f2 f1f2


+ = , = .
g1 g2 g1g2 g1 g2 g1g2

Here g1 and g2 are distinct from the zero polynomial over D.

This terminology is unfortunate and misleading, because a rational


function is not a function in the sense of Definition 3.1. A rational
function is not a function of the 'rational' kind, whatever that might
mean. The technical term we defined is rational function, a term
consisting of two words "rational" and "function". The meaning of the

430
words "rational" and "function" do not play any role in Definition 36.1. A
rational function is a fraction of polynomials over D. The reader should
exercise caution about this point. One should not conclude that
x2 1 x+1
and in (x)
x 1 1
are different rational functions, on grounds that that their domains are
different, since the domain of the first one does not contain 1, whereas 1
is in the domain of the second one. Neither of them has a domain, for
neither of them is a function. And these rational functions are equal
because the polynomials (x2 1)1 and (x 1)(x + 1) in [x] are equal.

36.2 Lemma: Let D be an integral domain and F the field of fractions of


D. Let x be an indeterminate over D. Then D(x) = F(x).

a
Proof: F consists of the fractions , where a,b D and b 0; and D(x)
b
consists of the fractions

anxn + a n 1xn 1 + . . . + a1x + a0


,
bmxm + bm 1xm 1 + . . . + b1x + b0

where an,a n 1, . . . ,a1,a0,bm,bm 1, . . . ,b1,b0 D and the denominator is


distinct from the zero polynomial in D[x]. Finally, F(x) consists of the
fractions
cnxn + c n 1xn 1 + . . . + c1x + c0
,
d xm + dm
xm 1 + . . . + d x + d
m 1 1 0

where cn,c n 1, . . . ,c1,c0,dm,dm 1, . . . ,d1,d0 F and the denominator is distinct


from the zero polynomial in F[x].

a
An element of D is identified with the fraction in F (Theorem 31.5),
1
f(x)
whence D F. Thus D[x] F[x] as sets. Note that two elements and
g(x)
p(x)
of D(x) are equal in D(x) if and only if f(x)q(x) = g(x)p(x) in D[x], and
q(x)
f(x)
this holds if and only if f(x)q(x) = g(x)p(x) in F[x], so if and only if
g(x)

431
p(x)
and are equal in F(x). Thus every element of D(x) is in F(x) and
q(x)
equality in D(x) coincides with equality in F(x). So D(x) F(x).

p(x)
Next we show F(x) D(x). Let F(x), with p(x), q(x) F[x], q(x) 0.
q(x)
ai m cj
∑ xi , q(x) = ∑
n
Then p(x) = xj , where ai,bi,cj ,dj D, bi 0, dj 0 for
i=0 bi j=0 dj

all i,j and not all of cj are equal to 0 D. We put b = b0b1. . . bn 1bn and d =

d0d1. . . dm 1dm. Then dbp(x) and dbq(x) are polynomials in D[x], and hence
p(x) dbp(x)
= D(x). So F(x) D(x). This proves D(x) = F(x).
q(x) dbq(x)

2 2 1 1
x x +
As an illustration of Lemma 36.2, observe that 3 7 4 (x)
2 2 1 1
x + x
5 3 2
2
5(56x 12x + 21)
is equal to the rational function in [x].
2
14(12x + 10x 15)

36.3 Remark: Let D be an integral domain and F the field of fractions


of D. Then
D(x1,x2, . . . ,xn) = field of fractions of D[x1,x2, . . . ,xn]
= field of fractions of D[x1,x2, . . . ,xn 1][xn]
= D[x1,x2, . . . ,xn 1](xn)
= D(x1,x2, . . . ,xn 1)(xn)
by Lemma 36.2, with D[x1,x2, . . . ,xn 1], D(x1,x2, . . . ,xn 1), xn in place of D,F,x,
respectively.

Also, we have D(x1,x2, . . . ,xn) = F(x1,x2, . . . ,xn), for this is true when n = 1
(Lemma 36.2) and, when it is true for n = k, so that D(x1,x2, . . . ,xk) =
F(x1,x2, . . . ,xk), it is also true for n = k + 1:
D(x1,x2, . . . ,xk,xk+1) = D(x1,x2, . . . ,xk)(xk+1)
= F(x1,x2, . . . ,xk)(xk+1)
= F(x1,x2, . . . ,xk,xk+1),
the last equation by the remark above, with F in place of D and k + 1 in
place of n.

432
In the remainder of this paragraph, we discuss partial fraction
expansions of ratinonal functions.

36.4 Lemma: Let K be a field and let f(x) be a nonzero polynomial in


K[x]. Let q(x), r(x) be two nonzero, relatively prime polynomials of posi-
tive degree in K[x]. Suppose deg f(x) deg q(x)r(x) and suppose that
f(x) is relatively prime to q(x)r(x). Then there are uniquely determined
nonzero polynomials a(x), b(x) in K[x] such that
a(x)r(x) + b(x)q(x) = f(x), deg a(x) deg q(x), deg b(x) deg r(x).

Proof: We first prove the existence of a(x) and b(x). Since q(x), r(x) are
relatively prime, there are polynomials h(x), k(x) in K[x] with

h(x)r(x) + k(x)q(x) = 1.

Multiplying both sides of this equation by f(x) and putting A(x) =


f(x)h(x), B(x) = f(x)k(x), we obtain

A(x)r(x) + B(x)q(x) = f(x).

We now divide A(x) by q(x) and B(x) by r(x):

A(x) = s(x)q(x) + a(x), a(x) = 0 or deg a(x) deg q(x),


B(x) = u(x)r(x) + b(x), b(x) = 0 or deg b(x) deg r(x).

Thus a(x)r(x) + b(x)q(x) = (A(x) s(x)q(x))r(x) + (B(x) u(x)r(x))q(x)


= (A(x)r(x) + B(x)q(x)) (s(x) + u(x))q(x)r(x)
= f(x) (s(x) + u(x))q(x)r(x).

We claim s(x) + u(x) is the zero polynomial in K[x]. Otherwise, we would


have deg (s(x) + u(x)) 0,
deg (s(x) + u(x))q(x)r(x) deg q(x)r(x),
and since by hypothesis deg f(x) deg q(x)r(x),
deg f(x) (s(x) + u(x))q(x)r(x) deg q(x)r(x),

so that a(x)r(x) + b(x)q(x) 0; in particular, both a(x) and b(x) cannot be


zero. Assume, without loss of generality, that a(x) 0 in case one of a(x),
b(x) is zero and that deg a(x)r(x) deg b(x)q(x) in case neither of them
is zero. Then we get the contradiction

deg [f(x) (s(x) + u(x))q(x)r(x)] = deg (a(x)r(x) + b(x)q(x))

433
deg a(x)r(x)
= deg a(x) + deg r(x)
deg q(x) + deg r(x)
= deg q(x)r(x).
Thus s(x) + u(x), and consequently (s(x) + u(x))q(x)r(x) is the zero poly-
nomial in K[x]. This gives a(x)r(x) + b(x)q(x) = f(x). It remains to show
that a(x) and b(x) are distinct from the the zero polynomial in K[x]. Both
of them cannot be 0, for then f(x) would be also 0, which it is not by hy-
pothesis. If one of them is 0, say if a(x) = 0, then b(x) 0 and f(x) =
b(x)q(x) would not be relatively prime to q(x)r(x) (because q(x) is of
positive degree, so not a unit in K[x]), against the hypothesis. This proves
the existence of a(x), b(x).

It remains to show the uniqueness of a(x) and b(x). If we have also.


a1(x)r(x) + b1(x)q(x) = f(x), deg a1(x) deg q(x), deg b1(x) deg r(x),
we obtain 0 = f(x) f(x) = (a1(x)r(x) + b1(x)q(x))
(a(x)r(x) + b(x)q(x))
= (a(x) a1(x))r(x) (b1(x) b(x))q(x),
so (a(x) a1(x))r(x) = (b1(x) b(x))q(x). (*)
Hence
r(x) (b1(x) b(x))q(x) in K[x]
r(x) b1(x) b(x) in K[x] as r(x) and are q(x) relatively prime.
Now b(x) b1(x) implies b(x) b1(x) 0 and this gives
deg r(x) deg (b1(x) b(x)) max{deg b1(x), deg b(x)} deg r(x),
a contradiction. Thus b(x) = b1(x) and we get then a(x) = a1(x) from (*).
So a(x) and b(x) are uniquely determined.

f(x)
36.5 Lemma: Let K be a field and let be a nonzero rational function
g(x)
in K(x),with deg f(x) deg g(x). Suppose that f(x) and g(x) are both
monic and that f(x) is relatively prime to g(x). Assume g(x) = q(x)r(x),
where q(x)r(x) are two relatively prime polynomials of positive degree
in K[x]. Then there are uniquely determined nonzero polynomials a(x),
b(x) in K[x] such that
f(x) f(x) a(x) b(x)
= = +
g(x) q(x)r(x) q(x) r(x)

and deg a(x) deg q(x), deg b(x) deg r(x).

434
f(x)
Proof: If is a nonzero rational function in K(x), then f(x) is a nonzero
g(x)
polynomial in K[x], and f(x) is relatively prime to g(x) = q(x)r(x). As f(x)
and g(x) are monic, these conditions determine f(x) and g(x) uniquely.
The polynomials q(x), r(x) are relatively prime and deg f(x) is smaller
than deg q(x)r(x). So the hypotheses of Lemma 36.4 are satisfied and
therefore there are uniquely determined nonzero polynomials a(x),b(x)
in K[x] such that
f(x) = a(x)r(x) + b(x)q(x),
and deg a(x) deg q(x), deg b(x) deg r(x).
Dividing both sides of the equation above by g(x) = q(x)r(x), we see that
there are uniquely determined nonzero polynomials a(x),b(x) in K[x]
such that
f(x) f(x) a(x) b(x)
= = +
g(x) q(x)r(x) q(x) r(x)

and deg a(x) deg q(x), deg b(x) deg r(x).

By induction on m, we obtain the following lemma.

f(x)
36.6 Lemma: Let K be a field and let be a nonzero rational
g(x)
function in K(x),with deg f(x) deg g(x). Suppose that f(x) and g(x) are
both monic and that f(x) is relatively prime to g(x). Assume g(x) =
q1(x)q2(x). . . qm(x), where q1(x), q2(x), . . . ,qm(x). are pairwise relatively
prime monic polynomials of positive degree in K[x].. Then there are
uniquely determined nonzero polynomials a1(x), a2(x), . . . ,am(x) in K[x]
such that .
f(x) f(x) a 1
(x) a 2
(x) a m
(x)
= = + + ... +
g(x) q1(x)q2(x). . .qm(x) q1(x) q2(x) qm(x)

and deg ai(x) deg qi(x) for all i = 1,2, . . . ,m.

36.7 Lemma: Let K be a field and x an indeterminate over K.. Let g(x)
be a polynomial in K[x] of degree 1. Then, for any f(x) K[x],. there
are uniquely determined polynomials r0(x), r1(x), r2(x), . . . ,rn(x) such that

435
f(x) = r0(x) + r1(x)g(x) + r2(x)g(x)2 + . . . + rn(x)g(x)n
and
ri(x) = 0 or deg ri(x) deg g(x) for all i = 1,2, . . . ,n.

Proof: From deg g 1, we know that g 0.. So we may divide f by g


and obtain f = q0g + r0, where q0, r0 K[x], with r0 = 0 or deg r0 deg g.
Here q0 and r0 are uniquely determined by f and g (Theorem 34.4) and
we have f = r0 + q0g. If q0 = 0, we are done (with n = 0). Otherwise, since
f = q0g + r0, deg g 1 and r0 = 0 or deg r0 deg g, we have deg q0
deg f (Lemma 33.3). We now divide q0 by g and obtain q0 = q1g + r1,
where q1, r1 K[x], with r1 = 0 or deg r1 deg g. Here q1 and r1 are
uniquely determined by q0 and g (hence by f and g) and f = r0 + r1g +
q1g 2. If q1 = 0, we are done. Otherwise, deg q1 deg q0. We then divide
q1 by g and obtain q1 = q2g + r2, where q2, r2 K[x], with r2 = 0 ordeg r2
deg g. Here q2 and r2 are uniquely determined by q1 and g (hence by f
and g) and f = r0 + r1g + r2g 2 + q2g 3. If q2 = 0, we are done. Otherwise, we
have deg q2 deg q1. We continue this process. As the degrees of q0, q1,
q2, . . . get smaller and smaller, this process cannot go on indefinitely..
Sooner or later, we will meet a qn equal to 0 K[x]. Then, with uniquely
determined r ,r ,r , . . . r , we have f = r + r g + r g 2 + . . . + r g n, where
0 1 2 n 0 1 2 n
ri(x) = 0 or deg ri deg g for all i = 1,2, . . . ,n.

In the situation of Lemma 36.7, the unique expression


f = r0 + r1g + r2g 2 + . . . + rng n
of f(x), where ri(x) = 0 or deg ri deg g for all i = 1,2, . . . ,n, is called the
g-adic expansion of f.

p(x)
36.8 Theorem: Let K be a field and a nonzero rational function in
q(x)
K(x), where p(x),q(x) K[x] are relatively prime in K[x]. Let u be the
leading coefficient of q(x) and let q(x) = ug1(x)m1 g2(x)m2 . . . gt(x)mt be the
decomposition of q(x) into polynomials irreducible over K, where gi(x)
are monic. Then there are uniquely determined polynomials G(x),

436
a1(1)(x), a2(1)(x), . . . ,am (1)
(x),a1(2)(x), a2(2)(x), . . . ,am (2)
(x),. . . ,a1(t)(x), a2(t)(x),
1 2

. . . ,am (t)(x) in K[x] such that


t

a1(1)(x) a2(1)(x) am (1)(x)


p(x) 1
= G(x) + 1 + 2 + ... +
q(x) g1 (x) g1 (x) g1m1 (x)
(2)
a1(2)(x) a2(2)(x) am (x)
2
+ + + ... +
g21(x) g22(x) g2m2 (x)
+ ......
a1(t)(x) a2(t)(x) am (t)(x)
t
+ 1 + 2 + ... +
gt (x) gt (x) gtmt(x)
and deg ai(k)(x) deg gk(x) or ai(k)(x) = 0 for all i and k.

Proof: We divide p(x) by q(x) and find unique polynomials G(x), H(x) in
K[x] with p(x) = q(x)G(x) + H(x), deg H(x) deg q(x) or H(x) = 0. In the
(k)
latter case, everything is proved (ai (x) = 0 for all i and k). If H(x) 0,
let v be the leading coefficient of H(x) and put c = v/u. Then H(x) and
q(x) are relatively prime (since p(x) and q(x) are). We have H(x) = vh(x),
where h(x) is monic, relatively prime to q(x) and

p(x) h(x)
= G(x) + c
q(x) q(x)

with deg h(x) deg q(x).. We may use Lemma 36.6 and get uniquely
determined nonzero polynomials b1(x), b2(x), . . . ,bt(x) in K[x] such that

h(x) b1(x) b2(x) b (x)


= + + ... + t
q(x) g (x)m1 g (x)m2 gt(x)mt
1 2

and deg bk(x) deg gk(x)mk for all k = 1,2,. . . ,t. We put fk(x) = cbk(x).
Then

p(x) f1(x) f2(x) f (x)


= G(x) + + + ... + t
q(x) g1(x)m1 g2(x)m2 gt(x)mt
and, since c is uniquely determined by p(x) and q(x),. the polynomials
fk(x) are also uniquely determined. Since
deg fk(x) = deg bk(x) deg gk(x)mk,
in the gk(x)-adic expansion
fk(x) = r0(x) + r1(x)gk(x) + r2(x)gk(x)2 + . . . + rn(x)gk(x)n

437
of fk(x), the polynomials rs(x) = 0 for s mk. So let

fk(x) = a1(k)(x)gk(x)mk 1 + a2(k)(x)gk(x)mk 2 + . . . + a(k)


m 1
(x)gk(x) + am (k)(x)
k k

be the gk(x)-adic expansion of fk(x). The polynomials a1(k),a2(k), . . . ,am (k)


k
(k) (k)
in K[x] are uniquely determined and deg ai deg gk(x) or ai = 0 for
all i = 1,2, . . . ,mk. Hence, for all k = 1,2, . . . ,t, there holds
fk(x) a1(k)(x) a2(k)(x) am (k)(x)
k
mk = 1 + 2 + ... +
gk(x) gk (x) gk (x) gkmk(x)

and this completes the proof.

a1(1)(x) a2(1)(x) am (1)(x)


p(x) 1
The equation = G(x) + 1 + 2 + ... +
q(x) g1 (x) g1 (x) g1m1 (x)
(2)
a1(2)(x) a2(2)(x) am (x)
2
+ + + ... +
g21(x) g22(x) g2m2 (x)
+ ......
a1(t)(x) a2(t)(x) am (t)(x)
t
+ 1 + 2 + ... +
gt (x) gt (x) gtmt(x)
p(x)
in Theorem 36.8 is known as the expansion of in partial fractions.
q(x)

Exercises

f
1. Let K be a field. For any nonzero rational function in K(x), we
g
f f f
define the degree of , denoted by deg , by deg = deg f deg g.
g g g
Prove that the degree of a rational function is well defined. Can you
extend the degree assertions in Lemma 33.3 to rational functions?

f
2. Let K be a field. For any rational function in K(x), we define the
g
f f
derivative of , denoted by ( )´, by declaring
g g

438
(gf )´ = f´g
2
fg´ .
g
f a
Prove that differentiation is well defined, i.e., prove that = implies
g b
(gf )´ = (ab )´.

3. Extend Lemma 35.15 and Lemma 35.16 to derivatives of rational


functions in one indeterminate over a field.

2x3 + 3x2 + 8x + 6
4. Expand (x) and
(x3 + 3x + 3)(x2 + 2x + 3)

4x3 + 3x2 + x + 2
(x)
x5 + 4x4 + 4x3 + 2x + 2 5

in partial fractions.

5. Let K be a field and let a1,a2, . . . ,am be pairwise distinct elements in K.


Put g(x) = (x a1)(x a2). . . (x am) and let f(x) be a nonzero polynomial
in K[x] with deg f(x) m. Show that
f(x) m f(ai)/g´(ai)
= ∑ .
g(x) i=1
x a i

439
§37
Irreducibility Criteria

In this paragraph, we develop some sufficient conditions for a


polynomial to be irreducible. In general, given a specific polynomial, it is
extremely difficult to determine whether it is irreducible. This is not
surprising when we remember that it is also exceedingly difficult to
determine whether a given specific integer is prime.

We start with Eisenstein's criterion, which is very simple to use (G.


Eisenstein, a German mathematician (1823-1852)).

37.1 Lemma (Eisenstein's criterion): Let D be a unique factorization


domain and let
f(x) = anxn + a n 1xn 1 + . . . + a1x + a0
be a nonzero polynomial in D[x] with C(f) 1. If there is a prime
(irreducible) element p in D such that

p an,
p a n 1,. . . . . . . . . , p a 1, p a 0,
p2 a0,
then f is irreducible over D.

Proof: Suppose, by way of contradiction, that f(x) is reducible over D.


Then its proper factors must have degrees 0, because C(f) 1. Assume
f(x) = g(x)h(x), where

g(x) = bmxm + bm 1xm 1 + . . . + b1x + b0 (bm 0, m 1)


h(x) = c xk + c xk 1 + . . . + c x + c
k k 1 1 0
(ck 0, k 1)

are polynomials in D[x].

Then a0 = b0c0. Since p a 0 and so p b0c0 by hypothesis and p is prime, we


see p b0 or p c 0. Here both p b0 and p c 0 cannot be simultaneously true,
for then we would have p2 b0c0, so p2 a 0, against our hypothesis. Thus
one and only one of p b0, p c 0 is true. Let us assume, without loss of
generality, that p b0 and p c 0.

440
Also an = bmck. Since p an and so p bmck by hypothesis, we have p bm.
Thus p b0 and p bm. Let r be the smallest index for which the coefficient
br in g(x) is not divisible by p, so that .
p b0, p b1, . . . , p br 1, p br (*)
(possibly r = 1 or r = m).

Now ar = (b0cr + b1c r 1 + . . . + br 1c1) + brc0, and r m m + k = n. So p a r


by hypothesis and p divides the expression in ( ) by (*), so p brc0. Then,
since p is prime, this forces p br or p c 0, whereas p br and p c 0. This
contradiction completes the proof..

37.2 Examples: (a) x5 + 5x + 5 [x] is irreducible over , because its


content is 1 and 5 1,
5 0, 5 0, 5 0, 5 5, 5 5,
52 5.

(b) Let D = [i] and f(x) = 3x3 + 2x2 + (4 2i)x + (1 + i) D[x]. Then D is a
unique factorization domain and C(f) 1. Moreover 1 + i D is a prime
element in D and
1+i 3
1 + i 2, 1 + i 4 2i, 1 + i 1 + i,
(1 + i)2 1 + i.
Hence f(x) is irreducible over D.

(c) Let D be a unique factorization domain and g(x,y) = xn y (D[y])[x].


The content of g is 1 D[y], since g is in fact a monic polynomial. Also, y
is irreducible in D[y] and
y 1
y 0, y 0, . . . , y 0, y y,
y2 y,
hence g(x,y) = xn + 0xn 1 + 0xn 2 + . . . + 0x y (D[y])[x] is irreducible
over D[y].

(d) Let p be a prime number and p(x) = xp 1 + xp 2 + . . . + x + 1


[x]. The polynomial p(x) is known as the p-th cyclotomic polynomial.
We show that p(x) is irreducible over . Eisenstein's criterion is not
directly applicable, but we observe that .
(x 1) p
(x) = xp 1,

441
and, when we substitute x + 1 for x in both sides of this equation, we get

∑ (pk)xp k
p1
x p
(x + 1) = (x + 1)p 1=
k=0

by the binomial theorem (Theorem 29.16), so

p p p
(x + 1) = xp 1 + (1 )xp 2 + (2 )xp 3 + . . . + (p-1 )
p

and we will try to apply Eisenstein's criterion to this polynomial. We


p
note p p!, so p (p k)!k!(k). Since p is relatively prime to (p k)! k!
p
when 1 k p 1, Theorem 5.12 gives p (k) for k = 1,2, . . . ,p 1. So
p 1,
p p p
p (1 ), p (2 ), . . . , p (p-1 ),
p
p2 (p-1 ),
and the content of p
(x + 1) = 1. Hence p
(x + 1) is irreducible over .

This implies that p(x) is also irreducible over , since p(x) is clearly
not a unit in [x] and any factorization p(x) = f(x)g(x) of p(x) into
nonunit polynomials f(x), g(x) [x] would give a factorization p(x + 1)
= f(x + 1)g(x + 1) = f1(x)g1(x) of p(x + 1) into nonunit polynomials f1(x),
g1(x) in [x], contrary to the irreducibility of p(x + 1) over .

The argument in the last example can be generalized.

37.3 Lemma: Let D be an integral domain, a unit in D and let be an


arbitrary element of D.
(1) The mapping T: D[x] D[x] is a ring isomorphism such that T =
f(x) f( x + )
for all D.
(2) deg f( x + ) = deg f(x) for any f(x) D[x]\{0} (that is, T preserves
degrees of polynomials).
(3) f(x) is irreducible over D if and only if f( x + ) is irreducible over D.
(4) If, in addition, D is a unique factorization domain, then C(f(x))
C(f( x + )) for any f(x) D[x]\{0} (that is, T preserves contents of poly-
nomials).

442
Proof: (1) The mapping T: f(x) f( x + ) is just the substitution
homomorphism T x+ (Lemma 35.3 with D, D[x], x + in place of R, S, s,
respectively). We are to show that T is one-to-one and onto. To this end,
we need only find an inverse of T (Theorem 3.17(2)). This is quite easy.
We are tempted to substitute (x )/ for x. This idea is correct, but we
must formulate it properly. Since is a unit in D, there is an inverse 1
of in D, and we put S: D[x] D[x]. Then we have
1
f(x) f( (x ))

f(x)TS = f( x + )S = f( ( 1
(x )) + ) = f(x)
f(x)ST = f( 1
(x ))T = f( 1
(( x + ) )) = f(x)

for all f(x) D[x]. Hence TS = D[x]


= ST and T is therefore an isomorphism.
Finally, polynomials of degree 0 and the polynomial 0 D[x] are not
effected by the substitution x x + and so T = for all D.

(2) For any f(x) D[x]\{0}, if deg f = n and


f(x) = anxn + a n 1xn 1 + . . . + a1x + a0
with an 0, we have
f( x + ) = an( x + )n + a n 1( x + )n 1 + . . . + a1( x + ) + a0
n n
= an x + terms of lower degree,
n
with an 0 as the leading coefficient. So deg f( x + ) = n, as claimed.

(3) If f(x) D[x]\{0} is not irreducible over D, then either f(x) is a unit in
D[x], hence f(x) D is a unit in D and f( x + ) = f(x) (by part (1)) is also
a unit in D and in D[x]; or f(x) = g(x)h(x) for some polynomials g(x), h(x)
in D[x] with 1 deg g(x) deg f(x), and then f( x + ) = g( x + )h( x +
) with g( x + ), h( x + ) D[x] and 1 deg g(x) = deg g( x + ) = deg
g(x) deg f(x) = deg f( x + ) (by part (2)), and thus f( x + ) has a
proper divisor. In either case, f( x + ) is not irreducible over D.

1
Repeating the same argument for the substitution x (x ), we
conclude: if f( x + ) is not irreducible over D, then f(x) is not irreducible
over D.

(4) Suppose now that D is a unique factorization domain, that f(x) =


anxn + a n 1xn 1 + . . . + a1x + a0, and that C(f(x)) . Then

n
f( x + ) = ( 0 )an n n
x + ((n1 )a n
n1 n
+ ( 0 )a n 1
n1
)x n1

443
+ ((n2 )an
n2 2 n
+ ( 1 )a n 1
n1 n
+ ( 0 )a n 2
n2
)x n2
+ ... .

A content of f( x + ) divides (n0 )an n


, hence an ( and n
is a unit);

and divides the coefficient of xn 1, hence (n0 )a n 1 n 1, hence a n 1;and

divides the coefficient of xn 2 , hence (n0 )a n 2 n 2, hence a n 2; etc.


Proceeding in this way, we see that divides all the coefficients of f(x).
Since C(f(x)), we obtain . The same argument with f( x + ), f(x), T 1
in place of f(x), f( x + ), T shows that . Thus , as was to be proved.

When C(f(x)) 1 but the divisibility conditions in Eisenstein's criterion


are not satisfied, we might attempt to find a unit and an element so
that f( x + ) will satisfy the divisibility conditions. If we succeed in
finding such , , then f( x + ) will be irreducible by Eisenstein's
criterion (as C(f( x + )) 1 by Lemma 37.3(4)) and f(x) will be
irreducible, too (by Lemma 37.3(3)). This is what we did in Example
37.2(d).

Eisenstein's criterion is a sufficient condition for irreducibility. It is not


necesary, even if we extend it using Lemma 37.3(3). That is to say, f(x)
may be irreducible and yet, for all units in D and for all elements in
D, the polynomial f( x + ) may fail to satisfy the divisibility conditions
in Eisenstein's criterion. In fact, a closer study of its proof reveals that
we are essentially reading the polynomials mod Dp, i.e., we are taking
the images of polynomials in D[x] under the mapping : D[x] (D/Dp)[x]
(see Lemma 33.7).

37.4 Lemma: Let D be an integral domain and let K be a field. Let


:D K be a ring homomorphism and let : D K be the homomorphism
of Lemma 33.7.
(1) If f D[x] and f = gh with g,h D[x], then f = g h .
(2) If f D[x]\D, deg f = deg f and f is irreducible in K[x], then f has no
divisors g in D[x] such that 0 deg g deg f.

Proof: (1) This follows from the fact that is a homomorphism.

444
(2) Suppose, on the contrary, that f = gh in D[x], with 0 deg g deg f.
Then f = g h by (1). Since f is irreducible in K[x], f 0, so g 0
h and either deg g = 0 or deg h = 0. We get then
deg f = deg g h = deg g + deg h
deg g + deg h deg g + deg h
deg g + deg h = deg gh = deg f = deg f ,
which forces deg g = deg g and deg h = deg h. Thus either deg g = 0 or
deg h = 0, and so either 0 = deg g or deg g = deg f, against our hypothesis
0 deg g deg f.

In Lemma 37.4, we relaxed the hypothesis on C(f) that was imposed in


Eisenstein's criterion. We pay for it, of course. Notice we did not claim
that f is irreducible over D. We claimed only that f has no proper factor
of positive degree less than deg f. Here f may have proper divisors, but
any factorization of f in D[x] has the form f = f1, where D and . deg f1
= deg f. .

37.5 Examples: (a) Let q(x) = x3 + x + 1 = 1x3 + 1x + 1 2


[x]. If q(x)
were reducible in 2[x], it would have a factor of degree 3/2, so a
factor of degree 1. So q(x) would have a root in 2 = {0,1} by the factor
theorem (Theorem 35.6). But q(0) = 1 0 and q(1) = 1 0,. so q(x) is
irreducible in 2[x].

Let f(x) = x3 + 2x2 + x + 7 [x]. Under the mapping : [x] 2


[x],
where : 2
is the natural homomorphism, we have
f = 1x3 + 2x2 + 1x + 7 = x3 + x + 1 = q(x) 2
[x],
and so f is irreducible over 2. By Lemma 37.4(2), f has no polynomial
divisors of degree 1, nor of degree 2. Since f does not have any divisors
of degree 0 either (C(f) 1), f is irreducible over .

(b) Lemma 37.4 can be useful even if f is not irreducible. The factori-
zation of f in K[x] gives us information about possible factors of f in D[x]
and restricts their number drastically.

445
As an illustration,. consider f(x) = x5 + 5x4 + 4x3 + 16x2 + 8x + 1 [x].
Under : [x] 3
[x], where : 3
is the natural homomorphism, we
have. (we drop the bars for ease of notation)
f = x5 + 2x4 + x3 + x2 + 2x + 1 3
[x]
= (x2 + 2x + 1)(x3 + 1)
= (x + 1)2(x + 1)(x2 x + 1)
= (x + 1)2(x + 1)(x2 + 2x + 1)
= (x + 1)5,
so any monic factor g of f in [x] with 1 deg g 2 satisfies
2
g =x+1 3
[x] or g = (x + 1) 3
[x]
( 3[x] is a unique factorization domain).

Does f [x] have a divisor of degree one? If it had, it would have a


rational root, and that root would be 1 or 1 by Theorem 35.10. Since
f(1) = 35 0 and f( 1) = 9 0, f has no rational root, and f has no divisor
of degree one.

Does f [x] have a divisor of degree two? If f has a monic divisor g =


2
g(x) =ax + bx + c [x] of degree two, then g = x2 + 2x + 1 3
[x], and
so a 1, b 2, c 1 (mod 3). Besides, a divides the leading coefficient of
f, and c divides the constant term in f: thus a 1 and c 1. So a = 1 and c =
1. Without restricting generality, we may assume a = 1. The possible
monic factors of f of second degree are therefore to be found among

gm(x) = x2 + (3m + 2)x + 1, hm(x) = x2 + (3m + 2)x 1 (m ).

We check if any gm or hm divides f. Supposing gm(x) f(x) in [x], we get

gm(1) f(1) in
3m + 4 35
3m + 4 {1,5,7,35, 1, 5, 7, 35}
3m + 4 = 1,7, 5, 35
3m + 2 = 1,5, 7, 37
gm(x) = x2 x + 1 or x2 + 5x + 1 or x2 + 7x + 1 or x2 + 37x + 1.

Testing these four polynomials in turn, we find x2 x + 1 does not divide


f(x), and x2 + 5x + 1 divides f(x); in fact f(x) = (x2 + 5x + 1)(x3 + 3x + 1). [If
none of the four polynomials divided f(x), we would repeat the argu-
ment with hm. In this way, we would find a divisor of f(x) or we would
show that f(x) is irreducible.] .

446
(c) Lemma 37.4 gives a very elegant proof of Eisenstein's criterion. in
case the underlying ring is a principal ideal domain. Suppose D is a
principal ideal domain and
f(x) = anxn + a n 1xn 1 + . . . + a1x + a0
is a nonzero polynomial in D[x] with C(f) 1 and p is a prime element D
such that

p an,
p a n 1,. . . . . . . . . , p a 1, p a 0,
p2 a0.
Since p is irreducible, the factor ring D/Dp is a field (Theorem 32.25). We
can use Lemma 37.4 with the natural homomorphism : D D/Dp. The
divisibility conditions on the coefficients of f imply
f = (an )xn, an D/Dp, an 0.
If f had a proper factorization f = gh in D[x], where 0 deg g n, we
would get
g h = f = (an )xn
hence g = b xr, h = c xs with 0 r n, 0 s n and b c = an .
Then the constant terms of g and h would be divisible by p, and p2
would be divide their product a0, contrary to the hypothesis.. Hence f is
irreducible over D..

The idea (that gm(x) f(x) gm(1) f(1) ) in Example 37.5(b) has been ex-
ploited by L. Kronecker (1823-1891). Let D be an integral domain and
let f(x) be an arbitrary nonzero polynomial in D[x]. To find out whether f
is irreducible over D, one must check whether g f or g f holds for all
polynomials g with deg g deg f. If D happens to be finite (and thus a
field; Theorem 31.1), there are finitely many g's with deg g deg f; and
the question whether f is irreducible over D can be decided by checking
g f for these the finitely many g's. If D is not finite, this argument does
not work, and we must, so it seems, check if g f for infinitely many
polynomials g D[x]. Kronecker showed that, if D is a unique
factorization do-main which possesses a finite number of units and if we
have a method for finding the irreducible factors of any given nonzero
element of D, then, to find out whether a given nonzero polynomial is
irreducible or not, we need check g f for only a finite number of
polynomials g in D[x].

447
His idea is that, if g(x) f(x) in D[x], then g(a) f(a) in D for any a D, and
that a polynomial g is determined uniquely if its values are known at
more than deg g elements of D (Lagrange's interpolation formula).

Let D be an infinite unique factorization domain. Asume there are


finitely many units in D, and assume that there is a method for finding
the irreducible factors of any given nonzero element of D. Let f be a
nonzero polynomial in D[x] of degree n. If n = 0, then f D and we can
.
find the irreducible factors of f in D by assumption. If n = 1, then f = cf1,
where c C(f) and f1 is an irreducible polynomial in D[x]. The irreducible
factors of c D can be found by assumption, and thus the irreducible
factors of f, too, can be found effectively. If n 2 and f is reducible,
there is a factor g D[x] of f with deg g n/2 (Lemma 33.3(3)). We put
m := [n/2]. We take m + 1 distinct elements a0,a1,a2, . . . ,am from D and
evaluate f(a0),f(a1),f(a2), . . . ,f(am) D. If any f(ai) happens to be 0 D,
then x ai is a factor of f (Theorem 35.6). Therefore we may assume that
f(a0),f(a1),f(a2), . . . ,f (am) are all distinct from zero. Each one of them has
finitely many divisors in D, because D is a unique factorization domain
and D has finitely many units. There is asumed to be a method of finding
these divisors. Let Ni be the number of factors of f(ai). A factor g of f
D[x] with deg g m satisfies one of the N0N1N2. . . Nm systems of
equations
g(a0) = c0, g(a1) = c1, g(a2) = c2, . . . , g(am) = cm, (†)
where c0,c1,c2, . . . ,cm run independently over the divisors of the elements
f(a0),f(a1),f(a2), . . . ,f (am), respectively. For each one of these N0N1N2. . . Nm
choices of c0,c1,c2, . . . ,cm, we build the unique polynomial g satisfying (†).
This is done by Lagrange's interpolation formula; but this formula re-
quires that the underlying ring be in fact a field. Thus Lagrange's inter-
polation formula gives us a list of N0N1N2. . . Nm polynomials g in F[x],
where F is the field of fractions of D, one for each choice c0,c1,c2, . . . ,cm of
the divisors of f(a0),f(a1),f(a2), . . . ,f (am).

From this list of polynomials, we delete those which are not in D[x]. If
any polynomial g remains, we divide f by g in F[x]. Then f = qg + r, with
q,r F[x]. If r 0 or r = 0 but q D[x], we delete g from our list. We
delete g from our list also the the polynomials which are units in D. If
any polynomial g survives, it is a factor of f. Otherwise, f is irreducible
over D.

448
When a proper divisor g of f is found in this way, the same procedure
can be applied to g and f/g. Repeating this process, we can find all irre-
ducible factors of f.

satisfies the conditions imposed on D in Kronecker's method. Thus the


irreducibility of a polynomial in [x] can be determined effectively. This
in turn implies that the irreducibility of a polynomial in [x][y] can be
determined effctively. By repeated application of Kronecker's method,
we can always decide . whether a given polynomial in [x1,x2, . . . ,xn] is
irreducible or reducible.. The same holds for polynomials in the rings
[i][x1,x2, . . . ,xn] and [ ][x1,x2, . . . ,xn].

Kronecker's method is very long and very cumbersome in any specific


case. However, it is important philosophically, because it assures that the
irreducibility or reducibility of a polynomial can be determined
effectively in a finite number of steps.

Exercises

1. Using Eisenstein's criterion, show that the following polynomials are


irreducible over the rings indicated:
x4 6x3 + 24x2 30x + 14 over ,
x4 + 6x3 42x2 + 57x + 78 over ,
3x5 + (21 i)x4 + (14 5i)x3 + ( 10 + 11i) over [i],
x5 7x4 + (3 + 2 )x3 + (2 )x + (1 4 ) over [ ].

2. Let f = x6 2x5 + 3x4 2x3 + 3x2 2x + 2 [x]. Either prove that f is


irreducible over or find all irreducible factors of f in [x].

3. Do Ex. 2 for the polynomials x4 2x3 2x2 + 15x + 30 and


x5 + 8x4 + 25x3 + 39x2 + 30x + 7 in [x].

449
§38
Symmetric Polynomials

Let D be an integral domain and let f(x1,x2, . . . ,xm) D[x1,x2, . . . ,xm]. For
1 2 . . . m
any permutation = (i i . . . i ) in S m, the value of f at (xi ,xi , . . . ,xi )
1 2 m 1 2 m

is a polynomial f(xi ,xi , . . . ,xi ) in D[x1,x2, . . . ,xm], which we can shortly


1 2 m

denote by f (Definition 35.19). For example, if f(x,y,z) = x2 + y2 xz in


2 2 2 3
[x,y,z], then f(z,x,y) = z + x zy; and if g(x,y) = x xy + y in [x,y],
2 3
then g(y,x) = y yx + x [x,y]. In general, f(xi ,xi , . . . ,xi ) will be a
1 2 m
polynomial distinct from f(x1,x2, . . . ,xm).

38.1 Definition: Let D be an integral domain and let f(x1,x2, . . . ,xm) be a


polynomial in D[x1,x2, . . . ,xm]. If f(xi ,xi , . . . ,xi ) = f(x1,x2, . . . ,xm) for all
1 2 m
1 2 . . . m
permutations = (i i . . . i ) in S m, then f = f(x1,x2, . . . ,xm) is called a
1 2 m
symmetric polynomial in D[x1,x2, . . . ,xm]. We also say that f(x1,x2, . . . ,xm) is
symmetric in the indeterminates x1,x2, . . . ,xm.

The polynomials x + y, xy, x2 + y2, x3 + y3 are symmetric polynomials in


D[x,y]. Also, the polynomials x2 + y2 + z2 and xy + yz + zx are symmetric
polynomials in D[x,y,z].

The sum, difference and product of symmetric polynomials are


symmetric polynomials. Indeed, if f(x1,x2, . . . ,xm) and g(x1,x2, . . . ,xm) are
symmetric polynomials in D[x1,x2, . . . ,xm], and if
h(x1,x2, . . . ,xm) = f(x1,x2, . . . ,xm) + g(x1,x2, . . . ,xm)
1 2 . . . m
is their sum, then, for any permutation (i i . . . i ) in S m, we have
1 2 m
h(xi ,xi , . . . ,xi ) = f(xi ,xi , . . . ,xi ) + g(xi ,xi , . . . ,xi )
1 2 m 1 2 m 1 2 m
= f(x1,x2, . . . ,xm) + g(x1,x2, . . . ,xm)
= h(x1,x2, . . . ,xm),

450
and so h(x1,x2, . . . ,xm) is a symmetric polynomial. The same argument
works also when h = f g and h = fg. This proves

38.2 Lemma: Let D be an integral domain.. The symmetric polynomials


in D[x1,x2, . . . ,xm] form a subring of D[x1,x2, . . . ,xm].

We introduce a new indeterminate t and consider the polynomial

f(t) = (t x1)(t x2). . . (t xm) in D[x1,x2, . . . ,xm][t].

We see that x1,x2, . . . ,xm D[x1,x2, . . . ,xm] are the roots of f(t). We have

f(t) = tm 1
(x1,x2,. . . ,xm)tm 1 + 2
(x1,x2,. . . ,xm)tm 2 + . . . + ( 1)m m
(x1,x2,. . . ,xm)

for some 1
, 2
, ..., m
in D[x1,x2, . . . ,xm]. Since

f(t) = (t xi )(t xi ). . . (t xi )
1 2 m

= tm (x ,x ,. . . ,xi )tm 1 +
1 i i
(x ,x ,. . . ,xi )tm 2 + . . . + ( 1)m
2 i i m
(xi ,xi ,. . . ,xi )
1 2 m 1 2 m 1 2 m
1 2 . . . m
for any permutation (i i . . . i ) in S m, we have
1 2 m
(x ,x
j i i
,. . . ,xi ) = (x ,x
j 1 2
,. . . ,xm) for all j = 1,2, . . . ,m.
1 2 m

Thus 1
, 2
, ..., m
are symmetric polynomials in D[x1,x2, . . . ,xm].

38.3 Definition: Let D be an integral domain and let.


(t x1)(t x2). . . (t xm)
=t m
(x ,x ,. . . ,x )t
1 1
m1
+ (x ,x ,. . . ,x )tm 2 + . . . + ( 1)m
2 m 2 1 2 m
(x1,x2,. . . ,xm).
m
The symmetric polynomials 1, 2, . . . , m
are called the elementary
symmetric polynomials in D[x1,x2, . . . ,xm].

By routine computation, we find the elementary symmetric polynomials


in explicitly. For example,

1
= x + y, 2
= xy in D[x,y]

451
1
= x + y + z, 2
= xy + yz + zx, 3
= xyz in D[x,y,z]

1
= x + y + z + u, 2
= xy + xz + xu + yz + yu + zu,
3
= xyz + xyu + xzu + yzu, 4
= xyzu in D[x,y,z,u]

are the elementary symmetric polynomials.

Notice that (t x1)(t x2). . . (t xm), when multiplied out, is a sum of


certain terms a1a2. . . am, where each ai is either t or one of x1, x2, . . . , xm.
The term ( 1)j j (x1,x2, . . . ,xm)tm j is the sum of those a1a2. . . am's for which
exactly m j of the a's are equal to t. Hence ( 1)j j (x1,x2, . . . ,xm) is the sum
of all products b1b2. . . bj , where b1,b2, . . . ,bj run independently over the

set { x1, x2, . . . , xm}. In other words, j


(x1,x2, . . . ,xm) is the sum of all (mj)
products of x1,x2, . . . ,xm, taken j at a time. Thus

1
= ∑ xi

2
= ∑ xixj

3
= ∑ xixj xk
..........................
m
= x1x2. . . xm.

Note that " j " stands for many polynomials. j in D[x1,x2, . . . ,xm] is distinct
from j in D[x1,x2, . . . ,xn] when m n. This ambiguity in notation will not
cause any confusion. if we pay attention to the number of indeterminates.
When confusion is likely, we write j (x1,x2, . . . ,xm) instead of j .

Now 1, 2, . . . , m are symmetric polynomials in D[x1,x2, . . . ,xm], and, by


reapeated application of Lemma 38.2, we conclude that g( 1, 2, . . . , m) is
also a symmetric polynomial,. where g is any polynomial in m indeter-
minates. Hence the set {g( 1, 2, . . . , m) : g D[u1,u2, . . . ,um]} consist only of
symmetric polynomials.. We will prove conversely that every symmetric
polynomial is in this set. (the subring of symmetric polynomials in
D[x1,x2, . . . ,xm] is the subring of D[x1,x2, . . . ,xm] generated by 1, 2, . . . , m).

38.4 Theorem. (Fundamendal theorem on symmetric polynomi-


als): Let D be an integral domain and f(x1,x2, . . . ,xm) a symmetric poly-

452
nomial in D[x1,x2, . . . ,xm]. Then there is a unique polynomial g(u1,u2,. . . ,um)
in D[u1,u2, . . . ,um] such that f is the value of g at ( 1, 2, . . . , m):

f(x1,x2, . . . ,xm) = g( 1, 2
, ..., m
) D[x1,x2, . . . ,xm].

Loosely speaking,. every symmetric polynomial is a polynomial in the


elementary symmetric polynomials 1, 2, . . . , m. We introduced new in-
determinates u1,u2, . . . ,um in order to distinguish clearly between g and
g( 1, 2, . . . , m).

For example, f(x,y) = x2 + y2 [x,y] is a symmetric polynomial, and we


2 2 2
have x + y = (x + y) 2xy = 12 2 2. Hence f(x,y) = g( 1, 2), where
g(u,v) = u2 2v [u,v]. Likewise, if f(x,y,z) is the symmetric polynomial
x y + xy + x z + xz + y2z + yz2 in [x,y,z], we have
2 2 2 2

f(x,y,z) = (x + y + z)(xy + yz + zx) 3xyz = 1 2 3


.. Thus f(x,y,z) =
g( 1, 2, 3), where g(u,v,w) = uv 3w [u,v,w].

The proof of the fundemental theorem requires some preparation.. First


we need an ordering of m-tuples. Given any two m-tuples (r1,r2, . . . ,rm),
(s1,s2, . . . ,sm) of nonnegative integers, we will say (r1,r2, . . . ,rm) is higher
than (s1,s2, . . . ,sm), or (s1,s2, . . . ,sm) is lower than (r1,r2, . . . ,rm) when r1 s1.
If r1 = s1, we will say (r1,r2, . . . ,rm) is higher than (s1,s2, . . . ,sm), or
(s1,s2, . . . ,sm) is lower than (r1,r2, . . . ,rm) when r2 s2. If r1 = s1 and r2 = s2,
we will compare r3 and s3, etc. This is very much like the ordering of
words alphabetically, and. will be referred to as the alphabetical or
lexigographical ordering of m-tuples. Stated differently, (r1,r2, . . . ,rm) is
higher than (s1,s2, . . . ,sm) if and only if the first nonzero difference
among.
r1 s1, r2 s2, . . . , rm sm
is positive. Clearly, if (r1,r2, . . . ,rm) is higher than (s1,s2, . . . ,sm) and
(s1,s2, . . . ,sm) is higher than (t1,t2, . . . ,tm), then (r1,r2, . . . ,rm) is higher than
(t1,t2, . . . ,tm).

Now let f be a polynomial in D[x1,x2, . . . ,xm]. So f is a sum of monomials


ax1k1 x2k2 . . . xmkm, where a D and (k1,k2, . . . ,km) is an m-tuple of nonnega-
tive integers. Here there may be several monomials ax1k1 x2k2 . . . xmkm,
bx1k1 x2k2 . . . xmkm, cx1k1 x2k2 . . . xmkm, etc. with the same exponent system
(k1,k2, . . . ,km). In this case, we collect these monomials into a single one

453
(a + b + c + . . . )x1k1 x2k2 . . . xmkm. We assume this has been done for each of
the exponent systems, so that each m-tuple (k1,k2, . . . ,km) occurs as an
exponent system of a monomial at most once. If, after this collection
process, a monomial ax1k1 x2k2 . . . xmkm occuring in f has a nonzero coeffi-
cient a D, we will say that a appears in f.

Let us now assume f 0. We order the monomials appearing in f by the


alphabetical ordering of their exponent systems. First we write the
monomial appearing in f whose exponent system is highest (i.e., higher
than the exponent systems of all other monomials appearing in f).
Among the remaining monomials appearing in f, we find the one with
the highest exponent system and write it in the second place. Among the
remaining monomials appearing in f, the one with the highest exponent
system will be written it in the third place, and so on. In this ordering of
monomials, the one that is written in the first place, that is to say, the
one with the highest exponent system will be called the leading mono-
mial of the nonzero polynomial f D[x1,x2, . . . ,xm].. Note that the coeffi-
cients of monomials play no role in this ordering. Only the exponent
systems are relevant.

For instance, f(x,y,z) = xz5 + z7 + 2x3 + 5x2y + 100x2y2 x2y2z [x,y,z]


3 2 2 2 2 2 5 7
will be written as 2x x y z + 100x y + 5x y + xz + z when we order
the monomials in the described manner. The leading monomial of f(x,y,z)
is 2x3.

38.5 Lemma: Let D be an integral domain and f,g D[x1,x2, . . . ,xm]\{0}.


k1 k2 km
If ax1 x2 . . . xm is the leading monomial of f and bx1 x2n2 . . . xmnm is the
n1

leading monomial of g, then abx1k1 + n1 x2k2 + n2 . . . xmkm+ nm is the leading


monomial of fg.

Proof: By hypothesis, a 0, b 0, so ab 0. Now fg 0 and fg is the sum


of all products (cx1r1 x2r2 . . . xmrm)(dx1s1 x2s2 . . . xmsm), where cx1r1 x2r2 . . . xmrm and
dx1s1 x2s2 . . . xmsm run through all monomials appearing in f and g,
respectively. We contend that, among all these products, the highest
exponent system is (k1 + n1,k2 + n2, . . . ,km + nm), and that this exponent

454
system arises only from the product (ax1k1 x2k2 . . . xmkm)(bx1n1 x2n2 . . . xmnm).
This will imply

fg = abx1k1 + n1 x2k2 + n2 . . . xmkm+ n


+ [a sum of monomials, each with an exponent
system lower than (k1 + n1,k2 + n2, . . . ,km + nm)],
and, since ab 0, the leading monomial of fg will be equal to
abx1k1 + n1 x2k2 + n2 . . . xmkm+ nm.

To prove our contention, let cx1r1 x2r2 . . . xmrm be a monomial appearing in f


and let dx1s1 x2s2 . . . xmsm be one appearing in g, but assume that either
cx1r1 x2r2 . . . xmrm is distinct from ax1k1 x2k2 . . . xmkm or dx1s1 x2s2 . . . xmsm is
distinct from bx1n1 x2n2 . . . xmnm. We are to show that the exponent system
(r1 + s1,r2 + s2, . . . ,rm + sm) is lower than (k1 + n1,k2 + n2, . . . ,km + nm). Now
(r1,r2, . . . ,rm) is lower than (k1,k2, . . . ,km) or equal to it, and (s1,s2, . . . ,sm) is
lower than (n1,n2, . . . ,nm) or equal to it, but the case of simultaneous
equality is excluded. Hence the first nonzero integer in
k1 r1, k2 r2, . . . , km rm
is positive, or (k1,k2, . . . ,km) = (r1,r2, . . . ,rm), and the first nonzero integer
in n1 s1, n2 s2, . . . , nm sm
is positive, or (n1,n2, . . . ,nm) = (s1,s2, . . . ,sm). Since simultaneous equality is
excluded, there are nonzero integers in
(k1 r1) + (n1 s1), (k2 r2) + (n2 s2), . . . , (km rm) + (nm sm)
and the first of them, being a sum of two positive integers or a sum of a
positive integer and zero, is certainly positive. This means that
(k1 + n1,k2 + n2, . . . ,km + nm) is higher than (r1 + s1,r2 + s2, . . . ,rm + sm).

Since the exponent system (k1 + n1,k2 + n2, . . . ,km + nm) does arise from
the product (ax1k1 x2k2 . . . xmkm)(bx1n1 x2n2 . . . xmnm), it is indeed the highest
exponent system of all the products (cx1r1 x2r2 . . . xmrm)(dx1s1 x2s2 . . . xmsm)
where cx1r1 x2r2 . . . xmrm and dx1s1 x2s2 . . . xmsm run through all monomials
appearing in f and g, respectively. This proves our contention, and also
the lemma.

By induction, we obtain

455
38.6 Lemma: Let D be an integral domain and f1,f2, . . . ,ft be nonzero
polynomials in D[x1,x2, . . . ,xm]. Then the leading monomial of f1f2. . . ft is
the product of the leading monomials of f1,f2, . . . ,ft.
38.7 Lemma: Let D be an integral domain, a D\{0}, and let 1, 2, . . . , m
be the elementary symmetric polynomials in D[x1,x2, . . . ,xm].
If k1 k2 k3 ... km 0 are integers, then the leading monomial
k1 k2 k2 k3
of a 1 2
. . . mkm-1
1
km km
m
is ax1k1 x2k2 . . . xkmm-11xmkm.

Proof: The leading monomials of 1


, 2
, 3
, 4
, ..., m
are respectively x1,

x1x2, x1x2x3, x1x2x3x4, . . . , x1x2. . . xm, because j


is a sum of (mj) monomials,
each of which is a product of j indeterminates from x1,x2, . . . ,xm. In view
k1 k2 k2 k3
of Lemma 38.6, the leading monomial of a 1 2
. . . mkm-1
1
km km
m
is

a(x1)k1 k2 (x1x2)k2 k3 (x1x2x3)k3 k4 . . . (x1x2x3. . . xm 1)km-1 km(x1x2x3. . . xm 1xm)km


= ax1k1 x2k2 . . . xkmm-11xmkm.

We need one more lemma for the proof of the fundamental theorem.

38.8 Lemma: Let D be an integral domain and let f(x1,x2, . . . ,xm) be a


nonzero symmetric polynomial in D[x1,x2, . . . ,xm]. Let ax1k1 x2k2 . . .xmkm be
the leading monomial of f (here a D, a 0 and k1,k2, . . . ,km are nonne-
gative integers).
(1) We have k1 k2 ... km km.
1
(2) If bx1r1 x2r2 . . .xmrm is a monomial appearing in f, then
k1 r1, k1 r2, . . . , k1 rm.

Proof: Let be any permutation in S m and let be the inverse of .


(1) As ax1k1 x2k2 . . .xmkm appears in f(x1,x2, . . . ,xm),
k1 k2 km
ax1 x2 . . .xm appears in f(x1 ,x2 , . . . ,xm ) = f = f = f(x1,x2, . . . ,xm),
ax1k1 x2k2 . .xmkm appears in f(x1,x2, . . . ,xm),
and, since ax1k1 x2k2 . . .xmkm is the leading monomial of f, we obtain:

for all S m, (k1,k2, . . . ,km) is higher than or equal to (k1 ,k2 , . . . ,km ).

Using this with = (12) S m, we see (k1,k2, . . . ,km) is higher than or


equal to (k2,k1, . . . ,km), so k1 k2. And = (23) yields that (k1,k2,k3 . . .

456
,km) is higher than or equal to (k1,k3,k2, . . . ,km), so k2 k3. In like
manner, when we choose = (34), . . . , (m 1,m) S m, we get k3 k4, . . .
,km 1 km. This proves (1)..

(2) As bx1r1 x2r2 . . .xmrm appears in f(x1,x2, . . . ,xm),


r1 r2 rm
bx1 x2 . . .xm appears in f(x1 ,x2 , . . . ,xm ) = f = f = f(x1,x2, . . . ,xm),
bx1r1 x2r2 . . .xmrm appears in f(x1,x2, . . . ,xm),
and so:

for all S m, (k1,k2, . . . ,km) is higher than or equal to (r1 ,r2 , . . . ,rm ).

Thus k1 r1 for all S m. Here 1 assumes all values 1,2, . . . ,m as


runs through S m, and hence k1 r1, k1 r2, . . . , k1 rm.

Proof of the fundamental theorem: Throughout the proof, the num-


ber m of the indeterminates will be fixed. We make induction on the
exponent system of the leading monomial of the symmetric polynomial.
This will be explained shortly.

Let f be a nonzero symmetric polynomial in D[x1,x2, . . . ,xm] and let


ax1k1 x2k2 . . .xmkm be its leading monomial.

First we claim: if (k1,k2, . . . ,km) = (0,0, . . . ,0), then there is a polynomial g


in m indeterminates u1,u2, . . . ,um over D such that f(x1,x2, . . . ,xm) is equal
to g( 1, 2, . . . , m). This is very easy to prove. Indeed, if (k1,k2, . . . ,km) =
(0,0, . . . ,0), then, by Lemma 38.2(2),. the exponent system of any mono-
mial appearing in f is (0,0, . . . ,0),. so f is the constant polynomial a in
D[x1,x2, . . . ,xm]. Then of course f(x1,x2, . . . ,xm) = g( 1, 2, . . . , m), where g is
the constant polynomial a in D[u1,u2, . . . ,um].

Now suppose that (k1,k2, . . . ,km) is higher than (0,0, . . . ,0) and that, for
any nonzero symmetric polynomial f1 D[x1,x2, . . . ,xm] whose leading
monomial has a lower exponent system than (k1,k2, . . . ,km), there is a
polynomial g1 in D[u1,u2, . . . ,um] such that f1(x1,x2, . . . ,xm) = g1( 1, 2, . . . , m).
Under this assumption,. we will prove the existence of a polynomial g in
D[u1,u2, . . . ,um] with f(x1,x2, . . . ,xm) = g( 1, 2, . . . , m). This will establish the
fundamental theorem. because (0,0, . . . ,0) is the lowest possible exponent
system and the theorem has been proved in this case above.. Moreover,

457
as there are only a finite number of m-tuples lower than (k1,k2, . . . ,km),
the method of proof can be used. effectively to find the polynomial g ex-
plicitly in concrete cases. [Basicly, we write the m-tuples L1,L2,L3, . . . in
alphabetical order and prove that . (1) the theorem is true for all nonzero
symmetric polynomials. whose leading monomials have the exponent
system L1 = (0,0, . . . ,0) and that (2) for any s 1, if the theorem is true
for all nonzero symmetric polynomials. whose leading monomials have
exponent systems equal to one of L1,L2, . . . Ls 1, then the theorem is also
true for all nonzero symmetric polynomials. whose leading monomials
have the exponent system Ls. Once the leading monomial of a symmetric
polynomial is given, there can be only a finite number of exponent
systems of monomials appearing in that symmetric polynomial (Lemma
38.8(2).]

By Lemma 38.8(1), the integers k1 k2, k2 k3, . . . , km km 1, km are non-


k1 k2 k2 k3 km-1 km km
negative and, by Lemma 38.7, the polynomial a 1 2
. .. m 1 m
k1 k2 k2 k3 km-1 km km
has the same leading monomial as f. Let f1= f a .. 1 2
. m 1 m
.
Thus f1 is a symmetric polynomial in D[x1,x2, . . . ,xm]. If f1 = 0, then f =
k1 k2 k2 k3
a 1 2
. . . mkm-1
1
km km
m
and f = g( 1, 2, . . . , m), where g =
k1 k2 k2 k3 km-1 km km
au1 u2 . . . um 1 um D[u1,u2, . . . ,um], and the proof
is completed in
this case. If f1 0, then f1 has a leading monomial. The exponent system
of this leading monomial of f1 is the exponent system of a monomial ap-
pearing in f or in a k11 k2 k22 k3 . . . mkm-1
1
km km
m
(or in both). This exponent
system is distinct from (k1,k2, . . . ,km). Since it arises from a monomial
k1 k2 k2 k3
appearing in f or in a 1 2
. . . mkm-1
1
km km
m
, it is lower than the common
exponent system (k1,k2, . . . ,km) of the leading monomials of f and
a k11 k2 k22 k3 . . . mkm-1
1
km km
m
. By hypothesis, there is a polynomial g1 in
D[u1,u2, . . . ,um] such that f1(x1,x2, . . . ,xm) = g1( 1, 2, . . . , m). Hence
k1 k2 k2 k3
f = f1 + a 1 2
. . . mkm-1
1
km km
m
k1 k2 k2 k3
= g1( 1, 2
, . . . , m) + a 1 2
. . . mkm-1
1
km km
m
and there is a polynomial g in D[u1,u2, . . . ,um], namely
g1 + auk11 k2 u2k2 k3 . . . ukmm-11 km um
km
,
such that f(x1,x2, . . . ,xm) = g( 1, 2, . . . , m).

This completes the proof of the existence of g.. It remains to show the
uniqueness of g.. Suppose now f is a nonzero symmetric polynomial in
D[x1,x2, . . . ,xm] and assume that g,h D[u1,u2, . . . ,um] with g( 1, 2, . . . , m) =
f(x1,x2, . . . ,xm) = h( 1, 2, . . . , m). If g were distinct from h, then g h 0

458
would have a leading monomial. which we may write in the form
u1s1 s2 u2s2 s3 . . . umsm-1 sm
umsm , where s1 s2 ... sm 1 sm. Then 0 = f f
1
= g( 1, 2, . . . , m) h( 1, 2, . . . , m) in D[x1,x2, . . . ,xm] would have a leading
monomial bx1s1 x2s2 . . .xsmm-11xmsm, a contradiction. Hence g = h, as was to be
proved.

The fundamental theorem can be summarized by saying that the


substitution mapping
T: D[x1,x2, . . . ,xm] S
g(u1,u2, . . . ,um) g( 1, 2, . . . , m)
is a ring isomorphism, where S is the subring of D[x1,x2, . . . ,xm] consisting
of the symmetric polynomials in D[x1,x2, . . . ,xm].

38.9 Examples: (a) We express the polynomial


f(x,y,z) = 5xyz + x2y + xy2 + xz2 + yz2 + y2z + x2z [x,y,z]
in terms of 1, 2, 3.
We first arrange the monomials appearing in f in the alphabetical order
of their exponent systems:
f(x,y,z) = x2y + x2z + xy2 + 5xyz + xz2 + y2z + yz2.
The leading monomial of f is 1x2y1z0. We therefore subtract 1 12 1 21 0 30
from f and get
f 1 2
= (x2y + x2z + xy2 + 5xyz + xz2 + y2z + yz2) (x + y + z)(xy + yz + zx)
= 2xyz.
The leading monomial of f 1 2
is 2x1y1z1. So we subtract 2 1
11
2
11
3
1

from f 1 2
and get
(f 1 2
) 2 3
= 2xyz 2xyz = 0.
Hence f(x,y,z) = 1 2 + 2 3.
(b) We express
f(x,y,z,w) = x3 + y3 + z3 + w3 [x,y,z]
in terms of 1, 2, 3, 4. The monomials are in alphabetical order, and the
leading monomial of f is 1x3y0z0w0. So we subtract 1 1
30
2
00
3
00
4
0

from f and get


3
f 1
= (x3 + y3 + z3 + w3) (x + y + z + w)3
= ......
= 3x2y 3xy2 3x2z 3xz2 3x2w 3xw2 3y2z 3yz2
3y2w 3yw2 3z2w 3zw2 6xyz 6xyw 6xzw 6yzw.

459
3
The leading monomial of f 1
is 3x2y = 3x2y1z0w0. We therefore
21 10 00 0 3
subtract 3 1 2 3 4
from f 1
and get
3 3
(f 1
) ( 3 1 2
) = (f ) + 3(x + y + z + w)(xy + xz + xw + yz + yw + zw)
1
= ......
= 3xyz + 3xyw + 3xzw + 3yzw
= 3 3.
3
Hence f(x,y,z,w) = 1
3 1 2
+3 3
.

We now derive formulas connecting the sum of the k-th powers of


x1,x2, . . . ,xm with the elementary symmetric polynomials.. These formulas
are due to I. Newton (1642-1727).

38.10 Theorem (Newton): Let D be an integral domain and x1,x2, . . .


,xm indeterminates over D. For k = 1,2,3, . . . , we put sk = x1k + x2k + . . . +
xmk, so that sk D[x1,x2, . . . ,xm]. Then
0 = s1 1
0 = s2 s +2 2
1 1
0 = s3 s + 2s1 3 3
1 2
............................................................
0 = sm 1
s
1 m 2
+ 2sm 3 + . . . + ( 1)m 2 m 2s1 + ( 1)m 1(m 1) m 1

and

0 = sm s
1 m 1
+ s
2 m 2
+ . . . + ( 1)m 2 s + ( 1)m 1
m 2 2
s + ( 1)mm
m 1 1 m

0 = sm+1 s +
1 m
s
2 m 1
+ . . . + ( 1)m 2 s + ( 1)m 1
m 2 3
s + ( 1)m
m 1 2
s
m 1

0 = sm+2 s
1 m+1
+ 2 m
s + ... + ( 1)m 2 s + ( 1)m 1
m 2 4
s + ( 1)m
m 1 3
s
m 2

0 = sm+3 s
1 m+2
+ s
2 m+1
+ . . . + ( 1)m 2 s + ( 1)m 1
m 2 5
s + ( 1)m
m 1 4
s
m 3
............................................................ .

Proof: We make use of the polynomial f(t) = (t x1)(t x2). . . (t xm). We


know that

f(t) = tm tm 1 + tm 2 . . . + ( 1)m 1 t + ( 1)m


1 2 m 1 m

460
and that x1,x2, . . . ,xm are the roots of f(t) D[x1,x2, . . . ,xm]. Hence

0 = xim x m 1+ xm2 . . . + ( 1)m 1 x + ( 1)m


1 i 2 i m 1 i m

for all i = 1,2, . . . ,m. Multiplying both sides of this equation by xij , where
j = 0,1,2,3, . . . , we get
0 = xim+ j x m+ j 1 + x m+ j 2 . . . + ( 1)m 1 x j+1 + ( 1)m xj
1 i 2 i m 1 i m i

for all i = 1,2, . . . ,m. Adding these m equations side by side, we obtain 0
=

∑ xim+ j 1∑ i 2∑ i m 1∑ i
m m m m
x m+ j 1 + x m+ j 2 + . . . + ( 1)m 1 x j+1 + (
i=1 i=1 i=1 i=1

m∑ i
m
1)m xj
i=1

i.e.,

0 = sm+ j s
1 m+ j 1
+ s
2 m+ j 2
+ . . . + ( 1)m 2 s
m 2 j+2
+ ( 1)m 1 s
m 1 j+1
+ (
m
1) s.
m j

This establishes all the equations except the first m 1 of them. The first
m 1 equations will be established by a similar reasoning. This time we
make use of the derivative of f(t). By Lemma 35.16(2), we have

f´(t) = (t x2)(t x3). . . (t xm) + (t x1)(t x3). . . (t xm) + . . . + (t x1)(t x2). . . (t xm 1)

f(t) + f(t) + . . . + f(t) .


=
t x1 t x2 t xm

For i = 1,2, . . . ,m, we put

f(t)
= qm(i)1tm 1 + qm(i)2tm 2 + . . . + q(i)
1
t + q(i)
0
.
t xi

Hence mtm 1 (m 1) 1tm 2 + (m 2) 2tm 3 + . . . + ( 1)m 1 m 1


= f´(t)

∑ ∑ (qm(i)1tm 1 + qm(i)2tm 2 + . . .
m f(t) m
= = + q(i)
1
t + q(i)
0 )
i=1 t xi i=1

= ( ∑ qm(i)1)tm 1 + ( ∑ qm(i)2)tm 2 + . . . + ( ∑ q(i) t + ( ∑ q(i)


m m m m
1 ) 0 )
,
i=1 i=1 i=1 i=1

so that

461
∑ ∑ ∑
m m m
m= qm(i)1, (m 1) 1
= qm(i)2, . . . , ( 1)m 22 m 2
= q(i)
1
,
i=1 i=1 i=1


m
( 1)m 1 m 1
= q(i)
0
. (*)
i=1

On the other hand, tm 2


tm 2 + . . . + ( 1)m 1 m 1t + ( 1)m m
1
tm 1 +
= f(t) = (t xi)(qm(i)1tm 1 + qm(i)2tm 2 + . . . + q(i)
1
t + q(i)
0
)
= qm(i)1tm + qm(i)2tm 1 + qm(i)3tm 2 + . . . + q(i)
1
t2 + q(i) 0
t
qm(i)1xitm 1 qm(i)2xitm 2 . . . q(i) x t2
2 i
q(i)
1 i
x t q(i)
0 i
x.
Comparing the coefficients of powers of t on both sides, we get

1 = qm(i)1

1
= qm(i)2 qm(i)1xi
+ 2
= qm(i)3 qm(i)2xi

3
= qm(i)4 qm(i)3xi
............
m1
( 1) m 1
= q(i)
0
q(i)x
1 i
m
( 1) m
= q(i)
0 i
x,

which may be written

qm(i)1 = 1
qm(i)2 = 1
+ qm(i)1xi
qm(i)3 = + 2
+ qm(i)2xi
qm(i)4 = 3
+ qm(i)3xi
............
q(i)
0
= ( 1)m 1 m 1
+ q(i)
1 i
x
0 = ( 1)m m
+ q(i)
0 i
x.

So, for each i = 1,2, . . . ,m,

qm(i)2 = 1
+ xi
(1)
qm(i)3 = + 2
+( 1
+ xi)xi = 2
x + xi2
1 i
(2)
qm(i)4 = 3
+( 2 1 i
x + xi2)xi = 3
+ x
2 i
x 2 + xi3
1 i
(3)
..............................
q(i)
0
= ( 1)m 1 m 1 + (( 1)m 2 m 2
+ ( 1)m 3 x + . . . + ( 1)
m 3 i 1 i
x m 3 + xim 2)xi

462
= ( 1)m 1 m 1
+ ( 1)m 2 m 2 i
x + ( 1)m 3 m 3 i
x 2 + . . . + ( 1) x m 2 + xim.
1 i
(m
1)

We now the m equations (1), the m equations (2), the m equations (3),. . .
, the m equations (m 1) (for i = 1,2, . . . ,m). Using (*), we get

(m 1) 1
= m 1
+ s1

+(m 2) 2
= +m 2 1 1
s + s2

(m 3) 3
= m 3
+ 2 1
s 1 2
s + s3
.............................

( 1)m 1 m 1
= ( 1)m 1m m1
+( 1)m 2 m2 1
s +( 1)m 3 m3 2
s + . . . + ( 1) 1sm 2 + sm 1,

which are equivalent to

s1 1
=0
s2 s +2 2=0
1 1
s3 s + 2s1 3
1 2 3
=0
........................
sm 1
s
1 m 2
+ 2sm 3 + . . . + ( 1)m 2 m 2 1
s + ( 1)m 1(m 1) m 1
= 0.

This completes the proof.

Newton's formulas express sk recursively in terms of s1,s2, . . . ,sk 1 and of


, , . . . , m. We can eliminate s1,s2, . . . ,sk 1 and write sk solely in terms of
1 2
, , . . . , m. For instance:
1 2

s1 = 1
2
s2 = 1 1
s 2 2
= 1 1
2 2
= 1
2 2
2 3
s3 = 1 2
s 2 1
s +3 3
= 1
( 1
2 2
) 2 1
+3 3
= 1
3 1 2
+3 3
3 2
s4 = 1 3
s 2 2
s + 3 1
s 4 4
= 1
( 1
3 1 2
+3 3
) 2
( 1
2 2
)+ 3 1
4 4
4 2 2
= 1 4 1 2
+4 1 3
+2 2
4 4
(here j
should be replaced by 0 when j m).

Now let D,E be integral domains and D E. Let


p(t) = c0tm + c1tm 1 + . . . + c m 1t + cm
be a nonzero polynomial of degree m in D[t],. and assume that there are
exactly m roots a1,a2, . . . ,am of p in E (counted with multiplicities). Then

463
p(t) = c0(t a1)(t a2). . . (t am) in E[t].
Hence p(t) E[t] is obtained from.
c0f(t) = c0(t x1)(t x2). . . (t xm) D[x1,x2, . . . ,xm][t]
by substituting ai for xi (i = 1,2, . . . ,m). Now c0f(t) =
c0(tm 1
(x1,x2, . . . ,xm)tm 1 + 2
(x1,x2, . . . ,xm)tm 2 + . . . + ( 1)m m
(x1,x2, . . . ,xm))
and, since substitution is a homomorphism (Lemma 35.20), we have
p(t) =
c0(t m
(a ,a , . . . ,am)tm 1 + 2(a1,a2, . . . ,am)tm 2 + . . . + ( 1)m m(a1,a2, . . .
1 1 2
,am)).

Therefore c1 = c0 1
(a1,a2, . . . ,am)
c2 = +c0 (a ,a , . . . ,am)
2 1 2
c3 = c0 (a ,a , . . . ,am)
3 1 2
...............
cm 1
= ( 1)m 1 m 1(a1,a2, . . . ,am)
cm = ( 1)m m(a1,a2, . . . ,am);
in words: the values. of the elementary symmetric polynomials at the
roots of a polynomial. are equal to the coefficients of that polynomial,
except for a factor c0, where c0 is the leading coefficient of the poly-
nomial. The equations above tell us that (i) i(a1,a2, . . . ,am) belong to D if
c0 is a unit in D; (ii) i(a1,a2, . . . ,am) belong to the field of fractions of D in
any case; (iii) in particular, i(a1,a2, . . . ,am) belong to D if D is a field.
Moreover, if h(x1,x2, . . . ,xm) D[x1,x2, . . . ,xm] is a symmetric polynomial,
then h(x1,x2, . . . ,xm) = g( 1, 2, . . . , m) for some polynomial in m
indeterminates over D, and substitution yields.
h(a1,a2, . . . ,am) = g ( 1(a1,a2, . . . ,am), 2(a1,a2, . . . ,am), . . . , m(a1,a2, . . . ,am))
so that (i) h(a1,a2, . . . ,am) belongs to D if c0 is a unit in D; (ii) h(a1,a2, . . . ,am)
belongs to the field of fractions of D in any case; (iii) h(a1,a2, . . . ,am) belongs
to D if D is a field. We summarize this discussion in the next theorem..

38.11 Theorem: Let D be an integral domain and let


p(t) = c0tm + c1tm 1 + . . . + c m 1t + cm a polynomial over D. Assume that p(t)
has exactly m roots a1,a2, . . . ,am in an integral domain containing D.

(1) ci = ( 1)i m
(a1,a2, . . . ,am) for i = 1,2, . . . ,m.

464
(2) If h is any symmetric polynomial in m indeterminates over D,. then
h(a1,a2, . . . ,am), which is an element of the integral domain containing the
roots of p(t), is in fact an element of the field of fractions of D. .
(3) If, in addition, the leading coefficient of p(t) is a unit in D,. then
h(a1,a2, . . . ,am) belongs to D.
(4) If, in particular, D is a field, then h(a1,a2, . . . ,am) belongs to D.

It is true that any nonzero polynomial of degree m over an integral do-


main D has exactly m roots in some integral domain containing D. This
will be proved later (Theorem 53.6). In the following examples, we will
assume that the polynomials have as many roots as their degrees in
some integral domain.

Examples: (a) Let us evaluate a 2b2 + a 2c 2 + a 2d2 + b2c 2 + b2d2 + c 2d2,


where a,b,c,d are the roots of t4 t2 + 1 [t]. To this end, we express
the symmetric polynomial x2y2 + x2z2 + x2u2 + y2z2 + y2u2 + z2u2 in terms
of 1, 2, 3, 4. Subtracting 1 21 2 22 0 03 0 04 from this polynomial, we get

(x2y2 + . . . + z2u2) 2
2
= 2x2yz ... 6xyzu,
and (x2y2 + . . . + z2u2) 2
2
( 221 11 10 0
1 2 3 4
)= . . . = 0,
so x2y2 + x2z2 + x2u2 + y z + 2 2
y2u2 + z2u2 = 22 1 3
.
Then a 2b2 + a 2c 2 + a 2d2 + b2c 2 + b d + c d 2 2 2 2

= (ab + ac + ad + bc + bd + cd)2 2(a + b + c + d)(abc + abd + acd + bcd).


Here a,b,c,d are the roots of t4 t2 + 1, so
a + b + c + d = 0, ab + ac + ad + bc + bd + cd = +( 1),
abc + abd + acd + bcd = 0, abcd = +1
and therefore a 2b2 + a 2c 2 + a 2d2 + b2c 2 + b2d2 + c 2d2 = ( 1)2 2(0)(0) = 1.

(b) We find a polynomial of degree three in [t] whose roots are the
cubes of the roots of t3 + 2t2 + 3t + 4 [t]. Let us denote the roots of
this polynomial by a,b,c, so that a + b + c = 2, ab + ac + bc = 3, abc = 4.
We put t3 + q1t2 + q2t + q3 = (t a 3)(t b3)(t c 3).
From Theorem 38.11, we know that
q1 = a 3 + b3 + c 3, q2 = a 3b3 + a 3c 3 + b3c 3, q3 = a 3b3c 3.
3
Since s3 = 1
3 1 2
+3 3
, we conclude
q1 = a 3 + b + c 3
3

465
= (a + b + c)3 3(a + b + c)(ab + ac + bc) + 3(abc)
= ( 2)3 3( 2)(3) + 3( 4)
= 2.
We find easily that x3y3 + x3z3 + y3z3 = 23 3 1 2 3 + 3 32; hence
q2 = a 3b3 + a 3c 3 + b3c 3
= (ab + ac + bc)3 3(a + b + c )(ab + ac + bc)(abc) + 3(abc)2
= (3)3 3( 2)(3)( 4) + 3( 4)2
= 3.
Finally, q3 = a 3b3c 3
= (abc)3
= ( 4)3
= 64.
Thus
t3 ( 2)t2 + (3)t ( 64) = t3 + 2t2 + 3t + 64 [x]
is a polynomial whose roots are the cubes of the roots of t + 2t2 + 3t + 4.
3

Exercises

1. Express the following symmetric polynomials over in terms of the


elementary symmetric polynomials:
(a) x3y2 + x2y3 + x3z2 + x2z3 + y3z2 + y2z3;
(b) x2y2 + x2z2 + x2u2 + y2z2 + y2u2 + z2u2;
(c) x5 + y5 + x5 + x4y + y4x + x4z + z4x + y4z + z4y.

2. Find a polynomial over whose roots are the


(a) squares of the roots of t3 + 5t2 + 7t + 1 [t];
5 4 3 2
(b) squares of the roots of t + 5t 6t + t 7t 4 [t];
4 3 2
(c) cubes of the roots of t 3t + 2t + 2 [t].

f(x1,x2, . . . ,xm)
3. Let K be a field. A rational function in K[x1,x2, . . . ,xm]
g(x1,x2, . . . ,xm)
is said to be a symmetic rational function over K if
f(x1 ,x2 , . . . ,xm ) f(x1,x2, . . . ,xm)
=
g(x1 ,x2 , . . . ,xm ) g(x1,x2, . . . ,xm)
for all Sn. Prove that a symmetic rational function over K can be
expressed as a fraction of two symmetic polynomials over K. Conclude
that any symmetic rational function over K can be written as

466
p( 1, 2, . . . , m)
q( 1, 2, . . . , m)
with suitable polynomials p,q in K[u1,u2, . . . ,um]. (Loosely speaking, any
symmetic rational function is a rational function of the elementary
symmetric polynomials.)

4. Express the following rational functions over in terms of the


elementary symmetric polynomials:
x y x z y z
(a) + + + + + ;
y x z x z y
x 2 y2 z 2
(b) + + ;
yz xz xy
1 1 1 .
(c) + +
1 x 1 y 1 z

5. Prove: for any symmetric polynomial f(x1,x2, . . . ,xm) over , there is a


polynomial h(u1,u2, . . . ,um) in [u1,u2, . . . ,um] such that f(x1,x2, . . . ,xm) =
h(s1,s2, . . . ,sm), where sj are the power sums of xi.

6. Write the symmetric polynomials in Ex. 1 as polynomials in sj over .

467
CHAPTER 4
Vector Spaces

§39
Definition and Examples

The term "vector" is familiar to the reader from Physics. Such physical
magnitudes as displacement, force, torque, momentum etc. are vectors.
Vectors can be added (by the parallelogram law) and multiplied by real
numbers, which are called scalars in this context. In this chapter, we
introduce systems of objects which can be added and multiplied by
scalars.

39.1 Definition: Let K be a field and let (V,+) be an (additively written)


abelian group. Suppose that, to each pair ( , ) in K V, there corre-
sponds a unique element of V, denoted by . , such that the following
equations hold for all , K, , V:
(1) .( + ) = . + .
(2) ( + ). = . + .
(3) .( . ) = ( ).
(4) 1. = (1 is the identity of K).
In this case, the ordered quadruple (V,+,K, .) is called a vector space over
K, or a K-vector space. The elements of K are called scalars, and K is
called the field of scalars of the vector space (V,+,K, .). The elements of V

475
are called vectors. The mapping ( , ) . from K V into V will be
called multiplication by scalars.

From now on, the term "vector" will mean an element of a vector space.
We will see vectors which do not resemble the vectors of physics in any
way.

At the cost of some stylistic clumsiness in the formulation of many


statements, we shall refer to the mapping ( , ) . as multiplication by
scalars, not as scalar multiplication. This is what the mapping really is. It
is multiplication of vectors by scalars, not a multiplication whose results
are scalars. We will usually omit . and write instead of . .

Strictly speaking, a vector space is an ordered quadruple (V, +, K, .).


However, as in the case of groups and rings, we shall usually refer to the
set V as a vector space over K. If the field of scalars is fixed throughout
a discussion, we shall speak of vector spaces, without reference to the
field of scalars.

It will be convenient to think of a vector space as an abelian group with


an additional structure on it supplied by the multiplication by scalars.
The wording of Definition 39.1 was chosen to emphasize this point of
view.

39.2 Examples: (a) Let V = be the direct sum of two copies of


and let K = . We define multiplication by scalars in the most natural
way: ( , ) = ( , ) (for all , ( , ) V).
It is easily seen that V is an -vector space.

(b) The same construction can be carried out with n-tuples of elements
from any field K. Let K be a field and let V = K K ... K be the di-
rect sum of n copies of K, which is an abelian group under component-
wise addition. We define multiplication by scalars also componentwise:
( 1, 2, . . . , n) = ( 1, 2, . . . , n). (for all K, ( 1, 2, . . . , n) V).
It is easily verified that V is a K-vector space. It will be designated by
Kn, and will be called the K-vector space of n-tuples.

476
(c) Let V be the set of all real-valued functions defined on the interval
[0,1]. For any two functions f,g in V, we define a new function f + g in V
by (f + g)(x) = f(x) + g(x) for all x [0,1].
V is an abelian group under this addition (called the pointwise addition
of functions). Now let K = and define a pointwise multiplication by
scalars: ( f)(x) = f(x) for all x [0,1],
.
Then V is a vector space over (cf. Example 29.2(i)).

When we put ( f)(x) = f(x) for all x [0,1] ,


then V would not be a vector space over , because f would not belong
to V for all , f V, as the function f is not real-valued when is a
complex number with a nonzero imaginary part.

(d) Let K be a field and let K[x] be the ring of all polynomials over K. Let
us forget that we can multiply two polynomials and concentrate on the
fact that we can add them and multiply them by the elements of K
(which are polynomials of degree zero, or the zero polynomial). It is
easily seen that K[x] is a K-vector space.

(e) Let n be a fixed natural number. Let K be a field and let V be the set
of all polynomials over K which have degree n. Is V a vector space over
K? No, because the sum of two polynomials of degree n is not always a
polynomial of degree n (when the leading coefficients are opposites of
each other). On the other hand, the set consisting of the zero polynomial
and of all polynomials over K whose degrees are less than or equal to n
is a vector space over K.

(f) Let V be a vector space over a field K, and let K1 be a field contained
in K (in this case, K1 is called a subfield of K). Then V is a vector space
over K1, too, since the requirements in Definition 39.1 are satisfied by
the elements of K1 if they are satisfied by the elements of K.

(g) Let K be a field. When we define the multiplication by scalars as the


multiplication in the field K, then K becomes a vector space over K. The
conditions in Definition 39.1 are simply the distributivity laws, the
associativity of multiplication and the very definition of the identity
element in K.

(h) It follows from Example 39.2(f) and Example 39.2(g) that, if K1 and K
are fields such that K1 K, then K is a vector space over K1: any field is a

477
vector space over its subfields. For instance, is a vector space over ,
and is a vector space over .

39.3 Remarks: (1) A vector space is an abelian group and has an iden-
tity element, which we call zero and denote by 0. The underlying field K
has a zero element, too, which is also denoted by 0. The reader should
carefully distinguish between these zeroes. One of them is a vector, the
other is a scalar. The vector zero is sometimes denoted by 0.

(2) Multiplication by scalars is a mapping from K V into V, hence it is


not a binary operation on V unless K = V. This feature distinguishes
vector spaces from groups and rings. Multiplication and addition are
binary operations on groups and rings.

Some basic facts are collected in the next lemma.

39.4 Lemma: Let V be a vector space over a field K. For all , K and
for all , , V, the following hold.
(1) 0 + = .
(2) + = 0.
(3) 0 = 0 (vector zero).
(4) + = + implies = .
(5) 0 = 0.
(6) 0 = 0.
(7) ( ) = ( ) = ( ) ; in particular, 1. = .
(8) ( ) = .
(9) ( )= .
(10) = 0 implies = 0 or = 0.
(11) = implies = or = 0.
(12) = implies = 0 or = .

Proof: (1),(2),(3),(4) hold in any group (Lemma 7.3, Lemma 8.2), also in
the abelian group (V,+).

(5) We are to prove 0 = 0 (vector zero). We observe


0 + 0 = 0 = (0 + 0) = 0 + 0,
hence 0 = 0 by (4).

478
(6) We are to prove 0 = 0 (on the left hand side, we have the scalar
zero, on the right hand side, the vector zero). We observe
0 + 0 = 0 = (0 + 0) = 0 + 0 ,
hence 0 = 0 by (4).

(7) We have 0 = 0 = ( + ( )) = + ( )
and 0 = 0 = ( + ( )) = +( )
by (5) and (6), so ( ) and ( ) are the opposite of . Thus
( )= ( )=( ) .

(8) We are to show ( ) = . Here is an abbreviation for


+ ( ) in K, and is an abbreviation for + ( ( )) in V. We
have indeed: ( ) = ( + ( )) = +( ) = + ( ( )) = .

(9) ( )= ( +( )) = + ( )= +( ( )) = .

(10) Assume = 0. If 0, then 1 exists in K and we get


= 1 = ( 1 ) = 1( ) = 10 = 0.

(11) This follows from (8) and (10).

(12) This follows from (9) and (10).

39.5 Lemma: Let V be a vector space over a field K. Then, for all
, ,. . . , n, in K and 1, 2, . . . , n, in V, there hold
1 2
( 1 + 2 + . . . + n) = 1 + 2 + . . . + n
and ( +
1 2
+ ... + ) =
n 1
+2
+ ... +
n
.

Proof: This follows by induction on n. The details are left to the reader.

Just as there may be different group structures on a set, there may also
be different vector space structures on a set. Here is an example.

39.6 Example: Let V := . We define a multiplication o of the ele-


ments of V by complex numbers by declaring
c o (a,b) = (c a,c b) for all c , (a,b) V.
This multiplication makes the abelian group V into a -vector space:

479
(1) c o [(a,b) + (d,e)] = c o (a + d,b + e)
= ( c (a + d),c (b + e))
= (c a + c d,c b + c e)
= (c a,c b) + (c d,c e)
= c o (a,b) + c o (d,e),
(2) (c + f) o (a,b) = ((c + f)a,(c + f)b)
= (c a + f a, c b + f b)
= (c a,c b) + (f a,f b)
= c o (a,b) + f o (a,b),
(3) (cf)(ab) = ( cf a, cf b)
= (c f a,c f b)
= c o (f a,f b)
= c o (f o (a,b)),
(4) 1 o (a,b) = (1a,1b) = (1a,1b) = (a,b)

for all (a,b),(d,e) V, c,f . Thus (V,+, ,o ) is a vector space, with the
same set V, the same addition + on V, the same underlying field as the
.
vector space (V,+, , ) of Example 39.2(b) (in case K = , n = 2), but
(V,+, ,o ) is distinct from (V,+, ,.) since the multiplication by scalars in
these vector spaces are different.

Exercises

1. Determine whether is an -vector space when


(a,b) + (c,d) = (a + c,b + d), a(c,d) = (c,d) for all a,b,c,d .

2. Determine whether is a -vector space when


(a,b) + (c,d) = (a + c,0), a(c,d) = (ac,ad) for all a,b,c,d .

3. Determine whether 7 7
is a 7-vector space when
(a,b) + (c,d) = (a + c,b + d), a(c,d) = (ac,0) for all a,b,c,d 7
.

480
4 Determine whether the set S of all sequences of real numbers is a
vector space over if addition and multiplication by scalars are defined
by (an) + (bn) = (an + bn), a(bn) = (abn).
for all (an),(bn) S, a (here (an) is the sequence a1,a2,a3, . . . ).

5. If q and K is a field of q elements, how many elements does Kn


have?

6. Construct a vector space with exactly four elements.

481
§40
Subspaces

Just as we defined subgroups and subrings, we will define sub(vector


space)s. We contract this awkward expression into "subspace".

40.1 Definition: Let V be a vector space over a field K. A nonempty


subset W of V is called a subspace of V if W itself is a K-vector space
(under the addition and multiplication by scalars inherited from V).

A subspace W of V is an abelian group, thus a subgroup of (V,+). Also,


products by scalars of the element of W belong to W, so that W
whenever K and W. Conversely, any subgroup W of V such that
W for all K and W is easily seen to be a subspace of V, for
the conditions in Definition 39.1 are automatically satisfied for all
elements of W if they are satisfied for all elements of V. Thus W is a
subspace of V if and only if
(1) W is a subgroup of V under addition,
(ii) if K and W, then W (i.e.,W is closed under
multiplication by scalars).

Here (1) embraces two conditions: (i) W is closed under addition, (ii´) for
any W, the opposite of also belongs to W. Thus W is a subspace
if and only if (i),(ii´) and (ii) hold. One checks easily that (ii) implies (ii´): if
(ii) holds and W, then ( 1) W, hence W by Lemma 39.4(7),
so (ii´) holds. Thus (ii´) is superfluous. We proved the following lemma.

40.2 Lemma (Subspace criterion): Let V be a vector space over a


field K and let W be a nonempty subset of V. Then W is a subspace of V
if and only if
(i) 1 + 2 W for all 1, 2 W,
(ii) W for all K, W.

481
So a nonempty subset of a vector space V is a subspace of V if and only
if it is closed under addition and multiplication by scalars. The two
closure properties of Lemma 40.2 can be combined to a single one. When
(i) and (ii) of Lemma 40.2 hold, then

1
+ 2
W for all , K, 1
, 2
W (*)

since 1
, 2
W by (ii) and 1
+ 2
W by (i).. Conversely, if (*)
holds, then, choosing = 1, = 1, we see that (i) holds and, choosing =
0, we see that (ii) holds. Thus (i) and (ii) are together equivalent to (*).
Then we obtain another version of Lemma 40.2.

40.3 Lemma (Subspace criterion): Let V be a vector space over a


field K and let W be a nonempty subset of V. Then W is a subspace of V
if and only if
1
+ 2
W for all , K, 1, 2 W.

The expression 1
+ 2
W is said to be a linear combination of the
vectors 1, 2. More generally, we have the

40.4 Definition: Let 1, 2, . . . , n be finitely many (not necessarily


distinct) vectors of a vector space V over a field. K. A vector of the form
+ 1 1
+ ... +
2 2
, n n
where 1, 2, . . . , n K, is called a K-linear combination of the vectors
, , . . . , n. (If the underlying field K is clear from the context, we use
1 2
the term "linear combination", without mentioning K.)

40.5 Lemma: Let V be a vector space over a field K and let W be a


subspace of V. If 1, 2, . . . , m are vectors in W, then every K-linear
combination of these vectors belongs to W. .

Proof: This follows from Lemma 40.3 by induction on m.

482
40.6 Examples: (a) Let V be any vector space. Then {0} and V are
subspaces of V.

(b) Consider the vector space K3 over a field K and put


W = {( , , ) K3: = 0} K3. .
If ( 1, 1,0) and ( 2, 2,0) are arbitrary vectors in W and if , are arbi-
trary scalars, then ( 1, 1,0) + ( 2, 2,0) = ( 1 + 2, 1 + 2,0) belongs to
W. By Lemma 40.3, W is a subspace of K3.

(c) Consider the vector space K3 over a field K and put


U = {( , , ) K3: = 1} K3.
Then (0,0,1) U, (1,0,1) U, but (0,0,1) + (1,0,1) = (1,0,1+1) U
3
(why?) and U is not closed under addition. So U is not a subspace of K .

3
(d) Consider over and let
3 3
A = {( , , ) : 0} .
If ( 1, 1, 1) and ( 2, 2, 2) are in A, then 1 0, 2 0, so 1 + 2
0 and
( 1, 1, 1) + ( 2, 2, 2) = ( 1 + , +
2 1
, + 2)
2 1
belongs to A. Thus A is closed under addition. However, A is not a sub-
3
space of , since, for instance, (0,0,1) A but ( 1)(0,0,1) U. This
example shows that a subset of a vector space can be closed under
addition without being closed under multiplication by scalars.

(e) Consider the vector space K2 over a field K and let


M = {( , ) K2: = 0 or = 0} K2.
If K and ( , ) M, then = 0 or = 0, so = 0 or = 0, so ( , ) =
( , ) belongs to M. Thus A is closed under multiplication by scalars.
However, M is not a subspace of K2, since, for instance, (1,0), (0,1) M,
but (1,0) + (0,1) M. This example shows that a subset of a vector space
can be closed under multiplication by scalars without being closed under
addition.

(f) Consider the vector space K2 over a field K and let , be two
arbitrary but fixed elements of K. Put
R = {( , ) K2: + = 0} K2. Then R is a subspace of K2:
(i) If ( 1, 1), ( 2, 2) R, then 1 + 1 = 0 = 2 + 2, so
( 1 + 1) + ( 2 + 2) = 0, so ( 1 + 2) + ( 1 + 2) = 0, so ( 1, 1) + ( 2, 2) =
( 1 + 2, 1 + 2) R.
(ii) If K and ( , ) R, then + = 0, so + = 0, so
( , )=( , ) R.

483
(g) Let V be a vector space over a field K and let {Wi: i I} be a collec-
tion of subspaces of V. Then their intersection W := Wi is a subspace
i I
of V. First of all, this intersection is not empty, since 0 Wi for all i I.
Also, if , K and 1, 2 W, then , K and 1, 2 Wi for all i I,
so 1
+ 2
Wi for all i I, so 1
+ 2
W.

(h) Let V be the -vector space of all real-valued functions defined on


[0,1] (See Example 39.2(c)) and let be a fixed number in [0,1]. We put
T = {f V: f is continuous at }.
It is known from analysis that, if f and g are functions, continuous at ,
then f + g is also continuous at . If f is continuous at and , then
.
f is continuous at . Hence T is a subspace of V.

(i) Let V be the -vector space of all real-valued functions defined on


[0,1]. We put
C([0,1]) = {f V: f is continuous on [0,1]}.
We know from analysis that, if f and g are functions, continuous on [0,1],
then f + g is also continuous on [0,1]. If f is continuous at , and ,
then f is continuous on [0,1]. Hence C([0,1]) is a subspace of V. This
conclusion can be drawn also by observing that C([0,1]) = T and
[0,1]
appealing to Example 40.6(g) and Example 40.6(h).

(j) Let C1([0,1]) = {f C([0,1]): f´ exists and is continuous on [0,1]}.


C1([0,1]) is a nonempty subset of C([0,1]). If , and f,g C1([0,1]),
then, as is well known from analysis, ( f + g)´ exists, is equal to f´ + g´
and is continuous on [0,1]. Hence f + g C1([0,1]) and therefore
C1([0,1]) is a subspace of C([0,1]).

Similarly, for k , we put


Ck([0,1]) = {f C([0,1]): f(k) exists and is continuous on [0,1]}.
Since the existence of the k-th derivative of f implies the existence and
continuity of the first k 1 derivatives f´,f´´, . . . , f(k 1), we see Ck([0,1]) is
a subset of Ck 1([0,1]). From the formula
( f + g)(k) = f(k) + g (k) ( , , f,g
k
C ([0,1]))
it is easily seen that Ck([0,1]) is a subspace of C([0,1]) and of Ck 1([0,1]).

We write C ([0,1]) = Ck([0,1]). From Example 40.6(g), we infer that


k
C ([0,1]) is also a subspace of C([0,1]) and of each Ck([0,1]).

484
(k) Let p(x) and q(x) be continuous functions, defined on [0,1]. We write
L = {f C2([0,1]): f´´(x) + p(x)f´(x) + q(x)f(x) = 0 for all x [0,1]}.
L is a nonempty subset of C2([0,1]). If , and f,g L, then
( f + g)´´(x) + p(x)( f + g)´(x) + q(x)( f + g)(x)
= f´´(x) + g´´(x) + p(x) f´(x) + p(x) g´(x) + q(x) f(x) + q(x) g(x)
= (f´´(x) + p(x)f´(x) + q(x)f(x)) + (g´´(x) + p(x)g´(x) + q(x)g(x))
= 0+ 0=0
for all x [0,1], so f + g L. Thus L is a subspace of C2([0,1]).

(l) So far, we spoke of subspaces without referring to the underlying


field. Sometimes it might be necessary to mention the underlying field.
Let V = 2 be the -vector space of ordered pairs of complex numbers.
Then V is an -vector space, too (Example 39.2(f)). We put
W = {( , ): } = {( , ): , = },
where denotes complex conjugation. If ( , ), ( , ) W and , ,
then = and = , so ( , ) + ( , ) = ( + , + ) with
+ = +
= +
= + (c)
= + ,
and ( , ) + ( , ) W. Thus W is a subspace of the -vector space V.
However, W is not a subspace of the -vector space V, for the critical
equation (c) need not be true when , are complex numbers (with
nonzero imaginary parts). We may say W is an -subspace of V, but not
a -subspace of V.

40.7 Theorem: Let V be a vector space over a field K and let


A = { 1, 2, . . . , n} be a finite nonempty subset of V. Then the set
W = { 1 1 + 2 2 + . . . + n n}
of all linear combinations of the vectors 1, 2, . . . , n is a subspace of V.

Proof: Since A is not empty, W . If , K and , W, then


= 1 1+ 2 2+ . . . + n n, = 1 1+ 2 2+ . . . + n n
with suitable 1, 2, . . . , n, 1, 2, . . . , n K and
+ = ( 1 1 + 2 2 + . . . + n n) + ( 1 1 + 2 2 + . . . + n n
)
=( 1
+ 1
) +(
1 2
+
2
) + ... + (
2 n
+n
)
n

485
belongs to W. Hence W is a subspace of V(Lemma 40.3). [Notice that
, , . . . , n are not assumed to be distinct.]
1 2

We extend this theorem to infinite subsets of V.

40.8 Theorem: Let V be a vector space over a field K and let


A = { i : i I} be a (finite or infinite) nonempty subset of V. Then the set
W = { 1 i + 2 i + ... + n i V: 1, 2, . . . , n K, i , i , . . . , i A, n }
1 2 n 1 2 n

of all finite linear combinations of the vectors in A is a subspace of V.

Proof: Since A is not empty, W . If , K and , W, then


= 1 i + 2 i + ... + n i , = 1 j + 2 j +
. . . + m j
1 2 n 1 2 m
with suitable 1
, 2
, .. . , n
, 1
, 2
, ..., n
K, i1 , i2 , ..., in , j1 , j2 , ..., jm A,

n,m and
+ = ( 1 i1
+ 2 i2
+ ... + n in
) + ( 1 j1
+ 2 j2
+ ... + m jm
)
= 1 i1
+ 2 i2
+ ... + n in
+ 1 j1
+ 2 j2
+ ... + m jm

is a K-linear combination of the vectors i1 , i2 , ..., in , j 1 , j 2 , .. . , jm in A. So

+ belongs to W and W is a subspace of V.

40.9 Definition: The subspace W of Theorem 40.7 or Theorem 40.8 is


called the K-span of A, or the K-span of the vectors in A, or the subspace
spanned by the (vectors in) A. It will be denoted by sK (A).. In case A =
{ 1, 2, . . . , n} is a finite set, we write sK ( 1, 2, . . . , n) instead of .
sK ({ 1, 2, . . . , n}). By convention, we put sK ( ) = {0}. When there is no
need to refer to the field K of scalars, we speak of the span of A, and
denote it by s(A).

The next lemma justifies the convention sK ( ) = {0}.

486
40.10 Lemma: Let V be a vector space over a field K and let A V.
Then sK (A) is the smallest subspace of V which contains A. More exactly,
if U is a subspace of V and A U, then sK (A) U.

Proof: If A = , then sK (A) = {0} U for any subspace U of V and the


theorem is proved in this case. Suppose now A . If A U and U is a
subspace of V, then every linear combination of the vectors in A belongs
to U by Lemma 40.5. Hence s(A) U.

40.11 Lemma: Let V be a vector space over a field K and let A,B be
subspaces of V such that A s(B) and B s(A). Then s(A) = s(B).

Proof: Since A s(B) and s(B) is a subspace of V, we have s(A) s(B) by


Lemma 40.10. In like manner, since B s(A) and s(A) is a subspace of V,
we get s(B) s(A). Thus s(A) = s(B).

40.12 Examples: (a) Let V be a vector space over a field K and let A
be a subset of V having only one element, say A = { }. Then the span s( )
of A is the set
{ V: K}

of all scalar multiples of . In case K = and V = 2 or V = 3, this span


is usually identified with the line through the origin determined by .

(b) Let V be a vector space over a field K and let , be two vectors in V.
The span s( , ) of these vectors is
{ + V: , K}.
In case is a scalar multiple of , we have
s( , ) = { + V: , K} = {( + ) : , K} = { : K} = s( ).
We see it is possible that A B and s(A) = s(B). In case K = and V = 3
and is not a scalar multiple of , this span is usually identified with
the plane through the origin determined by and .

(c) In the vector space 2 over , consider the set


A = {(1,0),(2,0), . . . ,(10 000,0)}.
2
The span s(A) is easily seen to be {(a,0) :a }, which is also the
2
span of {(1,0)} . Thus the number of vectors in A may be large, but
this does not imply that s(A) is a "big" subspace.

487
(d) Let V be a vector space over a field K and let A,B be subsets of V
with A B. Then A B s(B) and, since s(B) is a subspace of V, Lemma
40.10 yields s(A) s(B). So A B implies s(A) s(B). We have seen in
Example 40.12(b) and Example 40.12(c) that A B does not necessarily
imply s(A) s(B).

Exercises

1. Let V be a vector space over a field K. If W is a subspace of V and U is


a subspace of W, prove that U is a subspace of V.

2. Prove that the set of all sequences of real numbers converging to 0 is


a subspace of the -vector space S (see § 39, Ex. 5). What do you say
about the set of all convergent sequences, all bounded sequences, all
monotonic sequences, and all sequences with at most finitely many
nonzero terms?

3. Consider the -vector space V of Example 39.2(c). Determine whether


the following are subspaces of V: the set of bounded functions, the set of
even functions, the set of integrable functions, the set of monotonic
functions, the set of functions with at most finitely many points of
discontinuity (all with domains [0,1]).

3
4. Determine whether {( , , ) :5 4 + 2 = 0}
3
{( , , ) :5 4 +2 0}
3
{( , , ) 11
:5 4 + 2 = 0}
3
{( , , ) :5 4 +2 0}
are subspaces of the vector spaces indicated.

3 3
5. Is (1,0,1) in the -span of {(5,4,1), (3,2,2)} ?

488
§41
Factor Spaces

In the preceding paragraph, we discussed subspaces, which are the


analogues of subgroups and subrings. We now wish to discuss the ana-
logues of factor groups and factor rings.

Let V be a vector space over a field K and let W be a subspace of V.


Then W is a subgroup of the additive group V, and we can build the
factor group V/W. The elements of V/W are cosets + W, where V;
.
the sum of two cosets 1 + W and 2 + W is the coset ( 1 + 2) + W. The
operation on V/W is denoted by "+", but "+" designates in V/W an opera-
tion distinct from the addition in V. The question arises: is it possible to
define on V/W a kind of multiplication by scalars so that V/W becomes
a vector space over K? The most natural multiplication o is to put

o ( + W) = +W for all K, +W V/W.

We prove that o is well defined. To this end, we must show that the
implication

+W = +W +W= +W (for all K, , V)

is valid. This implication is equivalent to

W +W= +W

hence to

W W.

Since W is a subspace of V, it is closed under multiplication by scalars,


hence ( ) W whenever W. This proves that the above
multiplication o by scalars is well defined.

It is now quite straightforward to show that (V/W,+,K,o ) is a vector space.


For any , K, , V, we have

(1) o (( + W) + ( + W)) = o (( + ) + W )
= ( + )+W

489
=( + )+W
=( + W) + ( + W)
= o ( + W) + o ( + W),

(2) ( + ) o ( + W) = ( + ) + W
=( + )+W
=( + W) + ( + W)
= o ( + W) + o ( + W),
(3) ( ) o ( + W) = ( ) + W
= ( )+W
= o( + W)
= o ( o ( + W)),
(4) 1 o ( + W) = 1 + W
= + W.

Thus (V/W,+,K,o ) is a vector space.

We employed the symbol "o " chiefly to emphasize that multiplication of


the elements in V/W by scalars is distinct from the multiplication of the
elements in V by scalars. For ease of notation, we shall drop "o " and
write simply ( + W) instead of o ( + W). Also, we will write V/W for
(V/W,+,K,o ). The following theorem summarizes this discussion.

41.1 Theorem: Let V be a vector space over a field K and let W be a


subspace of V. Then the abelian group V/W is a vector space over K if
multiplication by scalars is defined by
( + W) = +W for all K, V.

41.2 Definition: Let V be a vector space over a field K and let W be a


subspace of V. The K-vector space V/W of Theorem 41.1 is called the
factor space of V by W, or the factor space V mod(ulo) W.

We know that factor groups (rings) are closely related to homomorph-


isms of groups (rings). The same is true for factor spaces.

490
41.3 Definition: Let V and U be vector spaces over the same field K. A
mapping : V U is called a vector space homomorphism, or a K-linear
transformation, or a K-linear mapping if

( 1
+ 2
) = 1
+ 2
and ( ) = ( )

for all 1
, 2
, V, K. When there is no need to emphasize the field of
scalars, we speak simply of linear transformations or linear mappings.

More exactly, when (V,+,K,.) and (U, ,K,o ) are vector spaces, the mapping
:V U is a vector space homomorphism provided

( 1
+ 2
) = 1 2
and ( ) = o ( )

for all 1
, 2
, V, K. Notice that the field of scalars of both vector
spaces are the same. A linear transformation from V into U cannot be
defined if V and U are vector spaces over different fields.

A mapping : V U such that ( 1


+ 2
) = 1
+ 2
for all 1
, 2
V is said
to be additive. So an additive mapping is just a group homomorphism
from the group (V,+) into (U,+). A mapping : V U such that ( ) =
( ) for all V, K is said to be homogeneous. A homogeneous
mapping is one that preserves the multiplication by scalars. A mapping
may be additive without being homogeneous, and it may be homogene-
ous without being additive. In order to be a linear transformation, a
mapping shoud be both additive and homogeneous.

A vector space homomorphism is therefore a homomorphism of additive


groups which preserves multiplication by scalars as well. This observa-
tion enables us to use the properties of group homomorphisms when-
ever we investigate vector space homomorphisms.

41.4 Lemma: Let V and U be a vector spacer over a field K. A function


:V U is a K-linear mapping if and only if
( 1 + 2) = ( 1 ) + ( 2 )
for all , K, 1, 2 V.

491
Proof: If is a K-linear mapping and , K, 1, 2 V, then
( 1 + 2) = ( 1) + ( 2) = ( 1 ) + ( 2 )
since is additive and homogeneous. Conversely, if we have ( 1 + 2)
= ( 1 ) + ( 2 ) for all , K, 1, 2 V, then, choosing = = 1, we see
that is additive and choosing = 0, we see that is homogeneous..

41.5 Lemma: Let V,U be a vector spacer over a field K and let : V U
be a vector space homomorphism.
(1) 0 = 0.
(2) ( ) = ( ) for all V.
(3) ( 1 1 + 2 2 + . . . + n n) = 1( 1 ) + 2( 2 ) + . . . + n( n ) for all
, , . . . , n K and for all 1, 2, . . . , n V.
1 2
(4) (n ) = n( ) for all n .

Proof: (1),(2),(4) follow respectively from (1),(2),(4) of Lemma 20.3 and


(3) follows from Lemma 20.3(3) by the homogeneity of , or from
Lemma 41.4 by induction on n.

41.6 Examples: (a) Let K be a field and let : K3 K2. Then


( , , ) ( , )
( ( , , ) + ( ´, ´, ´)) = (( , , ) + ( ´, ´, ´))
= (( + ´, + ´, + ´))
=( + ´, + ´)
= ( , ) + ( ´, ´)
= ( , ) + ( ´, ´)
= (( , , )) + (( ´, ´, ´))

for all , K, ( , , ), ( ´, ´, ´) K3. Hence is a K-linear transformation.

(b) Let K be a field and let : K2 K2. Then


( , ) ( , )
( ( , ) + ( ´, ´)) = (( , ) + ( ´, ´)) = (( + ´, + ´))
=( + ´, + ´) = ( , ) + ( ´, ´) = ( , ) + ( ´, ´)
= (( , )) + (( ´, ´))

for all , K, ( , ),( ´, ´) K2. Hence is a vector space homomorphism.

492
(c) The mapping : C1([0,1]) is -linear, because
1
f f(2 )
1 1 1 1 1
( f + g) = ( f + g)(2 ) = ( ( g)(2 ) = (f(2 )) + (g(2 )) = (f ) + (g )
f)(2 ) +
for all , , f,g C1([0,1]). Likewise, for any [0,1], the mapping

: C1([0,1])
f f( )
is a vector space homomorphism.

(d) Let V,U be vector spaces over a field K and let W be a subspace of V.
If : V U is a vector space homomorphism, then its restriction
W
:W U
to W is also a vector space homomorphism, because
( 1+ 2
) = ( 1 )+ ( 2 )
for all , K, 1, 2 W, as this holds in fact for all , K, 1
, 2
V
(Lemma 41.4).

(e) Let V,U be vector spaces over a field K and let K1 be a field contained
in K. Then V,U are vector spaces over K1, too (Example 39.2(f)). If
: V U is a K-linear mapping, then is also a K1-linear mapping,
because ( 1+ 2
) = ( 1 )+ ( 2 )
for all , K1, 1, 2 V, as this holds in fact for all , K, 1, 2 V
(Lemma 41.4).

(f) The mapping T: C2([0,1]) C([0,1]) is a vector space homomorphism


y y´´ 5y´+6y
because
( y1 + y2)T = ( y1 + y2)´´ 5( y1 + y2)´ + 6( y1 + y2)
= y1´´ + y2´´ 5( y1´ + y2´) + 6( y1 + y2)
= (y1´´ 5y1´ + 6y1) + (y2´´ 5y2´ + 6y2)
= (y1T) + (y2T)
for any , , y1,y2 C2([0,1]). In the theory of ordinary differential
equations, this mapping is called a linear differential operator and is
usually denoted by D2 5D + 6.

In the rest of this paragraph, we establish the counterparts of certain


theorems discussed in §§ 20, 21.

493
41.7 Theorem: Let V,U,W be a vector spaces over a field K. Let :V U
and :U W be vector space homomorphisms. Then the composition
mapping :V W is a vector space homomorphism from V into W.

Proof: is a group homomorphism (is additive) by Theorem 20.4.


Also
( ) = (( ) ) = ( ( )) = (( ) ) = ( ( ))
for all K, V, hence is homogeneous. Thus is a vector space
homomorphism.

41.8 Theorem: Let V,U be a vector spaces over a field K and let
:V U be K-linear. Then Im = { U: V} is a subspace of U and
Ker = { V: = 0} is a subspace of V.

Proof: Im is a subgroup of (U,+) by Theorem 20.6. Also, if Im


and K, then = for some V, so = ( ) = ( ) , so Im .
Thus Im is closed under multiplication by scalars. Therefore Im is a
subspace of U.

Ker is a subgroup of (V,+) by Theorem 20.6. Also, if Ker and


K, then = 0, so ( ) = ( ) = 0 = 0 by Lemma 39.4(6), so Ker .
Thus Ker is closed under multiplication by scalars. Therefore Ker is a
subspace of V.

41.9 Definition: Let V,U be a vector spaces over a field K. A vector


space homomorphism : V U is called a vector space isomorphism if
is one-to-one and onto. If there is a vector space isomorphism from V
onto U, we say V is isomorphic to U, and write V U.

So a vector space isomorphism is an additive group isomorphism which


preserves multiplication by scalars. We use the same symbol " " for
isomorphic vector spaces as for isomorphic groups. This will not lead to
confusion. When there is any danger of confusion, we will state explicitly
whether we mean vector space isomorphism or group isomorphism.

494
41.10 Lemma: Let V,U,W be a vector spaces over a field K and let
:V U and : U W be vector space isomorphisms.
(1) The composition : V W is a vector space isomorphism from V
onto W.
(2) The inverse 1: U V of is a vector space isomorphism from U
onto V.

Proof: (1) is a vector space homomorphism by Theorem 41.7, and


is one-to-one and onto by Theorem 3.13. So is a vector space iso-
morphism.

(2) 1: U V is an isomorphism of additive groups by Lemma 20.11(2).


We have only to show that 1 preserves multiplication by scalars. Let
U and K. Then = for some uniquely determined V,
1 1
namely = . Since ( ) = ( ) = , we have ( ) = . Thus
1 1 1
( ) = = ( ). So preserves multiplication by scalars.

As in the case of groups, we see that


V V,
if V U, then U V,
if V U and U W, then V W
for all K-vector spaces V,U,W, where K is any field. Thus is an
equivalence relation, but me must refrain from saying "on the set of K-
vector spaces".

In view of the symmetry property of , it is legitimate to say that V and


U are isomorphic when V is isomorphic to U.

41.11 Theorem: Let V be a vector space over a field K and let W be a


subspace of V. Then the mapping
:V V/W
+W
is a vector space homomorphism. It is onto V/W. Also, Ker = W. (This
mapping is called the natural or canonical homomorphism from V onto
V/W).

495
Proof: is an additive group homomorphism from V onto V/W such
that Ker = W (Theorem 20.12). Since ( ) = + W = ( + W) = ( )
for all K, V, we see that is a vector space homomorphism.

41.12 Theorem (Fundamental theorem on homomorphisms): Let


K be a field. Let V,V1 be K-vector spaces and let : V V1 . be a vector
space homomorphism.. Let W = Ker and let : V V/W be the associ-
ated natural homomorphism. .
Then there is a vector space homomorphism : V/W V1 such that
= ..

Proof: From Theorem 20.15, we know that :V/W V1 is a well


defined,
+W
one-to-one homomorphism of additive groups with = . For all K,
V, we have ( ( + W)) = ( + W) = ( ) = ( ) = (( + W) ), so
is homogeneous and is therefore a vector space homomorphism.

41.13 Theorem: Let V,U be vector spaces over a field K and let :V
U be a vector space homomorphism. Then
V/Ker Im (as vector spaces).

Proof: From Theorem 20.16 and its proof, we know that


:V/Ker Im
+ Ker
is an isomorphism of additive groups, thus V/Ker Im as groups;
and is a vector space homomorphism by Theorem 41.12. Hence
:V/Ker Im is a vector space isomorphism and V/Ker Im as
vector spaces.

41.14 Theorem: Let V,V1 be vector spaces over a field K and let
:V V1 be a vector space homomorphism from V onto V1.
(1) Each subspace W of V with Ker W is mapped to a subspace of V1,
which will be denoted by W1.

496
(2) If W,U are subspaces of V with Ker W U, then W1 U1.
(3) If W,U are subspaces of V with Ker W and Ker U, and if
W1 U1, then W U.
(4) If W,U are subspaces of V with Ker W and Ker U, and if
W1 = U1, then W = U.
(5) If S is any subspace of V1, then there is a subspace W of V such that
Ker W and W1 = S.
(6) If U is a subspace of V with Ker U, then V/U V1/U1.

Proof: (1) For each subspace W of V with Ker W, we put W1 = Im W,


as in Theorem 21.1. Then W1 is a subspace of V1 by Theorem 41.8. and
Example 41.6(d).

(2),(3),(4) These follow from parts (2),(3),(4) of Theorem 21.1 on


regarding the subspaces merely as additive subgroups.

(5) From Theorem 21.1(5) and its proof, we know that.


W := { V: S} is a subgroup of (V,+) with Ker W and W1 = S.
For any K and W, . we have S, so ( ) S, so ( ) S, so
W and W is in fact a subspace of V. .

(6) Let ´:V1 V1/U1 be the natural homomorphism. Then ´ and


´: V V1 V1/U1 are vector space homomorphisms (Theorem 41.11,
Theorem 41.7) with Ker ´ = U and Im ´ = V1/U1 (Theorem
21.1(6),(7)). Hence, by Theorem 41.13, we have the vector space
isomorphism
V/Ker ´ Im ´
V/U V1/U1.

41.15 Theorem: Let V be a vector space over a field K . and let W be a


subspace of V.. The subspaces of V/W are given by U/W, where U runs
through the subspaces of V containing W.. In other words, for each
subspace X of V/W,. there is a unique subspace U of V such that W U
and X = U/W. When X1 and X2 are subspaces of V/W, say with X1 = U1/W
and X2 = U2/W, where U1, U2 are subspaces of V containing W, then
X1 X2 if and only if U1 U2. Furthermore, there holds.
V/W / U/W V/U (vector space isomorphism).

497
Proof: The natural homomorphism : V V/W . is onto by Theorem
41.11. We may therefore apply Theorem 41.14.. This theorem states that
any subspace of V/W is of the form Im U for some subspace U of V with
Ker U. Now
Im U
={ V/W: U}
= { + W V/W: U} = U/W
and Ker = W by Theorem 41.11. Thus the subspaces of V/W are given
by U/W, where U's are subspaces of V containing W. By Theorem
41.14(2),(3),(4), U1/W U2/W if and only if U1 U2, and U1/W U2/W
whenever U1 U2. Finally, by Theorem 41.14(6)
V/U Im V
/Im U
=V/W / U/W as vector spaces.

41.16 Theorem: Let V be a vector space over a field K and let U,W be
subspaces of V. Then U W and U + W are subspaces of V and
W/U W U + W /U (vector space isomorphism).

Proof: U W is a subspace of V by Example 40.6(g). Also, U + W is a


subgroup of (V,+) by Lemma 19.4 and, for any K, U + W, there
are U and W with = + , so that
= ( + )= + U+W
since U and W; and so U + W is closed under multiplication by
scalars and U + W is a subspace of V.

We consider the restriction


:W V/U
W
to W of the natural homomorphism. : V V/U. By Theorem 41.11, is a
vector space homomorphism; by Example 41.6(d), W is a vector space
homomorphism, so
W/Ker W
Im W
(as vector spaces)
according to Theorem 41.13. From the proof of Theorem 21.3, we know
that Ker W = U W and Im W = U + W /U, as may also be established
directly. Hence
W/U W U + W /U.

Exercises

498
1. Let V be a vector space over a field K and let W be a subgroup of the
additive group (V,+). For all in K and for all + W in the factor group
V/W, we write o ( + W) = + W. Prove that ( , + W) o ( + W) is

a well defined mapping from K (V/W) into V/W if and only if W is a


subspace of V.

2. (cf. §20, Ex 14) Let : V V1 be a vector space homomorphism, let W


be a subspace of V such that W Ker ,. and let : V V/W be the
associated natural homomorphism.. Show that there is a vector space
homomorphism : V/W V1 such that = and Ker = (Ker )/W..
What happens when we drop the condition W Ker ?.

499
§42
Dependence and Bases

The span s(A) of a subset A in vector space V is a subspace of V. This


span may be the whole vector space V (we say then A spans V). In this
paragraph, we study subsets A of V which span V and which are most
economical in the sense that any proper subset of A spans a proper
subspace of V.

We begin with a definition that will be important for everything in the


sequel.

42.1 Definition: Let V be a vector space over a field K.. A finite


number of vectors 1, 2, . . . , n in V are called linearly dependent over K
if there are scalars 1, 2, . . . , n in K, not all of them being zero, such that

1 1
+2 2
+ ... +
n n
=0
(here 0 is the zero vector). If 1, 2, . . . , n are not linearly dependent over
K, then 1, 2, . . . , n are said to be linearly independent over K.

A finite subset A of V is called linearly dependent (resp. linearly inde-


pendent) over K if the finitely many vectors in A are linearly dependent
(resp. linearly independent) over K.

An infinite subset A of V is called linearly dependent over K if there is a


finite subset of A which is linearly dependent over K. An infinite subset
A of V is called linearly independent over K if A is not linearly
dependent over K, i.e., A is called linearly independent over K if every
finite subset of A is linearly independent over K.

In place of the phrase "linearly (in)dependent over K", we shall also use
the expression "K-linearly (in)dependent". When the field of scalars is
clear from the context, we drop the phrase "over K" or the prefix "K-".

According to our definition, the vectors 1, 2, . . . , n


of a vector space over
K are linearly independent over K provided.

500
, 2, . . . , n K, 1 1 + 2 2 + . . . + n n = 0
1 1
= 2 = . . . = n = 0.
That is to say, 1, 2, . . . , n are K-linearly independent if the vector zero
can be written as a linear combination of 1, 2, . . . , n only in the trivial
way where the scalars are zero..

42.2 Examples: (a) Let V be a vector space over a field K and let be a
nonzero vector in V. Then = 0 implies = 0 (Lemma 39.4(10)). Hence
(and { }) is linearly independent over K. On the other hand, {0} is
linearly dependent over K because 1.0 = 0 and 1 0.

(b) Consider the vector space 3 over . The vectors = (1,0,0), =


3
(0,1,0), = (0,0,1) of are linearly independent over , for if , , K
and + + = 0, then
(1,0,0) + (0,1,0) + (0,0,1) = (0,0,0)
( ,0,0) + (0, ,0) + (0,0, ) = (0,0,0)
( , , ) = (0,0,0)
= = = 0.

(c) More generally, the vectors 1


= (1,0, . . . ,0), 2
= (0,1, . . . ,0), . . . , n
=
(0,0, . . . ,1) in the vector space Kn over a field K are linearly independent
over K: if 1, 2, . . . , n K and 1 1 + 2 2 + . . . + n n = 0, then

1
(1,0, . . . ,0) + 2(0,1, . . . ,0) + . . . + n(0,0, . . . ,1) = (0,0, . . . ,0)

1
( ,0, . . . ,0) + (0, , . . . ,0) + . . . + (0,0, . . . , ) = (0,0, . . . ,0)
2 n
( ,
, . . . , n) = (0,0, . . . ,0)
1 2

1
= 2 = . . . = n = 0.
The reader is probably acquainted with the vectors 1, 2, 3
in the
3
vector space over under the names i , j , k.

2
(d) The vectors (1,0) and ( 1,0) in the -vector space are linearly
dependent over because 1 0 in and 1(1,0) + 1( 1,0) = (0,0) = zero
vector in 2.

(e) Let V be a vector space over a field K and let 1, 2, . . . , n be vectors


in V which are linearly independent over K. . Then any nonempty subset
of { 1, 2, . . . , n} is linearly independent over K. In fact, if, say, 1, 2, . . . , m
are linearly dependent over K (m n), then there are scalars.
, , . . . , m in K, not all equal to zero, such that
1 2

1 1
+ + ... +
2 2
= 0; m m

501
hence, when we put (in case m n) m+1
= . . . = n = 0, we obtain
+
1 1
+ ... +
2 2 m m
+ m+1 m+1 + . . . + n n = 0,
where not all of 1, 2, . . . , m, n+1, . . . , n are equal to zero, contradicting
the assumption that 1, 2, . . . , n are linearly independent over K. Thus
any nonempty subset of a linearly independent finite set of vectors is
linearly independent. But this statement is true also for infinite linearly
independent sets. Indeed, let A be an infinite linearly independent
subset of V and let B be a nonempty subset of A. If B is finite, then B is
linearly independent by definition. If B is infinite, then any finite subset
of B, being a finite subset of A, is linearly independent over K and hence
B itself is linearly independent over K. Thus we have shown that every
nonempty subset of a linearly independent set of vectors is linearly
independent. Equivalently, any set of vectors containing a linearly
dependent subset is linearly dependent.

(f) Let V be a vector space over a field K and let A be a subset of V


containing 0 V. Then A is linearly dependent over K by Example
42.2(a) and Example 42.2(e). Alternatively, just choose a finite number
of vectors 1, 2, . . . , n from A including 0, say 1 = 0 and observe that .
1 1 + 0 2 + . . . + 0 n = 0,
so that 1, 2, . . . , n are linearly dependent over K and consequently A,
too, is linearly dependent over K. .

(g) Let V be the vector space 2 over . The vectors (1,0), ( i,0) in V are
linearly dependent over , because
i(1,0) + 1( i,0) = (0,0) = zero vector in V.
However, when V is regarded as an -vector space, these two vectors
are not linearly dependent: if , and (1,0) + ( i,0) = (0,0), then
( i,0) = (0,0), hence the complex number i is equal to 0, so =
= 0. Thus (1,0), ( i,0) are linearly dependent over , but linearly
independent over . This example shows that the field of scalars must
be specified (unless it is clear from the context) whenever one discusses
linear (in)dependence of vectors.

(h) Let V be a vector space over a field K and let 1, 2, . . . be infinitely


many vectors in V. The linear dependence of 1, 2, . . . does not mean
that there are scalars 1, 2, . . . , not all equal to zero, such that

∑ k k
= 0.
k=1

502
This equation is meaningless, for its left hand side is not defined. What


n
is defined (Definition 8.4) is a sum k k
of a finite number n of
k=1

vectors 1
, 2
, ..., n
in V. The definition of ∑ k k
would involve some
k=1

limiting process, and this is not possible in an arbitrary vector space.

(i) Consider the vector space C1([0,1]) over (Example 40.6(j)). The
functions f: [0,1] and g: [0,1] , where f(x) = ex and g(x) = e2x for
all x in [0,1], are vectors in C1([0,1]).We claim that f and g are linearly
independent over . To prove this, let us assume , and f + g =
1 1
zero vector in C ([0,1]). The zero vector in C ([0,1]) is the function
z: [0,1] such that z(x) = 0 for all x in [0,1]. Hence
( f + g)(x) = 0 for all x [0,1],
f(x) + g(x) = 0 for all x [0,1],
x 2x
e + e =0 for all x [0,1].
Differentiating, we obtain
ex + 2 e2x =0 for all x [0,1].
2x x 2x
We have thus e = e = 2 e for all x [0,1], hence = 0, so = 0.
Therefore f and g are linearly independent over .

(j) Let V be a vector space over a field K and let 1


, 2
be vectors in V
which are linearly dependent over K. Then there are scalars , K, not
both zero, such that 1 + 2 = 0. If, say, 0, then has an inverse 1
1 1 1
in K and we obtain 1
+( ) 2
= ( 1
+ 2
)= 0 = 0, so
1
= 2
1
if we put = . So 1 is a scalar multiple of 2. Conversely, if 1 and 2
are vectors in V and if one of them is a scalar multiple of the other,. for
instance if 1 = 2 with some K, then 1 1 + ( ) 2 = 0 and 1, 2 are
linearly dependent over K.. Thus the linear dependence of two vectors
means that one of them is a scalar multiple of the other..

We generalize the last example.

42.3 Lemma: Let V be a vector space over a field K and let 1


, 2
, ..., n
be n vectors in V, where n 2. These vectors are linearly dependent
over K if and only if one of them is a K-linear combination of the other
vectors.

503
Proof: We first assume that 1, 2, . . . , n are linearly dependent over K.
Then there are scalars 1, 2, . . . , n, not all of them zero, such that
+ + ... +
1 1
= 0.
2 2 n n
1
To fix the ideas, let us suppose 1
0. Then 1
has an inverse 1
in K
and we obtain 1
1
( 1 1
+ 2 2
+ . . . + n n) = 1
1
0 = 0,

1
+ 1
1 2 2
+ . . . + 11 n n
= 0,

1
= 2 2
+ ... + n n
1
where we put j = 1 j
K (j = 2, . . . ,n). So 1
is a K-linear combination of
the vectors 2, . . . , n.

Conversely, let us suppose that one of the vectors, for example 1, is a


linear combination of the rest, so that there are scalars 2, . . . , n K such
that = + ... + .
1 2 2 n n
Then we get 1 1
+ 2 2 + ... + n n = 0
when we write 1
= 1. Since 1 = 1 0, we see that 1
, 2
, ..., n
are
linearly dependent over K.

A vector space over a field K can be spanned by many subsets of V.


Among the subsets of V which span V, we want to find the ones with the
least number of elements. The next two theorems, which are converses
of each other, tell us that linearly dependent subsets are not useful for
this purpose.

42.4 Theorem: Let V be a vector space over a field K and let A be a


nonempty subset of V. If A is linearly dependent over K, then there is a
proper subset B of A such that sK (A) = sK (B)..

Proof: Suppose A is K-linearly dependent. If A is infinite, then, by


definition, there is a finite linearly dependent subset A0 of A. If A is
finite, let us put A0 = A. Hence, in both cases, A0 is a finite linearly
dependent subset of A. Let A0 = { 0, 1, 2, . . . , n}.

We first dispose of the trivial case A0 = 1, V = A0. In this case we have n


= 0 and A0 = { 0}, so 0 = 0 by Example 42.2(a), so A0 = {0}. Thus
{0} = A0 A sK (A) V = A0 = {0}
and sK (A) is equal to the K-span of the proper subset B = of A.

504
Suppose now A0 1 or A0 = 1 but V A0. Then we may and do join
nonzero vectors 1, 2, . . . , n to A0 without disturbing the linear
dependence and finiteness of A0. One of the vectors 0, 1, 2, . . . , n, which
we may assume to be 0 without loss of generality, is a K-linear combi-
nation of the others (Lemma 42.3). So there are scalars 1, 2, . . . , n such
that
0
= 1 1
+ 2 2
+ ... + n n
.

We will show that 0 is reduntant. We put B = A\{ 0}. Then B is a proper


subset of A and sK (B) sK (A). We prove sK (A) sK (B).

Let sK (A). Then there are vectors 1


, 2
, ..., m
in A and scalars
, , . . . , m in K such that
1 2
= 1 1 + 2 2 + . . . + m m.
Here we may suppose that 1, 2, . . . , m are pairwise distinct (if 1
= 2
,
we write ( 1 + 2) 1 instead of 1 1 + 2 2, etc.).

If none of the vectors 1, 2


, . . . , m is equal to 0
, then is a K-linear
combination of the vectors , , . . . , m in B, so
1 2
sK (B).

If one of the vectors 1


, 2
, ..., m
is equal to 0
, for instance if 1
= 0
,
then we have = 1 1
+ 2 2
+ ... + m m
= 1
( 1 1
+ 2 2
+ ... + n
+ ... + m m n
)+ 2 2
= 1 1 1+ 1 2 2 1 n n
+ ... +
+ 2 2 + . . . + m m,
so is a K-linear combination of the vectors 1, 2, . . . , n, 2, . . . , m in B =
A\{ 0} (some i might equal a j , but this does not matter), so sK (B).

In both cases, sK (B). Thus sK (A) sK (B) and sK (A) = sK (B), as was to be
proved.

42.5 Theorem: Let V be a vector space over a field K and let A be a


nonempty subset of V. If there is a proper subset B of A such that sK (B)
= sK (A), then A is linearly dependent over K.

Proof: We first dispose of the trivial case B = . If B = , then


A sK (A) = sK (B) = sK ( ) = {0}.
gives A = {0} and A is K-linearly dependent by Example 42.2(a).

Suppose now B . Since B A, there is a vector in A\B. From

505
A sK (A) = sK (B), we conclude that there are vectors 1, 2, . . . , m in
B and scalars 1, 2, . . . , m in K with
= 1 1 + 2 2 + . . . + m m.
So the vector in A is a K-linear combination of the vectors 1, 2, . . . , m
and the subset { , 1, 2, . . . , m} of A is K-linearly dependent by Lemma
42.3. From Example 42.2(e), it follows that A is K-linearly dependent.

The last two theorems lead us to consider linearly independent subsets


of V spanning V. Whether an arbitrary vector space does have such a
subset will be discussed later. We give a name to the subsets in question.

42.6 Definition: Let V be a vector space over a field K. A nonempty


subset B of V is called a basis of V over K, or a K-basis of V, if B is
linearly independent over K and spans V over K (i.e., sK (B) = V).. By con-
vention, the empty set will be called a K-basis of the vector space {0}.

42.7 Examples: (a) Consider the vector space Kn over a field K. The
vectors 1 = (1,0, . . . ,0), 2 = (0,1, . . . ,0), . . . , n = (0,0, . . . ,1) are linearly
independent over K (Example 42.2(c)). Moreover, { 1
, 2
, ..., n
} spans Kn
over K because any vector ( 1
, 2, . . . , n) in Kn is a K-linear combination
1 1
+ 2 2 + ... + n n
of the vectors 1
, 2
, ..., n
. Hence { 1
, 2
, ..., n
} is a basis of Kn over K.

(b) Let V = {h C2([0,1]): h´´(x) 3h´(x) + 2h(x) = 0 for all x [0,1]}.


2
Then V is an -subspace of C ([0,1]), as can be verified directly and also
follows from Example 40.6(k). From the theory of ordinary differential
equations, it is known that every function in V (that is, every solution of
y´´ 3y´ + 2 = 0) can be written in the form c1f + c2g, where c1,c2 and
f(x) = ex, g(x) = e2x for all x [0,1]. Thus {f,g} spans V over . Also, {f,g}
is linearly independent over by Example 42.2(i). Hence {f,g} is an
basis of V.

506
42.8 Theorem: Let V be a vector space over a field K and let B .
= { 1, 2, . . . , n} be a nonempty subset of V. Then B is a K-basis of V if and
only if every element of V can be written in the form .
+ 2 2+ . . . + n n, ( 1, 2, . . . , n K)
1 1

in a unique way (i.e., with unique scalars 1


, 2
, ..., n
).

Proof: Assume first that B is a K-basis of V. Then V = sK (B), and every


element of V can be written as
1 1
+ 2 2 + ... + n n
with suitable scalars 1, 2, . . . , n. We are to show the uniqueness of this
representation. In other words, we must prove that, if
1 1
+ 2 2 + ... + n n = 1 1 + 2 2 + ... + n n
(1)
then 1
= 1
, 2
= , ..., =
. This is easy: if (1) holds, then
2 n n
( 1 ) +( 2
1 1
) + ... + ( n
2 2
) =0
n n
and we obtain,since B = { 1, 2, . . . , n} is K-linearly independent, that

1
=1
= ... =
2 2
= 0. This proves uniqueness.
n n

Conversely, let us suppose that every vector in V can be written in the


form 1 1
+ 2 2 + ... + n n
with unique scalars 1, 2, . . . , n. Then V = sK ( 1, 2, . . . , n) = sK (B).
Moreover, B is linearly independent over K, for if 1, 2, . . . , n K are
scalars such that.
+ + ... + = 0,
1 1 2 2 n n
then 1 1
+ 2 2 + ... + n n = 0 1 + 0 2 + ... + 0 n
and the uniqueness of the scalars. in the representation of 0 V as a K-
linear combination of , , . . . , implies that
1 2
= = ... =
n
= 0. Thus 1 2 n
B is K-linearly independent and consequently B is a K-basis of V. .

We prove next that any finitely spanned vector space has a basis.

42.9 Theorem:. Let V be a vector space over a field K and assume T is a


finite subset of V spanning V, so that sK (T) = V. Then V has a finite K-
basis. In fact, a suitable subset of T is a K-basis of V. .

507
Proof: If V happens to be the vector space {0},. then V has a K-basis,
namely the empty set (Definition 42.6). and T. Having disposed of
this degenerate case, let us assume V 0. Now sK (T) = V. Since V 0, we
have T .. If T is linearly independent over K, then T is a K-basis of V.
Otherwise, there is a proper subset T1 of T with sK (T1) = sK (T) = V
(Theorem 42.4). Here T1 , because sK (T1) = V {0}. If T1 is linearly
independent over K, then T1 is a K-basis of V. Otherwise, there is a
proper subset T2 of T1 with sK (T2) = sK (T1) = V. Here T2 , because
sK (T2) = V {0}. If T2 is linearly independent over K, then T2 is a K-basis
of V. Otherwise, there is a proper subset T3 of T2 with sK (T3) = sK (T2) = V.
Here T3 , because sK (T3) = V {0}. We continue in this way. Each time,
we get a nonempty subset Ti+1 of Ti such that sK (Ti+1) = V and Ti+1 has
less elements than Ti. Since T is a finite set, this process cannot go on
indefinitely. . Sooner or later, we will meet a K-linearly independent
subset Tm of T with sK (Tm) = V. This Tm is therefore a K-basis of V, and of
course Tm is finite.

Having convinced ourselves of the existence of bases in some vector


spaces, we turn our attention to the number of vectors in a finite basis.
We show that the number of linearly independent vectors in a subspace
cannot exceed the number of vectors spanning the subspace. This theo-
rem, due to E. Steinitz (1871-1928), is the source of many deep results
concerning the dimension of a vector space. The idea is to replace some
vectors in the spanning set by the vectors in the linearly independent
set without changing the span.

42.10 Theorem (Steinitz' replacement theorem): . Let V be a


vector space over a field K and 1, 2, . . . , m be finitely many vectors in
V. Let 1, 2, . . . , n be n linearly independent vectors in the K-span
sK ( 1, 2, . . . , m) of 1, 2, . . . , m.
Then n m. Moreover, there are n vectors among 1, 2, . . . , m, which
we may assume to be 1, 2, . . . , n, such that
sK ( 1, 2, . . . , n, n+1, . . . , m) = sK ( 1, 2, . . . , m).

Proof: For 1 h n, let Ah be the assertion


"there are h vectors among 1, 2, . . . , m, say 1
, ..., h
, such that

508
sK ( 1, . . . , h, h+1
, ..., m
) = sK ( 1
, ..., h
, h+1
, ..., m
)".

We show that (1) A1 is true,


(2) if 2 h n and Ah 1
is true, then Ah is true.

This will establish A1,A2, . . . ,An 1,An. The second claim An in the theorem
will be proved in this way. .

(1) A1 is true. We have 1


sK ( 1
, 2
, . . . , m), so
= 1 1 1
+ 2 2
+ ... + m m
with some scalars 1, 2, . . . , m
K. Since 1
, 2
, ... , n
are linearly inde-
pendent over K, 1 0 (Example 42.2(f)), hence not all of 1, 2
, . . . , m are
equal to 0 K. So one of them is distinct from 0. Renaming , , ..., m
1 2
if necessary, we may suppose 1
has an inverse 11 in K and
0. Then 1
we get = 1
( 1 ... )
1 1
2 2 m m

1
sK ( 1, 2, . . . , m)
{ 1, 2, . . . , m} sK ( 1, 2, . . . , m). (i)
Since 1
sK ( 1, 2, . . . , m),
we also have { 1, 2, . . . , m} sK ( 1, 2, . . . , m). (ii)
Using (i) and (ii) and applying Lemma 40.11 (with A = { 1, 2, . . . , m}
and B = { 1, 2, . . . , m}, we obtain
sK ( 1, 2, . . . , m) = sK ( 1, 2, . . . , m).
This proves A1.

(2) Suppose 2 h n and Ah 1 is true. Then Ah is true. The truth of


Ah 1 means sK ( 1, . . . , h 1, h, . . . , m) = sK ( 1, 2, . . . , m)
provided the i's are indexed suitably. We have
h
sK ( 1, . . . , h 1, h, . . . , m)
h
sK ( 1, . . . , h 1, h, . . . , m),
so = + ... +
h 1 1
+ + ... + h 1 h 1 h h m m
for some appropriate 1,. . . , h 1, h,. . . , m K. Here not all of h, . . . , m are
equal to 0 K, for then h would be a K-linear combination

1 1
+ . . . + h 1 h 1 of the vectors 1, . . . , h 1 and the vectors 1, . . . , h 1, h
would not be linearly independent over K (Lemma 42.3), so 1, 2, . . . , n
would not be linearly independent over K (Example 42.2(e)),. contrary
to the hypothesis. So one of h, . . . , m is distinct from 0. Renaming h, . . .
1
, m
if necessary, we may suppose h
0. Then h
has an inverse h
in K
and we get
h h
= 1 1
+ ... + h 1 h 1 h
+ h+1 h+1
+ ... + m m
,

509
h
= + ... + h 1 h 1
h
1
( 1 1 h
+ h+1 h+1 + . . . + m m),
h
sK ( 1, . . . , h 1, h, h+1, . . . , m).
Now each one of the vectors 1, . . . , h 1, being an element of the span
sK ( 1, 2, . . . , m) = sK ( 1, . . . , h 1, h, . . . , m), can be written in the form
+ ... +
1 1
+ + + ... +
h 1 h 1 h h h+1 h+1 m m
with scalars 1
, ..., , ,
h 1 h h+1
, ... , m
K. Thus each one of 1
, ..., h 1
can
be written as
1 1
+ ... + h 1 h 1
+ h
( h
1
(+ ... +
1 1 h 1 h 1 h
+ h+1 h+1
+ ... + m m
))
+ h+1 h+1
+ ... + m m
,

and so { 1
, ..., h 1
} sK ( 1, . . . , h 1
, h
, h+1
, ..., m
).

Therefore { 1
, ..., h 1
, h
, h+1
,. . . , m
} sK ( 1, . . . , h 1
, h
, h+1
, ..., m
). (i´)

Since 1
, ..., h 1
, h
sK ( 1
, ..., h 1
, h
, h+1
,. . . , m
),
we also have
{ 1, . . . , h 1
, h
, h+1
, ..., m
} sK ( 1
, ..., h 1
, h
, h+1
,. . . , m
). (ii´)

Using (i´) and (ii´) and applying Lemma 40.11 with


A={ 1
, ..., h 1
, h
, h+1
,. . . , m
}, B = { 1
, ..., h 1
, h
, h+1
, ..., m
},
we obtain
sK ( 1, . . . , h 1
, h
, h+1
, ..., m
) = sK ( 1
, 2
, ..., m
).

Thus Ah is true.

As remarked earlier, this establishes the truth of A1,A2, . . . ,An 1,An. In


particular, An is true, and the second statement in the enunciation is
proved. Now it remains to establish n m..

If we had m n, then Am would be true and we would get


sK ( 1, 2, . . . , m) = sK ( 1, 2, . . . , m).
Then n
sK ( 1, 2, . . . , m)
would give n
( , , . . . , m),
K 1 2
contrary to the hypothesis that 1, 2, . . . , m, . . . , n are linearly indepen-
dent over K. . So m n is impossible and necessarily n m. This com-
pletes the proof..

510
42.11 Theorem: Let V be a vector space over a field K and assume that
V has a finite K-basis. Then any two K-bases of V have the same number
of elements.

Proof: There is a finite K-basis of V by hypothesis, say B.. Assume that B


has exactly n vectors (n 0). We prove that any K-basis B1 of V has also
n vectors in it. .

If n = 0, then B = and V = {0}. Thus and {0} are the only subsets of V
and B = is the only K-basis of V. Then any K-basis of V has exactly 0
elements.

Suppose now n 1 and let B1 be any K-basis of V. First we show that B1


cannot be infinite. Otherwise, B1 would be an infinite K-linearly inde-
pendent subset of V. Every finite subset of B1 would be K-linearly inde-
pendent by definition. Let 1, 2, . . . , n, n+1 be n + 1 K-linearly indepen-
dent vectors in B1. These n + 1 vectors lie in the K-span sK (B) of B and B
has n elements.. Steinitz' replacement theorem gives n + 1 n, which is
absurd. Thus B1 cannot be infinite.

We put B1 = n1. Here n1 0, because n1 = 0 would imply B1 = and


B sK (B) = V = sK (B1) = sK ( ) = {0}, so B = {0} and B would be linearly
independent over K,. contrary to the hypothesis that B is a K-basis of V.
So n1 .

B1 is a K-linearly independent subset of V in sK (B), therefore n1 n by


Steinitz' replacement theorem.. Likewise, B is a K-linearly independent
subset of V in sK (B1), so n n1. Therefore n = n1, as was to be proved.

42.12 Definition: Let V be a vector space over a field K. If V has a


finite K-basis, the number of elements in any K-basis of V, which is the
same for all K-bases of V by Theorem 42.11, is called the dimension of V
over K, or the K-dimension of V. . It is denoted as dimK V or as dim V. If V
has no finite K-basis,. then the K-dimension of V is defined to be infinity,
and we write in this case dimK V = .

511
Thus dimK Kn = n (Example 42.7(a)) and dim V = 2, where V is the -
vector space of Example 42.7(b).

We frequently say that V is n-dimensional when dim V = n. A vector


space is said to be finite dimensional if dim V is a nonnegative integer
and infinite dimensional if dim V = . Notice that the dimension of the
vector space {0} is zero.

42.13 Lemma: Let V be a vector space over a field K, let dimK V = n


and let 1, 2, . . . , n be n vectors in V.
(1) If 1, 2, . . . , n are linearly independent over K, then sK ( 1, 2, . . . , n) =
V.
(2) If sK ( 1, 2
, . . . , n) = V, then 1
, 2
, ..., n
are linearly independent over
K.

Proof: (1) We are given dimK V = n . Let { 1, 2, . . . , n} be a basis of


V over K. If 1, 2, . . . , n are K-linearly independent vectors in V =
sK ( 1, 2, . . . , n), then we obtain sK ( 1, 2, . . . , n) = sK ( 1, 2, . . . , n) by
Steinitz' replacement theorem.

(2) Suppose sK ( 1, 2, . . . , n) = V. If 1, 2, . . . , n are not linearly


independent over K, then there is a proper subset T { 1, 2, . . . , n} of
{ 1, 2, . . . , n} with sK (T) = sK ( 1, 2, . . . , n) = V (Theorem 42.4), and there is
a K-basis B of V such that B T (Theorem 42.9). Using Theorem 42.11,
we obtain the contradiction
n = dimK V = B T n.
Hence 1, 2, . . . , n have to be linearly independent over K.

Any finite set spanning a vector space can be stripped off to a basis of
that vector space (Theorem 42.9). Similarly, any linearly independent
subset of a vector space can be extended to a basis, as we show now.

42.14 Theorem: Let V be an m-dimensional vector space over a field


K, with m 1. Let n 1 and let 1, 2, . . . , n be n linearly independent
vectors in V. Then there is a K-basis B of V such that { 1, 2, . . . , n} B.

512
Proof: Let { 1, 2, . . . , m} be a K-basis of V. Then 1, 2, . . . , n are linearly
independent vectors in V = sK ( 1, 2, . . . , m), and Steinitz' replacement
theorem gives
sK ( 1, 2, . . . , n, n+1, . . . , m) = V and n m
on indexing 's suitably. Then the m = dimK V vectors
, , . . . , n, n+1, . . . , m
1 2
are linearly independent over K by Lemma 42.13(2). Hence
B = { 1, 2, . . . , n, n+1, . . . , m}
is a K-basis of V containing the vectors 1, 2, . . . , n.

42.15 Lemma: Let V be a finite dimensional vector space over a field K


and let W be a subspace of V.
(1) W is finite dimensional; in fact dimK W dimK V.
(2) dimK W = dimK V if and only if W = V.

Proof: Let n = dimK V.


(1) The assertion is trivial when W = {0}, so let us assume W {0}. Then
there is a nonzero vector in V, and { } is a K-linearly independent
subset with one element. On the other hand, any n + 1 vectors in V (in
fact in V) are linearly dependent over K by Steinitz' replacement
theorem. Therefore there exists a natural number m such that
(a) 1 m n,
(b) there are m linearly independent vectors in W,
(c) any m + 1 vectors in W are linearly dependent.
This m is clearly unique in view of (b) and (c). The natural number m
having been defined in this way, let 1, 2, . . . , m be m K-linearly
independent vectors in W. We claim that { 1, 2,. . . , m} is a K-basis of W.
To show this, we must prove only that these vectors span W over K. Let
be an arbitrary vector in W. Then , 1, 2, . . . , m are linearly
dependent over K by (c). Hence
+ 1 1 + 2 2 + ... + m m = 0
with some scalars , 1, 2, . . . , m in K. Here 0, for otherwise the
equation above would imply that 1, 2, . . . , m are K-linearly dependent.
Hence has an inverse 1 in K and we get
= ( 1 1) 1 + ( 1 2) 2 + . . . + ( 1
m
) m
,
sK ( 1, 2, . . . , m).

513
This gives W sK ( 1, 2, . . . , m). But 1, 2, . . . , m belong to W, so
sK ( 1, 2, . . . , m) W (Lemma 40.5). The vectors 1, 2, . . . , m therefore
span W over K, so { 1, 2, . . . , m} is a K-basis of W. Thus W is finite
dimensional and in fact dimK W = m n = dimK V.

(2) If W = V, then of course dimK W = dimK V. Suppose conversely dimK W


= dimK V = n and let A be a K-basis of W. Then there is a K-basis B of V
with A B: this follows from Theorem 42.14 when A and is obvious
when A = . Then
n = dimK W = A B = dimK V = n
implies that A = B. Thus W = sK (A) = sK (B) = V.

42.16 Lemma: Let V,U be vector spaces over a field K.. Suppose V is
finite dimensional and let. : V U be a vector space homomorphism.
Let 1, 2, . . . , n be vectors in V.
(1) If is one-to-one and { 1, 2, . . . , n} is linearly independent over K,
then { 1 , 2 , . . . , n } is linearly independent over K.
(2) If is onto U and { 1, 2, . . . , n} spans V over K, then { 1 , 2 , . . . , n }
spans U over K.
(3) If is a vector space isomorphism and { 1, 2, . . . , n} is a K-basis of V,
then { 1 , 2 , . . . , n } is a K-basis of U. In particular, dimK U = dimK V.

Proof: (1) Suppose 1


, 2
, . . . , n are scalars such that
(
1 1
) + 2( 2 ) + . . . + n( n ) = 0.
Then ( 1 1 + 2 2 + . . . + n n) = 0,
so 1 1
+ 2 2 + . . . + n n Ker ,
and 1 1
+ + ... +
2 2
=0 n n
since Ker = 0 asis one-to-one. Since 1, 2, . . . , n are K-linearly inde-
pendent, we get 1 = 2 = . . . = n = 0. Hence 1 , 2 , . . . , n are linearly
independent over K.

(2) We must show that any element. of U can be written as a K-linear


combination of the vectors 1 , 2 , . . . , n . Let U. Then there is a in
V with = and = 1 1 + 2 2 + . . . + n n, where 1, 2, . . . , n are suit-
able scalars in K. This yields
= =( 1 1+ 2 2
+ ... + n n
)
= 1
( 1
)+ 2
( 2
) + ... + n
( n
) sK ( 1
, 2
, ..., n
),

514
as was to be proved.

(3) This follows immediately from (1) and (2).

From Lemma 42.16, it follows that dimK U = n whenever U Kn. The con-
verse of this statement is also true.

42.17 Theorem: Let V be a vector space over a field K. Then


dimK V = n if and only if V Kn (as vector spaces).

Proof: If V Kn, then dimK V = dimK Kn = n by Lemma 42.16(3). Suppose


conversely that dimK V = n and let { 1, 2,. . . , n} be a K-basis of V. Every
element of V can be written in a unique way as + + ... + , 1 1 2 2 n n
where 1
, 2
, ..., n
K. We consider the mapping
:V Kn
1 1
+ 2 2
+ ... + n n
( 1
, 2
, ..., n
).

This is a K-linear transformation, since, for any , K and


= 1 1+ 2 2+ . . . + n n, = 1 1 + 2 2 + . . . + n n in V, we have
( + ) = ( ( 1 1 + 2 2 + . . . + n n) + ( 1 1 + 2 2 + . . . + n n
))
= (( 1 + 1 ) 1 + ( 2 + 2 ) 2 + . . . + ( n + n ) n )
=( 1
+ , 1
+ 2, . . . ,
2 n
+ n
)
= ( , , ...,
1 2 n
) + ( 1, 2, . . . , n)
= ( )+ ( ).

Furthermore, = 1 1
+ 2 2
+ ... + n n
V belongs to Ker if and only if
( 1
, 2
, ..., n
) = (0,0, . . . ,0), thus if and only if =0 1
+0 2
+ . . . + 0 n = 0.
So Ker = {0} and is one-to-one.

Since any n-tuple ( 1


, 2
, ..., n
) in Kn is the image, under , of the vector
+
1 1
+ ... +2 2 n n
in V, we see that is onto.

Hence is a vector space isomorphism and V Kn.

515
42.18 Theorem:. Let V and U be finite dimensional vector spaces over a
field K. Then V U if and only if dimK V = dimK U.

Proof: The case when dimK V = 0 or dimK U = 0 is trivial. Let us suppose


dimK V 1 and dimK U 1. If V U, then dimK V = dimK U by Lemma
42.16(3). If dimK V = dimK U, then V KdimKV = KdimKU U by Theorem
42.17, hence V U.

42.19 Theorem: Let V be a vector space over a field K and let W be a


subspace of V. If V is finite dimensional, then V/W is finite dimensional.
In fact,
dimK V = dimK W + dimK V/W.

Proof: We eliminate the trivial cases. We know that W is finite dimen-


sional (Lemma 42.15(1)). If dimK W = 0, then W = {0}, so V V/{0} =
V/W, so dimK V/W = dimK V and dimK V = 0 + dimK V = dimK W + dimK V/W.
If dimK W = dimK V, then W = V (Lemma 42.15(2)), so V/W {0} and
dimK V = dimK V + 0 = dimK W + dimK V/W. Thus the theorem is proved in
case dimK W = 0 or dimK W = dimK V (in particular in case dimK V = 0).

Let us assume now 0 dimK W dimK V. Let dimK W = m and let


{ 1, 2, . . . , m} be a K-basis of W. There are vectors 1, 2, . . . , k in V such
that { 1, 2, . . . , m, 1, 2, . . . , k} is a K-basis of V (Theorem 42.14). Here
k 1 and m + k = dimK V. We claim that { 1 + W, 2 + W, . . . , k + W} is a
K-basis of V/W. This will imply k = dimK V/W, hence dimK V = m + k =
dimK W + dimK V/W.

To establish our claim, we note first that 1 + W, 2


+ W, . . . , k + W are
K-linearly independent vectors in V/W. Indeed, if , , . . . , k are scalars
1 2
such that
1
( 1
+ W) + 2
( 2 + W) + . . . + k( k + W) = 0 + W,
then 1 1
+ 2 2
+ . . . + k k W = sK ( 1, 2, . . . , m),
+ 2 2
1 1
+ ... +
k k 1 1
=
2 2
+ + ... + m m
where 1, 2, . . . , m are appropriate scalars in K. Then
+
1 1
+ ... +
2 2 k k
...
1 1 2 2 m m
=0
and linearly independence of 1, 2
, ..., m
, 1
, 2
, ..., k
implies that

1
= 2 = . . . = k = 0. Thus 1 + W, 2
+ W, . . . , k
+ W in V/W are linearly
independent over K.

516
Secondly, these vectors span V/W. To see this, let us take an arbitrary
vector + W in V/W, where V. Then
= 1 1+ 2 2+ . . . + k k + 1 1 + 2 2 + ... + m m
where 1, 2, . . . , k, 1, 2, . . . , m are scalars, and thus
+ W = ( 1 1 + 2 2 + . . . + k k + 1 1 + 2 2 + . . . + m m) + W
= ( + W) + ( + W) + . . . + ( + W)
1 1 2 2 k k
+ 1
( 1
+ W) + 2
2
+ W) + . . . + m( m + W)
(
= 1( 1 + W) + 2( 2 + W) + . . . + k( k + W)
sK ( 1 + W, 2 + W, . . . , k + W)
V/W,
hence V/W = sK ( 1 + W, 2 + W, . . . , k + W). This proves that
{ 1 + W, 2 + W, . . . , k + W} is a basis of V/W over K. As we remarked
above, this gives dimK V = dimK W + dimK V/W.

We deduce important corollaries from Theorem 42.19.

42.20 Theorem: Let V be a vector space over a field K. Let W,U be


finite dimensional subspaces of V. Then W + U is a finite dimensional
subspace of V and in fact
dimK (W + U) = dimK W + dimK U dimK (W U).

Proof: If { 1, 2, . . . , m} is a K-basis of W and { 1, 2, . . . , k} is a K-basis


of U, then W + U = { + V: W, U}. is clearly spanned by the
finite set { 1, 2, . . . , m, 1, 2, . . . , k}, hence W + U is finite dimensional
(Theorem 42.9).. From W + U/U W /W U (Theorem 41.16), we
obtain then dimK (W + U) dimK U = dimK (W + U/U )
= dimK (W /W U)
= dimK W dimK (W U).

42.21 Theorem: Let V be a vector space over a field K and let be a K-


linear transformation from V. If V is finite dimensional, then
dimK Ker + dimK Im = dimK V.

517
Proof: Theorem 41.13 tells us V/Ker Im and Theorem 42.19 gives
dimK V dimK Ker = dimK Im .

42.22 Theorem: Let V,U be vector spaces over a field K and let
:V U be a K-linear mapping. Suppose that V and U have the same
finite dimension. Then the following statements are equivalent.
(1) is one-to-one.
(2) is onto.
(3) is a vector space isomorphism.

Proof: (1) (2) If is one-to-one, then Ker = {0}, so dimK Ker = 0


and dimK Im = dimK Ker + dimK Im = dimK V = dimK U. Thus Im is a
subspace of U with dimK Im = dimK U, and Lemma 42.15(2) gives then
Im = U. Hence is onto..

(2) (1) If is onto, then Im = U, so dimK Im = dimK U and dimK Ker


= dimK V dimK Im = dimK U dimK U = 0. Thus Ker = {0} and is
one-to-one. .

Hence any one of (1),(2) implies the other, and these together imply (3).
Conversely, if is an isomorphism, then of course is one-to-one and
onto. Thus (3) implies both (1) and (2).

We close this paragraph with a brief discussion of infinite dimensional


vector spaces. Do infinite dimensional vector spaces have bases? From
Theorem 42.9, we know that such a vector space cannot be spanned by a
finite set. But if B is a spanning set, necessarily infinite, the argument of
Theorem 42.9 does not work. To prove the existence of bases of infinite
dimensional vector spaces, we have to resort to more sophisticated
means.

It is in fact true that every vector space has a basis, and a proof is given
in the appendix. The proof of this statement for infinite dimensional
vector spaces requires a fundamental tool known as Zorn's lemma. This
lemma can be used in a variety of situations to establish the existence of
certain objects.

518
The existence of bases having been assured by Zorn's lemma, we might
ask whether any two bases have the same cardinality. The answer
turned out to be "yes" in the finite dimensional case (Theorem 42.11),
and this was proved by using Steinitz' replacement theorem. The proof
of Steinitz' replacement theorem does not extend to the infinite dimen-
sional case. Nevertheless, theorems of set theory can be employed to
show that two bases of a vector space have the same cardinal number.
Thus renders it possible to define the dimension of a vector space as the
cardinality of a basis. Hence it is possible to distinguish between various
types of infinities. This is much finer than Definition 42.12, by which in-
finite dimensionality is merely a crude negation of finite dimensionality.

Theorem 42.14, which states that any linearly independent subset can
be extended to a basis, is true in the infinite dimensional case, too. The
proof makes use of Zorn's lemma.

Lemma 42.15(1) remains valid also in the infinite dimensional case, in


the sense that a basis of a subspace has a cardinal number less than or
equal to the cardinality of a basis of the whole space. Lemma 42.15(2),
however, is not necessarily true for infinite dimensional vector spaces: a
proper subspace may have the same dimension as the whole space
(think of and as -vector spaces).

Lemma 42.16 and its proof works in the infinite dimensional case.

Lemma 42.19 and its proof works in the infinite dimensional case, pro-
vided we refer to the generalization of Theorem 42.14 at the
appropriate place.

Generally speaking, infinite dimensional vector spaces are wild objects.


To render them more manageable, one equips them with some addi-
tional structure, perhaps with a topological or analytic one.

Exercises

1. Let V be a vector space over a field K and let W be a subspace of V.


Show that there is a subspace U of V such that V = W + U and

519
W U = {0}. (U is called a direct complement of W in V. We write then
V=W U and call V the direct sum of W and U.)

3
2. Is {(1,1,1), (1,1,0), (1,0,0)} an -basis of ?

3
3. Is {(1,2,6,), (0,0,1,), (2,1,0,) a 3
-basis of 3
?

4. Find an -basis of
{f C2([0,1]): f´´(x) 7f´(x) + 12f(x) = 0 for all x [0,1]}.

4 5
5. Find all -linear mappings from onto .

3 2
6. Find all 2
-bases of 2
and 3
-bases of 3
.

7. Show that the vectors (1,2,1), (0,2,0), (1,2, 1) and also the vectors
(1,1,0), (1,0,1), (1,1,1) in 3 are linearly independent over .

8. Let fk(x) = sin kx for x [0,1] (k = 1,2,3, . . . ). Prove that the functions
{f1,f2,f3, . . . } in C ([0,1]) are linearly independent over .

520
§43
Linear Transformations and Matrices

In this paragraph, we learn to construct a new vector space from two


given vector spaces V,W, namely the vector space of linear transforma-
tions from V into W. We introduce matrices and study the relationship
between linear transformations and matrices.

Suppose V and W are vector spaces over a field K. We denote by


LK (V,W) the set of all K-linear mappings from V into W. This set
LK (V,W) is not empty, for at least the mapping V W is a K-linear
0
transformation in LK (V,W). We want to define an addition and a multi-
plication by scalars on LK (V,W) and make LK (V,W) into a K-vector space.

Let T,S LK (V,W). How shall we define T + S? Well, the only natural way
to define T + S is to put (T + S) = T + S for all V (pointwise addi-
tion). What about multiplication by scalars? Given K and T L(V,W),
the mapping T had better mean: first multiply by , then apply T, so
that ( T) := ( )T (or, first apply T, then multiply by , so that ( T) :=
( T), but this is the same definition as before).

43.1 Theorem: Let V,W be vector spaces over a field K and let LK (V,W)
be the set of all K-linear transformations from V into W.. For any T,S in
LK (V,W) and for any in K, we write

(T + S) = T + S, ( T) = ( )T ( V).

Under this addition and multiplication by scalars, LK (V,W) is a vector


space over K.

Proof: We show first that LK (V,W) is an abelian group under addition.

(i) Let T,S LK (V,W). Then ( 1 + 2)(T + S)


= ( 1 + 2)T + ( 1 + 2)S
= ( 1T) + ( 2T) + ( 1S) + ( 2S)
= ( 1T) + ( 1S) + ( 2T) + ( 2S)

521
= ( 1T + 1
S) + ( 2T + 2
S)
= ( 1(T + S)) + ( 2(T + S))
for all , K and 1, 2 V. Thus T + S is K-linear and T + S LK (V,W).
Therefore LK (V,W) is closed under addition..

(ii) Let T,S,R be arbitrary elements of LK (V,W). Then


((T + S) + R ) = (T + S) + R = ( T + S) + R
= T + ( S + R) = T + (S + R) = (T + (S + R))
for all V; hence (T + S) + R = T + (S + R). Thus addition in LK (V,W) is
associative.

(iii) Let 0*: V W. Then ( 1


+ 2
)0* = 0 = 0 + 0
0
= ( 10*) + ( 20*) for all , K, 1
, 2
V and 0* is in LK (V,W). From
(T + 0*) = T + 0* = T + 0 = T ( V),
we obtain T + 0* = T for any T LK (V,W). Thus 0* is a right identity.

(iv) Any T LK (V,W) has an opposite in LK (V,W), namely the


mapping S: V W. Indeed
( T)
(T + S) = T + S = T + ( ( T)) = 0 = 0*
for all V, so T + S = 0*. Are we done? No! We should check that S is in
fact in L(V,W), but this easy:
( 1 + 2)S = (( 1 + 2)T)
=( ( 1
+ 2
))T (Lemma
41.5(2))
= (( 1
)+( 2
))T
=( ( 1
)+ ( 2
))T
= (( 1
))T + (( 2
))T
= ( ( 1T)) + ( ( 2T))
= ( 1S) + ( 2S)
for all , K and 1
, 2
V. Thus S is in LK (V,W) and S is a right inverse
of T. .

(v) Finally, T + S = S + T for any T,S LK (V,W), because


(T + S) = T + S = S + T = (S + T)
for all V. Hence LK (V,W) is a commutative group under addition.

522
Now the properties of multiplication by scalars. First we note that T is
in L(V,W) whenever K and T LK (V,W), because
( 1
+ 2
)( T) = ( ( 1
))T = ( 1 + 2)T
+ 2
= ( 1)T + ( 2)T = 1
( T) + 2
( T)
for all 1
, 2
V, so that T is additive and
( )( T) = ( ( ))T = (( ) )T = (( ) )T = (( ( ))T = (( )T) = ( ( T))
for all K, V, so that T is homogeneous. So T belongs to LK (V,W).

(1) (T + S) = T + S for all K and T,S LK (V,W) since


( (T + S)) = ( )(T + S) = ( )T + ( )S = ( T) + ( S) = ( T + S)
for any V.

(2) ( + )T = T + T for all , K and T LK (V,W) since


(( + )T) = (( + ) )T = ( + )T
=( )T + ( )T = ( T) + ( T) = ( T + T)
for any V.

(3) ( )T = ( T) for all , K and T LK (V,W) since


(( )T) = (( ) )T = (( ) )T = ( ( ))T = ( )( T) = ( ( T))
for any V.

(4) 1T = T for all T LK (V,W) since


(1T) = (1 )T = ( )T = T
for any V.

Thus LK (V,W) is a K-vector space.

Let us assume now that V is an n-dimensional K-vector space. and W is


an m-dimensional K-vector space (n,m ). Let B = { 1, 2, . . . , n} be a K-
basis of V and B´ = { 1, 2, . . . , m} be a K-basis of W. Then, for any linear
transformation T in LK (V,W), we have.

1
T = 11 1
+ 12 2
+ ... + 1m m

2
T = 21 1
+ 22 2
+ ... + 2m m
(*)
........................
n
T = n1 1 + n2 2 + . . . + nm m

where 11
, , ...,
12 nm
are scalars in K. The arrangement of the scalars in
(*) deserves a name.

523
43.2 Definition: Let K be a field and n,m . An n by m matrix over K
is an array
 11 12 . . . 1m 
 
 21 22 . . . 2m 
 ............
 n1 n2 . . . nm 
(1)

of nm elements 11, 12, . . . , nm of K, arranged in n rows and m columns,


and enclosed within parentheses.. The set of all n by m matrices over K
will be denoted by Matn m(K).

Sometimes we write "n m" instead of "n by m". The horizontal lines
i1 i2
. . . im
(2)
of a matrix over K are called the rows of that matrix. More specifically,
(2) is the i-th row of the matrix (1). The vertical lines

1j
2j
.. (3)
.
nj

of a matrix over K are called the columns of that matrix. More


specifically, (3) is the j-th column of the matrix (1). The element ij is at
the place where the i-th row and the j-th column meet. The first index i
refers to the row, the second index j refers to the column. Aso, in the
expression "n by m", the first number n specifies the number of rows,
the second number m specifies the number of columns of the matrix.
The elements ij are called the entries of the matrix (1). When n = m, the
matrix (1) is said to be a square matrix. The set of all square matrices
with n rows (or n columns) over K will be denoted by Matn(K) (instead
of Matn n(K)).

We will usually abbreviate the matrix (1) as ( ).


ij

Two matrices ( ij) Matn m(K) and ( ij) Matn m´(K) are declared to be
equal if n = n´, m = m´ and ij = ij for all i = 1,2, . . . ,n; j = 1,2, . . . ,m. Thus
two matrices are equal if and only if they have the same number of

524
rows and columns, and have the same elements at corresponding places.
We write then ( ij) = ( ij). Otherwise, we put ( ij) ( ij).

We now make Matn m(K) into a vector space over K.

43.3 Definition: Let K be a field, K and let A,B Matn m(K), say A =
( ij), B = ( ij). We write
A + B = C, C being the matrix ( ij) in Matn m(K), where ij = ij + ij,
and A = E, E being the matrix ( ij) in Matn m(K), where ij = ij
.
In other words, ( ij) + ( ij) = ( ij + ij) and ( ij) = ( ij).

43.4 Theorem: Let K be a field.. Under the addition and multiplication


by scalars of Definition 43.3, the set Matn m(K) is a vector space over K.

Proof: First we check that Matn m(K) is an abelian group under addition.

(i) For any A = ( ij), B = ( ij) in Matn m(K), we have A + B


= ( ij + ij) and each ij + ij is an element of K, because K is closed under
addition (i = 1,2, . . . ,n; j = 1,2, . . . ,m). Hence A + B Matn m(K) and
Matn m(K) is closed under addition.

(ii) For any A = ( ), B = ( ij), C = ( ij) in Matn m(K), there holds


ij
(A + B) + C = (( ij) + ( ij)) + ( ij) = ( ij
+ ) + ( ij) = ((
ij
) ij
+ ij
)+ ij
= ( ij + ( ij + ij)) = ( ij) + ( ij + ij) = ( ij) + (( ij) + ( ij)) = A + (B + C)
and addition in Matn m(K) is associative.

(iii) Let 0 be the n by m matrix whose entries are all equal to


the zero element of K. Thus 0 = ( ij), where ij
=0 K for all i,j. Then
A+0= ( ) + ( ij) = (
ij ij
+ ij
)=( ij
+ 0) = ( )=A
ij
for any A = ( )
ij
Matn m(K). So 0 Matn m(K) and 0 is a right identity of
Matn m(K).

(iv) For any A = ( )


ij
Matn m(K), let B = ( )
ij
Matn m(K).
Then A + B = ( ij) + ( ij) =( ij
+ ( ij)) = 0. Hence every element A = ( ij) in
Matn m(K) has an inverse ( ij
) in Matn m(K).

(v) For all A = ( ij


), B = ( ij) Matn m(K), we have

525
A + B = ( ij) + ( ij) = ( ij + ij) = ( ij + ij
) = ( ij) + ( ij) = B + A
and addition on Matn m(K) is commutative.

This proves that Matn m(K) is an abelian group under addition. Now the
properties of multiplication by scalars. For any , K and A = ( ij), B =
( ij) Matn m(K), we have

(1) (A + B) = (( ij) + ( ij)) = ( ij + ij) = ( ( ij


+ ))
ij
= ( ij + ij) = ( ij)+ ( ij)
= ( ij) + ( ij) = A + B,

(2) ( + )A = ( + )( ij) = (( + ) ij) = ( ij + ij)


= ( ij) + ( ij) = ( ij) + ( ij) = A + A,

(3) ( )A = ( )( ij) = (( ) ij) = ( ( )) = (


ij ij
)
= ( ( ij))= ( A),

(4) 1A = 1( ) = (1 ij) = (
ij
) = A.
ij

Thus Matn m(K) is a vector space over K.

A convenient K-basis of Matn m(K) is described in the next lemma.

43.5 Lemma: Let Eij be the matrix in Matn m(K) all of whose entries are
0, except for the single entry in the i-th row, j-th column,. which entry is
the identity element of K. Then the nm matrices Eij (where i = 1,2, . . . ,n;
j = 1,2, . . . ,m) form a K-basis of Matn m(K). In particular, dimK (Matn m(K))
is equal to nm. .

Proof: The matrices Eij span Matn m(K) over K because any A = ( ) in
ij
Matn m(K) can be written as a K-linear combination

A=( )=∑
ij
E
ij ij
i,j

of them. Moreover, matrices Eij are linearly independent over K, for if ij


are scalars such that

∑ E = 0,
ij ij
i,j

526
then ( ij) = 0
and ij
= 0 for all i,j. Therefore {Eij: i = 1,2, . . . ,n; j = 1,2, . . . ,m} is a K-basis
of Matn m(K). In particular, dimK (Matn m(K)) = nm.

We relate L(V,W) to Matn m(K). This relation is implicit in (*). We state


this relation as a definition and prove that LK (V,W) and Matn m(K) are
isomorphic K-vector spaces. .

43.6 Definition: Let V be an n-dimensional and W be an m-dimension-


al vector space over a field K, where n,m . Let B = { 1, 2, . . . , n} be a
K-basis of V and B´ = { 1, 2, . . . , m} be a K-basis of W.
Let T be a K-linear transformation in LK (V,W) and let


m
i
T= ij j
(i = 1,2, . . . ,n), (*)
j=1

where ij K.
The n m matrix ( ij) over K will be called the matrix associated with T
B
(relative to the bases B and B´), and will be written MB´ (T).

In the following discussion, the bases will be fixed and we simply write
B
M(T) instead of MB´ (T). The role of the bases will be discused at the end
of this paragraph.

43.7 Theorem: Let V be an n-dimensional and. W be an m-dimensional


vector space over a field K, where n,m . Then, for any T,S LK (V,W)
and K, we have .
M(T + S) = M(T) + M(S) and M( T) = M(T)
(all associated matrices are taken relative to the same pair of K-bases).
In other words, M: LK (V,W) Matn m(K) is a K-linear transformation.

Proof: Let B = { 1, 2, . . . , n} be the K-basis of V and B´ = { 1, 2, . . . , m}


be the K-basis of W relative. to which the associated matrices are taken,
so that M(T) = ( ij) and M(S) = ( ij), where

527
∑ ∑
m m
i
T= ij j
and i
S= ij j
.
j=1 j=1

∑ ∑ ∑(
m m m
Then i
(T + S) = i
T+ i
S= ij j
+ ij j
= ij
+ ij
) j
j=1 j=1 j=1

and therefore M(T + S) = ( ij


+ )=(
ij ij
) + ( ij) = M(T) + M(S). Also

∑ ∑
m m
i
( T) = ( i
)T = ( iT) = ij j
= ij j
j=1 j=1

and therefore M( T) = ( ) = ( ij) = M(T).


ij

43.8 Theorem: Let V be an n-dimensional and. W be an m-dimensional


vector space over a field K, where n,m . Then LK (V,W) is isomorphic
to Matn m(K) (as K-vector spaces). In particular, dimK LK (V,W) = nm.

Proof: We know that M: LK (V,W) Matn m(K) (in the notation of Theo-
rem 43.7) is a vector space homomorphism. We will prove that M is in
fact an isomorphism.

We prove that M is one-to-one and onto. Let ( ij) be any matrix in


Matn m(K). We want to find a T in LK (V,W) such that M(T) = ( ij). Such a
K-linear transformation T should satisfy.


m
i
T= ij j
j=1

(∑ T=∑ ∑ i ∑ ∑ (∑
n n n m m n
i i) )
and ( T) =
i i ij j
= i ij j
i=1 i=1 i=1 j=1 j=1 i=1

for any vector = 1 1 + 2 2 + . . . + n n in V. Thus there is at most one T


in LK (V,W) with M(T) = ( ij). Hence M is one-to-one.

With the hindsight gained from the chain of equations above,. given any
( ij) in Matn m(K), we define a function T: V W by

(∑ T = ∑ (∑
n m n
i i) )
i ij j
.
i=1 j=1 i=1

∑ ∑
n n
Then, for any , K and = i i
, ´= i i
V, we have
i=1 i=1
n
∑ ∑ T = (∑ (
n n
( + ´)T =( i i
+ i i) i
+ i
) i )T
i=1 i=1 i=1

528
∑ (∑ ( ∑ ( ∑ ∑
m n m n n
= i
+ i
) ij ) j
= i ij
+ )
i ij j
j=1 i=1 j=1 i=1 i=1

∑ (∑ ∑ (∑
m n m n
i ij) )
= j
+ i ij j
= ( T) + ( ´T)
j=1 i=1 j=1 i=1

and T is K-linear. Thus T LK (V,W). When we put i0 = 1 and i


= 0 for


m
i i0, we obtain i0 T = i0 j j
, (i0 = 1,2, . . .
j=1

,n)

so M(T) = ( ij). Thus every ( ij) Matn m(K) is the image, under M, of at
least one T LK (V,W) and so M is onto. Consequently M is a vector space
isomorphism: LK (V,W) Matn m(K). From Theorem 42.18 and Lemma
43.5, we get dimK LK (V,W) = dimK Matn m(K) = nm.

Now let U be a vector space over the field K, with dimK U = k , and let
B´´ = { 1, 2,. . . , k} be a basis of U over K. If T: V W and S: W U are K-
linear transformations, whose associated matrices . [relative to the K-
bases B = { 1, 2, . . . , n}, B´ = { 1, 2, . . . , m} of V and W, and relative to the
K-bases B´, B´´ of W and U] are ( ij) Matn m(K) and ( jl) Matm k(K), so
that

∑ ∑
m k
i
T= ij j
, j
T= jl l
,
j=1 l=1

then TS: V U is a K-linear transformation (Theorem 41.7) and

(TS) = ( iT)S = ( ∑ S= ∑
m m
j)
i ij ij
( j
S)
j=1 j=1

∑ ∑ ∑ (∑
m k k m
= ij jl l
= ij jl ) l
j=1 l=1 l=1 j=1

so that the matrix associated with TS [relative to the K-bases B,B´´] is the


m
n k matrix whose i-th row, l-th column entry is ij jl
. This leads us to
j=1

the following definition.

529
43.9 Definition: Let A = ( ) be an n
ij
m matrix and let B = ( ij) be an
m k matrix, with entries from a field K. Then the product of A and B,


m
denoted by AB, is the n k matrix ( ij) over K, where ij
= . Stated
ij jl
j=1
m
otherwise ( ij)( ij) := ( ∑ ij jl
)
j=1

Before studying the properties of this matrix multiplication, we summa-


rize the discussion preceding Definition 43.9. Although matrix multipli-
cation is defined in such a way as to make it true, the following theorem
is by no means obvious (cf. Remark 43.18).

43.10 Theorem: Let V,W,U be vector spaces over a field K, of nonzero


finite dimensions n,m,k, respectively. Let B, B´, B´´ be fixed K-bases of
V,W,U, respectively. If, relative to these bases, T LK (V,W) has the
associated matrix A, and S LK (W,U) has the associated matrix C, then
TS LK (V,U) has the associated matrix AC. Equivalently,
M(TS) = M(T)M(S).

The product of an n m matrix by an m k matrix is an n k matrix.


Notice that the number of columns in A has to be equal to the number of
rows in B in order AB to make sense. The product of an n m matrix by
an m´ k matrix is not defined unless m = m´.

Matrix multiplication is associative whenever it is possible. That is to


say, (AB)C = A(BC) for any matrices A,B,C with entries from a field K,
provided the sizes of A,B,C are such that the products AB and BC are
defined (then (AB)C and A(BC) are defined, too). More precisely, if A is
an n m matrix, B is an m k matrix and C is an k s matrix, then the
two n s matrices (AB)C, A(BC) are equal. To prove this, let us put A =
( ij), B = ( jl), C = ( lr), where i = 1,2, . . . ,n; j = 1,2, . . . ,m; r = 1,2, . . . ,s. Then.


m
AB = E = ( il
), where il
= ij jl
j=1

530

k
BC = F = ( jr), where jr
= jl lr
l=1


k
and from (AB)C = EC =( il lr )
l=1

(∑ (∑ (∑ ∑
k m k m
= ij jl ) lr) = ( )
ij jl lr ),
l=1 j=1 l=1 j=1

(∑ ) =( ∑ ij ( ∑
m m k
A(BC) = AF = ij jr jl lr))
j=1 j=1 l=1

(∑ ∑ (∑ ∑
m k k m
=
j=1 l=1
ij
( jl lr
) ) =
l=1 j=1
(
ij jl lr
) )
(∑ ∑
k m
=
l=1 j=1
( )
ij jl lr ),
we conclude that (AB)C = A(BC).

However, there is no hope for commutativity. For one thing, the product
BA need not be defined even if the product AB happens to be defined.
For instance, if A is a 2 3 and B is a 3 4 matrix, then AB is a 2 4
matrix, but BA is not even defined, let alone is equal to AB. But also in
cases where both products AB and BA are defined, they will, generally
speaking, have diferent sizes, so they will fail to be equal on dimension
grounds. For instance, if A is a 2 3 matrix and B is a 3 2 matrix, then
AB is a 2 2 matrix and BA is a 3 3 matrix and AB BA, since a 2 2
matrix cannot be equal to a 3 3 matrix. Even if both AB and BA are
defined and have the same size (this occurs only in case A and B are
square matrices with the same number of rows), it usually happens that
00 10 10 00
AB BA. For example (1 0)(0 0) (0 0)(1 0).

Let Im be the square matrix over K with m rows, whose entries are all
equal to 0 K, except for those on the main diagonal, which are all equal
to 1 K (the main diagonal in any n m matrix consists of the places
where the i-th row and the i-th column intersect, (i = 1,2, . . . ,min{n,m})).
It is easily verified that AIm = A for any A Matn m(K).. Likewise InA = A
for any A Matn m(K).

Let 0m k be the m k matrix over K all of whose entries are 0 K. One


checks easily that A0m k = 0n k and 0k nA = 0k m for any A Matn m(K).

531
Multiplication of matrices is distributive over addition. Indeed, for all A
= ( ij) Matn m(K) and B = ( jl), C = ( jl) Matm k(K), we have

) = (∑ ) = (∑ (
m m
jl )
A(B + C) = ( ij)( jl
+ jl ij
( jl
+ ij jl
+ ij jl
))
j=1 j=1

= (∑ ∑ = (∑ + (∑
m m m m
ij jl) jl) ) = AB + AC.
ij jl
+ ij ij jl
j=1 j=1 j=1 j=1

In like manner, one proves (B + C)A = BA + CA for all A Matn k(K) and
B,C Matm n(K).

One checks easily that, for all K, A Matn m(K), B Matm k(K)
( A)B = (AB) = A( B). (e)

On writing A = ( ), B = ( jl), we get indeed


ij

)( jl) = ( ∑ ( = (∑
m m
jl)
( A)B = ( ij ij
) ( ))
ij jl
j=1 j=1

∑ = (∑
m m
=( ij jl) ij jl)= (AB)
j=1 j=1

and

) = (∑ ) = (∑
m m
jl )
A( B) = ( ij
)( jl ij
( ( ij jl
))
j=1 j=1

∑ = (∑
m m
=( ij jl) ij jl)= (AB).
j=1 j=1

Let us now consider the set Matn(K) of square matrices over a field K.
From Theorem 43.3, we know that Matn(K) is an abelian group under
addition.. The product of any two n n matrices is an n n matrix. Since
the matrix multiplication is associative. and distributive over addition,
Matn(K) is a ring. We also know that AIn = InA = A for any n n matrix
A. Thus we proved.

43.11 Theorem: Let K be a field and n .. Then, under matrix


addition and matrix multiplication, Matn(K) is a ring with identity In.

532
The counterpart of Theorem 43.11 for linear transformations is also
valid.

43.12 Theorem: Let V be a vector space over a field K and let LK (V,V)
be the set of all K-linear mappings from V into V.. Then, under the point-
wise addition and composition of K-linear transformations, LK (V,V) is a
ring with identity.. The identity mapping : V V is the identity element
of this ring LK (V,V). [Notice that there is no hypothesis about dimK V.]

Proof: We must check the ring axioms. From Theorem 43.1, we know
that LK (V,V) is an abelian group under addition.. Also, (1) LK (V,V) is
closed under the composition of mappings (Theorem 41.7), and (2)
composition of mappings (whether K-linear or not) is associative
(Theorem 3.10), and (D) composition is distributive over addition: when
T,S,R are arbitrary elements of LK (V,V), then.

(T(S + R)) = ( T)(S + R) = (( T)S ) + (( T)R ) = ( (TS)) + ( (TR)) = (TS + TR)

and ((S + R)T) = ( (S + R))T = ( S + R)T = ( S)T + ( R)T = (ST) + (RT)


= (ST + RT)

for all V, hence T(S + R) = TS + TR and (S + R)T = ST + RT. So LK (V,V)


is a ring.. Finally, the identity mapping is clearly a K-linear transforma-
tion, so LK (V,V) and as T = T = T for all T LK (V,V), we conclude
that LK (V,V) is a ring with identity .

43.13 Theorem: Let V be a vector space over a field K with dimK V = n,


where n . Then LK (V,V) Matn(K) (ring isomorphism).

Proof: We fix a K-basis of V and use the mapping. M: LK (V,V) Matn(K)


of Theorem 43.7, so that M(T) is the associated matrix of the K-linear
transformation T LK (V,V).. By Theorem 43.8, M is an isomorphism of
abelian groups (in fact of K-vector spaces, but we do not need this now)
and by Theorem 43.10, M preserves multiplication as well. Hence M is a
ring isomorphism.

533
Let us recall that a unit in a ring with. identity is an element of that ring
possessing a (unique) right inverse. which is also a left inverse. What are
the units of LK (V,V)? The units in LK (V,V) are, by definition, those K-
linear transformations T with the inverse T 1 in LK (V,V). The inverse of
T in LK (V,V), whenever it exists, is in LK (V,V) by Lemma 41.10(2). Thus
the units of LK (V,V) are the K-linear transformations in LK (V,V) which
are one-to-one and onto: the units of LK (V,V) are the vector space
isomorphisms from V onto V.. The set of all isomorphisms from V onto V
will be denoted by GL(V).. This is a group under the composition of
mappings, called the general linear group of V.. Thus LK (V,V) = GL(V).

The units in Matn(K) are the invertible matrices, that is to say, matrices
A in Matn(K) for which an A 1 Matn(K) exists such that AA 1 = In = A 1A.
These are the matrices associated with isomorphisms from V onto V.. In
the next paragraph,. we will give a necessary and sufficient condition for
a matrix to be invertible (Theorem 44.20).. The set of all invertible
matrices in Matn(K) will be denoted by GL(n,K). This is a group under the
multiplication of matrices,. called the general linear group of degree n
over K. Thus Matn(K) = GL(n,K). When V is an n-dimensional vector
space over K, the group GL(V) is isomorphic to the group GL(n,K)..

We return to the more general case of LK (V,W) and Matn m(K). Suppose
again that. V and W are K-vector spaces of K-dimensions n and m,
respectively, where n,m . Let B = { 1, 2, . . . , n} and B* = { *, 1 2
*, . . . , *}
n
be K-bases of V and let B´ = { 1, 2, . . . , m} and B´* = { *,1 2
*, . . . , m
*} be K-
bases of W. With each K-linear transformation T: V W, there is
B
associated a matrix MB´(T) relative to the bases B and B´, and a matrix
B*
MB´*(T) relative to the bases B* and B´*. We want to study the
B B*
relationship between MB´ (T) and MB´*(T).

B B*
We recall that MB´ (T) = ( ) and MB´*
ij
(T) = ( ij), where

∑ ∑
m m
i
T= ij j
and *T
i
= ij j
*. (i= 1,2, . . .
j=1 j=1

,n)
We introduce transition matrices which describe the change of bases.
Writing

534

n
i
= *
ik k
(i= 1,2, . . .
k=1

,n)
we obtain a matrix ( ik) in Matn(K), called the transition matrix from the
B
K-basis B to the K-basis B* of V. Of course, ( ik) = MB*( ), where is the
identity mapping on V. We have the schema

mappings
V V V vector spaces
B* B B* bases
MB*
B
( ) M B
B*
( ) matrices
Now the composition is the identity mapping . Relative to the bases B*
and B*, the matrix associated with is the identity matrix In. This matrix
is also equal to MB*
B
B
( )MB*( ) by Theorem 43.10: MB*
B
B
( )MB*( ) = In. Thus
the transition matrix from B to B* is the inverse of the transition matrix
from B* to B.

V V V V
B B* B* B
B
MB*() MB*
B
( ) = (MB*( )) 1
B

43.14 Theorem: With the foregoing notation, let P be the transition


matrix from B to B*, and let Q be the transition matrix from B´ to B´*. If
B B*
T: V W is any K-linear mapping, then the matrices MB´ (T) and MB´*(T)
B*
are connected by MB´*(T) = P 1MB´
B
(T)Q.

Proof: We have the following schema


T
V W
B B´
B
MB´ (T)
The K-linear transformation T can be desribed also as follows.

V T W
V V W W
B* B B´ B´*
B
MB*( )
B V
MB´ (T) B´
MB´*( )
W

535
Now MB* B
( ) = [MB*
B V
( V)] 1 = P 1 and MB´*

( W
) = Q. By Theorem 43.10, the
matrix associated with T relative to the bases B* and B´* is P 1MB´
B
(T)Q.
B*
Hence MB´*(T) = P 1MB´
B
(T)Q.
V T W
V V W W
B* B B´ B´*
1 B
P MB´ (T) Q

43.15 Theorem: Let V be an n-dimensional vector space over a field K,


B and B* be K-bases of V, and let P be the transition matrix from B to B*.
Suppose T is any K-linear mapping from V into V. If M(T) is the matrix
associated with T relative to the bases B and B, and if M*(T) is the
matrix associated with T relative to the bases B* and B*, then
M*(T) = P 1M(T)P.

Proof: This is a special case of Theorem 43.14. Using the diagram

T
V V V V
B* B B B´*
1 P
P M(T)
the proof follows immediately from Theorem 43.10.

43.16 Definition: Let K be a field and let A = ( ij) be an n m matrix


with entries from K. Then the m n matrix ( ij), where ij = ji for all i,j, is
called the transpose of A, and is written At.

Hence At is obtained from A by changing rows to columns and columns


0 1 2  0 1
to rows. For instance, the transpose of (1 3 5) is  1 3 . It follows from
 2 5
 
the definition that (At)t = A for any matrix A.

43.17 Lemma: Let K be a field and let A,B Matn m(K), C Matm k(K),
K. Then

536
(A + B)t = At + B t, ( A)t = (At), (AC)t = CtAt.

Proof: Let A = ( ), B = ( ij) and C = ( jl), where i = 1,2, . . . ,n; j = 1,2, . . . ,m


ij
and l = 1,2, . . . ,k. Let us put At= ( ´ij), B t= ( ´ij), Ct= ( ´ij) and A = ( ij). Then
´ij = ji,, ´ij = ji,, ´ij = ji, and ij = ij
for all i,j. So
(A + B)t = (( ij) + ( ij))t = ( ij + ij)t
= matrix whose i-th row, j-th column entry is ji + ji
= (matrix whose i-th row, j-th column entry is ji)
+ (matrix whose i-th row, j-th column entry is )
ji
= ( ´ij) + ( ´ij) = At + B t,

( A)t= ( ij)t = matrix whose i-th row, j-th column entry is ji


= matrix whose i-th row, j-th column entry is ji
= (matrix whose i-th row, j-th column entry is )
ji
= ( ´ij ) = At,

(AC)t = matrix whose l-th row, i-th column entry is the i-th
row, l-th column entry in AC


m
= matrix whose l-th row, i-th column entry is ij jl
j

∑ jl
m
= matrix whose l-th row, i-th column entry is ij
j

∑ ´ij
m
= matrix whose l-th row, i-th column entry is ´ij
j

= (matrix whose l-th row, j-th column entry is ´ij)times


(matrix whose j-th row, i-th column entry is ´ij)
= ( ´ij)( ´ij) = CtAt.

43.18 Remark: The results in this paragraph are very natural. All
operations discussed here are natural, and the vector spaces and rings of
this paragraph arise naturally. Another natural item is the isomorphism
in Theorem 43.10.

There is, however, a subtle point here. Theorem 43.10 is true only
because we write the functions on the right of the elements on which
they act! If we had written them on the left, Theorem 43.10 would read:
M(TS) = M(S)M(T). Of course this is not as good as M(TS) = M(T)M(S). For

537
this reason, people who write functions on the left define the associated


m
matrices differently. If T LK (V,W) and i
T = ij j
as in Definition
j=1

43.6, they define the matrix associated with T (relative to the fixed K-
bases { 1, 2, . . . , n} of V and { 1, 2, . . . , m} of W) to be ( ij)t. Thus their
M(T) is our M(T)t, and
their M(TS) = their M(first S, then T) = our M(first S, then T)t = our
M(ST)t = our (M(S)M(T))t = our M(T)tM(S)t = their M(T)M(S)
so that Theorem 43.10 is true in their notation, too. In some books, the
forming of the transpose is included in the notation for the associated
matrix. More clearly, some people write


m
i
T= ji j
j=1

and define the matrix associated with T to be ( ji). Then M(TS) =


M(T)M(S) as before, but the equations above are not very sensible, for ji
depends primarily on i and secondarily on j, so the indices occupy wrong
places. In our notation, there is no need for the artificial transpositions,
nor do we write the indices in the wrong order.

Exercises

1. Compute A + B and AB when


0134 1 7 1 3
 2 4 2 1 1 0 3 4
A= 1 0 3 0 , B = 2 0 2 1 in Mat4( ),
 6 1 1 2   0 2 1 0 
and when
 5 3 2 4  1 2 5 1
 3 4 2 1  0 3 3 6
A =  1 2 3 1 , B = 2 6 1 4 in Mat4( 7).
0 4 6 1 0 4 5 0

2. Let K be a field and A,B Matn(K) with AB = BA. Prove that (A + B)2 =
A2 + 2AB + B 2 and (A + B)(A B) = A2 B 2. Show that these equations
need not hold if AB BA.

3. Evaluate A,A2,A3, . . . , where A is given by

538
0 1 1 1
 0 1 1 0 0 1
A =  0 0 1 and by A = 0 0 1
1
. Generalize to square matrices of n
 0 0 0 0 0 0 
0
  0
rows.

4. Evaluate A,A2,A3, . . . , where A is given by


1 1 0 0
 1 1 0 0 1 1 0
A =  0 1 1 and byA = 0 0 1 1 . Generalize to square matrices of n
 0 0 1 0 0 0 
1
 

rows.

4. The trace of a matrix A = ( ij) Matn(K) is defined to be the sum of


the entries on the main diagonal of A, denoted by tr(A), so that tr(A) =
11
+ 22 + . . . + nn. Prove that tr(A) = tr(At), that tr(AB) = tr(BA) and that
tr(C 1AC) = tr(A) for any A,B Matn(K), C GL(n,K).

6. Let V,W be vector spaces over a field K, let { i: i I} be a K-basis of V


and let T,S LK (V,W). Prove that T = S if and only if iT = iS for all i I.

7. Let V be a vector space over a field K, with dimK V = n . Prove that


GL(V) is isomorphic to GL(n,K).

8. Let : 3 3
be the -linear mapping for which 1 = (1,0,2), 2 =
(0,1,1), 3 = (1,0,1), where, as usual, 1 = (1,0,0), 2 = (0,1,0), 3 =
(0,0,1). We put 1 = ( 1,1,0), 2 = (1,2,3), 3 = (0,1,2). Let B = { 1, 2, 3},
3
and B* = { 1
, 2
, 3
}. Show that B* is an -basis of and find the matrix
of the -linear transformation relative to the bases (a) B and B; (b) B
and B*; (c) B* and B; (d) B* and B*.

539
§44
Determinants

With each (square) matrix over a field K, we associate an element of K,


called the determinant of the matrix. In this paragraph, we study the
properties of determinants.

Determinants arise in many contexts. For example, if a1,a2,b1,b2 elements


of a field K and if the equations
a1x + a2y = 0
b1x + b2y = 0
hold, then (a1b2 a2b1)x = (a1b2 a2b1)y = 0. In §17, we called a1b2 a2b1
a1 a2
the determinant of the matrix (b b ).
1 2

Now let a3,b3,c1,c2,c3 be further elements of K. If


a1x + a2y + a3z = 0
b1x + b2y + b3z = 0
c1x + c2y + c3z = 0,
then, multiplying the first equation by b2c3 b3c2, the second by a3c2
a2c3, the third by a2b3 a3b2 and adding them, we get Dx = 0, where

D = a1b2c3 a1b3c2 + a3b21c2 a2b1c3 + a2b3c1 a3b2c1.

One obtains also Dy = 0 and Dz = 0. Here D is a sum of 6 = 3! terms aibj ck,


1 2 3
where {i,j,k} = {1,2,3} and the sign of aibj ck is + or according as ( i j k)
is an even or odd permutation in S3.

Similarly, when we try to eliminate x,y,z,u from the equations


a1x + a2y + a3z + a4u = 0
b1x + b2y + b3z + b4u = 0
c1x + c2y + c3z + c4u = 0
d1x + d2y + d3z + d4u = 0,
we get D´x = D´y = D´z = D´u = 0, where D is a sum of 24 = 4! terms
aibj ckdl, where {i,j,k,l} = {1,2,3,4} and the sign of aibj ckdl is + or
1234
according as ( i j k l ) is an even or odd permutation in S4.

540
This pattern continues. The expressions we get in this. way are called
determinants. On changing a to 1, b to 2, c to 3, etc., the formal
definition reads as follows. .

44.1 Definition: Let K be a field and


 11 12 . . . 1n 
 
A =  21 22 =(
. . .
 ............ 
2n
)
ij
 n1 n2 . . . nn 
be an n n square matrix with entries from K. Then the element

∑ ( ) 1,1 2,2
... n,n
Sn

of K is called the determinant of the matrix A. It will be denoted as


11 12
. . . 1n
. . .
det A, or det(A), or 21 22 2n
, or ij .
. . . . . . . . . . . .
n1 n2
. . . nn

Hence det A is a sum of n! terms These summands are obtained from the
product 11 22. . . nn of the entries in the main diagonal by permuting the
second indices in all the n! ways and attaching a "+" or " " sign according
as the permutation is even or odd. Each summand, aside from its sign, is
the product of n entries of the matrix, the entries being from distinct
rows and distinct columns. The determinant can also be written

∑ 1,1 2,2
... n,n ∑ 1,1 2,2
... n,n
.
An S n \An

44.2 Remarks: (1) Determinants are defined for square matrices only.
Nonsquare matrices do not have a determinant. Note that the determi-
nant of the 1 1 matrix ( ) is equal to K.

(2) Definition 44.1 makes sense when K is merely a commutative ring.


The theory in this paragraph extends immediately to the case where K is
a commutative ring with identity. We will not need this general theory.

541
We observe only: when R is a subring of. K, and all entries of A Matn(K)
are in R, then det A is in fact an element of R.

Some fundamental properties of determinants are collected in the next


lemmas.

44.3 Lemma: Let K be a field and A Matn(K). Then det A = det At.
(The determinant does not change when rows are changed to columns.)

Proof: Let A = ( ) and At = ( ij), so that


ij ij
= ji
for all i,j. Then

det A = ∑ ( ) 1,1 2,2


... n,n
.
Sn
1
As runs through Sn, so does . Hence

det A = ∑ ( 1
) 1,1 -1 2,2 -1
. . . n,n -1 .
Sn

Using commutativity of multiplication in K, we reorder the factors in


each summand with regard to their second indices and get

det A = ∑ ( 1
) 1 ,1 2 ,2
... n ,n
.
Sn
1
Since ( )= ( ) for all Sn (in case n 2; if n = 1, there is nothing
to prove, for then A = At),

det A = ∑ ( ) 1 ,1 2 ,2
... n ,n
Sn

= ∑ ( ) 1,1 2,2
... n,n
Sn

= det At.

44.4 Lemma: Let K be a field and A Matn(K). If each element in a


particular row (column) of A is multiplied by K, then the determi-
nant of the new matrix thus obtained is equal to .det A.

542
Proof: In view of Lemma 44.3, it suffices to prove the stetement about
rows only. Let A = ( ij). Assume that the elements of the k-th row are
multiplied by . The new matrix is ( ij), where ij = ij for i k and kj =
kj
. Thus

ij
= ∑ ( ) 1,1
... k,k
... n,n
Sn

= ∑ ( ) 1,1
...( k,k
). . . n,n
Sn

= ∑ ( ) 1,1
... k,k
... n,n
Sn

= . ij
.

44.5 Lemma: Let K be a field and A Matn(K). Assume that each


element kj in the k-th row of A is a sum kj + kj (each element kj in the
k-th column of A is a sum ik + ik). Then det A is a sum of two
determinants: det A = det B + det C, .
where B resp. C is identical with A, except for the k-th row (column),. in
which kj are replaced by kj resp. kj ( ik are replaced by ik resp. ik).
Symbolically .

11 12
. . . 1n
. . . . . . . . . . . . . .
+
k1 k1 k2 k2
+ . . . +
kn kn
. . . . . . . . . . . . . .
n1 n2
. . . nn

11 12
. . . 1n 11 12
. . . 1n
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
= k1 k2
. . . kn + k1 k2
. . . kn
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
n1 n2
. . . nn n1 n2
. . . nn

and
11
. . . 1k
+ 1k . . . 1n

21
. . . 2k
+ 2k . . . 2n
. . . . . . . . . . . .
n1
. . . nk
+ nk . . . nn

543
11
. . . 1k
. . . 1n 11
. . . 1k
. . . 1n

21
. . . 2k
. . . 2n 21
. . . 2k
. . . 2n
= + .
. . . . . . . . . . . . . . . . . . . . . . . .
n1
. . . nk
. . . nn n1
. . . nk
. . . nn

Proof: The proof is shorter than the wording of the lemma. It will be
sufficient to prove the assertion involving rows only, and this follows
from summing
( ) 1,1 . . . k,k . . . n,n
= ( ) 1,1 . . . ( k,k + k,k ). . . n,n
= ( ) 1,1 . . . k,k . . . n,n + ( ) 1,1 . . . k,k . . . n,n
over all Sn.

The last two lemmas mean that the determinant of a matrix is a linear
function of any one of its rows or columns.

44.6 Lemma: Let K be a field, A Matn(K) and K. Then det ( A) =


n.
det A.

Proof: This follows from n successive applications of Lemma 44.4.


Altenatively, observe that, when we put A = ( ij), each summand
( )( 1,1 )( 2,2 ). . .( n,n ) of det ( A) is n times a summand
( ) 1,1 2,2 . . . n,n of det A, and conversely.

44.7 Lemma: Let K be a field and A,B Matn(K). If B is obtained from


A by interchanging two rows (columns) of A, then det B = det A (the
determinant changes sign when two rows (columns) are interchanged.)

Proof: We prove the statement about rows only. Assume A = ( ij) and
B = ( ij), and assume that B is obtained from A by interchanging the k-th
and m-th rows of A so that ij = ij for all i,j with i k, i m and kj = mj,
mj
= kj for all j. Then

det B = ∑ ( ) 1,1
... k,k
... m,m
... n,n
.
Sn

544
As ranges over Sn, so does (km) . Hence we have

det B = ∑ ((km) ) 1,1(km)


... k,k(km)
... m,m(km)
... n,n(km)
Sn

= ∑ ( ) 1,1
... k,m
... m,k
... n,n
Sn

= ∑ ( ) 1,1
... m,m
... k,k
... n,n
Sn

= det A.

44.8 Lemma: Let K be a field and A,B Matn(K), n 2. If B is obtained


from A by a permutation of the rows (columns) of A, then det B =
( )det A.

Proof: Let A = ( ij) and B = ( ij). We give a proof of the assertion about
rows only. The hypothesis is that ij = i ,j for some in Sn. We write as
a product of transpositions:
= 1 2. . . s ( 1, 2, . . . , s
are transpositions in Sn)
so that ( ) = ( 1)s by definition. We introduce matrices
A = A0, A1, A2, . . . ,As 1, As = B,
where each Ar is obtained from Ar 1 (r = 1,2, . . . ,s) by interchanging two
rows:
A0 = ( ), A1 = (
ij
), A2 = (
i 1 ,j
), . . . , As
i 1 2 ,j 1
=( i 1 2 ...
), As = ( ).
s-1,j i 1 2 ... s-1 s ,j

Then, using Lemma 44.7 repeatedly,


det B = det As = det As 1 = ( 1)2det As 2 = ( 1)3det As 3
= . . . = ( 1)sdet A = ( )det A. 0

44.9 Lemma: Let K be a field and A Matn(K), n 2. If two rows


(columns) of A are identical, then det A = 0.

Proof: One usually argues as follows. Interchanging the two identical


rows (columns), the det A does not change. But it becomes det A by
Lemma 44.7. Hence det A = det A. Thus 2det A = 0. One concludes from
this that det A = 0.

545
This conclusion is justified when we can divide by 2 in K,. that is to say,
if the multiplicative inverse of 2 exists in K.. Let us recall that 2 is an
abbreviation of 1K + 1K , where 1K is the identity of K. Since any nonzero
element of K has an inverse in K,. the conclusion is valid when K is a field
in which 1K + 1K 0. If, however, 1K + 1K = 0 (as in 2), this argument
does not work. .

We give an argument which works irrespective of whether 1K + 1K = 0 or


not.. We prove the statement about rows only. Reordering the rows of A
by a suitable permutation in Sn, we obtain a matrix B in which the first
two rows are identical and det B = ( )det A. Since det B = 0 if and only
if det A = 0, we may assume, without loss of generality, that the first
and second rows of A are identical. We prove det A = 0 under this
assump-tion.

11 12
Let A = ( ), with
ij 1j
= 2j
for all j. If n = 2, then det A = =
11 12

11 12 12 11
= 0. Let us suppose now n 3. Then

det A = ∑ 1,1 2,2 3,3


... n,n ∑ 1,1 2,2 3,3
... n,n
. (i)
An S n \An
As runs through An, the permutation (12) runs through Sn\An. Hence

the subtrahend ∑ 1,1 2,2 3,3


... n,n
in (i) is equal to
S n \An

= ∑ 1,1(12) 2,2(12) 3,3(12)


... n,n(12)
An

= ∑ 1,2 2,1 3,3


... n,n
An

= ∑ 2,1 1,2 3,3


... n,n
(commutativity of multiplication)
An

= ∑ 1,1 2,2 3,3


... n,n
(first two rows are identical),
An

which is the minuend in (i). Hence det A = 0.

It will be convenient to identify the i-th row


i1 i2
. . . in

546
of a matrix A = ( )
ij
Matn(K), where K is a field, with the vector
( i1 i2 . . . in)
in Kn = Mat1 n(K). Similarly, the j-th column
1j

2j
..
.
nj
of A will be identified with the vector (matrix)
 1j 
 ...
2j

 
nj
in Matn 1(K). Thus it is meaningful to speak of K-linear (in)dependence of
rows and columns of a matrix. Likewise, we can add two rows (columns)
and multiply them by scalars.

44.10 Lemma: Let K be a field and A,B Matn(K), n 2. Suppose that


B is obtained from A by multiplying a particular row (column) of A by
some K and adding it to a different row (column) of A. Then det B =
det A. (The determinant does not change when we add a multiple of a
row (column) to another.)

Proof: We prove the assertion about rows only. Suppose that the k-th
row in A is multiplied by K and added to the m-th row. Writing A =
( ij), B = ( ij), we have mj = kj + mj and ij = ij for i m. Lemma 44.5
gives det B = det C + det A, where C Matn(K) is identical with A except
for the m-th row, which is times the k-th row of A. By Lemma 44.4,
det C = det D, where D Matn(K) is identical with A except that the
m-th row of D = the k-th row of A = the k-th row of D. Then det D = 0 by
Lemma 44.9 and det B = .det D + det A = det A.

44.11 Lemma: Let K be a field and A Matn(K). If every entry in a


particular row (column) of A is equal to 0 K, then det A = 0.

547
Proof: Let A = ( ij). Under the hypothesis of the lemma, each summand
( ) 1,1 2,2 . . . n,n ( Sn) of det A is zero, for one of the factors is
zero. Hence det A = 0.

44.12 Lemma: Let K be a field and A Matn(K). If the rows (columns)


of A are linearly dependent over K, then det A = 0.

Proof: When n = 1, A must be the matrix (0)(see Example 42.2(a)), and


det A = 0. Assume now n 2. We prove the assertion about rows only.
If the rows of A are K-linearly dependent, then there are 1, 2, . . . , n in K
such that .

1
(1st row) + 2
(2nd row) + . . . + n
(n-th row) = (0,0, . . . ,0)
and not all of 1, 2, . . . , n are equal to 0 K. Suppose k 0. Then k has
1 1
an inverse k
in K. We multiply the i-th row by i k
and add it to the k-
th row; we do this for each i k. Then we obtain a matrix B whose
determinant is equal to det A by Lemma 44.10. On the other hand, the
k-th row of B consists entirely of zeroes and det B = 0 by Lema 44.11.
Hence det A = 0.

Now we want to discuss the calculation of determinants. In practice,


determinants are almost never computed from the definition

∑ ( ) 1,1 2,2
... n,n
Sn

Rather, a determinant of an n n matrix is expressed in terms of the


determinants of certain (n 1) (n 1) matrices, these in turn in terms of
the determinants of certain (n 2) (n 2) matrices and so on, until we
come to 2 2 matrices, whose determinants are evaluated readily. This
reduction process is known as the expansion of a determinant along (or
by) a row (column). To describe this process, we introduce a definition.

44.13 Definition: Let K be a field and A = ( ij) Matn(K), with n 2.


Let Mij be the (n 1) (n 1) matrix obtained from A by deleting the i-th
row and the j-th column of A, which intersect at the entry ij of A. Then

548
( 1)i+ j det Mij is called the cofactor of ij
in A. We write Aij for the cofactor
of ij in A.

The following lemma justifies the terminology.

44.14 Lemma: Let K be a field and A = ( ij) Matn(K), where n 2.


Let k,m be fixed elements of {1,2, . . . ,n}. Collecting together all terms
contain-ing km in

det A = ∑ ( ) 1,1 2,2


... n,n
Sn
we write
det A = km km
c + terms not containing km
.

The ckm having been defined uniquely in this way, we claim:

(1) cnn = cofactor of nn = Ann,


(2) cnm = Anm for any m = 1,2, . . . ,n,
(3) ckm = Akm for any k,m = 1,2, . . . ,n.

Proof: (1) We have

det A = ∑ ( ) 1,1 2,2


... n,n
Sn

= ∑ ( ) 1,1 2,2
. . . n 1,(n 1) n,n
+ ∑ ( ) 1,1 2,2
. . .
Sn Sn
n =n n n

n,n

= nn ∑ ( ) 1,1 2,2
... n 1,(n 1)
+ terms not involving nn
.
Sn
n =n
Any Sn with n = n can be regarded as a permutation in Sn 1, and any
permutation in Sn 1 can be regarded as a permutation in Sn with n = n .
Here ( ) is independent of whether we regard as an element of Sn or
of Sn 1. Hence

cnn = ∑ ( ) 1,1 2,2


... n 1,(n 1)
= ∑ ( ) 1,1 2,2
... n 1,(n 1)
Sn S n-1
n =n

549
11 12
. . . 1,n 1

21 22
. . . 2,n 1
= = Ann.
. . . . . . . . .
n 1,1 n 1,2
. . . n 1,n 1

(2) We prove cnm = Anm for all m = 1,2, . . . ,n. The case m = n having been
settled in part (1) above, we assume m n. Consider the matrix B =

11
. . . 1,m 1 1n 1,m+1
. . . 1,n 1 1m

21
. . . 2,m 1 2n 2,m+1
. . . 2,n 1 2m
. . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . .
n 1,1
. . . n 1,m 1 n 1,n n 1,m+1
. . . n 1,n 1 n 1,m

n1
. . . n,m 1 nn n,m+1
. . . n,n 1 nm

obtained from A by interchanging the m-th and n-th columns. Then we


have det A = det B by Lemma 44.7 and, by part (1),

det B = nm
det M + terms not involving nm
(ii)

where M is the (n 1) (n 1) matrix we obtain from B by deleting its n-th


row and n-th column. A glance at B reveals that M is obtained from

11
. . . 1,m 1 1,m+1
. . . 1,n 1 1n

21
. . . 2,m 1 2,m+1
. . . 2,n 1 2n
Mnm = . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . .

n 1,1
. . . n 1,m 1 n 1,m+1
. . . n 1,n 1 n 1,n

by n 1 m interchanges of columns. Hence det M = ( 1)n 1 mdet Mnm =


( 1)n+mdet Mnm = Anm. Substituting this in (ii), we get

det A = det B = nm
( det M) terms not involving nm
= A + terms not involving nm,
nm nm

as was to be shown.

(3) We now prove ckm = Akm for all k,m. The case k = n having been
settled in part (2), we assume k n. We consider the matrix C obtained
from B by interchanging the k-th and n-th rows. Then det C = det B =
det A by Lemma 44.7 and, by part (1),

det C = km
det N + terms not involving km

550
where N is the (n 1) (n 1) matrix. we obtain from C by deleting its n-th
row and n-th column. The matrix N is obtained from Mkm by n m 1
interchanges of columns and n k 1 interchanges of rows. Hence .

det N = ( 1)(n m 1)+(n k 1)det Mkm = ( 1)k+ mdet Mkm = Akm


and det A = det C = kmAkm + terms not involving km.

This completes the proof.

44.15 Theorem: Let K be a field, A = ( ij) Matn(K), where n 2. Let


Aij be the cofactor of ij in A. Then
det A = A + A + ... +
i1 i1 i2 i2 in in
A
det A = A +
1j 1j
A + ... +
2j 2j
A
nj nj
for all i,j.

Proof: We have det A = ∑ ( ) 1,1 2,2


... n,n
Sn

= ∑ ( ) 1,1 2,2
... n,n
+ ∑ ( ) 1,1 2,2
... n,n
+ ...
Sn Sn
i =1 i =2

+∑ ( ) 1,1 2,2
... n,n
Sn
i =n
= i1ci1 + i2ci2 + . . . + incin
= i1Ai1 + i2Ai2 + . . . + inAin
for any i. This proves the first formula. Applying it with At, j in place of
A, i, we obtain the second formula.

The first formula in Theorem 44.15 is known as the expansion of det A


along the i-th row, the second as the expansion of det A along the j-th
column. Each element in the i-th row (j-th column) contributes a term,
more specifically ij contributes ijAij, where Aij is the determinant of the
(n 1) (n 1) matrix. obtained from A by deleting the row and column of
ij
, times 1 or 1, determined by the chessboard pattern

551
 + + .. .. .. 
 + + ... 
+ + .
 + + . . . 
. . . . . . . . .
The expansion along a row or column is sometimes given as a recursive
definition of determinants in terms of determinants of smaller size.

A specific determinant is computed as follows. If a row or column


consists of zeroes, the determinant is 0. Otherwise, we choose and fix a
row or column. It will be convenient to choose the row (or column)
which has the largest number of zeroes. At least one of the entries on
the fixed row (column), say , is distinct from 0. If a column (row)
1
intersects our fixed row (column) at the entry , we add times the
column (row) of to that column (row). We do this for each column
(row). This does not change the determinant, but our fixed row (column)
will consist entirely of zeroes, except for the entry . Expanding the
determinant along the fixed row (column), we see that the determinant
is equal to D, where D is the new cofactor of . We repeat the same
procedure with the determinant D, and obtain D = ´D´, say. Then we
repeat the same process with D´, etc., until we come to a 2 2 or 3 3
determinant which can be computed easily.

44.16 Examples: (a) Let K be a field and x1,x2, . . . ,xn elements in K. We

1 1 1
. . .
x1 x2 xn
. . .
evaluate det (xij 1) = x12 x22 . . . xn2 .

x1n 1 x2n 1 . . . xnn 1

This is known as the Vandermonde determinant. Let us denote it by Dn.


We add xn times the i-th row to the (i + 1)-st row (i = 1,2, . . . ,n 1). The
only nonzero entry in the new last column. will be the entry 1 in the 1st
row, n-th column. Expanding Dn along the last column, and taking the
factor xi xn in the i-th column of the cofactor outside the determinant
sign by Lemma 44.4 (i = 1,2,. . . ,n 1), we obtain

Dn = ( 1)n-1(x1 xn)(x2 xn). . . (xn 1


xn)Dn 1.

552
This holds for any n. Thus

Dn = ( 1)n-1(x1 xn)(x2 xn). . . (xn 1


xn)
( 1)n-2(x1 xn 1)(x2 xn 1). . . (xn 2
xn 1)Dn 2
= ...
... +1
= ( 1)(n-1)+(n-2)+ (x1 xn)(x2 xn). . . (xn 3
xn)(xn 2
xn)(xn 1
xn)
(x1 xn 1)(x2 xn 1) . . . (xn 3
xn 1)(xn 2
xn 1)
(x1 xn 2)(x2 xn 2) . . . (xn 3
xn 2)
......
(x1 x2).

Changing the sign of the (n2 ) factors on the right hand side and noting
n
that (n 1) + (n 2) + . . . + 1 = ( 2 ), we finally get

Dn = ∏ (xi xj ),
i j

the product being over all (n2 ) pairs (i,j), where i,j = 1,2, . . . ,n and i j.

(b) Let K be a field. The determinant of a matrix ( ij) Matn(K), where


ij
= 0 whenever i j, which may be written symbolically

11 12 13
... 1n

22 23
... 2n

33
... 3n

0 ..
.
nn

can be evaluated by expanding successively along the first columns. One


finds immediately that ij = 11 22 33. . . nn. Likewise, the determinant

11

21 22
0
31 32 33

553
...... ..
.
n1 n2 n3
... nn

is evaluated to be 11 22 33
. . . nn
. In particular, the determinant of a
diagonal matrix
11

22
0
33

0 ..
.
nn

is ...
11 22 33
.
nn

What happens if we use the cofactors of the elements in a different row


(column) in the expansion along a particular row (column)? We get zero.

44.17 Theorem: Let K be a field., A = ( ij) Matn(K), n 2. Then


A + i2Ak2 + . . . +
i1 k1
A
in kn
=0
A + A + ... +
1j 1m 2j 2m
A
nj nm
=0
whenever i k and j m.

Proof: The first (second) sum is the expansion, along the i-th row (j-th
column), of det B, where B is the matrix obtained from A by replacing
the k-th row (m-th column) of A by its i-th row (j-th column). Since two
rows (columns) of B are identical, det A = 0 by Lemma 44.9. The result
follows.

Using Kronecker's delta, which is defined by

1 if r = s
rs
= { 0 if r s,

so that ( ij) is the identity matrix I in Matn(K), Theorem 44.15 and


Theorem 44.17 can be written.
A + i2Ak2 + . . . + inAkn = ikdet A
i1 k1
A + 2jA2m + . . . + njAnm = jmdet A.
1j 1m

554
To express these equations more succintly, we introduce a definition.

44.18 Definition: Let K be a field and A = ( ij) Matn(K), where n


2. The n n matrix obtained from A by replacing the entry ij by the
cofactor Aij of ij in A is called the adjoint of A. Hence
the adjoint of ( ij) = (Aij).

Using this terminology, the equations above can be written as matrix


equations,
A.(adjoint of A)t = (det A)I
At.(adjoint of A) = (det A)I.
Taking the transposes of both sides in the second equation, we obtain

44.19 Theorem: Let K be a field and A Matn(K), where n 2. Then


A.(adjoint of A)t = (det A)I = (adjoint of A)t.A.

44.20 Theorem: Let K be a field and A Matn(K). Then A is invertible


if and only if det A K . If this is the case, the inverse A 1 of A is given
by the formula
1
A1= (adjoint of A)t,
det A
1
where denotes the inverse of det A in K.
det A

Proof: If det A = 0, then (det A)I = 0 Matn(K), hence, by Theorem


44.19, A is a left zero divisor and a right zero divisor in the ring Matn(K).
From Lemma 29.10, we deduce that A cannot have a left or right
inverse.

1
Otherwise, det A 0 and det A has an inverse in K. If n = 1, then
det A
1
A = (det A) and ( ) is the inverse of A. If n 2, we multiply the
det A
1
members of the equations in Theorem 44.19 by and obtain
det A

555
1 1
A. (adjoint of A)t = I = (adjoint of A)t.A.
det A det A
1
This shows that (adjoint of A)t is an inverse of A. So A GL(n,K)
det A
and, since GL(n,K) is a group, A has a unique inverse. Hence
1
(adjoint of A)t is the inverse A 1 of A.
det A

The next theorem is another testimony for the use of determinants.

44.21 Theorem: Let K be a field and A Matn(K). Then det A = 0 if


and only if the rows (columns) of A are linearly dependent over K..

Proof: If the rows (columns) of A are linearly dependent over K, then


det A = 0 by Lemma 44.12.

Assume conversely that det A = 0. Let A = ( ij). Let V be an n-dimen-


sional K-vector space and let { 1, 2, . . . , n} be a K-basis of V. Then the K-
linear transformation T LK (V,V), given by


n
i
T= ij j
j=1
has the associated matrix ( ij) = A, which is not invertible since det A =
0. So A is not a unit in Matn(K) and T is not a unit in LK (V,V). Thus T is
not an isomorphism. From Theorem 42.22, we conclude that T is not
one-to-one. Thus Ker T {0}. Let Ker T, 0. We have
= 1 1+ 2 2+ . . . + n n
for some suitable scalars j K. Here not all of j are equal to 0, because
0 and { 1, 2, . . . , n} is a K-basis of V. Then

0 = T = (∑ T=∑ ∑ ∑ ∑ (∑
n n n n n n
i) )
i
( T) =
i i i ij j
= i ij i
,
i=1 i=1 i=1 j=1 j=1 i=1


n
so i ij
=0 for j = 1,2, . . . ,n
i=1
since { 1, 2
, . . . , n} is a K-basis of V. Thus

1
(1st row) + 2
(2nd row) + . . . + n
(n-th row) = (0,0, . . . ,0)

556
with scalars 1
, 2
, ..., n
K which are not all equal to 0. So the rows of A
are K-linearly dependent. Repeating the same argument with At, we see
that the columns of A, too, are K-linearly dependent.

We now establish the multiplication rule for determinants.

44.21 Theorem: Let K be a field, n .


(1) det (AB) = (det A)(det B) for all A,B Matn(K).
(2) det I = 1.
(3) det A 1 = (det A) 1 for all A GL(n,K).

Proof: (2) That det I = 1 is a special case of the formula for the
determinant of a diagonal matrix discused in Example 44.16(b). And (3)
follows from (1) and (2): (det A 1)(det A) = det (A 1A) = det I = 1.


n
We prove (1). Let A = ( ij
), B = ( ij), AB = ( ij), so that ij
= ik kj
for all
k=1

11
.
12
. . 1n

21 22
. . . 2n
i,j. Then det (AB) =
. . . . . . . . .
n1 n2
. . . nn

∑ ∑ ∑
n n n
1k1 k1 1 1k2 k2 2
... 1kn kn n
k1 =1 k2 =1 kn =1

∑ ∑ ∑
n n n
= 2k1 k1 1 2k2 k2 2
... 2kn kn n
k1 =1 k2 =1 kn =1

....................................

∑ ∑ ∑
n n n
nk1 k1 1 nk2 k2 2
... nkn kn n
k1 =1 k2 =1 kn =1

1k1 k1 1 1k2 k2 2
. . . 1kn kn n
. . .
∑ ∑ ∑
n n n 2k1 k1 1 2k2 k2 2 2kn kn n
= ... (Lemma 44.5)
k1 =1 k2 =1 kn =1 . . . . . .
nk1 k1 1 nk k 2
. . . nkn kn n
2 2

1k1 1k2
. . . 1kn
. . .
∑ ∑ ∑
n n n 2k1 2k2 2kn
= ... k1 1 k2 2
... kn n
(Lemma 44.4).
k1 =1 k2 =1 kn =1 . . . . . .
nk1 nk
. . . nkn
2

557
In this n-fold sum, k1,k2, . . . ,kn run independently over 1,2, . . . ,n. If,
however, any two of k1,k2, . . . ,kn are equal, then the determinant ik in
j
the n-fold sum has. two identical columns and therefore vanishes (Lemma
44.9). So we may disregard those combinations of the indices k1,k2, . . . ,kn
which contain two equal values,. and restrict the n-fold summation to
those combinations of k1,k2, . . . ,kn such that k1,k2, . . . ,kn are all distinct.
Then the n-fold sum becomes

1k1 1k2
. . . 1kn
. . .
∑ k1 1 k2 2
... kn n
2k1 2k2
. . . . . .
2kn

1 2 ... n
(k1 k2 ... kn ) Sn . . .
nk1 nk 2 nkn

1,1 1,2
. . . 1,n
. . .
= ∑ ( ) 1 ,1 2 ,2
... n ,n
. ( ) 2,1 2,2
. . . . . .
2,n
Sn
n,1 n,2
. . . n,n

11 12
. . . 1n
. . .
= ∑ ( ) 1 ,1 2 ,2
... n ,n
. 21 22
. . . . . .
2n
(Lemma
Sn
n1 n2
. . . nn
44.8)

= ∑ ( ) 1 ,1 2 ,2
... n ,n
(det A) = (det A)(det B) (Lemma 44.3).
Sn

Hence det AB = (det A)(det B).

The equation det AB = (det A)(det B) may also be written in the forms
det AB = (det A)(det Bt),
det AB = (det At)(det B),
det AB = (det At)(det Bt).
So there are four versions of the multiplication rule for determinants,
known as the rows by columns multiplication, rows by rows multiplica-
tion, columns by columns multiplication, column by rows multiplication,
which are respectively described below:

558
If K is a field, ( ij), ( ij), ( ij) Matn m(K), and if


n
ij
= ik kj
for all i,j, or
k=1


n
ij
= ik jk
for all i,j, or
k=1


n
ij
= ki kj
for all i,j, or
k=1


n
ij
= ki jk
for all i,j,
k=1
then ij
= ij ij
.

Restricting the mapping det: Matn(K) K to


GL(n,K) = {A Matn(K): det A K}
(Theorem 44.20), we obtain a group homomorphism
det: GL(n,K) K.
The kernel
{A Matn(K): det A = 1}
of this determinant homomorphism is a normal subgroup of GL(n,K),
known as the special linear group of degree n over K, and denoted as
SL(n,K).

Exercises

1. Verify that the determinant of a 3 3 matrix ( ij) can be computed as


follows. We write the first column of the matrix to the right of the
matrix and the second column to the right of the last written copy of the
first column:
11 12 13 11 12

21 22 23 21 22

31 32 33 31 32
We take the products of the upper-left, lower-right diagonals (full lines)
unchanged, the products of the lower-right, upper-left diagonals (broken
lines) with a minus sign. The sum of these six products is the determi-

559
nant of ( ij). . (This rule cannot be extented to n n matrices if n is
greater than 3).

2. Compute the determinants of the following matrices over :


1 3 5  1 4 5  1 2 5 1 0 1
(a)  0 
1 6 ; 
(b) 1 1 0 ;  (c)  1 3 4 ; (d)  1
 2 1 ;
0      3
 0 2  1 2 2  1 0 2  4 2 

 1 1 1 2
1 0 0 7  3 
 2 1 1 0 2 3 0  4 4 0 1 0

(e) 3 1 0 ; (f) 1 1 0 1  ; (g) 2 8 2 1  4 .
 0 3 2
  
0 1 0 4   2 1 5 2
1 0 1 0
 1
2

 2 1 1
3. Find det A if A is the matrix  3 1 0 in Mat3( ).
 0 3 2 7
 

4. Expand along the third column:

1 0 5 4 0
3 2 1 3 1
0 3 1 2 0 .
1 2 0 2 4
6 1 2 1 1

 1 2 1 0
 1 2 3  0143 
5. Find the adjoints of  0 1 1 and 2 201
 2 0 4 
   5 1 6 2 

6. Find the inverses of the following matrices:


 1 0 2  0 1 3  1 2 0
(a)  2 3 1  over ; (b)  5 4 2  over ; (c)  2 1 0 over ;
 3 4 1  2 1 1  0 2 2 3
     
1 0 7 2 3
6 9 8 1 6


(d) 8 5 9 4 10 over 11; 

1 0 3 2 7
2 7 1 5 4

7. Let K be a field and n 2. Prove that
adjoint of (adjoint of A) = (det A)n 2A
for any A Matn(K).

8. Let K be a field and n 2. Let x,y K and put

560
x + y xy 0 0 . . .
dn = 0 x + y xy 0 . . .
0 0 x + y xy . . .
............
Express dn in terms of dn 1 and dn 2, and evaluate it in closed form.

9. Evaluate the determinant

. . .
c+m-1 c+m c+m+1 c+m+m-1
( 0 ) ( 1 ) ( 2 ) ( m )
c+m c+m+1 c+m+2 . . . c+m+m
( 0 ) ( 1 ) ( 2 ) ( m )
.

c+m+m-1 cm+m c+m+m+1


( ( . . . (c+m+m+m-1)
1 ) ( )
0 ) 2 m

10. Let q and assume that K is a field of q elements. Using Theorem


44.21, find the orders of the groups GL(n,K) and SL(n,K) (cf. §17, Ex.17).

561
§45
Linear Equations

Let K be a field and ij, i K (i = 1,2, . . . ,m; j = 1,2, . . . ,n). We ask if there
are elements x1,x2, . . . ,xn in K such that

x +
11 1
x + ... +
12 2
x =
1n n 1
x + 22x2 + . . . +
21 1
x =
2n n 2
(1)
.....................
x + m2x2 + . . . + mnxn =
m1 1 m
.

(1) is said to be a system of linear equations. We will not treat the


general problem here. Our objective is this paragraph is to derive
necessary and sufficient conditions for the solvability of (1) in the
special case m = n. Concerning the case m n, we will prove only the
following consequence of Theorem 42.21.

45.1 Theorem: Let K be a field and ij


K (i = 1,2, . . . ,m; j = 1,2, . . . ,n). If
n m, that is to say, if there are more unknowns than equations in the
system

x +
11 1
x + ... +
12 2
x =0
1n n
x + 22x2 + . . . +
21 1
x =0
2n n
(2)
.....................
x + m2x2 + . . . + mnxn = 0,
m1 1

then there are elements x1,x2, . . . ,xn in K, not all of them being zero,
which satisfy the system (2).

Proof: Of course x1 = x2 = . . . = xn = 0 is a solution of (2), called the


trivial solution. We ask whether nontrivial solutions of (2) exit. The
claim is that there does exist nontrivial solutions of (2) when n m.

562
x 1

Mat (K). Setting X =  . 


x 2
Let A = ( ) Matn 1(K) and
x.. 
ij n

 0
 0
0 =  ..  Matm 1(K), we may write (2) as a matrix equation:
. 
 0
AX = 0.

The problem is thus: given A Matm n(K), is there a nonzero X in


Matn 1(K) such that AX = 0 Matm 1(K)?

Since A(N + M) = AN + AM and A( N) = (AN) for any N,M Matn 1(K)


and K, the mapping
: Matn 1(K) Matm 1(K)
N AN
is a vector space homomorphism. From Theorem 42.21, we obtain
n = dimK Matn 1(K) = dimK Ker + dimK Im
dimK Ker + dimK Matm 1(K)
= dimK Ker + m,
so dimK Ker n m 0,
Ker {0} Matn 1(K),
and there does exist an X 0 in Ker . So there is a nonzero X Matn 1(K)
with AX = 0, as was to be shown. .

45.2 Theorem: Let K be a field, ( ij) Matn(K) and let 1


, 2
, ..., n
be
elements of K. If det ( ij) 0, then the system

11 1
x + x + ... +
12 2 1n n
x = 1
x + 22x2 + . . . +
21 1 2n n
x = 2
(3)
.....................
x + n2x2 + . . . + nnxn =
n1 1 n

has a unique solution in K, given by


det Bj
xj = (j = 1,2, . . . ,n),
det ( ij )
where Bj is the matrix obtained from ( ij) by replacing its j-th column by

563
 1
 2
 . .
 .. 
 n

x 
1  1
 2
), X =  .  Matn 1(K) and B =  . 
x 2
Proof: Let A = (
 ..  Matn 1(K). Then
x.. 
ij
 n
n

(3) can be written as a matrix equation:


AX = B. (4)

1
Multiplying both sides of (4) on the left by A 1 = (adjoint of A)t,
det A
we obtain
1
X= (adjoint of A)tB. (5)
det A
Also, multiplying both sides of (5) on the left by A, and using Theorem
44.12, we obtain (4). Thus (4) and (5) are equivalent. So the system (3)
or (4) has a unique solution given by (5). In more detail, when we write
Aij for the cofactor of ij in A, so that (adjoint of A) = (Aij),. the solution is
given by

x 
1 A A . . . A
A A . . . A
11 21 n1  1
 
 ...  = det A  . . . . . . . . . n2  2
x
2 1
  ... 
12 22

x  A A . . . A
1n 2n
 
nn  n
n

A11 1 + A21 2 + . . . + An1 


A + A + n
1  12 1 . . . + An2 n
det A  .
22 2
=
. . . . . . . . .
A1n 1 + A2n 2 + . . . + Ann n

1
So xj = ( A + 2A2j + . . . + nAnj) for j = 1,2, . . . ,n. Comparing the
det A 1 1j
expression in parentheses with the expansion

A +
1j 1j
A
2j 2j
+ ... + A
nj nj

(Theorem 44.15) of det ( ij) along the j-th column, we see that the
paren-thetical expression is the expansion,. along the j-th column, of the

564
de-terminant of the matrix Bj that is obtained from ( ij
) by replacing its
j-th column by B. Thus .
det Bj
xj = (j = 1,2, . . . ,n),
det ( ij
)
as claimed.

det Bj
The formula xj = is known as Cramer's rule after G. Cramer
det ( ij
)
(1704-1752).

45.3 Theorem: Let K be a field, ( ij) Matn(K). The system

11 1
x + x + ... +
12 2
x =0
1n n
x + 22x2 + . . . +
21 1
x =0
2n n
(6)
.....................
x + n2x2 + . . . + nnxn = 0
n1 1

has a nontrivial solution in K (i.e., a solution distinct from the obvious


one x1 = x2 = . . . = xn = 0), if and only if det ( ij) = 0.

Proof: If det ( ij) 0, then the system has a unique solution by Theorem
45.2, which must be x1 = x2 = . . . = xn = 0, as follows also from Cramer's
rule, for the numerator determinants,. having a column consisting of
zeroes only, are all equal to 0. Thus,. if the system has a nontrivial solu-
tion in K, then det ( ij) must be zero.

Suppose conversely that det ( ij) = 0. Then the columns of ( ) are linear-
ij
ly dependent over K (Theorem 44.21): There are elements , , . . . , n in
1 2
K, not all of them being zero, such that.

       0
 11
  12
  1n
  0
 21  22  2n
...  =  ...  .
...
1 ...  + 2 ...  + n
+
       
n1 n2 nn  0
Thus x1 = 1
, x2 = 2
, . . . , xn = n
is a nontrivial solution of (6).

565
45.4 Remark: The theorems in this paragraph are chiefly of theoretical
interest. Finding solutions of specific systems by the methods described
in this paragraph would be very tedious.

Exercises

1. Find all solutions of the following systems of linear equations:

(a) 3x + 4y 5z = 1
2x 3y + z = 3
2x + y + 6z = 0;

(b) 4x + y 5z u =1
6x + 2y 3z + 3u = 8
4x + 5y 2z + u = 3
2x 7z 3u = 0.

2. Using Cramer's rule, find the solutions in 13


of the following systems
of linear equations, where denotes residue classes modulo 13:

(a) 2x + 11y + 4z =1
3x + 8y + 5z = 6
9x + 12y + 4z = 7;

(b) 2x + 11y + 4z + 3u =5
8x + 10y + 6z + 7u = 2
1x + 9y + 2z + 8u = 6
3x + 1y + 0z + 5u = 4.

566
§46
Algebras

In this last paragraph of Chapter 4, we consider multiplication of vectors.


If, on a vector space, there is an associative multiplication which is dis-
tributive over addition and compatible with multiplication by scalars,
the vector space is said to be an algebra. The formal is definition is as
follows.

46.1 Definition: Let K be a field and (V,+) an abelian group. A


quintuple (V,+,o ,K,.) is called an algebra over K, or a K-algebra provided
(i) (V,+,o ) is a ring,
(ii) (V,+,K,.) is a vector space,
(iii) ( .a) o b = .(a o b) = a o ( .b) for all K, a,b V.

It is implicit in this definition that o is a binary operation on V, called


multiplication, and . is a mapping from K V into V, called multiplication
by scalars. As usual, we drop these symbols and write a for .a, and ab
for a o b. Then (iii) becomes a kind of associativity law: ( a)b = (ab) =
a( b). As usual, we shall call V, rather than the quintuple (V,+,o ,K,.), a K-
algebra.

Examples: (a) Let K be a field and L a field containing K. Then L is a


algebra over K.

(b) Let K be a field. Then Matn(K) is a K-vector space (Theorem 43.4)


and also a ring (Theorem 43.11).. Since ( A)B = (AB) = A( B) for all in
K and A,B in Matn(K) (see (e) in §43, p. 523 ), we conclude that Matn(K)
is a K-algebra. ?.

(c) Let K be a field and V a vector space over K. Then LK (V,V) is a K-


vector space (Theorem 43.1) and also a ring (Theorem 43.12).. Moreover,
whenever K and T,S LK (V,V), there hold

567
(( T)S ) = ( ( T))S = (( )T)S = ( )(TS) = ( (TS))
and
(T( S)) = ( T)( S) = ( ( T))S = (( T)S ) = ( (TS)) = ( )(TS) = ( (TS))
for all V, thus ( T)S = (TS) = T( S). Thus LK (V,V) is a K-algebra.

(d) Let K be a field and x an indeterminate over K. Then K[x] is a vector


space over K (Example 39.2(d)) and also a ring. We have (af(x))g(x) =
a (f(x)g(x)) = f(x)(ag(x)) for all a K and f(x),g(x) K[x]. Thus K[x] is an
algebra over K. Likewise the ring K[x1,x2, . . . ,xn] of polynomials. in n inde-
terminates is an algebra over K..

46.3 Lemma: Let K be a field and V a finite dimensional vector space


over K. Suppose there is a multiplication on V which is distributive over
addition, and suppose that
( a)c = (ac) = a( c) for all K and a,c V
(thus all conditions for V to be an algebra over K are satisfied except
that associativity of multiplication is open).
Let B be a K-basis of V. Then multiplication on V is associative and V is a
K-algebra if and only if
(bb´)b´´ = b(b´b´´) for all b´,b´,b´´ B.

Proof: If multiplication on V is associative, then (bb´)b´´ = b(b´b´´) holds


for all elements b´,b´,b´´ of V, in particular, for all b´,b´,b´´ in B.

Assume conversely that (bb´)b´´ = b(b´b´´) for all b´,b´,b´´ in B. We put B


= {b1,b2, . . . ,bn}. If x,y,z are arbitrary elements of V, we write them as

∑ ∑ ∑
n n n
x= b ,
i i
y= b ,
j j
z= b
k k
i=1 j=1 k=1

with suitable scalars i


, j
, k
. Using distributivity and (iii), we find

(xy)z = ( ∑ ∑ ). ∑ ∑ ∑
n n n n n
b
i i
b
j j
b =
k k
( ibi)( j bj ) . b
k k
i=1 j=1 k=1 i,j=1 k=1

∑ ∑ ∑ ∑
n n n n
= (b ( b )) .
i i j j
b
k k
= ( (b b )) .
i j i j
b
k k
i,j=1 k=1 i,j=1 k=1

∑ ∑ ∑
n n n
= ( i j
)(bibj ) . b
k k
= (( i j
)(bibj ))( kbk)
i,j=1 k=1 i,j,k=1

568
∑ ∑
n n
= ( i j
)((bibj )( kbk)) = ( i j
)[ k((bibj )bk)]
i,j,k=1 i,j,k=1


n
= (( i j
) k)((bibj )bk)
i,j,k=1

∑ (∑ ∑ ∑ ∑
n n n n n
and likewise x(yz) = b .
i i
b
j j k k)
b = b .
i i
( j k
)(bj bk)
i=1 j=1 k=1 i=1 j,k=1


n
= ( i( j k
))(bi(bj bk)).
i,j,k=1
Now ( i j ) k = i( j k) since the multiplication on K is associative and
(bibj )bk = bi(bj bk) by hypothesis, so (xy)z = x(yz). Hence the multiplication
on V is also associative. .

46.4 Examples: (a) Let K be a field and G a finite multiplicative group.


Let KG denote the K-vector space that has G as a K-basis. Thus the


G
elements of KG are sums i i
g , where G = {g1,g2, . . . ,g G }. It will be
i=1

convenient to modify this notation as ∑ g


g. Two elements ∑ g
g and
g G g G

∑ g
g of KG are equal if and only if g
= g
for each g G. The sum of
g G

∑ g
g and ∑ g
g is ∑ ( g
+ g
)g, and the product of K by ∑ g
g
g G g G g G g G

is ∑ g
g. We now define a multiplication on KG by extending the
g G

multiplication on G using distributivity. More precisely, we define the

product of ∑ g
g by ∑ g
g = ∑ h
h to be ∑ g
g ∑ h
h =
g G g G h G g G h G

∑ g h
gh = ∑ (∑ g h )k.
g,h G k G g,h G
gh=k

569
For any a = ∑ g
g, b = ∑ g
g in KG and K, we have ( a)b =
g G g G

( ∑ g
g) ∑ g
g = ∑ g
g= ∑ g
g = ∑ (∑ g h )k =
g G g G g G g G k G g,h G
gh=k

∑ (∑ g h )k = ∑ ( ∑ g h )k = (ab) and similarly a( b) = (ab).


k G g,h G k G g,h G
gh=k gh=k

The reader will verify that distributivity laws are valid.

Each element g0 of G can be regarded as an element ∑ g


g in KG, where
g G

g
= 0 if g g0 and g0 = 1. Thus we regard G as a subset of KG. It is

checked easily that, for any g,h G, the product of gh in KG is the


product gh in G. Since multiplication on G is associative, and since G is a
K-basis of KG, we learn from Lemma 46.3 that KG is an algebra over K. It
is called the group algebra of G over K.

4
(b) Let = be the four-dimensional -vector space of ordered
quadruples, and let e = (1,0,0,0), i = (0,1,0,0), j = (0,0,1,0), k = (0,0,0,1).
Thus {e,i,j,k} is a basis of over . We give a multiplication table for
these basis elements:

e i j k

e e i j k
i i e k j .
j j k e i
k k j i e

Thus ea = ae = a for any a {i,j,k} and the products of i,j,k are like the
3
cross product of the vectors i,j,k in . The product of two distinct
elements from {i,j,k} is equal to the third, the sign being "+" for
products taken in the order indicated in the accompanying diagram, and
" " in the reverse order.
i

j k

By distributivity, we have the product formula:

( e + i + j + k)( ´e + ´i + ´j + ´k) = ( ´ ´ ´ ´)e

570
+( ´+ ´+ ´ ´)i
+( ´ ´ + ´ + ´)j
+( ´+ ´ ´ + ´)k
which may be taken as the definition of multiplication on . One checks
that this multiplication is distributivite over addition, and that e is an
identity element. To prove the associativity of multiplication, we must
only verify the 43 = 64 equations (ab)c = a(bc), where a,b,c {e,i,j,k}
(Lemma 46.3). This verification is left to the reader. The multiplication is
thus seen to be associative. One also finds immediately ( a)b = (ab) =
a( b) for any and a,b . Thus is an algebra over . This alge-
bra was discovered by the Irish mathematician W. R. Hamilton (1805-
1865). The elements of are called quaternions, and is known as the
Hamilonian algebra of quaternions. It is not commutative, since ij = e e
= ji, for example.

Since e is the identity of , we will write 1 instead of e and instead of


e (here ). Then any real number can be thought of as a
quaternion 1 = + 0i + 0j + 0k. In like manner, any complex number
+ i (where , ) can be considered as a quaternion + i + 0j + 0k.
In this way, we may suppose that and are subrings of .

For any a , say + i + j + k with , , , , we say is the real


part of a and i + j + k is the imaginary part of a. We also put a
= i j k and call a the conjugate of a. It is easily seen that ab =
b a for any a,b (note the reversal of the conjugates). We define the
norm of a, denoted as N(a), to be a a . Thus N( + i + j + k) is equal to
2
+ 2 + 2 + 2. Note that N(a) . Clearly N(a) = 0 if and only if a = 0.

There holds N(ab) = ab ab = ab b a = aN(b)a = N(b)a a = N(b)N(a) = N(ab)


for any quaternions a,b . This is equivalent to the identity
2 2 2 2
( + )( ´2 + ´2 + ´2 + ´2) = ( ´
+ + ´ ´ ´)2
+( ´+ ´+ ´ ´)2
+( ´ ´+ ´ + ´)2
+( ´+ ´ ´ + ´)2
which holds in fact in any commutative ring. Thus the product of two
numbers, each of which is a sum of four squares, is also a sum of four
squares.

571
Just like we divide a complex number a = + i by a nonzero complex
number b = + i by multiplying the numerator and denominator of a/b
by the conjugate b = i of b:

a + i + i i + +
= = = 2 2 + 2 2 i,
b + i + i i + +

we can divide any quaternion a by any nonzero quaternion b by multi-


plying the "numerator" and "denominator" of a/b by the conjugate b :

a ab ab
= = .
b bb N(b)

More exactly, any nonzero quaternion b has a multiplicative inverse


(1/N(b))b. Thus is a division ring. An algebra which is a division ring
is called a division algebra. So is a division algebra.

An interesting theorem of F. G. Frobenius (1849-1917) states that ,


and are the only division algebras over .

(c) The last example can be generalized. Let K be a field in which 1 + 1 is


distinct from 0. Let Q = K4 be the four-dimensional K-vector space of
ordered quadruples, and let e = (1,0,0,0), i = (0,1,0,0), j = (0,0,1,0), k =
(0,0,0,1). Thus {e,i,j,k} is a basis of Q over K. We define a multiplication
on Q by

( e + i + j + k)( ´e + ´i + ´j + ´k) = ( ´ ´ ´ ´)e


+( ´+ ´+ ´ ´)i
+( ´ ´+ ´+ ´)j
+( ´+ ´ ´+ ´)k

This multiplication is associative, distributivite over addition and e is an


identity element. One checks easily ( a)b = (ab) = a( b) for any K
and a,b Q. Thus Q is a K-algebra. Q is called the algebra of quaternions
over K. This time it will be convenient not to identify K with e Q.

The conjugate a of a = e + i + j + k Q is defined to be e i j k


2 2 2 2
and the norm N(a) of a to be aa. Thus N( e + i + j + k) = + + + .
2 2 2 2
If K is a field such that + + + = 0 implies = = = = 0, then
any nonzero a Q has a multiplicative inverse (1/N(a))a and Q is a
division algebra. Otherwise, Q has zero divisors: there is a nonzero a Q
such that aa = 0.

572
Exercises

1. Multiply 2 + 3(12) + (13) 2(23) + (123) 3(132) by + 2(12) + 4(13)


3(23) + 2(123) + (132) in S3.

2. Let G be a finite group, K a field. Put e = ∑ g KG. Show that e2 =


g G

G e.

3. Let K be a field and A an algebra over K. Prove that the center Z(A) of
A (see §32, Ex. 1) is a subspace of A.

4. Let G be a finite group. Show that dim Z( G) is equal to the number


of conjugacy classes in G.

5. For any a , show that there are real numbers t,n such that
a 2 ta + n = 0.

6. Prove that a 2iai + ia2ia iaia2 aia2i = 0 for any a .

7. Let a,b . Show that ab = ba if and only if 1,a,b are linearly depen-
dent over .

8. Prove that { 1, i, j, k}
is a group isomorphic to Q8 (see §17,
1 i j k
Ex.15) and that S = { 1, i, j, k,
2 } is a group
isomorphic to SL(2, 3). Show that { 1} S and S/{ 1} A4.

9. Prove that the quaternion algebra over is isomorphic (as ring and
-vector space) to the -algebra Mat2( ).

10. Let K be a field in which 1 + 1 0 and , nonzero elements in K. Let


A be the four dimensional K-vector space with K-basis e,i,j,k. On A, we
define a multiplication by the multiplication table on the basis elements:

573
e i j k

e e i j k

i i e k j

j j k e i

k k j i e

(a) Prove that this multiplication makes A into a K-algebra ( is a


special case K = , = = 1).

(b) Show that the center of A is {ke A: k K} and that A has no


ideals aside from 0 and A.

(c) Define the conjugate a of a = e + i + j + k A to be


e i j k and the norm N(a) of a to be aa. Verify ab = b a and
N(ab) = N(a)N(b) for any a,b A.

(d) Prove that A is a division algebra if and only if N(a) 0 for any
nonzero a A and this holds if and only if 20 = 2
1
+ 2
2
implies 0 = 1 =
2
= 0 for any 0, 1, 2 K.

(e) If K is finite, say K = q, show that there are q + 1 elements in


2 2
{ 1
K: 1 K} and {1 2
K: 2 K} and conclude that A is not a
division algebra (This is a special case of an important theorem due to H.
J. M. Wedderburn (1882-1948) which states that any finite division
algebra is a field).

11. If 1 + 1 = 0 in a field K and A is as in Ex.10, show that the mapping


x x2 is a ring homomorphism from A into A.

574
CHAPTER 5
Fields

§47
Historical Introduction

For a long time in the history of mathematics, algebra was understood to


be the study of roots of polynomials.

This must be clearly distinguished from numerical computation of the


roots of a given specific polynomial. The Newton-Hörner method is the
best known procedure to evaluate roots of polynomials. The actual
calculation of roots was (and is) a minor point. The principal object of
algebra was understanding the structure of the roots: how they depend
on the coefficients, whether they can be given in a formula, etc.

There is, of course, the related question concerning the existence of roots
of polynomials. Does every polynomial have a root? Here the coefficient
of polynomials were implicitly understood to be real numbers. A. Girard
(1595-1632) expressed that any polynomial has a root in some realm of
numbers (not neccessarily in the realm of complex numbers), without
indicating any method of proof. R. Descartes (1596-1650) noted that x
c is a divisor of a polynomial if c is a root of that polynomial and gave a
rule for determining the number of real roots in a specified interval. He
makes an obscure remark about the existence of roots. Euler stated that
any polynomial has a root in complex numbers. This result came to be

575
called the fundamental theorem of algebra, a very inappropriate name.
Euler proved it rigorously for polynomials of degree 6. J. R. D'Alembert
(1717-1783), Lagrange, P. S. Laplace (1749-1827) made attempts to
prove this statement. As Gauss criticized, their proof actually assumes
the existence of a root in some realm of numbers, and shows that the
root is in fact in . Gauss himself gave several proofs, some of which
cannot be accepted as rigorous by modern standards. Nevertheless,
Gauss has the credit for having given the first valid demonstration of the
so-called fundamental theorem of algebra. After Kronecker established
in 1882 that any polynomial has a root is some realm of "numbers" (see
§51), the earlier attempts became rigorous proofs. The really
fundamental the-orem is Kronecker's theorem.

This assures the existence of roots, but does not bring insight to the
problem of understanding the nature of roots any more than existence
theorems about differential equations give solutions of differential
equations or information about their analytic bahavior, singularities,
asymptotic expansions, etc.

The solution of quadratic equations were known to many ancient


civilizations. The cubic and biquadratic polynomials (that is, polynomials
of degree four) were treated by Italian mathematicians. Scipione del
Ferro (1465-1526) succeeded in solving the cubic equation x3 + ax = b
(1515) in terms of radical expresions. In 1535, Tartaglia (1499/1500-
1557) solved the cubic polynomial of the form x3 + ax2 = b. G. Cardan
(1501-1576), substituted x (b/3) for x and transformed the general
cubic x3 + bx2 + cx + d to a form in which the x2 term is absent. Thus
assuming, with no loss of generality, the equation to be x3 + px + q = 0, a
formula for the roots is found to be
3 3
q 2 3 2 3
(q2 ) (p3 ) q
2
+ + + 2
(q2 ) + (p3 )

This is known as Cardan's formula, but it is actually Tartaglia who found


it and divulged it to Cardan under pledge of secrecy, who later broke his
promise and published it in his book Ars Magna (1545). Cardan's orig-
inality lay in reducing the general cubic to one of the form x3 + px + q,
discussing the so called irreducible case, noting that a cubic can have at
most three roots and making an introduction to the theory of symmetric
polynomials.

576
Cardan's book contains a method for finding roots of biquadratic poly-
nomials (that is, polynomials of degree four) discovered by his pupil L.
Ferrari (1522-1565) round 1540. This book made a great impact on the
developement of algebra. Cardan even calculated with complex numbers,
which manifested themselves to be indispensable. Contrary to what one
may be at first inclined to believe, there was no need for complex
numbers as far as quadratic equations are concerned: mathematicians
had declared such equations as x2 = 1 simply unsolvable. However, in
Cardan's formula, one has to take square roots of negative numbers even
if all the roots are real (the irreducible case). In fact, the roots of a cubic
polynomial whose three roots are real cannot be expressed by a formula
involving real radicals only (Lemma 59.30).

Thus the first half of 16th century witnessed remarkable achievements


in algebra. As late as 1494, Fra Luca Pacioli() had expressed that a cubic
equation cannot be solved by radicals, and by 1540 both the cubic and
the biquadratic equation was solved by radicals. The next step would be
to find a formula for the roots of a quintic polynomial (that is, poly-
nomials of degree five) and, better still, of a polynomial of n-th degree
in general.

Other solutions of polynomial equations of the degree 4 are later


given by Descartes, Walter von Tschirnhausen (ca. 1690) and Euler.
Noted mathematicians tried in vein to find a formula for the roots of a
quintic polynomial. Mathematicians began to suspect that a quintic
polynomial equation cannot be solved by radicals.

Lagrange published in 1770-1771 a long paper "Réflexions sur la résolu-


tion algébrique des équations" in which he studied extensively all
known methods of solutions of polynomial equations. His aim was to
derive a general procedure from the known methods for finding roots of
polynomials. He treated quadratic, cubic and biquadratic polynomials in
detail, and succeeded in subsuming the various methods under one
general principle. The roots of a polynomial are expressed in terms of a
quantity t, called the resolvent, and the resolvent t itself is the root of an
auxiliary polynomial, called the resolvent polynomial. When the degree
of the given polynomial is n, the resolvent polynomial is of degree (n 1)!
in xn. For n 4, the auxiliary equation has therefore a smaller degree
than the polynomial given, and can be solved algebraically (by

577
induction), but for n 5, solving the auxiliary equation is not easier
than to solving the original equation.

The resolvent is a function of the roots which. is invariant under some


but not all of the permutation of the roots.. For example, when the
degree is four, r1r2 + r3r4 does not change if the roots r1,r2 and r3,r4 are
interchanged. Lagrange is thus led to the permutation of the roots, i.e.,
he investigated, without appropriate terminology and notation, the
symmetric group on n letters. (Incidentally, the degree of the resolvent
polynomial is a divisor of n!, the order of the symmetric group. This is
how Lagrange came to Theorem 10.9.)

Lagrange noted that, in the successful cases n 4, the resolvent has the
form r1 + r2 + . . . + n 1
rn, where ri are the roots of the polynomial and
is a root of xn 1. This type of a resolvent does not work in case n = 5,
but it is concievable that expressions of some other kind could work as
resolvents. Lagrange studied which type of expressions could be
resolvents.

In 1799, P. Ruffini (1765-1822) claimed a proof of the impossibility of


solving the general quintic equation algebraically, but whether his proof
was rigorous remained controversial. In 1826, Abel gave the first
complete proof of this impossibility theorem. His proof consists of two
parts. In the first part, he found the general form of resolvents must be
as in Lagrange's description for the cubic and biquadratic cases; in the
second part, he demonstrated that it can never be a root of a polynomial
of fifth degree. He added, without proof, that the general equation
cannot be solved algebraically if the degree is greater than 5. In addition
to the general equation, Abel also investigated which special equations
can be solved by radicals. He proved a theorem which reads, in modern
terminology, that an equation is solvable by radicals if the associated
Galois group is commutative. It is in this connection that commutative
groups are called abelian.

Abel thus finally demonstrated that the general equation cannot be


solved by radicals. "General polynomial" means that the coefficients are
independent variables or, more in the spirit of algebra, indeterminates.
Abel's theorem does not say anything about polynomials whose
coefficients are fixed complex numbers. But some polynomial equations
with constant coefficients of degree five or greater are solvable by

578
radicals. What is the criterion for a polynomial equation to be solvable
by radicals? This question is resolved by the French mathematician
Évariste Galois (1811-1832). With Galois, the principle subject matter of
algebra definitely ceased to be polynomial equations. Galois marks the
beginning of modern algebra, which means the study of algebraic struc-
tures (groups, rings, vector spaces, fields, and many others).

* *

Galois had a short and dramatic life. He began publishing articles when
he was a pupil in Lycée (1828). He was a remarkable talent and a
difficult student. He wanted to enter the École Polytechnique, but failed
twice in the entrance examinations. The reason, he says later, was that
the questions were so simple that he refused to answer them. He later
entered the École Normale (1829), but expelled from it due to a letter in
the student newspaper. His unbearable pride was notorious. He became
politicized, was sent to jail for some months, then began a liason with
"une coquette de bas étage" and died in an obscure duel (1830).

Galois' achievements have not been appreciated by his contemporaries.


He submitted several papers to the French Academy, but these were
rejected as unintelligible. It was not until J. Liouville (1809-1882)
published his memoirs in 1846 that the world came to know Galois and
realize him to be the one of the greatest mathematicians of all time.

Galois associated, with each resolvent equation, a field intermediate


between the field of the coefficients of the polynomial and the field of
the roots. His ingenious idea is to associate, with the given polynomial
and intermediate fields, a series of groups and to translate assertions
about fields into group-theoretical statements. This involved the
clarification of the field and group concepts. The theory of groups is
founded by Galois. He proved that a polynomial equation is solvable by
radicals if and only if, in the series of groups, each group is normal and
of prime index in the next one, i.e., if and only if the group of the
polynomial is solvable in the sense of Definition 27.19 (Theorem 27.25).

It should be noted that this criterion is not an effective procedure to


determine actually whether a polynomial equation is solvable by
radicals. His contemporaries expected that "the condition of solvability, if

579
it exists, ought to have an external character which can be verified by
inspecting the coefficients of a given equation or, all the better, by
solving other equations of degrees lower than that of the equation to be
solved."1 His is not a workable test that effectively decides if an
equation is solvable by radicals. Galois himself writes: "If now you give
me an equation that you have chosen at pleasure, and if you want to
know if it is or if it is not solvable by radicals, I need do nothing more
than indicate to you the means of answering your question, without
wanting to give myself or anyone else the task of doing it. In a word, the
calculations are impractical."2 But this is the whole point. Who cares
about solvability of polynomial equations. What Galois achieved, and
what his contemporaries failed to appreciate, is a fascinating parallel
between the group and field structures. The group-theoretical
solvability condition is at best a trivial application of the theory.

This was too big a change in algebra and in mathematics and heralded
the end of an era when mathematics was the science of numbers and
figures. Ever since the time of Gauss and Galois, mathematics is the
science of structures. Galois theory is the first mathematical theory that
compares two different structures: fields and groups. It was not easy to
follow this developement. Even mathematicians of later generations
concieved Galois theory as a tool for answering certain questions in the
theory of equations. The first writer on Galois theory who clearly
differentiated between the theory and its applications is Heinrich Weber
(1842-1913). In his famous text-book on algebra (1894), the exposition
of the theory occupies one chapter, its applications another.

The first writer on Galois theory is E. Betti (1823-1892). He published a


paper "Sulla risoluzione delle equazioni algebriche" in 1852, in which he
closely follows Galois' line. This is more of a commentary than an original
exposition. Here the concept of conjugacy and of factor groups made a
appeared dimly. Another commentator on Galois theory is J. A. Serret
(1819-1885).

Camile Jordan (1838-1922) gave the first exposition of Galois theory


that does not follow Galois' own line. With Jordan, emphasis shifted from
polynomials to groups. He made many important original contributions.
Among other things, he clarified the relationship between irreducible
polynomials and transitive groups, developed the theory of transitive
groups, defined factor groups as the group of the auxiliary equation,

580
introduced composition series, proved that the composition factors in
any two composition series of a solvable group are isomorphic. The
group concept became central, but solving polynomial equations still
remained as the major concern.

At the same time, Two German mathematicians, L. Kronecker and R.


Dedekind (1831-1916), were making very significant contributions to
field theory.

Dedekind lectured on Galois theory as early as 1856. He seems to be the


first mathematician who realized that the Galois group should be
regarded as an automorphism group of a field rather than a group of
permutations. In fact, he uses the term "permutation" for what we now
call a field automorphism. This means, of course, he very rightly recog-
nized the theory as a theory on fields, not as a theory on polynomials. He
introduced the notion of dependence/independence of elements in an
extension field over the base field.

Kronecker discussed adjunction in detail. He noted that it is possible to


adjoin transcendental elements as well as algebraic ones to a field and
proved the important theorem that any polynomial splits into linear
factors in some extension field.

Weber carried Kronecker's and Dedekind's ideas further. His exposition,


the first modern treatment of the subject, is not restricted to , but
rather deals with an arbitrary field. He clearly states that the theory is
about field extensions and automorphism groups of these extensions.
Weber was far ahead of his time. Many mathematicians of his time
found his treatment abstract and difficult.

Then came Emil Artin (1898-1962). He combined techniques of linear


algebra and field theory. Extensions are sometimes regarded as fields,
sometimes as vector spaces, whichever may be convenient. He studied
automorphisms of fields, proved that the degree of an extension is equal
to the order of the automorphism group, introduced the notion of a
Galois extension, and abolished the role of the resolvent (primitive
element). This latter was an ugly aspect of the theory about field
extensions, remnant of earlier times when the theory has been regarded
as one about polynomials. Artin then set up the correspondence between
intermediate fields and subgroups of the automorphism group. All

581
computations are eliminated from the theory. Where an earlier writer
would spend many pages for the step-by-step adjunction of resolvents
to construct a splitting field, we see Artin merely write: "Let E be a
splitting field of f(x)." With Artin, Galois theory lost all its connections
with its past. It is interesting to note that Artin does to applications of
the theory to polynomial equations. In his book Galois Theory, applica-
tions are harshly separated from the main text: they can be no more
than an appendix; but Artin does not even condescend to write the
appendix himself: this task is relegated to one of his students.

______________________________________________
1
Poisson, quoted from Kiernan's article (see References), page 76.
2
Galois, quoted from Edwards' book Galois Theory, page 81.

582
§48
Field Extensions

We recall a technical term from Example 39.2(f).

48.1 Definition: Let E be a field and let K be a nonempty subset of E. If


K itself is a field under the operations defined on E, then K is called a
subfield of E. In this case, E is called an extension field of K, or simply an
extension of K.

We write E/K to denote that E is an extension of K, and speak of the field


extension E/K. Confusion with a factor group or a factor space is not
likely. We will frequently employ Hasse diagrams (see §21) for field ex-
tensions. For example, the picture
E

K
will mean that K is a subfield of E.

As in the case of subgroups, subrings and subspaces, we have a subfield


criterion.

48.2 Lemma (Subfield criterion): Let E be a field and K a nonempty


subset of E. Then K is a subfield of E if and only if
(i) a + b K,
(ii) b K,
(iii) ab K,
(iv) b 1 K (in case b 0)
for all a,b K.

Proof: A field is a ring in which the nonzero elements form a commuta-


tive group under multiplication (see the remarks after Definition 29.13).
Thus E is a ring of this type, and K is a subfield of E if and only if K is a

583
subring of E such that the nonzero elements in K form a commutative
group under multiplication. Certainly, every subgroup of E = E\{0} is
commutative. Thus K is a subfield of E if and only if K is a subring of E
and K\{0} is a subgroup of E . Now K is a subring of E if and only if
(i),(ii),(iii) hold and K\{0} is a subgroup of E if and only if

(iii)´ ab K\{0} for all a,b K\{0}

and (iv) hold. Since K E and the field E has no zero divisors, (iii)´ is
weaker than (iii), and we conclude that K is a subfield of E if and only if
(i),(ii),(iii),(iv) hold.

1
From now on, we will write (or 1/b) for the inverse b 1 of a nonzero el-
b
a
ement in a field. Likewise, we will write or (a/b) for the product ab 1 =
b
1 1
b a of two elements a,b in a field (assuming b 0). It follows from
Lemma 48.2 that, whenever K is a subfield of E and a,b K, then
a
a + b, a b, ab,
b
belong to K, it being assumed b 0 in the last case. A subfield of E is
therefore a nonempty subset of E that is closed under addition, subtrac-
tion, multiplication and division (by nonzero elements).

48.3 Examples: (a) is a extension of , and is an extension of .


Also is a subfield of .

(b) If K is any field and x an indeterminate over K, then K is a subfield


of K(x) (provided we identify, as usual, an element a of K with the
a
rational function , where the numerator and denominator are elements
1
of K K[x]). Similarly K is a subfield of K(x,y), where y is another
indeterminate over K.

(c) Let (i) := {x + yi : x,y } . For any a,b in (i), say a = x + yi


and b = z + ui with x,y,z,u , we have
(i) a + b = (x + z) + (y + u)i (i),
(ii) b = ( z) + ( u)i (i),

584
(iii) ab = (xz yu) + (xu + yz)i (i),
z u
(iv) b 1 = 2 2
+ 2 i (i), provided b = z + ui
z + u z + u2
0 + 0i = 0.
So (i) is a subfield of . It is in fact the field of fractions of [i], and is
called the gaussian field.

(d) ( 2) := {x + y 2 : x,y } is a subfield of . Indeed, for any


a,b in (i), say a = x + y 2 and b = z + u 2with x,y,z,u , we have
(i) a + b = (x + z) + (y + u) 2 ( 2),
(ii) b = ( z) + ( u) 2 ( 2),
(iii) ab = (xz + 2yu) + (xu + yz) 2 ( 2),
z u
(iv) b 1 = 2 2
+ 2 2 ( 2), provided b =
z 2u z 2u2
z + u 2 0 + 0 2 = 0. Here we use the fact that 2 is an irrational
2 2
number (Example 35.11) so that z 2u 0 if z and u are nonzero
rational numbers.
3
(e) Let L = {x + y 2 : x,y } . Then L is not a subfield of
3 3 3
since, for example 2 L but 2. 2 L (why?) On the other hand,
3 3 3 3 3
( 2) := {x + y 2 + z 4 : x,y,z } = {x + y 2 + z( 2)2 : x,y,z
}
3 3
is a subfield of . The proof of b ( 2)\{0} 1/b ( 2) is left
to the reader.

(f) Let K be a field and let Ki (i I) be a family of subfields of K. Then


Ki is a subfield of K, for the closure properties in Lemma 48.2 hold
i I
for Ki if they hold for each of the Ki.
i I

From the last example, we infer that the intersection of all subfields of a
field K is a subfield of K. Note that the intersection is taken over a
nonempty set, since at least K is a subfield of K.

48.4 Definition: Let K be a field. The intersection of all subfields of K is


called the prime subfield of K.

585
Thus every subfield of K contains (is an extension of) the prime subfield
of K. We want to describe the elements in the prime subfield of K. Let P
denote the prime subfield of K. In order to distinguish clearly between
the integer 1 and the identity element of K, we will denote in this
discussion the identity element of K as e. We know 0 P, e P and 0 e
because P is a field. Now P is a group under addition, so e + e = 2e, 2e + e
= 3e, 3e + e = 4e, . . . are elements of P, and also e, 2e, 3e, 4e, . . . .

Hence . . . , 4e, 3e, 2e, e, 0, e, 2e, 3e, 4e, . . .

all belong to P: we have {me K: m } P. Moreoever,. P is closed


under division (by nonzero elements), and so P0 := {me/ne K: m,n }
is a subset of P. It is natural to expect that P0 is a subfield of K (and thus
P0 = P): for any me/ne, re/se P0 with m,n,r,s , we presumably have
me re (ms + rn)e
(i) + = P0,
ne se (ns)e
re ( r)e
(ii) = P0,
se se
me re (mr)e
(iii) = P0,
ne se (ns)e
1 se re
(iv) = P0, provided 0, i.e., re 0.
re re se
se
These are in fact true,. but care must be exercised in justifying (i),(ii),(iii),
(iv). This is done in the next theorem which states. that P is isomorphic
either to or to p for some prime number p.

48.5 Theorem:. The prime subfield of any field K is isomorphic to or


to p for some prime number p (ring isomorphism).

Proof: Let e be the identity of K and let P be the prime subfield of K.


Then 1e = e 0 e = 1e. We distinguish two cases, according as there
does or does not exist an integer n \{0} satisfying ne = 0.

Case 1. Assume there is a nonzero integer n such that ne = 0. Then there


are natural numbers k with ke = 0. Let p be the smallest natural number
such that pe = 0. We claim that the mapping
: P
n ne
is a ring homomorphism, that p is a prime number and that P p
.

586
For any m,n , we have (m + n) = (m + n)e = me + ne (this is not
distributivity!) and (mn) = (mn)e = (me)(ne) = m .n (here (mn)e =
(me)(ne) is distributivity!), so is a ring homomorphism.

If p were composite, say p = rs with r,s ,1 r p, 1 s p, then


0 = pe = (rs)e = (re)(se) would yield, since the field K has no zero
divisors, that re = 0 or se = 0, contradicting the definition of p as the
smallest natural number satisfying pe = 0. So p is a prime number.

To prove P p
, we will find Ker . From pe = 0, we have p Ker , so
pn Ker for all n (because Ker is an ideal of ) and p Ker .
On the other hand, if m Ker , we divide m by p to get m = qp + r, with
q,r and 0 r p. This gives 0 = me = (qp + r)e = (qp)e + re = 0 + re.
As 0 r p, this forces r = 0, which means m = qp and m p . So we
get Ker p . Therefore Ker = p . [A more conceptual argument: Ker
is an ideal of and is a principal ideal domain, so Ker = d for some
d . We have d 0 in Case 1. From pe = 0 we get p Ker = d , so
d p. But p is a prime number, so d = 1 or d = p. The possibility d = 1 is
excluded, because 1e = e 0. Hence d = p and Ker = d = p = p .]

Thus p = /p = /Ker Im P and Im , being a ring isomorphic


to p, is a field. So Im is a subfield of K, therefore P Im . This yields
P = Im and p P, as claimed.

Case 2. Assume there is no nonzero integer n such that ne = 0. We claim


that the mapping
: P
m/n me/ne
is a ring homomorphism and that P .

m m´
First we show that is well defined. If
= with m,n,m´,n´
n n´
(n 0 n´), then mn´ = m´n in , so (mn´)e = (m´n)e in P, thus (me)(n´e)
1 1
= (m´e)(ne) in P. Multiplying both sides of this equation by P,
ne n´e
me m´e
we obtain = . So is well defined.
ne n´e

m r
is a ring homomorphism: for all
, with m,n,r,s , n 0 s,
n s
m r  ms + rn (ms + rn)e (ms)e + (rn)e
we have  +  =   = =
n s  ns  (ns)e ne.se

587
(me)(se) + (re)(ne) me re m r
= = + = +
ne.se ne se n s

 m r mr (mr)e (me)(re) me re m r
and   = = = = =
 n s ns (ns)e (ne)(se) ne se n s

Since we assume that me 0 for m \{0} in Case 2, we obtain Ker =


m me m m
{ : = 0 K} = { : me = 0 K} = { :m=0 } = {0}, so
n ne n n
/{0} = /Ker Im P and Im , being a ring isomorphic to ,
is a field. So Im is a subfield of K, therefore P Im . This yields P =
Im and P, as claimed.

48.6 Definition: Let K be a field and let e be the identity element of K.


If there are nonzero integers n such that ne = 0, and if p is the smallest
natural number such that pe = 0, then K is said to be a field of charac-
teristic p and p is called the characteristic of K. If there is no nonzero
integer n such that ne = 0, then K is said to be a field of characteristic 0,
and 0 is called the characteristic of K.

Equivalently,. K is of characteristic p or 0 according as its prime subfield


is isomorphic to p or to . We write char K = p and char K = 0 in these
respective cases. For example, char p = p and char (i) = char ( 2) =
char = char = 0. We will usually identify p or with the prime
subfield of K, as the case may be. In particular, we will write 1 instead
of e for the identity element of K. Thus K will be considered to be an
.
extension of p or .

We remark that, if K is a field of characteristic p, then pa = 0 for any


element a of K. This follows from
pa = a + a + . . . + a = 1a + 1a + . . . + 1a = (1 + 1 + . . . + 1)a = (p1)a = 0a = 0,
the sums having p terms. This result will be used in the sequel without
explicit mention.

We make two conventions. Henceforward, we will write p in place of p.


This will always remind us that p is a field (p prime). Secondly, we shall
drop the bars in the elements of p, as we have already done on several

588
occasions. For example, we will write 2 instead of 2 5
. A notation such
as "2" is therefore ambiguous:. it stands for the integer 2 , as well as
2 2
, as well as 2 3
, as well as 2 5
, etc. It will be clear from the
context, however,. which meaning is accorded to "2". The ambiguity is
therefore harmless. .

We proceed to discuss field homomorphisms.

48.7 Lemma: If K is a field, then K and {0} are the only ideals of K.

Proof: If A is an ideal of K and A {0}, there is an a A, a 0. Then a


1 1
has an inverse in K and a = 1 A, because A is an ideal. Then we
a a
get b = b.1 A for all b K, so K A and A = K.

48.8 Lemma: If K1 and K2 are fields and : K1 K2 is a ring homo-


morphism, then either a = 0 for all a K1 or is one-to-one.

Proof: Ker is an ideal of K1, so either Ker = K1 or Ker = {0} by


Lemma 48.7. In these respective cases, either a = 0 for all a K1 or is
one-to-one. .

When we deal with fields and ring homomorphisms from a field to


another, we naturally want to disregard the uninteresting ring homo-
morphism that maps every element of its domain to the zero element of
the other field. Any other ring homomorphism is one-to-one by Lemma
48.8. This leads us to the following definition.

48.9 Definition: If K1 and K2 are fields and : K1 K2 is a one-to-one


ring homomorphism,. then will be called a field homomorphism. If is
a field homomorphism onto K2, then will be called a field isomorphism.
A field isomorphism from K onto. the same field K will be called a (field)
automorphism of K. .

589
If : K1 K2 is a field isomorphism, then is a homomorphism of addi-
tive groups, so 0K = 0K , and also Ker = {0K }, where 0K and 0K are
1 2 1 1 2
the zero elements of the fields K1, K2, respectively. Thus K 1 \{0}
is a one-
to-one mapping from K1\{0} onto K2\{0}. In addition, (ab) = a .b for all
a,b in K1, so (ab) = a .b for all a,b K1\{0} and therefore K : K1 K2
1
is a one-to-one homomorphism of groups onto K2 : we have K1 K2 In
particular, (1K ) = 1K , where 1K and 1K are the identities of the fields
1 2 1 2
K1, K2, respectively.

48.10 Lemma: Let K1, K2, K3 be fields.

(1) If : K1 K2 and : K2 K3 are field homomorphisms, then : K1 K3


is a field homomorphism.

(2) If : K1 K2 and : K2 K3 are field isomorphisms, then : K1 K3 is


a field isomorphism.

1
(3) If : K1 K2 is a field isomorphism, then : K2 K1 is a field iso-
morphism.

Proof: (1) is a ring homomorphism by Lemma 30.16(1) and one-to-


one by Theorem 3.11(2).

(2) is a field homomorphism by part (1) and onto by Theorem


3.11(1).

(3) 1 is a ring homomorphism by Lemma 30.16(2) and one-to-one by


Theorem 3.17(1).

A field homomorphism : K1 K2 can be characterized as a one-to-one


function such that
a a
(a + b) = a + b , (a b) = a b , (ab) = a .b , =
b b
for all a,b K1 (b 0 in the division). Let us consider some examples.

590
48.11 Examples: (a) The conjugation mapping : is an auto-
morphism of , because x x
x + y = x + y, x y = x y, x y = x . y, x/y = x/y
for any x,y .

(b) The mapping : ( 2) ( 2) is an automorphism of ( 2)


because
a +b 2 a b 2

((a + b 2) + (c + d 2)) = ((a + c) + (b + d) 2) = (a + c) (b + d) 2


= (a b 2) + (c d 2) = (a + b 2) + (c + d 2) ,

((a + b 2)(c + d 2)) = ((ac + 2bd) + (ad + bc) 2)


= (ac + 2bd) (ad + bc) 2 = (ac + 2( b)( d)) + (a( d) + ( b)c ) 2
= (a b 2)(c d 2) = (a + b 2) (c + d 2)

for all a + b 2, c + d 2 ( 2), where a,b,c,d , so that is a ring


homomorphism and, because of 1 = (1 + 0 2) = 1 0 2 = 1 0, the
kernel of is not K and is therefore one-to-one.

(c) Let K be a field and x an indeterminate over K. Then the mapping


: K(x) K(x)
p(x) p(x2 )
q(x) q(x2 )
is a field homomorphism. Note that Im K(x). Thus K(x) is isomorphic
to a proper subset of itself (namely to Im ).

Let E/K be a field extension. Then E is an additive group and


a(x + y) = ax + ay
(a + b)x = ax + ay
(ab)x = a(bx)
1x =x
for all x,y E and for all a,b K (in fact for all a,b E, but we do not
need this now). Hence E is a vector space over K, as we have already
noted in Example 39.2(h). Studying both the field and the vector space
structure of E will be very useful. In particular, the dimension of E over
K will play an important role.

591
48.12 Definition: Let E/K be a field extension. The dimension of E over
K is called the degree of E over K, or the degree of the extension E/K.

It will prove convenient to write E:K instead of dimK E for the degree of
E over K. The field E is said to be a finite dimensional extension or an
infinite dimensional extension of K according as E:K is finite or infinite.
Most authors use the term "finite extension" for a finite dimensional
extension.

An important fact in the theory of fields is that a finite dimensional


extension of a finite dimensional extension is a finite dimensional exten-
sion, and that the degrees behave multiplicatively. More exactly, we
have the

48.13 Theorem: Let F/E and E/K be field extensions of finite degrees
F:E and E:K . Then F/K is a finite dimensional extension. In fact
F:K = F:E E:K

and furthermore if {f1,f2, . . . ,fr} is an E-basis of F and {e1,e2,. . . ,es} a K-


basis of E, then {fiej : i = 1,2, . . . ,r; j = 1,2, . . . ,s} is a K-basis of F.

Proof: If K is a subfield of E and E is a subfield of F, then certainly K is a


subfield of F. Thus F is an extension of K.

Now the claim about the degree. Put F:E = r and E:K = s for brevity.. We
are to prove that the dimension of F over K is equal to rs. Let {f1,f2, . . . ,fr}
be an E-basis of F and {e1,e2,. . . ,er} a K-basis of E. We are to find a K-basis
of F having exactly rs elements.. The most natural thing to do is to consi-
der the rs products fiej . We contend that {fiej : i = 1,2, . . . ,r; j = 1,2, . . . ,s}
is a K-basis of F. .

First we show that {fiej } spans F over K. Indeed, let f be an arbitrary


element of F. Then .
f=b f +b f + . . . +bf
1 1 2 2 r r
for some b1,b2, . . . ,br E, because {fi : i = 1,2, . . . ,r} spans F over E; and for
each i, bi = ai1e1 + ai2e2 + . . . + aises
for some ai1,ai2, . . . ,ais K, because {ej : j = 1,2, . . . ,s} spans E over K. Hence

592
∑ ∑ (∑ aijej )fi = ∑ aij(ej fi)
r r s
f= bifi =
i=1 i=1 j=1 i,j
is a linear combination of ej fi = fiej over K. Thus {fiej } spans F over K.

Furthermore, {fiej } is linearly independent over K. Indeed, if bij are


elements of K such that

∑ bijfiej = 0
i,j

∑ (∑
r s
then bijej )fi = 0,
i=1 j=1


s
where bijej E for each i. Since {fi : i = 1,2, . . . ,r} is linearly independ-
j=1


s
ent over E, we have bijej = 0 for each i. Since {ej : j = 1,2, . . . ,s} is
j=1
linearly independent over E, we obtain bij = 0 for each i,j. Hence {fiej } is
linearly independent over K.

Thus {fiej } is is a K-basis of F and F:K = rs = F:E E:K .

It follows from Theorem 48.13 by induction that

Kn:K1 = Kn:Kn 1 Kn 1:Kn 2 . . . K2:K1

whenever Kn/Kn 1, Kn 1/Kn 2, . . . , K2/K1 are finite dimensional field exten-


sions. In fact, Theorem 48.13 and its generalization is true for infinite
dimensional extensions, too, but we will not need this.

48.14 Lemma: Let F/E and E/K be field extensions. If F:K is finite,
then F:E and E:K are both finite. In fact, both of them are divisors of
F:K and F:K = F:E E:K .

Proof: Let n = F:K and let {fi : i = 1,2, . . . ,n} be a basis of F over K. Then
{fi : i = 1,2, . . . ,n} spans F over E and so F:E n by Steinitz' replacement
theorem. Thus F:E is finite.

Now the finiteness of E:K . If E were infinite dimensional over K, there


would be n + 1 K-linearly independent elements of E, so there would be

593
n + 1 K-linearly independent elements of F, contradicting F:K = n. Thus
E:K is finite.

We now obtain n = F:K = F:E E:K from Theorem 48.13. In particular,


F:E and E:K divide n.

Exercises

1. Let E be a field and K E. Show that K is a subfield of E if and only if


K is a subgroup of E and K\{0} is a subgroup of E\{0}.

2. Let p be prime. Is p2
an extension of p
? Is p3
an extension of p2
?

3. Prove that ( ) = {x + y : x,y } and ( 5i) = {x + y 5i: x,y } are


subfields of .

4. Let K be a field and let Aut(K) be the set of all field automorphisms of
K. Show that Aut(K) is a group under composition.
3
5. Find all automorphisms of , p
, (i), ( ), ( 5i), ( 2) (see Ex.3).

6. Find three nonisomorphic infinite fields of characteristic p 0.

7. Find the degrees of the following extensions: / , / (i), (i)/ ,


/ , (x)/ .

8. Show that ( 2,i) := {a + b 2 + ci + d 2i : a,b,c,d } is an extension


field of both (i) and ( 2). Find ( 2,i): by two different methods.

9. Prove or disprove: If E/K1 and E/K2 are finite dimensional field exten-
sions, then E/(K1 K2) is finite dimensional, too.

10. Let K be a field and e the identity element of K. Show that char K = 0
or p according as the subring of K generated by e is isomorphic to or to
p
.

11. Find the prime subfields of the fields in §29, Ex. 8.

594
12. Let K be a field of characteristic p 0. Prove that : K K is a field
a ap
homomorphism.

595
§49
Field Extensions (continued)

49.1 Definition: Let E be an extension field of K. If F is a field such that


K F E, then F is said to be an intermediate field of the extension E/K.

49.2 Definition: Let E/K be a field extension and let S be a subset of E.


The intersection of all subfields of E containing K S, which is a subfield
of E by Example 48.3(f), is called the subfield of E generated by S over K,
and is denoted by K(S).

It follows immediately from this definition that. K K(S) E so that K(S)


is an intermediate field of E/K.. When S is a finite subset of E, say when
S = {a1,a2, . . . ,an}, we write K(a1,a2, . . . ,an) instead of K({a1,a2, . . . ,an}). In
particular, if a E, then K(a) is, by definition,. the smallest subfield of E
containing both K and a. Notice that K(a1,a2, . . . ,an) = K(ai ,ai , . . . ,ai ) for
1 2 n
1 2 . . . n
any permutation ( i i . . . in ) in Sn.
1 2

49.3 Definition: Let E/K be a field extension and let S be a subset of E.


The intersection of all subrings of E containing K S, which is a subring
of E by Example 30.2´3(c), is called the subring of E generated by S over
K, and is denoted by K[S].

Since every subfield of E. containing K S is also a subring of E contain-


ing K S, we clearly have K K[S] K(S) E.. If S is a finite subset of E,
say S = {a1,a2, . . . ,an}, we write K[a1,a2, . . . ,an] instead of K[{a1,a2, . . . ,an}]. In
particular, if a E, then K[a] is, by definition,. the smallest subring of E
containing both K and a. We have K[a1,a2, . . . ,an] = K[ai ,ai , . . . ,ai ] for any
1 2 n
1 2 . . . n
permutation ( i i . . . in ) in Sn.
1 2

596
49.4 Example: In the extension / , let us find the subfield of gene-
rated by i over . Any subfield of containing both and i contains
a + bi
complex numbers of the form , where a,b,c,d and c + di 0.
c + di
a + bi
One verifies easily that F = { : a,b,c,d , c + di 0} is a
c + di
subfield of containing both and i. Hence F is the subfield of
generated by i over .

Let us note that any element of F can be written in the form x + yi, with
x,y . Thus {x + yi : x,y } = F and F is equal to the field (i) de-
fined in Example 48.3(c). So the notation of Example 48.3(c) is consistent
with that of Definition 49.2.

The description of the elements in a field generated by a subset over a


subfield resembles the preceding example.

49.5 Lemma: Let E/K be a field extension and a1,a2, . . . ,an E. Then
(1) K[a1,a2, . . . ,an] = {f(a1,a2, . . . ,an) E: f K[x1,x2, . . . ,xn]};
(2) K(a1,a2, . . . ,an)
f(a1,a2, . . . ,an)
={ E: f,g K[x1,x2,. . . ,xn], g(a1,a2, . . . ,an) 0}.
g (a ,a , . . . ,a )
1 2 n

Proof: (1) Let A be the set on the right hand side of the equation in (1).
Any subring of E containing K and {a1,a2, . . . ,an} will contain the elements
of the form ka1m1 a2m2 . . . anmn , where k K and m1,m2, . . . ,mn are nonnega-
tive integers, hence also the elements of the form

∑ km
1
m2 ...mn a1
m1
a2m2 . . . anmn (*)

where km m2 ...mn K and m1,m2, . . . ,mn are nonnegative integers. Of


1

course, (*) is nothing but the value of the polynomial

f(x1,x2, . . . ,xn) = ∑ km m2 ...mn x1


m1
x2m2 . . . xnmn K[x1,x2, . . . ,xn]
1

597
at (a1,a2, . . . ,an). So every element of A is in any subring of E containing K
and {a1,a2, . . . ,an}. This gives A K[a1,a2, . . . ,an]. To prove the reverse
inclusion, it suffices, in view of K {a1,a2, . . . ,an} A E, to show that A
is a subring of E. But this is immediate: given any f(a1,a2, . . . ,an) and
g(a1,a2, . . . ,an) A, where f,g K[x1,x2, . . . ,xn], we have
f(a1,a2, . . . ,an) + g(a1,a2, . . . ,an) = (f + g)(a1,a2, . . . ,an) A
g(a1,a2, . . . ,an) = ( g)(a1,a2, . . . ,an) A
f(a1,a2, . . . ,an)g(a1,a2, . . . ,an) = (fg)(a1,a2, . . . ,an) A
since f + g, g, fg belong to [x1,x2, . . . ,xn] whenever f,g do. Thus A is a
subring of E by the subring criterion (Lemma 30.2). This proves.
K[a1,a2, . . . ,an] = A.

(2) The reasoning is similar. Let B be the set on the right hand side of
the equation in (2). Clearly A B. Note that B = {b/c E: b,c A, c 0} =
{bc 1 E: b,c A, c 0}. Any subfield of E containing K and {a1,a2, . . . ,an}
will contain K[a1,a2, . . . ,an] = A and, since a subfield is closed under
division, it will contain also the elements b/c, where b,c A and c 0.
This means that B is contained in any subfield of E containing K and
{a1,a2, . . . ,an}. Hence B K(a1,a2, . . . ,an). To prove the reverse inclusion, it
suffices, in view of K {a1,a2, . . . ,an} B E, to show that B is a subfield
of E. Indeed, given any b/c, d/e B, where b,c,d,e A, c,e 0, we have
b d be + dc
+ = B
c e ce
d
B
e
bd bd
= B
c e ce
1 e
= B (provided d/e 0,
d d
e
i.e.,d 0)
since be + dc, ce, d, bd,ce belong to A whenever b,c,d,e do and ce 0
whenever c 0 e (A is a subring of the field E and has therefore no
zero divisors). Thus B is a subfield of E by the subfield criterion (Lemma
48.2). This proves K(a1,a2, . . . ,an) = B.

The proof of Lemma 49.5 can be somewhat simplified by referring to


Theorem 31.8.

598
Let us take a new look at Example 49.4 under the light of Lemma 49.5.
The field F in Example 49.4 is exactly the the field desribed in Lemma
49.5, with K = , n = 1, a1 = i . On the other hand, the field.
{x + yi : x,y }. is exactly the subring of described in Lemma
49.5, with K = , n = 1, a1 = i . Thus we have (i) = [i]. The reader
will easily verify that ( 2) = [ 2] also (cf. Theorem 50.6). .

49.6 Lemma: Let E/K be a field extension and a,b,a1,a2, . . . ,an E.


(1) K(a) = K if and only if a K.
(2) K(a1,a2, . . . ,a n 1,an) = (K(a1,a2, . . . ,a n 1))(an) and K[a1,a2, . . . ,a n 1,an] =
[K(a1,a2, . . . ,a n 1]][an].
(3) K(a,b) = (K(a))(b) = (K(b))(a) and K[a,b] = [K[a]][b] = [K[b]][a].

Proof: (1) a K(a) by the definition of K(a) and, if K(a) = K, we obtain


a K. Conversely, if a K, then K = K {a} and K is the intersection of all
subfields of E containing both K and a; thus K(a) = K.

(2) Let us write L = K(a1,a2, . . . ,a n 1). Then L contains K and a1,a2, . . . ,a n 1.


Now L(an) is a subfield of E containing both L and an, so L(an) is a sub-
field of E containing K and a1,a2, . . . ,a n 1 and an. Then K(a1,a2, . . . ,a n 1,an),
being the intersection of all subfield of E containing K and a1,a2,. . . ,a n 1,an,
is a subfield of L(an). This gives K(a1,a2, . . . ,a n 1,an) L(an). On the other
hand, K(a1,a2, . . . ,a n 1,an) is a subfield of E containing K, a1,a2,. . . ,a n 1 and
also an. So L K(a1,a2, . . . ,a n 1,an) by the definition of L = K(a1,a2, . . . ,a n 1);
and an K(a1,a2, . . . ,a n 1,an). Hence K(a1,a2, . . . ,a n 1,an) is a subfield of E
containing both L and an. Then L(an) K(a1,a2, . . . ,a n 1,an) by the defini-
tion of L(an). We obtain K(a1,a2, . . . ,a n 1,an) = L(an), as was to be proved.
The second assertion is proved in exactly the same way (read "subring"
in place of "subfield" in the foregoing argument).

(3) Using part (2) twice, we get (K(a))(b) = K(a,b) = K(b,a) = (K(b))(a) and
similarly [K[a]][b] = K[a,b] = K[b,a] = [K[b]][a].

599
We introduce a very important classification of field extensions:
algebraic vs. transcendental extensions. They behave very differently.

49.7 Definition: Let E/K be a field extension. An element a of E is said


to be algebraic over K if there is a nonzero polynomial f in K[x] such that
a is a root of f, i.e., f(a) = 0. An element a of E is said to be transcendental
over K if a is not algebraic, that is to say, if there is no nonzero poly-
nomial f in K[x] with f(a) = 0.

If every element of E is algebraic over K, then E is called an algebraic


extension of K and E/K is called an algebraic extension. In this case, E is
said to be algebraic over K. If E is not an algebraic extension of K, then E
is called a transcendental extension of K and E/K is called a transcenden-
tal extension. If so, that is to say, if E contains at least one element which
is not algebraic over K, then E is said to be transcendental over K.

49.8 Examples: (a) Let K be any field.. Then, for any element a K, the
polynomial fa (x) := x a is in K[x], and a is a root of fa . Thus any element
of K is algebraic over K, and K is an algebraic extension of K..

(b) i is a root of the polynomial x2 + 1 [x]. Hence i is algebraic


over . Also, any element a + bi of (i), where a,b , is a root of
2 2 2
[x (a + bi)][x (a bi)] = x 2ax + (a + b ) [x]
and is therefore algebraic over . Hence (i)/ is an algebraic extension.

(c) 2 is a root of the polynomial x2 2 [x]. Hence 2 is


algebraic over . Also, any element a + b 2 of ( 2), where a,b , is a
root of
[x (a + b 2)][x (a b 2)] = x2 2ax + (a 2 2b2) [x]
and is therefore algebraic over . Hence ( 2)/ is an algebraic exten-
sion.

(d) It is a fact that and e are transcendental over . We


borrow this fact from number theory without proof. Thus / is a
transcendental extension. ( ) and (e) are also transcendental exten-
sions of .

600
(e) Let K be a field and x an indeterminate over K. Then K(x) is an
extension field of K and x K(x). If f is any nonzero polynomial in K[x],
then f(x) = f 0 (Example 35.2(d)). Thus x is transcendental over K and
K(x)/K is a transcendental extension.
Likewise f(x2) 0 for any nonzero polynomial f in K[x] and x2 is
transcendental over K. On the other hand, if y is another indeterminate
over K, then x is the root of the polynomial y2 x2 (K(x2))[y], so x is
algebraic over K(x2). Thus an element may be transcendental over a
field and algebraic over another field.

49.9 Definition: Let E/K be a field extension.. If there is an element a


in E such that E = K(a), then E is called a simple extension of K.. In this
case, any element a of E satisfying E = K(a). is called a primitive element
of the extension E/K. If there are finitely many elements a1,a2, . . . ,an in E
such that E = K(a1,a2, . . . ,an), then E is said to be finitely generated over K.

The reader should clearly distinguish between finite dimensional exten-


sions and finitely generated extensions.

We close this paragraph with a theorem that describes all simple tran-
scendental extensions up to isomorphism. Simple algebraic extensions
will be treated in the next paragraph.

49.10 Theorem: Let E/K be a field extension and let a E be transcen-


dental over K. Then K(a) K(x), where x is an indeterminate over K.

Proof: We wish to find an isomorphism from K(x) onto K(a). What is


more natural than the extension
: K(x) K(a)
f f(a)
g g(a)
of the substitution homomorphism? In any case, Lemma 49.5(2)
suggests that we try this mapping. Now is meaningful, for, given any
f/g K(x) with f,g K[x], g 0, we have g(a) 0 (a is transcendental
over K) and so (f/g) = f(a)/g(a) is a perfectly definite element of K(a).

601
We claim that is well defined. Indeed, if f/g = f1/g1 in K(x), where
f,g,f1,g1 K[x] and g 0 g1, then fg1 = f1g in K[x] and, by Lemma 35.3,
f(a)g1(a) = f1(a)g(a) in E, with g1(a) 0 g(a); multiplying this equation
by 1/g1(a)g(a), we obtain
f(a) f1(a) f1
(gf ) = =
g(a) g 1(a)
=( ) ,
g1
which shows that is well defined.

is a ring homomorphism because, from Lemma 35.3, we have


f p fq + pg (fq + pg)(a) f(a)q(a) + p(a)g(a)
( + q) = = =
g gq (gq)(a) g(a)q(a)
f(a) p(a) f p
= + =( ) +( )
g(a) q(a) g q
f p fp f(a)p(a) f(a) p(a) f p
and ( ) = = = =( ) ( )
g q gq g(a)q(a) g(a) q(a) g q
for any f/g, p/q K(x), where f,g,p,q K[x] and g 0 q, the last
condition ensuring g(a) 0 q(a).

Since Ker = {f/g K(x): f,g K[x], g 0 in K[x], f(a)/g(a) = 0}


= {f/g K(x): f,g K[x], g 0, f(a) = 0}
= {f/g K(x): f,g K[x], g 0, f = 0}
= {0},
is one-to-one. Hence is a field homomorphism. Lemma 49.5(2) states
that is onto K(a). So : K(x) K(a) is a field isomorphism: K(x) K(a).

Exercises

1. Let E/K be a field extension and S E, S . Show that


K[S] = {f(s1,s2, . . . ,sn) E: n , f K[x1,x2, . . . ,xn] and s1,s2, . . . ,sn S};
and K(S)
f(s1,s2, . . . ,sn)
={ E: n , f,g K[x1,x2,. . . ,xn] and g(s1,s2, . . . ,sn) 0}.
g (s ,s , . . . ,s )
1 2 n

2. Let E/K be a field extension and S E, S . Using the definition of


K[S] and K(S) only (in particular, without using Ex. 1), prove that K(S) is
the field of fractions of K[x].

602
3. Let E/K be a field extension and S E. Show that K(S) = K if and only if
S K.

4. Let E/K be a field extension and a1,a2, . . . ,an E. Prove that


(K(a1, . . . ,ak))(ak+1,. . . ,an) for any k = 1,2, . . . ,n 1.

5. Let a,b are arbitrary rational numbers. Find a polynomial in [x]


which admits a + b 5 as a root. Conclude that ( 5)/ is an algebraic
extension.

6. Show that 2 + i, 2 + 3, 2 + 3 + i are algebraic over by


exhibiting polynomials in [x] having these numbers among their roots.

7. Let K be a field. Prove that every element in K(x)\K is transcendental


over K.

8. Let E/K be a simple field extension and let a be a primitive element of


this extension. Let k,k´ K, with k 0. Show that ka + k´ is also a primi-
tive element of E/K.

9. Find a finitely generated field extension which is not finite dimension-


al. Prove that every finite dimensional extension is finitely generated.

10. Prove or disprove: if E/K is a field extension and a,b E are tran-
scendental over K, then K(a,b) K(x,y), where x,y are indeterminates
over K.

603
§50
Algebraic Extensions

Let E/K be a field extension and let a E be algebraic over K. Then there
is a nonzero polynomial f in K[x] such that f(a) = 0. Hence the subset A =
{f K[x]: f(a) = 0} of K[x] does not consist only of 0. We observe that A is
an ideal of K[x], because A is the kernel of the substitution homomorph-
ism Ta : K[x] E.

Thus A is an ideal of K[x] and A {0}. . Since K[x] is a principal ideal


domain, A = K[x]f0 =: (f0) for some nonzero polynomial f0 in K[x]. For any
polynomial g K[x], the relation (g) = A = (f0) holds if and only if g and f0
are associate in K[x], that is to say, if and only if g(x) = cf0(x) for some c
in K . There is a unique c0 K such that the leading coefficient of c0f0(x)
is equal to 1. With this c0, we put g0(x) = c0f0(x). Then g0 is the unique
monic polynomial in K[x] satisfying (g0) = A = {f K[x]: f(a) = 0}, and f(a)
= 0 for a polynomial f in K[x] if and only if g0 f in K[x]. In particular, we
have deg g0 deg f for any f K[x] having a as a root.

In this way, we associate with a E a unique monic polynomial g0 in K[x].


This g0 is the monic polynomial in K[x] of least degree having a as a root.

g0 is irreducible over K: if there are polynomials p(x), q(x) in K[x] with


g0(x) = p(x)q(x), 1 deg p(x) deg g0(x) and 1 deg q(x) deg g0(x),
then 0 = g0(a) = p(a)q(a) would imply p(x) A or q(x) A, hence g0 p or
g0 q in K[x], which is impossible in view of the conditions on deg p(x) and
deg q(x). 0

We proved the following theorem.

50.1 Theorem: Let E/K be a field extension and a E. If a is algebraic


over K, then there is a unique nonzero monic polynomial g(x) in K[x]
such that

for all f(x) K[x], f(x) = 0 if and only if g(x) f(x) in K[x].

604
In particular, a is a root of g(x) and g(x) has the smallest degree among
the nonzero polynomials in K[x] admitting a as a root. Moreover, g(x) is
irreducible over K.

50.2 Definition:Let E/K be a field extension and let a E be algebraic


over K. The unique polynomial g(x) of Theorem 50.1 is called the
minimal polynomial of a over K.

The minimal polynomial of a over K is also called the irreducible poly-


nomial of a over K. Given an element a of E, algebraic over K, and a
polynomial h(x) in K[x], in order to find out whether h(x) is the minimal
polynomial of a over K, it seems we had to check whether h(x) f(x) for all
the polynomials f(x) K[x] having a as a root. Fortunately, there is
another characterization of minimal polynomials.

50.3 Theorem: Let E/K be a field extension and a E. Assume that a is


algebraic over K. Let h(x) be a nonzero polynomial in K[x]. If
(i) h(x) is monic,
(ii) a is a root of h(x),
(iii) h(x) is irreducible over K,
then h(x) is the minimal polynomial of a over K.

Proof: We must show only that h(x) divides any polynomial f(x) K[x]
having a as a root. Let f(x) be a polynomial in K[x] and assume that a is a
root of f(x). We divide f(x) by h(x) and get
f(x) = q(x)h(x) + r(x), r(x) = 0 or deg r(x) deg h(x)
with suitable q(x),r(x) K[x]. Substituting a for x, we obtain
0 = f(a) = q(a)h(a) + r(a) = q(a)0 + r(a) = r(a).
If r(x) were distinct from the zero polynomial in K[x], then the irre-
ducible polynomial h(x) would have a common root a with the polyno-
mial r(x) whose degree is smaller than the degree of h(x). This is impos-
sible by Theorem 35.18(4). Hence r(x) = 0 and f(x) = q(x)h(x). Therefore
h(x) f(x) for any polynomial f(x) K[x] having a as a root, as was to be
proved.

605
50.4 Examples: (a) Let us find the minimal polynomial of i over
2
. Since i is a root of the polynomial x + 1 [x], which is monic and
irreducible over , Theorem 50.3 tells us that x2 + 1 is the minimal
poly-nomial of i over . In the same way, we see that x2 + 1 [x] is
the minimal polynomial of i over . On the other hand, x2 + 1 ( (i))[x]
2
is not irreducible over (i), because x + 1 = (x i)(x + i) in ( (i))[x]. Now
x i is a monic irreducible polynomial in ( (i))[x] having i as a root, and
thus x i is the minimal polynomial of i over (i).

(b) Let us find the minimal polynomial of u = 2 + 3 over . The


calculations u = 2+ 3
u 2 = 3
2
u 2 2u + 2 =3
(u)
u2 1 = 2 2u
u4 2u2 + 1 = 8u2
u4 10u2 + 1 =0
show that 2 + 3 is a root of the monic polynomial f(x) = x4 10x2 + 1
in [x]. We will prove that f(x) is irreducible over . Theorem 50.3 will
then yield that f(x) is the minimal polynomial of 2 + 3 over .

In view of Lemma 34.11, it will be sufficient to show that f(x) is


irreducible over . Since the numbers 1/ 1 = 1 are not roots of f(x), we
learn from Theorem 35.10 (rational root theorem) that f(x) has no poly-
nomial factor in [x] of degree one. If there were a factorization in [x]
of f(x) into two polynomials of degree two, which we may assume to be
x4 10x2 + 1 = (x2 + ax + b)(x2 + cx + d)
(e)
without loss of generality, then the integers a,b,c,d would satisfy
a + c = 0, d + ac + b = 10, ad + bc = 0, bd = 1
and this would force b = d = 1 and the first two equations would give
a + c = 0, ac = 12 or a + c = 0, ac = 8
a 2 = 12 or a 2 = 8,
whereas no integer has a square equal to 8 or 12. Thus f(x) is irreducible
in [x] and, as remarked earlier, f(x) is therefore the minimal poly-
nomial of 2 + 3 over .

606
The irreducibility of f(x) of degree four over could be proved by
showing the irreducibility of another polynomial, of degree less than
four, over a field larger than . As this gives a deeper insight to the
problem at hand, we will discuss this method. The equation (u) states
that 2 + 3 is a root of the polynomial f2(x) = x2 2 2x 1
( ( 2))[x]. Let g(x) ( ( 2))[x] be the minimal polynomial of 2 + 3
over ( 2). Then g(x) f2(x) in ( ( 2))[x] and, if g(x) f2(x), then deg
g(x) would be one and g(x) would be x ( 2 + 3), since the latter is the
unique monic polynomial of degree one having 2 + 3 as a root. But
g(x) ( ( 2))[x] and this would imply 2 + 3 ( 2), so 3 ( 2),
so 3 = m + n 2 with suitable m,n , where certainly m 0 n, so 3 =
2 2 2
m + 2 2mn + n , so 2 = (3 m 2n2)/2mn would be a rational
number, a contradiction. Thus f2(x) = g(x) is the minimal polynomial of 2
+ 3 over ( 2).

Now the irreducibility of f(x) over follows very easily. f(x) has no
factor of degree one in [x]. If f(x) had a factorization (e) in [x], where
a,b,c,d are rational numbers (not necessarily integers), then 2 + 3
would be a root of one of the factors on the right hand side of (e), say of
x2 + ax + b. But then x2 + ax + b, being a polynomial in ( ( 2))[x] having
2 + 3 as a root, would be divisible, in ( ( 2))[x], by the minimal
poly-nomial f2(x) = x2 2 2x 1 of 2 + 3 over ( 2). Comparing
degrees and leading coefficients, we would obtain x2 2 2x 1 = x2 + ax
+ b, so 2 2 = a , a contradiction. Hence f(x) is irreducible over .

The next lemma crystalizes the argument employed in the last example.

50.5 Lemma: Let K1 K2 E be fields and a E. If a is algebraic over


K1, then a is algebraic over K2. Moreover, if f1, f2 are, respectively, the
minimal polynomials of a over K1 and K2, then f2 f1 in K2[x].

Proof: If a is algebraic over K1 and f1(x) is the minimal polynomial of a


over K1, then f1(a) = 0. Since f1(x) K1[x] K2[x], we conclude that a is
algebraic over K2. Then, from f1(a) = 0 and f1(x) K2[x], we obtain
f2(x) f1(x) in K2[x] by the very definition of the minimal polynomial f2(x)
of a over K2.

607
We proceed to desrcribe simple algebraic extensions. Let us recall that
we found [i] = (i). This situation obtains whenever we consider a
simple extension generated by an algebraic element.

50.6 Theorem: Let E/K be a field extension and a E. Assume that a is


algebraic over K and let f be its minimal polynomial over K. We denote
by K[x]f =: (f) the principal ideal generated by f in K[x]. Then
K(a) = K[a] K[x]/(f).

Proof: Consider the substitution homomorphism Ta : K[x] E. Here Ker Ta


= {h K[x]: h(a) = 0} = (f) by Theorem 50.1 and Im Ta = K[a] by Lemma
49.5(1). Hence K[x]/(f) = K[x]/Ker Ta Im Ta = K[a].

It remains to show K(a) = K[a]. Since K[a] K(a), we must prove only
K(a) K[a]. To this end, we need only prove that 1/g(a) K[a] for any
g(x) K[x] with g(a) 0 (Lemma 49.5). Indeed, if g(x) K[x] and g(a) 0,
then f g and, since f is irreducible in K[x], the polynomials f(x) and g(x)
are relatively prime in K[x] (Theorem 35.18(3)). Thus there are poly-
nomials r(x), s(x) in K[x] such that
f(x)r(x) + g(x)s(x) = 1.
Substituting a for x and using f(a) = 0, we obtain g(a)s(a) = 1. Hence
1/g(a) = s(a) K[a]. This proves K[a] = K(a). (Another proof. Since K[x] is
a principal ideal domain and f is irreducible in K[x], the factor ring
K[x]/(f) is a field by Theorem 32.25; thus K[a], being a ring isomorphic to
the field K[x]/(f), is a subfield of E, and K[a] contains K and a. So K(a)
K[a] and K(a) = K[a].)

50.7 Theorem: Let E/K be a field extension and a E. Suppose that a is


algebraic over K and let f be its minimal polynomial over K. Then
K(a):K = deg f
(the degree of the field K(a) over K is the degree of the minimal poly-
nomial f in K[x]). In fact, if deg f = n, then {1,a,a 2, . . . ,a n 1} is a K-basis of
K(a) and every element in K(a) can be written in the form
k + k a + k a2 + . . . + k an 1
0 1 2 n 1
(k ,k ,k , . . . ,k
0 1 2 n 1
K)

608
in a unique way.

Proof: We prove that {1,a,a 2, . . . ,a n 1} is a K-basis of K(a). Let us show


that it spans K(a) over K. We know K(a) = K[a] from Theorem 50.6 and
K[a] = {g(a) E: g K[x]} from Lemma 49.5(1). Thus any element u of
K(a) can be written as g(a), where g(x) is a suitable polynomial in K[x].
Dividing this polynomial g(x) by f(x), which has degree n, we get

g(x) = q(x)f(x) + r(x), r(x) = 0 or deg r(x) n 1

with some polynomials q(x), r(x) in K[x]. Substituting a for x, we obtain

u = g(a) = q(a)f(a) + r(a) = q(a)0 + r(a) = r(a).

If, say, r(x) = k0 + k1x + k2x2 + . . . + kn 1xn 1, where k0,k1,k2, . . . ,kn 1


K,

then u = k0 + k1a + k2a 2 + . . . + kn 1a n 1

and thus {1,a,a 2, . . . ,a n 1} spans K(a) over K.

Now let us show that {1,a,a 2, . . . ,a n 1} is linearly independent over K. If


k0,k1,k2, . . . ,kn 1 are elements of K such that
k + k a + k a 2 + . . . + k a n 1 = 0,
0 1 2 n 1
then a is a root of the polynomial h(x) = k0 + k1x + k2x2 + . . . + kn 1xn 1 in
K[x], so f(x) h(x) by Theorem 50.1. Here h(x) 0 would yield the contra-
diction n = deg f deg h n 1. Therefore h(x) = 0, which means that
k = k = k = ... = k
0 1 2
= 0. Hence {1,a,a 2, . . . ,a n 1} is linearly independent
n 1
over K.

This proves {1,a,a 2, . . . ,a n 1} is a K-basis of K(a). It follows that


K(a):K = dimK K(a) = {1,a,a 2, . . . ,a n 1} = n = deg f(x)
and, by Theorem 42.8, every element of K(a) can be written uniquely in
the form
k0 + k1a + k2a 2 + . . . + kn 1a n 1.

50.8 Definition: Let E/K be a field extension and a E. Suppose a is


algebraic over K. Then the degree of its minimal polynomial over K,
which is also the degree of K(a) over K, is called the degree of a over K.

609
50.9 Examples: (a) The minimal polynomial of i over is the
2 2
polynomial x + 1 in [x] (Example 50.4(a)), and x + 1 has degree 2.
Thus i is (algebraic and) has defree 2 over . Likewise, the minimal
poly-nomial of i over is x2 + 1 [x] and i has degree 2 over .

(b) The minimal polynomial of 2 + 3 over was found to be


4 2
x 10x + 1 [x] (Example 50.4(b)). Thus 2 + 3 has degree 4 over
. This follows also from Theorem 50.7. In fact, the numbers 1, 2 form
a -basis of the field ( 2), hence ( 2): = 2. Observe that

( 2+ 3)

x2 2 2x + 1
degree 2
x4 10x2 + 1
degree 4 ( 2)
x2 2
degree 2

9 1
2= ( 2 + 3) + ( 2 + 3)3, so 2 ( 2 + 3) and therefore
2 2
( 2) ( 2 + 3). Thus ( 2) is an intermediate field of the
extension ( 2 + 3)/ . From Theorem 48.13, we infer that

4= ( 2+ 3): = ( 2+ 3): ( 2) ( 2): = ( 2+ 3): ( 2) 2

( 2+ 3): ( 2) = 2

and 2+ 3 has degree 2 over ( 2).

(c) Since x2 + 1 [x] is the minimal polynomial of i over , Theo-


rem 50.6 states that [x]/(x2 + 1) (i). In the ring [x]/(x2 + 1), we
have the equality x2 + [x](x2 + 1) = 1 + [x](x2 + 1), and calculations
are carried out just as in the ring [x], but we replace [x + [x](x2 + 1)]2
= x2 + [x](x2 + 1) by 1 + [x](x2 + 1). In the same way, calculations
are carried out in (i) = just as though i were an indeterminate over
, and we write 1 for i 2 wherever we see i 2. This is what the
isomorphism [x]/(x2 + 1) (i) = means.

610
(d) Likewise, if E/K is a field extension and a E, and if a is algebraic
n
over K with the minimal polynomial x + c n 1x n1
+ c n 2xn 2 + . . . + c1x + c0
over K so that
an = c n 1a n 1 c n 2a n 2 ... c1a c0,
then K(a) consists of the elements
k0 + k1a + . . . + kn 2a n 2 + kn 1a n 1 (k0,k1, . . . ,kn 2,kn 1
K)
and computations are carried out in K(a) just as though a were an inde-
terminate over K and then replacing a n by c n 1a n 1 c n 2a n 2 . . . c1a c0
wherever it occurs.

For instance, writing a for 2 + 3 , we have a 4 = 10a 2 1 in (a). If


t = 2 + a a 2 + 3a 3 (a) and u = a + a 2 + 2a 3 (a), then

t + u = 2 + 2a + 5a 3 (a)

and tu = (2 + a a 2 + 3a 3)(a + a 2 + 2a 3)
= 2a + 2a 2 + 4a 3 + a 2 + a 3 + 2a 4 a 3 a 4 2a 5 + 3a 4 + 3a 5 + 6a 6
= 2a + 3a 2 + 4a 3 + 4a 4 + a 5 + 6a 6
= 2a + 3a 2 + 4a 3 + 4(10a 2 1) + a(10a 2 1) + 6a 2(10a 2 1)
= 2a + 3a 2 + 4a 3 + 40a 2 4 + 10a 3 a + 60(10a 2 1) 6a 2
= 64 + a + 637a 2 + 14a 3 (a).

Let us find the inverse of a 2 + a + 1. According to Theorem 50.6, we must


find polynomials r(x), s(x) in [x] such that
(x4 10x2 + 1)r(x) + (x2 + x + 1)s(x) = 1
and this we do by the Euclidean algorithm:

x4 10x2 + 1 = (x2 x 10)(x2 + x + 1) + (11x + 11)


1
x2 + x + 1 = (11x)(11x + 11) + 1,

1
so that 1 = (x2 + x + 1) (11 x)(11x + 11)
1
= (x2 + x + 1) (11 x)[(x4 10x2 + 1) (x2 x 10)(x2 + x + 1)]
1 1
= (x2 + x + 1)(1 + (11 x)(x2 x 10)) (11 x)(x4 10x2 + 1),
1 1 10 1
1 = (x2 + x + 1)(11 x3 x2 x + 1) (11 x)(x4 10x2 +
11 11
1)
and, substituting a for x, we get
1 1 10
1 = (a 2 + a + 1)(11 a 3 a2 a + 1),
11 11

611
1 1 10
1/(a 2 + a + 1) = 11 a 3 a2 a + 1.
11 11

Notice that a is treated here merely as a symbol that satisfies the rela-
tion a 4 10a 2 + 1 = 0. The numerical value of a = 2 + 3 = 3.14626337. .
. as a real number is totaly ignored. This is algebra, the calculus of
symbols. This allows enormous flexibility: we can regard a as an element
in any extension field E of in which the polynomial x4 10x2 + 1 has a
root. This idea will be pursued in the next paragraph.

50.10 Theorem: Let E/K be a finite dimensional extension. Then E is


algebraic over K and also finitely generated over K.

Proof: Let E:K = n . To prove that E is algebraic over K, we must


show that every element of a is a root of a nonzero polynomial in K[x]. If
u is an arbitrary element of E, then the n + 1 elements 1,u,u2, . . . ,un 1,un
of E cannot be linearly independent over K, by Steinitz' replacement
theo-rem. Thus there are k0,k1,k2, . . . ,kn 1,kn in K, not all of them zero,
.
with
k0 + k1u + k2u2 + . . . + kn 1un 1 + knun = 0.
Then g(x) = k0 + k1x + k2x2 + . . . + kn 1xn 1 + knxn is a nonzero polynomial
in K[x], in fact of degree n, and u is a root of g(x). Thus u is algebraic
over K. Since u was arbitrary, E is algebraic over K.

Secondly, if {b1,b2, . . . ,bn} E is a K-basis of E, then


E = sK (b1,b2, . . . ,bn) = {k1b1 + k2b2 + . . . + knbn}
{f(b1,b2, . . . ,bn) E: f K[x1,x2, . . . ,xn]}
= K(b1,b2, . . . ,bn)
E,
thus E = K(b1,b2, . . . ,bn) is finitely generated over K.

As a separate lemma, we record the fact that the polynomial g(x) in the
preceding proof has degree n.

612
50.11 Lemma: Let E/K be a field extension of degree E:K = n . Then
every element of E is algebraic over K and has degree over K at most
equal to n.

Next we show that an extension generated by algebraic elements is al-


gebraic.

50.12 Theorem: Let E/K be a field extension and let a1,a2, . . . ,a n 1,an be
finitely many elements in E. Suppose that a1,a2, . . . ,a n 1,an are algebraic
over K. Then K(a1,a2, . . . ,a n 1,an) is an algebraic extension of K. In fact,
K(a1,a2, . . . ,a n 1,an) is a finite dimensional extension of K and
K(a1,a2, . . . ,a n 1,an):K K(a1):K K(a2):K . . . K(an):K

Proof: Let r1 = K(a1):K . For each i = 2, . . . ,n 1,n, the element ai is alge-


braic over K, hence also algebraic over K(a1, . . . ,a i 1) by Lemma 50.5. This
lemma yields, in addition, that the minimal polynomial of ai over the
field K(a1, . . . ,a i 1) is a divisor of the minimal polynomial of ai over K; so,
comparing the degrees of these minimal polynomials and using Theorem
50.7, we get ri := (K(a1, . . . ,a i 1))(ai):K(a1, . . . ,a i 1) K(ai):K , this for all
i = 2, . . . ,n 1,n. From

K K(a1) K(a1,a2) ... K(a1,a2, . . . ,a n 1) K(a1,a2, . . . ,a n 1,an)

and K(a1, . . . ,a i 1,ai) = (K(a1, . . . ,a i 1))(ai) for i = 2, . . . ,n

1,n

(Lemma 49.6(2)), we obtain


K(a1,a2, . . . ,a n 1,an):K = rnrn 1. . . r2r1 (Theorem
48.13)

K(an):K K(an 1):K . . . K(a2):K K(a1):K .

Thus K(a1,a2, . . . ,a n 1,an) is a finite dimensional extension of K and, by


Theorem 50.10, an algebraic extension of K.

613
50.13 Lemma: Let E/K be a field extension and a,b E. If a and b are
algebraic over K, then a + b, a b, ab and a/b (in case b 0) are alge-
braic over K.

Proof: If a and b are algebraic over K, then K(a,b) is an algebraic exten-


sion of K by Theorem 50.12: every element of K(a,b) is algebraic over K.
Since a + b, a b, ab and a/b are in K(a,b), they are algebraic over K.

50.14 Theorem: Let E/K be a field extension and let A be the set of all
elements of E which are algebraic over K. Then A is a subfield of E (and
an intermediate field of the extension E/K).

Proof: If a,b A, then a and b are algebraic over K, then a + b, b, ab


and 1/b (the last in case b 0) are algebraic over K by Lemma 50.13
and so A is a subfield of E by Lemma 48.2. Since any element of K is
alge-braic over K (Example 49.8(a)), we have K A. Thus A is an
intermedi-ate field of E/K.

50.15 Definition: Let E/K be a field extension and let A be the subfield
of E in Theorem 50.14 consisting exactly of the elements of E which are
algebraic over K. Then A is called the algebraic closure of K in E.

A is of course an algebraic extension of K. In fact, if a E, then a is alge-


braic over K if and only if a A; and if F is an intermediate field of E/K,
then F is algebraic over K if and only if F A.

The last theorem in this paragraph states that an algebraic extension of


an algebraic extension is an algebraic extension, sometimes referred to
as the transitivity of algebraic extensions.

50.16 Theorem: Let F,E,K be fields. If F is an algebraic extension of E


and E is an algebraic extension of K, then F is an algebraic extension of K.

614
Proof: We must show that every element of F is algebraic over K. Let
u F. Since F is algebraic over E, its element u is algebraic over E, and
there is a polynomial f(x) E[x] with f(u) = 0, say
f(x) = e0 + e1x + . . . + enxn.
We put L = K(e0,e1, . . . ,en). Then clearly f(x) L[x]. Since E is algebraic
over K, each of e0,e1, . . . ,en is algebraic over K and Theorem 50.12 tells us
that L/K is finite dimensional. Also, since f(u) = 0 and f(x) L[x], we see
that u is algebraic over L and Theorem 50.7 tells us that L(u)/L is finite
dimensional. So L(u):K = L(u):L K(e0,e1, . . . ,en):K is a finite number: L(u)
is a finite dimensional extension of K. By Theorem 50.10, L(u) is an alge-
braic extension of K. So every element of L(u) is algebraic over K. In
particular, since u L(u), we see that u is algebraic over K. Since u is an
arbitrary element of F, we conclude that F is an algebraic extension of K.

50.17 Definition: Let K and L be subfields of a field E. The subfield of


E generated by K L over P, where P is the prime subfield of E, is called
the compositum of K and L, and denoted by KL.

So KL = P(K L) by definition. It follows immediately from this


definition that KL = LK. The compositum KL is the smallest subfield of E
containing both K and L, whence KL = K(L) = L(K).

In order to define the compositum of two fields K and L, it is necessary


that these be contained in a larger field. If K and L are not subfields of a
common field, we cannot define the compositum KL.

If E/K is a field extension and a,b E, then the compositum K(a)K(b) of


K(a) and K(b) is K(P {a,b}) = K(a,b).

Exercises

615
1. Find the minimal polynomials of the following numbers over the
fields indicated.
(a) 2 over , ( 2), ( 3).
(b) 3 2 over , ( 2), ( 3).
(c) 2 + 3 + 5 over , ( 2), ( 3), ( 2 + 5).
3 3 4
(d) 2+ 2 over , ( 2), ( 2), ( 2).
3 3
(e) 2+ 3 over , ( 2), ( 2), ( 2+ 3).
(f) 3+ 2 over , ( 2).
3
(g) 1+ 2 over , ( 2), (i).
3
(h) 1 2 over , ( 2), (i).
3 3
(j) 1+ 2+ 1 2 over , ( 2), (i).

2. Let E/K be an extension of fields and let D be an integral domain such


that K D E. Prove that, if E is algebraic over K, then D is a field.

3. Let E/K be an extension of fields and a1,a2, . . . ,am elements of E which


are algebraic over K. Prove that K[a1,a2, . . . ,am] = K(a1,a2, . . . ,am).

4. Let E/K be a field extension and a,b E. If a is algebraic of degree m


over K and b is algebraic of degree n over K, show that K(a,b) is an
algebraic extension of K and that K(a,b):K mn. If, in addition, m and n
are relatively prime, then in fact K(a,b):K = mn.

5. Let E/K be a field extension andL,M intermediate fields. Prove the


following statements.
(a) LM:K is finite if and only if both L:K and M:K are finite.
(b) If LM:K is finite, then L:K and M:K divide LM:K .
(c) If L:K and M:K are finite and relatively prime, then LM:K is
equal to L:K M:K .
(d) If L and M are algebraic over K, then LM is algebraic over K.
(e) If L is algebraic over K, then LM is algebraic over M.

6. A complex number u is said to be an algebraic integer if u is the root


of a monic polynomial in [x]. Prove the following statements.
(a) If c is algebraic over , then there is a natural number n
such that nc is an algebraic integer.
(b) If u and u is an algebraic integer, then u .

616
(c) Let f(x) and g(x) be monic polynomials in [x]. If f(x)g(x) [x],
then f(x) and g(x) are in [x]. (Hint: consider contents.)
(d) If u is an algebraic integer, then the minimal polynomial of
u over is in fact a polynomial in [x].

617
§51
Kronecker's Theorem

In this paragraph, we prove an important theorem due to L. Kronecker


which states that any polynomial over a field has a root in some exten-
sion field. It can be regarded as the fundamental theorem of field exten-
sions. As might be expected from Kronecker's philosophical outlook, the
proof is constructive: we do not merely prove the existence of such an
extension in an unseen world; we actually describe what its elements
are and how to add, multiply and invert them.

In our discussions concerning the roots of polynomials, we assumed, up


to this point, that we are given: (1) a field K; (2) a polynomial f(x) in K[x];
(3) an extension field E of K; (4) an element a of E which is a root of f(x).
But in many cases, we are given only a field K and a polynomial f(x) in
K[x], and the problem is to find a root of f(x). In more detail, the problem
is to find a field E, an extension of K, and an element a in E such that f(a)
= 0. Not only are we to find a, but we are also to find E, which is not
given in advance. Kronecker's theorem tells us how to do this.

Let us consider a historical example, viz. the introduction of complex


numbers into mathematics in the 18th and 19th centuries. Mathema-
ticians had the field of real numbers, and the polynomial x2 + 1
[x]. This polynomial has no root in , because there is no real number
whose square is 1. However, there were strong indications (for instance
Cardan's formula for the roots of a cubic polynomial) that a root of this
polynomial would be very welcome. What did mathematicians do, then?
They invented a symbol 1, which they perfectly knew not to be a real
number, and considered the expressions a + b 1, where a,b . These
expressions were coined "complex numbers" (not a fortunate name, by
the way). Two complex numbers a + b 1 and a´ + b´ 1 are regarded as
equal if and only if a = a´ and b = b´. The sum of two complex numbers is
defined in the obvious way. The product of two complex numbers
a + b 1, c + d 1 is found from the naïve calculation

(a + b 1)(c + d 1) = ac + ad 1+b 1c + b 1d 1

617
= ac + bd( 1)2 + (ad + bc) 1
= (ac bd) + (ad + bc) 1,

where we interpret ( 1)2 as the real number 1. Thus 1 is a compu-


tational device: we multiply complex numbers using the usual field
properties of , and putting 1 for ( 1)2 wherever ( 1)2 occurs. The
rigorous foundation for complex numbers as ordered pairs of real
numbers, due to W. R. Hamilton, came in the middle of the 19th century,
but there was nothing basicly wrong in the "definition" of complex num-
bers used by the earlier mathematicians. The field were constructed
in this way as the extension field (i) of having a root of the
polynomial x2 + 1 [x]. More specifically, the complex number 0 + 1 1
2
is a root of x + 1.

Another example. Given the field and the polynomial x4 10x2 + 1 in


[x], we wish to find a root of this polynomial. What can we do? As
mentioned in Example 50.9(d), we invent a symbol a, subject it to the
condition a 4 10a 2 + 1 = 0 and consider all expressions c0 + c1a + c2a 2 +
c3a 3, as c1,c2,c3,c4 run independently over . These expressions are new
"num-bers". These new "numbers" are multiplied using the usual field
proper-ties of , and putting 0 for a 4 10a 2 + 1 wherever a 4 10a 2 + 1
occurs (equivalently, putting 10a 2 1 for a 4 wherever a 4 occurs). The
field (a) is constructed from and a as an extension field of having
4 2
a root of the polynomial x 10x + 1 [x]. More specifically, the
"number"
0 + 1a + 0a 2 + 0a 3 is a root of x4 10x2 + 1.

It is now clear what to do in the general case. Given a field K and an ir-
reducible polynomial f(x) in K[x], to find a root of f(x), we invent a sym-
bol u, subject it to the condition f(u) = 0 and consider the K-vector space
with the K-basis 1,u,u2, . . . ,un 1, where n = deg f(x) and 1,u,u2, . . . ,un 1 are
computational symbols. We multiply the elements of this K-vector space
by treating u as an indeterminate over K and writing 0 for f(u) wher-
ever f(u) occurs. The rigorous method of doing is is to consider the factor
ring K[x]/(f), as suggested by Theorem 50.6.

618
51.1 Theorem (Kronecker's theorem): Let K be a field and f(x) an
irreducible polynomial in K[x]. Then there is an extension field E of K
such that f(x) has a root in E.

Proof: Let E = K[x]/(f), the factor ring of K[x] modulo the principal ideal
generated by f(x) in K[x]. Since K[x] is a principal ideal domain and f(x) is
irreducible in K[x], the factor ring E = K[x]/(f) is a field (Theorem 32.25).

The mapping :K E
k k + (f)
is a ring homomorphism because
(k1 + k2) = (k1 + k2) + (f) = (k1 + (f)) + (k2 + (f)) = k1 + k2
and
(k1k2) = k1k2 + (f) = (k1 + (f))(k2 + (f)) = k1 .k2

for any k1,k2 K. Since f(x) is irreducible in K[x], it is not a unit in K[x],
thus 1 = 1 + (f) 0 + (f) and is one-to-one by Lemma 48.8. So is a
field homomorphism. We identify K with its image K in E. So we will
write k instead of k + (f) when k K. In this way, we regard K as a
subfield of E and E as an extension field of K.

Let us write u = x + (f) E for brevity. We claim that u is a root of f(x).


Indeed, if f(x) = b0 + b1x + b2x2 + . . . + bnxn K[x], bn 0, then
f(u) = b + b u + b u2 + . . . + b un
0 1 2 n
= (b0 + (f)) + (b1 + (f))(x + (f)) + (b2 + (f))(x + (f))2 + . . . + (bn + (f))(x + (f))n
= (b0 + (f)) + (b1 + (f))(x + (f)) + (b2 + (f))(x2 + (f)) + . . . + (bn + (f))(xn + (f))
= b + b x + b x2 + . . . + b xn + (f)
0 1 2 n
= f + (f)
= 0 + (f)
=0 E
and so u E is a root of f(x). Thus E is an extension field of K containing
a root of f(x). (The identitification of K with K E amounts to writing k
for k + 0u + 0u2 + . . . + 0un E when k K.)

Let us keep the notation of the preceding proof. Clearly K(u) E. Also,
2
any element of E has the form c0 + c1x + c2x + . . . m
+ cmx + (f), and thus
equals c0 + c1(x + (f)) + c2(x + (f))2 + . . . + cm(x + (f))m
= c + c u + c u2 + . . . + c um
0 1 2 m

619
and belongs to K(u). So E K(u). This shows that E = K(u) is a simple
extension of K.

Now let F = K(t) be another simple extension of K, generated by a root t


in F of f(x) K[x]. By Theorem 50.6, we have the field isomorphisms

: K[x]/(f) K(u) : K[x]/(f) K(t)


g(x) + (f) g(u) g(x) + (f) g(t)

induced from the substitution homomorphisms

Tu: K[x] K(u) Tt: K[x] K(t)


g(x) g(u) g(x) g(t)

(see Theorem 30.17). Hence

1
: K(u) K(t)
g(u) g(t)

is a field isomorphism: K(u) K(t). Besides, since k = (k + (f)) = kTu = k


and likewise k = k for all k K, the restriction of 1 to K K(u) is the
identity mapping on K. We proved the following strengthening of
Kronecker's theorem.

51.2 Theorem: Let K be a field and let f(x) K[x] be an irreducible


polynomial in K[x]. Then there is a simple extension K(u) of K such that
u K(u) is a root of f(x). Moreover, if K(t) is also a simple extension of K
such that t K(t) is a root of f(x), then K(u) K(t) and in fact there is an
isomorphism : K(u) K(t) whose restriction to K is the identity
mapping on K.

51.3 Definition: Let K be a field and let f(x) K[x] be an irreducible


polynomial in K[x]. Then a simple extension K(u) of K, where u is a root
of f(x) (which field exists and is unique to within an isomorphism whose
restriction to K is the identity mapping on K by Theorem 51.2), is called
the field obtained by adjoining a root of f(x) to K.

620
51.4 Remark: Let K be a field and let f(x) K[x] be an irreducible
polynomial in K[x]. Suppose K(u) is the field obtained by adjoining a root
u of f(x) to K. Let c be the leading coefficient of f(x). From Theorem 50.3,
1
we learn that f(x) is the minimal polynomial of u over K. Then it
c
1
follows from Theorem 50.7 that K(u):K = deg f(x) = deg f(x): the
c
degree over K of the field obtained by adjoining to K a root an irreduc-
ible polynomial f(x) K[x] is equal to the degree of f(x).

51.5 Theorem (Kronecker): Let K be a field and let f(x) be a poly-


nomial in K[x]\K (not necessarily irreducible over K) with deg f(x) = n.
Then there is an extension field E of K such that f(x) has a root in E and
E:K n.

Proof: From f(x) K, we know that f(x) is neither the zero polynomial
nor a unit in K[x]. As K[x] is a unique factorization domain, we can
decompose f(x) into irreducible polynomials, and adjoin a root of one of
the irreducible divisors of f(x) to K. The field E obtained in this way will
have a root of (that irreducible divisor of f(x), hence also of) f(x).
Moreover, E:K will be equal to the degree of that irreducible divisor of
f(x), hence will be smaller than or equal to deg f(x) = n.

51.6 Examples: (a) Consider the polynomial f(x) = x2 2 5


[x]. It is
irreducible over 5, for otherwise f(x) would have a root in 5, whereas
there is no element in 5 whose square is 2 5
(in the language of
elementary number theory, 2 is a quadratic nonresidue mod 5). . Let
us adjoin a root u of f(x) to 5. The resulting field 5(u) is an 5-vector
space with an 5-basis {1,u}, and u2 = 2 5
. Here are some sample com-
putations in 5(u):
(4 + 2u)(3 + u) = 12 + 4u + 6u + 3u2 = 12 + 10u + 3.2 = 2 + 0u + 1 = 3,
(3 + 2u)(2 + 4u) = 6 + 12u + 4u + 8u2 = 1 + 2u + 4u + 3.2 = 7 + 6u = 2 + u.

In view of the equation u2 = 2 5


, we agree to write 2 in place of u in

5
(u). We keep in mind of course that 2 is just another name for our
computational device u: here 2 is not the real number 1.414. . . whose
square is the real number 2.

621
Let us express (1 + 2 2)(3 + 2) and (4 + 2) 1 in terms of the 5
-basis
{1, 2}.
(1 + 2 2)(3 + 2) = 3 + 2 + 6 2 + 2.2 = (3 + 4) + (1 + 6) 2 = 2 + 2 2,
1 1 4 2 4 2 4 2 4 2 1
= = = = = (4 2)
4+ 2 4+ 2 4 2 16 2 14 4 4
= 4(4 2) = 16 4 2 = 1 + 2.
Check: (4 + 2)(1 + 2) = 4 + 4 2 + 2 + 2 = 6 + 5 2 = 1.

Note that : 5
( 2) 5
( 2) is an automorphism of 5
( 2), because
a+b 2 a b 2
[(a + b 2) + (c + d 2)] = [(a + c) + (b + d) 2]
= (a + c) (b + d) 2
= (a b 2) + (c d 2)
= (a + b 2) + (c + d 2)
and [(a + b 2)(c + d 2)] = [(ac + 2bd) + (ad + bc) 2]
= (ac + 2bd) (ad + bc) 2
= (ac + 2( b)( d)) + (a( d) + ( b)c) 2
= (a b 2)(c d 2)
= (a + b 2) .(c + d 2)
for all a + b 2, c + d 2 5
( 2); and is clearly onto and Ker

5
( 2). By the binomial theorem (Theorem 29.16),
(a + b 2)5 = a 5 + 5a 4b 2 + 10a 3b22 + 10a 2b32 2 + 5ab44 +
b54 2
= a 5 + 4b5 2 = a + 4b 2 = a b 2
for all a + b 2 5
( 2). Thus can also be described as
: 5
( 2) 5
( 2).
5
t t

(b) The polynomial g(x) = x2 3 5


[x], too, is irreducible over 5
(3
is a quadratic nonresidue mod 5). Adjoining a root 3 of g(x) to 5
, we
obtain the field 5
( 3), which is an 5
-vector space with a basis {1, 3}
2
over 5
, and ( 3) = 3 5
. We do not forget, of course, that 3 is a com-
putational symbol only, and not the real number 1.732. . . whose square
is 3 . In 5( 3), we have

(3 + 2 3)(1 + 4 3) = 3 + 12 3 + 2 3 + 8.3 = 27 + 14 3 = 2 + 4 3,
(2 + 3 3)(2 + 4 3) = 4 + 8 3 + 6 3 + 12.3 = 4 + 3 3 + 3 + 36 = 4 3,

622
1 1 1 3 3 1 3 3 1+2 3 1
= = = = (1 + 2 3)
1+3 3 1+3 3 1 3 3 1 27 4 4
= 4(1 + 2 3) = 4 + 3 3.

As 8 = 3 in 5
, we may also write 8 for 3, with the understanding that
8 5
( 3) is a computational device satisfying ( 8)2 = 8 = 3. Here 8
is not the real number 2.828. . . whose square is 8 . We might be
tempted to write 8 = 4.2 = 2 2. For the time being, this is not legit-
imate: as 8 5
( 3) and 2 2 5
( 2) are in different fields, and not in
their intersection 5
, it is not meaningful to write 8 = 2 2.

However, this suggests that : 5


( 3) 5
( 2) might be an interesting
a +b 3 a + 2b 2
mapping. Indeed,
[(a + b 3) + (c + d 3)] = [(a + c) + (b + d) 3]
= (a + c) + 2(b + d) 2
= (a + 2b 2) + (c + 2d 2)
= (a + b 3) + (c + d 3)
and [(a + b 3)(c + d 3)] = [(ac + 3bd) + (ad + bc) 3]
= (ac + 3bd) + 2(ad + bc) 2
= (ac + 2.2b.2d) + (a .2d + 2b.c) 2
= (a + 2b 2)(c + 2d 2)
= (a + b 3) .(c + d 3)
for all a + b 3, c + d 3 5
( 3), thus is a ring homomorphism. As it is
clearly one-to-one and onto, is a field isomorphism. Hence 5
( 3) and

5
( 2) are isomorphic fields. We identify these two fields by the iso-
morphism , i.e., by declaring a + b 3 = a + 2b 2 for all a,b 5
. Then,
but only then can we write 3 = 2 2.

We could identify these fields by declaring a + b 3 = a 2b 2 for all a,b


in 5, which amounts to identifying them by the isomorphism.
: 5( 3) 5
( 2). How we identify them is not important, but we
must consistently use one and the same identification.

When we identify 5
( 3) and 5
( 2) by declaring a + b 3 = a + 2b 2 for
all a,b 5
, we can no longer interpret 18, for example, merely as a
computational device whose square is 18 5
, for there are two ele-
ments in 5
( 3) = 5
( 2) whose squares are 18, viz. 2 2 and 2 2 =

623
3 2. We must specify which of 2 2, 3 2 we mean by 18. Otherwise
we might commit such mistakes as
3 2 = 9.2 = 18 = 9.2 = 4.2 = 2 2 in 5
( 2)
which resembles the mistake
7 = ( 7)2 = 49 = 7 in .
In , there are two numbers whose squares are 49, namely 7 and 7,
and 49 is understood to be the positive of the numbers 7, 7. Thus
when we write 49, we specify which of 7, 7 we mean by 49. This
prevents the mistake 7 = \r(\L(( 7))2). In 5(\r(2)), specifying 2\r(2) or
3 2 as 18 prevents the mistake 3 2 = 2 2.

Exercises

1. Adjoin a root u of x3 + 2x2 2 [x] to and construct the field


2 2 2
(u). Express (u + u 1)(u + 2u 5), (u 3u + 1)/(u2 + 2u + 3), (u4 +
u3)(u3 1) in terms of the -basis 1,u,u2 of (u).

2. Find all monic irreducible polynomials in 5


[x] of degree two (aside
from x2 2 and x2 3, there are eight of them).. Adjoining a root u of
these polynomials to 5, construct eight fields 5(u) of 25 elements.
Prove that each of these fields is isomorphic to 5
( 2).

3. Prove that 5
and 5
( 2) are cyclic.

4. Find a field K of nine elements and show that K is cyclic.

5. Prove the following statements.


(a) f(x) = x2 + x + 1 2
[x] is irreducible over 2
.
(b) g(x) = x4 + x + 1 2
[x] is irreducible over 2
.
Let i be a root of f(x) and u a root of g(x).
(c) h(x) = x2 + ix + 1 2
(i)[x] is irreducible over 2
(i).
Let t be a root of h(x) 2
(i)[x].
(d) 2(i)(t) and 2(u) are cyclic.
(e) 2(i)(t) 2
(u).

624
§52
Finite Fields

We have seen some examples of finite fields, i.e., fields with finitely
many elements. In this paragraph, we want to discuss some properties
of finite fields.

In modern times, it is customary to treat finite fields after the presenta-


tion of Galois theory. Our approach to finite fields will be more elemen-
tary and more concrete than usual. We hope this will prepare the way to
a better understanding of Galois theory. See also Example 54.18(c) and
Theorem 54.26.

We begin by restricting the order of a finite field to prime powers.

52.1 Lemma: Let q be a natural number and K a field with q elements.


Then q = pn for some prime number p and for some natural number n.

Proof: Let K be a field with q elements. The prime subfield of K cannot


be (isomorphic to) , for then K would contain infinitely many elements.
Hence the prime subfield of K is (isomorphic to) p for some prime
number p. We consider K as an p-vector space. The dimension of K over
p
must be finite, say K: p = n . Let {k1,k2, . . . ,kn} be an p-basis of K.
Then K consists of the elements
a1k1 + a2k2 + . . . + ankn
as a1,a2, . . . ,an run independently through p, and
a1k1 + a2k2 + . . . + ankn b1k1 + b2k2 + . . . + bnkn
whenever (a1,a2, . . . ,an) (b1,b2, . . . ,bn). Hence there are p possible choices
for each of a1,a2, . . . ,an and there are precisely pp. . . p = pn elements in K.

Thus the condition q = pn is a necessary condition for the existence of a


field with q elements. One of our main goals in this paragraph is to show
that it is also a sufficient condition.

625
By the proof of Lemma 52.1, we know that a field with pn elements is of
characteristic p. We prove two lemmas about (not necessarily finite)
fields of prime characteristic.

52.2 Lemma: Let K be a field of characteristic p 0. Then


(a + b)p = a p + bp and (ab)p = a pbp
for all a,b K.

Proof: We use the binomial theorem (Theorem 29.16). Here p is a prime


number and the binomial coefficients (pk) are divisible by p when k =
p
1,2, . . . ,p 1: note that p! is divisible by p, so k!(p k)!(k) is divisible by

p, but k!(p k)! is relatively prime to p, so p divides (pk) by Theorem


5.12. Then, for any a,b K, we have

∑ (pk)a p kbk + bp = a p + ∑
p1 p1
(a + b)p = a p + 0 + bp = a p + bp
k=1 k=1
p p
since p (k) and char K = p imply that (k)a p kbk = 0 for k = 1,2, . . . ,p 1.
This proves (a + b)p = a p + bp. The claim (ab)p = a pbp follows from Lemma
8.14(1).

Lemma 52.2 states that the mapping : K K is a field homomorphism


a ap
(clearly 1p = 1 0). By induction on m, we obtain
(a1 + a2 + . . . + am)p = a1p + a2p + . . . + amp
for any m elements a1,a2, . . . ,am of a field of prime characteristic p.

52.3 Lemma: Let K be a field of characteristic p 0 and n . Then,


n n n
(a + b)p = a p + bp
for any a,b K.

Proof: We make induction on n. The claim is established for n = 1 in


Lemma 52.2. If the assertion is true for n = k, then, for any a,b K,
k+1 k k k k k k+1 k+1
(a + b)p = [(a + b)p ]p = [a p + bp ]p = (a p )p + (bp )p = a p + bp

626
and it is true for n = k + 1 also. Hence it is true for all n .

52.4 Lemma: Let q and let K be a field with q elements.


q1
(1) a = 1 for all a K .
q
(2) a = a for all a K.

(3) xq x= ∏ (x a) in K[x].
a K

(4) Let f(x) be a nonzero polynomial of degree d in K[x]. If f(x) (xq x) in


K[x], then f(x) has exactly d roots in K, and these roots are pairwise
distinct.

Proof: (1) K is a multiplicative group of order K = K\{0} = q 1. Hence


a q 1 = 1 for any a K.

(2) This follows from (1) if a 0 and from 0q = 0 if a = 0.


(3) Any element a of K is a root of the polynomial xq x K[x] by part

(2). Thus both xq x and ∏ (x a) are monic polynomials, in K[x], of


a K

degree q having all the q elements of K as roots. If xq x were not equal

to ∏ (x a), then (xq x ) ∏ (x a) would be a nonzero polynomial


a K a K

of degree less than q having at least q distinct roots, contrary to

Theorem 35.7. So xq x= ∏ (x a)
a K

(4) We put xq x = f(x)g(x), with g(x) K[x]. Then deg g(x) = q d. The
q
roots of x x are pairwise distinct by part (3) and, since any root of f(x)
is also a root of xq x, we see that the roots of f(x), too, are pairwise
distinct. Likewise the roots of g(x) are pairwise distinct. Now g(x) has at
most q d roots in K (Theorem 35.7). If f(x) had r roots in K and r d,
q
then x x = f(x)g(x) would have at most r + (q d) q roots in K,
q
contrary to the fact that all q elements of K are roots of x x. Thus f(x)
has at least d roots in K. But it can have at most d roots in K by Theorem
35.7. Hence f(x) has exactly d roots in K.

627
52.5 Lemma: Let L/K be a field extension and assume that K has q ele-
ments, q . Let b be an element of L. Then b K if and only if bq = b.

Proof: b K if and only if b is a root of ∏ (x a), so if and only if b is


a K

a root of xq x, so if and only if bq = b.

The last two lemmas will now be employed to get information about the
subfields of a finite field. If K1 K2 are finite fields, with pm1 and pm2
elements, respectively, then K1 is a subgroup of K2 , hence pm1 1 = K1
divides K2 = pm2 1 by Lagrange's theorem. We proceed to show that
this happens if and only if m1 divides m2.

52.6 Lemma: Let m,n and put d = (m,n).


(1) For any k , we have (km 1, kn 1) = kd 1.
(2) If K is any field and x an indeterminate over K, then, in the unique
factorization domain K[x], we have (xm 1, xn 1) xd 1.

Proof: (1) We put e = (km 1, kn 1). Since


km 1 = (kd 1)((kd)(m/d) 1 + (kd)(m/d) 2 + . . . + kd + 1),
we have kd 1 km 1. Likewise kd 1 kn 1 and so kd 1 e. On the other
hand, km 1 (mod e), so km = 1 in e, so o(k) m. Likewise o(k) n, so
o(k) d, so kd = 1 in e
, so kd 1 (mod e), so e kd 1. From kd 1 e and
e kd 1, we obtain e = kd 1, as claimed.

(2) We put f(x) = (xm 1,xn 1). Since


xm 1 = (xd 1)((xd)(m/d) 1 + (xd)(m/d) 2 + . . . + xd + 1),
we have xd 1 xm 1 in K[x]. Likewise xd 1 xn 1 and so xd 1 f(x). On
the other hand, f(x) xm 1, so (x + (f))m = xm + (f) = 1 + (f) in K[x]/(f(x)),
hence x + (f) is a unit in K[x]/(f) and the order of x + (f) (K[x]/(f)) is
d
divisible by m, likewise by n, and therefore by d. Thus x + (f) = (x +
(f))d = 1 + (f), and f(x) xd 1 in K[x]. From xd 1 f(x) and f(x) xd 1, we
get f(x) xd 1, as claimed.

628
52.7 Lemma: Let m,n,p and let K be a field and x an indeterminate
over K.
(1) For any k , we have km 1 kn 1 if and only if m n.
(2) In the polynomial ring K[x], we have xm 1 xn 1 if and only if m n.
m n
(3) In the polynomial ring K[x], we have xp x xp x if and only if m n.

Proof: (1) km 1 kn 1 if and only if (km 1, kn 1) = km 1, so if and


only if k(m,n) 1 = km 1, so if and only if (m,n) = m, so if and only if
m n.

(2) xm 1 xn 1 in K[x] if and only if (xm 1, xn 1) xm 1, so if and


only if x(m,n) 1 xm 1, so if and only ifx(m,n) 1 = xm 1, so if and only
if (m,n) = m, so if and only if m n.
m n m n
(3) We have xp x xp x if and only if xp 1 1 xp 1 1, so if and only
if pm 1 pn 1 by part (2), so if and only if m n by part (1).

52.8 Theorem: Let K be a field with pn elements (p prime). Then K has


a subfield with pm elements if and only if m n. In this case, there is
exactly one subfield of K with pm elements. This subfield is
m
{a K: a p = a}.

Proof: As noted earlier, if K has a subfield H with pm elements, then H


is a subgroup of K , so pm 1 = H divides K = pn 1 by Lagrange's
m n
theorem. From p 1p 1, we get m n by Lemma 52.7(1).

Suppose now m n. We want to show that K has a subfield with pm


elements. Lemma 52.5 leads us to consider the set of all elements a in K
m m
satisfying a p = a. So we put K1 = {a K: a p = a}. Then K1 is not empty
and, for any a,b K1, we have
m m m
(a + b)p = a p + a p = a + b, so a + b K1 (Lemma
52.3),
m m m m
( b)p = ( 1b)p = ( 1)p bp = ( 1)b, so b K1 (even when p = 2),
pm pm pm
(ab) =a b = ab, so ab K1,
pm pm
(1/b) = 1/b = 1/b, so 1/b K1 (if
b 0).
Thus K1 is a subfield of K. We now show that K1 has exactly pm elements.
m n
Since m n, we have xp x xp x in K[x] (Lemma 52.7(3)). Thus the

629
m
polynomial xp x has exactly pm roots (and these are pairwise distinct).
m
(Lemma 52.4(4)) and the roots of xp x are precisely the elements in
m
K1. Hence K1 has indeed p elements.

This proves that K has a subfield K1 with pm elements whenever m n.


Moreover, there is only one subfield with pm elements, for if K2 is a
m
subfield of K and K2 = pm, then any element b of K2 satisfies bp = b by
Lemma 52.5, so K2 K1, so K2 = K1. The proof is complete.

As an illustration of Theorem 52.8, assume that K4096 is a field with


4096 = 212 elements. Then all subfields of K4096 are as the figure below,
where Kq denotes a field with q elements.

K4096

K26
K24

K23
K22

In particular, assuming the existence of a field with 4096 elements, we


can conclude the existence of a field with 21, 22, 23, 24, 26 elements, too.
However, we do not know whether a field with 4096 elements really
exists, so the foregoing argument is very weak. It is in fact true that
there is a field with pn elements, for any prime number p and for any
natural number n. We wish to prove this assertion. We need some
results from elementary number theory.

* *

In the following, we use the notation ∑ ad. This means that n and
d|n
that we take a sum of terms ad as d ranges through the positive divisors

630
of n, including 1 and n. For instance ∑ ad = a1 + a2 + a3 + a4 + a6 + a12 and
d|12

∑ = a1 + a3 + a5 + a15. We cleary have ∑ ad = ∑ an/d. The notations ∏ad


d|15 d|n d|n d|n

and Sd will have similar meanings.


d|n

52.9 Lemma: Let be Euler's function. Then, for any natural number n,

∑ (d) = n.
d|n

Proof: For any k , (k) is defined to be the number of positive


integers less than (on equal to) k that are relatively prime to k. The
greatest common divisor of any integer in {1,2, . . . ,n} with n is a positive
divisor d of n. Hence we have

{1,2, . . . ,n} = Sd, where Sd = {k :k n and (k,n) = d}.


d|n

Counting the number of elements, we get n = {1,2, . . . ,n} = ∑ Sd . Here


d|n

Sd = {k :k n and (k,n) = d}
= {k : d k, k n and (k,n) = d}.
= {k : k = db for some b ,k n and (k,n) = d}
= {db : , db n and (db,n) = d}
n
= {db : , db n and (db,d ) = d}
d
n n
= {db :1 b and (b, ) = 1}.
d d

Thus Sd is the number of positive integers b such that 1 b n/d and


(b, (n/d)) = 1, and this number is (n/d) by definition. We then obtain

n= ∑ Sd = ∑ (n/d) = ∑ (d).
d|n d|n d|n

For ease in formulation of the next lemma,. we introduce some terminol-


ogy. Let m .. A complete residue system mod m is defined to be a set

631
of m integers such that one and only one of them. is congruent to each
one of 1,2, . . . ,m.. Thus a complete residue system mod m is a set
{r1,r2,. . . ,rm} such that the residue classes mod m of r1,r2, . . . ,rm make
up m. In particular, ri are then mutually incongruent mod m (and, a
fortiori, mutually distinct). If r1,r2, . . . ,rm are integers mutually incongru-
ent mod m, then {r1,r2, . . . ,rm} is a complete residue system mod m. Also,
if any integer is congruent, modulo m, to one of the integers r1,r2, . . . ,rm,
then {r1,r2, . . . ,rm} is a complete residue system mod m.

A reduced residue system mod m is defined to be a set of (m). integers


such that one and only one of them. is congruent to each one of the
integers among 1,2, . . . ,m that are relatively prime to m.. Thus a reduced
residue system mod m is a set {a1,a2, . . . ,a (m)} such that the residue
classes mod m of a1,a2, . . . ,a (m) make up m. In particular, ai are then
mutually incongruent mod m . (and, a fortiori, mutually distinct). If
a1,a2,. . . ,a (m) are integers relatively prime to m and mutually incongru-
ent mod m, then {a1,a2,. . . ,a (m)} is a reduced residue system mod m. Also,
if any integer that is relatively prime to m is congruent,. modulo m, to
one of the integers a1,a2, . . . ,a (m), then {a1,a2, . . . ,a (m)} is a reduced resi-
due system mod m. .

52.10 Lemma: Let be Euler's function. Let m,n and (m,n) = 1.

(1) If {r1,r2, . . . ,rm} is a complete residue system mod m and if


{s1,s2, . . . ,sn} is a complete residue system mod n, then
{msi + nrj : i = 1,2, . . . ,m, j = 1,2, . . . ,n}
is a complete residue system mod mn.

(2) If {a1,a2, . . . ,a (m)} is a reduced residue system mod m and if


{b1,b2, . . . ,b (n)} is a reduced residue system mod n, then
{mai + nbj : i = 1,2, . . . , (m), j = 1,2, . . . , (n)}
is a reduced residue system mod mn.

(3) (mn) = (m) (n).

Proof: (1) It will be sufficient to show that any two distinct of the mn
numbers msi + nrj are incongruent modulo mn. Indeed, if
msi + nrj msi´ + nrj´ (mod mn),
then msi + nrj msi´ + nrj´ (mod m) and msi + nrj msi´ + nrj´ (mod n)
nrj nrj´ (mod m) and msi msi´ (mod n)

632
rj rj´ (mod m) and si si´ (mod n)
rj = rj´ and si = si´
msi + nrj = msi´ + nrj´.

(2) Let us take a complete residue system {r1,r2, . . . ,rm} mod m such that
{a1,a2, . . . ,a (m)} {r1,r2, . . . ,rm} and a complete residue system {s1,s2, . . . ,sm}
mod n such that {b1,b2, . . . ,b (n)} {s1,s2, . . . ,sn}. We have {a1,a2, . . . ,a (m)} =
{rj : j = 1,2, . . . ,m, (rj ,m) = 1} and {b1,b2, . . . ,b (n)} = {si : i = 1,2, . . . ,n, (si,n) =
1}. Now {msi + nrj : j = 1,2, . . . ,m, j = 1,2, . . . ,n} is a complete residue
system mod mn. So it will be sufficient to show that msi + nrj is
relatively prime to mn if and only if (si,n) = 1 and (rj ,m) = 1.

If (si,n) 1, then (si,n) divides both msi + nrj , and mn, so (si,n) divides
(msi + nrj , mn) and (msi + nrj , mn) 1. Likewise (rj ,m) 1 implies that
(msi + nrj , mn) 1.

On the other hand, if (si,n) = 1 and (rj ,m) = 1, then (msi + nrj ,mn) = 1. For
otherwise (msi + nrj , mn) would be divisible by a prime number p. Then
we would have p mn, so p m or p n. Without loss of generality, assume
p m. Also p msi + nrj , so p nrj . Since p m and (m,n) = 1, we would get (p,n)
= 1. Then p nrj and (p,n) = 1 would give p rj and p would divide (rj ,m),
contrary to (rj ,m) = 1. So (si,n) = 1 and (rj ,m) = 1 implies (msi + nrj , mn) =
1.

(3) From part (2), we learn that a reduced residue system modulo mn
has (m) (n) elements. Hence (mn) = (m) (n) whenever m and n are
relatively prime.

It follows by induction on k that (m1m2. . . mk) = (m1) (m2). . . (mk) for


all natural numbers m1,m2, . . . ,mk that are pairwise relatively prime. In
particular, if n and n = p1a 1 p2a 2 . . . pka k is the canonical decomposition
of n into prime numbers, then (n) = (p1a 1 ) (p2a 2 ). . . (pka k).

Now it is easy to find (pa ) in closed form if p is prime: among the pa


integers 1,2, . . . ,pa , exactly pa 1 of them, namely
p1, p2, . . . ,ppa 1
are not relatively prime to p, so exactly pa pa 1 of them are relatively
prime to p. This means (pa ) = pa pa 1. We can also write (pa )

633
1
= pa (1 ).
p

Therefore, if n ,n 1 and n = p1a 1 p2a 2 . . . pka k is the canonical decom-


position of n into prime numbers, then

(n) = (p1a 1 p1a 1 1)(p2a 2 p2a 2 1). . . (pka k pka k 1)


= p1a 1 1p2a 2 1. . . pka k 1(p1 1)(p2 1). . . (pk 1)
1 1 1
= p1a 1 p2a 2 . . . pka k(1 )(1 ). . . (1 )
p1 p2 pk
1 1 1
= n(1 )(1 ). . . (1 ).
p1 p2 pk

Expanding the last expresion, we find


n n ... n n n n
(n) = n ( + + + )+( + + ... + )+ ... +
p1 p2 pk p1p2 p1p3 p k 1p k
n
( 1)k( ).
p1p2. . .pk
n
Thus (n) is equal to a sum of terms of the form , where d is a
d
product of distinct prime divisors of n, and the sign is + or according as
the number of prime divisors is even or odd. Thus we can write

∑ n
(n) = (d)
d|n d
where (d) = 0 if d is divisible by the square of some prime number,
and, if d is not divisible by the square of any prime number, (d) = 1 or
1 according as the number of (distinct) prime divisors of d is even or
odd. This leads us to the function named after A. F. Möbius (1790-1868).

52.11 Definition: The function : , where


(1) = 1,
(n) = ( 1)r if n is the product of r distinct prime numbers,
(n) = 0 otherwise, i.e., if n is divisible by the square of a
prime number,
is called the Möbius function.

For example, (1) = 1, (2) = 1, (3) = 1, (4) = 0, (5) = 1,

634
(6) = 1, (7) = 1, (8) = 0, (9) = 0, (10) = 1.

∑ ∑ n
The two formulas n = (d) and (n) = (d) are equivalent. This
d|n d|n d
is a special case of a formula known as Möbius inversion formula that

connects a divisor sum ∑ ad with the ad. To establish this formula, we


d|n

need a lemma.

52.12 Lemma: Let be the Möbius function and n . Then ∑ (d) is


d|n

equal to 1 in case n = 1 and to 0 in case n 1.

Proof: If n = 1, then ∑ (d) = ∑ (d) = (1) = 1.


d|n d|1

If n 1 and n = p1a 1 p2a 2 . . . pka k is the canonical decomposition of n into


prime numbers, then

∑ (d) = ∑ (d) = (1) + ( (p1) + (p2) + . . . + (pk))


d|n d|p1 p2 ...pk

+ ( (p1p2) + (p1p3) + . . . + (pk 1pk))


+ ( (p1p2p3) + . . . + (pk 2pk 1pk))
+ ...
+ (p1p2. . . pk)
k k k k
= 1 + (1 )( 1)1 + (2 )( 1)2 + (3 )( 1)3 + . . . + (k)( 1)k
= (1 1)k = 0.

52.13 Lemma (Möbius inversion formula): Let K be a field and let


f: K be any function. Define the function F by declaring

F(n) = ∑ f(d).
d|n

∑ ∑
n n
for all n . Then f(n) = (d)F( ) = ( )F(d)
d|n d d|n d
for all n .

635
Proof: Let n . For any positive divisor d of n, we have


n
F( ) = f(b),
d n
b|
d


n
(d)F( ) = (d)f(b)
d n
b|
d

∑ (d)F( ) = ∑ ∑
n
(d)f(b).
d|n d d|n n
b|
d

The last sum is over all ordered pairs (d,b) of positive divisors of n such
that db n. Hence it is also the sum over all ordered pairs (b,d) of positive
divisors of n such that bd n and we get

∑ (d)F( ) = ∑ ∑ (d)f(b) = ∑f(b)( ∑


n
(d)) = f(n)
d|n d b|n n b|n n
d| d|
b b

since ∑ (d) is equal to 1 when b = n and to 0 when b is a proper


n
d|
b

divisor of n (Lemma 52.12).

52.14 Lemma: Let K be a field and let f : K be any function. Define


the function F: K by declaring

F(n) = ∏ f(d).
d|n

f(n) = ∏ F( ) = ∏ F(d)
n (d) (n/d)
for all n . Then
d|n d d|n

for all n .

. We have F( ) = ∏ f(b),
n
Proof: Let n
d n
b|
d

= ∏ f(b)
n (d) (d)
F( )
d n
b|
d

636
∏ =∏ ∏
n (d) (d)
F( ) f(b)
d|n d d|n n
b|
d

and so

∑ (d)

∏ =∏ ∏ =∏
n (d) d|(n/b)
F( )
d
f(b) (d)
(f(b) ) = f(n)
d|n b|n n b|n
d|
b

* *

We return to finite fields. We will prove that, for any prime number p
and natural number n, there is a finite field with pn elements and that
any two finite fields with the same number of elements are isomorphic.
n
We begin by discussing the decomposition of xp x p
[x] into irreduc-
ible polynomials in the unique factorization domain p[x]. It turns out
n
that all irreducible factors of xp x are distinct, and an irreducible poly-
pn
nomial in p[x] divides x x if and only if its degree divides n.

52.15 Theorem: Let p be a positive prime number and let Fd(x) be the
product of all monic irreducible polynomials of degree d in p[x] (if there
is no monic irreducible polynomial of degree d in p[x], let Fd(x) be the
constant polynomial 1 p
[x]). Then

x = ∏ Fd(x)
n
xp in p
[x].
d|n

n n
Proof: All roots of xp x are simple, because xp x is relatively prime
pn
to its derivative derivative 1. So x x is not divisible by the square of
n
any polynomial in p[x]. In particular, xp x is not divisible by the
.
square of any of its irreducible factors in p[x].

Supposef(x) p
[x] is a monic irreducible polynomial in p[x] and let d =
deg f(x). We construct the field p(a) by adjoining a root a of f(x) to p.
Now f(x) is the minimal polynomial of a over p, so p(a): p = deg f(x) =
d
d and p
(a) is a field of pd elements. Therefore bp = b for all b p
(a)

637
n
(Lemma 52.4(3)). We are to prove that f(x) xp x in p
[x] if and only if
d n in .
d d
Assume d n. As a p
(a), we have a p = a, so a is a root of xp x p
[x].
pd
But f(x) is the minimal polynomial of a over p
, hence f(x) x x in p
[x].
pd pn pn
From d n, it follows that x x x x (Lemma 52.7(3)), so f(x) x x.
n n
Assume now f(x) xp x. Then f(x)g(x) = xp x for some g(x) p
[x], and
pn pn
f(a)g(a) = a a = 0. So a is a root of x x. But then any element of
pn
p
(a) is a root of x x: if b p
(a), say b = f0 + f1a + f2a 2 + . . . + fd 1a d 1
with f0,f1,f2, . . . ,fd 1 p
, then we get
pn n
b = (f + f a + f a 2 + . . . + f a d 1)p
0 1 2 d 1
n pn pn pn n 2 pn n
= f0p + f2 (a ) + . . . + (fd 1)p (a d 1)p
+ f1 a
= f0 + f1a + f2a 2 + . . . + fd 1a d 1 = b.
d
Since the elements of p
(a) coincide with the roots of xp x (Lemma
pd pn
52.4(3)), we see that any root of x x is also a root of x x. Therefore
pd pn
x x divides x x and, by Lemma 52.7(3), d divides n.

52.16 Lemma: Let p be a prime number and let Nd be the number of


monic irreducible polynomials of degree d in p[x]. Let Fd(x) be the
product of all the Nd monic irreducible polynomials of degree d in p[x]
(with the understanding Fd(x) = 1 in case Nd = 0; we prove presently that
Nd 0). For any n , we have

(1) pn = ∑ dNd ;
d|n

∏ d
(2) Fn(x) = (xp x) (n/d)
;
d|n


1 n
(3) Nn = ( )pd;
n d|n d

(4) Nn 0.


n
Proof: (1) This follows from xp x= Fd(x) by equating the degrees
d|n

of the polynomials on both sides.

638
(2) This follows from the same equation by Lemma 52.14 (with the
function F: p
(x) that maps n to Fn(x)).

(3) This follows from part (1) by Möbius inversion formula (Lemma
52.13).

∑ n
(4) Nn 0 by its definition. Also, if Nn = 0, we get ( )pd = 0 from
d|n d
n
part (3) and, dividing both sides by the smallest pd for which ( ) 0,
d


n n
say by pd0 , we obtain an equation ( )= ( )pd d0 , where the right
d0 d|n d
d d0

hand side is and the left hand side is not divisible by p,. a contradiction.
Hence Nn 0.

52.17 Theorem: Let n and let p be a prime number. Then there


n
exists a finite field with p elements.

Proof: By Lemma 52.16(4),. there is an irreducible polynomial f(x) of


degree n in p[x]. Let K be the field obtained by adjoining a root of f(x) to

p
. Then K: p
= n and K is a field with pn elements (Theorem 50.7).

52.18 Theorem: Let K be a field and let G be a finite subgroup of K .


Then G is cyclic. In particular, if K is a finite field, then K is cyclic.

Proof: Let n = G . The order of any element g in G is a divisor of n.


Hence we have the disjoint union
G= {g G: o(g) = n},
dn
from which we obtain

n= G = ∑ (d),
dn

where (d) is the number of elements in G of order d.

We claim that (d) is either 0 or (d). If there is no element in G of order


d, then of course (d) = 0. If there does exist an element g in G of order
d, then all the d elements in the cyclic group g generated by g satisfy

639
g d = 1. Hence they are roots of the polynomial xd 1 K[x] and this
polynomial has therefore at least d roots in K. On the other hand, it can
have at most roots in K, thus it has exactly d roots in K, namely the
elements in g . Thus any element in G that has order d, which
necesarily is a root of xd 1, is in the subgroup g , and an element in
g is of order d if and only if that element is a generator of g . Thus
the elements in G of order d coincide with the generators of g . There
are (d) generators of g , so there are (d) elements in G of order d, i.e.,
(d) = (d), as claimed.

Since (d) (d) for any positive divisor of n, we obtain n = ∑ (d)


dn

∑ (d) = n and this gives (d) = (d) for all positive divisors d of n. In
dn

particular, (n) = (n) 0: there is an element a in G of order n. Thus G


is the cyclic group a .

52.19 Theorem: Let K be a field of pn elements and let t be a generator


of the cyclic group K . Then
(1) K = p(t).
(2) The minimal polynomial of t over p has degree n.
(3) If K1 is any field of pn elements, then the minimal polynomial t over
p
has a root in K1.

Proof: Since 0 p
(t) and since any nonzero element of K, being a power
of t, is in p(t), we get K p
(t); thus K = p(t). This proves (1). Then the
degree of the minimal polynomial of t over p is equal to p(t): p = K: p
= n. This proves (2).. Finally, since the degree of the minimal polynomial
of t over p is equal to n, hence a divisor of n, this polynomial is a divisor
n
of xp x (Theorem 52.15) and has n distinct roots in K1 (Lemma
52.4(3)); in particular, there is a root of this polynomial in K1. This
proves (3).

52.20 Theorem: Any two finite fields with the same number of ele-
ments are isomorphic.

640
Proof: Let K and K1 be fields of pn elements. Then K is a cyclic group
(Theorem 52.18). Let t be a generator of K . Then K = p(t) by Theorem
52.19(1). Let f(x) p
[x] be the minimal polynomial of t over p. Now
f(x) has a root c in K1 (Theorem 52.19(3)). Let p(c) K1 be the subfield
of K1 generated by c over p. Then n = deg f(x) = p
(c): p K1: p = n
yields p(c) = K1. We then get
K1 = p(c) p
[x]/(f(x)) p
(t) = K
from Theorem 50.6. Hence K1 K.

In view of this theorem, we identify all finite fields of the same number
of elements. Thus there is a unique field of q elements (q = pn), and this
field will be henceforward denoted by q..

Exercises

1. Find finite subgroups of and show directly that they are cyclic.

2. Let E and K be finite fields, with K E and E:K = 5. Let a K. If there


is no b K such that b2 = a, show that there is no b E such that b2 = a.

3. Let E and K be finite fields, with K E and let E:K = n. Let a K be


such that there is no b K such that b2 = a. Prove that, if n is odd, there
is no b E such that b2 = a and that, if n is even, there is a b E such
that b2 = a.

4. Find all monic irreducible polynomials in 2


[x] of degree 2,3 and 4.
Verify Theorem 52.

5. Let p and q be distinct prime numbers.. Find the number of monic


irreducible polynomials in p[x] of degree q.


n 1 k
6. Let K be a field with pn elements. Let a K and put f(x) = (x a p ).
k=0
p2 n-1
Show that f(x) p
p
[x]. Conclude that a + a + a + . . . + ap p
. This sum
2 n-1
a + a p + a p + . . . + a p is called the trace of a over p
and is denoted by

641
TK / (a). Prove that TK / (a + b) = TK / (a) + TK / (b) and TK / (ca) =
p p p p p
cTK / (a) for all a,b K and c p
and show that there is an a K with
p
TK / (a) 0.
p

7. Keep the notation of Ex.6.. Prove that g(x) = xp x a K[x] is either


irreducible in K[x] or is a product of p polynomials of degree one.. Prove
that the latter alternative holds if and only if TK / (a) = 0.
p

8. Construct addition and multiplication tables for the finite fields 4


, 8
, 9
and 16.

9. Find a generator of the cyclic group K when K = 4


, 5
, 7
, 8
, 9
, 16
, 27
.

10. Prove that a root of x2 + 7x + 2 11


[x] is a generator of 112
.

642
§53
Splitting Fields

Given a field K and a polynomial f(x) K[x]\K, is it possible to find an


extension field E of K such that f(x) can be written as a product of poly-
nomials in E[x] of first degree? In this paragraph, we study this problem.

This problem. is related to another important question in the theory of


field extensions:. whether a field isomorphism can be extended to a field
isomorphism of the extension field. More precisely, if E1/K1 and E2/K2 are
field extensions and if : K1 K2 is a field isomorphism, can we find a
field isomorphism : E1 E2 such that K = ? The answer is negative in
general, but in the important case of simple algebraic extensions,. it
turns out to be positive. .

Let us recall that, for any field isomorphism : K1 K2, we have a ring

( ∑ aixi) ∑ (ai
m m
isomorphism : K1[x] K2[x] given by = )xi (Lemma
i=0 i=0

33.7, Theorem 33.8).

53.1 Lemma: Let E1/K1 and E2/K2 be field extensions and let : K1 K2
be a field isomorphism. Assume f1(x) K1[x] is an irreducible polynomial
in K1[x] and let f2(x) = (f1(x)) K2[x] be its image under . Let u1 E1 be
a root of f1(x) and u2 E2 a root of f2(x). Let K1(u1) E1 be the subfield of
E1 generated by u1 and let K2(u2) E2 be the subfield of E2 generated by
u2. Then extends to an isomorphism of fields K1(u1) K2(u2) that maps
u1 to u2; that is, there is a field isomorphism : K1(u1) K2(u2) such that
u1 = u2 and K = . Moreover, there is only one isomorphism with
these properties.

Proof: We make use of Theorem 50.6 and Theorem 30.18. Since u1 is a


root of f1(x) and f1(x) is irreducible in K1[x], we see that c0 1f1(x) is the
minimal polynomial of u1 over K1, where c0 is the leading coefficient of
f1(x) (Theorem 50.3; as f1(x) is irreducible, it is not the zero polynomial
or a polynomial of degree zero). Now (c0 1f1) = (f1) and from Theorem
50.6 and its proof (which depends on Theorem 30.17 and Theorem
30.18), we know that

643
: K1(u1) K1[x]/(f1)

∑ aiu1i ∑ ai(x + (f1))i


i i

is a field isomorphism. Likewise there is a field isomorphism

: K2(u2) K2[x]/(f2).

∑ aiu2i ∑ ai(x + (f2))i


i i

Besides, we have an isomorphism of rings

: K1[x] K2[x].

Here (f1) is an ideal of K1[x], therefore Im (f1 )


= (f2) is an ideal of K2[x]
and K1[x]/(f1) K2[x]/Im (f1 )
= K2[x]/(f2) by Theorem 30.19(7). More

specifically, we have the isomorphism

: K1[x]/(f1) K2[x]/(f2).
g + (f1) g + (f2)

1
Hence : K1(u1) K2(u2) is a (ring, and therefore also a) field iso-
1 1 1 1
morphism. We write = . Then a = (a ) =a = [a + (f1)] =
[a + (f2)] 1 = a 1 = a for any a K1 (we regard K1 as a subfield of
K1[x]/(f1) and K2 as a subfield of K2[x]/(f2) as in Kronecker's theorem
(Theorem 51.1)) and u1 = (u1 ) 1 = (x + (f1)) 1
= (x + (f2)) 1
= u2.
Thus is an extension of such that u1 = u2.

The uniqueness of as an extension of with u1 = u2 follows from the


fact that powers of u1 form a K1-basis of K1(u1) (Theorem 50.7). Indeed,
if : K1(u1) K2(u2) is a field isomorphism such that u1 = u2 and K = ,

then maps any element t = ∑ aiu1i of K1(u1), where ai K1, to t =


i

∑ (aiu1i) = ∑ ai (u1 )i = ∑ ai u2i = (∑ aiu1i) = t , and so = .


i i i i

53.2 Theorem: Let E1/K and E2/K be field extensions and let u1 E1
and u2 E2 be algebraic over K. Then the minimal polynomial of u1 over

644
K coincides with the minimal polynomial of u2 if and only if there is an
isomorphism (necessarily unique) of fields :K(u1) K(u2) that maps u1
to u2 and whose restriction to K is the identity mapping on K.
.

Proof: If u1 and u2 have the same minimal polynomial over K, then we


apply Theorem 53.1. with the identity mapping : K K in place of and
conclude that. the identity isomorphism can be extended to a unique
isomorphism : K(u1) K(u2) such that u1 = u2.

Conversely, suppose that :K(u1) K(u2) is a field isomorphism such that

∑ aixi be the minimal poly-


m
u1 = u2 and a = a for all a K. Let f(x) =
i=0

∑ aiu1i. Hence 0 = 0 ( ∑ aiu1i)


m m
nomial of u1 over K. Then 0 = f(u1) = =
i=0 i=0

∑ ∑ ∑ ∑
m m m m
= (aiu1i) = ai u1i = ai(u1 )i = aiu2i = f(u2). Thus u2 is a
i=0 i=0 i=0 i=0
root of f(x) and . f(x) K[x] is a monic irreducible polynomial, which
means that f(x) is the minimal polynomial of u2 over K.

53.3 Remark: Theorem 53.1 should not mislead the reader to believe
that any field isomorphism can be extended to larger fields. Consider, for
example, the isomorphism : ( 2) ( 2) given by a +b 2 a b 2
4
(a,b ). Now ( 2) is an extension field of ( 2). If : ( 2) ( 2)
4 4
could be extended to an isomorphism : ( 2) ( 2), we would have
4 4 2
2 = ( 2) = ( 2) = (( 2)2) = (( 2) ) , a contradiction, since the
4 4
square of ( 2) ( 2) has to be positive. So cannot be extended
4
to an isomorphism of ( 2).

The most important application of Theorem 53.1 is that any two splitting
fields of a polynomial are isomorphic. We now discuss this matter.

645
53.4 Definition: Let E/K be a field extension and f(x) K[x]\K. If f(x)
can be written as a product of linear polynomials in E[x], i.e., if there are
a0,a1,a2, . . . ,am in E such that f(x) = a0(x a1)(x a2). . . (x am), then f(x) is
said to split in E. If f(x) splits in E but not in any proper subfield of E
containing K, then E is called a splitting field of f(x) over K.

53.5 Examples: (a) Consider x2 + 1 [x]. Now x2 + 1 = (x i)(x + i) in


[x], so x2 + 1 splits in . It does not split in any proper subfield of
containing because is the only proper subfield of containing
2 2
and x + 1 does not split in [x]. So is a splitting field of x + 1 over .

is not a splitting field of x2 + 1 over , because x2 + 1 splits in the field


(i) . Now x2 + 1 does not split in which is the only proper subfield
of (i) containing . Hence (i) is a splitting field of x2 + 1 over .

(b) ( 2) is a splitting field of x2 2 [x] over .

3
(c) x3 2 [x] does not split in ( 2) because x3 2
3 3 3 3
= (x 2)(x2 + 2x + ( 2)2) in ( 2)[x] and the second factor is
3 3 3
irreduc-ible in ( 2)[x]. On the other hand, x3 2 = (x 2)(x 2)(x
3 3 3 3
2
2) in ( 2, )[x], so x3 2 splits in ( 2, )[x]. In fact ( 2, ) is a
3 3 3 3
splitting field of x3 2 over . Notice that ( 2, ) = ( 2, 2, 2
2)
is the field generated by the roots of x3 2 over .

(d) Let E/K be a field extension and . f(x) K[x] a polynomial of positive
degree n. Assume that E contains n roots a1,a2, . . . ,an of f(x) (counted with
multiplicity). Then H = K(a1,a2, . . . ,an) is a splitting field of f(x) over K.
Indeed, with the leading coefficient a0 K, we have the factorization f(x)
= a0(x a1)(x a2). . . (x an) in H[x] since each factor x ak belongs to H[x].
Hence f(x) splits in H[x].. On the other hand, if L is any intermediate field
of E/K in which f(x) splits, then x ak is in L[x] and so ak is in L for all k,
thus {a1,a2, . . . ,an} L and H = K(a1,a2, . . . ,an) L. Hence f(x) does not split
in any proper subfield of H containing K.. Therefore, H is a splitting field
of f(x) over K. This argument shows in fact that K(a1,a2, . . . ,an) is the
unique intermediate field of E/K which is a splitting field of. f(x) over K.
In particular, E is a splitting field of f(x) if and only if E = K(a1,a2, . . . ,an).

646
(e) Let E/K be a field extension, L an intermediate field of this extension
and f(x) K[x]\K. Assume that E is a spliting field of f(x) over K. Then E
is a spliting field of f(x) over L, too, since f(x) splits in E but not in any
proper subfield of E containing K so that all the more so f(x) does not
split in any proper subfield of E containing L.
n
(f) Let p be prime. Any greatest common divisor of xp x with its
n n
derivative pnxp 1 1 = 1 is a unit in p[x]. Hence xp x p
[x] has no
.
multiple roots (Theorem 35.18(2)). Thus an extension field of p
in
pn
which x x splits must have at least the pn distinct roots of f(x). We
n
know that xp x splits in the field pn with pn elements (Lemma
n
52.4(3)). Thus pn
is a splitting field of xp x over p
.

(g) Let E/K be a field extension and f(x) K[x]\K. Let a1 E be a root of
f(x) and let L = K(a1) be the subfield of E generated by a1 over K, so that
f(x) = (x a1)g(x) for some g(x) L[x]. We claim that, if (g(x) has positive
degree and) E is a splitting field of g(x) over L,. then E is also a splitting
field of f(x) over K.. Indeed, if E is a splitting field of g(x) over L, then
g(x) = c(x a2). . . (x an) where c K and a2, . . . ,an E. We know that E =
L(a2, . . . ,an) from Example 53.5(d). Then f(x) = c(x a1)(x a2). . . (x an) in
E[x] and f(x) splits in E[x].. On the other hand, if E´ is any intermediate
field of E/K and f(x) splits in E´, then c(x a1)(x a2). . . (x an) in E´[x], so
a1 E´, so L = K(a1) E´ and a2, . . . ,an E´, so L(a2, . . . ,an) E´ and E E´.
Thus f(x). cannot split in any proper subfield of E containing K and E is a
splitting field of f(x) over K. .

3
(h) We saw in Example 53.5(c) that ( 2, ) is a splitting field of x3 2.
3
over . Likewise, ( 2)[y]/(y2 + y + 1) and ( )[y]/(y3 2) are splitting
fields of x3 2 over (here y is an indeterminate over ). In these
3
fields, x 2 splits as
3 3 3
[x ( 2 + (y2 + y + 1))] [(x ( 2y + (y2 + y + 1))] [x ( 2y2 + (y2 + y +
1))] . and [x (y + (y3 2))] [x ( y + (y3 2))] [x ( 2y + (y3
2))],
respectively.

647
A natural question is whether any polynomial has a splitting field. We
show now this is indeed the case. The following theorem is due to
Kronecker.

53.6 Theorem: Let f(x) be an arbitrary polynomial of positive degree


over an arbitrary field K. Then there is an extension field E of K such
that E:K (deg f(x))! and E is a splitting field of f(x) over K.

Proof: We make induction on n = deg f(x). If n = 1, then f(x) = c(x a)


for some c,a K and so K is a splitting field of f(x) over K and we have
E:K = 1 1!. So the claim is established when n = 1.

Suppose now deg f(x) = n 2 and the theorem is true for any poly-
nomial over any field if its degree is n 1. We construct an extension
field L of K in which f(x) has a root a and L:K n (Theorem 51.5;
possibly L = K). Then, by theorem 35.6, f(x) = (x a)g(x) for some g(x) in
L[x]. Now deg g(x) = n 1 and, by induction, there is an extension field E
of L such that E is a splitting field of g(x) over L and E:L (n 1)!.
From Example 53.5(g), we conclude that E is a splitting field of f(x) over
K. Moreover, E:K = E:L L:K (n 1)! L:K (n 1)!n = n!.

We see that Theorem 53.6 is nothing but repeated application of


Kronecker's theorem (Theorem 51.5). We use Theorem 51.5 succesively
until we find a field which contains all the roots of f(x). In the proof of
Theorem 53.6, the successive adjunction of roots is replaced by an
inductive argument.

We now turn to the question of uniqueness. Example 53.5(h) reveals


that there can be many distinct splitting fields of a polynomial. However,
as has already been remarked, all splitting fields of a polynomial are
isomorphic. We prove a slightly more general theorem.

53.7 Theorem: Let E1/K1 and E2/K2 be field extensions and let : K1 K2
be a field isomorphism. Let f1(x) K1[x] be a polynomial in K1[x]\K1 and
let f2(x) = (f1(x)) K2[x]\K2 be its image under . If E1 is a splitting field

648
of f1(x) over K1 and E2 is a splitting field of f2(x) over K2, then extends
to a field isomorphism : E1 E2 and so E1 E2.

Proof: E1 is generated over K1 by the roots of f1(x). Since each root of


f1(x) is algebraic over K1 and there are finitely many roots, Theorem
50.12 yields that E1:K1 is finite. We make induction on E1:K1 .

If E1:K1 = 1, then E1 = K1 and f1(x) splits in K1. Then f2(x) splits in K2 and
K2 = E2. Thus E1 = K1 K2 = E2 is the desired isomorphism.

Suppose now E1:K1 2 and suppose that any field isomorphism can be
extended to an isomorphism. of splitting fields of corresponding poly-
nomials. whenever the degree of a splitting field is less than or equal to
n 1. Since E1:K1 2 and E is generated over K1 by the roots of f1(x),
there must be a root of f1(x) in E1 which does not belong to K1. Let u1 be
a root of f1(x) in E1\K1. Assume g1(x) K1[x] is the minimal polynomial of
u1 over K1 and let u2 be a root of (g1(x)) = g2(x) K2[x] in E2. From
Lemma . 53.1, we know that can be extended to an isomorphism
: K1(u1) K2(u2). Now u1 E1\K1, so K1(u1):K1 1 and E1:K1(u1) n
(Theorem 48.13). As E1 is a splitting field of f1(x) over K1(u1) and E2 is a
splitting field of f2(x) over K2(u2) (Example 53.5(e)), we conclude, by
induction, that can be extended to an isomorphism : E1 E2. This is
the desired extension of . .

53.8 Theorem: Let K be a field and let f(x) be any polynomial of


positive degree in K[x]. Then any two splitting fields of f(x) over K are
isomorphic. In fact, the splitting fields of f(x) are isomorphic by an
isomorphism fixing each element of K.

Proof: Let E1 and E2 be splitting fields of f(x) over K and apply Theorem
53.7 with K1 = K = K2 and = = identity mapping on K.

In the remainder of this paragraph, we discuss algebraically closed


fields.

649
53.9 Definition: A field K is said to be algebraically closed if K has no
proper algebraic extension field, i.e., if any algebraic extension E of K
coincides with K.

53.10 Theorem: Let K be a field. The following statements are equi-


valent.
(i) K is algebraically closed.
(ii) Any irreducible polynomial in K[x] has degree one.
(iii) Every polynomial of positive degree in K[x] has a root in K.
(iv) Every polynomial of positive degree in K[x] splits in K.

Proof: (i) (ii) Assume that K is algebraically closed. If there were an


irreducible polynomial f(x) in K[x] with deg f(x) 1, then E = K[x]/(f)
would be an algebraic extension of K with K E, contrary to the
assumption that K has no proper algebraic extension. Thus every
irreducible polynomial in K[x] has degree one.

(ii) (i) Suppose that any irreducible polynomial in K[x] has degree one.
We want to show that K has no proper algebraic extension. If E were a
proper algebraic extension of K, then there would be an a E\K. Now a
is algebraic over K and K K(a) since a K (Lemma 49.6(1)). This leads
to the contradiction
1 K(a):K = degree of the minimal polynomial of a over K
= degree of an irreducible polynomial in K[x] = 1.
Thus K is algebraically closed.

(ii) (iii) Suppose that any irreducible polynomial in K[x] has degree
one. Let f(x) be any polynomial of positive degree in K[x]. We show that
f(x) has a root in K. Indeed, any irreducible divisor of f(x) has the form
c(x a) with c,a K and thus has a root a in K, so f(x) too, has a root in a
in K.

(iii) (iv) Assume that any polynomial of positive degree in K[x]. has a
root in K and let f1(x) K[x]\K. Then f1(x) has a root a1 in K and f1(x) =
(x a1)f2(x) for some f2(x) K[x]. If f2(x) has positive degree, then f2(x)
has a root a2 in K and f2(x) = (x a2)f3(x) for some f3(x) K[x]; so f1(x) =
(x a1)(x a2)f3(x). If f3(x) has positive degree, then f3(x) has a root a3 in
K and f3(x) = (x a3)f4(x) for some f4(x) K[x]; so f1(x)

650
= (x a1)(x a2)(x a3)f4(x). Proceeding in this way, we will meet an fn(x)
of degree zero and f1(x) = (x a1)(x a2)(x a3). . . (x an)fn splits in K.

(iv) (ii) Suppose every polynomial of positive degree in K[x] splits in K


and let f(x) be an irreducible polynomial in K[x]. Then, by assumption,
f(x) is a product of deg f(x) polynomials of degree one. Since f(x) is
irreducible, the number deg f(x) of factors must be one: deg f(x) = 1. So
any irreducible polynomial in K[x] has degree one.

An example of an algebraically closed field is . This is a consequence of


the result known as the fundamental theorem of algebra, which says
that any polynomial with complex coefficients has a root in . The name
'fundamental theorem of algebra' is grotesk, for this is neither a funda-
mental theorem nor a theorem of algebra! Any proof of this result is
bound to use some results from analysis.

53.11 Lemma: Let E/K be a field extension and assume that E is


algebraically closed. Let A be the algebraic closure of K in E (Definition
50.15). Then A is an algebraically closed field.

Proof: It suffices to prove that any polynomial in A[x]\A has a root in A.


Let f(x) be a polynomial of positive degree in A[x]. Then f(x) is a poly-
nomial of positive degree in E[x] and therefore has a root b in E
(Theorem 53.10). Then A(b) is an algebraic extension of A and A is an
algebraic extension of K, so A(b) is an algebraic extension of K (Theorem
50.16). Consequently b A(b) is algebraic over K and hence b A by
the definition of A.

53.12 Definition: Let E/K be a field extension. If E is an algebraic


extension of K and E is algebraically closed, then E is called an algebraic
closure of K.

651
Does every field K have an algebraic closure? The answer is 'yes' and its
proof requires Zorn's Lemma. There is no algebraic difficulty in the
proof, but there are certain set-theoretical subtelties and we will not
give the proof in this book. It is also true that an algebraic closure of a
fieldK is unique in the sense that any two algebraic closures of a field K
are isomorphic by an isomorphism that fixes each element of K.

Exercises

1. Construct a splitting field over of


2
(a) x 3;
2
(b) x 5;
2
(c) x p, where p is a prime number;
5
(d) x 1;
p
(e) x 1, where p is a prime number;
4 2
(f) x 5x + 6;
6
(g) x 10x4 + 31x2 30;
(h) x5 + 3x4 + x3 8x2 6x + 4;
(i) x4 x2 + 1.

2. Let K be a field and let f(x) K[x] be of degree n 0. If E is a splitting


field of f(x) over K, show that E:K divides n!.

3. What is the difference between an algebraic closure of a field K and


the algebraic closure of K in an extension field?

4. Prove that a finite field cannot be algebraically closed.

652
§54
Galois Theory

This paragraph gives an exposition of Galois theory. Given any field


extension E/K, we associate intermediate fields of E/K with subgroups of
a group, called the Galois group of the extension. Many questions about
the intermediate field structure of the extension can be thus reduced to
related questions about the subgroup structure of the Galois group. Our
exposition closely follows the treatment of I. Kaplansky.

If E/K is a field extension, then E is a field and also a K-vector space. It


will be very fruitful to study both the field and the vector space
structure of E at the same time. For this reason, we consider mappings
which preserve both of these structures.

Let E be a field. Let us recall that a field automorphism of E is a one-


to-one ring homomorphism from E onto E. Equivalently, a field
automorphism of E is an automorphism of the additive group (E,+) which
is also a ring isomorphism of E. Clearly the identity mapping on E is a
field automorphism of E, so the set of all field automorphisms of E is not
empty. Besides, if and are any two field automorphisms of E, then
and 1 are automorphisms of the additive group (E,+) which are ring
isomorphisms from E onto E as well (Lemma 30.16); thus and 1 are
field automorphisms of E. Therefore the set of all field automorphisms of
E is a subgroup of the group of all automorphisms of the additive group
(E,+). The group of all field automorphisms of E will be denoted by
Aut(E). Thus we use the same notation for the group of additive group
automorphisms of E and the group of field automorphisms of E. This is
not likely to cause confusion. Anyhow, Aut(E) will play a minor role in
the sequel.

Aut(E) is the collection of mappings from E onto E that preserve the field
structure of E. From these field automorphisms, we select the mappings
that preserve the vector space structure of E. We introduce some
terminology.

653
54.1 Definition: Let E/K and F/K be field extensions. A mapping : E
F is called a K-homomorphism if is both a field homomorphism and a
K-vector space homomorphism. A K-homomorphism : E F is called a
K-isomorphism if is one-to-one and onto F. A K-isomorphism from E
onto E is called a K-automorphism of E. The set of all K-automorphisms
of E will . be denoted by AutK E or by G(E/K).

If : E F is a K-homomorphism, then (1E ) = 1F (see the remarks


follow-ing Definition 48.9). since is a field homomorphism and, for any
k K, there holds k = (k1E ) = k(1E ) = k1F = k since is a K-linear
transforma-tion. Thus k = k for all k K. Conversely, if : E F is a K-
homomorph-ism such that k = k for all k K, then (ke) = k .e = k(e )
for all k K and e E, and thus is a K-linear transformation, too.
Therefore a field homomorphism : E F is a K-homomorphism if and
only if fixes every element of K.

54.2 Lemma: Let E/K be a field extension and let AutK E be the set of
all K-automorphisms of E over K. Then AutK E is a group.

Proof: We have E AutK E Aut(E) and Aut(E) is a group. Since the


composition. two vector space isomorphisms and also the inverse of a
vector space isomorphism are vector. space isomorphisms (Theorem
41.10), AutK E is closed under composition and forming of inverses. Thus
AutK E is a subgroup of Aut(E).

54.3 Definition: Let E/K be a field extension. The group AutK E = G(E/K)
is called the Galois group of E over K. .

54.4 Examples: (a) Let E be any field and let P be the prime subfield
of E. Any field automorphism of E fixes 1 E. This implies that fixes
each element in P. Therefore any field automorphism of E is a P-auto-
morphism of E and Aut(E) = AutP (E)..

654
(b) The familiar complex conjugation mapping (a + bi a bi, where
a,b ) is an -automorphism of .

(c) The mapping : ( 2) ( 2) that maps a + b 2 to a b 2 (where


a,b ) is a -automorphism of ( 2).

(d) Let K be a field and x an indeterminate over K.. Then K(x) is an


extension field of K.. If a K , then ax is transcendental over K and, by
Theorem 49.10, the mapping a : K(x) K(x) given by f(x)/g(x)
f(ax)/g(ax) is a field automorphism of K(x). It is easy to see that a is in
fact a K-automorphism of K(x). Likewise, for any b K, the mapping b:
K(x) K(x) given by. f(x)/g(x) f(x + b)/g(x + b) is a K-automorphism of
K(x). As x a b = (ax) b = a(x + b) ax + b = (x + b) a = x b a unless a 1 or
b 0, we see that AutK K(x) is a nonabelian group.

We find AutK K(x). In the following, y and z are two additional distinct
indeterminates over K.

Let u be an arbitrary element in K(x)\K, say u = p(x)/q(x), where p(x)


and q(x) are relatively prime polynomials in K[x] and q(x) 0. We claim
that u is transcendental over K and K(x) is finite dimensional (hence
algebraic) over K(u).

We prove the first claim, viz. that u is transcendental over K. If u were


algebraic over K, then u would have a minimal polynomial
H(y) = yk + c k 1yk 1 + . . . + c1y + c0 K[y]
over K. Then, from H(u) = 0, we would get
(p(x)/q(x))k + c k 1(p(x)/q(x))k 1 + . . . + c1(p(x)/q(x)) + c0 = 0,
p(x)k + c p(x)k 1q(x) + . . . + c p(x)q(x)k 1 + c q(x)k = 0,
k 1 1 0
k
q(x) p(x) in K[x] and (p(x),q(x)) 1,
q(x) is a unit in K[x], so q(x) K,
u = p(x)/q(x) K[x],
k k 1 ...
H(u) = u + c k 1u + + c1u + c0 is a polynomial of degree k(deg p(x)),
contrary to H(u) = 0. Thus u is transcendental over K.

Secondly, we prove that K(x):K(u) is finite. Now u = p(x)/q(x). Let us put


p(x) = anxn + a n 1xn 1 + . . . + a1x + a0, q(x) = bmxm + bm 1xm 1 + . . . + b1x + b0,
with an 0 bm. We note that x is a root of the polynomial
F(y) = (b u)ym + (b u)ym 1 + . . . + (b u)y + b
m m 1 1 0
any n
a n 1y n1 ... a1y a0

655
in K(u)[y]. Thus x is algebraic over K(u). We see moreover that deg F(y) =
max (m,n) = max (deg p(x),deg q(x)), because bmu an 0 as u K. We
will show that F(y) is irreducible over K(u). This will imply cF(y) is the
minimal polynomial of x over K(u), where 1/c is the leading coefficient
of F(y), and so K(x):K(u) = deg cF(y) = deg F(y) = max (deg p(x),deg
q(x)).

Now the irreducibility of F(y) over K(u). Since u is transcendental over K,


the substitution homomorphism z u is in fact a field isomorphism
from K(z) onto K(u) K(x) (Theorem 49.10). So K(u) K(z) and Theorem
33.8 gives K(u)[y] K(z)[y]. Then F(y) K(u)[y] is irreducible in K(u)[y] if
and only if its image F(z) K(z)[y] is irreducible in K(z)[y]. From
Theorem 34.5(3) and Lemma 34.11, we conclude F(z) is irreducible in
K(z)[y] if and only if F(z) = q(y)z p(y) is irreducible in K[z][y] = K[y][z].
But F(z) = q(y)z p(y) is certainly irreducible in K[y][z] since q(y)z p(y)
is of degree one in K[y][z] and its coefficients q(y), p(y) are relatively
prime in K[y] (for p(x) and q(x) are relatively prime in K[x]).

Thus we get K(x):K(u) = max (deg p(x),deg q(x)) for any u = p(x)/q(x) in
K(x)\K, where p(x) and q(x) are relatively prime polynomials in K[x] and
q(x) 0.

Now let AutK K(x) and x = u. Write u = p(x)/q(x) as above. Since


K(u) = K(x ) = {f(x )/g(x ): f,g K[x], g 0}
= {f(x) /g(x) : f,g K[x], g 0}
= { (f(x)/g(x)) : f,g K[x], g 0}
= (K(x)) = K(x) K,
we have u K(x)\K and
1 = K(x):K(x) = K(x):K(u) = max (deg p(x),deg q(x))
yields p(x) = ax + b, q(x) = cx + d for some a,b,c,d K. Here ad bc 0 for
ad bc = 0 implies the contradiction u = p(x)/q(x) = (ax + b)/(cx + d) K.

Thus every automorphism in AutK K(x) is a substitution homomorphism


that sends x to (ax + b)/(cx + d) for some a,b,c,d K satisfying ad bc
0. Conversely, if is a substitution homomorphism of this type, with x =
(ax + b)/(cx + d), a,b,c,d K, ad bc 0, then (ax + b)/(cx + d) =: u is not
in K, so u is transcendental over K and is a field homomorphism from
K(x) onto K(u). Since ad bc 0, both of a and c cannot be 0, so
K(x):K(u) = max (deg ax + b,deg cx + d) = 1 and K(u) = K(x). Hence is a
field homomorphism from K(x) onto K(x). As fixes all elements in K, we

656
infer that is in AutK K(x). Therefore AutK K(x) consists exactly of the
substitution homomorphisms x (ax + b)/(cx + d), where a,b,c,d K and
ad bc 0.

The next lemma is a generalization of the familiar fact that the complex
conjugate of any root of a polynomial with real coefficients is also a root
of the same polynomial. In the terminology of §26, if E/K is a field
extension,. AutK E acts on the set of distinct roots of a polynomial f(x)
over K. .

54.5 Lemma: Let E/K be a field extension and f(x) K[x]. If u E is a


.
root of f(x), then, for any AutK E, the element u of E is also a root of
f(x)..

∑ aixi, then f(u) = 0 implies 0 = 0 = (f(u)) = ( ∑ aiui)


m m
Proof: If f(x) =
i=0 i=0

∑ ∑
m m
= (ai )(ui ) = ai(u )i = f(u ). Thus u is a root of f(x).
i=0 i=0

Let E/K be a finite dimensional extension and assume that {a1,a2, . . . ,am}
is a K-basis of E. Then any K-automorphism of E is completely
determined by its effect on the basis elements, for if and are K-
automorphisms of E and ai = ai for i = 1,2, . . . ,m, then, for any a E,

∑ = ( ∑ kiai) = ∑
m m m
which we write in the form kiai, we have a ki ai
i=0 i=0 i=0

∑ ∑ ∑ = ( ∑ kiai)
m m m m
= ki(ai ) = ki(ai ) = ki ai = a . For this reason,
i=0 i=0 i=0 i=0

we will describe the K-automorphisms of E by describing the images of


the basis elements. Thus the conjugation mapping will be denoted by
i i, the mapping of Example 54.4(c) by 2 2, etc.

In particular, if E/K is a simple extension and a is a primitive element,


then {1,a,a 2, . . . ,a n 1} is a K-basis of E = K(a),. where n is the degree of the
minimal polynomial of a over K (Theorem 50.7). Let AutK E. Since a i

657
= (a )i for any i = 0,1,2, . . . ,n 1,. the mapping is completely deter-
mined by its effect on a.. Now a is a root in K(a) of the minimal polyno-
mial of a over K. Thus AutK E r, where r is the number of distinct
roots in K(a) of the minimal polynomial of a over K.. We proved the
following lemma. .

54.6 Lemma: Let K be a field. If a is algebraic over K with the minimal


polynomial f over K, and if r is the number of distinct roots of f in K(a)
then AutK K(a) r deg f = K(a):K ..

3
54.7 Examples: (a) Let 2 be the positive real cube root of 2. Thus
3 3 3 3
( 2) . We find Aut ( 2). If Aut ( 2), then 2 is a
3
root of the minimal polynomial x3 2 of 2 over . Since the roots of x3
3 3 3
2 other than 2 are complex, 2 must be 2. Thus must be the
3 3
identity mapping on ( 2) and Aut ( 2) = 1.

(b) = (i). and the minimal polynomial of i over is x2 + 1, which has


two roots in . Thus Aut 2. Since the identity mapping and
conjugation mapping are -automorphisms of , Aut = 2 and we
get Aut C2. Likewise Aut ( 2) C2.

(c) We find Aut ( 2, 3). We have ( 2, 3) = ( 2)( 3). Here {1, 2}


is a -basis of ( 2) and {1, 3} is a ( 2)-basis of ( 2)( 3) (because
x2 3 is irreducible over ( 2)), hence, by Theorem 48.13, {1, 2, 3, 6}
is a -basis of ( 2, 3). Now any Aut ( 2, 3) maps 2 to 2 or
to 2 and 3 to 3 or to 3 and there are four possibilities for :

(a + b 2 + c 3 + d 6) 1
=a+b 2+c 3+d 6
(a + b 2 + c 3 + d 6) 2
=a+b 2 c 3 d 6
(a + b 2 + c 3 + d 6) 3
=a b 2+c 3 d 6
(a + b 2 + c 3 + d 6) 4
=a b 2 c 3+d 6

(a,b,c,d ). It is easy to see that 1


, 2
, 3
, 4
are indeed -automorph-
isms of ( 2, 3) so that Aut ( 2, 3) = { 1
, 2
, 3
, 4
}. Here 1
is the

658
identity mapping on ( 2, 3) and i j
= k
when {i,j,k} = {1,2,3}. Thus
Aut ( 2, 3) C2 C2 V4.

We now proceed to establish the correspondence between intermediate


fields of an extension E/K and subgroups of AutK E.

54.8 Lemma: Let E/K be a field extension and put G = AutK E.


(1) If L is an intermediate field of E/K, then
L´ = { G: l = l for all l L}
is a subgroup of G.

(2) If H is a subgroup of G, then


H´ = {a E: a = a for all H}
is an intermediate field of E/K.

Proof: (1) Clearly E


L´, so L´ . If , L´, then l( ) = (l ) =l =l
1 1
for all l L, so L´ and l = l gives l = l for all l L,. so L´.
Thus L´ is a subgroup of AutK E. (In fact L´ = AutL E.)

(2) Since any H´ AutK E fixes the elements of K, we have K H´. If


a,b H´, then a = a and b = b for all H, so
(a + b) = a + b = a + b, a +b H,´
( b) = (b ) = b, b H,´
(ab) = a .b = ab, ab H,´
(1/b) = 1/b = 1/b (provided b 0) 1/b H´.
So H´ is a subfield of E and therefore H´ is an intermediate field of E/K.

For example, in the notation of Example 54.7(c), we have

( 2, 3)´ = 1 G = Aut ( 2, 3)
( 2)´ = { 1
, 2
}, ( 3) = { 1
, 3
}, ( 6)´ = { 1
, 4
},
´=G

and 1´ = ( 2, 3)
{ 1, 2}´ = ( 2), { 1, 3}´ = ( 3), { 1
, 4
}´ = ( 6),

659
G´ = .

If E/K is a field extension and H AutK E, then H´ is called the fixed field
of H. Let us consider the four extreme cases of the priming correspond-
ence in Lemma 54.8.

54.9 Lemma: Let E/K be a field extension and G = AutK E. Then


(1) 1´ = E.
(2) E´ = 1.
(3) K´ = G.
(4) G´ contains K, and possibly K G´.

Proof: (1) 1´ = {a E: a = a for all 1} = {a E: a E


= a} = E.
(2) E´ = { G: a = a for all a E} = { E } = 1.
(3) K´ = { G: a = a for all a K} = G.
3
(4) Of course K G´. From Example 54.7(a), we know that Aut ( 2) = 1
3 3
so that, for the extension ( 2)/ , we have G = 1 and K = ( 2) =
1´ = G´. Thus G´ is not always equal to K.

E 1 E 1

K G K G

54.10 Definition: Let E/K be a field extension and put G = AutK E. If G´


is equal to K, then E/K is said to be a Galois extension and E is said to be
Galois over K.

Equivalently,. E/K is Galois if and only if for any element a of E\K, there
exists a AutK E such that a a. It is easy to verify that is a Galois
extension of and that ( 2) and ( 2, 3) are Galois extensions of .

660
54.11 Lemma: Let E/K be a field extension and put G = AutK E. Let L,M
be intermediate fields of E/K and let H,J be subgroups of G. If X is an
intermediate field of E/K or a subgroup of G, we denote (X´)´ shortly by
X´´. Then the following hold.

(1) If L M, then M´ L´.


(2) If H J, then J´ H´.
(3) L L´´ and H H´´.
(4) L´´´ = L´ and H´´´ = H´.

Proof: (1) Suppose L M. If M´, then a = a for all a M and a


fortiori a = a for all a L, hence L´ and consequently M´ L´.

(2) Suppose H J. If a J´, then a = a for all J and a fortiori a =a


for all H, hence a H´ and consequently J´ H´.

(3) If a L, then a = a for all L´ by the definition of L´, so a is fixed


by all the K-automorphisms in L´. Hence a is in the fixed field of L´ and
a L´´. This gives L L´´. If H, then a = a for all a H´ by the defi-
nition of H´, so fixes every element in H´, so H´´. This gives H H´´.

(4) By parts (1) and (2), priming reverses inclusion, therefore L L´´
and H H´´ yieldL´´´ L´ and H´´´ H´. Also, using (3) with L replaced
by H´ and H by L´, we get H´ H´´´ and L´ L´´´. So L´´´ = L´ and H´´´ =
H´.

E 1 E 1

M M´ H´ H

L L´ J´ J

K G K G
In general, L may very well be a proper subset of L´´ and H a proper
subset of H´´. We introduce a term for the case of equality.

54.12 Definition: Let E/K be a field extension and G = AutK E. An


intermediate field L of E/K is said to be closed if L = L´´ and a subgroup
H of G is said to be closed if H = H´´.

661
So E is Galois over K if and only if K is closed. Lemma 54.11(4) states
that any primed object is closed.

54.13 Theorem: Let E/K be a field extension and G = AutK E. There is a


one-to-one correspondence between the set of all closed intermediate
fields of E/K and the set of all closed subgroups of G, given by L L´.

Proof: If L is a closed intermediate field of E/K, then L´ is a subgroup of


G by Lemma 54.8(1) and L´ is closed by Lemma 54.11(4). Thus priming
is a mapping from the set of all closed intermediate fields of the
extension into the set of all closed subgroups of G. This mapping is one-
to-one, for L´ = M´ implies (L´)´´ = (M´)´´, whence L = M by Lemma
54.11(4) again. Finally, the priming mapping is onto the set of all closed
subgroups of G because, if H is any closed subgroup of G, then H´ is a
closed intermedi-ate field and (H´)´ = H. This completes the proof.

This theorem is "virtually useless" until we determine which intermedi-


ate fields and which subgroups are closed. In the most important case
when E/K is a finite dimensional Galois extension, all intermediate fields
and all subgroups will turn out to be closed.

Our next goal is to show that an object is closed if it is "bigger than a


closed object by a finite amount" (Theorem 54.16). We need two
technical lemmas.

If E/K is a field extension and L,M are intermediate fields with L M,


then the dimension M:L of M over L will be called the relative
dimension of L and M. If G is the Galois group of this extension and H,J
are subgroups of G with H J, then the index J:H of H in J will be called
the relative index of H and J.

54.14 Lemma: Let E/K be a field extension and L,M intermediate


fields with L M. If the relative dimension M:L of L and M is finite,

662
then the relative index of M´ and L´ is also finite. In fact, L´:M´ M:L .
In particular, if E/K is a finite dimensional extension, then AutK E
E:K ..

Proof: We make induction on n = M:L . If n = 1, then M = L and L´ = M´,


so L´:M´ = 1. Suppose now n 2 and that the theorem has been proved
for all i n. Since M:L 1, we can find an a M\L. Now M:L is finite
and therefore M is an algebraic extension of L (Theorem 50.10), so a is
algebraic over L. Let f(x) L[x] be the minimal polynomial of a over L
and put k = deg f(x). We have k 1 because a L (Lemma 49.6(1)).
From Theorem 50.7, we deduce L(a):L = k and Theorem 48.13 gives
M:L(a) = n/k. The situation is depicted below.

M M´
n/k
L(a) L(a)´
k
L L´

In case k n, induction settles everything: from n/k n and k n, we


obtain L(a)´:M´ M:L(a) and L´:L(a)´ L(a):L and therefore L´:M´ =
L´:L(a)´ L(a)´:M´ L(a):L M:L(a) = k(n/k) = n = M:L . The case k = n
requires a separate argument.

Suppose now k = n so that M:L(a) = 1 and M = L(a). In order to prove


L´:M´ n, we construct a one-to-one mapping from the set of all
right cosets of M´ in L´ into the set of all distinct roots of f(x). Since
has L´:M´ right cosets, this will prove that L´:M´ r, where r is the
number of distinct roots of f(x) in M. As r deg f = L(a):L = M:L , the
theorem will be thereby proved.

What the required mapping should be is suggested by Lemma 54.5. We


put : {b M: f(b) = 0}
M´ a
( L´). Since a is a root of f(x) and L´ G = AutK E, Lemma 54.5
yields that a is indeed a root of f(x). The mapping is well defined, for
if M´ = M´ ( , L´), then = for some M´, so fixes every
element of M, so fixes a and (M´ ) = a = a( ) = (a ) = a = (M´ ) .
1
Moreover, is one-to-one, for if (M´ ) = (M´ ) , then a = a , so a =

663
1 1 1
a, so fixes a, so fixes each element of L(a) = M, so M´ and
M´ = M´ . This completes the proof of L´:M´ M:L .

The assertion AutK E E:K follows easily: AutK E = AutK E:1 = K´:1 =
K´:E´ E:K .

54.15 Lemma: Let E/K be a field extension and H,J are subgroups of G
= AutK E with H J. . If the relative index J:H of H and J is finite, then
the relative dimension of J´ and H´ is also finite. In fact, H´:J´ J:H .

Proof: Let J:H = n and assume, by way of contradiction, that H´:J´ n.


Then there are n + 1 elements in H´ that are linearly independent over
n
J´, say a1,a2, . . . ,an,an+1. Let H i
be the disjoint decomposition of J as a
i=1
union of right cosets of H.

We consider the system of n linear equations in n + 1 unknowns:

(a1 1
)x1 + (a2 1
)x2 + (a3 1
)x3 + . . . + (an+1 1
)xn+1 = 0
(a1 2
)x1 + (a2 2
)x2 + (a3 )x + . . . + (an+1
2 3 2
)xn+1 = 0
..................... (b)

(a1 n
)x1 + (a2 n
)x2 + (a3 n
)x3 + . . . + (an+1 n
)xn+1 = 0

where the coefficients ai j are in the field E. . Since the number of


unknowns is greater than the number of equations,. this system (b) has a
nontrivial solution in E (Theorem 45.1).. From the nontrivial solutions of
(b), we choose one for which the number of zeroes among xi is as small
as possible. Let x1 = b1, x2 = b2, x3 = b3, . . . , xn+1 = bn+1 be such a solution.
Assume r of the bj are nonzero (r n + 1). By the choice of bj , there is no
solution x1 = c1, x2 = c2, x3 = c3, . . . , xn+1 = cn+1 of (b) in which the number
of nonzero cj 's is less than r.

Eventually after renumbering, we may assume that b1, . . . ,br are distinct
from zero and (in case n + 1 r) br+1 = . . . = bn+1 = 0. Also, we may
assume that b1 = 1, for otherwise we may take the solution b1/b1, b2/b1,
b3/b1, . . . , bn+1/b1 instead of b1,b2,b3, . . . ,bn+1. Of course the number r of
nonzero elements in both solutions are the same.

Let J. We consider the system:

664
(a1 1
)x1 + (a2 1
)x2 + (a3 1
)x3 + . . . + (an+1 1
)xn+1 = 0
(a1 2
)x1 + (a2 2
)x2 + (a3 2
)x3 + . . . + (an+1 2
)xn+1 = 0
..................... (s)

(a1 n
)x1 + (a2 n
)x2 + (a3 n
)x3 + . . . + (an+1 n
)xn+1 = 0

We make two remarks concerning (s). First, since x1 = b1 = 1, x2 = b2, x3 =


b3, . . . , xn+1 = bn+1 is a solution of (b) and is a homomorphism, it is clear
that x1 = b1 = 1, x2 = b2 , x3 = b3 , . . . , xn+1 = bn+1 is a solution of (s).
Second, the system (s) is identical with (b), aside from the order of the
equations. To prove the last assertion, we note that 1 , 2 , 3 , . . . , n
are elements of distinct right cosets of H in J, for H i = H j implies
( i
)( j
)1 H, so i j
1
H, so H i
= H j , so i = j. Let us write then
H 1
=H i1 , H 2
=H i2 , H 3
=H i3 , . . . ,H n
=H in

so that 1
= 1 i1
, 2
= 2 i2
, 3
= 3 i3
, ..., n
= n in
for some 1, 2, 3, . . . , n H (where {i1,i2,i3, . . . ,in} = {1,2, . . . ,n}). Thus each
k
fixes each am in H´ and the ik-th equation
(a )x + (a )x + (a
1 ik 1
)x + . . . + (a
2 ik 2
)x =0
3 ik 3 n+1 ik n+1

in (b) is identical with


(a1 k i )x1 + (a2 )x
k ik 2
+ (a3 )x
k ik 3
+ . . . + (an+1 )x
k ik n+1
=0
k

and therefore with the k-th equation


(a1 k )x1 + (a2 k )x2 + (a3 k )x3 + . . . + (an+1 k
)xn+1 = 0
in (s). This proves that (b) and (s) are identical systems.

Consequently, the solution x1 = 1, x2 = b2 , x3 = b3 , . . . , xn+1 = bn+1 of (s)


is also a solution of (b). Now x1 = 1, x2 = b2, x3 = b3, . . . , xn+1 = bn+1 is a
solution of (b). Hence the difference of these solutions
x1 = 0, x2 = b2 b2 , x3 = b3 b3 , . . . , xn+1 = bn+1 bn+1
i.e., x1 = 0, x2 = b2 b2 , . . . , xr = br br , xr+1 = 0,. . . , xn+1 = 0 (c)
is a solution of (b).

So far, was an arbitrary element of J. We now make a judicious choice


of . One of the 1, 2, 3, . . . , n belongs to H, say 1 H, so am 1 = am
because am H´ for m = 1,2, . . . ,n, n + 1. Since x1 = b1, x2 = b2, x3 = b3, . . . ,
xn+1 = bn+1 is a solution of (b), we get
a b + a b + a b + ... + a
1 1
b 2 2
=0 3 3 n+1 n+1

665
from the first equation in (b). Here {a1,a2,a3 . . . ,an+1} is linearly indepen-
dent over J´ and b1 = 1 0. Thus all of b1,b2,b3 . . . ,bn+1 cannot be in J´: one
of them, say b2, is not in J´. So there is a J such that b2 b2.

We choose J such that b2 b2. Then the solution (c) of the system
(b) is a nontrivial solution in which the number of zonzero elements is
less than r, contrary to the meaning of r as the smallest number of
nonzero elements in any solution of (b). This contradiction shows that
H´:J´ n is impossible. Hence H´:J´ n = J:H .

54.16 Theorem: Let E/K be a field extension and G = AutK E. Let L,M be
intermediate fields of E/K with L M and let H,J be subgroups of G with
H J.
(1) If L is closed and M:L is finite, then M is closed and L´:M´ = M:L .
(2) If H is closed and J:H is finite, then J is closed and H´:J´ = J:H .

Proof: (1) Here M M´´ by Lemma 54.11(3) and L = L´´ by hypothesis,


so
M:L M´´:M M:L = M´´:L = M´´:L´´ = (M´)´:(L´)´ L´:M´ M:L ,
the last two inequalities by Lemma 54.15 and Lemma 54.14,
respectively. This proves L´:M´ = M:L . The proof of (2) is similar and
will be omitted.

We are now in a position to state and prove the major theorem of this
paragraph.

54.17 Theorem (Fundamental theorem of Galois theory): Let E/K


be a finite dimensional Galois extension of fields and G = AutK E. Then
there is a one-to-one correspondence between the set of all intermediate
fields of E/K and the set of all subgroups of G, given by L L´. In this
correspondence, the relative dimension of two intermediate fields is
equal to the relative index of the corresponding subgroups. In particular,
G = AutK E = E:K ..

666
Proof: By Theorem 54.13, there is a one-to-one correspondence
between the set of all closed intermediate fields of E/K and the set of all
closed subgroups of G, given by L L´. Now K is closed (E/K is a Galois
exten-sion) by hypothesis and all intermediate fields are closed by
Theorem 54.16(1) since they are finite dimensional over K. Moreover, if
M is any intermediate field, then K´:M´ = M:K . In particular, E is closed
and AutK E = G = G:1 = G:E´ = K´:E´ = E:K . Hence G is finite. Since 1 is
closed, it follows from Theorem 54.16(2) that all subgroups of G are
closed, because they are finite subgroups of G. Hence the priming map-
ping is a one-to-one correspondence between the set of all intermediate
fields of E/K and the set of all subgroups of G. Theorem 54.16 tells that
the relative dimension M:L of two intermediate fields L M is equal to
the relative index L´:M´ of the corresponding subgroups of G and that
the relative index J:H of two subgroups H J of G is equal to the
relative dimension H´:J´ of the corresponding intermediate fields.

3
54.18 Examples: (a) Let 2 be the real cube root of 2 and consider the
3 3
extension ( 2, ) over . The -automorphisms of ( 2, ) are 1
, 2
, 3
,
, , , where
4 5 6
3 3
1
: 2 2, ,
3 3
2
2
: 2 2, = 1 ,
3 3
3
: 2 2 , ,
3 3
2
4
: 2 2 , = 1 ,
3 3 3
2
5
: 2 2 = 2( 1 ), ,
3 3 3
2 2
6
: 2 2 = 2( 1 ), = 1 .

3
Any element u of ( 2, ) can be written uniquely in the form
3 3 3 3
u=a +b 2+c 4+d +e 2 +f 4 ,
3
where a,b,c,d,e,f are rational numbers. We show that ( 2, ) is Galois
over . To this end, we have to show that the fixed field of G is exactly
. Since

667
3 3 3 3
(a + b 2 + c 4 + d + e 2 + f 4 ) 2
3 3 3 3
2
= a + b 2 + c 4 + (d + e 2 + f 4)
3 3 3 3
= a + b 2 + c 4 + (d + e 2 + f 4)( 1 )
3 3 3 3
=(a d) + (b e) 2 + (c f) 4 d e 2 f 4 ,
3 3 3 3 3
we see that an element u = a + b 2 + c 4 + d +e 2 +f 4 of ( 2, )
is fixed by 2 if and only if
a = a d, d= d
b = b e, e= e
c = c f, f = f.
3 3 3
So an element u of ( 2, ) fixed by 2
has the form a + b 2 + c 4. If u
3 3 3 3
is fixed also by 3
, then a + b 2 + c 4 = (a + b 2 + c 4)
3 3
2
=a +b 2 +c 4
3 3
= a + b 2 + c 4( 1 )
3 3 3
=a c 4+b 2 c 4

yields b = 0, c = c, c = 0 and so u = a .. Since an element u in the


fixed field of G is necessarily fixed by 2 and 3, that u has to be rational.
3
Thus the fixed field of G is . This shows that ( 2, ) is Galois over .

3
The multiplication table of G( ( 2, )/ ) can be constructed easily.
3 3 3
2 2
Since 2 2 3
= 2 3
= 2 and 2 3
= 3
= , we have 2 3
= 4
etc.
3
and the multiplication table of G( ( 2, )/ ) is

1 2 3 4 5 6

1 1 2 3 4 5 6

2 2 1 4 3 6 5

3 3 6 5 2 1 4

4 4 5 6 1 2 3

5 5 4 1 6 3 2

6 6 3 2 5 4 1

668
3
So G( ( 2, )/ ) is a nonabelian group of order 6 and isomorphic to S3,
as can be easily seen by comparing the table above with the multiplica-
tion table of S3:

(23) (123) (12) (132)


(13)

(23) (123) (12) (132)


(13)
(23) (23) (12) (123) (13)
(132)
(123) (123) (13) (132) (23)
(12)
(12) (12) (132) (13) (23)
(123)
(132) (132) (12) (13)
(123) (23)
(13) (13) (123) (23) (132)
(12)

3
The isomorphism G( ( 2, )/ ) S3 can be found in a better way by
3
ob-serving that any automorphism inG( ( 2, )/ ) is completely deter-
3
mined by its effect on the roots of x3 2. The roots of x3 2 are u1 = 2,
3 3
2
u2 = 2 , u3 = 2 . Now 2
maps u1 to u1, u2 to u3 and u3 to u2 and can
therefore be represented, in a readily understood extension of the nota-
u1 u2 u3
tion for permutations, as (u u u ) = (u1)(u2u3) = (u2u3). Dropping u
1 3 2
and retaining only the indices, we see that 2 can be thought of as the
permutation (23) in S3. The other j can be thought of as permutations in
3
S3 in a similar way and this gives the isomorphism G( ( 2, )/ ) S3.
In the multiplication tables above, j and its image in S3 under this iso-
morphism occupy corresponding places.

The subgroup structure of S3 is well known. The subgroups of S3 are


depicted in the Hasse diagram below (A B means A B).

669
{ ,(23)} { ,(13)} { ,(12)}

{ ,(123),(132)} = A3

S3

3
So the subgroups of G( ( 2, )/ ) are

{ 1, 2
} { 1, 6
} { 1, 4
}

{ 1, 3
, 5
}

3
G( ( 2, )/ )

and priming yields

3
( 2, )

3 3 3
2
( 2) ( 2 ) ( 2 )

( )

670
4
(b) Let 2 be the real fourth root of 2 and consider the extension
4 4
( 2,i) over . The -automorphisms of ( 2,i) are 1
, 2
, 3
, 4
, 5
, 6
, 7
, 8
where
4 4
1
: 2 2, i i,
4 4
2
: 2 2, i i,
4 4
3
: 2 2i, i i,
4 4
4
: 2 2i, i i,
4 4
5
: 2 2, i i,
4 4
6
: 2 2, i i,
4 4
7
: 2 2i, i i,
4 4
8
: 2 2i, i i.

1
We put 2
= and 3
= . Then o( ) = 2, o( ) = 4 and = . Thus
4
G( ( 2,i)/ ) is a dihedral group of order 8. Since any automorphism in
4
G( ( 2,i)/ ) is completely determined by its effect on the four roots u1
4 4 4 4 4
= 2, u2 = 2i, u3 = 2, u4 = 2i of x4 2, the group G( ( 2,i)/ ) is
u1 u2 u3 u4
isomorphic to a subgroup of S4. We see = (u u u u ) = (u1u2u3u4)
2 3 4 1
u
u2 u3 u4 4
and = (u1
u4 u3 u2 ) = (u u
2 4
). So G( ( 2,i)/ ) (24),(1234) =
1
{ ,(13),(24),(12)(34),(13)(24),(14)(23),(1234),(1432)} S4 by an

isomorph-ism 2
= (24), 3
= (1234). The subgroups of
4
G( ( 2,i)/ ) are

2 2
{1, } {1, } {1, } {1, }
{1, 3 }

671
2 2 2 3 2 3
{1, , , } {1, , , } {1, , , }

4
Aut ( 2,i)

4
2
Let us find the intermediate field of ( 2,i)/ corresponding to {1, }.
4
We write u = 2 for brevity. We have (u) 2 = (u ) = (ui) = (u .i ) =
(ui.i) = ( u) = (u ) = u and (i) 2 = (i ) = (i ) = i = i. Now let
a,b,c,d,e,f,g,h and s = a + bu + cu2 + du3 + ei + fui + gu2i + hu3i. Then
s 2 = (a + bu + cu2 + du3 + ei + fui + gu2i + hu3i) 2
= a + b( u) + c( u)2 + d( u)3 + e( i) + f( u)( i) + g( u)2( i) + h( u)3( i)
= a bu + cu2 du3 ei + fui gu2i + hu3i
and so s is fixed under 2 if and only if
a = a, b = b, c = c, d = d,
e = e, f = f, g = g, h = h,
so if and only if b = d = e = g = 0,
so if and only if s = a + cu2 + fui + hu3i = a + f(ui) c(ui)2 h(ui)3
so if and only if s (ui).
4
2
Thus the intermediate field of ( 2,i)/ corresponding to {1, } is
4
{1, 2 }´ = (ui) = ( 2i). Similar computations yield that the Galois
corre-spondence is as in the diagram below, where intermediate fields
occupy the same relative position as the corresponding subgroups.

4
( 2,i)

4 4 4 4
( 2) ( 2i) ( 2,i) ( 2(1+i)) ( 2(1 i))

( 2) (i) ( 2i)

672
(c) Let p be a prime number and n . We consider the extension
pn
/ p. The mapping : pn pn
a ap

is a field homomorphism (Lemma 52.2) and fixes every element in p


(Theorem 12.7 or Theorem 52.8). Thus is p-linear and, since pn : p is
finite, is onto pn (Theorem 42.22). So is an p-automorphism of pn .

We put G = Aut pn
. We want to show G = . First we prove o( ) = n.
p
n pn
From a =a = a for all a pn
(Lemma 52.4(2) or Theorem 52.8), we
n
get = 1, so o( ) n.. On the other hand, if m is a positive proper divisor
of n, then pn
has a proper subfield pm
with pm elements (Theorem 52.8)
m
and there is a b pn
\ pm with b m = bp b, so m
1. So we conclude
o( ) = n. Since pn : p is finite, we get
n = o( ) = G = G:1 = ( p)´:( pn )´ pn
: p
=n
from Lemma 54.14, so = G = n and G = .

It is now easy to show that pn


is Galois over p. We have
G´ = ´= {a pn
: a = a} = p
by Theorem 52.8 and thus pn
is Galois over p.

The Galois correspondence is easy to describe. The subgroups of are


in one-to-one correspondence with the positive divisors of n and any
subgroup H of G is of the form H = m (Theorem 11.8). The subfield of
pn
corresponding to H = m is
m m m
H´ = ´ = {a pn
:a = a} = {a pn
: a p = a} = ,
pm
the unique subfield of pn
with pm elements.

pn
1
n/m n/m
m
pm
m m
p
G=

In all these examples, we first determined the subgroups of the Galois


group and then found the intermediate fields corresponding to them.
One can of course reverse this, i.e., one can determine the intermediate
fields in the first place and then find the subgroups corresponding to

673
them. However, it is in general more difficult to find all intermediate
fields of an extension, for it is likely that one overlooks some of them.
Also, it is more difficult to avoid duplications. For instance, in Example
54.18(c), it is not immediately clear where ( 2(1+i)) and ( 2(1 i))
are, nor whether ( 2(1+i)) = ( 2(1 i)). It is far easier to list the
subgroups than to list the intermediate fields.

It is natural to ask which intermediate fields correspond to the normal


subgroups of the Galois group of an extension. Also, what can be said
about the factor groups of the Galois group? We proceed to answer these
questions. We need a definition.

54.19 Definition: Let E/K be a field extension and let G = AutK E be its
Galois group.. An intermediate field L of this extension is said to be
stable relative to K and E,. or to be (K,E)-stable if every K-automorphism
AutK E of E maps L into L.

In the situation of Definition 54.19, if L is a (K,E)-stable intermediate


field, then the inverse 1 of any K-automorphism of E also maps L into
L. Thus the restriction L to L of any K-automorphism of E is a K-auto-
morphism of L. Thus we have a "restriction" mapping

res: AutK E AutK L


L

A K-automorphism of L is said to be extendible to E if there is a K-


automorphism of E such that = L . Therefore res is a mapping onto
the set of all extendible K-automorphisms of L.

54.20 Theorem: Let E/K be a field extension..


(1) If L is a (K,E)-stable intermediate field,. then L´ is a normal subgroup
of the Galois group AutK E.
(2) If H is a normal subgroup of AutK E, then H´ is a (K,E)-stable inter-
mediate field of the extension..

674
1
Proof: (1) We are to prove that L´ for all L´ and AutK E.
Thus we must show that a( 1 ) = a for all a L. Indeed, if a L, L´
1
and AutK E, then a L since L is (K,E)-stable, so (a 1) = a 1
, so
1 1 1
a( ) = (a ) = (a ) = a. Hence L´ AutK E.

(2) We are to prove that a H´ for all a H´ and AutK E. Thus we


must show that (a ) = a for all H. Indeed, if a H´, H. and
1 1 1
AutK E, then H since H AutK E, so a( ) = a, so a( ) = a,
so a( ) = a . Hence H´ is (K,E)-stable. .

54.21 Theorem: Let E/K be a Galois extension and L an intermediate


field. If L is (K,E)-stable, then L is Galois over K.

Proof: For any a L\K, we must find a AutK L such that a a. Since
E is Galois over K, there is a AutK E such that a a. Then L
AutK L
by stability of L relative to K and E. Thus L can be taken as .

54.22 Theorem: Let E/K be a Galois extension and f(x) K[x] be


irreducible in K[x]. If f(x) has a root in E, then f(x) splits in E and the
roots of f(x) are all simple.

Proof: Let a1 be a root of f(x) in E. We put deg f(x) = n. We want to show


that f(x) = c(x a1)(x a2). . . (x an) for some elements c,a1,a2, . . . ,an in E.
For this purpose, we put g(x) = (x a1)(x a2). . . (x am) E[x], where
a1,a2, . . . ,am are all the distinct roots of f(x) in E. We know m n from
Theorem 35.7.

Any K-automorphism of E maps a root of f(x) to a root of f(x) (Lemma


54.5). Thus the coefficients of g(x), which are symmetric in the roots
a1,a2, . . . ,am of g(x),. are fixed by any K-automorphism of E. This shows
that the coefficients of g(x) are in E´ = K. Hence g(x) K[x].. Then f(x) and
g(x) are two polynomials in K[x] with a common root a1 and f(x) is
irreducible over K.. Theorem 35.18(1),(3) gives then f(x) g(x) and conse-
quently n = deg f(x) deg g(x) = m.. We have m n also, thus n = m.
From f(x) g(x) we get then f(x) g(x). So f(x) = c(x a1)(x a2). . . (x an)
for some c K and the roots a1,a2, . . . ,an E of are all distinct, i.e., all
roots of f(x) are simple. .

675
The next theorem is a kind of converse to Theorem 54.21. The result is
not necessarily true without the hypothesis that L is algebraic (cf. Ex. 8).

54.23 Theorem: Let E/K be a field extension and L an intermediate


field. If L is algebraic and Galois over K, then L is (K,E)-stable.

Proof: We want to show that a L for any a L and any AutK E. If


a L, then a is algebraic over K since L is algebraic over K. Let f(x) be
the minimal polynomial of a over K. Then f(x) is a product of n distinct
polynomials of degree one in L[x] because L is Galois over K (Theorem
54.22). Thus all roots of f(x) are in L.. Now if AutK E, then a is a root
of f(x), hence a L, as was to be proved..

Let E/K be a field extension and let L be a (K,E)-stable intermediate field


of E/K. Let us consider the restriction mapping

res: AutK E AutK L


L

Since ( )L = L L for any two K-automorphisms , of E, we see that res


is a homomorphism. Therefore (AutK E)/Ker res = Im res. Now Im res is
the set of all K-automorphisms of L that are extendible to E (hence the
set of all K-automorphisms of L that are extendible to E is a subgroup of
AutK E) and Ker res = { AutK E: L = L } = { AutK E: a = a for all a L}
= L´ = AutL E. Hence (AutK E)/(AutL E) is isomorphic to the group of all K-
automorphisms of L that are extendible to E. We proved the

54.24 Theorem: Let E/K be a field extension and L an intermediate


field. If L is (K,E)-stable, then (L´ = AutL E is normal in AutK E and) the
quotient group G(E/K)/G(E/L) = (AutK E)/(AutL E) is isomorphic to the
subgroup of AutK L consisting exactly of the K-automorphisms of L that
are extendible to E.

We can now supplement the fundamental theorem by describing the


situation with respect to an intermediate field.

676
54.25 Theorem: . Let E/K be a finite dimensional Galois extension of
fields and G = AutK E. Let L be an intermediate field of E/K.
(1) E is Galois over L. .
(2) L is Galois over K if and only if L´ = AutL E is normal in G = AutK E. In
this case, G/L´ = (AutK E)/(AutL E) is isomorphic to the Galois group AutK L
of L over K. Thus G(E/K)/G(E/L) G(L/K). .

Proof: Here the hypotheses of the fundamental theorem are satisfied.


The fundamental theorem states that any intermediate field of E/K and
any subgroup of G is closed.

(1) In order to show that E is Galois over L, we must prove that L = L´´,
that is, that L is closed. This follows from the fundamental theorem.

(2) E/K is a finite dimensional extension by hypothesis and so L/K is also


a finite dimensional extension. Thus L is algebraic over K (Theorem
50.10) If L is Galois over K, then L is (K,E)-stable by Theorem 54.24 and
so L´ is normal in AutK E by Theorem 54.20(1). Conversely, if L´ is normal
in AutK E, then L´´ is a (K,E)-stable intermediate field by Theorem
54.20(2). Here L = L´´ because all intermediate fields are closed. Thus L
is (K,E)-stable. Theorem 54.21 tells then that L is Galois over K. So L is
Galois over K if and only if L´ is normal in G = AutkE.

Suppose now L is Galois over K and L´ G = AutkE. Then AutK L = L:K


by the fundamental theorem. (with L in place of E). Theorem 54.23
states that G/L´ = (AutK E)/(AutL E) is isomorphic to a subgroup of AutK L.
Using L = L´´ (i.e., L is closed). and G´ = K (i.e., L is Galois over K), we see
G/L´ = G:L´ = L´´:G´ = L:K = AutK L by the fundamental theorem. Thus
G/L´, which is isomorphic to a subgroup of AutK L, has the same order as
AutK L. Since AutK L = L:K is finite, this implies that G/L´ is actually
isomorphic to AutK L itself, as was to be shown.

We end this paragraph with an important illustration of Theorem 54.25.

677
54.26 Theorem: Let q be a field of q elements and E a finite dimen-
sional extension of q. Then E is Galois over q and Aut E is cyclic, gener-
q
q
ated by the automorphism , where :a a for all a E.

Proof: Let E: q
= r and char q
= p, so that p
is the prime subfield of q
m
(and of E). We have q = p , where m = q: p . We consider the extension
E/ p. Since E is an r-dimensional vector space over q and q is an m-
dimensional vector space over p, Theorem 48.13 says E is an rm-dimen-
sional vector space over p
and so E = prm. Thus E is a finite field and E
is Galois over (Example 54.18(c)). Then E is Galois over any
p
intermediate field of E/ p (Theorem 54.25(1)); in particular, E is Galois
over q
. Furthermore, we know from Example 54.18(c) that Aut E = ,
p

where is the field isomorphism a a p for all a E and that the group
( q)´ corresponding to the intermediate field q with pm elements is m
.
m m
Thus Aut E = ( q
)´ = , where = is the mapping a a p = a q for all
q

a E.

Exercises

1. Find the Galois group AutK E and all its subgroups and describe the
Galois correspondence between the subgroups of AutK E and the inter-
mediate fields of E/K when

(a) E = ( 2, 3) and K = ;
3 3 3
(b) E = ( 2, 5) and K = ,K= ( 2);
3 4 4
(c) E = ( 2, 3,i) and K = (i), K = (i, 3);
3
(d) E = ( 2,i) and K = (i);
3
(e) E = ( 2, 5) and K = , ( 2).

2. Let E/K be a field extension. Prove that if L is a (K,E)-stable


intermediate field, so is L´´ and that if H is a normal subgroup of AutK E,
so is H´´.

678
3. Let E/K be a field extension and G = AutK E. Let L, M be intermediate
fields of E/K and let H,J be subgroups of G. Prove that H J ´ = H´ J´
and (LM)´ = L´ M´.

If, in addition, L is finite dimensional and Galois over K, then LM is finite


dimensional and Galois over M and AutL M
L AutM LM.

4. Let K be a field and x an indeterminate over K. Show that, if L is an


intermediate field of K(x)/K and L K, then K(x):L is finite.

5. Prove that K(x) is Galois over K if and only if K is infinite.

6. Let K be an infinite field. Prove that a proper subgroup of AutK K(x) is


closed if and only if it is a finite subgroup of AutK K(x).

7. Consider the extension (x)/ . Prove that the intermediate field


(x2) is closed and the intermediate field (x3) is not closed.

8. Let K be an infinite field and x,y two distinct indeterminates over K.


Show that the intermediate field K(x) of the extension K(x,y)/K is Galois
over K but K(x) is not stable relative to K and K(x,y).

679
§55
Separable Extensions

In §54, we established the foundations of Galois theory, but we have no


handy criterion for determining whether a given field extension is Galois
or not. Even in the quite simple cases such as in Example 54.18, we had
to study the effects of automorphisms on the elements in the extension
field, and this involved much calculation. The extension fields in
Example 54.18 were seen to be splitting fields of certain polynomials
over the base field. In this paragraph, we will learn that a finite
dimensional extension is Galois if and only if the extension field is a
splitting field of a polynomial whose irreducible factors have no multiple
roots. We give a name to irreducible polynomials of this kind.

55.1 Definition: Let K be a field and f(x) K[x]. If f(x) is irreducible


over K and has no multiple roots (in any splitting field of f(x) over K),
then f(x) is said to be separable over K.

Thus all the deg f(x) roots of a polynomial f(x) separable over K are
distinct and f(x) splits into distinct linear factors in any splitting field of
f(x) over K.

The existence of multiple roots can be decided by means of the deriva-


tive. If K is a field, f(x) an irreducible polynomial in K[x] and E a splitting
field of f(x) over K, then Theorem 35.18 (5) and Theorem 35.18 (6) show
that f(x) is separable over K if and only if f´(x) 0.

How can an irreducible polynomial f(x) have a zero derivative? Now f(x)
is not 0 or a unit because of irreducibility, so deg f(x) =: m 1. Let f(x) =

∑ aixi, with am ∑ iaixi 1 = 0 if and only if iai = 0 for all i


m m
0. Then f(x) =
i=0 i=1

= 1,2, . . . ,m. In particular, (m1)am = mam = 0. Since a field has no zero


divisors and am 0, this forces m1 = 0. This is impossible in case char K =
0 and is equivalent to p m in case char K = p 0. Likewise, if ai 0, the

680
condition iai = 0 is equivalent to p i in case char K = p. So for terms aixi

∑ apjxpj.
[m/p]
with ai 0, we have i = pj for some j and we may write f(x) =
j=0

∑ bjxj, we obtain f(x) = g(xp). Thus


n
Putting [m/p] = n, apj = bj and g(x) =
j=0

f(x) is actually a polynomial in xp. Conversely, if f(x) = g(xp), then f´(x) =


g´(xp).pxp 1 = g´(xp).0 = 0 by Lemma 35.16. We summarize:

55.2 Lemma: Let K be a field. If char K = 0, then any polynomial


irreducible over K is separable over K. If char K = p 0 and f(x) K[x] is
irreducible over K, then f(x) is separable over K if and only if f(x) is not
a polynomial in xp, i.e., f(x) is not separable over K if and only if f(x) =
g(xp) for some g(x) K[x].

In terms of separable polynomials we now define separable elements


and separable field extensions.

55.3 Definition: Let E/K be a field extension and a E. If a is algebraic


over K and the minimal polynomial of a over K is separable over K, then
a is said to be separable over K.

Thus any element a of K is separable over K since the minimal polyno-


mial of a over K is x a K[x] and x a is separable over K.

55.4 Definition: Let E/K be a field extension. If E is algebraic over K


and if every element of E is separable over K, then E is said to be
separable over K or a separable extension of K and E/K is called a
separable extension.

681
The polynomial x2 + 1 [x] is separable over , because it is irreduc-
ible over and char = 0. On the other hand, x2 + 1 2
[x] is not separ-
able over 2
because x2 + 1 = (x + 1)2 is not even irreducible over 2
.

If E/K is an extension of fields of characteristic 0, then any element of E


that is algebraic over K is separable over K. Thus any algebraic extension
of a field of characteristic 0 is a separable extension of that field.

We compare separability over a field with separability over an inter-


mediate field.

54.5 Lemma: Let E/K be a field extension and let L be an intermediate


field of E/K. Let a E be algebraic over K. If a is separable over K, then a
is separable over L.

Proof: Lemma 50.5 shows that a is algebraic over L. Let f(x) be the
minimal polynomial of a over K and g(x) the minimal polynomial of a
over L. By Lemma 50.5, g(x) is a divisor of f(x). Thus any root of g(x) is a
root of f(x). Since a is separable over K, the roots of f(x) are all simple,
hence, all the more so, the roots of g(x) are all simple and a is separable
over L.

55.6 Lemma: Let E/K be a field extension and let L be an intermediate


field of E/K. Then E is separable over L and L is separable over K.

Proof: Assume that E is separable over K. We are to show that (1) E is


algebraic over L and L is algebraic over K and (2) any element of E is
separable over L and any element of L is separable over K. Since E is
separable over K, we deduce E is algebraic over L (Lemma 50.5) and any
element of E, being (algebraic and) separable over K, is also separable
over L (Lemma 55.5). Thus E/L is a separable extension. Moreover, all
elements of E are separable over K, so, in particular, all elements in L are
separable over K and L/K is a separable extension.

682
The converse of Lemma 55.6 is also true and will be proved later in this
paragraph (Theorem 55.19). Our next goal is to characterize Galois
exten-sions as splitting fields of separable polynomials.

55.7 Theorem: Let E/K be a finite dimensional field extension. Then


the following statements are equivalent.
(1) E is Galois over K.
(2) E is a separable extension of K and the splitting field over K of a
polynomial in K[x].
(3) E is the splitting field of a polynomial in K[x] whose irreducible
factors are separable over K.

Proof: (1) (2) We prove E/K is a separable extension. Since E/K is a


finite dimensional extension, E is algebraic over K. We have also to show
that the minimal polynomial over K of any element u in E is separable
over K. This follows immediately from Theorem 54.22. Hence E is a
separable extension of K.

We must now show that there is a polynomial g(x) in K[x] such that E is a
splitting field of f(x) over K. Let {a1,a2, . . . ,am} be a K-basis of E and let
fi(x) K[x] be the minimal polynomial of ai over K (i = 1,2, . . . ,m). We put
g(x) = f1(x)f2(x). . . fm(x) K[x]. From Theorem 54.22 again, we learn that
each fi(x), hence also g(x), splits in E. Moreover, g(x) cannot split in any
proper subfield L of E containing K for if L is an intermediate field of E/K
and g(x) splits in L, then L contains all roots of g(x), hence L contains
a1,a2, . . . ,am and we have
E = sK (a1,a2, . . . ,am) K(a1,a2, . . . ,am) L,
so E = L. Thus E is indeed a splitting field of g(x) over K.

(2) (3) Assume now E is separable over K and E is a splitting field


over K of a polynomial g(x) in K[x]. We are to prove that the irreducible
factors of g(x) in K[x] are separable over K. Let g(x) = f1(x)f2(x). . . fm(x) be
the decomposition of g(x) into irreducible factors fi(x) in K[x]. Since g(x)
splits in E, each fi(x) has a root ai E. Here ai is separable over K because
E is separable over K. Thus the minimal polynomial of ai over K is a
separable polynomial over K. But the minimal polynomial of ai over K is
cifi(x) with some suitable ci K, because ai is a root of fi(x) and fi(x) is

683
irreducible in K[x]. So cifi(x) is separable over K and consequently fi(x) is
also separable over K.

(3) (1) Suppose now E is a splitting field of a polynomial g(x) K[x]


whose irreducible factors in K[x] are separable over K. We put

K0 = {a E: a = a for all AutK E}.

Clearly K0 K. In fact K0 is the fixed field of AutK E, hence K0 is an


intermediate field of the extension E/K.. We prove that E is Galois over K
by showing (i) E is Galois over K0; (ii) AutK E = AutK E; (iii) E:K = AutK E .
0
These will indeed imply .

E:K0 = AutK E (by the fundamental theorem of


0
Galois theory, since E/K0 is a finite dimensional Galois extension),
AutK E = AutK E (by (ii)),
0

AutK E = E:K (by (iii)),


so E:K0 = E:K ,
so K0 = K
and E is Galois over K (by (i)).

Since, for any AutK E, there holds a = a for all a K0, we see that
AutK E AutK E.
0

(i) In order to show that E is Galois over K0, we have to find, for
each b E\K0, an automorphism AutK E such that b b. If b E\K0,
0
then, by definition of K0, there is a AutK E such that b b. From
AutK E AutK E, we see AutK E and b b. Thus E is Galois over K0.
0 0

(ii) E/K is a finite dimensional extension, hence E/K0 is a finite


dimensional extension and E/K0 is Galois. Therefore, by the fundamental
theorem of Galois theory, the subgroup AutK E of AutK E is a closed sub-
0

group of AutK E. Hence AutK E = K0´ = ((AutK E)´)´ = (AutK E)´´ = AutK E.
0 0

(iii) We prove E:K = AutK E by induction on n = E:K , the hypothesis


being that E be a splitting field over K of a polynomial in K[x] whose
irreducible factors (in K[x]) are separable over K.

If n = 1, then E = K, so AutK E = AutK K = { K } and E:K = 1 = { k} = AutK E .

684
Suppose now n 2 and suppose that E1:K1 = AutK E1 whenever E1/K1 is
1
a finite dimensional extension with 1 E1:K1 n such that E1 is a split-
ting field of a polynomial in K1[x] whose irreducible factors (in K1[x]) are
separable over K1.

Let g(x) K[x] be the polynomial. of which E is a splitting field over K


and let g(x) = f1(x)f2(x). . . fm(x) be the decomposition of g(x) into irreduc-
ible polynomials fi(x) in K[x]. The polynomials fi(x) cannot all be of first
degree, for then the roots of fi(x) would be in K and, as E is a splitting
field of g(x) over K,. the field E would coincide with K, against the hypo-
thesis E:K = n 1. Thus at least one of fi(x) have degree 1. Let us
assume deg f1(x) = r 1 and let a E be a root of f1(x). We put L = K(a).
Then L:K = r and E:L = n/r n. .

Now E is a splitting field of g(x) L[x] over L (Example 53.5(e)). and the
irreducible factors (in L[x]) of g(x), being divisors of fi(x), have no
multiple roots and are therefore separable over L.. Since E:L = n/r n,
we get E:L = AutL E = L´ by induction.

In order to prove E:K = AutK E , i.e., in order to prove E:L L:K =


AutK E:L´ L´ , it will be thus sufficient to show that r = L:K = AutK E:L´ .

We show AutK E:L´ = r by defining a one-to-one mapping A from the set


of right cosets of L´ in AutK E onto the set of distinct roots of f1(x) in E.
Let {a = a1,a2, . . . ,ar} be the distinct roots of f1(x) in E. We put

A: {a1,a2, . . . ,ar}.
L´ a
( AutK E; we know a E is a root of f1(x) from Lemma 54.5). This
mapping A is well defined, for if L´ = L´ , then
1

1
fixes each element of L = K(a)
1
fixes a
a( 1) = a
(a ) 1 = a
a =a
(L´ )A = (L´ )A,
so A is well defined and,. reading the lines backwards, we see that A is
one-to-one as well. It remains to show that A is onto. Indeed, if ai is any
root of f1(x) in E, then there is a field homomorphism i: K(a) K(ai)

685
mapping a to ai and fixing each element of K (Theorem 53.1) and i can
be extended to a K-automorphism i: E E (Theorem 53.7). Then A
sends the coset L´ i to a i = a i = ai. Hence A is onto. This gives
AutK E:L´ = = {a1,a2, . . . ,ar} = r. The proof is complete.

Thus for finite dimensional extensions, being Galois is equivalent to


separability plus being a splitting field.

If E/K is a field extension and E is a splitting field of f(x) K[x] over K,


then all roots of the polynomial f(x) are in E. We show more generally
that, if there is a root in E of a polynomial over K, then all roots of that
polynomial are in E. This gives a characterization of splitting fields
without referring to any particular polynomial.

55.8 Theorem: Let E/K be a finite dimensional field extension. The


following statements are equivalent.
(1) There is a polynomial f(x) K[x] such that E is a splitting field of f(x)
over K.
(2) If g(x) is any irreducible polynomial in K[x], and if g(x) has a root in
E, then g(x) splits in E.

Proof: (1) (2) Asume that g(x) K[x] is irreducible over K and that
g(x) has a root u E. We want to show that all irreducible factors of g(x)
in E[x] have degree one. Suppose, on the contrary, that h(x) E[x] is an
irreducible (over E) factor of g(x) with deg h(x) = n 1. We adjoin a root
t of h(x) to E and thereby construct the field E(t).

Now u and t are roots of the irreducible polynomial g(x) in K[x], so there
is a K-isomorphism : K(u) K(t) (Theorem 53.2). Since E is a splitting
field of f(x) over K(u) and E(t) is a splitting field of f(x) over K(t)
(Example 53.5(e)), the K-isomorphism can be extended to a K-
isomorphism : E E(t) (Theorem 53.7). But then E:K = E(t):K =
E(t):E E:K = n E:K E:K , a contradiction. Thus all irreducible factors of
g(x) in E[x] have degree one and g(x) splits in E.

(2) (1) Suppose now that any irreducible polynomial in K[x] splits in E
whenever it has a root in E. Let {a1,a2, . . . ,am} be a K-basis of E and let

686
fi(x) K[x] be the minimal polynomial of ai over K. We put f(x) =
f1(x)f2(x). . . fm(x). We claim E is a splitting field of f(x) over K.

Each fi(x) has a root ai in E, so each fi(x) splits in E by hypothesis, so f(x)


splits in E. Moreover, f(x) cannot split in a proper subfield of E contain-
ing K for if L is an intermediate field of E/K and f(x) splits in L, then all
roots of f(x) will be in L, in particular each ai will be in L, thus E =
sK (a1,a2, . . . ,am) K(a1,a2, . . . ,am) L. Hence E is a splitting field of f(x)
over K.

Theorem 55.8 leads us to

55.9 Definition: Let E/K be a field extension. If E is algebraic over K


and if every irreducible polynomial in K[x] that has a root in E in fact
splits in E, then E is said to be normal over K, and E/K is called a normal
extension.

With this terminology, Theorem 55.8 reads as follows.

55.8 Theorem: A finite dimensional extension E/K is a normal exten-


sion if and only if E is a splitting field over K of a polynomial in K[x].

55.10 Theorem: Let E/K be a finite dimensional field extension. E is


Galois over K if and only if E is normal and separable over K.

Proof: This is immediate from Theorem 55.7(2) and Theorem 55.8.

55.11 Theorem: Let E/K be a finite dimensional field extension. There


is an extension field N of E such that
(i) N is normal over K;
(ii) no proper subfield of N containing E is normal over K;
(iii) N:K is finite.
(iv) N is Galois over K if and only if E is separable over K.
Moreover, if N´ is another extension field of E with the same properties,
then N and N´ are E-isomorphic.

687
Proof: Let {a1,a2, . . . ,am} be a K-basis of E and let fi(x) K[x] be the
minimal polynomial of ai over K. We put f(x) = f1(x)f2(x). . . fm(x) K[x].
Let N be a splitting field of f(x) over E, with N:E finite. (Theorem 53.6).
We claim N has the properties stated above. . Since N:E and E:K are both
finite, N:K is finite. This proves (iii)..

To establish (i), we show that N is a splitting field of f(x) over K


(Theorem 55.8). Certainly f(x) splits in N, because N is a splitting field of
f(x) over E. Now we have to prove that f(x) does not split in any proper
subfield of N containing K. If L is an intermediate field of N/K in which
f(x) splits, then L contains all roots of f(x), hence {a1,a2, . . . ,am} L, hence
E = sK (a1,a2, . . . ,am) K(a1,a2, . . . ,am)L N; so L, in which f(x) splits, is an
intermediate field of N/E; so L = E since N is a splitting field of f(x) over
E. Thus N is indeed a splitting field of f(x) over K.

Now (ii). If L is a proper subfield of N containing E, then L cannot be


normal over K. Otherwise L, containing a root ai of fi(x), would in fact
contain all roots of fi(x) by normality, hence L would contain all the roots
of f(x), thus L would contain E and all roots of f(x). Then L would contain
H, where H is the subfield of N generated by the roots of f(x) over E. But
H is the unique splitting field of f(x) which is an intermediate field of
N/E (Example 53.5(d)), so N = H L and this forces L = N. This
establishes (ii).

(iv) If N is Galois over K, then N is separable over K and the intermediate


field E of N/K is also separable over K (Lemma 55.6). Conversely, if E is
separable over K, then ai are separable over K, so fi(x) are separable
over K and N´ is a splitting field over K of a polynomial f(x) whose
irreducible divisors are separable over K. Thus N is Galois over K
(Theorem 55.7).

Finally, let N´ be any extension field satisfying (i),(ii),(iii). As ai E N´and


N´ is normal over K, the field N´ contains all roots of the minimal poly-
nomial fi(x) over K, hence N´ contains all roots of f(x), hence N´ contains a
splitting field H´ of f(x) over K. Then H´ is normal over K (Theorem 55.8).
Because of the condition (ii), we get H´ = N´. Hence N´ is a splitting field of
f(x) over K. From Example 53.5(e), we deduce that N´ is also a splitting
field of f(x) over E. Thus both N and N´ are splitting fields of f(x) over E
and therefore N and N´ are E-isomorphic (Theorem 53.8).

688
55.12 Definition: Let E/K be a finite dimensional field extension. An
extension field N of E as in Theorem 55.11 is called a normal closure of E
over K.

Since a normal closure of E over K is unique to within an E-isomorphism,


we sometimes speak of the normal closure of E over K.
3 3 4
The field ( 2, ) is a normal closure of ( 2) over . Likewise ( 2,i)
4
is a normal closure of ( 2) over .

Our next topic is the so-called primitive element theorem which states
that a finitely generated separable extension is in fact a simple exten-
sion. This theorem is due to Abel, but the first complete proof was given
by Galois. The elements of a finitely generated separable extension can

therefore be expessed in the extremely convenient form ∑ aiui, where


i
u is a primitive element of the extension and ai are in the base field.

55.13 Theorem: Let E/K be an algebraic separable extension of fields


and a,b E. Then there is an element c in K(a,b) such that K(a,b) = K(c).

Proof: We distinguish two cases according as K is a finite or an infinite


field.

If K is finite, then K(a,b) is finite dimensional over K (Theorem 50.12)


and has K K (a,b):K elements. Hence K(a,b) is finite and its characteristic is
p 0, thus p K(a,b) and K(a,b) = p(c) for some c in K(a,b) (Theorem
52.19(1); c can be chosen as a generator of the cyclic group K(a,b) ). Then
K(a,b) = K(c).

Assume now K is infinite. Let N be a normal closure of E over K


(Theorem 55.11). Let f(x) K[x] be the minimal polynomial of a over K
and g(x) K[x] the minimal polynomial of b over K. Since a,b E N and
N is normal over K, all roots of f(x) and g(x) lie in N.

689
Let a = a1,a2, . . . ,an N be the roots of f(x) and b = b1,b2, . . . ,bm N be the
roots of g(x).. Since E is separable over K, a and b are separable over K, so
f(x) and g(x) are separable over K, so ai aj when i j (i,j = 1,2, . . . ,n) and
bk bl when k l (k,l = 1,2, . . . ,m).

There are finitely many elements in N of the form


bk bl
(i,j = 1,2, . . . ,n; k,l = 1,2, . . . ,m; i j).
ai aj
Since K is assumed to be infinite, there is a u K which is distinct from
all these (bk bl)/(ai aj ). Hence

aiu + bl aj u + bk unless i = j and k = l. (*)

With this u, we put c = au + b = a1u + b1. We claim K(a,b) = K(c). Certainly


K(c) K(a,b). In order to prove K(a,b) K(c), we must show a,b K(c).
Since b = c au, the relation a K(c) implies b K(c). Hence we need
only prove a K(c). We do this by showing x a K(c)[x]. We shall see
that x a is a greatest common divisor of two polynomials in K(c)[x].

Now a = a1 is a root of f(x) and of g(c ux). These are polynomials in


K(c)[x].. Thus x a is a divisor of the greatest common divisor of f(x) and
g(c ux). On the other hand, any root ai of f(x) distinct from a1 cannot be
a root of g(c ux), because then c uai would be a root of g(x), hence ai
would be equal to one of b = b1,b2, . . . ,bm, contrary to (*). Thus a = a1 is
the only common root of f(x) and g(c ux).. Thus x a is a greatest
common divisor of the polynomials. f(x) and g(c ux) in K(c)[x] and x a
itself is in K(c)[x]. This gives a K(c) and completes the proof..

We can now prove that every finitely generated algebraic separable


extension is a simple extension.

55.14 Theorem: Let E/K be an algebraic separable extension of fields


and assume E = K(a1,a2, . . . ,am). Then there is an element c in E such that
E = K(c).

Proof: We make induction on m.. The claim is true when m = 2 by


Theorem 55.13 (with E = K(a,b)). .If the assertion is proved for m 1,

690
then K(a1,a2, . . . ,am 1) = K(c1) for some c1 and therefore we have K(a1,a2, . . .
,am) = K(a1,a2, . . . ,am 1)(am) = K(c1)(am) = K(c1,am) = K(c) for some c E.

We give a useful characterization of simple algebraic extensions. This


yields an alternative proof of Theorem 55.14.

55.15 Theorem: Let E/K be a finite dimensional extension of fields. E is


a simple extension of K if and only if there are only finitely many inter-
mediate fields of E/K.

Proof: Assume first that E is a simple extension of K, say E = K(c). We


want to show that there are finitely many intermediate fields. We will
show that each intermediate field of E/K is uniquely determined by a
divisor of the minimal polynomial of c over K.

Let f(x) K[x] be the minimal polynomial of c over K. Let L be an


intermediate field of E/K and let g(x) L[x] be the minimal polynomial
of c over L. The field L is generated over K by the coefficients of g(x). To

∑ aixi (with am = 1) and M = K(a1,a2, . . . ,am). Since g(x)


m
see this, let g(x) =
i=0

is in L[x], we have {a1,a2, . . . ,am} L and K(a1,a2, . . . ,am) L. Thus M L


and E:M E:L = K(c):L = L(c):L = deg g(x) = m. On the other hand, c
is a root of a polynomial g(x) in M[x] of degree m, so the degree of the
minimal polynomial of c over M is at most m, so E:M = K(c):M =
M(c):M m (Theorem 50.7). Therefore E:M = m = L(c):L = E:L and
consequently L:K = M:K . Together with M L, this gives M = L (Lemma
42.15(2)).

Therefore each intermediate field L of E/K is uniquely determined by


the minimal polynomial g(x) of the primitive element c over that inter-
mediate field L. We know g(x) divides f(x) in L[x] (Lemma 50.5). Let N
be a normal closure of E over K. Then N contains all roots of f(x) and f(x)
splits in N. Of course g(x) divides f(x) in N[x] and, since N[x] is a unique
factorization domain, g(x) is a product of some of the linear factors of
f(x) in N[x]. Since, in N[x], there are only finitely many monic divisors of

691
f(x), there is only a finite number of possibilities for g(x) and there are
only a finite number of intermediate fields L.

Assume conversely that there is only a finite number of intermediate


fields of E/K. If K is finite, so is E and E is a simple extension of its prime
subfield and of K (Theorem 52.19(1)). So we may suppose K is infinite.
We choose an element c in E such that K(c):K is as large as possible. In
other words, K(c):K K(b):K for any b E. With this c, we claim E =
K(c). Otherwise, there is an e E\K(c). As k ranges through the infinite
set K, we get finitely many intermediate fields K(c + ek). Thus there are
k and k´ in K such that k k´ and K(c + ek) = K(c + ek´). Then c + ek and
c + ek´ are in K(c + ek´), then their difference e(k k´) is in K(c + ek), then
e is in K(c + ek), hence ek is also in K(c + ek) and finally c = (c + ek) ek
is in K(c + ek). Thus e,c K(c + ek). So K(c) K(c + ek). Since e K(c + ek)
and e K(c), we get K(c) K(c + ek) and thus K(c):K K(c + ek):K
(Lemma 42.15(2)), a contradiction. Hence E = K(c).

Theorem 55.14 follows very easily from Theorem 55.15. Suppose E =


K(a1,a2, . . . ,am) is an algebraic separable extension of K. We find a normal
closure N of K over E. Then N is Galois over K and finite dimensional over
K (Theorem 55.11). The Galois group of the extension N/K is thus finite
and it has finitely many subgroups. By the fundamental theorem of
Galois theory, there are finitely many intermediate fields of N/K and so
finitely many intermediate fields of E/K. Theorem 55.15 states that E is
a simple extension of K.

We proceed to prove the converse of Lemma 55.6. We need some


prepatory lemmas, which are of intrinsic interest as well.

55.16 Lemma: Let E/K be an extension of fields of characteristic p 0


and let a E. Assume a is algebraic over K. Then a is separable over K if
and only if K(a) = K(a p).

Proof: Suppose first that a is separable over K. Then a is also separable


over K(a p) by Lemma 55.5. Let g(x) K(a p)[x] be the minimal polynomial
of a over K(a p). Thus all roots of g(x) are simple. Since a is a root of the

692
polynomial xp a p K(a p)[x], we have g(x) xp a p in K(a p)[x]. Therefore
g(x) xp a p and g(x) (x a)p in E[x]. So g(x) = (x a)m for some m such
that 1 m p. Since g(x) has no multiple roots, we get m = 1. Then
g(x) = x a K(a p)[x] and consequently a K(a p). This gives K(a) K(a p)
and, since K(a p) K(a) in any case, we obtain K(a) = K(a p).

Conversely, suppose that K(a) = K(a p). We want to show that a is separ-
able over K. Let f(x) be the minimal polynomial of a over K. If a is not
separable over K, then f(x) has the form f(x) = g(xp) for some g(x) K[x].
Here g(x) is irreducible over K because g(x) is not a unit in K[x] (for f(x),
being irreducible over K, is not a unit in K[x]) and any factorization g(x) =
r(x)s(x) with deg r(x) 0 deg s(x) would give a proper factorization
f(x) = r(x )s(x ) with deg r(xp)
p p
0 deg s(xp), contrary to the irre-
ducibility of f(x) over K. Clearly g(x) is a monic polynomial and, since 0 =
f(a) = g(a p), we see that a p is a root of g(x). Thus g(x) is the minimal
polynomial of a p over K (Theorem 50.3). Of course deg f(x) = p.deg g(x)
and
K(a):K = deg f(x) = p(deg g(x)) deg g(x) = K(a p):K .
Hence K(a p) is a proper subspace of the K-vector space K(a) (Lemma
42.15(2)), contrary to the hypothesis K(a) = K(a p). Consequently, K(a) =
K(a p) implies that a is separable over K.

55.17 Lemma: Let E/K be a finite dimensional extension of fields of


characteristic p 0, say E:K = n. Then the following are equivalent.
(1) There is a K-basis {u1,u2, . . . ,un} of E such that {u1p,u2p, . . . ,unp} is also a
K-basis of E.
(2) For all K-bases {t1,t2, . . . ,tn} of E, {t1p,t2p, . . . ,tnp} is also a K-basis of E.
(3) E is a separable extension of K.

Proof: (1) (2) Let {u1,u2, . . . ,un} be a such a K-basis of E that


{u1p,u2p, . . . ,unp} is also a K-basis of E and let {t1,t2, . . . ,tn} be an arbitrary
K-basis of E. In order to show that {t1p,t2p, . . . ,tnp} is a K-basis of E, it
suffices to prove that {t1p,t2p, . . . ,tnp} spans E over K (Lemma 42.13(2); tip
are mutually distinct since tip tj p = (ti tj )p 0 for i j) and thus it
suffices to prove that uip sK (t1p,t2p,. . . ,tnp) for all i = 1,2,. . . ,n. But this is
obvious: we have
ui sK (t1,t2, . . . ,tn),

693
ui = k1t1 + k2t2 + . . . + kntn for some kj K,
uip = k1pt1p + k2pt2p + . . . + knptnp for some kj p K,
uip sK (t1p,t2p, . . . ,tnp).

(2) (1) This is trivial.

(2) (3) Suppose now that {t1p,t2p, . . . ,tnp} is a K-basis of E whenever


{t1,t2, . . . ,tn} is. Every element in E is algebraic over K because E/K is a
finite dimensional extension (Theorem 50.10). Thus we are to show that
every element b of E is separable over K. We do this by proving K(b) =
K(bp) (Lemma 55.16).

Let b E. We put r = K(b):K . Then r n and {1,b,b2, . . . ,br 1} is a K-basis


of K(b). We extend the K-linearly independent subset {1,b,b2, . . . ,br 1} of
E to a K-basis {1,b,b2, . . . ,br 1,cr+1, . . . ,cn} of E, as is possible by virtue of
Theorem 42.14. Then {1,bp,(bp)2, . . . ,(bp)r 1,crp+1, . . . ,cnp} is also a K-basis
of E by hypothesis and so {1,bp,(bp)2, . . . ,(bp)r 1} is a K-linearly
independent subset of K(b). Lemma 42.13(1) states that {1,bp,(bp)2, . . .
,(bp)r 1} spans K(b) over K. So K(b) sK (1,bp,(bp)2, . . . ,(bp)r 1) K(bp). This
proves K(b) = K(bp). Hence b is separable over K.

(3) (2) We assume E is separable over K and {t1,t2, . . . ,tn} is a K-basis


of E. We want to show that {t1p,t2p, . . . ,tnp} is a K-basis of E. Since tip tj p
for i j, the set {t1p,t2p, . . . ,tnp} has exactly n = E:K elements and, in view
of Lemma 42.13, it suffices to prove that {t1p,t2p, . . . ,tnp} spans E over K.
So we put L = sK (t1p,t2p, . . . ,tnp) and try to show L = E.

Our first step will be to establish that L is a subring of E. In order to


prove this, we must only show that L is closed under multiplication. If a

∑aitip and b = ∑bjtjp are elements of L (ai,bj ∑aibjtiptjp,


n n n
= K), then ab =
i=1 j=1 i,j=1

and L will closed under multiplication provided tiptj p L. As {t1,t2, . . . ,tn}


n
is a K-basis of E, there are elements cijk in K with titj = cijktk and so
i,j=1


n
tiptj p = cpijktkp sK (t1p,t2p, . . . ,tnp) = L. Thus L is a subring of E.
i,j=1

694
Since L contains K and {t1p,t2p, . . . ,tnp}, and L is contained in the ring
K[t1p,t2p, . . . ,tnp], we get L = K[t1p,t2p, . . . ,tnp]. Now for each i = 2, . . . ,n, the
element tip is algebraic over K, so algebraic over K(t1p, . . . ,ti p1) and so
K(t1p, . . . ,ti p1)[ti] = K(t1p, . . . ,ti p1)(ti) (Theorem 50.6) and repeated
application of Lemma 49.6(2), Lemma 49.6(3) gives L = K[t1p,t2p, . . . ,tnp]
= K(t1p,t2p, . . . ,tnp). Thus L = K(t1p,t2p, . . . ,tnp) and L is in fact a field.

We now prove E = K(t1p,t2p, . . . ,tnp). Let a be an arbitrary element of E.


Then a is algebraic over K and over L (Lemma 50.5). Let f(x) L[x] be
the minimal polynomial of a over L. Since a sK (t1,t2, . . . ,tn) and
therefore a p sK (t1p,t2p, . . . ,tnp) = L, we see xp ap L[x] and a is a root
of xp a p. Thus f(x) divides xp a p in L[x]. We put xp a p = f(x)eg(x),
where e 1, g(x) L[x]\{0} and (f(x),g(x)) 1 in L[x]. Taking
derivatives, we obtain
0 = ef(x)e 1f´(x)g(x) + f(x)g´(x),
g(x) divides f(x)g´(x) in L[x],
g(x) divides g´(x) in L[x],
g´(x) = 0,
0 = ef(x)e 1f´(x)g(x),
and since E is separable over K, here f´(x) 0, so f(x)e 1f´(x)g(x) 0 and
e = 0 in L,
p e in ,
e = pm for some m ,
p p
p = deg(x a ) = pm(deg f(x)) + deg g(x),
m = 1, deg f(x) = 1 and g(x) = 0,
e = p and g(x) = 1 (comparing leading coefficients),
(x a)p = xp a p = f(x)p,
x a = f(x) L[x],
and a L. This proves E L. Hence E = L = sK (t1p,t2p, . . . ,tnp) and thus
{t1p,t2p, . . . ,tnp} is a K-basis of E, as was to be proved.

55.18 Lemma: Let E/K be a field extension and a E. Then K(a) is a


separable extension of K if and only if a is separable over K.

Proof: If K(a) is separable over K, then every element of K(a) is separ-


able over K, in particular a is separable over K. Suppose now a is separ-
able (thus algebraic) over K. We wish to prove that K(a) is separable

695
over K. The case char K = 0 being trivial, we may assume char K = p 0.
Let n = K(a):K . Then {1,a,a 2, . . . ,a n 1} is a K-basis of K(a) (Theorem 50.7).
Likewise {1,a p,(a p)2, . . . ,(a p)m 1} is a K-basis of K(a p), where m = K(a p):K .
Since a is separable over K, we have K(a p) = K(a) (Lemma 55.16) and m =
K(a p):K = K(a):K = n. Thus {1,a p,(a p)2, . . . ,(a p)n 1} = {1p,(a)p,(a 2)p, . . .
,(a n 1)p} is also a K-basis of K(a). Thus K(a) is separable over K by Lemma
55.17.

55.19 Theorem: Let E/K be a finite dimensional field extension and let
L be an intermediate field of E/K. Then E is separable over K if and only
if E is separable over L and L is separable over K.

Proof: If E is separable over K, then E is separable over L and L is


separable over K (Lemma 55.6). Conversely, suppose that E is separable
over L and L is separable over K. We are to show that (1) E is algebraic
over K and that (2) any element in E is separable over K. Since E/L and
L/K are separable extensions, they are algebraic extensions and E/K is
also algebraic by Theorem 50.16. Now the separability of E over K. The
case char K = 0 being trivial, we assume char K = p 0. As E/K is a finite
dimensional extension by hypothesis, E:L and L:K are finite (Lemma
48.14). Let E:L = n and L:K = m.

Since E is separable over L, there is an L-basis {a1,a2, . . . ,an} of E such that


{a1p,a2p, . . . ,anp} is also an L-basis of E and, since L is separable over K,
there is a K-basis {b1,b2, . . . ,bm} of L such that {b1p,b2p, . . . ,bmp} is also a K-
basis of L (Lemma 55.17). Then {aibj } is a K-basis of E by the proof of
Theorem 48.13, and likewise {aipbj p} is a K-basis of E. Hence {aibj } is a K-
basis of E such that {(aibj )p} is also a K-basis of E. From Lemma 55.17, it
follows that E is separable over K.

We close this paragraph with a brief discussion of perfect fields.

55.20 Definition: Let K be a field. If char K = 0 or if char K = p 0 and


for each a K, there is a b K such that a = bp, then K is said to be
perfect.

696
Thus in case char K = p 0, K is a perfect field if and only if the field
homomorphism : K K is onto K. Then for each a K, there is a
p
u u
unique b K such that a = bp, for is one-to-one. This unique b will be
p
denoted by a.

For example, every finite field is perfect, for if q


is a finite field and
char q = p 0, then the one-to-one homomorphism : q q
(u up) is
p
-linear and thus onto q by Theorem 42.22 (or, more simply, because
the one-to-one mapping from the finite set q into q must be onto q).

55.21 Theorem: Let K be a field. K is perfect if and only if every irre-


ducible polynomial in K[x] is separable over K.

Proof: The assertion is trivial in case char K = 0, so assume that char K =


p 0.

Suppose first that K is perfect. Now, if f(x) K[x] is not separable over K,

∑ aixi and
m
then f(x) = g(xp) for some g(x) K[x], say g(x) =
i=0

∑ aixip = ∑ ( ai)pxip = ( ∑
m m p m p
f(x) = g(xp) = ai xi)p,
i=0 i=0 i=0

f(x) cannot be irreducible over K. Thus, if K is a perfect field, then every


irreducible polynomial in K[x] is separable over K.

Conversely, suppose that every irreducible polynomial in K[x] is separ-


able over K. We want to show that K is perfect. Let a K. We must find a
p p
b in K with b = a. So we consider the polynomial x a K[x]. We adjoin
p p p
a root a of xp a to K and obtain the field K( a) (possibly K K( a)).
p p
Then, in K( a)[x], we have the factorization xp a = (x a)p. The mini-
p p
mal polynomial of a over K is thus (x a)k for some k {1,2,. . . ,p}. So
p
(x a)k is necessarily irreducible and, by hypothesis, separable over K
and has therefore no multiple roots. This forces k = 1. So the minimal
p p p
polynomial of a is x a K[x], which gives a K, as was to be
proved.

697
Consequently every algebraic extension of a perfect field K is separable
over K. Theorem 55.21 yields the corollary that every algebraically
closed field is perfect, since any irreducible polynomial in an algebraic-
ally closed field is of first degree and has therefore no multiple roots (is
separable over that field).

Exercises

5
1. Find a normal closure of ( 3, 7) over .

2. If E/K is a field extension and E:K = 2, show that E is normal over K.

3. Let E/K be a field extension with E:K = 3 and assume that E is not
normal over K. Let N be a normal closure of E over K. Show that N:K = 6
and that there is a unique intermediate field L of N/K satisfying L:K = 2.

4. Let N/K be a field extension and assume that N is normal over K. Let L
be an intermediate field of N/K. Prove that L is normal over K if and
only if E is (K,N)-stable.

5. Find fields K L N such that N is normal over L, L is normal over K


but N is not normal over K.

6. Find fields K L N such that N:K = 6, N is Galois over K but L is not


Galois over K.

7. Find fields K L N such that N:K is finite, N is normal over K but L


is not normal over K.

8. Find primitive elements for the extensions ( 2, 3), ( 2, 3, 5),


3
( 2,i), ( 2, 5) of .

9. Find a splitting field K over 3


of (x2 + 1)(x2 + x + 2) 3
[x] and a
primitive element of K.

698
10. Let p be a prime number and x,y two distinct indeterminates over
p
. Let E = p(x,y) and K = p(xp,yp). Show that E is not a simple extension
of K and find infinitely many intermediate fields of E/K.

11. Prove the following generalization of Theorem 55.14. If K is a field,


K(a1,a2, . . . ,am) is an algebraic extension of K and a2, . . . ,am are separable
over K, then K(a1,a2, . . . ,am) is a simple extension of K.

12. Let K be a field and K(a1,a2, . . . ,am) a finitely generated extension of K.


Show that K(a1,a2, . . . ,am) is separable over K if and only if all a1,a2, . . . ,am
are separable over K.

13. Prove that Theorem 55.19 is valid without the hypothesis that E be
finite dimensional over K. (Hint: Reduce the general case to the finite
dimensional case.)

14. Let L and M be intermediate fields of a field extension E/K. Prove


that, if L is separable over K, then LM is separable over M.

15. Prove that every finite dimensional extension of a perfect field is


perfect.

16. Let E/K be a finite dimensional field extension. If E is perfect, show


that K is also perfect.

699
§56
Galois Group of a Polynomial

In this paragraph, we give some applications of Galois theory to the


theory of equations. We shall introduce resultants and discriminants,
and then discuss polynomial equations f(x) = 0, where f(x) is of degree
2,3,4.

56.1 Lemma: Let K be a field and . f(x) = anxn + a n 1xn 1 + . . . + a1x + a0,
g(x) = bmxm + bm 1xm 1 + . . . + b1x + b0 be nonzero polynomials in K[x]\K.
Assume that at least one of an,bm is distinct from 0. Then f(x),g(x) have a
nonunit greatest common divisor in K[x]. if and only if there are nonzero
polynomials g1(x),f1(x) K[x] such that
f(x)g1(x) = g(x)f1(x) and deg f1(x) n, deg g1(x) m.

Proof: One direction is clear.. If f(x) and g(x) have a nonunit greatest
common divisor h(x) in K[x], then f(x) = h(x)f1(x), g(x) = h(x)g1(x) with
some suitable f1(x),g1(x) in K[x] and
deg f1(x) = deg f(x) deg h(x) n deg h(x) n
since deg h(x) is greater than zero. Likewise deg g1(x) m. We have of
course f(x)g1(x) = f1(x)h(x)g1(x) = f1(x)g(x).

Conversely, assume f(x)g1(x) = g(x)f1(x) for some nonzero polynomials


f1(x), g1(x) in K[x] satisfying deg f1(x) n and deg g1(x) m.. We put
h(x) (f(x),g(x)). We want to prove deg h(x) 0. Write f(x) = h(x)F(x),
g(x) = h(x)G(x). Then (F(x),G(x)) 1 and f(x)g1(x) = g(x)f1(x) gives
F(x)g1(x) = G(x)f1(x). Suppose, without loss of generality, an 0, so that
deg f(x) = n. Now F(x) divides G(x)f1(x) and, as (F(x),G(x)) 1, F(x)
divides f1(x); thus deg F(x) deg f1(x) n = deg f(x) = deg F(x) + deg
h(x) and we get deg h(x) 0. This completes the proof..

Let K be a field and f(x) = anxn + a n 1xn 1 + . . . + a1x + a0,


g(x) =b xm + b
m m 1
xm 1 + . . . + b x + b
1 0

700
two polynomials in K[x], where an 0 or bm 0, so that deg f(x) = n or
deg g(x) = m.. From Lemma 56.1, we know that f(x) and g(x) have a
nonunit greatest common divisor in K[x]. if and only if there are elements
cm 1,cm 2, . . . ,c1,c0,dn 1,dn 2, . . . ,d1,d0, where at least one ci 0 and at least
one dj 0, such that

(anxn + a n 1xn 1 + . . . + a1x + a0)(cm 1xm 1 + cm 2xm 2 + . . . + c1x + c0)


= (bmxm + bm 1xm 1 + . . . + b1x + b0)(dn 1xn 1+ dn 2xn 2 + . . . + d1x + d0). (*)

This polynomial equation is equivalent to the system of equations:

ancm 1 = bmdn 1
ancm 2 + a n 1cm 1 = bmdn 2 + bm 1dn 1
ancm 3 + a n 1cm 2 + a n 2cm 1 = bmdn 3 + bm 1dn 2
+ bm 2dn 1
........................
a1c0 + a0c1 = b1d0 + b0d1
a0c0 = b0d0.

This system can be written as

ancm 1
bmdn 1 =0
ancm 2
+ a n 1cm 1
bmdn 2 bm 1dn 1
=0
ancm 3
+ a n 1cm 2
+ a n 2cm 1
bmdn 3 bm 1dn 2
bm 2dn 1
=0
........................
a1c0 + a0c1 b1d0 b0d1 = 0
a0c0 b0d0 =0

or as

ancm 1 bmdn 1 =0
a n 1cm 1
+ ancm 2 bm 1dn 1 bmdn 2 =0
a n 2cm 1
+ a n 1cm 2
+ ancm 3
bm 2dn 1 bm 1dn 2 bmdn 3
=0
........................
a1cm 1
+ a2cm 2 + a3cm 3 ......... =0
a0cm 1
+ a1cm 2 + a2cm 3 ......... =0
a0cm 2 + a1cm 3 ......... =0
........................
a0c1 + a1c0 b0d1 b1d0 =0
a0c0 b0d0 =0

701
We write this system in matrix form:

m columns n columns

an 0 0 ... 0 bm 0 0 ... 0 cm 1
=0
an 1
an 0 ... 0 bm 1 bm 0 ... 0 cm 2
=0
an 2
an 1
an ... 0 bm 2 bm 1 bm ... 0 cm 3
=0
.................................

0 0 0 ... a0 0 0 0 ... b0 d0 =0

Let A denote the matrix of this system. Then the polynomials f(x),g(x)
have a nonunit greatest common divisor if and only if the matrix
equation AX = 0 has a solution

X = (c m 1,c m 2,c m 3, . . . ,c 1,c 0, dn 1, dn 2, dn 3,. . . , d1, d0)t

in which at least one ci 0 and at least one dj 0. From the equation (*)
and the fact that K[x] has no zero divisors, we deduce that, in a solution
X = (c m 1, . . . , d0)t of AX = 0, there is at least one ci 0 if and only if
there is at least one dj 0. Thus the polynomials f(x),g(x) have a nonunit
greatest common divisor if and only if the matrix equation AX = 0 has a
nontrivial solution. This is the case if and only if det A = 0 (Theorem
45.3). Since det A = det At, we get that f(x),g(x) have a nonunit greatest
common divisor if and only if det At = 0. We proved the

56.2 Theorem: Let K be a field and f(x) = anxn + a n 1xn 1 + . . . + a1x + a0,
g(x) = bmxm + bm 1xm 1 + . . . + b1x + b0 be polynomials in K[x]\K, where at
least one of an,bm is distinct from 0. Then f(x) and g(x) have a nonunit
greatest common divisor in K[x] if and only if the determinant
an an 1 . . . a1 a0 0 0 0 0 0
0 an a n 1 . . . a1 a0 0 0 0 0
0 0 a a ... a1 a0 0 0 0
n n 1
..................
0 0 0 ... an an ... a1 a0
1

bm bm 1
... b1 b0 0 0 0 0 0

0 bm bm 1
... b1 b0 0 0 0 0

702
0 0 bm bm 1
... b1 b0 0 0 0
..................
0 0 0 ... bm bm 1
... b1 b0

is equal to zero.

56.3 Definition: Let K be a field and f(x) = anxn + a n 1xn 1 + . . . + a1x + a0,

g(x) = bmxm + bm 1xm 1 + . . . + b1x + b0 polynomials in K[x]. The determinant


an a n 1 . . . a1 a0
an a n 1 . . . a1 a0
an a n 1 . . . a1 a0
.........
an a n 1 . . . a1 a0

bm bm 1 . . . b1 b0
bm bm 1 . . . b1 b0
bm bm 1 . . . b1 b0
.........
bm bm 1 . . . b1 b0

(empty places are to be filled with zeroes) is called the resultant of f(x)
and g(x), and is denoted by R(f,g) or by R (f(x),g(x)).

56.4 Remark: Notice that an and bm can be zero in Definition 56.3.


There is ambiguity in this definition and notation: the resultant depends
not only on f(x) and g(x), but also on the number of apparent
coefficients, a point neglected in almost every book. For example, let f(x)
= anxn + a n 1xn 1 + . . . + a1x + a0 and g(x) = bmxm + bm 1xm 1 + . . . + b1x + b0
again, and let b m+1
= 0, h(x) = b xm+1 + b xm + b
m+1
xm 1 + . . . + b x + b .
m m 1 1 0
Then of course g(x) = h(x) but R(f,h) has one more column than R(f,g)
and the expansion of R(f,h) along the first column gives R(f,h) = anR(f,g),
so R(f,h) R(f,g) (unless an = 1 or R(f,g) = 0). Thus adding an initial term
to g(x) with coefficient 0 changes R(f,g) to anR(f,g). Consequently, if
f(x) = a xn + a xn 1 + . . . + a x + a ,
n n 1 1 0
m
g(x) = bmx + bm 1
x m1
+ . . . + b1x + b0 and bm = bm 1
= . . . = bk+1 = 0, bk 0,
G(x) = bkxk + bk 1
xk 1 + . . . + b x + b ,
1 0

703
then g(x) is obtained from G(x) by adding m k initial terms bmxm,
bm 1xm 1, . . . , bk+1xk with coefficient 0 and so R(f,g) = anm kR(f,G).

Definition 56.3 gives a new formulation of Theorem 56.2

56.2 Theorem: Let K be a field and f(x) = anxn + a n 1xn 1 + . . . + a1x + a0,
g(x) = bmxm + bm 1xm 1 + . . . + b1x + b0 be polynomials in K[x]\K, where at
least one of an,bm is distinct from 0. Then f(x) and g(x) have a nonunit
greatest common divisor in K[x] if and only if R(f,g) = 0.

We give some product formulas for the resultant of two polynomials.


These formulas make it evident that the resultant is 0 if and only if the
polynomials have a nontrivial common factor.

56.5 Theorem: Let K be a field and u1,u2, . . . ,un,y1,y2, . . . ,ym, indetermi-


nates over K. Let an,bm be nonzero elements of K and let x be an indeter-
minate over K distinct from all of u1,u2, . . . ,un, y1,y2, . . . ,ym. Let f(x) and
g(x) be polynomials in K(u1,u2, . . . ,un,y1,y2, . . . ,ym)[x] defined by
f(x) = an(x u1)(x u2). . . (x un)
g(x) = bm(x y1)(x y2). . . (x ym).
Then the following hold.

(1) R(f,g) is in P[an,u1,u2, . . . ,un,bm,y1,y2, . . . ,ym], where P is the prime


subfield of K.

∏ ∏
n m
(2) R(f,g) = am bn
n m
(ui yj ).
i=1 j=1


n
(3) R(f,g) = am
n
g(ui).
i=1


m
(4) R(f,g) = ( 1)mnbnm f(yj ).
j=1

Proof: We put
f(x) = anxn + a n 1xn 1 + . . . + a1x + a0,

704
g(x) = bmxm + bm 1xm 1 + . . . + b1x + b0,

where ai,bj K(u1,u2, . . . ,un,y1,y2, . . . ,ym). Thus R(f,g) is a determinant of a


matrix whose entries are a0,a1,a2, . . . ,an,b0,b1,b2, . . . ,bm and 0. Hence the
entries of the matrix are in P[a0,a1,a2, . . . ,an,b0,b1,b2, . . . ,bm] and the deter-
minant R(f,g) itself is also in P[a0,a1,a2, . . . ,an,b0,b1,b2, . . . ,bm] (Remark
44.2(2)). Since each ai/an, aside from a sign, is an elementary symmetric
polynomial in u1,u2, . . . ,un, and since the coefficients of elementary
symmetric polynomials are in the prime subfield P, we get

ai/an P[u1,u2, . . . ,un] for all i = 1,2, . . . ,n.

So each ai is in P[an,u1,u2, . . . ,un] P[an,u1,u2, . . . ,un,bm,y1,y2, . . . ,ym].


Likewise each bj is in P[an,u1,u2, . . . ,un,bm,y1,y2, . . . ,ym]. Consequently

R(f,g) P[a0,a1,a2, . . . ,an,b0,b1,b2, . . . ,bm] P[an,u1,u2, . . . ,un,bm,y1,y2, . . . ,ym].

This proves (1). Now let L = P[an,u1,u2, . . . ,un,bm,y1,y2, . . . ,ym]. We put

∏ ∏
n m
S = am bn
n m
(ui yj ) L.
i=1 j=1


m
We have g(x) = bm (x yj ),
j=1


m
g(ui) = bm (ui yj ),
j=1

∏ ∏ ∏
n n m
g(ui) = bnm (ui yj ).
i=1 i=1 j=1


n
and thus S = am
n
g(ui). (i)
i=1

∏ ∏
n n
In like manner, from f(x) = an (x ui) = ( 1)nan (ui x), we get
i=1 i=1


n
f(yj ) = ( 1)nan (ui yj ),
i=1

∏ f(yj ) = ∏ ∏
m m n
(( 1)nan (ui yj )),
j=1 j=1 i=1

∏ ∏ ∏
m m n
f(yj ) = ( 1)nmam
n
(ui yj ),
j=1 j=1 i=1

705

m
S = ( 1)nmbnm f(yj ). (ii)
j=1

Now let f0(x) be the polynomial obtained by substituting yj for ui in f(x).


Thus f0(x) = an(x u1). . . (x ui 1)(x yj )(x ui+1)(x un)
P(an,u1, . . . ,ui 1,ui+1, . . . ,un,bm,y1,y2, . . . ,ym)[x].

Then the polynomials f0(x) and g(x) in

P(an,u1, . . . ,ui 1,ui+1, . . . ,un,bm,y1,y2, . . . ,ym)[x]

have a common factor x yj and therefore R(f0,g) = 0.

Thus R(f,g) L, regarded as a polynomial in

P[an,u1, . . . ,ui 1,ui+1, . . . ,un,bm,y1,y2, . . . ,ym][ui]

has the value R(f0,g) = 0 when yj is substituted for ui. So R(f,g) has the
root yj . So ui yj divides R(f,g) in

P[an,u1, . . . ,ui 1,ui+1, . . . ,un,bm,y1,y2, . . . ,ym][ui] = L.

This is true for all i = 1,2, . . . ,n and for all j = 1,2, . . . ,m. Since any ui yj is
irreducible in L, and ui yj is distinct from ui´ yj´ whenever (i,j) (i´,j´),
the polynomials ui yj are pairwise relatively prime. Thus R(f,g) is
divisible, in L, by their product

∏ ∏
n m
(ui yj ).
i=1 j=1

It follows that R(f,g) is divisible by

∏ ∏
n m
S = am bn
n m
(ui yj )
i=1 j=1

in M[u1,u2, . . . ,un,y1,y2, . . . ,ym], where we put M = P(an,bm).

Let us write H = R(f,g)/S. Basically, we will argue that R(f,g) and S are
both homogeneous (§35, Ex. 4) of the same degree and conclude that H is
a constant. Comparison of a monomial appearing in these polynomials
will yield that this constant must be equal to 1, whence R(f,g) = S. The
details are rather tedious.

706

n
From (i), we see that S/am
n
= i
g(u )
i=1
= (bmu1m + . . . )(bmu2m + . . . ). . . (b u m + . . . )
m n
= bnm u1mu2m. . . unm + . . .
P[bm,y1,y2, . . . ,ym][u1,u2, . . . ,un]

is a symmetric polynomial in u1,u2, . . . ,un over P[bm,y1,y2, . . . ,ym] and


hence there is a unique polynomial h1 in n indeterminates over the
integral domain P[bm,y1,y2, . . . ,ym] such that

S/am
n
= h1( an 1/an, an 2/an, . . . , a1/an, a0 /an).

Let us recall that h1 is obtained from S/am


n
by subtracting symmetric
polynomials of the form

k1 k2 k2 k3 kn-1 kn kn
y 1 2
... n 1 n
, y P[bm,y1,y2, . . . ,ym]

where yu1k1 u2k2 . . . un 1


kn-1
unkn are certain monomials appearing in S/am
n
.
We have m k1 by Lemma 38.8(2) since the leading monomial of S/am
n
is bnmu1mu2m. . . unm. A symmetric polynomial of the form above gives rise
to a term

y( an 1/an)k1 k2 (an 2/an)k2 k3 . . . ( a1/an)kn-1 kn ( a0 /an)kn ,

which is (1/an)k1 times a polynomial in P[bm,y1,y2, . . . ,ym][a0,a1, . . . ,a n 1]. As


m k1 for each of the terms in h1, we see am h is a polynomial in
n 1
P[bm,y1,y2, . . . ,ym][a0,a1, . . . ,a n 1,an]. Thus

S = (am
n
)(S/am
n
) = am h ( an 1/an, an 2/an, . . . , a1/an, a0 /an),
n 1

S P[bm,y1,y2, . . . ,ym][a0,a1, . . . ,a n 1,an] (iii)

and S = h(a0,a1, . . . ,a n 1,an), where h is a polynomial in n + 1 indetermi-


nates over P[bm,y1,y2, . . . ,ym] (Lemma 49.5(1)).

Also R(f,g) P[b0,b1,b2, . . . ,bm][a0,a1,a2, . . . ,an]


P[bm,y1,y2, . . . ,ym][a0,a1, . . . ,a n 1,an]

and, together with (iii), we obtain

H = R(f,g)/S P[bm,y1,y2, . . . ,ym](a0,a1, . . . ,a n 1,an).

707
Thus H M[y1,y2, . . . ,ym][u1,u2, . . . ,un] is symmetric in u1,u2, . . . ,un and
therefore

H = k( an 1/an, an 2/an, . . . , a1/an, a0/an)

for some polynomial k in n indeterminates over M[bm,y1,y2, . . . ,ym], which


gives H M[bm,y1,y2, . . . ,ym][a0,a1, . . . ,a n 1,an] (Lemma 49.5(1)).

Now H = R(f,g)/S = R(f,g)/h(a0,a1, . . . ,a n 1,an). Note that multiplying the


coefficients an,a n 1, . . . ,a1,a0 of f(x) by an indeterminate t does not change
the roots u1,u2, . . . ,un of f(x), but, in view of (i), changes S to tmS, so that

h(tan,tan 1, . . . ,ta1,ta0) = tmh(an,a n 1, . . . ,a1,a0).

Likewise multiplying the coefficients an,a n 1, . . . ,a1,a0 of f(x) by an


indeterminate t changes R(f,g) to tmR(f,g), as the determinant R(f,g) has
m rows consisting of zeroes and the coefficients of f. Thus H does not
change when the coefficients of f are multiplied by t. But any monomial

ya0k0 a1k1 . . . ankn (y M[bm,y1,y2, . . . ,ym])

...+ kn
changes then to y(ta0)k0 (ta1)k1 . . . (tan)kn = tk0 + k1 + ya0k0 a1k1 . . . ankn . Thus
the exponent system of any monomial ya0k0 a1k1 . . . ankn appearing in H is
such that k + k + . . . + k = 0. This means k = k = . . . = k = 0 for all
0 1 n 0 1 n
k0 k1 kn
monomials ya0 a1 . . . an appearing in H and H is a "constant", i.e., H is
in M[bm,y1,y2, . . . ,ym].

Repeating the same argument with S/bnm in place of S/am


n
, we get that H
is in M[an,u1,u2, . . . ,un]. So H M = P(an,bm) K.


n
Thus R(f,g) = HS for some H K. The constant term in S = am
n
g(ui) is
i=1
equal to am bn. So R(f,g) must have a term Ham
n 0
bn. Now R(f,g) has the term
n 0
am bn, the product of the entries in the principal diagonal. Hence H = 1
n 0
and R(f,g) = S. This proves (2). From (i) and (ii), we get the equations in
(3) and (4).

56.6 Lemma: Let K be a field and f(x),g(x) polynomials of positive


degree in K[x], say deg f(x) = n and deg g(x) = m. Let an be the leading

708
coefficient of f(x) and bm the leading coefficient of g(x). Let r1,r2, . . . ,rn be
roots of f(x) and s1,s2, . . . ,sm roots of g(x) in a splitting field of f(x)g(x)
over K. Then

∏ ∏ ∏ ∏
n m n m
R(f,g) = am bn
n m
(ri sj ) = am
n
g(ri) = ( 1)mnbnm f(sj ).
i=1 j=1 i=1 j=1

Proof: In a splitting field of f(x)g(x) over K, we have the factorizations


f(x) = an(x r1)(x r2). . . (x rn)
g(x) = bm(x s1)(x s2). . . (x sm).
Thus f(x) and g(x) are obtained from
F(x) = an(x u1)(x u2). . . (x un)
G(x) = bm(x y1)(x y2). . . (x ym),
where u1,u2, . . . ,un, y1,y2, . . . ,ym are indeterminates over K, by substituting
ri for ui and sj for yj . Since

∏ ∏ ∏ ∏
n m n m
R(F,G) = am bn
n m
(ui yj ) = am
n
g(ui) = ( 1)mnbnm f(yj )
i=1 j=1 i=1 j=1

by Theorem 56.5, this substitution gives

∏ ∏ ∏ ∏
n m n m
R(f,g) = am bn
n m
(ri sj ) = am
n
g(ri) = ( 1)mnbnm f(sj )
i=1 j=1 i=1 j=1

56.7 Lemma: Let K be a field. Let


f(x) = anxn + a n 1xn 1 + . . . + a1x + a0
be a polynomial of degree n in K[x]\K and
g(x) = bmxm + bm 1xm 1 + . . . + b1x + b0
a polynomial in K[x]\K, possibly bm = 0. Let r1,r2, . . . ,rn be the roots of f(x)
in some splitting field of f(x) over K. Then


n
R(f,g) = am
n
g(ri).
i=1

Proof: Assume first bm 0. Let F be a splitting field of f(x) over K in


which r1,r2, . . . ,rn lie and let E be a splitting field of g(x) over F so that


n
both f(x) and g(x) split completely in E. Then R(f,g) = am
n
g(ri) by
i=1

Lemma 56.6.

Assume now bm = 0 and let k be the largest index for which bk 0. Thus
bm = bm 1 = . . . = bk+1 = 0 and bk 0. We put G(x) = bkxk + bk 1xk 1 + . . . +

709
b1x + b0. We get R(f,g) = am
n
k
R(f,G) from Remark 56.4 and we have R(f,G)

∏G(ri) by what we have just proved. Since G(ri) = g(ri) for


n
= akn any i =
i=1

1,2, . . . ,n, we obtain

∏G(ri) = amn ∏G(ri) = amn ∏g(ri).


n n n
R(f,g) =am
n
k
R(f,G) = am
n
k k
an
i=1 i=1 i=1

This completes the proof.

56.8 Definition: Let K be a field and. f(x) a nonzero polynomial in K[x]


of positive degree n. Let an be the leading coefficient of f(x) and let
r1,r2, . . . ,rn be the roots of f(x) in some splitting field E of f(x) over K.
Then
a2n
n
2
∏ (ri rj )2 E
i j

is called the discriminant of f(x) and is denoted by D(f).

It seems as though the discriminant of f(x) depended on the splitting


field E we choose and we had to call it actually the discriminant of f(x) in
E and denoted by DE (f). However, there is no need to refer to the
splitting field since the discriminant is in fact an element of the field K.
This we prove in the next theorem.

In the next theorem, if f(x) = anxn + a n 1xn 1 + . . . + a1x + a0 and f´(x) =


nanxn 1 + (n 1)a n 1xn 2 + . . . + a1, then R(f,f´) is understood to be the
determinant with n + (n 1) rows, the first n 1 rows being
an a n 1 . . . a1 a0
surrounded with zeroes and the last n being
nan (n 1)a n 1 . . . a1
surrounded with zeroes, even if nan = 0, (n 1)a n 1= 0, etc. (this happens
when char K = p 0 and p n, a n 1= 0, etc.). In other words, we define
R(f,f´) as if f´ is of degree n 1, although the degree of f´ may be less
than n 1 (cf. Remark 56.4).

710
56.9 Theorem: Let K be a field and f(x) a polynomial of positive degree
n and let an be the leading coefficient of f(x). Then the discriminant D(f)
of f(x) is in K. In fact, R(f,f´) = ( 1)n(n 1)/2anD(f).

Proof: Let E be a splitting field of f(x) over K and let r1,r2, . . . ,rn be the

roots of f(x) in E. We evaluate R(f,f´). We have R(f,f´) = ann 1∏f´(ri) by


n

i=1

Lemma 56.7. We must find f´(ri). From f(x) = an(x r1)(x r2). . . (x rn),
we get


n
f´(x) = an(x r1). . . (x rj 1)(x rj+1). . . (x rn)
j=1

rn) = an ∏ (ri
n
so f´(ri) = an(ri r1). . . (ri ri 1)(ri ri+1). . . (ri rj ).
j =1
j i

Thus R(f,f´) = ann 1∏f´(ri) = ann 1 ∏ (an ∏ (ri ∏


n n n n
rj )) = a2n
n
1
(ri rj )
i=1 i=1 j =1 i j
j i

= an.a2n
n
2
∏ (ri rj ) = an.a2n
n
2
∏ (ri rj ) ∏ (ri rj )
i j i j j i

= an.a2n
n
2
∏ (ri rj ) ∏ ( 1)(rj ri)
i j j i

= an.a2n
n
2
∏ (ri rj ) ∏ ( 1)(ri rj )
i j i j

∏ ∏
...
= an.a2n
n
2
(ri rj ) . ( 1)(n 1) + (n 2) + + 2 + 1 (ri rj )
i j i j

= ( 1)n(n 1)/2an.a2n
n
2
∏ (ri rj )2 = ( 1)n(n 1)/2anD(f).
i j

56.10 Examples: (a) Let K be a field and ax2 + bx + c K[x], with a 0.


The discriminant of f(x) is ( 1)a 1 times the resultant

a b c a b c
b 2c
2a b 0 = 0 b 2c = a = a( b2 + 4ac) = a(b2
2 b
0 2a b 0 2a b
4ac),

711
hence the discriminant of f(x) is b2 4ac.

(b) Let K be a field and x3 + px + q K[x]. The discriminant of f(x) is


3. 2/2 1
( 1) 1 times the resultant

q 0 1 0 p
1 0 p q 0
1 0 p q
p q 0 1 0
0 1 0 p q
0 2p 3q 0
3 0 p
0 0 = 0 0 2p 3q 0 = 3 0 p 0
p 0 0 3 0
0 3 0 p 0
0 3 0 p
0 p 0 0 3
0 0 3 0 p
1 0 p q
2p 3q 0
0 2p 3q 0
= 0 0 0 2p 3q = 4p3 + 27q2.
2p 3q =
3 0 p
0 3 0 p
So the discriminant of f(x) is equal to 4p3 27q2.

We now turn our attention to polynomial equations.

56.11 Lemma: (1) Let E/K, E1/K1 be field extensions. Assume that there
are field isomorphisms : K K1 and : E E1 and that is an extension
of . Then AutK E AutK E1.
1

(2) Let K be a field and f(x) a polynomial in K[x]\K.. Let E and F be two
splitting fields of f(x) over K. Then AutK E AutK F.

1
Proof: (1) For any AutK E, consider the mapping : E1 E1.
1
Clearly is a field isomorphism (Lemma 48.10). Moreover, for any a1
1 1
K1, there is a unique a K with a =a = a1, i.e., a1 = a1 = a and
a1 1 = (a1 1) = (a) = (a ) = a = a1, so 1
is in fact a K1-
automorphism of E1. Thus we have a mapping

A: AutK E AutK E1
1
1

Now ( )A = 1( ) = ( 1 )( 1 ) = A A for any , AutK E, so A is a


group homomorphism. Repeating the same argument with K,E, , and K1,
1 1
E1, , interchanged, we conclude that the mapping

712
B: AutK E1 AutK E
1
1

is an inverse of A, so A is one-to-one and onto AutK E1. Thus A is an


1
isomorphism and we get AutkE AutK E1.
1

(2) The fields E and F are K-isomorphic by Theorem 53.8, so the claim
follows immediately from part (1).

Thus Galois groups of any two splitting fields (over K) of f(x) are iso-
morphic. This justifies the definite article in the next definition.

56.12 Definition: Let K be a field and f(x) a polynomial in K[x]\K.. The


Galois group AutK E of a splitting field E of f(x) over K is called the Galois
group of f(x) K[x]..

56.13 Examples: (a) (i) is a splitting field of x2 + 1 [x] over and


hence the Galois group of x2 + 1 [x] is Aut (i) C2.

(b) The Galois group of x3 2 is { 1


, 2
, 3
, 4
, 5
, 6
} S3. Here we used the
notation of Example 54.18(a).

(c) The Galois group of x4 2 is { 1, 2, 3, 4, 5, 6, 7, 8} = , D8. Here


we used the notation of Example 54.18(b). We know that D8
{ ,(13),(24),(12)(34),(13)(24),(14)(23),(1234),(1432)} S 4.

n
(d) Let p be a prime number. The field pn
is a splitting field of xp x
pn
over p
(Example 53.5(f)). Hence the Galois group of x x p
[x] is
Aut pn
[x] = , where is the homomorphism a a p (Example
54.18(c)).

56.14 Theorem: Let K be a field, f(x) a polynomial in K[x]\K and let G


be the Galois group of f(x). Then G is isomorphic to a subgroup of a
symmetric group Sn.

713
Proof: Let E be a splitting field of f(x) over K and let a1,a2, . . . ,an be the
distinct roots of f(x) in E (1 n deg f(x)). Any G = AutK E maps
any ai to a aj and thus gives rise to a permutation Sn, namely i j.
Thus is given by ai ai .

Now the mapping : G Sn is a homomorphism of groups since, for any

, G, we have
ai = ai( ) = (ai ) = ai = aj (put i = j)
= aj = a (i )
= a i( )
for i = 1,2, . . . ,n and so = . Here Ker if and only if ai = ai for
all i = 1,2, . . . ,n. Thus an automorphism in Ker fixes each element of K
and fixes each ai. Since E is generated by ai over K (Example 53.5(d)), we
deduce that an automorphism in Ker fixes all elements of E. Thus Ker
= { E }. So is one-to-one and G is isomorphic to Im Sn.

The preceding proof is quite simple.. G acts on the set of distinct roots of
f(x),. and the permutation representation is one-to-one; thus G is
isomorphic to a subgroup of SU , and SU itself is isomorphic to Sn. We will
often identify the Galois group of a polynomial. with its isomorphic
images in SU and in Sn.

The Galois group of a polynomial reflects many important properties of


that polynomial. We describe how irreducibility is reflected in the Galois
group. It turns out that the decomposition of f(x) into irreducible
polynomials is intimately connected with the partitioning of its roots
into disjoint orbits. Let us recall that a group G is said to act transitively
on a set X provided, for any x,y X, there is a g G such that xg = y
(Definition 25.11). If G Sn acts transitively on {1,2, . . . ,n}, then we shall
call G a transitive subgroup of Sn. Thus G Sn is transitive if and only if,
for any i,j {1,2, . . . ,n}, there is a G such that i = j.

56.15 Examples: (a) A subgroup G of Sn is transitive if and only if, for


any i {1,2, . . . ,n}, there is a Sn such that 1 = i. The necessity of this
condition is clear. Conversely, if the condition is satisfied and i,j are in

714
1
{1,2, . . . ,n}, there are , G with 1 = i and 1 = j, so G maps i to j;
hence the condition is also sufficient.

(b) If H G Sn and H is transitive, then G is also transitive.

(c) A3 = { ,(123),(132)} is a transitive subgroup of S3 for there are


permutations i in A3 with 1 i = i for any i =1,2,3, viz. 1 = , 2 = (123)
and 3 = (132). Then S3 is of course another transitive subgroup of S3. On
the other hand, { ,(12)} is not a transitive subgroup of S3 for there is no
permutation in { ,(12)} that maps 1 to 3. Likewise { ,(13)} and { ,(23)}
are not transitive subgroups of S3. Certainly { } is not a transitive
subgroup of S3. Thus A3 and S3 are the only transitive subgroups of S3.

i
(d) Let = (12. . . n) Sn. Then is a transitive subgroup of Sn since 1
= i for any i = 1,2, . . . ,n.

(e) If G is a transitive subgroup of Sn, so is any conjugate of G. Indeed, if


G is transitive and Sn, then, for any i,j {1,2, . . . ,n}, there is a G
that maps i 1 to j 1, i.e., i 1 = j. Thus there is a G that maps i to j
and G is therefore transitive.

(f) It follows from the last two examples that (1234) and its
conjugates (1324) , (1243) are transitive subgroups of S 4. Also V4 =
{ ,(12)(34),(13)(24),(14)(23)} is a transitive subgroup of S 4. From V4
A4 and V4 S 4, we see that A4 and S 4 are transitive subgroups of S 4.
Likewise D = { ,(13),(24),(12)(34),(13)(24),(14)(23),(1234),(1432)} and
its conjugates
{ ,(12),(34),(13)(24),(12)(34),(14)(32),(1324),(1423)}
{ ,(14),(23),(12)(43),(14)(23),(13)(24),(1243),(1342)}
are transitive subgroups of S 4. On the other hand, { ,(12),(34),(12)(34)}
and its conjugates are not transitive subgroups of S 4.

56.16 Theorem: Let K be a field and let f(x) K[x]. be a monic poly-
nomial having no multiple roots.. Let E be a splitting field of f(x) and G =
AutK E the Galois group of f(x). Let r1,r2, . . . ,rn E be the roots of f(x). Let
m0 = 0 and mk = n.
(1) Assume the notation so chosen that
{r1,r2, . . . ,rm }, {rm +1,rm +2, . . . ,rm }, {rm ,rm , . . . ,rm },
1 1 1 2 2 +1 2 +2 3

715
. . . , {rm ,rm , . . . ,rm }
k-1+1 k-1+2 k

are the disjoint orbits under the action of G. Put


fi(x) = (x rm +1)(x rm +2). . . (x rm ) E[x] for i = 1,2, . . . ,k.
i-1 i-1 i
Then fi(x) K[x] and fi(x) is irreducible in K[x],so that
f(x) = f1(x)f2(x). . . fk(x)
is the canonical decomposition of f(x) into irreducible polynomials in K[x].

(2) Let f(x) = f1(x)f2(x). . . fk(x) be the canonical decomposition of f(x) into
monic irreducible polynomials in K[x] and let rm +1, rm +2, . . . ,rm be the
i-1 i-1 i
roots of fi(x) (i = 1,2, . . . ,k). Then
{r1,r2, . . . ,rn} = {r1,r2, . . . ,rm } {rm +1,rm , . . . ,rm } {rm ,rm , . . . ,rm }
1 1 1 +2 2 2 +1 2 +2 3
... {rm ,r
+1 m +2
, . . . ,rm }
k-1 k-1 k
is the partitioning of {r1,r2,. . . ,rn} into disjoint orbits under the action of G.

Proof: (1) We first prove that fi(x) K[x]. The coefficients of fi(x) =
(x rm +1)(x rm +2). . . (x rm ) are elementary symmetric polynomials in
i-1 i-1 i
rm , r
+1 m +2
, . . . ,rm . Any automorphism in G maps each one of these rm
i-1 i-1 i i-
,r
+1 m +2
, . . . ,rm to one of them again and thus leaves the coefficients of
1 i-1 i
fi(x) unchanged. So the coefficients of fi(x) are in the fixed field of G. Now
f(x) has no multiple roots, so the irreducible divisors of f(x) are
separable over K and, since E is a splitting field of f(x) over K, we infer E
is a Galois extension of K (Theorem 55.7) and the fixed field of G is
exactly K. Hence fi(x) K[x]..

We prove next that fi(x) is irreducible in K[x]. Let g(x) K[x] be an


irreducible divisor of fi(x). In E, there is a root of g(x), say rm +1. Then,
i-1
for any G, rm is also a root of g(x). But {rm : G} = orbit of
i-1+1 i-1+1
rm = {rm , rm , . . . ,rm }. Thus each of rm , rm , . . . ,rm is a root of
i-1+1 i-1+1 i-1+2 i i-1+1 i-1+2 i

g(x). These roots are distinct, for f(x) has no multiple roots. Thus g(x) has
at least mi mi 1 distinct roots. Then mi mi 1 deg g(x) deg fi(x) =
mi mi 1 and so g(x) = fi(x). Thus fi(x) = g(x) is irreducible in K[x].

It follows that f(x) = f1(x)f2(x). . . fk(x) is the canonical decomposition of


f(x) into irreducible polynomials in K[x].

(2) Suppose now f(x) = f1(x)f2(x). . . fk(x) is the canonical decomposition of


f(x) into irreducible polynomials in K[x]. We are to show that the roots of

716
fi(x) make up the orbit of rm . Indeed, if G, then rm is also a
i-1+1 i-1+1
root of fi(x) and thus:
orbit of rm {rm , rm , . . . ,rm }.
i-1+1 i-1+1 i-1+2 i

On the other hand, if r E is any root of fi(x), then K(rm ) K(r) by a


i-1+1
K-isomorphism that sends rm to r (Theorem 53.2) and can be
i-1+1

extended to a K-automorphism (Theorem 53.7; E is a splitting field of


f(x) over K(rm +1) and over K(r) by Example 53.5(e)). So there is a G
i-1
with rm = r and any root r of fi(x) is in the orbit of rm . Thus:
i-1+1 i-1+1
{rm , rm , . . . ,rm } orbit of rm .
i-1+1 i-1+2 i i-1+1

This completes the proof.

56.17 Theorem: Let K be a field, f(x) a polynomial of positive degree n


in K[x] and let G be the Galois group of f(x). If f(x) is irreducible and
separable over K, then n divides G and G is isomorphic to a transitive
subgroup of Sn.

Proof: Let E be a splitting field of f(x) over K. Then E is a Galois exten-


sion of K (Theorem 55.7) and, under the action of G, there is only one
orbit of the roots of f(x). Thus G acts transitively on the set of roots of
f(x) and its isomorphic image in Sn acts transitively on {1,2, . . . ,n}. So G is
isomorphic to a transitive subgroup of Sn. Furthermore, if r E is any
root of f(x), then K(r) is an intermediate field of E/K and K(r):K = deg f =
n (Theorem 50.7) and, by the fundamental Theorem of Galois theory, G
has a subgroup K(r)´ of index G:K(r)´ = K(r):K = n. So n divides G by
Lagrange's theorem.

We shall regard the Galois group as a subgroup of Sn. It will be interest-


ing to determine the role of An. This is connected with discriminants.

56.18 Theorem: Let K be a field such that char K 2 and let f(x) K[x].
Assume deg f = n 0 and let E be a splitting field of f(x) over K.
Suppose f(x) has n distinct roots r1,r2, . . . ,rn in E. Put

717
= ∏(ri rj ) = (r1 r2)(r1 r3). . . (rn 1
rn) and d = 2
.
i j

(1) For AutK E Sn, there holds = if and only if is in An and =


if and only if is in Sn\An.
(2) d, which is an elements of E, is actually in K. In fact, d = an (2n 2)D(f),
where an is the leading coefficient and D(f) is the discriminant of f(x).

Proof: (1) We have = ∏(ri´ rj´) = (r1´ r2´)(r1´ r3´). . . (r(n 1)´
rn´),
i j

where ri´ = ri . We divide the ordered pairs (i,j) with i j into two

classes according as i´ j´ or i´ j´. Then = ∏(ri´ rj´) ∏(ri´ rj´)


i j i j
i´ j´ i´ j´

= ∏(ri´ rj´) ∏( 1)(rj´ ri´)


i j i j
i´ j´ i´ j´

= ∏(ri´ rj´) ∏( 1)(ri´ rj´) (interchange the dummy indices i and j)


i j j i
i´ j´ j´ i´

= ∏(ri´ rj´) . ( 1)s ∏(ri´ rj´) (where s is the number of factors in


i j j i
i´ j´ j´ i´

the second product; hence s is the


number of inversions of the permuta-
12 . . . n
tion (1´2´ . . . n´) = AutK E Sn)

= ( 1)s ∏(ri´ rj´) ∏(ri´ rj´) = ( ) ∏(ri´ rj´) = ( ) ∏(ri rj ) = ( ) .


i j j i i´ j´ i j
i´ j´ j´ i´

This proves (1).

(2) The equation d = an (2n 2)D(f) is immediate from the definition of


discriminant (Definition 56.8). This implies of course that d is in K, since
D(f), being an 1 times a determinant of a matrix with entries in K, is an
element of K. Alternatively, we have = and thus d = ( 2) = ( )2 =
( )2 = 2 = d for any AutK E. So d is in the fixed field of AutK E. Since
the roots of f(x) are simple by hypothesis,. the irreducible divisors of f(x)

718
are separable over K and thus E is Galois over K (Theorem 55.7), so the
fixed field of AutK E is K and d is in K.

56.19 Theorem: Let K be a field such that char K 2 and let f(x) K[x].
Assume deg f = n 0 and let E be a splitting field of f(x) over K.
Suppose f(x) has n distinct roots r1,r2, . . . ,rn in E so that E is a Galois

extension of K (Theorem 55.7). Put = ∏(ri rj ). Consider the Galois


i j

group AutK E as a subgroup of Sn.


In the Galois correspondence, the intermediate field K( ) corresponds to
AutK E An. In particular, AutK E An if and only if K.

Proof: In the Galois correspondence, the subgroup of AutK E correspond-


ing to the intermediate field K( ) is .
K( )´ = { AutK E: a = a for all a K( )}
={ AutK E: = }
={ AutK E: An}
= AutK E An
by Theorem 56.18. In particular, AutK E An if and only if AutK E An =
AutK E, so if and only if K( )´= AutK E = K´, hence if and only if K( ) = K,
hence if and only if K.

We now study Galois groups of polynomials of degree 2,3,4. We start


with quadratic polynomials.

56.20 Theorem: Let K be a field and f(x) an irreducible polynomial in


K[x] of degree 2. Let G be the Galois group of f(x), regarded as a
subgroup of S2. If f(x) is separable over K, then G = S2 C2. If f(x) is not
separable over K, then G = 1.

Proof: If f(x) is separable over K, then G is a transitive subgroup of S2


(Theorem 56.17). Since S2 is the only transitive subgroup of S2, the result
follows. If f(x) = ax2 + bx + c is not separable over K, then f´(x) = 2ax + b =
0, so 2a = 0 = b (and a 0), so char K = 2 and f(x) = a(x2 + e) for some e
K, and a splitting field of f(x) over K is K(r), where r is a root of f(x).

719
Then any in G maps r to r and thus fixes K(r). This means G consists of
the identity mapping on K(u). Hence G = 1.

56.21 Theorem: Let K be a field and f(x) an irreducible separable


poly-nomial in K[x] of degree 3. Let G be the Galois group of f(x),
regarded as a subgroup of S3. Then G = S3 or G = A3. More specifically, if
char K 2, then G = A3 in case D(f) is the square of an element in K, and
G = S3 in case D(f) is not the square of any element in K.

Proof: G is a transitive subgroup of S3 (Theorem 56.17). Since S3 and A3


are the only transitive subgroups of S3 (Example 56.15(c)), the result
follows.

Assume in addition char K 2. Then G = A3 if and only if K in the


2 4
notation of Theorem 56.19. Since = a3 D(f), where a3 is the leading
coefficient of f(x) (Theorem 56.18), we conclude G = A3 if and only if
a3 4D(f) is the square of an element in K, thus if and only if D(f) is the
square of an element in K.

56.22 Examples: (a) Let x3 + 6x + 2 7


[x]. This polynomial has no
root in 7, hence is irreducible and then clearly separable over 7. Its
dis-criminant 4(6)3 27(2)2 = 4 + 1.4 = 1 = 12 (Example 56.10(b)) is a
square in 7
, so the Galois group of x3 + 6x + 2 is A3.

(b) Let x3 + 5x + 5 [x]. This polynomial is irreducible by Eisenstein's


criterion and is separable over since char = 0. The discriminant is
3 2
equal to 4(5) 27(5) = 1175, which is not a square in . So the
3
Galois group of x 5x + 5 is S3.

Next we investigate polynomials of degree four. Here S 4 will come into


play. We know that V4 = { , (12)(34), (13)(24), (14)(23)} is an important
normal subgroup of S 4. It will be useful to find the intermediate field
corresponding to V4 in the Galois correspondence..

720
56.23 Theorem: Let K be a field such that char K 2 and let f(x) K[x]
be a polynomial of degree four. Let E be a splitting field of f(x) over K.
Suppose f(x) has four distinct roots r1,r2,r3,r4 in E so that E is a Galois
extension of K (Theorem 55.7). We put = r1r2 + r3r4, = r1r3 + r2r4 and
= r1r4 + r2r3 and consider the Galois group AutK E as a subgroup of S4
(Theorem 56.14).
In the Galois correspondence, the intermediate field K( , , ) corresponds
to AutK E V4.

Proof: In the Galois correspondence, the subgroup of AutK E correspond-


ing to the intermediate field K( , , ) is .
K( , , )´ = { AutK E: a = a for all a K( , , )}
={ AutK E: = , = , = }.

If = (12)(34) AutK E, then fixes since = (r1r2 + r3r4) = r2r1 + r4r3


= r1r2 + r3r4 = . Similarly = (r1r3 + r2r4) = r2r4 + r1r3 = and =
(r1r4 + r2r3) = r2r3 + r1r4 = . Thus (12)(34) K( , , )´ if (12)(34) is in
AutK E. In like manner, one verifies that (13)(24) and (14)(23) belong to
K( , , )´ whenever they are in AutK E. This proves V4 AutK E K( , , )´.

To complete the proof, we show, for any AutK E, that V4 implies


K( , , )´. Indeed if V4, then is in one of the cosets V4(12),
V4(13), V4(23), V4(123), V4(132) of V4 in S 4. If V4(12), then =
(12) for some V4 AutK E, therefore (r1r3 + r2r4) = (r1r3 + r2r4) (12)
=
(r1r3 + r2r4)(12) and does not fix since
r1r3 + r2r4 = = = (r1r3 + r2r4) = (r1r3 + r2r4)(12) = r2r3 + r1r4
yields (r1 r2)r3 = (r1 r2)r4 and so r1 = r2 or r3 = r4, contrary to the
hypothesis that the roots of f(x) are distinct. Similarly, if V4(13),
then does not fix and if V4(23), then does not fix . If
V4(123), then does not fix since .
r1r2 + r3r4 = = = (r1r2 + r3r4) = (r1r2 + r3r4)(123) = r2r3 + r1r4
yields (r1 r3)r2 = (r1 r3)r4 and so r1 = r3 or r2 = r4, contrary to the
hypothesis. Similarly, if V4(132), then does not fix . This proves
that no automorphism in AutK E\V4 can be in K( , , )´. Hence we obtain
K( , , )´ V4 AutK E, as was to be proved.

721
56.24 Definition: Let K be a field and let f(x) K[x] be a polynomial of
degree four having four distinct roots r1,r2,r3,r4 in a splitting field of f(x)
over K. We put = r1r2 + r3r4, = r1r3 + r2r4 and = r1r4 + r2r3. The
polynomial (x )(x )(x ) K( , , )[x] is called the resolvent cubic
of f(x).

56.25 Lemma: Let K be a field and let f(x) K[x] be a polynomial of


degree four having four distinct roots in a splitting field of f(x) over K.
Then the resolvent cubic of f(x) is a polynomial in K[x]. In fact, if f(x) =
x4 + bx3 + cx2 + dx + e, then the resolvent cubic of f(x) is equal to
x3 cx2 + (bd 4e)x (b2e 4ce + d2).

Proof: This is routine computation. Let r1,r2,r3,r4 be the roots of f(x) in a


splitting field of f(x) over K. The resolvent cubic of f(x) is
x3 ( + + )x2 + ( + + )x ( ),
where = r1r2 + r3r4, = r1r3 + r2r4, = r1r4 + r2r3. Let m
be the m-th
elementary symmetric polynomial in 4 indeterminates. Then we have
+ + = r1r2 + r3r4 + r1r3 + r2r4 + r1r4 + r2r3 = 2(r1,r2,r3,r4) = c;
+ + = r 2r r + . . . = . . .
1 2 3
= (r1 + r2 + r3 + r4)(r1r2r3 + r1r2r4 + r1r3r4 + r2r3r4) 4r1r2r3r4=
= 1(r1,r2,r3,r4) 3(r1,r2,r3,r4) (r ,r ,r ,r ) = bd 4e;
4 1 2 3 4
= . . . = b2e 4ce + d2.

56.26 Theorem: Let K be a field and let f(x) K[x]. be a polynomial of


degree four, which is irreducible and separable over K.. Let E be a
splitting field of f(x) over K and let r1,r2,r3,r4 be the (distinct) roots of f(x)
in E. We put = r1r2 + r3r4, = r1r3 + r2r4 and = r1r4 + r2r3. Let G = AutK E
be the Galois group of f(x), considered as a subgroup of S4. We put
K( , , ):K = m. Then G can be described as follows. .
G = S4 m = 6.
G = A4 m = 3.
G D8 m = 2 and f(x) is irreducible over K( , , ).
G = V4 m = 1.
G C4 m = 2 and f(x) is reducible over K( , , ).

Proof: Since f(x) is irreducible and separable over K, its roots are
distinct. We know that G is a transitive subgroup of of S 4 and 4 divides

722
G (Theorem 56.17). The transitive subgroups of S 4 whose orders are
divisible by 4 are S 4, A4, the Sylow 2-subgroups of S 4 (isomorphic to D8),
V4 and the cyclic groups generated by 4-cycles like (1234) (Example
56.15(f)). Thus G is one of S 4,A4,D8,V4,C4.

The intermediate field K( , , ) corresponds to V4 G (Theorem 56.23).


Now E is Galois over K( , , ) and the Galois group AutK ( , , )E = K( , , )´ is
V4 G. Since V4 S4, we have V4 G G and so K( , , ) is a Galois
extension of K and the Galois group of K( , , ) over K is (isomorphic to)
G/(G V4) (Theorem 54.25(2)). We get
m = K( , , ):K = AutK K( , , ) = G/(G V4) and

G = S4 m = G/(G V4) = S 4/V4 = 6;


G = A4 m = G/(G V4) = A4/V4 = 3;
G D8 m = G/(G V4) = D8/V4 = 2; moreover, E is a
splitting field of f(x) over K( , , ) and AutK ( , , )E = K( , , )´ = V4 D8 =
V4 is a transitive subgroup of S 4, so f(x) is irreducible over K( , , ) by
Theorem 56.16;
G = V4 m = G/(G V4) = V4/V4 = 1;
G C4 m = G/(G V4) =
= { ,(1234),(13)(24),(1432)}/{ ,(13)(24)} = 2
(eventually after renaming the roots, we may assume, without loss of
generality, that G = { ,(1234),(13)(24),(1432)}); moreover, AutK ( , , )E =
K( , , )´ = (1234) V4 = (13)(24) is not a transitive subgroup of S 4,
so f(x) is not irreducible over K( , , ) by Theorem 56.16.

This proves the assertions in the statement of the theorem. As the


five cases are mutually exclusive, the converse assertions are also valid.

56.27 Examples: (a) The polynomial f(x) = x4 4x2 + 1 [x] has no


integer roots and is easily verified to have no quadratic factors in [x],
so f(x) is irreducible over and over (Lemma 34.11). Since char =
0, f(x) is separable over . In order to determine its Galois group G, we
find the resolvent cubic of f(x). The resolvent cubic of f(x) is
= x3 ( 4)x2 + (0.0 4.1)x (02.1 4( 4)(1) + 02)
= x3 + 4x2 4x 16
= (x + 4)(x 2)(x + 2)

723
and the roots , , of the resolvent cubic are 4, 2, 2. Thus ( , , )=
and m = ( , , ): = 1. Theorem 56.26 yields G = V4.

From f(r) = 0 (r2 2)2 = 3, we see that the roots (say in ) of f(x) are
r1 = 2+ 3, r2 = 2 3, r3 = 2+ 3, r4 = 2 3.
Note that r2 = 1/r1, r3 = r1 and r4 = 1/r1. Since
(12)(34) V4 = G fixes r1 + r2 = 6,
(13)(24) V4 = G fixes r12, hence also r12 2= 3,
(14)(23) V4 = G fixes r1 + r4 = 2, the Galois correspondence
is as depicted below.

( 2+ 3) 1

( 6) ( 3) ( 2) (12)(34) (13)(24) (14)(23)

V4

(b) Let f(x) = x4 + 5x2 + 5 [x]. Then f(x) is irreducible over by


Eisenstein's criterion and also over by Lemma 34.11. Thus f(x) is
separable over . Let G be the Galois group of f(x). The resolvent cubic
of f(x) is x3 5x2 20x + 100 = (x 5)(x2 20) = (x 5)(x 2 5)(x +
2 5), with roots , , = 5, 2 5, 2 5. Hence ( , , ) = ( 5). So
Theorem 56.26 gives G D8 or G C4. In fact, since
 5+ 5  2 5 5
f(x) =  x2 +  x + 
 2  2 
is reducible over ( 5), we have G C4.

(c) Let f(x) = x4 2 [x]. Then f(x) is irreducible over by


Eisenstein's criterion and Lemma 34.11. Let G be the Galois group of f(x).
The resolvent cubic of f(x) is x3 + 8x, whose roots are , , = 0, 2 2i,
2 2i. Therefore m = ( 2i): = 2 and G D8 or G C4. It is easy to see
that f(x) is irreducible over ( 2i), so we get G D8 from Theorem
56.26.

724
Exercises

1. Find the resultant R(f,g) when f(x) = x3 + 4x3 3x2 + x 2 [x] and
g(x) = x 3 [x].

2. Let K be a field and f(x) = anxn + a n 1xn 1 + . . . + a1x + a0, g(x) = b1x + b0
polynomials in K[x], with b1 0. Show that R(f,g) = ( b1)nf( b1/b0).

3. Let K be a field and f(x) = anxn + a n 1xn 1 + . . . + a1x + a0,


g(x) = bmxm + bm 1xm 1 + . . . + b1x + b0. If n m, show that R(f + cg,g) =
R(f,g) for all c K.

4. Let K be a field and f,g,h K[x]. Prove that R(fh,g) = R(f,g)R(h,g).

5. Let K be a field and f,g,h K[x]. Prove that D(fg) = D(f)D(g)[R(f,g)]2 and
that D(f(x)) = D(f(x c)) for any c K.

6. Let K be a field and f(x) = ax3 + bx2 + cx + d K[x]. Prove that


D(f) = b2c 2 + 18abcd 4ac3 4b3d 27a 2d2.

7. Let K be a field and f(x) = x4 + ax2 + bx + c K[x]. Prove that


D(f) = 4a 3b2 + 144acb2 + 16a 4c 128a 2c 2 + 256c 3 27b4.

8. Let K be a field, f(x) a polynomial of degree n in K[x], with leading


coefficient an and let r1,r2, . . . ,rn be the roots of f(x) in some splitting field
of f(x) over K. Put s = n and s = r m + r m + . . . + r m for m
0 m 1 1 n
. Show
that

s0 s1 s2 . . . sn 1
s1 s2 s3 . . . sn
D(f) = a2n 2 .
. . . . . . .
sn 1 sn sn+1 . . . s2n 2

(Hint: multiply two Vandermonde determinants.)

8. Where did we use the hypothesis char K 2 in Theorem 56.18?

9. Find the discriminants and Galois groups of the following polynomials.


(a) x3 + 3x2 1 [x].

725
(b) x3 2x2 + 4x + 6 [x].
(c) x3 x+2 3
[x].
(d) x3 + 3x2 3 5
[x].

10. Find the Galois groups of the following polynomials over the fields
indicated.
(a) x4 2 over ( 2) and over ( 2i).
(b) (x3 2)(x2 5) over .
(c) x4 8x2 + 15 over .
(d) x4 + 4x2 + 2 over and over ( 2).
2 2 2
(e) (x 2)(x 3)(x 5) over , over ( 2), over ( 6) and over
( 2, 3).

11. Let K be any arbitrary field and f(x) = x3 3x + 1 K[x]. Show that
f(x) is either irreducible over K or splits in K.

12. Let K be a field and f(x) an irreducible separable polynomial of


degree three in K[x]. Suppose r1,r2,r3 are the roots of f(x) in some
splitting field of f(x) over K. If the Galois group of f(x) is S3, show that, in
the Galois correspondence, K(ri) corresponds to the subgroup { ,(jk)} of
S3, where {i,j,k} = {1,2,3}.

13. Prove that S4 has no transitive subgroup of order six.

14. Let p be a prime number and G Sp. Show that G is transitive if and
only if p divides the order of G.

726
§57
Norm and Trace

In this paragraph, we introduce norm and trace of elements in an exten-


sion field. These can be defined for any finite dimensional extension, but
we restrict ourselves to the important case where the extension is
separable.

In order to define norm and trace, we need K-homomorphisms of an


extension field of K. In the case of a separable extension, these are easy
to describe.

Let K be a field and E a finite dimensional separable extension of K and


N a normal closure of K over E so that N is finite dimensional and Galois
over K (Theorem 55.11). Let us put E:K = n. Since E is finite dimensional
and hence finitely generated (Theorem 50.10) over K, there is an a K
such that E = K(a) (Theorem 55.14). Let f(x) K[x] be the (separable)
minimal polynomial of a over K, so that deg f(x) = K(a):K = E:K = n
(Theorem 50.7). Since N is normal over K and f(x) has a root a in N, the
polynomial f(x) splits in N, say
f(x) = (x a1)(x a2). . . (x an), a1 = a, a1,a2, . . . ,an N
and a1,a2, . . . ,an are pairwise distinct. Then K(a) and K(ai) N are K-iso-
morphic by Theorem 53.2, namely by the K-isomorphism

N N

i
: E = K(a) = K[a] K[ai] = K(ai)
k0 + k1a + . . . + kn 2a n 2 + kn 1a n 1 k0 + k1ai + . . . + kn 2ani 2 + kn 1ani 1,

where k0,k1, . . . ,kn 2,kn 1


K. Thus each i
is a K-homomorphisms from E
into N. Conversely, any K-homomorphism : E N must map a to one of
a1,a2, . . . ,an and must coincide with one of 1, 2, . . . , n. So { 1, 2, . . . , n} is
the complete set of K-homomorphisms from E into N.

We give a generalization of this result.

727
57.1 Lemma: Let K be a field and E a finite dimensional separable
extension of K. Let L be an intermediate field of E/K and let N be a
normal closure of K over E. If : L N is a K-homomorphism, then can
be extended is exactly E:L ways to a K-homomorphism E N.

E i
:a ai, l l (i = 1,2, . . . ,m; l L)

Proof: Since E is finitely generated and separable over K, it is a simple


extension of K, say E = K(a) (Theorem 55.14). Let E:L = m and g(x) L[x]
the minimal polynomial of a over L so that deg g(x) = m. Let f(x) be the
minimal polynomial of a over K. Then f(x) splits in N because the irre-
ducible polynomial f(x) K[x] has a root in N and N is normal over K.
Since g(x) divides f(x) (in L[x]; Lemma 50.5), the roots of g(x) are all in N.
Let a = a1,a2, . . . ,am N be the roots of g(x). Then any extension : E N
( a K-homomorphism) of must send a to one of a1,a2, . . . ,am and any l
in L to l , and thus must coincide with one of the mappings

i
: E = L(a) = L[a] L[ai] = L(ai) N
l0 + l1a + . . . + lm 2a m 2 + l m 1a m 1 (l0 ) + (l1 )ai + . . . + (lm 2 )am
i
2
+ (l m 1
)am
i
1

(l0,l1, . . . ,lm 2,l m 1 L), where i = 1,2, . . . ,m; and these mappings i are
indeed extensions of (since i: l0 l0 ) and field homomorphisms (cf.
Lemma 53.1, Theorem 53.2). Thus { 1, 2, . . . , m} is the complete set of K-
homomorphisms from E into N which are extensions of .

We can now give the definition of norm and trace.

57.2 Definition: Let K be a field and E a finite dimensional separable


extension of K. Let a E. Choose a normal closure N of K over E and let
{ 1, 2, . . . , k} be the set of all K-homomorphisms from E into N (so k =
E:K ). The norm of a E over K is denoted by N E/K (a) and is defined as

728
N E/K (a) = (a 1
)(a 2
). . . (a k
).

The trace of a E over K is denoted by TE/K (a) and is defined as

TE/K (a) = a 1
+a 2
+ ... + a k
.

N E/K (a) and N E/K (a) therefore depend on E, K as well as a. It seems as


though N E/K (a) and TE/K (a) depended also on the normal closure N we
choose, but they actually do not depend on N. This will be proved
shortly (Lemma 57.4(3)).

In case E is Galois over K, the normal closure N is equal to E and then we


have
N E/K (a) = (a 1)(a 2). . . (a k), TE/K (a) = a 1 + a 2 + . . . + a k,
where { 1, 2, . . . , k} = AutK E.

57.3 Examples: (a) Consider the extension over .. Now is Galois


over and Aut = { , }, where is the conjugation mapping. Thus
N /
(a + bi) = (a + bi)(a + bi) = (a + bi)(a bi) = a 2 + b2 and T /
(a + bi)
= (a + bi) + ((a + bi) ) = (a + bi) + (a bi) = 2a for any a + bi (a,b
).

(b) ( 2) is a Galois extension of and Aut ( 2) = { , }, where is


the homomorphism 2 2. Thus N ( 2)/
(a + b 2) = (a + b 2)(a +
b 2) = (a + b 2)(a b 2) = a 2 2b2 and T ( 2)/
(a + b 2) = (a + b 2) +
(a + b 2) = (a + b 2) + (a b 2) = 2a for any a + b 2 ( 2) (a,b ).
3
(c) ( 2) is a separable extension of , but not Galois over . A normal
3 3
closure of over ( 2) is ( 2, ). There are exactly three -
3 3 3 3
homomorphisms from ( 2) into ( 2, ), namely 2 2 (the
3 3 3 3 3 3
2
identity), 2 2 and 2 2 . So N 3
( 2)/
(a + b 2 + c 22)
3 3 3 3 3 3
= (a + b 2 + c( 2)2)(a + b 2 + c( 2)2 2
)(a + b 2 2
+ c( 2)2 )
= . . . = a 3 + 2b3 + 4c 3 2abc
3 3
and T 3
( 2)/
(a + b 2 + c 22)
3 3 3 3 3 3
= (a + b 2 + c( 2)2) + (a + b 2 + c( 2)2 2
) + (a + b 2 2
+ c( 2)2 )

729
= 3a
3 3 3
for any a + b 2 + c( 2)2 ( 2) (a,b,c ).

In these examples, norm and trace are found to be in the base field. This
is always true. In fact, the norm and trace of an element are essentially
coefficients of the minimal polynomial of that element. In particular,
they are independent of the normal closure that we use in their defini-
tion. We now prove these assertions.

57.4 Lemma: Let K be a field and E a finite dimensional separable


extension of K. Let a,b be arbitrary elements of E.

(1) N E/K (ab) = N E/K (a)N E/K (b) and TE/K (a + b) = TE/K (a) + TE/K (b).

(2) If b K, then NE/K (b) = b E:K and TE/K (b) = E:K b.

(3) If f(x) = xn + a n 1xn 1 + . . . + a1x + a0 K[x] is the minimal polynomial


of b over K, then
N E/K (b) = (( 1)na0) E:K (b) and TE/K (b) = E:K(b) ( a n 1).

Proof: Let N be a normal closure of K over E and let { 1, 2


, . . . , k} be the
set of all K-homomorphisms from E into N. In view of the comments
above, their number k is the number of roots of the minimal polynomial
of a primitive element of the extension E/K, hence k = E:K (or use
Lemma 57.1 with L = K).

(1) Clearly N E/K (ab) = (ab 1)(ab 2). . . (ab k)


= (a 1b 1)(a 2b 2). . . (a kb k)
= (a 1)(b 1)(a 2)(b 2). . . (a k)(b k)
= (a 1)(a 2). . . (a k)(b 1)(b 2). . . (b k)
= N E/K (a)N E/K (b)
and TE/K (a + b) = (a + b) 1 + (a + b) 2 + . . . + (a + b) k
= (a 1 + b 1) + (a 2 + b 2) + . . . + (a k + b k
)
= a + b + a + b + ... + a + b
1 1 2 2 k k
= (a 1 + a 2 + . . . + a k
) + (b 1
+b 2
+ ... + b k
)
= TE/K (a) + TE/K (b).

730
(2) If b K, then b i
= b for all i = 1,2, . . . ,k and
N E/K (b) = (b 1)(b 2). . . (b k) = bb. . . b = bk
TE/K (b) = b 1
+ b + . . . + b = b + b + . . . + b = kb.
2 k

(3) Let f(x) = xn + a n 1xn 1 + . . . + a1x + a0 K[x] be the minimal polynomial


of b over K and let b = b1,b2, . . . ,bn be the roots of f(x) in N. Then n =
K(b):K and f(x) = (x b1)(x b2). . . (x bn). Thus
b + b + ... + b = a
1 2
and b b . . . b = ( 1)na .
n n 1 1 2 n 0
Let us write E::K(b) = s so that k = sn. There are exactly n K-homo-
morphisms 1, 2, . . . , n from K(b) into N (namely i:b bi). The restric-
tion to K(b) of any j (j = 1,2, . . . ,k) is one of these 1, 2, . . . , n, and each i
(i = 1,2, . . . ,n) can be extended to precisely s K-homomorphisms from E
into N (Lemma 57.1). Let these extensions of i be i(1), i(2), . . . , i(s). In
(m)
this way, we obtain ns K-homomorphisms i
:E N (i = 1,2, . . . ,n; m =
1,2, . . . ,s). Since ns = k, we get

(m)
{ 1, 2
, . . . , k} = { i
: i = 1,2, . . . ,n and m = 1,2, . . . ,s}.
Thus

)=∏ ∏ ∏
n s n
(m)
N E/K (b) = (b 1
)(b 2
). . . (b k
b i
= (b i)s
i=1 m=1 i=1

∏ (bi)s = (∏ bi)s = (( 1)na0)s


n n
=
i=1 i=1

∑ ∑ ∑
n s n
and TE/K (b) = b 1
+b 2
+ ... + b k
= b i
(m)
= s(b i)
i=1 m=1 i=1


n
=s bi = s( a n 1).
i=1

We have already mentioned that N E/K (a) and TE/K (a) depend on the
fields E and K.. It is clear from the definition or from Lemma 57.4(3) that
N E/K (a) and TE/K (a) will be distinct from N L /K (a) and TL /K (a) and also
from N E/L (a) and TE/L (a) if L is an intermediate field (with a L in the
first case).

Norm and trace behave. very reasonably through intermediate fields: we


have N E/K = N L /K o N E/L and TE/K = TL /K o TE/L for any intermediate field L
of E/K. This is the content of the next theorem. Although we know the
structure of extensions of homomorphisms in the separable case, we

731
give a new argument that works in more general situations.

57.5 Lemma: Let K be a field and E a finite dimensional separable


extension of K. Let a be an arbitrary element of E. Then
N E/K (a) = N L /K (N E/L (a)) and TE/K (a) = TL /K (TE/L (a)).

Proof: (The assertion is meaningful, for N E/L (a) is an element of L by


Lemma 57.4(3), thus we can take the norm of N L /K (a) L over K. The
claim is that this is equal to the norm of a E over K. Similarly for the
trace.)

E
s N E/L
N E/K L
n N L /K
K

The proof has been foreshadowed is Lemma 57.4. Let N be a normal


closure of K over E. Then N is Galois over K by Theorem 55.11 and N is
Galois over L by Theorem 54.25(1). We choose a field M such that
(i) E M N; (ii) M is Galois over L; (iii) A is not Galois over L for any
field A with E A M. This is possible because N/K is Galois, N/E is also
Galois and so N/E is separable (Theorem 55.10) and there are only
finitely many intermediate fields of N/E (Theorem 55.15). M is a normal
closure of L over E. Likewise, we choose a field R such that (i) L R N;
(ii) R is Galois over K; (iii) A is not Galois over L for any field A with
L A R. Thus R is a normal closure of K over L.

We put E:K = k, E:L = s and L:K = n so that k = sn. Let.


{ 1, 2, . . . , k} be the set of all K-homomorphisms from E into N,
{ 1, 2, . . . , s} the set of all L-homomorphisms from E into M N and
{ 1, 2, . . . , n} the set of all K-homomorphisms from L into R N.
Then N E/K (a) = (a 1)(a 2). . . (a k),
NE/L (a) = (a 1)(a 2). . . (a s),
NL /K (b) = (b 1)(b 2). . . (b n)
for any a E, b L.

732
N is a splitting field of a polynomial f(x) K[x] over K (Theorem 55.11
and Theorem 55.7) and therefore N is a splitting field of f(x) over L and
over L i (Example 53.5(e)). The isomorphism i: L L i ( N) can be
(1)
extended to an isomorphism i
:N N (Theorem 53.7). Here of course
(1)
i
:N N is a K-homomorphism.

(1)
We claim { 1, 2
, . . . , k} = { j i
: i = 1,2, . . . ,n; j = 1,2, . . . ,s}. Since k = ns,
(1) (1)
we must merely show that j´ i´ j i
when (i,j) (i´,j´). Indeed, if


\o\al(i´,(1)) = j
\o\al(i(1)), then the restriction of j´
\o\al(i´,(1)) and
(1)
j i
to L must be equal and since j and j´ fix each element in L, we
get (1)i´ L
= (1)
i L
, so i´ = i and i´ = i. Then, as (1)
i
is one-to-one, i´ = i and
(1)
j´ i´
= j i(1) imply that j´ = j and j´ = j. This establishes the claim.

Thus, since N E/L (a) L by Lemma 57.5(3), we get

)=∏ ∏
n s
(1)
N E/K (a) = (a 1
)(a 2
). . . (a k
a j i
i=1 j=1

= ∏ (∏ a ∏ ∏
n s n n
(1) (1)
j
) i
= (N E/L (a)) i
= (N E/L (a)) i
i=1 j=1 i=1 i=1
= N L /K (N E/L (a))

and similarly TE/K (a) = TL /K (TE/L (a)).

57.6 Definition: Let E be a field and let { 1, 2, . . . , k} be a finite set of


field automorphisms of E. If, for any a1,a2, . . . ,ak E,
a1(b 1) + a2(b 2) + . . . + ak(b k) = 0 for all b E
implies a = a = . . . = a = 0, then { , , . . . , } is said to be linearly
1 2 k 1 2 k
independent.

Equivalently, { 1, 2, . . . , k} is linearly independent provided, for each k-


tuple (a1,a2, . . . ,ak) of elements from E, where at least one ai is distinct
from 0, there is a b E such that
a1(b 1) + a2(b 2) + . . . + ak(b k
) 0.

733
57.7 Lemma: Let E be a field and let { 1, 2, . . . , k} be a finite set of field
automorphisms of E. If 1, 2, . . . , k are pairwise distinct, then { 1, 2, . . . , k}
is linearly independent.

Proof: (cf. Lemma 54.15; note that we do not assume { 1, 2, . . . , k} is a


group.) Suppose, by way of contradiction, 1, 2, . . . , k are distinct auto-
morphisms of E and that { 1, 2, . . . , k} is not linearly independent. Then
there are elements a1,a2, . . . ,ak, not all zero, in E such that
a (b ) + a (b ) + . . . + a (b ) = 0
1 1 2
for all b E.
2
(1) k k

Let r be the smallest number of nonzero components ai in all the k-


tuples (a1,a2, . . . ,ak) E E . . . E \{(0,0,. . . ,0)} satisfying (1) and choose
a k-tuple (c1,c2, . . . ,ck) with exactly r nonzero components. We have r
1. Renumbering the automorphisms, we may assume c1, . . . ,cr are distinct
from zero and (in case r k) c = . . . = c = 0. r+1 k

Then c1(b 1
) + c2(b 2
) + . . . + cr(b r) = 0 for all b E.
(2)

Since 1
, 2
, ..., k
are distinct,
and there is a u E with u 1 2 1
u 2
.
Writing ub in place of b in (2) and using (ub) i = u i.u i, we get

c1(u 1
)(b 1
) + c2(u 2
)(b 2
) + . . . + cr(u r)(b r) = 0 for all b E. (3)

Multiplying (2) by u 1
, we obtain

c1(u 1
)(b 1
) + c2(u 1
)(b 2
) + . . . + cr(u 1
)(b r) = 0 for all b E. (4)

Subtraction gives

[c2(u 2
u 1
)](b 2
) + . . . + [cr(u r
u 1
)](b r) = 0 for all b E.

where at least c2(u 2


u 1
) 0. Hence there is a k-tuple

(0,c2(u 2
u 1
), . . . ,cr(u r
u 1
), 0, . . . ,0) (0,0, . . . ,0)

with at most r 1 nonzero components satisfying (1),. contrary to the


definition of r. Therefore { 1, 2, . . . , k} is linearly independent.

734
We now characterize all elements with trace 0 and all elements with
norm 1 in case of a Galois extension with a finite cyclic group. The
second part of Theorem 57.9 (formulated for finite dimensional
extension of ) is the theorem with number 90 in D. Hilbert's (1862-
1943) famous report on algebraic number theory and is known as
"Hilbert's theorem 90". It is the beginning of cohomology theory.

57.8 Definition: Let E/K be a field extension. If E is algebraic and


Galois over K, and if the Galois group AutK E is cyclic,. then E is called a
cyclic extension of K and E/K is said to be cyclic.

57.9 Theorem: Let E/K be a finite dimensional. cyclic extension and let
be a generator of AutK E. Let a E.
(1) TE/K (a) = 0 if and only if there is an element b E with a = b b .
(2) N E/K (a) = 1 if and only if there is an element b E\{0} with a = b /b .

Proof: Let E:K = n. Then AutK E = n by the fundamental theorem of


Galois theory and AutK E = {1, , 2, . . . , n 2, n 1}, with n = 1 and o( ) = n.
For convenience, we write T instead of TE/K and N instead of N E/K .

(1) If a = b b , then we have a telescoping sum:


T(a) = a + a + a 2 + . . . + a n 2 + a n 1
= (b b ) + (b b ) + (b b ) 2 + . . . + (b b ) n 2 + (b b ) n 1
= (b b ) + (b b 2) + (b 2 b 3) + . . . + (b n 2 b n 1) + (b n 1 b n
)
= b b n = b b = 0.

Conversely, assume that T(a) = 0. We first find an element c in E with


T(c) = 1. Since o( ) = n, the automorphisms 1, , 2, . . . , n 2, n 1 are distinct
and {1, , 2, . . . , n 2, n 1} is linearly independent by Lemma 57.7. So there
is a u E with
T(u) = u + u + u 2 + . . . + u n 2 + u n 1 0.
Let c = u/T(u). Since T(u) K and E is Galois over K, we have (T(u)) j =
T(u) for any j = 0,1,2, . . . ,n 1 and thus
u u u u u
T(c) = +( ) +( ) 2
+ ... + ( ) n2
+( ) n1
T(u) T(u) T(u) T(u) T(u)
u u u 2 n2
u n1
= +( ) +( ) + ... + ( u ) + ( )
T(u) T(u) T(u) 2 T(u) n 2 T(u) n 1

735
u u u 2 ... u n 2 u n 1
= + + + + +
T(u) T(u) T(u) T(u) T(u)
T(u)
= = 1.
T(u)

We put b = ac + (a + a )(b ) + (a + a + a 2)(b 2) + (a + a + a 2 + a 3)(b 3


)
+ . . . + (a + a + a 2 + . . . + a n 2)(b n 2) E.
Then
b b = ac + (a + a )(c ) + (a + a + a 2)(c 2)
+ (a + a + a 2 + a 3)(c 3)
+ . . . + (a + a + a 2 + . . . + a n 2)(c n 2).
(a )(c ) (a + a 2)(c 2) (a + a 2 + a 3)(c 3)
(a + a 2 + a 3 + a 4)(c 4)
. . . (a + a 2 + a 3 + . . . + a n 1)(c n 1)
= ac + a(c ) + a(c 2) + . . . + a(c n 2) (a + a 2 + . . . + a n 1)(c n 1)
= ac + a(c ) + a(c 2) + . . . + a(c n 2) (T(a) a)(c n 1)
= ac + a(c ) + a(c 2) + a(c 3) + . . . + a(c n 1) = aT(c) = a.

Hence a = b b for some b E when T(a) = 0.

(2) If a = b/b for some b E\{0}, then


b . b
N(a) = (b ) .( b ) 2 . . . ( b ) n 2 .( b ) n 1
b b b b
2 n2 n1
b .b . b ... b .b b
= n = b = 1.
b b 2 b 3
b n1
b

Conversely, assume N(a) = 1. Then of course a 0. From Lemma 57.7, it


follows that there is a d E for which

b := (a)d + (a .a )d + (a .a .a 2
)d 2
+ . . . + (a .a .a 2 . . . a n2
)d n2

+ (a .a .a 2 . . . a n 2.a n1
)d n1

is distinct from 0. Then we get

a(b ) = a(a )d + a(a .a 2)d 2 + a(a .a 2.a 3)d 3


+ . . . + a(a .a 2.a 3 . . . a n 1)d n 1 + a(a .a 2.a 3
... a n 1.
a n
)d n

= b (a)d + a .N(a).d n = b ad + a .1.d = b.

Since b 0, also b 0 and a(b ) = b gives a = b/b .

736
We close this paragraph with two applications of Theorem 57.9. We
describe cyclic extensions. The degree is the characteristic in the first
case and relatively prime to the characteristic in the second case.

57.10 Theorem: Let K be a field of characteristic p 0 and let E be a


cyclic extension of K with E:K = p. Then there is an a K such that f(x) =
xp x a K[x] is irreducible in K[x] and E = K(t) for any root t of f(x).

Proof: By hypothesis, E is Galois over K and AutK E is cyclic of order p,


2 p1
say AutK E = . Then TE/K (1) = 1 + 1 + 1 + ... + 1 = p = 0, so 1 = b
b for some b E (Theorem 57.9(1)). Let u = b. Then u = 1 + u and we
get up = (u )p = (1 + u)p = 1p + up = 1 + up. Hence
(up u) = up u = (1 + up) (1 + u) = up u
and up u is fixed by and thus by all automorphisms in AutK E. Since E
is Galois over K, this gives up u K. Let us put up u = a. Thus u is a
root of f(x) = xp x a K[x].

It remains to show that f(x) is irreducible over K and that E = K(t) for
any root t of f(x). Since b is not fixed by , we see b K, so u K and
thus K K(u) E. But E:K = p is prime and so there is no intermediate
field of E/K distinct from K and E. This forces K(u) = E. Then deg f(x) = p
= E:K = K(u):K = degree of the minimal polynomial of u over K. Since the
minimal polynomial of u over K divides f(x), we deduce that f(x) is the
minimal polynomial of u over K. In particular, f(x) is irreducible in K[x].

Now for any j p


K, there holds j p = j and consequently
f(u + j) = (u + j)p (u + j) a = up + j p u j a = up u a = 0.
So u, u + 1, u + 2, . . . , u + p 1 E are roots of f(x). Since f(x) has p roots,
any root t of f(x) is equal to u + j for some j p
.. So we get K(t) = K(u + j)
= K(u) = E for any root t of f(x)..

57.11 Theorem: Let K be a field and let E be a cyclic extension of K of


degree E:K = n. Assume that either char K = 0 or char K 0 but char K
does not divide n. Assume, in addition, that xn 1 splits in K. Then there
is an a K such that f(x) = xn a K[x] is irreducible in K[x] and E = K(u)
for any root u of f(x).

737
Proof: By hypothesis, AutK E is a cyclic group, say AutK E = , and o( ) =
n
AutK E = E:K = n. All roots of the polynomial x 1, which splits in K, are
simple since its derivative nxn 1 0 in view of the assumption on char K.
Thus there are exactly n distinct roots of xn 1 in K. Since rn = sn = 1
implies (rs)n = 1, the roots of xn 1 make up a subgroup of K . Any finite
subgroup of K is cyclic (Theorem 52.18), so the roots of xn 1 form a
cyclic group of order n. Let r K be a generator of this group so that the
n roots of xn 1 are 1,r,r2, . . . ,rn 1.

We have N E/K (r) = rn = 1 (Lemma 57.4(2)) and so there is a b E with r


= b/b (Theorem 57.9(2)). Let u = 1/b. Then u E\{0} and u = ur. This
n n n n n n
implies u = (u ) = (ur) = u r = u, so u is fixed by , so by AutK E, and
therefore un K. Let us put un = a.

Then xn a K[x] and this polynomial has n roots u,ur,ur2, . . . , urn 1 in


K(u), which are all distinct. So xn a splits in K(u), but not in a proper
subfield of K(u) containing K, since any intermediate field of K(u)/K, in
which xn a splits, must contain the root u and hence must be identical
with K(u). Thus K(u) is a splitting field of xn a over K. Since the roots of
xn a are distinct, the irreducible factors of xn a are separable over K
and thus K(u) is Galois over K (Theorem 55.7). In particular, AutK K(u) =
K(u):K .

j
Any K-automorphism AutK E (j = 0,1,2, . . . ,n 1) sends u to urj K(u),
thus the restriction of j to K(u) is a K-automorphism of K(u) (Theorem
42.22). Since u i = uri urj = u j when i,j {0,1,2, . . . ,n 1} and i j, we
see that these K-automorphisms of K(u) are distinct.. Hence there are at
least n K-automorphisms of K(u). This implies K(u):K = AutK K(u) n.
From n = E:K K(u):K n, we get K(u):K = n, whence E = K(u)..

Finally, since the minimal polynomial of u over K divides xn a and


deg(xn a) = n = K(u):K = degree of the minimal polynomial of u over K,
we deduce that xn a is the minimal polynomial of u over K and xn a is
irreducible in K[x]. Moreover, any root t of xn a is equal to urj for some
j = 0,1,2, . . . ,n 1 and, since r K, we get K(t) = K(urj ) = K(u) = E for any
root t of xn a.

738
Exercises

1. Let K be a field and let E be a finite dimensional separable extension


of K. Prove that, for any k K, there is an a E such that TE/K (a) = k..

2. Let K L E N be fields and assume that N is normal over K. If s is


the cardinal number of L-homomorphisms from E into N and n is the
cardinal number of K-homomorphisms from L into N, prove that sn is
the cardinal number of K-homomorphisms from E into N.

3. Let K be a field of characteristic p 0 and f(x) = xp x a K[x]. Show


that f(x) either splits in K or is irreducible in K[x].

4. Let K be a field of characteristic p 0 and f(x) = xp x a K[x]. Prove


that if f(x) is irreducible in K[x] and u a root of f(x), then K(u) is a cyclic
extension of K of degree p.

5. Let K be a field and n . Assume that either char K = 0 or char K 0


but char K does not divide n. Assume that xn 1 splits in K. Prove that, if
a K and u a root of f(x) = xn a K[x], then K(u) is a cyclic extension of
K (u):K
K and K(u):K divides n and u K.

739
§58
Cyclotomic Fields

The theory of cyclotomy is concerned with the problem of dividing the


perimeter of a circle into a given number of equal parts (cyclo-tomy
means: circle-division). Consider the unit circle in the complex plane. The
points dividing this unit circle into n equal parts are the points e2 i/n =
cos (2 /n) + isin (2 /n) and the geometric problem of cyclotomy is
equivalent to studying the fields (e2 i/n) . The complex numbers
2 i/n n
e are roots of the polynomial x 1 and (e2 i/n) is a splitting field
of xn 1. The splitting fields of such polynomials over any field K will be
called cyclotomic fields (although they may not be relevant to the geo-
metric problem of circle division).

58.1 Definition: Let K be a field and 1 K the identity element of K.


Let n . An extension field E of K is called a cyclotomic extension of K
(of order n) if E is a splitting field of xn 1 K[x] over K.

58.2 Definition: Let K be a field. A root of the polynomial xn 1 K[x]


is called an n-th root of unity or, if there is no need to be exact, simply a
root of unity.

58.3 Lemma: Let K be a field of characteristic p 0 and let n ,


a
where n = p m and (p,m) = 1. Let u be an element in an extension field
of K. Then u is an n-th root of unity if and only if u is an m-th root of
unity.
a a
Proof: If u is an m-th root of unity, then um = 1, so un = (um)p = 1p = 1
and u is an n-th root of unity. If u is an n-th root of unity, then 0 = un
a a
1 = (um)p 1 = (um 1)p , so um 1 = 0 and u is an m-th root of unity.

740
So in the situation of Lemma 58.3, a splitting field of xn 1 over K is also
a splitting field of xm 1 over K, and conversely. For this reason, in case
char K 0, it is no loss of generality to assume that the order of a
cyclotomic extension is relatively prime to the characteristic of K.

58.4 Lemma: Let K be a field and E an extension field of K containing


all n-th roots of unity. Assume char K = 0 or (char K,n) = 1. Then the set
of all n-th roots of unity is a cyclic group of order n under multiplication.

Proof: If u and t are n-th roots of unity, then (ut)n = untn = 1.1 = 1 and
ut is also an n-th root of unity. Since the number of n-th roots of unity is
at most n (Theorem 35.7), it follows that the set of all n-th roots of unity
is a subgroup of K (Lemma 9.3(1)). This group of n-th roots of unity is
cyclic by Theorem 52.18. To prove that the order of this group is equal
to n, we must only show that all roots of xn 1 are simple. This follows
from the fact that the derivative nxn 1 of xn 1 is distinct from zero
(because of the asumption char K = 0 or (char K,n) = 1) so that xn 1 and
nxn 1 have no common root.

58.5 Definition: Let K be a field and E an extension field of K


containing all n-th roots of unity. Assume char K = 0 or (char K,n) = 1. A
generator of the cyclic group of all n-th roots of unity is called a
primitive n-th root of unity.

is a primitive n-th root of unity if and only if o( ) = n. If is a primi-


tive n-th root of unity, then all n-th roots of unity are given without
duplication in the list
1 = 0, 1, 2, 3, . . . , n 1
or in the list
, 2, 3, . . . , n 1, n = 1
and j has order n/(n,j) (Lemma 11.9(2)). Hence j is a primitive n-th
root of unity if and only if (n,j) = 1. There are therefore (n) primitive n-
th roots of unity (cf. §11).

741
If u is a root of unity and o(u) = d, then, by definition, u is a primitive d-
th root of unity.

1 is a primitive first root of unity, 1 is a primitive second root of


2
unity, and are primitive third roots of unity, i and i
are primitive fourth roots of unity.

58.6 Definition: Let K be a field and n . Assume that char K = 0 or


(char K,n) = 1. Let be a primitive n-th root of unity and
j
{ 1, 2
, ... (n)
}={ : j = 1,2, . . . ,n and (n,j) = 1}
the set of all primitive n-th roots of unity in some extension field of K.
The monic polynomial
(x 1
)(x 2
). . . (x (n)
)
of degree (n) is called the n-th cyclotomic polynomial over K and is
denoted by n(x).

For example, over , the first few cyclotomic polynomials are

1
(x) = x 1, 2
(x) = x ( 1) = x + 1,
2
3
(x) = (x )(x ) = x2 + x + 1, 4
(x) = (x i)(x + i) = x2 + 1.

We see that these are in fact polynomials in [x]. This is true for any
cyclotomic polynomial. The n-th cyclotomic polynomial over K does not
depend on the extension field of K in which the primitive n-th roots of
unity are assumed to lie. In fact, it does not even depend on K (but only
on char K).

58.7 Lemma: Let K be a field, n and assume that char K = 0 or


(char K,n) = 1. Then

(1) xn 1= ∏ d
(x).
dn
(2) n
(x) [x] if char K = 0 and n
(x) p
[x] = p
[x] if char K = p 0.

Proof: (1) Any root u of xn 1 is an n-th root of unity and o(u) = d for
some divisor of n. Then u is a primitive d-th root of unity. Conversely, if
d n, any primitive d-th root of unity is an n-th root of unity with o(u) =

742
d. Thus d
(x) = ∏ (x u). Collecting together the roots of xn 1 with
un =1
o(u)= d

order d, for each divisor d of n, we get

xn 1= ∏ (x u) = ∏ ∏ (x u) = ∏ d
(x).
un =1 dn un =1 dn
o(u)= d

(2) Let D = in case char K = 0 and D = p = p


in case char K = p 0. We
prove n(x) D[x] by induction on n. Since 1
(x) = x 1 and 2
(x) = x +
1, we have n(x) D[x] when n = 1,2.

Suppose now n 3 and that d


(x) D[x] for all d = 1,2, . . . ,n 1. From
(1), we have x n
1 = n
(x) ∏ d
(x). Let us put f(x) = ∏ d
(x). Then
dn dn
d n d n
f(x) is a monic polynomial and f(x) D[x] since, by induction, d
(x) D[x]
for all divisors d of n which are distinct from n. As xn 1 D[x] and f(x)
is monic, there are unique polynomials q(x) and r(x) in D[x] such that

xn 1 = q(x)f(x) + r(x), r(x) = 0 or deg r(x) deg f(x)

(Theorem 34.4). Now let E be an extension field of K containing all roots


of xn 1. The division algorithm in E[x] reads

xn 1= n
(x)f(x) + 0.

Since D K E and. the quotient and remainder are uniquely determin-


ed, the unique quotient q(x) in D[x] must be the unique quotient n(x) in
E[x] and the unique remainder r(x) in D[x]. must be the unique remainder
0 in E[x]. Hence n(x) = q(x) D[x]. This completes the proof.

xn 1
The equation (x) = is a recursive formula for (x). Thus
n
∏ d
(x)
n

dn
d n

6
(x) = x6 1/ 1
(x) 2
(x) 3
(x)
= x6 1/(x 1)(x + 1)(x2 + x + 1) = x2 x + 1.

Another recursive formula is for n


(x) is given in the next lemma.

743
58.8 Lemma: Let K be a field, n and assume that char K = 0 or
(char K,n) = 1. Then

n
(x) = ∏ (xd 1) (n/d)
= ∏ (xn/d 1) (d)
.
dn dn

Proof: This follows immediately from Lemma 58.7(1) and Lemma 52.14
(in Lemma 52.14, let the field be K(x) and let the function F: K(x)
be n n
(x)).

For example, we have, over :

12
(x) = (x12 1) (1)
(x6 1) (2)
(x4 1) (3)
(x3 1) (4)
(x2 1) (6)
(x
(12)
1)
= (x12 1)(x2 1)/(x6 1)(x4 1) = x4 x2 + 1,

15
(x) = (x15 1) (1)
(x5 1) (3)
(x3 1) (5)
(x 1) (15)

= (x15 1)(x 1)/(x5 1)(x3 1)


= x8 x7 + x5 x4 + x3 x + 1.

58.10 Theorem: Let K be a field, n and assume that char K = 0 or


(char K,n) = 1. Let E be a cyclotomic extension of order n and let E be
a primitive n-th root of unity and let f(x) K[x] be the minimal
polynomial of over K. Then
(1) E = K( );
(2) E is Galois over K;
(3) AutK E divides (n) and AutK E is isomorphic to a subgroup of n;.
(4) AutK E n
AutK E = (n) f(x) = n(x)
n
(x) is irreducible in K[x].

Proof: (1) Let a1,a2, . . . ,ak be the natural numbers less than n and
relatively prime to n (where k = (n)). so that a 1 , a 2 , . . . a k are the roots
of n(x). Now E is a splitting field of n(x) by definition, so E is generated
a1 a2 ak
by the roots of n
(x) over K (Example 53.5(d)) and E = K( , , ... )=
K( ).

744
(2) The roots of n
(x) are simple because n
(x) is a divisor of xn 1 and
the roots of xn 1 are simple . (the derivative of xn 1, being distinct
from 0 since char K = 0 or (char K,n) = 1, is relatively prime to xn 1).. So
the irreducible factors of n(x) are separable over K. Since E is a splitting
field of n(x), Theorem 55.7 shows that E is Galois over K.

(3) Since is a root of n(x) K[x] and f(x) is the minimal polynomial of
over K, we see f(x) divides n
(x) in K[x] and the roots of f(x) are certain
of the roots of n(x).. Let deg f(x) = s and m1 , m2 , . . . ms be the roots of
f(x), where m1,m2, . . . ,ms are some suitable natural numbers relatively
prime to n and less than n and m1 = 1, say. Thus
m1 m2 ms
f(x) = (x )(x ). . . (x ).

Here we have AutK E = E:K = K( ):K = deg f(x) = s because E is Galois


m1 m2 ms
over K. Any K-automorphism of E maps to one of , , ... . Let mi
mi
be the K-automorphism (i = 1,2, . . . ,s). Since

mi mj
mi
= mj
= mi mj (mod n) i = j,

m1
, m2
, ..., ms
are pairwise distinct and AutK E = { m1
, m2
, ..., ms
}.

Let mi* be the residue class of mi in n. Since mi and n are relatively


prime, there holds mi* n
. We put G = {m1*,m2*, . . . ,ms*} n
. Consider
the mapping

:G AutK E.
mi* mi

As mi
= mj
mi* = mj *, the mapping is well defined and one-to-one.
Both G and AutK E have s elements, so is also onto AutK E. Then has an
inverse :

: AutK E G n
.
mi
mi*

Suppose mi mj
= mk
. Then
mk mi
= mk
= mi mj
=( mi
) mj
=( ) mj
=( mj
)mi = ( mj mi
) = mi mj
,

so mk mimj (mod n), so mi*mj * = mk* and therefore

745
( mi mj
) =( mk
) = mk* = mi*mj * = ( mi
) ( mj
) .

Hence : AutK E n
is a one-to-one group homomorphism, and Im = G
is a subgroup of n and is an isomorphism form AutK E onto G. This
proves that AutK E is isomorphic to a subgroup of n. It follows from
Lagrange's theorem that AutK E = G divides n = (n).

(4) Since AutK E is isomorphic to a subgroup of n and n


= (n) is finite,
we have the equivalence AutK E n
AutK E = (n).

We have AutK E = deg f(x) and (n) = deg n(x). Now f(x) divides n(x) in
K[x] and both f(x) and n(x) are monic, so f(x) = n(x) if and only if
deg f(x) = deg n(x), so if and only if AutK E = (n).

Finally, since n(x) is monic and is a root of n(x), irreducibility of


n
(x) in K[x] implies that n(x) is the minimal polynomial of over K, i.e.,
that f(x) = n(x). Conversely, if f(x) = n(x), then n(x) is irreducible.

When the base field is , we have sharper results.

58.11 Theorem: For any n , the n-th cyclotomic polynomial n


(x)
over is irreducible in [x].

Proof: Let n and let g(x) be an irreducible divisor of n(x) in [x],


with deg g(x) 1 so that n(x) = g(x)h(x), say, where g(x), h(x) [x]
are monic polynomials. Let be a root of g(x). Thus g(x) is the minimal
polynomial of over .

Our first step will be to show that p is also a root of. g(x) for any prime
number p relatively prime to n. Now is a root of n(x), so o( ) = (n)
and if p is a prime number such that (p,n) = 1, then o( p) = (n) and p is
also a primitive n-th root of unity: p is a root of n(x), so p is a root of
g(x) or of h(x). Let us assume, by way of contradiction, that p is not a
root of g(x). Then p is a root of h(x). Then is a root of h(xp) and h(xp) is
divisible by the minimal polynomial g(x) of over .

Let us write h(xp) = g(x)p(x), where p(x) [x]. Let

746
h(xp) = g(x)q(x) + r(x), r(x) = 0 or deg r(x) deg g(x)

be the division algorithm in [x] (g(x) is monic). The uniqueness of the


quatient and remainder in [x] [x] implies p(x) = q(x) and r(x) = 0.
p
Thus we have h(x ) = g(x)p(x), where p(x) [x].

Let : p
be the natural homomorphism and let : [x] p
[x] be the
homomorphism of Lemma 33.7. We shall write s(x) instead of (s(x)) for
s(x) [x]. Then h(xp) = g(x)p(x) implies
h(xp) = g(x)p(x) in p[x].
Since char p
= p, there holds h(xp) = h(x)p in p
[x] and we get
h(x)p = g(x)p(x) in p
[x].

So there is an irreducible factor of g(x) in p[x] which divides h(x)p and


which therefore divides h(x) in p[x]. Thus g(x) and h(x) have a common
factor in p
[x]. Since g(x)h(x) = n
(x) divides xn 1 in [x], there is a k(x)
in [x] such that
g(x)h(x)k(x) = xn 1 in [x],
so g(x)h(x)k(x) = xn 1 = xn 1 in p
[x]

and xn 1 p
[x] has a multiple root. But the derivative of xn 1 p
[x]
is not 0 p
[x], so relatively prime to xn 1 and xn 1 p
[x] has no
p
multiple roots. This contradiction shows that must be a root of g(x).

Hence if p is a prime number,


(p,n) = 1,
p
is a root of g(x), then is a root of g(x).

Let m be any natural number satisfying 1 m n and (n,m) = 1. Then


a1 a2 ar
m = p1 p2 . . . pr with suitable prime numbers pi relatively prime to n.
Repeated application of the result we have just proved shows that m is
a root of g(x) when is. This is true for each of the (n) natural numbers
m such that 1 m n and (n,m) = 1. Thus g(x) has (n) (distinct) roots
m
and g(x) is divisible by ∏ (x m) = n(x). Hence n(x) = g(x) and
1 m n
(n,m) = 1

n
(x) is irreducible in [x].

747
58.12 Theorem: Let n .
and let be a primitive n-th root of unity
in some extension of . Then ( ) is Galois over and Aut ( ) n
.

Proof: Since n
(x) is monic and irreducible in [x], it is irreducible in
[x] (Lemma 34.11). The claim follows now from Theorem 58.10.

We consider the special case of Theorem 58.12 where n is prime. Let p


be a prime number. Then the isomorphism p = p Aut ( ) is given,
in the notation of the proof of Theorem 58.10, by mi* m
Aut ( ),
i
mi
where mi
: . Both p
and Aut ( ) are cyclic. Let g be such
that its residue class g * p
is a generator of p
. Then Aut ( ) = ,
g
where = g
, i.e., is the automorphism .

Then the p-th primitive roots of unity are

g g2 g3 g p-2
, , , , ...,
k k
and we have k: g
. Let us put k = g . Then k+(p 1) = k+(p 1) = k
=
k
so that any index k can be replaced by any j with k j (mod p 1).
gk k k g k+1 gk m gk
Now k
=( ) =( )g = ( g )g = = k+1
and k
m
=( ) m
=( ) =
gm gk g k+m m
( ) = = k+ m
. Thus raises the index by 1 and more generally
raises the index by m.

Let us find the intermediate fields of the extension ( )/ . Since ( ) is


Galois over , and since Aut ( )= is cyclic of order p 1, there is
one and only one intermediate field for each positive divisor e of p 1,
e
namely the one that corresponds to the subgroup of Aut ( ).
Hence this field, say Ke, is the fixed field of e and Ke: = : e
= e. In
order to describe Ke explicitly, we note first that

g g2 g3 g p-2 2 3 p2
{1, , , , , ..., } = {1, 0, 1
, 2
, 3
,. . . , p 2
} = {1, , , , , ..., }

is a -basis of ( ) since this set is equal to {1, , 2, 3, . . . , p 1}, which is a


-basis of ( ) by Theorem 50.7. So any element u in ( ) can be
written in the form

u = a0 0
+ a1 1
+ a2 2
+ a3 3
+ . . . + ap 1 p 1

748
with uniquely determined a0,a1,a2, . . . ,ap 1
. Here

u e
= (a0 0
+ a1 1
+ a2 2
+ a3 + . . . + a p 2 p 2)
3
e

= a0 e+0
+ a1 e+1
+ a2 e+2
+ a3 e+3 + . . . + a p 2 e+(p 2)

e e
and u is fixed by , i.e., u = u if and only if

a0 e+0
+ a1 e+1
+ a2 e+2
+ a3 e+3
+ . . . + ap 2 e+(p 2)
= ae+0 + ae+1 e+1
+ ae+2 e+2
+ ae+3 e+3
+ . . . + a e+(p 2) e+(p 2)

which is equivalent, when we put f = (p 1)/e, to

a0 = ae+0 = a2e+0 = a3e+0 = . . . = a(f 1)e+0


a1 = ae+1 = a2e+1 = a3e+1 = . . . = a(f 1)e+1
a =a
2
=a e+2
=a = ... = a
2e+2 3e+2 (f 1)e+2
...............
a(e 1)
= ae+(e 1)
= a2e+(e 1) = a3e+(e 1)
= . . . = a(f 1)e+(e 1)

and this means u = a0( 0


+ e
+ 2e
+ 3e
+ ... + (f 1)e
)
+ a1( 1
+ + ... +
e+1
+ 2e+1
+ 3e+1 (f 1)e+1
)
+ a2( 2 + e+2 + 2e+2 + 3e+2 + . . . + (f 1)e+2
)
+ ...
+ ae 1( e 1
+ e+(e 1)
+ 2e+(e 1)
+ 3e+(e 1)
+ ... + ).
(f 1)e+(e 1)

We put k
= k
+ e+ k
+ 2e+ k
+ 3e+ k
+ ... + (f 1)e+ k
(k = 1,2, . . . ,e 1). The
e
elements are called the periods of f terms. We see u is fixed by
k
if
and only if u = a0 0 + a1 1 + a2 2 + . . . + ae 1 e 1 with a0,a1,a2, . . . ,ae 1 .
So { 0, 1, 2, . . . , e 1} is a -basis of Ke.

Note that : 0 1
, 1 2
, 2 3
, ..., e 2 e 1
, e 1 0
. Thus each
e e
of 0, 1, 2, . . . , e 1 is fixed by and by powers of , but not by any
other automorphism of Aut ( ). Hence all intermediate fields ( 0),
e
( 1
), ( 2
), . . . , ( e 1
) of
correspond to the same subgroup ( )/
of Aut ( ). This forces ( 0) = ( 1) = ( 2) = . . . = ( e 1) = Ke. So any
period of f terms is a primitive element of Ke, the unique intermediate
field of ( )/ with Ke: = e.

749
( ) 1

f f
e
( k
)
e e

We summarize our results.

58.13 Theorem: Let p be a prime number and a primitive p-th root


of unity in some extension field of . Let g be such that its residue
*
class g in p is a generator of p . Then

(1) ( ) is Galois over ;


(2) Aut ( ) is a cyclic group of order p 1. A generator of Aut ( ) is
g
the -automorphism : .
(3) Let e and f be natural numbers such that ef = p 1, and put

g e+k g 2e+k g (f-1)e+k


k
= + + ... + (k = 0,1,2, . . . ,e 1).

Then there is one ond only one intermediate field of the extension
( )/ whose -dimension is equal to e. This field is ( k) for any k =
0,1,2, . . . ,e 1. The set { 0, 1, 2, . . . , e 1} is a -basis of ( k). All
intermediate fields of ( )/ are found in this way as e ranges through
the positive divisors of p 1.

58.14 Examples: (a) We find all intermediate fields of ( ), where the


complex number is a primitive 7-th root of unity. These are the
simple extensions of whose primitive elements are the periods. In
order to construct the periods, we need a generator of 7 . One checks
easily that the residue class of 3 is a generator of 7 . The images of
3
under powers of the automorphism : are

3 2 6 4 5
, , , , , .

The 1-term periods are , 3, 2, 6, 4, 5 and ( ) = ( 3) = ( 2) = ( 6) =


( 4) = ( 5) is the intermediate field with ( ): = 6.

750
6
The 2-term periods are + , 3 + 4, 2 + 5 and ( + 6
) is the
6 6
intermediate field ( + ): = 3. We also have ( + ) = ( + 1) =
( 3 + 4) = ( 2 + 5).

The 3-term periods are = + 2 + 4, ´ = 3


+ 6
+ 5
and ( )= ( ´) is
the intermediate field with ( ): = 2.

The 6-term period is 3 + 2 + 6


+ 4+ 5
+ = 1 and ( 1) = is the
intermediate field with ( 1): = 1.

( )

2
1
3 ( + )

( )

2 3

(b) We determine the intermediate fields of ( )/ , where is a


primitive 17-th root of unity. The divisors of 17 1 = 16 are 1,2,4,8,16
and there are five intermediate fields, of dimensions 1,2,4,8,16 over .

The residue class of 3 in 17


is a generator of 17
. The successive
powers of 3 are congruent, modulo 17, to
1,3,9,10,13,5,15,11,16,14,8,7,4,12,2,6

The 8-term periods are


9 13 15 16 8 4 2
0
= + + + + + + +
= 3 + 10 + 5 + 11 +
1
14
+ 7
+ 12 + 6.
An elementary computation shows that 0
+ 1
= 1 and 0 1
= 4. So 0
1 17
and are the roots of x2 + x 4. Hence . Which of 0, 1 , =
1 2 0 1

has the plus sign depends on the choice of . We may assume is a 17-th
root of unity that appears in the period with the plus sign (otherwise
replace by one of the roots of unity that appear in the period with the
1+ 17 1 17 .
plus sign). Then 0 = and =
2 1 2

The 4-term periods are


13 16 4 9 15 8 2
0
= + + + , 2
= + + +

751
3 5
1
= + + 14 + 12
, 3
= 10
+ 11
+ 7
+ 6

and 0
+ 2
= 0
, 0 2
= 1; 1
+ 3
= 1
, 1 3
= 1.

Hence 0
and 2
are the roots of x2 0
x 1 and 1
and 3
are the roots
2 2
0
+ 0
+4 0 0
+4
2
of x x 1. Here we may put and 2 = = by
1 2 2 0

assuming that is a 17-th root of unity that appears in the period with
2
1 1
+4
the plus sign The signs of radicals in , = , however, can no
1 3 2
longer be arbitrarily assigned by choosing suitably. To determine
which of 1, 3 has the positive radical, we note

( 0 2
)( 1 3
) = 2( 0 1
)
2
0
+4
.( ) = 17,
2 1 3

2 2
1
+ 1
+4 1 1
+4
so that is positive. This gives = and = .
1 3 1 2 3 2

The 2-term periods are


16 13 4
0
= + , 4
= +
3 14 5 12
1
= + 5
= +
9 8 15 2
2
= + , 6
= +
10 7 11 6
3
= + 7
= + .
Here 0
+ 4
= 0
and 0 4
= 1
, so 0
and 4
are roots of x2 0
x + 1
.
2 2
0 0
4 1 0
+ 0
4 1
Thus , = . We put = . In like manner as
0 2 4 2 0
above, one can find polynomials whose roots are j
and determine the
roots without ambiguity.

A 1-term period is , which is a root of x2 0


x + 1. Hence we may put
2
0
+ 0
4
= .
2
The subfield structure of ( ) is depicted below.

752
( )
2
( 0
)
2
( 0
)
2
( 0
)
2

* *

We now prove an important theorem due to J. H. M. Wedderburn which


states that any finite division ring is commutative. The proof makes use
of the class equation (Lemma 25.16) of the multiplica-tive group of
nonzero elements in a finite division ring. Let us recall the class equation
of any finite group G is

∑ G:CG(xi) ,
k
G =
i=1

where k is the number of distinct conjugacy classes, x1,x2, . . . ,xk are


representatives of these classes and CG(xi) = {g G: xig = gxi} are the
centralizers of xi (i = 1,2, . . . ,k).

In addition to these centralizer groups, we consider centralizer rings and


evaluate their dimensions to find the terms in the class equation. An
argument involving cyclotomic polynomials shows than that the class
equation cannot hold unless the division ring is commutative.

In order not to interrupt the main argument, we establish two lemmas


we will need.

753
58.15 Lemma: Let n be a natural number greater than one and let
n
(x) be the n-th cyclotomic polynomial over . Then, for any a proper
divisor d of n, we have

xn 1
(x) = x(n/d) 1 + x(n/d) 2 + . . . + x(n/d) + 1 in [x]
n xd 1

and, for any natural number q,

qn 1
(q) in .
n qd 1

Proof: Since n
(x) (xn 1) and xn 1 = (xd 1)[(xn 1)/(xd 1)], it is
sufficient to show that n
(x) is relatively prime to xd 1 for any proper
divisor d of n. But this is clear, because n(x) and xd 1 have no root in
common: the roots of n(x) are primitive n-th roots of unity, whereas a
root of xd 1 cannot be a primitive n-th root of unity if d is a proper
divisor of n. This proves the divisibility relation in [x]. Substituting any
integer q for x (and using n(x), (xn 1)/(xd 1) [x]) we obtain the
divisibility relation in .

58.16 Lemma: If n 1 and n(x) is the n-th cyclotomic polynomial


over , then n(q) q 1 for all q with q 2.


n
k
Proof: We have n
(x) = (x ), where is a primitive n-th root of
k=1
(k,n)=1

unity in some extension field of . For example, we may take = e2 i/n.


Substituting q for x and using the triangle inequality a b a b ,
we get

∏ ∏ ∏
n n n
k
n
(q) = q = q e2 ki/n
q e2 ki/n
k=1 k=1 k=1
(k,n)=1 (k,n)=1 (k,n)=1


n
= q 1 = (q 1) (n)
= (q 1).(q 1) (n) 1
q 1
k=1
(k,n)=1

754
in case (n) 1 1 since q 1. In case (n) 1 = 0, we have n = 2 and

2
(q) = q + 1 q 1.

58.17 Theorem (Wedderburn's theorem): If D is a finite division


ring, then D is a field.

Proof: Let D be a division ring with finitely many elements. D = D \{0}


is then a finite group under multiplication and the class equation of D is

∑ D :CD (xi) ,
k
D =
i=1

where k is the number of distinct conjugacy classes of D and x1,x2, . . . ,xk


are representatives of these classes.

We now put CD (xi) = {a D: xia = axi} = CD (xi) {0} D. Since a,b CD (xi)
implies xi(a + b) = xia + xib = axi + bxi = (a + b)xi, we see CD (xi) is closed
under addition and thus CD (xi) is a subgroup of D under addition
(Lemma 9.3(2)). As CD (xi)\{0} = CD (xi) is a subgroup of D , we conclude
that CD (xi) is a division ring (a subdivision ring of D).

The same argument proves that the center of the ring D:

Z = {a D: xa = ax for all x D} = Z(D ) {0}

is a a subdivision ring of D. But Z is a commutative division ring, i.e., Z is


a field. Then char Z = p for some prime number p and Z = pt for some
natural number t. We put q = pt = Z for brevity.

We have Z CD (xi) D. Since multiplication in D is associative and


distributive over addition, and since 1a = a for all a CD (xi), we get that
CD (xi) and D are vector spaces over Z. Let dimZCD (xi) = mi and dimZD = n.
Then, as in Lemma 52.1, we have CD (xi) = Z mi = qmi and D = Z n = qn.
This gives CD (xi) = CD (xi)\{0} = CD (xi) 1 = qm i 1 and likewise D =
D\{0} = D 1 = qn 1. The class equation is therefore

D qn 1
∑ D :CD (xi) ∑ = ∑ mi
k k k
n
q 1= = .
i=1 i=1
CD (xi) i=1 q 1

755
Now D :CD (xi) is an integer, so qmi 1 divides qn 1 and this implies
that mi divides n (Lemma 52.7(1)).

We want to show that D is commutative, or, what is the same thing, that
Z = D. We will assume Z D and derive a contradiction. Well, if Z D, then
n 1 and there is at least one xi such that D :CD (xi) 1, because
D :CD (xi) = 1 if and only if xi Z(D ). We so choose the notation that
{x1,x2, . . . ,xh} = Z(D ) and xh+1, . . . ,xk are not in the center of D . Then the
class equation becomes

qn 1
1 = ∑ D :CD (xi) + ∑ D :CD (xi) = Z(D ) + ∑ mi
h k k
n
q
i=1 i=h+1 i=h+1 q 1

qn 1

k
qn 1 = (q 1) + mi .
i=h+1 q 1

and mi is a proper divisor of n for i = h + 1, . . . ,k. As n 1 by


qn 1
assumption, n
(q) divides
mi for all i = h + 1, . . . ,k (Lemma 58.15);
q 1
n
(q) divides also qn 1. We read from the class equation that n(q)
divides q 1. But this is impossible, for n(q) q 1 by Lemma 58.16.

Thus n = 1 and D = Z is commutative.

Exercises

1. Find the m-th cyclotomic polynomial m


(x) over for m 50.

2. Let m(x) denote the m-th cyclotomic polynomial over . Prove:


(a) 2n(x) = n( x) if 2 n.
(b) pn
(x) = n
(xp)/ n
(x) if p is an odd prime number and p n.

3. Evaluate the pk-th cyclotomic polynomial (x) over


pk
if p is a prime
number and k .

4. Let p,k and p be prime. Let p(x) denote the p-th cyclotomic
polynomial over . Prove that, if d p(k), then d 1 (mod p) or d = p.

756
5. Let p be prime, k and let k* be the residue class of k in p. Let
n and n(x) the n-th cyclotomic polynomial over . Suppose that
p n. Prove the following statements.
(a) p n(k) if and only if o(k*) = n (order of k* in p is n).
(b) There is an integer a with p n(a) if and only if p 1 (mod n).

6. Let n and n(x) the n-th cyclotomic polynomial over and let
p1,p2, . . . ,pm be prime numbers of the form tn + 1 (t,n ). Use Ex. 5 and
prove the following statements.
(a) n(anp1p2. . . pm) 1 (mod np1p2. . . pm) for any a .
(b) n(anp1p2. . . pm) 1 if a is sufficiently large.
(c) For some a , there is a prime divisor p of n(anp1p2. . . pm)
which is distinct from p1,p2, . . . ,pm.
(d) There are infinitely many prime numbers p of the form tn + 1.
(This is a special case of the following celebrated theorem of Dirichlet: if
a,b are any relatively prime integers, then there infinitely many prime
numbers of the form an + b.)

7. Find all subfields of ( ), where is a primitive n-th root of unity


1+ 5+i 10+2 5
and n = 4,5,6,8,12. Prove e2 i/5
= .
4

8. Prove the formula due to Gauss:

2 1 1 1
cos = + 17 + 34 2 17
17 16 16 16
1
+ 17 + 3 17 34 2 17 2 34 + 2 17
8

9. Under the hypotheses of Theorem 58.13, show that the set of periods
independent of the integer g for which g * is a generator of p , but the
indices of individual periods do depend on g. Describe this dependence.

10. Let the hypotheses of Theorem 58.13 be valid, with p an odd prime
number, and let 0, 1 be the [(p 1)/2]-term periods. Prove that 0 1 =
(p 1)/4 or (p + 1)/4 according as p 1 (mod p) or p 3 (mod p). Show
that 0 1
= ( 1)(p 1)/2p. (The sign depends on the primitive p-th
root of unity we take. If we choose = e2 i/p
, then the sign is plus.
This is considerably difficult to prove. This exercise shows ( (
1)(p 1)/2p)) is contained in the cyclotomic field ( ). A theorem of class

757
field theory, known as Kronecker-Weber theorem, states that any finite
dimensional Galois extension of whose Galois group is abelian is
contained in a suitable cyclotomic extension of .)

11. Let k
denote a primitive k-th root of unity. Show that, if (n,m) =
1, then ( n, m) = ( nm) and ( n) ( m) = .

12. Let be a primitive n-th root of unity. Prove that all roots of
unity in ( ) are j (j = 0,1,2, . . . ,n 1).

13. Let p be a prime number and p


(x) the p-th cyclotomic poly-
nomial over . Find the discriminant of p
(x).

14. Show that any finite subring of a division ring is a division ring.

758
§ 59
Applications

This paragraph consists of five parts. In the first part, we give an exact
definition of solvability by radicals, discuss radical extensions and
establish the fundamental theorem due to Galois that a polynomial
equation is solvable by radicals if and only if the Galois group of the
polynomial is a solvable group. In the second part, we apply this
theorem to the general polynomial of degree n over a field and deduce
Abel's theorem: if n 5, then the general polynomial of degree n is not
solvable by radicals. In the third part, we discuss solvability of
equations when the degree is prime. In the fourth part, we give
formulas for the roots of polynomials of degree two, three and four. In
the last part, we examine which real numbers can be constructed by
ruler and compass.

* *

We study solvability of polynomials by an algebraic formula. We start


by clarifying what we mean by an algebraic formula. Intuitively, this is
an expression like

u
k t
m n r s
. . . + . . . + . . . . . .

involving addition, subtraction, multiplication, division and taking n-th


roots, where the terms in innermost radicals are elements of the field to
which the coefficients of the polynomial belong. If the terms are from a
field K, the field operations addition, subtraction, multiplication, division
n
give rise to elements in the same field K, but extraction of n-th root a
amounts to a field extension, namely to the adjunction of a root of xn a
to K. Thus a formula basically desribes a sequence

K0 K1 K2 ... Kn

759
of fields, K0 being the field in which the coefficients of the given
polynomial lie and each Ki+1 is obtained from Ki by adjoining a root of a
polynomial of the form xn a Ki[x] to Ki. These considerations lead to
the following definitions.

59.1 Definition: Let E/K be a field extension. If there are elements


u1,u2, . . . ,un in E such that
(1) E = K(u1,u2, . . . ,un),
(2) there exist natural numbers h1,h2, . . . ,hm such that u1h1 K and
hi
ui K(u1, . . . ,ui 1) for i = 2, . . . ,n,
then E is called a radical extension of K.

59.2 Definition: Let K be a field and f(x) K[x]. We say the equation
f(x) = 0 is solvable by radicals provided there is a splitting field S of f(x)
over K and a radical extension R of K such that K S R.

Note we do not require the splitting field S itself to be a radical exten-


sion, rather that S be contained in some radical extension.

It follows from Definition 59.1 that a radical extension is a finitely


generated and in fact a finite dimensional extension. When we consider
radical extensions as in Definition 59.1 we agree, for uniformity in
notation, to read K(u1, . . . ,uh 1) as K when h = 1. .

If, in the setup of Definition 59.1, hi = rs and if we put uir = ui´ so that
ui´s K(u1, . . . ,ui 1), then we may insert the field K(u1, . . . ,ui 1,ui´) between
K(u1, . . . ,ui 1) and K(u1, . . . ,ui 1,ui):

K(u1, . . . ,ui 1) K(u1, . . . ,ui 1,ui´) K(u1, . . . ,ui 1,ui´,ui) = K(u1, . . . ,ui 1,ui),

without disturbing the condition (2) in Definition 59.1 because

ui´s K(u1, . . . ,ui 1) and uir K(u1, . . . ,ui 1,ui´).

Thus inserting additional intermediate fields if necessary, we may


suppose that the hi in Definition 59.1. are prime numbers whenever we
want to..

760
One of the principle theorems in this paragraph is that, if a polynomial
equation f(x) = 0 is solvable by radicals, then the Galois group of f(x) is a
solvable group (Definition 27.19). In fact, we obtain more general
results. In the next three lemmas, we study radicality of some related
field extensions.

59.3 Lemma: Let K L E be fields.


(1) If E is a radical extension of K, then E is a radical extension of L.
(2) If L is a radical extension of K and if E is a radical extension of L,
then E is a radical extension of K.

Proof: (1) If E is a radical extension of K, there are u1,u2, . . . ,un in E such


that E = K(u1,u2, . . . ,un) and uihi K(u1, . . . ,ui 1) for some natural numbers
hi (i = 1,2, . . . ,n). Then E = L(u1,u2, . . . ,un), as K L E and also u1h1 L
and uihi L(u1, . . . ,ui 1) for i = 2, . . . ,n. Thus E is a radical extension of L.

(2) If L is a radical extension of K, then there are elements u1,u2, . . . ,un in


L such that L = K(u1,u2, . . . ,un) and natural numbers h1,h2, . . . ,hm such that
u1h1 K and uihi K(u1, . . . ,ui 1). If E is a radical extension of L, then
there are elements t1,t2, . . . ,tm in E such that E = L(t1,t2, . . . ,tn) and natural
numbers k1,k2, . . . ,km such that t1k1 L and tiki L(t1, . . . ,ti 1). Thus there
are elements u1,u2, . . . ,un,t1,t2, . . . ,tm in E such that
E = K(u1,u2, . . . ,un,t1,t2, . . . ,tm) and natural numbers h1,h2, . . . ,hm,k1,k2, . . . ,km
such that
u1h1 K and
uihi K(u1, . . . ,ui 1) for i = 2, . . . ,n,
t1k1 K(u1,u2. . . ,un 1,un)
tiki K(u1,u2, . . . ,un,t1, . . . ,ti 1) for i = 2, . . . ,m.
This shows that E is a radical extension of K.

59.4 Lemma: Let K be a field and L,M radical extensions of K contained


in some extension of K. Then their compositum (see Definition 50.17) LM
is a radical extension of K.

Proof: Since L and M are radical extensions of K, we have

761
L = K(u1,u2, . . . ,un) and uihi K(u1, . . . ,uh 1) (i = 1,2, . . . ,n)
M = K(t1,t2, . . . ,tm) and tj kj K(t1, . . . ,tk 1) (j = 1,2, . . . ,m)

with some suitable elements ui, tj and natural numbers hi,kj . Now LM is
the smallest subfield of E containing K and u1,u2, . . . ,un,t1,t2, . . . ,tm, so LM =
K(u1,u2, . . . ,un,t1,t2, . . . ,tm). Since uihi K(u1, . . . ,uh 1) for i = 1,2, . . . ,n and
likewise tj kj K(u1,u2, . . . ,un,t1, . . . ,tk 1) for j = 1,2, . . . ,m, we conclude that
LM is a radical extension of K (where K(u1,u2, . . . ,un,t1, . . . ,tk 1) is to be
read as K(u1,u2, . . . ,un) when k = 1).

59.5 Lemma: Let E/K be a radical field extension and let N be a normal
closure of K over E. Then N is a radical extension of K.

Proof: Let {a1,a2, . . . ,am} be a K-basis of E and let fi(x) K[x] be the
minimal polynomial of ai over K. We remind the reader of the fact that N
is a splitting field of f(x):= f1(x)f2(x). . . fm(x) over K (see the proof of
Theorem 55.11).

Let a be a root of fj (x). There is a K-isomorphism : K(aj ) K(a) with aj =


a (Theorem 53.2). Since N is a splitting field of f(x) over K(aj ) and over
K(a) (Example 53.5(e)), the isomorphism extends to a K-automorphism
:N N of N (Theorem 53.7). Then E is an intermediate field of N/K
which is K-isomorphic to E and E contains the root aj of fj (x). In this
way, we find, for each j = 1,2, . . . ,m and for each root b of fj (x),
intermedi-ate fields of N/K which are K-isomorphic to E and which
contain the root b of fj (x).

Let E1,E2, . . . ,Es be the fields obtained in this way. Then each Ei is K-
isomorphic to E and so a radical extension of K. Using Lemma 59.4
repeatedly, we get that the compositum E1(E2(E3( . . . Es)). . . ) is a radical
extension of K. But this compositum is a subfield of N containing all roots
of f(x). Since N is a splitting field of f(x) over K, the compositum must
equal N. Thus N is a radical extension of K.

762
59.6 Lemma: Let E/K be a finite dimensional field extension, m .
Assume char K = 0 or (m,char K) = 1 and let be a primitive m-th root of
unity. If E is Galois over K, then E( ) is also Galois over K.

Proof: We have a chain of fields K E E( ). By Theorem 55.7, there is


a polynomial f(x) K[x] whose irreducible factors are separable over K
such that E is a splitting field of f(x) over K and E( ) is the splitting field
of the m-th cyclotomic polynomial m(x). over E, whose irreducible
factors, too, are separable over E (Theorem 58.10)..

We claim E( ) is a splitting field of f(x) m


(x) K[x] over K (we have

m
(x) K[x] by Lemma 58.7(2)). Since the irreducible factors of
f(x) m(x) have no multiple roots, they are separable over K and the
claim will imply that E( ) is a Galois extension of K (Theorem 55.7)..

Any root of f(x) m(x) is in E( ), so f(x) m(x) splits in E( ). Now let F be a


subfield of E( ) containing K such that f(x) m(x) splits in F. Then all roots
of f(x) are in F and,. since E is generated over K by the roots of f(x)
(Example 53.5(d)),. E F. Moreover, F contains , so we have E( ) F.
Thus f(x) m(x) cannot split in any proper subfield of E( ) containing K
and E( ) is therefore Galois over K..

59.7 Lemma: Let K be a field, n and assume that char K = 0 or


(char K,n) = 1. Suppose that K contains a primitive n-th root of unity. Let
a K\{0} and let u be a root of xn a K[x]. Then

(1) K(u) is a cyclic extension of K;


(2) K(u):K divides n and u K (u):K K.

Proof: (1) We must show K(u) is Galois over K and AutK K(u) is a cyclic
group. If K is a primitive n-th root of unity, then u, u, 2u, . . . , n 1u
are the roots of xn a. Thus K(u) is a splitting field of xn a over K. The
polynomial xn a has no multiple roots, so the irreducible divisors of
xn a are separable over K. Thus K(u) is Galois over K (Theorem 55.7).

We now show that AutK K(u) is cyclic. If AutK K(u), then u is a root of
xn a, so u = u for some (not necessarily primitive) n-th root of
unity. Since u = u( ) = (u ) = ( u) = ( )(u ) = . u = ( )u and so

763
= for any , AutK K(u), the mapping : AutK K(u) K is a

homomorphism of groups. Here Ker if and only if = 1, i.e., if and


only if u = u, so if and only if is the identity mapping on K(u).. Thus
Ker = 1 and is one-to-one. This shows that AutK K(u) is isomorphic to
a subgroup of K . Since AutK K(u) is finite, AutK K(u) is a cyclic group by
Theorem 52.18.

(2) Let K(u):K = d. Since K(u) is Galois over K, we have AutK K(u) = d by
the fundamental theorem of Galois theory. So AutK K(u) is a cyclic group
of order d, say AutK K(u) = . Now is isomorphic to a subgroup
of and has order n. Hence d n. Moreover, o( ) = = = o( )
d
= d, so = 1 and (ud) = (u )d = ( u)d = d d
u = ud, so ud is fixed by
and by AutK K(u), so ud K since K(u) is Galois over K.

We now proceed to prove that the Galois group of a polynomial is a


solvable group if the polynomial is solvable by radicals. It will be seen
that it is sufficient to prove this under the asumption that a splitting
field of the polynomial is a radical extension (rather than a subfield of a
radical extension), and one may moreover suppose that splitting field of
the polynomial is Galois over the base field. As a technical convenience,
we will bring a certain root of unity into the base field. Then the
subgroups of the Galois group corresponding to the intermediate fields
as in Definition 59.1 under the Galois correspondence will make up a
chain such that each group will be normal in the next one and the factor
groups will be cyclic by Lemma 59.7. This will give an abelian series of
the Galois group, which must be therefore solvable.

59.8 Theorem: Let K be a field and E a Galois extension of K.. If E is a


radical extension of K, then AutK E is a solvable group.

Proof: Since E is a radical extension of K, we have E = K(u1,u2, . . . ,un) and


there are natural numbers h1,h2, . . . ,hm such that uihi K(u1, . . . ,ui 1) for i
= 1,2, . . . ,n. Without loss of generality, we may suppose hi are prime
numbers.

764
First we show that char K, if distinct from 0,. can be assumed to be
distinct from all the prime numbers hi. Indeed, if 0 char K = p = hi,
then uip K(u1, . . . ,ui 1). But E is Galois, hence separable over K and over
K(u1, . . . ,ui 1) (Lemma 55.6), so ui is separable over K(u1, . . . ,ui 1) and
K(u1, . . . ,ui 1,ui) = K(u1, . . . ,ui 1)(ui) = K(u1, . . . ,ui 1)(upi ) = K(u1, . . . ,ui 1,upi ) =
K(u1, . . . ,ui 1) by Lemma 55.16. Thus K(u1, . . . ,ui 1) = K(u1, . . . ,ui 1,ui) and ui
can be deleted from the set of generators.. We assume all generators of
this type have been deleted and thus all the prime numbers hi are
relatively prime to the characteristic of K in case char K = p 0..

Put m = h1h2. . . hn and let be a primitive m-th root of unity. We


consider the cyclotomic extensions E( ) of E and K( ) of K:

E( )

E K( )

. E K( )

Since either char K = 0 or char K is relatively prime to m,. Lemma 59.6


shows that E( )/K is Galois. (E is finite dimensional over K because E is a
radical extension of K). Theorem 54.25(2) gives: AutE E( ) AutK E( ) and
AutK E AutK E( ) / AutE E( ). We want to prove that AutK E is a solvable
group. If we can show that AutK E( ) is solvable, then AutK E will also be
solvable,. because a factor group of a solvable group is solvable (Lemma
27.20). Thus it is sufficient to prove that AutK E( ) is a solvable group.

We make one further reduction.. K( ) is a Galois extension of K by


Theorem 58.10(2), so AutK ( )E( ) AutK E( ) and moreover AutK K( )
AutK E( ) / AutK ( )E( ). We know that AutK K( ) is abelian (Theorem
58.10(3)). Thus AutK E( ) / AutK ( )E( ) is abelian and solvable. If we can
show that AutK ( )E( ) is solvable, then AutK E( ) will also be solvable in
view of Lemma 27.21. Thus it is sufficient to prove that AutK ( )E( ) is a
solvable group..

We put K( ) = E0 and K( ,u1, . . . ,ui) = Ei for i = 1,2, . . . ,n. In particular En =


K( ,u1,u2, . . . ,un) = K(u1,u2, . . . ,un)( ) = E( ). Since En/K is Galois, En is Galois
over any intermediate field (Theorem 54.25(1)). Thus En is Galois over

765
E0. Let Gi AutE En = AutK ( )E( ) be the subgroup Ei´ = AutE En of AutE En
0 i 0
corresponding to Ei (i = 0,1,2, . . . ,n).

E( ) = En Gn = 1

Ei Gi

Ei 1 Gi 1

K( ) = E0 G0

Now char Ei 1 = 0 or relatively prime to hi


Ei = K( ,u1, . . . ,ui) = K( ,u1, . . . ,ui 1)(ui) = Ei 1(ui),
uihi K(u1, . . . ,ui 1) K( ,u1, . . . ,ui 1) = Ei 1,
and Ei 1 has a primitive hi-th root of unity,

since in fact Ei 1 has a primitive m-th root of unity (i = 1,2, . . . ,n). Thus
Lemma 59.7 applies and shows that Ei is a cyclic extension of Ei 1 of
degree Ei : Ei 1 = hi or 1. In particular, Ei is Galois over Ei 1 and, since En is
also Galois over Ei 1, we get Gi Gi 1 and Gi 1/Gi AutE Ei from Theorem
i-1
54.25(2). Thus Gi 1/Gi = Ei : Ei 1 = hi or 1 and Gi 1/Gi is cyclic (of prime
order hi or of order 1). Hence

1 = Gn Gn 1
Gn 2
... G1 G0 = AutE En = AutK ( )E( )
0

is an abelian series of AutK ( )E( ) and AutK ( )E( ) is a solvable group. This
completes the proof.

59.9 Lemma: Let E/K be a field extension and


K1 = {a S : a = a for all AutK E}.
Then AutK E = AutK E and E is Galois over K1.
1

Proof: Clearly K1 is closed under addition, subtraction, multiplication


and division; so K1 is a field and we have K K1 by the very definition of
K1. Any K-automorphism of E fixes the elements of K1 and, since K K1,
any K1-automorphism of E fixes K elementwise. Thus AutK E = AutK E.
1

766
If b E is fixed by all K1-automorphisms of E, then b is fixed by all K-
automorphisms of E, so b K1. Hence K1 is the fixed field of AutK E, which
1
means E is a Galois extension of K1.

59.10 Theorem: Let K S R be a fields.. If R is a radical extension of


K, then AutK S is a solvable group.

Proof: We put K1 = {a S : a = a for all AutK E}. Then AutK S = AutK S


1
and S is a Galois extension of K1 by Lemma 59.9. Moreover, R is a radical
extension of K1 (Lemma 59.3(1)). Let N be a normal closure of K1 over R.
Then N is a radical extension of K1 by Lemma 59.5.

radical

S radical

Galois

K1

Now S is a Galois extension of K1, so S is (K1,N)-stable (Theorem 54.23).


Then AutS N AutK N and AutK N / AutS N is isomorphic to the subgroup
1 1
of AutK S consisting of all K1-automorphisms of S that are extendible to N
1
(Theorem 54.24). What is this subgroup of AutK S? Since N is normal
1
over K1, there is a polynomial f(x) in K1[x] such that N is a splitting field
of f(x) over K1 (Theorem 55.8). Thus N is a splitting field of f(x) over S
(Example 53.5(e)) and any K1-automorphism of S can be extended to a
K-automorphism of N (Theorem 53.7). So the subgroup of AutK S consist-
1
ing of all K1-automorphisms of S that are extendible to N is actually the
whole AutK S. We get
1

767
AutK S = AutK S AutK N / AutS N.
1 1

Since any factor group of a solvable group is solvable (Lemma 27.20),. it


suffices to prove that AutK N is solvable. As in the first paragraph in this
1
proof,. we replace the base field by another and make the extension
Galois. We put K2 = {a N : a = a for all AutK N} N. Then K1 K2
1

and AutK N = AutK N. Here N is a Galois, radical extension of K2 (Lemma


1 2

59.9, Lemma 59.3(1)) and Theorem 59.8 yields that AutK N = AutK N is a
1 2
solvable group..

From Definition 59.2 and Theorem 59.10, we get

59.11 Theorem: Let K be a field and f(x) K[x]. If the equation f(x) = 0
is solvable by radicals, then the Galois group of f(x) is a solvable group.

We now want to establish the converse of Theorem 59.9. Let E/K be a


Galois extension. If AutK E is solvable, then the composition factors of
AutK E are cyclic of prime order and the Galois correspondence gives rise
to a chain of intermediate fields in which the two consecutive terms
represent a cyclic extension of prime degree. There are two types of
cyclic extensions of prime degree: (1) extensions of the form K(u)/K
where u is a root of xp x and char K = 0 or (p,char K) = 1 and (2) exten-
sions of the form K(u)/K, where u is a root of xp x a and p = char K.
(Just as Lemma 59.7 is the converse of Theorem 57.11, Theorem 57.10
admits a converse; see § 57, Ex. 3,4.) Hence the extensions of the second
type will creep into the intermediate field structure of E/K. There are
two ways of coping with this situation. Either we modify the definition
of radical extensions so as to include extensions of the second type as
admissible intermediate steps (see Ex. 1,2.) or we impose restrictive
hypotheses on the characteristic to prevent extensions of the second
type from coming up.

768
59.12 Theorem: Let K be a field and E a finite dimensional Galois
extension of K. Assume char K = 0 or 0 char K and char K does not
divide E : K . If AutK E is a solvable group, then there is a radical extension
R of K such that K E R.

Proof: We make induction on E:K . If E:K = 1, then E = K and K is a


radical extension of K containing K. Thus the theorem is proved in case
E:K = 1.

Let n = E:K . Assume n 2 and the theorem is proved for all field
extensions of degree n.

Since AutK E is a solvable group, AutK E =: G has a subgroup of prime


index, say H G and G:H = p, where p is a prime number . (Theorem
27.25). Here p divides AutK E = E:K , so p char K by hypothesis. Let
be a primitive p-th root of unity. The cyclotomic extension K( ) is a
radical extension of K, so, if we can prove there is a radical extension R
of K( )

1 E( )

E K( )

. E K( )
H
p
AutK E K

containing E( ), then R will be a radical extension of K containing E


(Lemma 59.3(2)).
We show that E( ) is contained in some radical extension of K( ).. Since E
is Galois over K,. Lemma 59.6 yields E( ) is Galois over K and Theorem
54.23 yields E is (K,E( ))-stable. So the restriction mapping E
is a
homomorphism : AutK ( )E( ) AutK E. If Ker AutK ( )E( ), then E
fixes all elements of E, so fixes all elements of E and also ,. so is the
identity mapping on E( ), so Ker = { E( )} and is one-to-one.

We distinguish two cases, according as Im is a proper subgroup of


AutK E or equal to AutK E. Since E( ) is Galois over K, it is Galois over K( )
by Theorem 54.25(1) and so E( ):K( ) = AutK ( )E( ) .

769
If Im AutK E, then E( ):K( ) = AutK ( )E( ) = Im AutK E = n. Now
AutK ( )E( ), being isomorphic to a subgroup of the solvable group AutK E,
itself is a solvable group (Lemma 27.20) and. E( ) is Galois over K( ), so,
by induction, there is a radical extension R of K( ) containing E( ).. The
proof is complete in this case. .

If Im = AutK E, then is an isomorphism and has an inverse isomorph-


ism 1: AutK E AutK ( )E( ). We put J = H 1. Then J AutK ( )E( ) and
AutK ( )E( ):J = p. Since H is solvable, its isomorphic image J is solvable.
Let F = (AutJE( ))´ be the intermediate field of the Galois extension
E( )/K( ) corresponding to J.

E( ) 1 1

F J H
p p p
K( ) AutK ( )E( ) AutK E

As J AutK ( )E( ), Theorem 54.25(2) shows that F is Galois over K and


AutK ( )F AutK ( )E( ) / AutF E( ) = AutK ( )E( ) / J Cp. So F is a cyclic
extension of K( ), so F = K(u) for some root of a suitable polynomial of
the form xp a in K( )[x] (Theorem 57.11). Thus F is a radical extension
of K( ). Here AutF E( ) = J is solvable, E( ) is Galois over F (Theorem
54.25(1)) and E( ):F E( ):F F:K( ) = E( ):K( ) = n, so, by induction,
there is a radical extension R of F with F E( ) R. Since R is a radical
extension of F and F is a radical extension of K( ), Lemma 59.3(2) yields
that R is a radical extension of K( ) with K( ) E( ) R. This completes
the proof.

59.13 Theorem: Let K be a field and f(x) K[x] a polynomial of degree


n 0. Suppose char K = 0 or 0 char K n. Then the equation f(x) = 0
is solvable by radicals if and only if the Galois group of f(x) is a solvable
group.

Proof: If the equation f(x) = 0 is solvable by radicals, then the Galois


group of the polynomial f(x) is solvable by Theorem 59.11.

770
Conversely, let S be a splitting field of f(x) over K and assume that the
Galois group AutK S of f(x) is solvable. In order to prove that the equation
f(x) = 0 is solvable by radicals, i.e., in order to prove that there is a
radical extension R of K satisfying K S R, it suffices, in view of
Theorem 59.12, to show that S is Galois over K and char K = 0 or char K
does not divide S:K .

To prove that S is Galois over K, we use Theorem 55.7. We need only


show that the irreducible factors of f(x) are separable over K. This is
clear in case char K = 0. If char K 0 and axd + . . . K[x] is an irreducible
factor of f(x) with a 0, then d n char K and da 0 K, so its
d 1 .. .
derivative dax + is not equal to 0 K[x] and ax + . . . is separable
d

over K.

Thus we are done in case char K = 0. In case char K 0, we have char K


n, so the prime number char K does not divide n! and, as S:K n!, it
does not divide S:K either. The proof is complete.

* *

In this part, we prove the celebrated theorem due to Abel which states
that the general polynomial (over a field of characteristic 0) of degree n
is solvable by radicals if and only if n 4 and some related results.
First of all, we must explain what we mean by the general polynomial of
degree n.

59. 14 Definition: Let K be a field and let a1,a2, . . . ,an 1,an be n distinct
indeterminates over K. The polynomial
g(x) = xn a 1xn 1 + a 2xn 2 a 3xn 3 + . . . + ( 1)n 1an 1x + ( 1)nan
in K(a1,a2, . . . ,an 1,an)[x] is called the general polynomial of degree n over
K.

Any monic polynomial in K[x] can be obtained from f(x) by substituting


appropriate elements of K for the indeterminates. This justifies the
terminology. Note, however, a peculiarity: the general polynomial of

771
degree n over K is not a polynomial over K, that is, it is not in K[x], but in
K(a1,a2, . . . ,an 1,an)[x].

Alternating signs are attached to the coefficients aj for convenience in


computations. This makes it easier to compare the coefficients aj with
the elementary symmetric polynomials..

Our main goal is to prove that. the Galois group of the general polynomial
of degree n is the symmetric group Sn. After we established some
prepatory lemmas, we prove that each permutation of the roots induces
an automorphism of the splitting field if the roots are indeterminates
(Theorem 59.17) and that we can indeed treat the roots of the general
polynomial as indeterminates (Theorem 59.18).

59.15 Lemma: Let D1,D2 be integral domains and F1,F2 the field of frac-
tions of D1,D2, respectively. If : D1 D2 is a ring isomorphism, then the
mapping

: F1
1
F2
a/b (a )/(b )

is a field isomorphism.

Proof: We are to show that 1 is a one-to-one ring homomorphism from


F1 onto F2. Let a,b,c,d D1 and b,d 0. Then

a/b = c/d ad = bc (ad) = (bc) a .d = b .c


(a )/(b ) = (c )/(d ) (a/b) 1
= (c/d) 1
,

which shows that 1 is well defined and one-to-one. Moreover, if u F2,


then u = e/f for some e,f D2 with f 0,. then e = a and f = b for some
a,b D1 and b 0,. so u = e/f = a /b = (a/b) 1 is the image of a/b F1
under 1 and thus 1 is onto F2.

It remains to prove that 1 preserves addition and multiplication. This is


easy: if a,b,c,d D and b,d 0, then .

[(a/b) + (c/d)] 1
= [(ad + bc)/bd] 1 = (ad + bc) /(bd)
= (a d + b c )/(b d ) = (a /b ) + (c /d ).
= (a/b) 1 + (c/d) 1

772
and
[(a/b)(c/d)] 1
= (ac/bd) 1 = (ac) /(bd) = (a c )/(b d )
= (a /b )(c /d ) = (a/b) 1(c/d) 1.

Thus 1
is a field isomorphism.

59.16 Lemma: Let K be a field and let x1,x2, . . . ,xn be n distinct indeter-
minates over K. .
(1) For each permutation Sn, the mapping

´: K(x1,x2, . . . ,xn) K(x1,x2, . . . ,xn)


f(x1,x2, . . . ,xn)/g(x1,x2, . . . ,xn) f(x1 ,x2 , . . . ,xn )/g(x1 ,x2 , . . . ,xn )

is a field automorphism of K(x1,x2, . . . ,xn).


(2) If , Sn and , then ´ ´.

Proof: (1) Let Sn. The mapping ´´: K[x1,x2, . . . ,xn] K[x1,x2, . . . ,xn]
f(x1,x2, . . . ,xn) f(x1 ,x2 , . . . ,xn )

is the substitution homomorphism that substitutes xj for xj (j = 1,2,. . .


,n). It has an inverse ( ´´) 1 = ( 1)´´: f(x1,x2, . . . ,xn) f(x1 -1,x2 -1, . . . ,xn -1).
Thus ´´ is a ring isomorphism from the integral domain K[x1,x2, . . . ,xn]
onto itself. Lemma 59.15 gives that

( ´´)1: K(x1,x2, . . . ,xn) K(x1,x2, . . . ,xn)


f(x1,x2, . . . ,xn)/g(x1,x2, . . . ,xn) f(x1,x2, . . . ,xn) ´´/g(x1,x2, . . . ,xn) ´´

is a field automorphism of the the field K(x1,x2, . . . ,xn) of fractions of


K[x1,x2, . . . ,xn]. But ( ´´)1 is nothing else than ´. Hence ´ is a field auto-
morphism of K(x1,x2, . . . ,xn).

(2) If , then there is a j {1,2, . . . ,n} such that j j , then xj ´ = xj


xj = xj , so ´ ´.

59.17 Theorem: Let K be a field and x1,x2, . . . ,xn be n distinct indeter-


minates over K and let

f1 = ∑ xi

773
f2 = ∑ xixj

f3 = ∑ xixj xk
...............
fn = x1x2. . . xn

be the elementary symmetric polynomials in K[x1,x2, . . . ,xn]. Then the


field of rational functions K(x1,x2, . . . ,xn) is a Galois extension of
K(f1,f2, . . . ,fn), the subfield of K(x1,x2, . . . ,xn) generated by f1,f2,. . . ,fn over K
and AutK (f ,f ,...,f )K(x1,x2, . . . ,xn) Sn.
1 2 n

Proof: We put E = K(x1,x2, . . . ,xn) and L = K(f1,f2, . . . ,fn). Let x be a new


indeterminate over K. If

h(x) = xn f1(x1,x2,. . . ,xn)xn 1 + f2(x1,x2,. . . ,xn)xn 2 + . . . + ( 1)nfn(x1,x2,. . . ,xn),

then h(x) L[x] and h(x) splits in E:

h(x) = (x x1)(x x2). . . (x xn).

Since E = L(x1,x2,. . . ,xn) is generated by the roots x1,x2,. . . ,xn of g(x) over L,
we deduce that E is a splitting field of h(x) over L (Example 53.5(d)). As
h(x) has no multiple roots, the irreducible factors of h(x) in L[x] are
separable over L. Theorem 55.7 tells now E is a Galois extension of L.

For each of the n! permutations in Sn, there is a ´ Aut (E) by Lemma


59.16, and ´ fixes f1,f2, . . . ,fn as f1,f2, . . . ,fn are symmetric polynomials, so
´ fixes L = K(f1,f2, . . . ,fn). This means ´ AutL E. As ´, ´ AutL E are
distinct whenever , Sn are distinct, there are at least n! automorph-
isms in AutL E and AutL E n!. On the other hand, AutL E = E:L since E
is Galois over L and E:L n! by Theorem 53.6 and Theorem 53.8.. So
we have AutL E = n!. We know from Theorem 56.14 that AutL E is
isomorphic to a subgroup of Sn. In view of AutL E = n!, it must be
isomorphic to Sn.

59.18 Theorem: Let K be a field and n . The Galois group of the


general polynomial of degree n over K is isomorphic to Sn.

774
Proof: Let a1,a2, . . . ,an 1,an indeterminates over K so that

g(x) = xn a 1 xn 1 + a 2 xn 2 a 3 xn 3 + . . . + ( 1)n 1a x + ( 1)na


n 1 n

is the general polynomial of degree n over K. We put L1 = K(a1,a2, . . . ,an).


Let E1 be a splitting field of f(x) over L1[x] and r1,r2, . . . ,rn E1[x] the
roots of f(x). Then E1 = L1(r1,r2, . . . ,rn) = K(a1,a2, . . . ,an,r1,r2, . . . ,rn)
= K(r1,r2, . . . ,rn) by Example 53.5(d). The Galois group of f(x) is AutL E1.
1

Let x1,x2, . . . ,xn be n indeterminates over K which are distinct from the
a1,a2, . . . ,an. Let E = K(x1,x2, . . . ,xn) and let f1,f2,. . . ,fn be the elementary
symmetric polynomials in K[x1,x2, . . . ,xn] and put L = K(f1,f2, . . . ,fn). We
know AutL E Sn from Theorem 59.17.

K(r1,r2, . . . ,rn) = E1 K(x1,x2, . . . ,xn) = E

1
K(a1,a2, . . . ,an) = L1 K(f1,f2, . . . ,fn) = L

K K

We show that there is a K-isomorphism 1: L1 L. First observe that we


have the substitution homomorphism : K[a1,a2, . . . ,an] K[f1,f2, . . . ,fn] that
maps ai to fi and h(a1,a2, . . . ,an) to h(f1,f2, . . . ,fn). Clearly fixes all ele-
ments of K. Furthermore, is one-to-one, for if h1,h2 K[a1,a2, . . . ,an],
then h1 h2 implies h1 = h1(f1,f2, . . . ,fn) h2(f1,f2, . . . ,fn) = h2 by the
uniqueness assertion in the fundamental theorem on symmetric poly-
nomials (Theorem 38.4). Thus is a ring isomorphism from K[a1,a2, . . . ,an]
onto Im . Using Lemma 59.15, we extend to a field isomorphism 1
from L = K(a1,a2,. . . ,an) onto the the field of fractions of Im L. Since L
= K(f1,f2, . . . ,fn) and {f1,f2, . . . ,fn} Im , it follows that the field of
fractions of Im is equal to L and thus 1 is onto L. Also 1 fixes every
element of K. Hence 1: L1 L is a K-isomorphism.

The homomorphism 1
: L1[x] L[x] of Lemma 33.7 maps

g(x) = xn a 1 xn 1 + a 2 xn 2 a 3 xn 3 + . . . + ( 1)n 1a x + ( 1)na


n 1 n

775
to h(x) = xn f1 x n 1 + f2 x n 2 f3 x n 3 + . . . + ( 1)n 1f x + ( 1)nf .
n 1 n

Here E1 is a splitting field of g(x) over L1 and E = L(x1,x2,. . . ,xn) is a split-


ting field of h(x) over L (Example 53.5(d)), so the isomorphism 1: L1 L
can be extended to an isomorphism : E1 E (Theorem 53.7). Lemma
56.11(1) and Theorem 59.17 give now AutL E1 AutL E Sn. This
1
completes the proof.

59.19 Theorem (Abel): Let K be a field, n and g(x) the general


polynomial of degree n over K. If the equation g(x) = 0 is solvable by
radicals, then n 4. Conversely, if char K = 0 and n 4, then the equa-
tion g(x) = 0 is solvable by radicals.

Proof: The Galois group of g(x) is Sn (Theorem 59.18). If the equation


g(x) = 0 is solvable by radicals, then Sn is a solvable group (Theorem
59.11), so n 4 by Theorem 27.26.. Conversely, if n 4 and char K =
0, then Sn is a solvable group (Example 27.10(a),(b), Theorem 27.25) and
the equation g(x) = 0 is solvable by radicals (Theorem 59.13)..

Theorem 59.19 is a statement about general polynomials. It does not


state that specific polynomial equations of degree 5 cannot be
solvable by radicals.

* *

In this part, we examine solvability by radicals of polynomial equations


of prime degree. It is necessary to understand the solvable transitive
subgroups of Sp. These have a simple structure. After we gave a
characterization of solvable transitive subgroups of Sp, we prove the
curious result of Galois: "In order for an irreducible equation of prime
degree to be solvable by radicals, it is necessary and sufficient that once
any two of the roos are known the others can be deduced from them
rationally." (Edwards' translation.)

776
Let p be a prime number. It will be convenient to regard Sp as acting on
the p elements 1,2, . . . ,p of p. For any a p
and b p
, We write

a,b
: p p
.
u au + b

Clearly a,b c,d


whenever (a,b) (c,d). For any (a,b),(c,d) p p
, we
have

u a,b c,d
= (au + b) c,d
= c(au + b) + d = cau + cb + d
= (ac)u + (bc + d) = u ac,bc+ d,

so a,b c,d = ac,bc+ d. So A(p) := { a,b : a p


, b p
} is closed under the
composition of mappings. Observe that 1,0 A(p) is the identity
mapping and hence (1/a),(-b/a) A(p) is the inverse of a,b. As the
composition of mappings is associative, A(p) is a group. In particular,
each a,b is one-to-one and onto, and can be considered as a permutation
in Sp. Thus we shall regard A(p) as a subgroup of Sp. Then

=  1 2 . . . p 
a,b  a+b a2+b . . . ap+b ,

where the integers ought to be interpreted modulo p. The permutation


0,1
= (12. . . p) of order p will be denoted as .

59.20 Definition: Let p be a prime number and Sp. If there are


elements a p
and b p
such that

1 2 . . . p
=  a+b a2+b . . . ap+b .

then is called a linear permutation in Sp. In this case, we shall denote


the permutation as a,b Then A(p) = { a,b : a p
,b p
} is a subgroup
of Sp and is called the one dimensional affine group over p. If = (12. . .
p) and G A(p), then G is called a linear subgroup of Sp.

59.21 Lemma: Let p be a prime number, = (12. . . p) Sp and let H be


a subgroup of Sp.
(1) If H is a linear subgroup of Sp, then the only elements of order p in H
2 3 p1
are , , , ..., .

777
2 3 p1
(2) If , , , ... , are the only elements of order p in H, then is a

characteristic and normal subgroup of H.


(3) If is a normal subgroup of H, then H is a linear subgroup of Sp

(4) If H is a linear subgroup of Sp and H K Sp, then K is a linear

subgroup of Sp.

Proof:(1) Assume H is a linear subgroup of Sp and let H. Then =


a,b
for suitable a,b p
, a 0.

If a = 1, then u = u + b = u b for any u {1,2, . . . ,p 1}, so = b and


o( ) = o( b) and o( b) = 1 in case b = 0 and o( b) = p in case b = 1,2, . . . ,p
1. Thus the only elements 1,b in H satisfying o( 1,b) = p are , 2, 3, . . .
p1
, .

To complete the proof, we show that a 1 implies o( a,b


) p. If a 1 and
2
= a,b
, then u = a(au + b) + b = a 2u + (a + 1)b,
3
u = (a 2u + (a + 1)b) + b = a 3u + (a 2 + a + 1)b
and similarly u n
= a nu + (a n 1 + a n 2 + . . . + a 2 + a + 1)b
for any n . As a 1 p
, we can write

n nan 1
u =a u+ b
a 1

from which we read that n = if and only if a n = 1,. so o( ) = o(a), the


order of a in the multiplicative group p . But p has order p 1 and, by
Lagrange's theorem, there is no a in p with o(a) = p. Thus a,b cannot be
of order p if a 1.

(2) By hypothesis, H. Let Aut(H). Then 1 is an element of


order p in H. Then = a for some a {1,2, . . . ,p 1}, so = =
a
= and is characteristic and therefore also normal in H.

(3) By hypothesis, H. We must prove H A(p). Let H. Then 1


1
= = and = a for some a {1,2, . . . ,p 1}. So =
a
and

a
(t + 1) = t =t = t + a, for any t {1,2, . . . ,p 1,p}.

Then (t + 2) = (t + 1) + a = t + 2a,
(t + 3) = (t + 2) + a = t + 3a,

778
and similarly (t + u) = t + ua for all t,u {1,2, . . . ,p 1,p}. Putting t = 0
and t = b, we get u = t + ua = au + b for any u = 1,2, . . . ,p 1,p. There-
fore = a,b A(p). This proves H A(p).

(4) Assume now H is a linear subgroup of Sp and H K Sp, We must


prove K A(p).. Now is a characteristic subgroup of H by part (2),.
and is a normal subgroup of K by Lemma 23.15,. so K is a linear
subgroup of Sp by part (3).

59.22 Lemma:. Let p be a prime number and K a transitive subgroup


of Sp. If 1 H K, then H is also transitive.

Proof: Let i,j {1,2, . . . ,p}. We claim that the number of elements in the
H-orbit of i is equal to the number of elements in the H-orbit of j.
Indeed, since K is transitive, there is a K with i = j and

H-orbit of i = H:StabH (i) = H:StabK (i) H = H :(StabK (i) H)

= H :(StabK (i)) H = H:(StabK (i)) H = H:StabK (i ) H

= H:StabK (j) H = H:StabH (j) = H-orbit of j

in view of Lemma 25.10 and Lemma 25.8. Thus all orbits of H have the
same number of elements, say m. If k is the number of H-orbits, then
the {1,2, . . . ,p} is partitioned into k subsets each of which has m
elements. Thus p = mk and k = p or k = 1. If k = p were true, i.e., if there
were p H-orbits, the H-orbits would consist of single terms and we
would get u = u for any u {1,2, . . . ,p}, H. This would give H = 1,
contrary to the hypothesis. Hence k = 1 and H is transitive.

59.23 Lemma: Let p be a prime number and G Sp. Then G is


transitive if and only if p divides the order of G.

Proof: If p G , there is an element of G with o( ) = p. Then is a cycle


of length p, say (a1a2. . . ap). Then any ai is mapped to any aj by j i 1 G
and so G is transitive. Conversely, if G is transitive, there is, for each a =

779
1,2, . . . ,p, a permutation a
with 1 a
= a and we have the coset
decomposi-tion

p
G= [StabG(1)] a ,
i=1

whence G = StabG(1) p is divisible by p.

We can now find all solvable transitive subgroups of Sp. Basically, we use
Lemma 59.21 and Lemma 59.22 to go downwards and upwards along a
composition series of such subgroups.

59.24 Theorem: Let p be a prime number and G Sp. Then G is a solv-


able transitive subgroup of Sp if and only if G is conjugate to a linear
subgroup of Sp.

Proof: Let G be a solvable transitive subgroup of Sp. Consider a


composition series of G, say

1 = H0 H1 H2 ... Hm 1 Hm = G.

The composition factors Hi/Hi 1 are cyclic of prime order by Theorem


27.18. Since Hm is transitive, Hm 1 is also transitive by Lemma 59.22, and
then Hm 2 is transitive, then Hm 3 transitive and so on. In this way, we see
that H1 is transitive. Then p divides H1 by Lemma 59.23 and we get H1
= p. So H1 is a cyclic group generated by a cycle (a1a2. . . ap). Replacing G
by a conjugate of G, we may assume H1 = = (12. . . p) . Now Lemma
59.21(4) shows that H2 is a linear subgroup of Sp, so H3 is also a linear
subgroup of Sp, so H4 is also a linear subgroup of Sp and so on. In this
way, we conclude Hm = G is a linear subgroup of Sp.

Conversely, let G be a linear subgroup of Sp. Then is a subgroup of G


and so p divides G and Lemma 59.23 shows G is a transitive subgroup
of Sp. Now we have to prove G is solvable. As G A(p), it will be
sufficient to prove that A(p) is solvable. In view of the multiplication
rule a,b c,d = ac,bc+ d, the mapping

: A(p) p

780
a,b
a

is a homomorphism onto p , with Ker = { 1,b A(p): b p


}= . Thus
A(p) and A(p)/ = A(p)/Ker Im = p is abelian. Then

1 A(p)

is an abelian series of A(p) and hence A(p) is solvable.

We give another. group theoretical characterization of solvable transitive


subgroups of Sp. This will be translated into Galois' characterization of
polynomial equations of prime degree which are solvable by radicals..

59.25 Theorem: . Let p be a prime number and G a transitive subgroup


of Sp. Then G is solvable if and only if is the only permutation in G that
fixes two numbers from {1,2, . . . ,p}, i.e., if and only if .
StabG(i) StabG(j) = 1
for any two distinct i,j from {1,2, . . . ,p}. .

Proof: Suppose first that G is a solvable transitive subgroup of Sp. Then


G is conjugate to a subgroup of A(p),. say G = H , where H A(p)
and Sp (Theorem 59.24). If i,j are distinct numbers from {1,2, . . . ,p}
-1 -1 -1
and G fixes both i and j, then G = (H ) = H A(p) fixes both
i 1 and j 1. But, aside from the identity, there is no permutation in A(p)
-1
that fixes two distinct numbers from {1,2, . . . ,p}. Thus = and = .. So
the identity permutation is the only permutation in G that fixes two
numbers in {1,2, . . . ,p}.

Now suppose conversely that G is a transitive subgroup of Sp with the


property that the identity is the only permutation in G that fixes two
numbers in {1,2, . . . ,p}. Let i,j be two distinct numbers in {1,2, . . . ,p} and
write

H = StabS (i) StabS (j) = { Sp : i = i and j = j}.


p p

The hypothesis gives H G = 1. If 1


, 2
G and 1
, 2
belong to the same
right coset of H in Sp, then 1 2 1 belongs to H G = 1, so 1 = 2. Thus
there is at most one element of G in each right coset of H in Sp. So G is

781
less than or equal to the number Sp:H of right cosets of H in Sp and, as H
is isomorphic to Sp 2, we have G Sp:H = Sp / H = p!/(p 2)! = p(p 1).
Lemma 59.23 yields p divides G , so there is an element ´ = (a1a2. . . ap)
 a a . . . ap
of order p in G. If we write = 1 2  Sp, then (12. . . p) = = ´ is
1 2. . . p
an element of order p in G . Aside from the powers of , there is no
permutation of order in G , for if G had order p and = 1,
then = = p2 (Lemma 19.6) and so there would be at
least p2 distinct elements in G , whereas G = G is at most p2 p. So
is a normal subgroup of G by Lemma 59.21(2) and G is a linear
subgroup of Sp by Lemma 59.21(3). Hence G is conjugate to a linear
subgroup of Sp and G is solvable by Theorem 59.24.

For the sake of completeness, we prove Galois' theorem stating that a


polynomial equation of prime degree is solvable by radicals if and only
if "all roots can be expressed rationally in terms of any two of them."

59.26 Theorem: Let K be a field of characteristic 0 and let f(x) be an


irreducible polynomial of prime degree p in K[x]. The equation f(x) = 0 is
solvable by radicals if and only if, for any two distinct roots a,b of f(x),
K(a,b) is a splitting field of f(x) over K.

Proof: . Let E be a splitting field of f(x) over K and G the Galois group of
f(x). Then E is a Galois extension of K (Theorem 55.7). and G is a
transitive subgroup of Sp (Theorem 56.17). Let a,b be two distinct roots
of f(x) and let J = K(a.b)´ be the subgroup of G corresponding to it. .

E 1

K(a.b) J

K G

J is the subgroup of G consisting precisely of the permutations of the


roots fixing a and b. Now we have the is equivalences

E = K(a.b) J=1
1 is the only permutation in G fixing a and b

782
G is a solvable subgroup of Sp
the equation f(x) = 0 is solvable by radicals.

This completes the proof.

* *

In this part, we give algebraic formulas for the roots of polynomials of


degree two, three and four. For the sake of generality, we assume the
coefficients are indeterminates, but, as will be clear from the arguments,
the formulas are valid if the coefficients are taken from the base field.

59.27 Theorem: Let K be a field with char K 2 and let a,b be


indeterminates over K so that

g(x) = x2 ax + b

is the general polynomial of degree two over K. Then the roots r1,r2 of
g(x) are given by
a+ D , a D,
r1 = r2 =
2 2
where D = a 2 4b.

Proof: The discriminant D of g(x) is (r1 r2)2 = (r1 + r2)2 4r1r2 = a 2 4b.
Hence r1 + r2 = a and r1 r2 = D . Solving this system of linear equations
for r1,r2, we find
a+ D , a D
r1 = r2 =
2 2

In particular, the cubic roots of unity, which are the roots of the poly-
nomial x2 + x + 1, are given by 1 = ( 1 + 3)/2, 2 = ( 1 3)/2. This
will be used in the next theorem.

783
59.28 Theorem: Let K be a field with char K 2,3 and assume that K
contains a primitive cube root of unity, say . Let a,b,c be distinct
indeterminates over K and let

g(x) = x3 ax2 + bx c

be the general cubic polynomial over K. Then the roots r1,r2,r3 of g(x) are
given by

1 1 2 1 2
r1 = (a + u + v) r2 = (a + u + v) r2 = (a + u + v),
3 3 3

3
9 27 3
where u= a3 ab + c + 3D
2 2 2

3
9 27 3
v= a3 ab + c 3D
2 2 2

4 1
are such that uv = a 2 3b and D = (a 2 3b)3 (2a 3 9ab + 27c)2.
27 27

Proof: Let E = K(r1,r2,r3) be a splitting field of g(x) over L= K(a,b,c). Then


E is a Galois extension of L and the Galois group AutL E of g(x) is S3
(Theorem 59.17). Since S3 is transitive, g(x) is irreducible over L, and
char K 3 implies that the derivative of g(x) is not zero,. so g(x) has no
common root with its derivative and the roots r1,r2,r3 are distinct. Under
the Galois correspondence, the alternating group A3 corresponds the
subfield L( ) of E, where = (r1 r2)(r1 r3)(r2 r3) is the square root of
the discriminant of g(x) (Theorem 56.18, Theorem 56.19). .

Let D be the discriminant of g(x). We evaluate D. Observe that r1 (a/3),


r2 (a/3),r3 (a/3) are the roots of g(x + (a/3)), so the root differences
and the discriminant of g(x) are the same as those of g(x + (a/3)). One
obtains easily g(x + (a/3)) = x3 + px + q, where p = (3b a 2)/3 and
q = ( 2a 3 + 9ab 27c)/27. Then D = 4p3 27q2 is computed to be
4 1
(a 2 3b)3 (2a 3 9ab + 27c)2
27 27
(Example 56.10(b)).

E 1

784
L( ) = L( D) A3

2
L S3

Now E is a cyclic extension of L( D), because E is Galois over L( D)


(Theorem 54.25(1)) and its Galois group A3 is cyclic of order 3. Thus E is
obtained by adjoining a root u of a polynomial x3 h L( D)[x] to L( D)
(Theorem 57.11). A generator of A3 maps as follows

r1 r2, r2 r3, r3 r1;


2 2
u u, u u, u u.

An examination of the proof of Theorem 57.11 reveals that u should be


taken as a nonzero element of the form d + ( )d + ( )( 2)d 2,
with d E. So we must find a d E such that d + d + 2d 2 0. We
2 2
choose d = r1. So let u = r1 + r2 + r3. Similarly we put v = r1 + r2 + r3.

We already know u3 L( D). We now evaluate it. We have

u3 = (r1 + r2 + 2
r3)3 = r13 + r23 + r33 + 3 A + 3 2
B + 6r1r2r3,
(1)

where we put A = r12r2 + r22r3 + r32r1 and B = r1r22 + r2r32 + r3r12 for
shortness. The method of §38 gives

r13 + r23 + r33 = (r1 + r2 + r3)3 3(A + B) 6r1r2r3,

and A + B = r12r2 + r22r3 + r32r1 + r1r22 + r2r32 + r3r12


= (r1 + r2 + r3)(r1r2 + r1r3 + r2r3) 3r1r2r3 = ab 3c,

A B = r12r2 + r22r3 + r32r1 r1r22 r2r32 r3r12


= (r1 r2)(r1 r3)(r2 r3) = D,

so (1) becomes

ab 3c+ D ab 3c D
u3 = [(r1 + r2 + r3)3 3[( )+( )] 6r1r2r3]
2 2

1+ 3 ab 3c+ D 1 3 ab 3c D
+ 3( )( )+ 3( )( ) + 6r1r2r3
2 2 2 2

785
9 27 3
= a3 ab + c+ 3D
2 2 2

and a similar calculation yields

9 27 3
v3 = a 3 ab + c 3D.
2 2 2

So u and v are cube roots of the expressions found above. But there are
three cube roots of these expressions, and we must decide which cube
roots we should take. This is found from

2 2
uv = (r1 + r2 + r3)(r1 + r2 + r3) = a12 + a22 + a32 a1a2 a1a3 a2a3
= a2 3b.

The cube roots must be therefore so chosen that their product will be
equal to a 2 3b. If

3
9 27 3
u= a3 ab + c + 3D
2 2 2

3
9 27 3
v= a3 ab + c 3D
2 2 2

are denote cube roots with this property, then, solving the equations

a = r1 + r2 + r3
2
u = r1 + r2 + r3
2
v = r1 + r2 + r3
for r1,r2,r3, we get
1 1 2 1 2
r1 = (a + u + v) r2 = (a + u + v) r2 = (a + u + v),
3 3 3
as was to be proved.

A remarkable fact is that, if f(x) [x] has three real roots, then the
roots of f(x) cannot be expessed in terms of real radicals. We want to
dicsuss this matter. We need an elementary lemma.

786
59.29 Lemma: Let K be a field, a K and p a prime number. Assume
p
char K p. If x a K[x] is reducible in K[x], then a = c pfor some c K.

Proof: In a splitting field of xp a over K, we have the decomposition


p1
xp a= (x k
u)
k=0

where u is a root of xp a and is a primitive p-th root of unity. If xp a


is reducible in K[x] and f(x) K[x] is a factor of xp a with 1 deg f(x)
k
p, then f(x) is a product of some of the x u, and the constant term
h h m h
( 1) b0 of f(x) is ( 1) u for some m , where h = deg f(x). So b0 =
uh for some p-th root of unity, so b0p = uph = a h and, since (h,p) = 1,
there are integers k,n satisfying kh + np = 1. Thus a = a kha np = (b0p)ka np =
(b0ka n)p and b0ka n K.

59.30 Lemma: Let K be a subfield of and f(x) an irreducible cubic


polynomial in K[x]. Let S be a splitting field of f(x) over K. If f(x) has
three distinct real roots, then there is no radical extension R of K such
that S R .

Proof: Let r1,r2,r3 be the roots and D = (r1 r2)2(r1 r3)2(r2 r3)2 the dis-
criminant of f(x). Then D is a positive real number. Put K1 = K( D) .
Clearly K1 is a subfield of S. We may assume that f(x) is monic.

Suppose, by way of contradiction,. there is a radical extension R of K with


S R . Then RK1 is a radical extension of K1 (Lemma 59.4). So there is
a finite chain of fields.

K1 K2 ... Kn 1 Kn = RK1

such that Ki = Ki 1(ui) for some root ui of a polynomial of the form xmi ai
in Ki 1[x] (i = 2,3,. . . ,n). We may assume mi are prime numbers. Moreover,
after deleting redundant fields, we may assume ui Ki 1. Thus we
assume mi are prime, uimi Ki 1 and ui Ki 1. Then xmi uimi Ki 1[x] is
irreducible in Ki 1[x], for otherwise we had uimi = c mi for some c Ki 1
(Lemma 59.29) and ui/c, which is distinct from 1 in view of ui Ki 1,
would be a primitive mi-th root of unity, so ui/c Ki 1 would be
complex number with nonzero imaginary part, a contradiction. Therefore

787
xmi uimi Ki 1[x] is irreducible over Ki 1 and is in fact the minimal poly-
nomial of ui Ki over Ki 1. This gives Ki:Ki 1 = mi.

f(x) is irreducible in K1[x], for f(x) is the minimal polynomial of any of its
roots over K and if r1, say, were in K1, then 2 = K1:K = K1:K(r1) K(r1):K
would be divisible by K(r1):K = deg f(x) = 3, which is nonsense. Now S is
a splitting field of f(x) over K1 (Example 53.5(e)) and since D K1, the
Galois group AutK S is isomorphic to A3. (Theorem 56.21).
1

On the other hand, the roots of f(x) are in S R RK1 and f(x) is
reducible over RK1 = Kn. Let Ki be the field in the chain above where f(x)
becomes reducible,. that is to say, let i {2, . . . ,n} be such that f(x) is
irreducible over Ki 1 and reducible over Ki = Ki 1(ui). Then there is a root
of f(x) in Ki, say r1 Ki and, as above, f(x) is the minimal polynomial of r1
over Ki 1, so the prime number mi = Ki:Ki 1 = Ki:Ki 1(r1) Ki 1(r1):Ki 1 is
divisible by Ki 1(r1):Ki 1 = deg f(x) = 3 and so mi = 3. Thus Ki is an
extension of Ki 1 containing the root r1 of f(x).

Let N be a splitting field of f(x) over Ki 1. Then N/Ki 1 is a Galois exten-


sion (Theorem 55.7) and since D Ki 1, the Galois group AutK N is iso-
i-1
morphic to A3 (Theorem 56.21). So N:Ki 1 = AutK N = 3. From r1 Ki 1
i-1
and r1 N Ki, we get Ki 1 N Ki N and degree considerations force
N Ki = N, so N Ki and as N:Ki 1 = 3 = Ki:Ki 1 , we deduce N = Ki.

Theorem 55.10 yields now that Ki is normal over Ki 1 and since the irre-
ducible polynomial xmi ai in Ki 1[x] has a root ui in Ki, the other roots ui
2
and ui of xmi ai are in Ki, so = ui /ui Ki. This contradicts Ki .

Thus there can be no radical extension R of K such that S R .

59.31 Theorem: Let K be a field with char K 2,3 and assume that K
contains a primitive cube root of unity. Let a,b,c,d be distinct
indeterminates over K and let

g(x) = x4 ax3 bx2 + cx d

be the general cubic polynomial over K. Then the roots r1,r2,r3,r4 of g(x)
are given by

788
1
r1 = (a + u+ v+ y)
4

1
r2 = (a + u v y)
4

1
r3 = (a u+ v y)
4

1
r4 = (a u v+ y),
4

where u = a2 4 4 , v = a2 4 4 , y = a2 4 4 ,

, , are the roots of x3 bx2 + (ac 4d)x (a 2d 4bd + c 2) and the


square roots are subject to the condition

u v y = a 3 + 4ab 8c.

Proof: Let r1,r2,r3,r4 be the roots of g(x) and let = r1r2 + r3r4,
= r1r3 + r2r4, = r1r4 + r2r3,. Then , , are the roots of the resolvent
cubic

x3 bx2 + (ac 4d)x (a 2d 4bd + c 2)

and the Galois group of g(x), regarded as a polynomial in K( , , ) is V4


(Theorem 56.23, Theorem 59.17, Lemma 56.25).. We can solve for , ,
in terms of radicals by the method of Theorem 59.28. As V4 = 4, we can
find the roots r1,r2,r3,r4 by introducing two square roots. For this
purpose, we put

u = (r1 + r2 r3 r4)2
v = (r1 r2 + r3 r4)2
y = (r1 r2 r3 + r4)2.

An easy computation gives

u = a2 4 4 ; v = a2 4 4 ; y = a2 4 4

and (r1 + r2 r3 r4)(r1 r2 + r3 r4)(r1 r2 r3 + r4) is a symmetric poly-


nomial in the roots r1,r2,r3,r4, found easily to be a 3 + 4ab 8c. Hence we
have

a = r1 + r2 + r3 + r4

789
u = r1 + r2 r3 r4
v = r1 r2 + r3 r4
y = r1 r2 r3 + r4,

provided we choose the square roots in such a way that u v y


= a 3 + 4ab 8c. Solving this system of linear equations, we find

1 1
r1 = (a + u+ v+ y), r2 = (a + u v y),
4 4

1 1
r3 = (a u+ v y), r4 = (a u v+ y).
4 4

* *

In this part, we settle some famous problems.

A real number a will be called constructible if it is possible to draw a


line segment of length a using ruler and compass only in a finite
number of steps. Thus "constructible" means "constructible by ruler and
compass". Similarly, "to draw" will mean "to draw using ruler and
compass only". Each step in a ruler and compass construction is one of
the following types:
(i) finding the intersection point of two straight lines;
(ii) finding the intersection points of two circles;
(iii) finding the intersection points of a straight line and a circle.

From elementary geometry, it is known that, for any given line l and a
given P, we can draw a line through P parallel to l and also a line
through P perpendicular to l.

We draw two perpendicular lines and regard them as coordinate axes.


Then we fix a unit length. Then we can draw line segments on the
coordinate axes with integral length. Since we can draw lines parallel
and/or perpendicular to the axes, we can locate all points in the
Euclidean plane with integer coordinates as intersection of lines paralel
to the axes.

790
After the introduction of a coordinate system on the plane, we see that a
real number a 0 is constructible if and only if the line segment
[0,a] {0} or {0} [0,a] is constructible. (Closed intervals. Here [0,a] is to
be read as [a,0] when a 0.)

Assume a and b are constructible. Then a + b and a b are constructible,


too. In addition, we can draw the line through (a,0) parallel to the line
segment joining (0,b) and (1,0), which intersects the y-axis at (0,ab).
Also, if b 0, we can draw the line through (0,1) parallel to the line
segment joining (0,b) and (a,0), which intersects the y-axis at (0,a/b).
Thus a + b, a b, ab and a/b are constructible whenever a and b are
constructible (b 0 in case of division). Thus the constructible real
numbers form a subfield of . In particular, all rational numbers are
constructible, for is the prime subfield of the field of constructible real
numbers.

A point (a,b) is said to be constructible if both a and b are constructible.


This is the case if and only if (a,b) can be determined by a finite
sequence of ruler and compass constructions starting from points with
integer coordinates. Hence all points with rational coordinates are
constructible.

In order to determine which numbers are constructible, i.e., in order to


determine the coordinates of constructible points, we must examine
what type of points arise after each of the ruler and compass
construction steps (i),(ii),(iii). It will be convenient to introduce some
terminology.

If K is a subfield of real numbers, a point (a,b) in the plane is called a K-


point if both a and b are elements of K. A straight line through two
distinct K-points is called a K-line. A circle whose center is an K-point
and whose radius is an element of K is called a K-circle.

A K-line l has an equation of the form ax + by + c = 0,. where a,b,c K,


for if l is the straight line through the K-points (x0,y0) and (x1,y1), then l
has the equation (y1 y0)x + (x0 x1)y + (x1y0 y1x0) = 0. A K-circle C has
an equation of the form x2 + y2 + ax + by + c = 0,. where a,b,c K, for if
the center of C is the K-point (x0,y0) and the radius of C is the K-number
r, then C has the equation x2 + y2 + ( 2x0)x + ( 2y0)y + (x02 + y02 r2) = 0.

791
Now let K be a subfield of . We determine the nature of intersection
points of two K-two straight lines and/or K-circles that arise as a result
of one of the steps (i),(ii),(iii).

If l and m are K-lines, say with equations ax + by + c = 0 and dx + ey + f =


0, where a,b,c,d,e,f K, then l and m intersect if and only if ae bd 0
and their point of intersection can be found, on solving the system of
linear equations
ax + by = c
dx + ey = f
for x,y by Cramer's rule, to be the K-point (( ce + bf)/(ae bd), ( af +
cd)/(ae bd)) (Theorem 45.2). Thus two K-lines intersect (if at all) at a
K-point.

Let C be a K-circle and l a K-line, with equations x2 + y2 + ax + by + c = 0


and dx + ey + f = 0, say, where a,b,c,d,e,f K. We find the intersection
points (x,y) of of C and l. Here d,e cannot both be 0, for then the equation
dx + ey + f = 0 would not represent a straight line. In case d = 0, we have
e 0 and y = f/e, so x2 + ( f/e)2 + ax + b( f/e) + c = 0, giving Ax2 + Bx + E
= 0, with A,B,E K. Hence the x-coordinate of an intersection point of C
and l is a root of a quadratic polynomial over K. Let D be the discriminant
of this polynomial. Either D 0 and this polynomial has no real roots, so
C and l do not intersect; or D 0 and it has two (possibly equal) roots
x1,x2 in the field K( D), so C and l intersect at two K( D)-points (x1, f/e),
(x2, f/e). In case d 0, we put g = e/d, h = f/d and use the equation x =
gy + h of l. Here g,h K. Now (gy + h)2 + y2 + a(gy + h) + by + c = 0 gives
Ay2 + By + E = 0, with A,B,E K (not the same A,B,E as above). Hence the
y-coordinate of an intersection point of C and l is a root of a quadratic
polynomial over K. Let D be the discriminant of this polynomial. Either D
0 and this polynomial has no real roots, so C and l do not intersect; or
D 0 and it has two (possibly equal) real roots y1,y2 in the field K( D),
so C and l intersect at two (possibly identical) K( D)-points (gy1 + h, y1),
(gy2 + h, y2).

Let C1,C2 be K-circles, say with equations x2 + y2 + ax + by + c = 0 and


x2 + y2 + dx + ey + f = 0, where a,b,c,d,e,f K. . Then the intersection
points of C1 and C2 are the same as the intersection points of C1 and the
K-line (a d)x + (b e)y + c = 0. Thus either C1,C2 do not intersect or they
intersect at two (possibly identical) K( D)-points, where D K.

792
So each step in a ruler and compass construction gives rise to a K-point
or a K( D)-point for some D K, if K denotes the field of lines/circles
used in that step.

A real number a is constructible if and only if the point (a,0) is


constructible, hence if and only if the point (a,0) can be obtained as a
result of a finite sequence of the steps (i),(ii),(iii) beginning with points
having rational coordinates. Thus a is constructible if and only if there is
a finite chain of fields

= K0 K1 K2 ... Kn 1 Kn

such that Ki = Ki 1( Di) for some Di Ki and a Kn. Here Ki:Ki 1 = 1 or 2


according as Di is or is not in Ki 1; hence Kn: is a power of two.
Moreover a Kn, so (a) Kn and (a): , being a divisor of Kn: ,
is also a power of two. Thus a is algebraic over and the degree of a is a
power of two. We proved the following theorem.

59.32 Theorem: If a real number a is constructible, then a is algebraic


over and the degree of a over is a power of two.

The converse of Theorem 59.32 is also true. See Ex. 10. We are now in a
position to resolve some famous construction problems. The first one is
the construction of a cube whose volume is twice the volume of a given
cube (duplication of a cube). Choosing the length of a side of the given
3
cube as unit length, the side of the cube to be constructed has length 2.
3
Thus the problem is to construct the real number 2. Its minimal
3
polynomial is x 2, since this polynomial is irreducible over by
3
Eisenstein's criterion. Thus 2 is algebraic over , but its degree over
3
is three, not a power of two. Hence 2 cannot be constructed: it is
impossible to duplicate a cube by ruler and compass alone.

The second problem is to divide a given angle into three equal parts
(trisection of an angle). An angle of radians is the circular arc of lenth
on the unit circle, which we may assume to issue from the point (1,0)

793
and terminate at the point (cos , sin ). It is constructible if and only if
(cos , sin ) is constructible. In view of sin = 1 (cos )2, we see
that an angle of radians is constructible if and only if cos is
constructible. The problem is thus equivalent to: given cos , construct
cos ( /3). From the trigonometric identity

cos 3 = 4cos3 3cos ,

we get, on writing a for cos , the polynomial equation

4x3 3x a=0

for cos ( /3). The polynomial 4x3 3x a (a)[x], where a is an inde-


terminate over , is known as the angle trisection polynomial. It is
irreducible over (a): to prove this, it will be sufficient to prove that it
is irreducible over [a] (Lemma 34.11); but 4x3 3x a [a][x] =
[x][a] is certainly irreducible in [x][a], because it of degree one in a
and its coefficients (4x3 3x), 1 [x] are relatively prime in [x]. So
the angle trisection polynomial is irreducible over (a). The general
trisection problem is whether it is possible to construct a root of

4x3 3x a = 0, a = cos

in such a way that the construction remains valid when a is treated as


an indeterminate. Since a root of the angle trisection polynomial has
degree three over (a), the answer is negative: it is impossible to trisect
an arbitrary angle by ruler and compass alone. This does not mean of
course that no specific angle can be trisected. On the contrary, there are
angles like 90° that can very well be trisected.

The third problem is to draw a square whose area is the area of a given
circle (squaring the circle). Choosing the radius of the given circle as unit
length, the side of the square to be constructed has length . Thus the
problem is to construct the real number . But and all the more so
are not algebraic over (Example 49.8(d)), let alone be of degree a
power of two. Hence cannot be constructed: it is impossible to square
the circle by ruler and compass alone.

The final problem is to draw a regular n-gon. This is the same problem
as dividing the circle into n equal parts. Thus we are to divide the angle
of 2 radians into n equal parts, which means we are to construct the

794
+ 1
number cos (2 /n). Now cos (2 /n) = , where = e2 i/n is
2
a primitive n-th root of unity. The field (cos (2 /n)) is fixed only by
1
the automorphisms and in the Galois group of the
cyclotomic extension ( )/ , which is Galois and of degree (n) over
(Theorem 58.12). So (cos (2 /n)) is an intermediate field of ( )/
1
satisfying ( ): (cos (2 /n)) = { , } = 2. Thus (cos (2 /n)): =
(n)/2. Hence, if cos (2 /n) is constructible, then (n)/2 and
consequently also (n) is a power of two. Let n = 2 p1 p2 . . . pra r be the
a0 a1 a2

canonical decomposition of n into prime numbers, but possibly with a0 =


0. Then

(n) = 2a 0 1p1a 1 1p2a 2 1. . . pra r 1(p1 1)(p2 1). . . (pr 1)

(the term 2a 0 1 is to be deleted in case a0 = 1) and (n) is a power of two


if and only if a = a = . . . = a = 1 and p = 2ki + 1 for some k
1 2 r i
. Here k i i
t
cannot be divisible by an odd number t, for otherwise 2 + 1 would
divide the prime number 2ki + 1. Hence ki is a power of two, say ki = 2mi.
m m
Thus pi = 22 + 1. Prime numbers of the form 22 + 1 are called Fermat
m
primes. It is easily verified that 22 + 1 is prime when m = 0,1,2,3,4, but
5
22 + 1 is not prime (it is divisible by 641). It is not known whether
m
there are infinitely or finitely many Fermat primes. In fact, numbers 22
+ 1 are known to be prime only in the cases m = 0,1,2,3,4. We obtain: if a
regular n-gon is constructible, then n has the form n = 2a 0 p1p2. . . pr,
where pi are distinct Fermat primes.

Exercises

1. Let K be a field and define an extension E of K to be a radical


extension of K if E is separable over K and there are elements u1,u2, . . . ,un
in E such that E = K(u1,u2, . . . ,un) and one of the following is true:
(i) ui is a root of a polynomial of the form xhi ai over K(u1, . . . ,ui 1),
where either char K = 0 or char K = p 0 and p is relatively prime to hi;
(ii) ui is a root of polynomial of the form xp x ai over
K(u1, . . . ,ui 1), where p = char K 0.

795
Prove that Theorem 59.8, Theorem 59.9, Theorem 59.10 and Theorem
59.11 remain valid with this new definition of radical extensions.

2. Let K be a field and define an extension E of K to be a radical


extension of K if there are elements u1,u2, . . . ,un in E such that
E = K(u1,u2, . . . ,un) and one of the following is true:
(i) ui is a root of a polynomial of the form xhi ai over K(u1, . . . ,ui 1),
(ii) ui is a root of polynomial of the form xp x ai over
K(u1, . . . ,ui 1), where p = char K 0.

Prove that Theorem 59.8 remains valid with this new definition of
radical extensions if we assume E is normal over K. Discuss Theorem
59.9, Theorem 59.10 and Theorem 59.11.

3. Let K be a field, f(x) K[x] an irreducible polynomial of degree n 5,


E a splitting field of f(x) over K and r a root of f(x) in E. Assume that
AutK E S5. Prove the following assertions.
(i) K(r) is not a Galois extension of K.
(ii) K(r):K = n.
(iii) AutK K(r) 1.
(iv) If N is a normal closure of K over K(r), then there is a subfield
of N isomorphic to E.
(v) There is no radical extension R of K such that K K(r) R.
(This exercise shows the hypothesis that E be Galois over K is indespens-
able in Theorem 59.8.)

4. Prove that S5 and A5 are the only nonsolvable transitive subgroups of


S5. (Hint: Assume = (12345) is in such a subgroup G of S5. There is a
transposition or a 3-cycle in G. In the first case,. G contains all transposi-
tions and G = S5. In the second case, with a 3-cycle in G, we have
and there are 52 even permutations in G and A5 G.)

5. Let f(x) [x] be an irreducible polynomial of degree 5.. Show that, if


f(x) has three real and two complex conjugate roots,. then the Galois
group f(x) is S5. (Hint: Use Ex. 1 and the discriminant.)

6. Find five irreducible polynomials in [x] whose Galois groups are S5.
3 3
7. Show that 2+ 121 + 2 121 = 4 (Raffael Bombelli (ca. 1520-
1572)).

796
8. Let K be a subfield of and let f(x) be a cubic polynomial in K[x]. Let
D be the discriminant of f(x). Prove that
(a) D 0 if and only if f(x) has three real distinct roots;
(b) D 0 if and only if f(x) has one real and two complex conjugate
roots;
(c) D = 0 if and only if f(x) has three real roots, one of which is
repeated.

9. Let K be a subfield of and let f(x) be an irreducible polynomial in


K[x] such that K(a) is a splitting field over K of f(x), for any root a of f(x).
Show that there is no splitting field of f(x) over K and a radical extension
R of K satisfying S R .

10. Prove the converse of Theorem 59.32.. (Hint: Show that, if the degree
of Kn over is a power of two, so is the degree over of the normal
closure of over Kn. Use Galois correspondence and Ex. 12 in §26.)

11. Prove that the angle 90° can and the angle 60° cannot be trisected
by ruler and compass alone.

12. Show that, if n has the form n = 2a 0 p1p2. . . pr, where pi are distinct
Fermat primes, then a regular n-gon is constructible by ruler and
compass alone.

797

You might also like