Professional Documents
Culture Documents
ALGEBRA
by
Ahmet K. Feyzioğlu
CHAPTER 1
Preliminaries
§1
Set Theory
We assume that the reader is familiar with basic set theory. In this pa-
ragraph, we want to recall the relevant definitions and fix the notation.
Our approach to set theory will be informal. For our purposes, a set is a
collection of objects, taken as a whole. "Set" is therefore a collective term
like "family", "flock", "species", "army", "club", "team" etc. The objects
which make up a set are called the elements of that set. We write
x S
to denote that the object x is an element of the set S. This can be read "x
is an element of S", or "x is a member of S", or "x belongs to S", or "x is in
S", or "x is contained in S", or "S contains x". If x is not an element of S,
we write
x S.
1
A set S is called a subset of a set T if every element of S is also an
element of T. The notation
S T
means that S is a subset of T. This is read "S is a subset of T, or "S is
included in T", or "S is contained in T". By convention, the empty set is
a subset of any set. If S is not a subset of T, we write
S T.
This means there is at least one element of S which does not belong to T.
S=T
if S and T are equal sets. Whenever we want to prove that two sets S
and T are equal, we must show that S is included in T and that T is
included in S. If S and T are not equal, we put
S T.
S T
2
way. In many cases, the elements of a set S are characterized by a
property P and the set is then written
{x : x has property P}.
In particular, S T=T S.
S1 S2 ... S n = {x : x S1 or x S2 or . . . or x Sn}.
n
∑ai
n
We usually contract this notation into Si, just like we write
i=1
i=1
Given two sets S and T, we consider those objects which belong to S and
to T. Such objects will make up a new set. This set is called the
intersection of S and T and is denoted by S T. We remark here that
'and' in the definition of a intersection is the logical 'and'. Let us recall
that
'p and q' is true in case 'p' is true, 'q' is true;
3
and
'p and q' is false in case 'p' is true, 'q' is false;
'p' is false, 'q' is true;
'p' is false, 'q' is false.
Thus we have
S T = {x : x S and x T}.
In particular, S T=T S.
Two sets S and T are said to be. disjoint if their intersection is empty:
S T = . Given a family of sets S i, indexed by a set I, the sets Si are
called mutually disjoint if any two distinct of them are disjoint: .
Si1 Si2 = for all i1, i2 I, Si1 Si2 .
S = {x : x U and x S}.
T \ S = {x : x T and x S}.
and call this set the relative complement of S in T, or the difference set T
minus S. The set S may or may not be a subset of T. Note that
T\ S=T S .
According to our definition of equality, the sets {a,b} and {b,a} are equal.
Frequently, we want to distinguish between a,b and b,a. To this end, we
define ordered pairs. An ordered pair is a pair of objects a,b, enclosed
4
within parentheses and separated by a comma. Thus (a,b) is an ordered
pair. The adjective "ordered" is used to emphasize that the objects have
a status of being first and being second. a is called the first component of
the ordered pair (a,b), and b is called its second component. Two ordered
pairs are declared equal if their first components are equal and their
second components are equal. Thus (a,b) and (c,d) are equal if and only
if a = c and b = d, in which case we write (a,b) = (c,d). Notice that we
have (a,b) (b,a) unless a = b (here means the negation of equality).
The set of all ordered pairs, whose first components are the elements of
a set S and whose second components are the elements of a set T, is
called the cartesian product of S and T, and is denoted by S T. Hence
A set can have finitely many or infinitely many elements. The number
of elements in a set S is called the cardinality or the cardinal number of
S. The cardinality of S is denoted by |S| The set S is said to be finite if |S|
is a finite number. S is said to be infinite if S is not finite. A rigorous
definition of finite and infinite sets must be based on the notion of one-
to-one correspondence between sets, which will be introduced in §3.
However, we will not make any attempt to give a rigorous definition of
finite and infinite sets. We shall be content with the suggestive
description above.
Exercises
5
2. Show that (R S) T = R (S T)
and (R S) T = R (S T).
3. Prove: S T = S if and only if S T, and S T if and only if S T = T.
S
11. Prove: if S is a finite set, then S has exactly 2 subsets.
6
§2
Equivalence Relations
7
This definition presents the logical structure of an equivalence relation
very clearly, but we will almost never use this notation. We prefer to
write a b, or a b, or a b or some similar symbolism instead of
(a,b) R in order to express that a,b are related by an equivalence
relation R. Here a b can be read "a is equivalent to b". Our definition
then assumes the form below.
(b) Let A be the set of all points in the plane except the origin. For any
two points P and R in A, let us put P R if R lies on the line through the
origin and P.
(i) P P for all points P in A since any point lies on the line
through the origin and itself. Thus is reflexive.
8
(ii) If P R, then R lies on the line through the origin and P;
therefore the origin, P, R lie on one and the same line; therefore P lies
on the line through the origin and R; and R P. Thus is symmetric.
(iii) If P R and R T, then the line through the origin and
R contains the points P and T, so T lies on the line through the origin and
P, so we get P T. Thus is transitive.
This proves that is an equivalence relation on A.
(c) Let S be the set of all straight lines in the plane. Let us put m n if
the line m is parallel to the line n. It is easily seen that (parallelism) is
an equivalence relation on S.
(d) Let be the set of integers. For any two numbers a,b in , let us put
a b if a b is even (divisible by 2).
(i) a a for all a since a a = 0 is an even number.
(ii) If a b, then a b is even, then b a = (a b) is also
even, so b a.
(iii) If a b and b c, then a b and b c are even. Their
sum is also an even number. So a c = (a b) + (b c) is even and a c.
We see that is an equivalence relation on .
(f) Let S = ( \{0}). Thus S is the set of all ordered pairs of integers
whose second components are distinct from zero. Let us write
(a,b) (c,d) for (a,b), (c,d) S if ad = bc.
(i) (a,b) (a,b) for all (a,b) S, since ab = ba for all a ,
b \{0}.
9
(ii) If (a,b) (c,d), then ad = bc, then da = cb, then cb = da, so
(c,d) (a,b).
(iii) If (a,b) (c,d) and (c,d) (e,f), then
ad = bc and cf = de
df = bcf and bcf = bde
adf = bde
d(af be) = 0
af be = 0 (since d 0)
af = be
(a,b) (e,f).
Thus is an equivalence relation on S.
(g) Let T be the set of all triangles in the Euclidean plane. Congruence of
triangles is an equivalence relation on T.
(h) Let S be the set of all continuous functions defined on the closed
A
interval [0,1]. For any two functions f,g in S, let us write f = g if
A
∫1
0 f(x)dx = ∫
1
0 g(x)dx.
Then = is an equivalence relation on S.
Let us examine our examples under this light. In Example 2.3(b), the
points P and R may be different, but the lines they determine with the
origin are equal. In Example 2.3(c), the lines may be different, but their
directions are equal. In Example 2.3(d), the integers may be different,
but their parities are equal. In Example 2.3(e), the integers may be
10
different, but their remainders, when they are divided by n, are equal.
In Example 2.3(f), the pairs may be different, but the ratio of their
components are equal. In Example 2.3(g), the triangles may have
different locations in the plane, but their geometrical properties are the
same. In Example 2.3(h), the functions may be different, but the "areas
under their curves" are equal.
The equivalence classes [a] are subsets of A. The set of all equivalence
classes is sometimes denoted by A/ . It will be a good exercise for the
reader to find the equivalence classes in Example 2.3.
11
A= [a] and if [a] [b], then [a] [b] = .
a A
Conversely, let
A= Pi , Pi Pj = if i j
a A
c a and c b
a c and c b (by symmetry)
(1) a b (by transitivity)
(2) b a (by symmetry).
We want to prove [a] = [b]. To this end, we have to prove [a] [b] and
also [b] [a]. Let us prove [a] [b]. If x [a], then x a and a b by (1),
then x b by transitivity, then x [b], so [a] [b]. Similarly, if y [b],
then y b , then y b and b a by (2), then y a by transitivity, then
y [a], so [b] [a]. Hence [a] = [b] if [a] and [b] are not disjoint. This
completes the proof of the first assertion.
Now the converse. Let A = Pi, where any two distinct Pi's are
a A
disjoint. We want to define an equivalence relation on A and want the
P i's to be the equivalence classes. How do we accomplish this? Well, if
the P i are to be the equivalence classes, we had better. call two elements
equivalent if they belong to one and the same Pi0 .
12
two or more of the subsets Pi, for then Pi would not be mutually disjoint.
So each element of A belongs to one and only one of the subsets Pi.
Exercises
13
3. Let and be two equivalence relations on a set A. We define by
declaring a b if and only if a b and a b; and we define by
declaring a b if and only if a b or a b. Determine whether and
are equivalence relations on A.
14
§3
Mappings and Operations
15
expects that we write f(a) = b in place of (a,b) f. This is the symbolism
that the reader is accustomed to, and reminds us of a mapping rule that
assigns b to a. However, we will rarely write f(a) = b. We prefer to write
(a)f = b or af = b, with the function symbol f on the right side of the
element a. This might seem odd, and the reader might wonder about this
strange order of elements and functions. It takes some time to get
accustomed to this way of writing functions on the right, but the ad-
vantages of this notation will far outweigh the little trouble it causes at
first. This will be amply clear in the sequel. We remark that not every
algebraist conforms to this usage, and an isolated notation will have
different meanings according as whether the functions are written on
the right or on the left. We will point out these differences as occasions
arise.
16
Then g is not a function from A into B since 3 A is the first component
of two distinct ordered pairs in g A B.
(c) Let A and B be two nonempty sets and let b B be a fixed element
of B . Then f, defined by
uf = 1, xf = 2, yf = 2, zf = 1.
(f) Let A be a nonempty set and let S be the set of all subsets of A. For
any a A, put af = {a} S. Then f is a function from A into S.
17
(j) Let A be a nonempty set and let B be a fixed subset of A. For any a in
A, we put
B
=
{ 0 if a
1 if a
B
B.
Then B
is a function from A into {0,1}. It is called the characteristic
function of B. Here we wrote the function on the left.
f(x) =
{ 0 if x is irrational
1 if x is rational.
: A A/
a [a]
So, in order that two functions f and f1 be equal, their domains must be
equal and the images of any element in. this common domain under the
mappings f and f1 must be equal, too. In particular, if f: A B is a
function and B C, then the function g, defined by ag = af for all a A,
is equal to f. The ranges do not play any role in the definition of
equality. (In some branches of mathematics, for example in topology,
18
two functions with different ranges are sometimes considered distinct,
even if their domains and functional values coincide.)
In the definition of a mapping f: A B, we required that every element
of A be the first component of at least one ordered pair in f and also that
every element of A be the first component of at most one ordered pair
in f. There was no analogous requirement for the elements of B. If we
impose similar conditions on the elements of B, we get special types of
functions, which we now introduce.
The reader must be careful about the usage of the prepositions "into"
and "onto", for they are used with different meanings. That f is a
function from A onto B means that every element of B is the image of
some element of A. For an arbitrary mapping f: A B, an element of B
has perhaps no preimage at all, but if f is a mapping from A onto B, then
each element of B has at least one preimage in A.
(b) Let denote the set of all positive real numbers. Then the map-
ping f: , given by f(x) = x2 for all x , is onto.
1g = a, 2g = a, 3g = a, 4g = b, 5g = c
is onto.
19
(d) Let A be any nonempty set. Then A : A A is onto, for any a A
has a preimage a in A under A since a A = a.
(b) Let denote the set of all positive real numbers. Then the mapping
{(x,y): x2 = y} is a one-to-one function from into .
1g = b, 2g = d, 3g = a
20
is one-to-one.
h = {( a,(af)g ) A C: a A} A C,
In order to compose two functions f and g, we must make sure that the
range of the first function f is a subset of the domain of the second
function g. Otherwise, their composition is not defined. Note the order of
the functions f and g. We apply f first, then g; and we write first f, then g
in the composition notation fg. One of the advantages of writing the
functions on the right becomes evident here. If we had written the
functions on the left, then fg would have meant: first apply g, then f [as
in the calculus, where (f o g)(x) = f(g(x))] and we would have been reading
backwards. Notice also that the domain of fg is the domain of f.
1f =a ag = U
21
2f = c bg = x
3f = d cg =
4f = b dg = 5
Then we have
1(fg) = (1f)g = ag =U
2(fg) = (2f)g = cg =
3(fg) = (3f)g = dg =5
4(fg) = (4f)g = bg = x.
f: 1 a and g: a y
2 a b z
3 b c z,
we have
fg: 1 y
2 y
3 z.
22
However, it is associative.
Proof: We must prove that the domains of (fg)h and f(gh) are equal and
that an arbitrary element in the common domain is assigned to the same
element by (fg)h and by f(gh).
The domain of (fg)h is the domain of fg, which is the domain of f, which
is A. The domain of f(gh) is the domain of f, which is A. So the domains
of (fg)h and f(gh) coincide.
Onto mappings and one-to-one mappings behave very nicely when they
are composed.
Proof: (1) Suppose f and g are onto. For any c C, we must find a
preimage of c under fg.The only thing we know about C is that C is the
range of g. Now g is onto, so c has a preimage in B under g. Let b B be
such that bg = c. Since b B and B is the range of f, and f is onto, b has a
preimage a A under f, so that af = b. Then we get a(fg) = (af)g = bg = c.
23
So a is a preimage of c under fg. This proves that fg is onto. (Summary: a
preimage of a preimage is a preimage that works.)
f g f1
{a,b,c} {x,y,z} {1,2} {a,b} {x,y,z} g 1 {1,2,3}
a x 1 a x 1
b y 2 b y 2
c z z 3
Here fg is onto, but f is not onto; and f1g1 is one-to-one, but g1 is not one-
to-one.
24
(2) Assume fg is one-to-one. We wish to prove. that f is one-to-one.
Suppose that af = a1f, where a, a1 A. Applying g to both sides of this
equation, we get (af)g = (a1f)g, therefore a(fg)= a1(fg). Since fg is one-to-
one by hypothesis, we get a = a1. This proves that af = a1f implies a = a1.
Thus f is one-to-one.
25
3.15 Definition: The mapping g of Theorem 3.14 is called the inverse
mapping of f, or simply the inverse of f. It is denoted by f 1.
Proof: We must show that the domains and functional values coincide.
The domain of ff 1 is the domain of f, which is . A, and A is the domain of
A
. Further, for any a A, we have a(ff 1) = (af)f 1
= a = a A
by the
1. 1
definition of f This proves ff = A
.
This proves f 1f = B
.
26
*
* *
(b) Let M be a set and let S be the set of all subsets of M. Taking union
and taking intersection are binary operations on S. The usual notation
"A B", "A B" conforms to the remarks above.
(c) Let F be the set of all functions from a set A into A. The usual com-
position of functions is a binary operations on F.
27
x y in any way, but this does not preclude from being a binary
operation.
3
(e) Let V be the set of all vectors in the three space . Taking dot
product of two vectors is not a binary operation on V, since the result is
a scalar (real number), not a vector. On the other hand, taking cross
product is a binary operation on V, since the result is a uniquely
determined vector in V.
(f) For any natural numbers m,n, let m • n denote their (positive)
greatest common divisor. Then • is a binary operation on .
(g) Let S be the set of all students in a classroom. For any students a,b
in S, let a .b be that student who sits in front of a. Then . is not a binary
operation on S, for a .b is not defined if a happens to sit in the foremost
row. Remember that a binary operation on S has to be defined for all
pairs in S S.
(h) For any ordered pairs (a,b), (c,d) of real numbers, we put
(a,b) + (c,d) = (a + c, b + d),
(a,b).(c,d) = (ac bd, ad + bc).
Then + and . are binary operations on . Notice that one and the
same symbol "+" stands for two different binary operations, one on ,
and one on .
Exercises
28
A1 f (f(A1)), f(f (B1)) B1
4. Keep the notation of Ex. 2. Assume that f is one-to-one and onto, and
let f 1: B A be its inverse. Show that
f (B1) = f 1(B1) and (f 1) (A1) = f(A1)
for any subsets B1 and A1 of B and A, respectively.
29
§4
Mathematical Induction
We can use this axiom to prove statements of the form 'pn for all n '
as follows. We let S be the set of all natural numbers n for which pn
is true. First we verify 1 S, that is, we verify that p1 is true. Second, we
assume that k . S and under this hypothesis, which is called the
induction hypothesis, we prove that pk+1 is true. So we show that k S
implies k+1 S. By the axiom of mathematical. induction, S = , so the
30
statement pn is true for all n . We formulate the axiom as an
operational procedure.
for all n , pn
by establishing that
I. p1 is true,
II. for all k , if pk is true, then pk+1 is true.
n(n + 1)
4.3 Examples: (a) Prove that 1 + 2 + . . . + n = for all n .
2
We use the principle of mathematical induction.
1(1 + 1)
I. 1= , so the formula is true for n = 1.
2
k(k + 1) .
II. Make the inductive hypothesis that 1 + 2 + . . . + k =
2
(k+1)((k+1) + 1) .
We want to establish 1 + 2 + . . . + k + (k + 1) = We have
2
k(k + 1)
1 + 2 + . . . + k + (k + 1) = + (k + 1) (by inductive
2
hyp.)
k
= ( + 1)(k + 1)
2
(k + 1)(k + 2) ,
=
2
so the formula is true for n = k + 1 if it is true for n = k. Hence
n(n + 1)
1 + 2 + . . . +n = for all n .
2
31
(b) Prove that 2 + 22 + 23 + . . . + 2n = 2n+1 2 for all n .
32
i. q1 is true,
ii. for all k , if q1, q2, q3, . . . , qk are true, then qk+1 is true.
Then qn is true for all n .
Proof: We prove the lemma by the principle of mathematical induction.
We put
p1 = q1
pk = q1 and q2 and . . . and qk (for all k ,k 2).
Now induction.
I. p1 is true (by the hypothesis i.)
II. Make the inductive hypothesis that pk is true. Then
q1 and q2 and . . . and qk is true (definition of pk)
q1, q2, . . . , qk are all true (truth value of conjunction)
qk+1 is true (by the hypothesis ii.)
q1, q2, . . . , qk, qk+1 are all true
q1 and q2 and . . . and qk and qk+1 is true
pk+1 is true.
Hence, for all k , if pk is true, then pk+1 is true. By the principle of
mathematical induction, pn is true for all n . So
q1 and q2 and . . . and qn is true for all n .
In particular, qn is true for all n . This completes the proof.
The statement '2n . n2' is not true for all natural numbers n, but true
for all natural numbers n 5. The. principle of mathematical induction
33
can be used to prove this and similar propositions. Let a be a fixed. inte-
ger (positive, negative or zero) and let pn be a statement involving an
integer n a. We prove the truth of pn for all n a by showing that
1. pa is true
2. for all k a, if pk is true, then pk+1 is true.
This is easily seen when we put qn = pn+ a-1 for n and use Principle
4.2 with qn in place of pn. There is a similar modification of Principle 4.5.
Exercises
Prove the assertions in Ex. 1-6 for all n by the principle of mathe-
matical induction.
1. 1 + 3 + . . . + (2n - 1) = n2.
n(3n 1) .
2. 1 + 4 + 7 + . . . + (3n 2) =
2
n(n + 1)(2n + 1) .
3. 12 + 22 + . . . + n2 =
6
n (n + 1)2 .
2
4. 13 + 23 + . . . + n3 =
4
n(n + 1)(2n + 1)(3n2 + 3n 1) .
4 4
5. 1 + 2 + . . . 4
+n =
30
6. Prove that 2n n2 for all n 5, n .
8. Prove that, for any n and for any positive real numbers
a1,a2,. . . , a n ,
2
2
n a1+a2+. . . +a
2n .
a1a2. . . a n
2n 2
9. Prove that, for any n and for any positive real numbers
a1, a2,. . . , an,
n a1+a2+. . . +an
a1a2. . . an .
n
34
§5
Divisibility
35
(2) If a|b and b|c, then a 0 b and ak = b and bh = c for some k,h .
So a(kh) = bh = c and, since kh , we obtain a|c.
(6) This can be proved in the same way as (5). We might also observe
that a| c if a|c by (1), hence a|b+( c) by (5), so a|b c.
(12) If a|b and b|a, then a 0 and b 0, so we may apply (11) to get
|a| |b| and |b| |a|. Thus |a| = |b|.
36
Proof: There are two claims in this theorem: (1) that there are integers
q,r, with the stated properties and (2) that these are unique, that is, the
pair of integers q,r is the only one which has the stated properties. The
proof of this theorem will accordingly consist of two parts. In the first
part, we prove the existence of q,r, in the second part, their uniqueness.
a |b .
... q
...
r
37
We subtract b from a until we get a number r smaller than b. This is
exactly what happens when we perform division, and this is essentially
the proof of Theorem 5.3.
5.4 Theorem: Let a,b , not both zero. Then there is a unique integer
d such that
(i) d|a and d|b,.
(ii) for all d1 , if d1|a and d1|b, then d1|d,
(iii) d 0..
Now the uniqueness of d. Suppose d satisfies the conditions (i), (ii), (iii),
too. Then d |a, d |b by (i), and so d |d by (ii). Also, d|a, d|b by (i), and so
38
d|d by (ii). By Lemma 5.2(12), we obtain |d| = |d |. From (iii), we get d 0,
d 0, which yields d = d . Thus d is unique.
5.5 Definition: Let a,b , not both zero. The unique integer d in
Theorem 5.4 is called the greatest common divisor of a and b.
Definition 5.5 and the proof of Theorem 5.4 enables us to write the
5.6 Theorem: Let a,b , not both zero. Then (a,b) is the smallest
positive integer in the set {ax by : x,y }.
We first observe that the set U in Theorem 5.6 does not change if we
write a in place of a or b in place of b. This yields
(a,b) = ( a,b) = ( a, b) = (a, b)
for all a,b, not both zero. Hence (a,b) = (|a|,|b|) and, when we want to find
(a,b), we may assume a 0, b 0 (the case a = 0, b = 0 is excluded)
without loss of generality. Moreover, the set U in Theorem 5.6 remains
unaltered if we interchange a and b. Thus
(a,b) = (b,a).
39
Therefore, when we want to find (a,b), we may assume a b without
loss of generality. (Instead of appealing to Theorem 5.6, we could use
the definition to obtain (a,b) = ( a,b) = ( a, b) = (a, b) = (b,a).)
We claim that rk, the last nonzero remainder, is the greatest common
divisor of a and b, and that it can be written in the form ax by , where
x,y are integers.
a = q1b + r1, 0 r1 b,
b = q2r1 + r2, 0 r2 r1,
r1 = q3r2 + r3, 0 r3 r2,
........................
40
be the equations we obtain when we. use the division algorithm
(Theorem 5.3) succesively until we get a nonzero. remainder. (This chain
of equations is known as the Euclidean algorithm.). Then the last nonzero
remainder rk is the greatest common divisor of a and b. Moreover, rk can
be written in the form ri-1x riy; x,y for i = k 1, k 2, . . . , 2, 1, 0 (we
put r0 = b, r 1 = a). In particular, there are integers x0, y0 such that (a,b)
= ax0 by0, and eliminating r1, r2, . . . , rk 1 from the equations above gives
a systematic way of finding the integers x0, y0.
We prove (i) of Theorem 5.4, namely that rk|a and rk|b. We start from the
last equation. in the algorithm and go up through the algorithm. From
the (k+1)-st equation, we get rk|rk 1. Using Lemma 5.2, we get rk|rk 2 from
the k-th equation. So rk|rk 1 and rk|rk 2. From the (k 1)-st equation, we
get rk|rk 3, so rk|rk 2 and rk|rk 3. In general, if we have rk|ri+1 and rk|ri, the
(i+1)-st equation gives rk|ri 1, so we have rk|ri and rk|ri 1. Going through
the equations in this way, we finally get rk|r0 and rk|r 1, that is, we get
rk|b and rk|a. This proves (i) of Theorem 5.4.
Now (ii) of Theorem 5.4. Assume e|a and e|b. We must prove e|rk. We start
from the first equation. in the algorithm and go down through the
algorithm. From the first equation, we get e|a q1b and e|r1 by Lemma
5.2. So e|b and e|r1. From the second equation, we get e|b q2r1 and e|r2.
So e|r1 and e|r2. In general, if we have e|ri 1 and e|ri, the (i+1)-st equation
gives e|ri 1 qi+1ri and e|ri+1. So e|ri, and e|ri+1. Going through the equa-
tions in this way, we finally get e|rk. This proves (ii) of Theorem 5.4.
rk = rk 2
rk 1qk = rk 2 (rk-3 qk 1rk 2)qk
= rk 3( qk) + rk 2(1 + qk 1qk),
41
so rk can be represented as rk 3x rk 2y, namely with x = qk,
y = (1 + qk 1qk). In general, if rk can be written in the form
rix ri+1y, x,y ,
we get, using the (i+1)-st equation in the Euclidean algorithm,
rk = rix ri+1y
= rix (ri 1 qi+1ri)y
= ri 1( y) + ri(x + qi+1y),
which shows that rk can be written also in the form ri 1x1 riy1, namely
with x1 = y, y1 = (x + qi+1y). Going through the equations in this way,
we finally obtain
rk = ax0 by0
for some x0,y0 . This completes the proof.
42
5.10 Lemma:. Let a,b be integers, not both zero. Then a and b are rela-
tively prime if and only if there are integers x0,y0 such that ax0 by0 = 1.
Proof: If (a,b) = 1, then there are integers x0,y0 such that ax0 by0 = 1
by Theorem 5.6 or also by Theorem 5.7. Conversely, if. there are
integers x0,y0 with ax0 by0 = 1, then 1 is certainly the smallest positive
integer in the set {ax by : x,y }, hence (a,b) = 1 by Theorem 5.6.
5.11 Lemma: Let a,b be integers, not both zero, and let d = (a,b). Then
a/d and b/d are relatively prime.
Proof: a/d, b/d are integers, not both of them zero. We have ax by = d
for suitable integers x,y by Theorem 5.7. Dividing both sides of this
equation by d 0, we get
(a/d)x (b/d)y = 1,
and so (a/d, b/d) = 1 by Lemma 5.10
5.12 Theorem: Let a, b,c be integers. If a|bc and (a,b) = 1, then a|c.
We separate \{0} into three subsets: (1) units, (2) prime numbers, (3)
composite numbers. The numbers 1 and 1 will be called units. The units
divide every integer by Lemma 5.2(10). Any other integer a has at least
four divisors: 1, a. These are called the trivial divisors of a. A divisor of
a, which is not one of the four trivial divisors of a, is called a proper
43
divisor of a. If a nonzero integer a is not a unit and has no proper
divisors, then a is called a prime number. Thus 2, 3, 5, 7, 11 are prime
numbers. A nonzero integer, which is neither a unit nor a prime number,
will be called a composite number. So a \{0} is a composite number if
and only if there is a d with 1 |d| |a| and d|a.
Suppose now q2, q3, q4, . . . , qk-1 are true, so that 2,3,4, . . . , k 1 are either
prime numbers or products of prime numbers. We want. to prove that k
is a prime number or a product of prime numbers. If k is. prime, we are
done. If k is not prime, we have k = k1k2, 1 k1 k, 1 k2 k, for
some integers k1,k2. Since qk and qk are true by the induction hypo-
1 2
thesis, each of k1, k2 is either a prime number or a product of prime
numbers:
k1 = p1p2. . . pr k2 = p 1p 2. . . p s
where p1,p2, . . . , pr,p 1,p 2, . . . , p s are prime numbers (r = 1 or s = 1 is
possible, in which case k1 = p1 or k2 = p 1 are prime numbers), and so
k = k1k2 = p1p2. . . prp 1p 2. . . p s,
is a product of prime numbers. Hence qk is true.
44
This proves the theorem for positive integers. For a negative integer n,
where n is not a unit, we have
n = p1p2. . . pt
for some prime numbers p1,p2, . . . ,pt by what we proved above (possibly
t = 1). Hence
n = ( p1)p2. . . pt
is prime or is a product of prime numbers.
After reading the proof of this theorem, it will be clear to the reader
that an abbreviation of the phrase "prime number or a product of prime
numbers" will be very useful. When we speak of a product, we mean a
product of two, three, or more terms. We now extend this to one factor.
A single term will be called a product of one factor (or of one factors). A
prime number is also a product of prime numbers with this convention.
Our theorem reads now more shortly as follows.
Now that we know any integer, which is not zero or a unit, can be ex-
pressed as a product of prime numbers, we ask if it can be written as a
product of prime numbers in different ways. By way of example, let us
begin decomposing 60 into prime numbers as in the proof of Theorem
5.13. We can begin from any decomposition of 60 into factors. For
instance,
60 = 10.6 60 = 15.4
Now we are to decompose each one of the factors 10,6,15,4 into smaller
factors until we get prime numbers. Will we reach the same prime
numbers if we use the two different decompositions as our starting
point? We know of course that further decomposition
60 = (2.5)(2.3) 60 = (3.5)(2.2)
yields the same prime numbers 2,2,3,5 (aside from order). Nevertheless,
our question should not be taken lightly. It is a very pertinent question.
We remark that Theorem 5.13 says nothing in this regard. Theorem 5.13
says that, after enough factorizations, the factors will be prime. As it is,
the prime numbers we obtain may very well be distinct if we start with
45
different factorizations. Indeed, if you start with different things, why
on earth should you end with the same things? If 60 can be written as a
product of factors in two different ways, as above, why should it not be
written as a a product of prime factors in two different ways? The
readers experience with the uniqueness of prime factors of integers
should not mislead him (or her) to believe the uniqueness is obvious. It
is anything but obvious.
With this understanding, we will prove that any integer ( 0,1, 1) has a
unique decomposition into prime numbers. We need some lemmas.
46
Our proof will depend heavily on the following corollary to Theorem
5.12. It is Proposition 30 in Euclid's Elements, Book VII. We shall refer to
it as Euclid's lemma.
5.16 Lemma: Let a1,a2, . . . , an,p be integers. If p is prime and p|a1a2. . . an,
then p|a1 or p|a2 or . . . or p|an.
47
Assume first |n| = 2. Then n = 2 or n = 2 is prime and n = 2 is the
unique representation of n as a product of prime numbers (having only
one factor). So the theorem is true for n if |n| = 2.
Now we make the inductive hypothesis that |n| 2 and that the
theorem is true for all k with 2 |k| n 1, and prove it for n.
p1p2. . . pr 1
= q1q2. . . qs 1
r =s
48
n = p1a 1 p2a 2 . . . pra r, (*)
where p1, p2, . . . , pr are distinct prime numbers and ci 0, ei 0 for all
i = 1,2, . . . ,r, then (m,n) is given by
with ti = min{ci,pi} for all i = 1,2, . . . , r. Here min{x,y} denotes the smaller
(minimum) of x and y when x y and denotes x when x = y.
It can be shown that ((a,b),c ) = (a,(b,c)) for any a,b,c , provided a,b
are not both equal to zero and b,c are not both equal to zero. The
positive number ((a,b),c ) is called the greatest common divisor of a,b,c,
and is denoted shortly by (a,b,c). One proves easily that (a,b,c) is the
unique integer d such that
ax + by + cz = (a,b,c).
49
is defined to be ((a1,a2, . . . , an 1), an ). One can show that their greatest
common divisor (a1,a2, . . . , an 1, an) is the unique integer d such that
Our final topic in this paragraph will be the least common multiple of
two nonzero integers. If a,b and a|b, we say that b is a multiple of a.
5.19 Theorem: Let a,b , neither of them zero (i.e., a 0 b). Then
there is a unique integer such that
Proof: The proof will be similar to that of Theorem 5.4. We consider the
set V = {n : a|n and b|n}. This set is not empty, since, for example,
|ab| is in V (here we use the hypothesis a 0 b). We choose the
smallest positive integer in V. Let it be called m. Thus m 0 and m
satisfies (iii). Also, a|m and b|m since m V, and m satisfies (i). It
remains to show that m satisfies (ii).
50
in. V smaller than the smallest natural number m in V, which is absurd.
Thus r = 0, so m1 = qm, and m|m1. This shows that m satisfies (ii).
Now the uniqueness of m. Suppose m satisfies the conditions (i), (ii), (iii),
too. Then a|m , b|m by (i), and so m|m by (ii). Also, a|m, b|m by (i), and
so m |m by (ii). Hence m|m and m |m. By Lemma 5.2(12), we obtain
|m| = |m |. From (iii), we have m 0, m 0, which yields m = m . Thus m
is unique.
5.20 Definition: Let a,b , neither of them zero. The unique integer
m in Theorem 5.19 is called the least common multiple of a and b.
The least common multiple of a and b will be denoted by [a,b]. From the
proof of Theorem 5.19, we see that [a,b] is indeed the smallest of the
positive multiples of a and b ([a,b] is the smallest number in V). From
the fact that a|m and a|m are equivalent, and likewise that b|m and
b|m are equivalent, it follows that the defining conditions (i), (ii), (iii) do
not change when we replace a by a or b by b. Therefore, [a,b] = [ a,b] =
[ a, b] = [a, b]. In the same way, the conditions (i), (ii), (iii) in Theorem
5.19 are symmetric in a and b, and this gives [a,b] = [b,a].
The greatest common divisor and the least common multiple of two
integers will be connected in Lemma 5.22. We need a preliminary result.
51
Proof:. As neither [a,b], nor (a,b), nor |ab| changes when we replace a
and b by their absolute values,. we assume, without loss of generality,
that a 0, b 0.. We put d = (a,b). We show that ab/d satisfies the
three conditions (i), (ii), (iii) in Theorem 5.19. Let a = a1d, b = b1d, so that
(a1,b1) = 1 by Lemma 5.11.
It can be shown that [[a,b],c ] = [a,[b,c]] for any a,b,c , provided a,b,c
are all distinct from zero. The positive number [[a,b],c ] is called the least
common multiple of a,b,c, and is denoted shortly by [a,b,c]. One proves
easily that [a,b,c] is the unique integer m such that
Exercises
52
m n
2. Assume m,n and m n. What is (22 + 1,22 + 1)?
3. Let a,b , neither of them equal to zero, and assume (a,b) = 1.. Let
x0,y0 be integers such that ax0 + by0 = 1. Prove that all integer pairs x,y
satisfying ax + by = 1 are given by .
x = x0 + bt, y = y0 at
as t runs through all integers.
4. Let a,b,c be integers, none of them. equal to zero, and let (a,b) = d.
Prove that there are integers x,y satisfying. ax + by = c if and only if d|c.
Moreover, if d|c and x0,y0 are integers such that ax0 + by0 = c, prove that
all integer pairs x,y satisfying ax + by = c are given by
x = x0 + (b/d)t, y = y0 (a/d)t
as t runs through all integers.
with ui = max{ci,ei} for all i = 1,2, . . . , r. Here max{x,y} denotes the greater
(maximum) of x and y when x y and denotes x when x = y.
53
§6
Integers Modulo n
_____
The set {0,1,2,. . . , n 1} of residue classes (mod n) will be denoted by
n
. An element of n, thas is, a residue class (mod n) is called an integer
modulo n, or an integer mod n. An integer mod n is not an integer, not
54
an element of ; it is a subset of . An integer mod n is not an integer
with a property "mod n". It is an object whose name consists of the three
words "integer", "mod(ulo)", "n".
for all a ,b n
(for all a,b ). This is a very natural way of introducing
addition and multiplication on n.
(*) and (**) seem quite . innocent, but we must check that and are
really binary operations on n. The reader might say at this point that
and are clearly defined on n and that there is nothing to check. But
yes, there is. Let us remember that a binary operation on n is a
function from n n
into n (Definition 3.18). As such, to each pair (a ,b)
in n n
, there must correspond a . single element a b and a b if
and are to be binary operations on n (Definition 3.1) We must check
that the rules (*) and (**) produce elements of n that are uniquely
determined by a and b.
The rules (*) and (**) above convey the wrong . impression that a b
and a b are uniquely determined by a and b. In. order to penetrate
into the matter, let us try to evaluate X Y, where X,Y n
are not
given directly as the residue classes of integers a,b . . (We discuss ;
the discussion applies equally well to .) How do we. find X Y? Since
X,Y n
, there are integers a,b with a = X, b = Y. Now add a and b in
to get a+b , then take the residue class of a+b. The result is X Y.
55
The result? The question is whether we have only one result to justify
the article "the". We summarize telegrammatically. To find X Y,
1) choose a from X,
2) choose b from Y,
3) find a + b in ,
4) take the residue class of a + b.
This sounds a perfectly good recipe for finding X Y,. but notice that we
use some auxiliary objects, namely a and b, to find X Y,. which must be
determined by X and Y alone.. Indeed, the result a + b depends explicitly
on the auxiliary objects a and b.. We can use our recipe with different
auxiliary objects. Let us do it. 1) I choose a from X and you choose a1
from X. 2) I choose b from Y and you choose b1 from Y. 3) I compute
a + b and you compute a1 + b1. In general, a + b a1+ b1. Hence our recipe
gives, generally speaking, distinct elements a + b and a1+ b1. So far, both
of us followed the same recipe.. I cannot claim that my computation is
correct and yours is false.. Nor can you claim the contrary. Now we carry
out the fourth step.. I find the residue class of a + b as X Y, and you
find the residue class of a1 + b1 as X Y. Since a + b a1 + b1 in , it can
very well happen that a + b a1 + b1 in n. On the other hand, if
is. to be a binary operation on n, we must have a + b = a1 + b1. This is
the central issue. In order that be a binary operation on n, there
must work a mechanism which ensures a + b = a1 + a1 whenever a =
a1, b1 = b, even if a + b a1 + b1. If there is such a mechanism, we say
is a well defined operation on n. This means is really a genuine
operation on n: X Y is uniquely determined by X and Y alone. Any
dependence of X Y on auxiliary integers a . X and b Y is only
apparent. We will prove that and are well defined operations on n,
but before that, we discuss more generally well definition of functions..
56
A rule of this type uses an auxiliary object x. The result then depends on
a and x. At least, it seems so. This is due to the ambiguity in the second
step. This step states that we choose an x with such and such property,
but there may be many objects x,y,z, . . . related to a in the prescribed
manner. The auxiliary objects x,y,z, . . . will, in general, produce different
results, so we should perhaps that the result is f(a,x) (or f(a,y), f(a,z), . . . ).
In order the above rule to be a function, it must produce the same
result. Hence we must have f(a,x) = f(a,y) = f(a,z) = . . . . The rule must be
so constructed that the same result will obtain even if we use different
auxiliary objects. If this be the case, the function is said to be well
defined.
57
6.2 Examples: (a). Let L be the set of all straight lines in the Euclidean
plane, on which we have a cartesian coordinate system.. We consider the
"function" s: L { },. which assigns the slope of the line l to l. How
do we find s(l)? As follows: 1) choose a point, say (x1,y1), on l; 2) choose
another point, say (x2,y2), on l; 3) evaluate x2 x1 and y2 y1; 4) put s(l)
= (y2 y1)/(x2 x1) if x1 x2 and s(l) = if x1 = x2. Clearly we can choose
the points in many ways. For example, we might choose (x1 ,y1 ) (x1,y1)
as the first point, (x2 ,y2 ) (x2,y2) as the second point. Then we have, in
general, x2 x1 x2 x1 and y2 y1 y2 y1, so we might suspect that
(y2 y1 )/(x2 x1 ) (y2 y1)/(x2 x1). It is known from analytic geometry
that these two quotients are equal, hence s(l) depends only on l, and not
on the points we choose. Thus s is a well defined function. Ultimately,
this is due to the fact that there passes one and only one straight line
through two distinct points. The next example shows that well definition
breaks down if we modify the domain a little.
(b) Let C be the set of all curves in the Euclidean plane.. We consider the
"function" s: C { },. which assigns the "slope" of the curve c to c.
How do we find s(c)? As follows: 1) choose a point, say (x1,y1), on l; 2)
choose another point, say (x2,y2), on l; 3) evaluate x2 x1 and y2 y1; 4)
put s(l) = (y2 y1)/(x2 x1) if x1 x2 and s(l) = if x1 = x2. This is the
same rule as the rule in Example 6.2(a). Let us find the "slope" of the
curve y = x2. 1) Choose a point on this curve, for example (0,0). If you
prefer, you might choose ( 1,1). 2) Choose another point on this curve,
for example (1,1). If you prefer, you might choose (2,4) of course. 3)
Evaluate the differences of coordinates. We find 1 0 and 1 0. You find
2 ( 1) and 4 1. Hence 4) the slope is 2/1. You find it to be 3/3. So s(c)
= 2 and s(c) = 1. This is nonsense. We see that different choices of the
points on the curve (different choices of the auxiliary objects) give rise
to different results. So the above rule is not a function. We do not say "s
is not a well defined function". s is simply not a function at all. s is not
defined.
(c) Let F be the set of all continuous functions on a closed interval [a,b].
We want to "define" an integral "function" I: F , which assignes the
∫
real number ab f(x)dx to f ∫
F. So I(f) = ab f(x)dx. I is a "function"
whose "domain" is a set of functions. How do we find I(f)?. As follows. 1)
Choose an indefinite integral of f,. that is, choose a function F on [a,b]
such that F (x) = f(x) for all x [a,b] . (we take one-sided derivatives at a
58
and b). 2) Evaluate F(a) and F(b). 3) Put I(f) = F(b) F(a).. There are
many functions F with F (x) = f(x) for all x [a,b].. For two different
choices F1 and F2, we have F1(b) F2(b) and F1(a) F2(a) in general. So
we may suspect that F1(b) F1(a) F2(b) F2(a). In order to show that I
is a well defined function, we must prove F1(b) F1(a) = F2(b) F2(a)
whenever F1 and F2 are functions on [a,b] such that F1 (x) = f(x) = F2 (x)
for all x [a,b]. We know from the calculus that, when F1 and F2 have
this property, there is a constant c such that F1(x) = F2(x) + c for all
x [a,b]. So F1(b) F1(a) = (F2(b) + c ) (F2(a) + c ) = F2(b) F2(a). There-
fore, I is well defined.
After this lengthy digression, we return to the integers mod n and to the
"operations" and .
59
6.4 Lemma: For all a ,b,c n
, the following hold.
(1) a + b n
;
(2) (a + b) + c = a + (b + c );
(3) a + 0 = a;
(4) a + a = 0;
(5) a + b = b + a;
(6) a .b n
;
(7) (a .b). c = a .(b. c );
(8) a .1= a ;
(9) if (a,n) = 1, then there is an x n
such that a .x = 1;
(10) a .b = b.a ;
(11) a .(b + c ) = a .b + a . c and (b + c ).a = b.a + c .a ;
(12) a .0 = 0.
(a + b) + c = ( a + b ) + c
= (a + b) + c
= a + (b + c)
= a + (b + c)
= a + (b + c ).
The remaining assertions are proved in the same way by drawing bars
over integers in the corresponding equations in . We prove only (9),
which is not as straightforward as the other claims. If (a,n) = 1, Then
there are integers x,y with ax ny = 1 (Lemma 5.10). Using (3) and (12),
we get 1 = ax ny= ax ny = a .x n.y = a .x 0.y = a .x 0 = a .x.
Exercises
60
(e) g(a ) = (a 3,169);
(f) g(a ) = (a,6);
(g) g(a ) = (a 2,65);
where a 13
and a .
f
2) Let f: 12 12
be such that (a ,b) a 2+ab+b2 . Is f well defined?
(e) 5 6
,â ~
a+1 .
61
CHAPTER 2
Groups
§7
Basic Definitions
7.1 Examples: (a) Consider the addition of integers. From the numer-
ous properties of this binary operation, we single out the following ones.
(i) + is a binary operation on , so, for any a,b , we have
a +b .
(ii) For all a, b, c , we have (a + b) + c = a + (b + c).
(iii) There is an integer, namely 0 , which has the property
a+0=a for all a .
(iv) For all a , there is an integer, namely a, such that
a + ( a) = 0.
62
(iii) There is a positive real number, namely 1 , which has
the property
a .1 = a for all a .
(iv) For all a , there is a positive real number, namely
1/a, such that
1
a . = 1.
a
(d) Let X be a nonempty set and let SX be the set of all one-to-one
mappings from X onto X. Consider the composition o of mappings in SX .
(i) o is a binary operation on SX , for if and are one-to-one
mappings from X onto X, so is o by Theorem 3.13.
(ii) For all , , SX , we have ( o ) o = o ( o ) (Theorem
3.10).
(iii) There is a mapping in SX , namely X
SX , such that
o
X
= for all SX (Example 3.9(a)).
(iv) For all SX , there is a mapping in SX , namely 1, such
that
o 1 = .
X
(See Theorem 3.14 and Theorem 3.16. That 1 SX follows from
Theorem 3.17(1).)
63
that set having the same properties as in the examples above. A group
will thus consist of two parts: a set and a binary operation. Formally, a
group is an ordered pair whose components are the set and the opera-
tion in question.
When (G, o ) is a group, we also say that G is (or builds, or forms) a group
with respect to o (or under o ). Since a group is an ordered pair, two
groups (G, o ) and (H,*) are equal if and only if G = H and the binary
operation o on G is equal to the binary operation * on G (i.e., o and * are
identical mappings from G G into G). On one and the same set G, there
may be distinct binary operations o and * under which G is a group. In
this case, the groups (G, o ) and (G, *) are distinct.
The four conditions (i)-(iv) of Definition 7.2 are known as the group ax-
ioms. The first axiom (i) is called the closure axiom. When (i) is true, we
say G is closed under o .
64
An element e of a set G, on which there is a binary operation o , is called
a right identity element or simply a right identity if a o e = a for all a in
G. The third group axiom (iii) ensures that group G has a right identity
element. We will show presently that group has precisely one identity
element, but we have not proved it yet and we must be careful not to
use the uniqueness of the right identity before we prove it. All we know
at this stage is that a group has at least one right identity for which (iv)
holds. As it is, there may be many right identities. In addition, there
may be some right identities for which (iv) is true and also some for
which (iv) is false. For the time being, these possibilities are not
excluded.
They will be excluded in Lemma 7.3, where we will prove further that
our unique right identity is also a left identity. A left identity element
or a left identity of G, where G is a nonempty set with a binary
operation o on it, is by definition an element f of G such that f o a = a for
all a G. The group axioms say nothing about left identities. If (G, o ) is a
group, we do not yet know if there is a left identity in G at all, nor do we
know any relation between right and left identities. For the time being,
there may be no or one or many left identities in G. If there is only one
left identity, it may or may not be right identity. If there are many left
identities, some or one or none of them may be right identities.
We mention all these possibilities so that the reader does not read in the
axioms more than what they really say. The group axioms say nothing
about left identities or about the uniqueness of the right identity.
65
Before we lose ourselves in chaos, we had better prove our lemma.
66
7.3 Lemma: Let (G, o ) be a group and let e be a right identity element
of G such that, for all a G, there exists a suitable x in G with a o x = e.
The existence of e is assured by the group axioms (iii) and (iv).
(1) If g G is such that g o g = g, then g = e.
(2) e is the unique right identity in G.
(3) A right inverse of an element in G is also a left inverse of the same
element. In other words, if a o x = e, then x o a = e.
(4) e is a left identity in G. That is, e o a = a for all a G.
(5) e is the unique left identity in G.
(6) Each element has a unique right inverse in G.
(7) Each element has a unique left inverse in G.
(8) The unique right inverse of any a G is equal to the unique left
inverse of a.
(2) The claim is that e is the unique right identity in G. This means: if
f G is a right identity, that is, if a o f = a for all a G, then f = e.
Suppose f is a right identity. Then a o f = a for all a G. Writing f for a in
particular, we see f o f = f. Hence f = e by part (1).
67
ao x =xo a (by part (3))
(a o x) o a = (x o a) o a
a o (x o a) = (x o a) o a
ao e =eo a
a = e o a.
Therefore, e is a left identity as well. This proves part (4).
(5) The claim is that e is the unique left identity in G. This means: if f is
a left identity in G so that f o a = a for all a G, then f = e. We know that
the right identity e is a left identity (part (4)), and that e is the unique
right identity (part (2)). So we conclude that e is the unique left
identity. Is this correct? No, this is wrong. This would be correct if we
knew that any left identity is also a right identity (and so the unique
right identity by part (2)), which is not what part (4) states. For all we
proved up to now, there may very well a unique right identity and
many left iden-tities (among them the right identity). We are to show in
part (5) that this is impossible.
After so much fuss, now the correct proof, which is very short. Suppose
f o a = a for all a G. Write in particular f for a. Then f o f = f and part (1)
yields f = e.
(6) The claim is that each element a G has a unique right inverse in G.
We know that a has at least one right inverse, say x. We have a o x = e.
We are to show: if a o y = e, then y = x (here y G). Suppose then
a o x = e and a o y = e. We obtain
xo a =e (by part (3))
( x o a) o y = e o y
x o (a o y) = e o y
xo e =eo y
x =eo y
x =y (by part (4)).
(7) and (8) Let a G and let x be the unique right inverse of a. From
part (3), we know that x is a left inverse of a, so that x o a = e. We must
prove: if x o a = e and y o a = e, then y = x. Suppose then x o a = e and
y o a = e. Then
a o x =e
y o (a o x) = y o e
68
(y o a) o x = y
eo x =y
x = y.
This completes the proof.
According to Lemma 7.3, a group (G, o ) has one and only one right
identity, which is also the unique left identity. Therefore, we can refer
to it as the identity of the group, without mentioning right or left. Simi-
larly, since any a G has a unique right inverse, which is also the
unique left inverse of a, we may call it the inverse of a. The inverse of a
is uniquely determined by a; for this reason, we introduce a notation
displaying the fact that it depends on a alone. We write a 1 for the
inverse of a (read: a inverse). Thus a 1 is the unique element of G such
that a o a 1 = a 1 o a = e, where e is the identity of the group.
69
Yes, this is true since + is an associative operation on . Hence is asso-
ciative.
(iii) Is there an element in , (a 0,b0) say, such that
(a,b) (a0,b0) = (a,b) for all (a,b) ?
Well, this is true if and only if (a,b + b0) = (a,b), which is equivalent to
b0 = 0 . There is no condition on a0. For example,
(a,b) (0,0) = (a,b + 0) = (a,b)
(a,b) (1,0) = (a,b + 0) = (a,b)
for all (a,b) , so (0,0) and (1,0) are right identities. In fact, any
(n,0) is a right identity.
From Lemma 7.3, we know that a group has one and only one right
identity. so is not a group under . On the other hand, with
respect to (0,0) for example (in fact, with respect to any right identity),
each element (a,b) of has a left inverse (0, b):
(0, b) (a,b) = (0, b + b) = (0,0)
(with respect to (n,0), a left inverse of (a,b) is (n, b)).
70
4) that for each a G, there is an a 1 G such that a o a 1 = e,
5) that a 1 o a = e as well,
6) that this a 1 is the unique element of G with a o a 1 = e = a 1 o a,
which more than doubles our work. With our Definition 7.2, we need
check only 1) and 4). The other items 2),3),5),6) follow from 1) and 4)
automatically. We pay for our comfort by having to prove Lemma 7.3,
but, once this is over, we have less work to do in order to see whether a
given set G forms a group under a given operation o on it, as in the
following examples.
7.4 Examples: (a) For any two elements a,b of \{1}, we put
a o b = ab a b + 2. We ask if \{1} is a group under o . Let us check the
group axioms.
(i) For all a,b \{1}, we observe a o b = ab a b + 2 ,
but this is not enough. We must prove a o b 1 also. Let a,b , a 1
b. We suppose a o b = 1 and try to reach a contradiction. If a o b = 1, then
ab a b + 2 = 1
ab a b + 1 = 0
(a 1)(b 1) =0
a 1 = 0 or b 1 =0
a = 1 or b = 1,
a contradiction. So a o b \{1} and o is a binary operation on \{1}.
(ii) For all a,b,c \{1}, we ask if (a o b) o c = a o (b o c).
?
(ab a b + 2) o c =a o (bc b c + 2)
?
(ab a b + 2)c (ab a b + 2) c + 2 = a(bc b c + 2) a (bc b c + 2) + 2
?
abc ac bc + 2c ab + a + b 2 c + 2 = abc ab ac + 2a a bc + b + c 2+
2
The answer is "yes." So o is associative.
(iii) We are looking for an e \{1} such that a o e = a for all
a \{1}. Assuming such an e exists, we get
ae a e + 2 = a
ae e = 2a 2
(a 1)e = 2(a 1)
e =2 (since a 1 0).
We have not proved that 2 \{1} is a right identity element. We
showed only that a right identity element, if it exists at all, has to be 2.
Let us see if 2 is really a right identity. We observe
71
a o 2 = a2 a 2 + 2 = 2a a = a
far all a \{1}. Since 2 \{1}, 2 is indeed a right identity in \{1}.
(iv) For all a \{1}, we must find an x \{1} such that
a o x = 2. Well, this gives
ax a x + 2 = 2
ax a x + 1 =1
(a 1)(x 1) =1
x 1 = 1/(a 1)
x = a/(a 1),
which is meaningful since a 1. We have not proved yet that a/(a 1) is
a right inverse of a. We showed only that a right inverse of a \{1}, if
it exists at all, has to be a/(a 1). We must now show that a o a/(a 1) =
2 for all a \{1} and also that a/(a 1) \{1}. Good. We have
a o a/(a 1) = a (a/(a 1)) a (a/(a 1)) + 2
= (a 1)(a/(a 1)) a + 2
= 2,
and also a/(a 1) 1, for a/(a 1) and a/(a 1) = 1 would imply
that a = a 1, hence 0 = 1, which is absurd.
Since all the group axioms hold, \{1} is a group under o .
72
x= 4 a .
4 a is indeed a right inverse of a since a * ( 4 a) = a + ( 4 a) + 2 =
2.
Therefore is a group with respect to *.
(c) Let A se a nonempty set and let be the set of all subsets of A. The
elements of are thus subsets of A. Consider the forming of symmetric
differences (§1, Ex.7). is a group under :
(i) For all S,T , S T is a subset of A, so S T and is
closed under .
(ii) is associative (§1, Ex.8).
(iii) is a right identity (§1, Ex.8).
(iv) Each element S of has a right inverse, namely S itself,
as S S = for all S (§1, Ex.8).
So is a group under .
73
square is known as the Cayley table or the operation table
(multiplication or addition table, as the case may be) of the group (G, o ).
+ 0 1 2 3
0 0 1 2 3
1 1 2 3 0
2 2 3 0 1
3 3 0 1 2
Proof: (1) We prove first that there can be at most one x G such that
a o x = b. Let a o x = b = a o x1. We prove x = x1. We have
a o x = a o x1
a 1o (a o x) = a 1 o (a o x1)
(a 1o a) o x = (a 1 o a) o x1
e o x = e o x1
x = x1
by Lemma 7.3. So there can be at most one x with a o x = b.
74
The proof of (2) is similar and is left to the reader.
e a b e a b e a b
e e a b e e a b e e a b
*
a a a a b a a b e
b b b b b b e a
(a o a) o a = a o (a o a) (b o a) o a = b o (a o a)
(a o a) o b = a o (a o b) (b o a) o b = b o (a o b)
(a o b) o a = a o (b o a) (b o b) o a = b o (b o a)
75
(a o b) o b = a o (b o b) (b o b) o b = b o (b o b)
holds.
We close this paragraph with some comments on the group axioms. The
reader might ask why we should study the structures (G, o ) where o
satisfies the axioms (i),(ii),(iii),(iv). Why do we not study structures (G, o )
where o satisfies the axioms (i),(iii),(iv),(v) or (i),(ii),(iii),(v)? What is the
reason for preferring the axioms (i),(ii),(iii),(iv) to some other combination
of (i),(ii),(iii),(iv),(v)? There is of course no reason why other combinations
ought to be excluded from study. As a matter of fact, all combinations
have a proper name and there are theories about them. However, they
76
are very far from having the same importance as the combination (i),(ii),
(iii),(iv).
Exercises
1. Determine whether the following sets build groups with respect to the
operations given. In each case, state which group axioms are satisfied.
(a) under subtraction, multiplication and division.
(b) \{0}, \{0}, \{0} under multiplication.
(c) {0,1}, { 1,1} under multiplication.
(d) {z : z 1} under multiplication.
(e) {z : z = 1} under multiplication.
77
(f) 5 = {5z :z } under multiplication and addition.
(g) {x} under o , where x o x = x.
(h) {(t,u) : t2 5u2 = 4} under *, where * is defined by
t1t2 +5u1u2 , t1u2+t2u1
(t1,u1) * (t2,u2) = ( )
2 2
for all (t1,u1),(t2,u2) in this set.
(i) 6 and 8 under multiplication and addition.
(j) 7 and 7\{0} under multiplication.
(k) {f,g} under the composition of mappings, where f: x x and
g: x 1/(1 x) are functions from \{1} into \{1}.
(l) {f,g,h} under the composition of mappings, where f: x x and
g: x 1/(1 x) and h: x (x 1)/x are functions from \{1,0} into
\{1,0}.
(m) {fa,b: a,b , a 0} under the composition of mappings,
where fa,b is defined by fa,b(x) = ax + b as a function from into .
78
§8
Conventions and Some Computational Lemmas
78
Proof: (1) If ab = ac, we multiply by a 1 on the left and get a 1(ab) =
a 1(ac). Using associativity, we obtain (a 1a)b = (a 1a)c. So 1b = 1c. Since 1
is the identity element of G, we finally get b = c.
79
*
* *
Now let us consider the product of four elements a,b,c,d. Their product
in this order will be defined by three successive multiplications of two
elements. This can be done in five distinct ways:
a (b(cd)), a ((bc)d), (ab)(cd), ((ab)c )d, (a(bc))d,
but these five products are all equal by associativity. The first two
products are equal since b(cd) = (bc)d. The last two products are equal
since (ab)c = a(bc). Further, we have a (b(cd)) = (ab)(cd) [put cd = e, then
a(be) = (ab)e] and (ab)(cd) = ((ab)c )d [put ab = f, then f(cd) = (fc)d]. So
the five products are equal. This renders it possible to drop the
parentheses and write simply abcd. This is the product of a,b,c,d in the
given order.
80
8.3 Lemma: Let G be a nonempty set and . let there be defined an
associative binary operation on G, denoted . by juxtaposition. Let
a1,a2, . . . ,an G. Then the products of a1,a2,. . . ,an are independent of the
mode of putting parentheses. This means the following. We define .
P1(a1) = {a1}
P2(a1,a2) = {a1a2}
x P3(a1,a2,a3), y P1(a4) }
..........................................
Claim: For all n and for all a1,a2,. . . ,an G, the set Pn(a1,a2,. . . ,an)
contains one and only one element.
Proof: The proof will be by induction on . n (in the form 4.5). For n =
1,2, it is evident that P1(a1), P2(a1,a2) each have exactly one element. For
n = 3, the claim is just the associativity of multiplication. For n = 4, the
argument preceding the lemma proves the claim. Notice that we used
only the associativity of multiplication there.
81
We prove u = v first under the assumption. i = j. By induction, the set
Pi(a1,a2,. . . ,ai) contains one and only one element. Hence x = s. Also,
applying the induction hypothesis to n i, with the elements ai+1,. . . ,an,
we conclude that Pn i(ai+1,. . . ,an) has one and only one element. This
gives y = t. Then we get u = xy = sy = st = v. So the. claim is proved in
case i = j. .
∏ai.
n
or by
i=1
82
Using the notation of Definition. 8.4, we can reformulate Lemma 8.3 as
follows. If G is a nonempty. set with an associative multiplication on it,
and if a1, a2, . . . , an G, then
a1(a2. . . an) = (a1a2)(a3. . . an) = (a1a2a3)(a4. . . an) = . . . = (a1a2. . . an 1)an = a1a2. . . an.
83
Proof: (1) We prove a ma n = a m+ n. If m 1, n 1, Lemma 8.5 yields the
0 n n n 0+n
result. If m = 0, then a a = 1a = a = a for all n ; and if n = 0,
m 0 m m m+0
then a a = a 1 = a = a for all m . So we have
84
(a m)n = a mn whenever m,n 0. (e´)
(a m)n = [(a 1)m]n = (a 1)mn = a (mn) = a m( n). This proves (i). We also get (a m)n
= a (mn) = a (m)n. This proves (ii). Finally, we have (a m) n = [(a m) 1] n =
([(a m) 1] 1)n = (a m)n = a mn = a ( m)( n). This proves (iii).
Thus (a m)n = a mn for all m,n .
as was to be proved.
Proof: (1) We prove abn = bna. The case n = 0 is trivial. Also, ab1 = ab =
ba = b1a by hypothesis and the claim is true for n = 1. Suppose now
n , n 2 and the claim is proved for n 1, so that abn 1 = bn 1a.
85
Then abn = a(bn 1b) = (abn 1)b = (bn 1a)b = bn 1(ab) = bn 1(ba) = (bn 1b)a =
bna. By induction, abn = bna for all n .
n
We multiply this relation by b on the left and on the right. This gives
b na = ab n for n . So abn = bna is true also when n 1. So abn = bna
for all n .
(2) We have bna = abn by (1). We use this as a hypothesis and apply (1)
with a,b,n replaced by bn,a,m, respectively. Then we obtain a mbn = bna m
for all m,n .
86
= (b(a1a2. . .am 1))am
= b((a1a2. . .am 1)am)
= b(a1a2. . .am 1am),
as was to be proved.
87
. , an 1 and the arrangement k1,. . . kj 1,kj+1,. . . ,kn of the numbers 1,2, . . . ,n
1, we have ak . . . ak ak . . . ak = a1a2. . . an 1; therefore
1 j-1 j+1 n
ak ak . . . ak = (ak . . . ak ak . . . ak )an
1 2 n 1 j-1 j+1 n
= (a1a2. . . an 1)an
= a1a2. . . an 1an
(3) That (ab)n = a nbn is proved for n 0. We are to prove it also when
n 1. Replacing n by n, we are to prove that (ab) n = a nb n for n .
88
We note that ab = ba implies b 1a 1 = (ab) 1 = (ba) 1 = a 1b 1, so the
hypothesis of (1) is satisfied when we replace a by a 1 and b by b 1.
Using (1) with a 1,b 1 in place of a,b, respectively, we obtain
8.15 Lemma: Let G be a commutative group. Then (ab)n = a nbn for all
a,b G and for all n .
Proof: (1),(2),(3) follow from Lemma 8.7 and (4) from Lemma 8.15.
Notice that commutativity is essential for (4).
Exercises
89
1. Let G be a group such that a 2 = 1 for all a G. Prove that G is
commutative.
90
§9
Subgroups
91
Since H G, we have all the more so (ab)c = a(bc) for all a,b,c H.
Indeed, if all the elements of G have a certain property, then all the
elements of H will have the same property. Thus associativity holds in H
automatically, so to speak. We do not have to check it.
In H, there must exist an identity, say 1H H such that a1H = a for all
a H. In particular, the identity 1H of H has to be such that 1H 1H = 1H .
Since 1H H G, Lemma 7.3(1) yields 1H = 1G = identity element of G.
So the identity element of G is also the identity element of H, provided it
belongs to H. Then we do not have to look for an identity element of H,
we must only check that the identity element of G does belong to H. We
write 1 for the identity element of H, since it is the identity element of
G.
92
So we can dispense with checking 1 H when we know H . On the
other hand, when we do not know a priori that H , the easiest way to
ascertain H may be to check that 1 H.
9.3 Lemma: (1) Let G be a group and let H be a nonempty finite subset
of G. Then H is a subgroup of G if and only if H is closed under
multiplication.
Proof: (1) We prove that 9.2(ii) follows from 9.2(i) when H is finite, so
that 9.2(i) and 9.2(ii) are together equivalent to 9.2(i), which is the
claim. So, for all a H, we must show that a 1 H under the assumption
that H is finite and closed under multiplication.
(2) This follows from (1), since any subset of a finite set is finite.
9.4 Examples: (a) For any group G, the subsets {1} and G are
subgroups of G. Here {1} is called the trivial subgroup of G.
93
(ii) if x 4 , then 4 x, then 4 x, so x 4 .
Hence 4 .
(g) Let S [0,1] be the set of all one-to-one mappings from [0,1] onto [0,1],
which is a group under the composition of mappings (Example 7.1(d)).
Consider
T={ S [0,1] : 0 = 0}.
94
_ _ _ _
(h) Let U = {1,3,5,7} 8
and consider the multiplication in 8
. We see
_ _ _ _ _ _ _ _ _ _ _ _
11=1 13=3 15=5 17=7
_ _ _ _ _ _ _ _ _ _ _ _
31=3 33=1 35=7 37=5
_ _ _ _ _ _ _ _ _ _ _ _
51=5 53=7 55=1 57=3
_ _ _ _ _ _ _ _ _ _ _ _
71=7 73=5 75=3 77=1
So U is a group. Let us find its subgroups. Now we can use Lemma 9.3.
This lemma shows that {1,3},{1,5},{1,7} are subgroups of U since they
are closed under multiplication. The reader will easily see that these are
the only nontrivial proper subgroups of U. Hence the subgroups of U
have orders 1,2,4, which are all divisors of the order U = 4 of U.
(i) E:= {1, 1,i, i} \{0} is a subgroup of the group \{0} of nonzero
complex numbers under multiplication by Lemma 9.3 as it is closed
under multiplication. The same lemma shows that {1, 1} is a subgroup
of E. Also, E has no other nontrivial proper subgroup, for any subgroup
of E that contains i or i must contain i 2,i 3,i 4 or ( i)2,( i)3,( i)4 and thus
must be E itself. So E has exactly three subgroups, one of order 1, one of
order 2,one of order 4. Here, too, the orders of the subgroups are
divisors of the order E = 4 of the group E.
95
(j) Lemma 9.3 may be false if the subset is not finite. For example, is
a group under addition, is a subset of and is closed with respect
to addition. Still, is not a subgroup of since there is no additive
identity in (0 ).
Exercises
_ _ _ _ _ _
4. Let L = {1,2,4,5,7,8} 9
. Show that L is a group under multiplication.
Find all subgroups of L. Do the orders of the subgroups divide the order
L = 6 of the group L?
96
§10
Lagrange's Theorem
Ha := {ha G: h H} G
aH := {ah G: h H} G
Right and left cosets of H are subsets of G. When the group is written
additively, we write H + a = {h + a G: h H} and a + H = {a + h G: h
H} for the right and left cosets of H. A right coset is not necessarily a left
coset and a left coset is not necessarily a right coset. However, when the
group is commutative, the right and left cosets coincide, as is evident
from the definition. During a particular discussion, we usually fix a sub-
group H of a group G and consider its various (right or left) cosets. Then
we refer to Ha as the right coset of a G, or as the right coset of H
determined by a. We use similar expressions for aH.
97
Cosets are subsets of a group, so the equality of two cosets is defined by
mutual inclusion. We ask when two cosets are equal. The next lemma
gives an answer.
Proof: We prove only the assertions for right cosets and leave the
discussion of left cosets to the reader.
a = hb and b = h 1a,
h a = h hb Hb and h b = h h 1a Ha for all h H,
Ha Hb and Hb Ha,
so Ha = Hb.
98
(5) Ha = Hb if and only if a = hb for some h H, and there is a unique h
1
with a = hb, namely h = ab (Lemma 7.5(2)); thus a = hb for some h H
1
if and only if ab H.
Now we prove that the right cosets of H are mutually disjoint. Assume
Ha Hb . We are to show Ha = Hb. Well, we take c Ha Hb if
Ha Hb . Then c Ha and c Hb. So Ha = Hc and Hc = Hb by Lemma
10.2(4). We obtain Ha = Hb.
99
10.5 Lemma: Let H G. Right congruence modulo H and left congru-
ence modulo H are equivalence relations on G.
Proof: We give the proof for right congruence only. We check that it is
reflexive, symmetric and transitive.
100
Proof: We must find a one-to-one correspondence between and .
We put :
Ha a 1H.
We show that is a one-to-one, onto mapping. First we prove it is a
mapping. We have to do it. Indeed, how do we find X if X ? Well,
we write X = Ha, that is, we choose an a X, then we find the inverse of
this a, and "map" X = Ha to the left coset a 1H of H determined by a 1. So
we must show that X is independent of the element a we choose from
X., i.e., that is a well defined function. We are to prove
Ha = Hb (Ha) = (Hb) .
1
If Ha = Hb, then ab H by Lemma 10.2(5), then (ab 1) 1 H, so ba 1
H, so a 1H = b 1H by Lemma 10.2(5), and (Ha) = (Hb) . Hence is indeed
a well defined function.
101
0+4 ,1+4 ,2+4 ,3+4
of 4 in are all the left cosets of 4 in . Hence : 4 = 4. Inciden-
tally, we see that Definition 10.4 is a natural generalization of the
congruence relation on .
Proof: We prove the lemma for right cosets only. For any a G, we
must find a one-to-one correspondence between H and Ha. What is more
natural than the mapping
:H Ha
h ha
from H into Ha? Now is indeed a mapping H into Ha. It is one-to-one,
for h = h (h,h H) implies ha = h a, which gives h = h after cancelling
a (Lemma 8.1(2)). Also, it is onto by the very definition of Ha. So we get
Ha = H .
G= Ha,
Ha
where is the set of distinct right cosets of H in G. Since Ha are disjoint,
we obtain
G = ∑ Ha
Ha
when we count the elements. Since Ha = H for all Ha by Lemma
10.8, we get
G = ∑ Ha = ∑ H = H = G:H H
Ha Ha
102
The basic idea of the preceding proof is simple. We have a disjoint union
G= Ha and we count the elements. Then we get G = ∑ Ha . In
Ha
Ha
the sequel, we will prove some important results by a similar reasoning.
We will have a disjoint union S = Ti and, counting the elements, we
i I
will get S = ∑ Ti . See §§25,26.
i I
Proof: We are to show that {1} and G are the only subgroups of G. Now
if H G, then H | G by Lagrange's theorem, so H | p and H = 1 or p.
If H = 1, then necessarily H = {1}. If H = p, then H = G and H G
together yield H = G.
G H G H G
G:H = , H:K = , so G:H H:K = = = G:K .
H K H K K
We give another proof of this result which works also in the case of
infinite groups and infinite indices.
103
We must prove G:K = I J . Since I J = I J , this will be accomplished if
we can find a one-to-one correspondence between I J and the set of
right cosets of K in G. How we find this correspondence will be clear
when we observe
(i,j) Kbj ai
Exercises
104
1. Find the right cosets of all subgroups of U (Example 9.4(h)), of E
(Example 9.4(i)) and of L (§9,Ex.4).
105
§11
Cyclic Groups
(b) In §9, Ex.4, the reader proved that L = {1,2,4,5,7,8} is a group under
multiplication (mod 9). Let us find the cyclic subgroup of L generated by
2. We have
20 = 1, 21 = 2, 22 = 4, 23 = 8, 24 = 7, 25 = 5,
106
L = {1,2,4,5,7,8} = {2n L: n = 0,1,2,3,4,5} {2n L: n }= 2 ,
11.4 Lemma: Let G be a group and a G. Then o(a) is finite if and only
if there is a natural number n with a n = 1. If this is the case, then o(a) is
the smallest natural number s such that a s = 1.
107
Suppose now there are natural numbers n with a n = 1, that is, suppose
that A . We prove that o(a) is finite, and is in fact the smallest natur-
al number in A. To this end, let s be the smallest natural number in A.
We show first s o(a) and then o(a) s.
Consider the s elements a 0,a 1,a 2, . . . ,a s 1 of a . These are all distinct, for
if a i = a j , i j, 0 i,j s 1
say with i j, then
a j i = 1, j i (s 1) 0, j i ,
j i A, j i s 1,
contradicting that s is the smallest natural number in A. So there are at
least s distinct elements in a . This gives s a = o(a).
h = qs + r, q,r , 0 r s 1,
h qs+ r sq r s q r q r r
a =a = a a = (a ) a = 1 a = a ,
so a {a 0,a 1,a 2, . . . ,a s 1},
a {a 0,a 1,a 2, . . . ,a s 1} ,
o(a) s
Suppose now the condition in the lemma does not hold. Then there are
m,k with a m = a k, m k. Assume m k without loss of generality.
mk
Then m k and a = 1. There is a natural number n, namely n
108
= m k, with a n = 1. Then o(a) is finite by Lemma 11.4. Hence o(a) =
implies that a m a k whenever m k (m,k ).
n = qs + r, q,r , 0 r s 1,
109
an H, for instance n = |m|. Thus the set {n : an H} is not empty.
From the natural numbers in this set, we choose the smallest one and
call it t.
n = tq + r, q,r , 0 r t 1,
r n tq n t q
a =a = a (a ) H,
Proof: (1) Suppose o(a) = . If o(a k) were finite, say o(a k) = m , then
k m km 0
(a ) = 1, so a = 1 = a , although km and 0 are distinct integers,
contrary to Lemma 11.5. So o(a) = implies o(a k) = .
110
n
= smallest natural number s such that
(n,k) | s (Lemma
5.11 and
Theorem 5.12)
n .
=
(n,k)
111
11.10 Lemma: Let G = a be a cyclic group of order G = n. For any
positive divisor m of n, there is a unique subgroup H of order H = m,
namely a n/m .
Lemma 11.10 implies that a finite cyclic group G has, for any positive
divisor k of G , a unique subgroup of index k. This reformulation of
Lemma 11.10 extends immediately to infinite cyclic groups.
112
can be written as a tq+ r, with some uniquely determined integers q,r,
where 0 r t 1. Thus any element a n of G belongs to one and only
one of the subsets
a t a 0, a t a 1, a t a 2, ... , at at 1
La0, La1, La2, ... , Lat 1
We learned the structure of cyclic groups quite well, but we had only a
few examples. We have not seen any cyclic group of order 5 or 7. For all
we know about cyclic groups up to now, it is feasible that there is no
cyclic group of order 5 or 7. We show next that there is a cyclic group of
any order. Incidentally, this shows that there are groups of all orders.
113
(under addition) is a cyclic group of infinite order as = {m1 : m }
= 1 is generated by 1 .
n
(under addition) is a cyclic group of order n as n
= {m1 : m } =
1 is generated by 1 n
.
Exercises
4. Let G be a group and a G. Let n,k and let m = [n,k] be the least
n
common multiple of n and k. Prove that a ak = am .
114
and o(a1) = n1, o(a2) = n2.
6. Let G be a group and a,b G. Assume that o(a) , o(b) and that
o(a),o(b) are relatively prime. Prove: if ab = ba, then o(ab) = o(a)o(b).
Prove also that o(ab) = o(a)o(b) is not necessarily true when the hypo-
thesis ab = ba is omitted.
115
§12
Group of Units Modulo n
115
By the definition of Euler's phi function, we conclude n = (n). So (12)
= 4 and in fact 12
= {1,5,7,11}. Also, (15) = 8 and 15
=
{1,2,4,7,8,11,13,14}.
p ab and pn
p a or p b and pn (Euclid's
lemma)
p a and p n or p b and p n
p (a,n) or p (b,n),
(ii) Multiplication in n
is associative since it is in fact asso-
ciative in n
(Lemma 6.4(7)).
(iii) 1 n
as (1,n) = 1 and a 1 = a1 = a for all a n
. Hence
1 is an identity element of n.
116
get a x = 1, so x is an inverse of a . Yes, but this is not enough.. We must
further show that x n
, or equivalently that (x,n) = 1. This follows
from the equation ax + ny = 1,. since d = (x,n) implies d x, d n, so d ax +
ny, so d 1, so d = 1. .
Hence n
is a group under multiplication.
n
is a finite group of order (n). Using Lemma 11.7, we obtain a (n) = 1
for all a n
. Writing this in congruence notation, we get an important
theorem of number theory due to L. Euler.
12.5 Theorem (Euler's theorem): Let n . For all integers that are
relatively prime to n, we have
a (n) 1 (mod n).
for all integers a that are relatively prime to p (i.e., for all integers a
such that p a.
117
ap a (mod p)
for all integers a.
Exercises
1. Prove that n
is an abelian group under multiplication.
4. Show that 3
, 32
, 33
, 34
are cyclic.
8. Show that pq
is not cyclic if p and q are positive odd prime numbers.
(Hint: What is (pq) and what is a (p 1)(q 1)/2 congruent to (mod pq) if a is
an integer relatively prime to pq?)
118
§13
Groups of Isometries
For any nonempty set X, the set S X of all one-to-one mappings from X
onto X is a group under the composition of mappings . (Example 7.1(d)).
In particular, if X happens to be the Euclidean plane E, . then E is the set
of all points in the plane and S E is a group. We note that E is not merely
an ordinary set of points.. An important feature of E is that there is a
measure of distance between the points of E. Among the mappings in S E ,
we examine those functions. which preserve the distance between any
two points. . Clearly, such functions will be more important than other
ones in S E , since such mappings respect an important structure of the
Euclidean plane E. .
This word is derived from "isos" and "metron",. meaning "equal" and
"measure" in Greek.. The set of all isometries of E will be denoted by
Isom E. Since the identity mapping E : E E is evidently an isometry,
Isom E is a nonempty subset of S E . In fact, Isom E S E.
119
Proof: We must show that the product of two isometries and the
inverse of an isometry are isometries (Lemma 9.2).
(ii) Let Isom E and let P,Q be any two points in E.. Since
S E , there are uniquely determined points P ,Q in E such that P = P,
1 1
Q = Q. Thus P = P ,Q =Q . Then
Hence Isom E S E.
120
(x+a,y+b)
(x,y)
(a,b)
(0,0)
Figure1
(2) We have (x,y) 0,0 = (x + 0,y + 0) = (x,y) = (x,y) for all (x,y) E. Thus
0,0 = .
Proof: First of all, we must show that any translation belongs to S E . Let
be an arbitrary translation (a,b
a,b ). There is a mapping : E E
such that a,b = = a,b, namely = a, b by Lemma 13.4(3). Thus a,b is
one-to-one and onto by Theorem 3.17(2). Hence a,b S E .
Next we show d((x1,y1),(x2,y2)) = d((x1,y1) a,b, (x2,y2) a,b) for any two
points (x1,y1),(x2,y2) in E. We have
121
d((x1,y1) a,b, (x2,y2) a,b) = d((x1 + a,y1 + b), (x2 + a,y2 + b))
x = r cos y = r sin .
122
Q = (r, + )
P = (r, )
r
C
Figure 2
In our fixed cartesian coordinate system, the image of any point P = (x,y)
can be found as follows. If P has polar coordinates (r, ), then its image
will have polar coordinates (r, + ), so the cartesian coordinates x ,y of
Q := (r, + ) are
123
13.8 Lemma: Let and be arbitrary rotations about the origin.
(1) = + .
(2) 0 = E = .
(3) = = .
Proof: First of all,. we must show that any rotation about the origin
belongs to S E . Let be an arbitrary rotation about the origin ( ).
There is a mapping : E E such that = = , namely = by
Lemma 13.8(3). Thus is one-to-one and onto by Theorem 3.17(2).
Hence S E.
124
Now we prove that preserves distance. For any two points (x,y),(u,v)
2
in E, we have d ((x,y) ,(u,v) )
= d2((x cos y sin , x sin + y cos ),(u cos v sin , u sin + v cos ))
= [(x u)cos (y v)sin ]2 + [(x u)sin + (y v)cos ]2
= (x u)2cos2 2(x u)(y v)cos sin + (y v)2sin2
2 2
+(x u) sin + 2(x u)(y v)cos sin + (y v)2cos2
= (x u)2(cos2 + sin2 ) + (y v)2(sin2 + cos2 )
= (x u)2 + (y v)2
= d2((x,y),(u,v)),
So far, we have been dealing with rotations about the origin. What about
rotations about an arbitrary point C, whose coordinates are (a,b), say. A
rotation about C through an angle will map a point P with coordinates
(x + a,y + b) to a point Q with coordinates (x + a,y + b), where (x ,y ) is
the point to which (x,y) is mapped under a rotation about the origin
through an angle . So the image of (x,y) a,b will be (x,y) a,b. See Figure
3. This suggests the following formal definition..
125
13.11 Definition: Let C = (a,b) be a point in E. The mapping
( a,b) 1 a,b: E E is called a rotation about C through an angle .
(x´+a,y´+b)
(x+a,y+b)
(x´,y´) (a,b)
(x,y)
(0,0)
Figure 3
1 1
We put ( a,b) R a,b:= {( a,b) a,b: R}. This is the set of all rotations
about the point (a,b). It is a subgroup of Isom E. The proof of this state-
ment is left to the reader.
The geometric idea of a reflection is that there is a line m and that each
point P is mapped to its "mirror image" Q on the other side of m. So PQ is
perpendicular to m and d(P,R) = d(R,Q), where R is the point of intersec-
tion of m and PQ. See Figure 4.
P R Q
Figure 4
126
13.12 Definition: Let m be a straight line in E and let m
:E E be the
mapping defined by
P m = P if P is on m
and
P m = Q if P is not on m and if m is the perpendicular bisector of PQ.
m
is called the reflection in the line m.
2
13.13 Lemma: Let m
be the reflection in a line m. Then m
= m
.
Proof: P m
P if P is not on the line m and so m
. Now we prove that
2 2
m
= . We have P m
= P( m m
) = (P m
) m
=P m
= P when P is a point on
2
m by definition. It remains to show P =
P also when P is not on m. Let
m
P be a point not on m and let Q = P m, P 1 = Q m. Then Q is not on m and
m is the perpendicular bisector of PQ as well as of QP1. So PQ and QP1
are parallel lines.. Since they have a point Q in common, they are
identical lines. Let R be the point at which m and PQ intersect. So P1 Q
and P1 is that point on PQ for which d(Q,R) = d(P1,R). Since P is on PQ and
d(Q,R) = d(P,R), we obtain P = P1, as was to be proved.
127
intersect m at S. Then d(Q,S) = d(S,Q1), d(P,S) = d(S,P1) since P = P1 and
the angles PSQ and P1Q1S are both right angles. By the side-angle-side
condition, the triangles QPS and Q1P 1S are congruent. So the
corresponding sides PQ and P1Q1 have equal length. This means d(P,Q) =
d(P1,Q1).
From now on, assume that neither P nor Q is on m. Let m intersect PP1 at
N and QQ1 at S.
Q P Q P P
Q Q
m S N S m N T m N S m
P=P S T
Q
Q
P Q
Q P P
Case 2 Case 3 Case 4 Case 4
Figure 5
128
Proof: { , m
} is a finite nonempty subset of Isom E by Lemma 13.14. It
is closed under multiplication by Lemma 13.13. So it is a subgroup of
Isom E by Lemma 9.3(1).
Proof: (1) When P = (a,b) and Q = (c,d), say, then m,n maps P to Q if and
only if (a + m,b + n) = (c,d), i.e., if and only if m = c a, n = d b.. So
c a, d b
is the unique translation that maps P to Q.
(2) We draw the circle whose center is at P and whose radius is equal to
d(P,Q). This circle passes through R by hypothesis. Let be the angle
which the circular arc QR subtends at the center P. Then a rotation about
P through an angle maps Q to R.
13.17 Lemma: Let P,Q,R be three distinct points in E that do not lie on
a straight line. Let , be isometries such that P = P , Q = Q , R = R .
Then = .
129
P
N N
Figure 6
1
Proof: We put = . We suppose and try to reach a contradiction.
If , then there is a point N in E such that N N . Since P = P by
hypothesis, P = P and so P N. Similarly Q N and R N. Now is an
isometry, so d(P,N ) = d(P ,N ) = d(P,N) and likewise d(Q,N ) = d(Q,N) and
d(R,N ) = d(R,N). So the circle with center at P and radius d(P,N) and the
circle with center at Q and radius d(Q,N) intersect at the points N and N .
Then PQ is the perpendicular bisector of N N . Here we used N N . But
d(R,N ) = d(R,N) and R lies therefore on the perpendicular bisector of
N N , i.e., R lies on PQ, contrary to the hypothesis that P,Q,R do not lie on
a straight line. Hence necessarily = and = .
13.18 Theorem: Let P,Q,R be three distinct points in E that do not lie
on a straight line and let P ,Q ,R be three distinct points in E. Assume
that d(P,Q) = d(P ,Q ), d(P,R) = d(P ,R ), d(Q,R) = d(Q ,R ). Then there is a
translation , a rotation (about an appropriate point and through a
suitable angle) and a reflection such that
P =P ,Q =Q ,R =R ,
where denotes or .
130
R1 = R2.
P P P P
Q Q1 Q Q
R R1 R2 R
Exercises
131
2. Let m and n be two distinct lines intersecting at a point P.. Show that
m n
is a rotation about P. Through which angle?
6. A halfturn P
= (a,b)
about a point P = (a,b) is defined as the mapping
given by
(x,y) (2a x, 2b y)
for all points (x,y) in E. Show that any halfturn is an isometry of order
two. Prove that the product of three halfturns is a halfturn.
7. Prove that a halfturn P is the product of any two reflections in lines
intersecting perpendicularly at P. .
10. Prove that the product of four reflections can be written as a product
of two reflections.
12. Prove that every nonidentity translation is of infinite order and. that
is of finite order if and only if is a rational multiple of .
132
§14
Dihedral Groups
Let F be any nonempty subset of the Euclidean plane E.. Here F might be
a set with a single point, a line, a geometric figure. or an arbitrary subset
of E. Let SE . We put
F = {x : x F} = {x: x F} = F
133
14.2 Definition: Let F be a nonempty subset of E and let Isom E. If
F = F, then is called a symmetry of F.
F 1,m
= {f 1,m E: f F}
= {(x,y) 1,m
E: y = mx}
= {(x+1,y+m) E: y = mx}
= {(x+1,(x+1)m) E: x }
= {(u,v) E: v = mu}
= F.
134
1 1 1
(ii) If Sym F, then F = F, so F( ) = (F ) = F( )=F =
F by Lemma 14.1. Thus 1 Sym F.
We now study the symmetry groups of regular polygons.. For our pur-
poses, it will be convenient to define regular polygons as follows.. Let K
be a circle and let P1,P2, . . . ,Pn be n points on this circle K such that each
one of the arcs P1P2,P2P3, . . . ,Pn 1Pn subtends an angle of 2 /n radians at
the center of K (where n 3). So the points P1,P2, . . . ,Pn divide the circle
K into n circular arcs of equal length.. The union of the line segments
P1P2,P2P3, . . . ,Pn 1Pn,PnP1 is called a regular n-gon. The circle K is called the
circumscribing circle of this regular n-gon.. This is justified since a
regular n-gon has a unique circumscribing circle.. The center of the
circumscribing circle is called the center of the regular n-gon. and the
points P1,P2, . . . ,Pn are called the vertices of the regular n-gon.
1
2 n
4
Figure 1
135
Now let Sym F. . Then is completely determined by its effect on
three distinct points not on a straight line (Lemma 13.17).. For example,
is determined by C ,P 1 ,P 2 . We have already remarked that C = C. Also
P 1 = P k for some k {1,2, . . . ,n}. What about P 2? Since is an isometry,
P 2 will be a vertex whose distance from P k is equal to the distance
between P 1 and P 2. Thus P 2 will be adjacent to P k: it is either Pk 1 or Pk+1.
We see that there are n choices for P 1 and, once the choice for P 1 has
been made, there are two choices for P 2 . Hence there are at most n.2 =
2n isometries in Sym F. . We exhibit 2n symmetries of F and this will
prove Sym F = 2n. .
1 3
2 3 1 2
Figure 2
1 1
2 3 3 2
1 1 3
2 3 3 2 2 1
1 3 3
2 3 1 2 2 1
Figure 4
The discussion of a general regular polygon follows much the same lines.
Consider a rotation about the center of F through an angle of 2 /n
radians, which we denote by . Under , the vertices P1,P2, . . . ,Pn are
k
mapped respectively to P2,P3, . . . ,Pn,P1. It is seen that maps P1,P2, . . . ,Pn
respectively to Pk+1,Pk+2, . . . ,Pk+ n, where k is any integer. Thus k = if and
only if Pk+ i = Pi, that is, if and only if k + i i (mod n) for all i, so if and
only if n k, from which we obtain o( ) = n. In this way, we found n
symmetries of F, namely , , 2,. . . ., n. Here is a cyclic subgroup of
order n of Sym F.
137
1 n
2 n 1 n-1
3 n-1 2 n-2
4 3
Figure 5
Now consider the reflection in the angular bisector of the angle PnP1P2.
The bisector of this angle passes through P(n/2)+1 if n is even and through
the midpoint of P(n+1)/2P(n+3)/2 if n is odd. One reads off from Figure 6 that
Pk = Pn+2 k for k = 1,2, . . . ,n.
1 1
2 n 2
n
3 n-1 n-1 3
4 n-2
Figure 6
Pj = Pn+2 j
= P(n+2 j)+1
= Pn j+3
,
1
Pj = Pj 1
= P(n+2) (j 1)
= Pn j+3
,
1
so = , as can be seen from Figure 7 too.
138
1 1 2
2 n n 2 1 3
3 n-1 n
n-1 3 4
4 n-2 n-1
1 2 2
2 n 3 1 1 3
3 4 n
n-1 n 4
4 5 n-1
Figure 7
2
14.6 Lemma: Let G be a group and let , G be such that = 1 and
= 1 . Then n = n for all n .
139
1 n
( ) = ( 1) n for all n ,
n n
and thus = for all n . This completes the proof.
2 n1 2 n1
Sym F = { , , , ..., , , , , ..., }.
140
in "D2n" (whether it designates an arbitrary dihedral group or the par-
ticular dihedral group Sym F) is harmless. .
Some people use Definition 14.7 only when n 3. They do not consider
D4 as a dihedral group. This is consistent with the fact that D4 is not the
symmetry group of any regular polygon (see however Ex.10). But then
they have to formulate the following theorem of Leonardo da Vinci (yes,
of Leonardo da Vinci (1452-1519)) less beautifully.
This theorem will not be used in the sequel and its proof is left to the
reader.
Exercises
2. Let be an isometry and P,Q two distinct points in E. Show that (PQ)
= P Q and (PQ) = P Q . (Hint: PQ = {R E: d(P,R) + d(R,Q) = d(P,Q)}.)
141
5. Let P 1,P 2, . . . ,P n be the vertices of a regular n-gon. If n happens to be
odd, let Qi be the midpoint of the side P[(n 1)/2]+iP[(n+1)/2]+i (i = 1,2, . . . ,n)
Prove that . the center C of the regular n-gon is uniquely determined as
the midpoint of P iP i+(n/2) if n is even and as the midpoint of P iQi if n is
odd. (As the radius of a circumscribing circle is equal to d(P i,C), this
proves that. a circumscribing circle is completely determined by the
vertices. Hence there is a unique circumscribing circle of a regular n-gon.).
8. Let P be a point and m a line. Find all isometries that fix P, all isomet-
ries that fix m and all isometries that fix m pointwise. Show directly that
these three sets are subgroup of Isom E.
(a) , S .
(b) x y = x y = x y . (So and preserve distance in .
For this reason, they are said to be isometries of .)
(c) o( ) = and o( ) = 2.
142
(d) = 1 . (Thus , satisfy the conditions an a,b in Definition 14.7,
except n is replaced by here. A dihedral group of infinite order is a
group D having element a,b such that o(a) = , o(b) = 2, ba = a 1b and
G = {a kbr: k , r = 0,1}.
The group generated by , is an example of a dihedral group of infinite
order.)
14. Prove that a group generated by two distinct elements a,b such that
o(a) = 2 = o(b) is a dihedral group (of finite or infinite order).
15. Let n be any natural number or . Find a group G and a,c G such
that o(a) = 2 = o(c) and o(ac) = n. (So o(ac) cannot be determined from
o(a) and o(c) alone.)
143
§15
Symmetric Groups
For any nonempty set X, the set SX of all one-to-one mappings from X
onto X is a group under the composition of functions. (Example 7.1(d)).
In particular,. choosing X to be the set {1,2, . . . ,n} of the first n natural
numbers, we get a group S{1,2,...,n}. We abbreviate this group as Sn.
The reader should not confuse the symmetric group with the symmetry
group of a figure in the Euclidean plane.
permutations in Sn.
144
We introduce a notation for permutations. Let n and Sn. Then
is a mapping : {1,2, . . . ,n} {1,2, . . . ,n}. and can be exhibited by
associating any number in {1,2, . . . ,n} with its image by an arrow.. Thus
Sn for which 1 = 3, 2 =1, 3 = 2, 4 = 5, 5 = 4 can be displayed as
1 3
2 1
3 2
4 5
5 4,
1 2 3 4 5
.
3 1 2 5 4
(13 21 32 45 54)
for our . In general, we write any S n as
... a ...
(... a ...)
In this notation, there are two rows of n elements and n columns of two
elements. The rows consist of the numbers 1,2, . . . ,n. The image under
of any a {1,2, . . . ,n} is written just below a in the second row. This
nota-tion is due to A. Cauchy (1789-1857).
123. . . n
= (1 2 3 . . . n).
145
1
The inverse of any Sn is found easily. By definition,
is the function
... a ...
(permutation) that maps a to a, for all a {1,2, . . . ,n}. Let be (... a ...).
Then, under 1, any element in the second row is mapped to the number
just above it. 1 is therefore obtained by interchanging the rows of .
For instance, in S7, we have .
1
(17 26 33 45 54 61 72) = (71 62 33 54 45 16 27),which may also be
(16 2 3 4 5 6 7
7 3 5 4 2 1). Two permutations in Sn, say
written as and , are
... a ... ... b ...
multiplied as follows. We have = (... a ...) and = (... b ...). What is
The remaining entries are found by the same method and we get
146
1 4
2 5
3 1
4 3
5 6
6 2
7 7.
1 4 3 1; 2 5 6 2; 7 7,
or as
1 4 2 5 7.
3 6
(15234)(6897) S9
152346897 123456789
(15234)(6897) =(5 2 3 4 1 8 9 7 6) = (5 3 4 1 2 8 6 9 7).
147
to-one). If b = a, we close the parenthesis and obtain the expression
(ab). If b = c b, we write c after b. Now we have (abc . Here c b,c,
because is one-to-one. We evaluate c . If c = a, close the parenthesis
and obtain the expression (abc). If c = d a, we repeat our procedure,
each time writing down the image of a number after that number. Since
we have n numbers at our disposal, we meet, after at most n steps, one
of the numbers for a second time. If this happens when we have the
expression
(abc. . . g
where a,b,c,. . . ,g are all distinct, but g is one of them, we conclude that
g b,c,. . . ,g, since b = a , c = b , . . . and is one-to-one. Hence g = a. We
close the parenthesis and obtain the expression (abc. . . g).
If a,b,c,. . . ,g exhaust all the numbers 1,2, . . . ,n, we are done. Otherwise,
we select an arbitrary number from 1,2, . . . ,n that is distinct from a,b,c,. . .
,g. Let us call it h. We open a new parenthesis starting with h and repeat
our procedure. After finitely many steps, we get an expression of the
form
(abc. . . g)(h. . . k). . . (t. . . x),
where {a,b,c, . . . ,g,h, . . . ,k, . . . ,t, . . . x} = {1,2, . . . ,n}. We call each one of the
expressions (abc. . . g),(h. . . k), . . . ,(t. . . x) a "cycle".
In this notation, the order of the cycles is not important. We can write
the permutation above also as
(5)(379)(8)(1264) or as (379)(5)(1264)(8).
148
Besides, one can start a cycle with any number in that cycle. Our permu-
tation can thus be written as
(5)(793)(8)(6412) or as (937)(5)(2641)(8).
The inverse of a permutation is found easily. Let Sn, a,b {1,2, . . . ,n}.
In the cycles of , the number a follows a. By definition, b 1 is that
number a for which a = b. Hence b 1 is the number which is followed
by b. Stated otherwise, b 1 is the number that comes just before b in
the cycles of . So 1 consists of the same cycles, but the entries being
written in the reverse order. For example,
149
in this way, we find (1256)(347).(157)(24)(3)(6) = (14)(273)(56).
Another example:
(152)(3476).(1724)(563) = (1654273).
The alert reader will have noticed that the same. symbol in cycle
notation stands for many different permutations.. Thus (123)(45) stands
for (123)(45) in S5, for (123)(45)(6) in S6, for (123)(45)(6)(7) in S7, etc.
So an isolated symbol (123)(45) is ambiguous.. Also, our thumb rule for
finding inverses in the cycle notation works only. when the cycles are
disjoint. It is time that we discuss these points rigorously. .
150
15.4 Definition: Let , Sn. If the set of numbers moved by and the
set of numbers moved by are disjoint, then and are called disjoint
permutations. We also say is disjoint from in this case.
151
In case III, we have
m( ) = (m ) = m = m,
m( ) = (m ) = m = m,
so m( ) = m( ).
In all three cases, we have m( ) = m( ). Since this holds for all m in the
set {1,2, . . . ,n}, we conclude = .
(1432)(567)(8)(9,10)
15.7 Lemma: Let be a permutation in Sn. We put, for a,b {1,2, . . . ,n},
a b
k
if and only if there is an integer k such that a = b. Then is an
equivalence relation on {1,2, . . . ,n}.
0
Proof: (i) For all a {1,2, . . . ,n}, we have a = a, with 0 . So a a for
all a and is reflexive.
k k
(ii) If a b, then a = b for some k , so b = a with k and
therefore b a. So is symmetric.
152
(iii) If a b and b c, then a k = b and b m = c for some k,m , then
a k+ m = a k m = b m
= c, with k + m and therefore a c. So is
transitive.
The reader will check easily that {1432}, {567}, {8}, {9,10} are the equi-
valence classes of in {1,2,3,4,5,6,7,8,9,10} if denotes the permutation
(14 21 32 43 56 67 75 88 10
9 10
9)
we treated above. So the equivalence relation of Lemma 15.7 seems
promising.
b if b A
b A
= { b if b A
k k1
x A
= (x A
) A
k1
= (x ) A
k1
= (x )
= x k,
m
x if x A
x m
A
= { x if x A
= x,
153
m1 m1
hence A A
= = A A
. By Theorem 3.17(2), A
is one-to-one and onto.
Thus A
Sn.
Proof: (1) The equivalence classes A1,A2, . . . ,Ah are pairwise disjoint
sets. Now A either moves no number at all (this happens if and only if
i
Ai has exactly one element), or moves only the numbers in Ai. There-
fore, the numbers moved by A and A make up disjoint sets whenever
i j
i j. So the permutations A1 , A2 , ..., Ah are pairwise disjoint permuta-
tions (Definition 15.4) and they commute by Theorem 15.6. .
(2) We have A A . . . A = A A . . . A for any arrangement 1´,2´, . . . ,h´
1 2 h 1´ 2´ h´
of the numbers 1,2, . . . ,h. (Lemma 8.12). We want to show.
b = b A A . . . A for all b {1,2, . . . ,n} = A1 A2 ... Ah. So let b be
1 2 h
in {1,2, . . . ,n}. Renumbering A1,A2, . . . ,Ah if need be, we may assume,
without loss of generality that b Ah. Then b A1, b A2, . . . , b Ah 1
and thus b =b = ... = b
A1 = b by the definition of these functions.
A2 Ah-1
Thus b A1 A2 . . . Ah = bA and the proof will be complete when we show
h
b = bA . But this follows immediately from the definition of Ah since b
h
Ah.
154
(567)(1)(2)(3)(4)(8)(9)(10) = (567)
(8)(1)(2)(3)(4)(5)(6)(7)(9)(10) = (8) (= )
(9,10)(1)(2)(3)(4)(5)(6)(7)(8) = (9,10).
In view of this, we define cycles as the associated permutations. Cycles
will be distinguished from other permutations by the property stated in
Lemma 15.8.
a1 = a2, a2 = a3, . . . , am 1
= am, am = a1.
In this case, we write (a1a2. . . am) for . Then m is called the length of the
cycle (a1a2. . . am) and (a1a2. . . am) = is called an m-cycle. The identity
permutation is called a 1-cycle.
155
them, too. Hence b m = b for all b {1,2, . . . ,n}. Thus m is the smallest
m
natural number such that = . Using Lemma 11.4, we obtain the
following Theorem, which is also true when m = 1.
15.12 Remarks: (1) The inverse of a cycle = (a1a2. . . am) Sn, for
which
a1 = a2, a2 = a3, . . . , am 1 = am, am = a1 and which fixes any other
number in {1,2, . . . ,n} (if any). is by definition the mapping whose
effect on a1,a2, . . . ,am is given by am = am 1, . . . ,a3 = a2, a2 = a1, a1 = am
1
and which fixes the other numbers (if any). Thus = is the cycle.
(amam 1. . . a2a1).
156
15.13 Lemma: Let G be a group and a,b G. Suppose ab = ba and
assume that o(a) and o(b) are finite. Suppose further that a b =
{1}. Then o(ab) is finite. In fact, o(ab) is the least common multiple of
o(a) and o(b): we have o(ab) = [o(a),o(b)].
Proof: First we show that (ab)k = 1 if and only if o(a) k and o(b) k
(where k ). Indeed, if o(a) k and o(b) k, then a k = 1 and bk = 1
(Lemma 11.6) and so (ab)k = a kbk = 1.1 = 1 (Lemma 8.14(3); here we use
ab = ba). Conversely, if (ab)k = 1, then a kbk = 1, so a k = b k a b =
k k k k
{1}. So we have a = 1 = b , and a = 1 = b , and thus o(a) k and o(b) k.
Therefore (ab)k = 1 if and only if o(a) k and o(b) k.
= [o(a),o(b)]
Lemma 15.13 will be used to find the order of a product of disjoint per-
mutations. We need the following result.
(2) If 1
, 2
, ..., m
are disjoint from , then 1 2
... m
and are disjoint.
157
1
(3) If and are disjoint, then and are disjoint.
m
(4) If and are disjoint, then and are disjoint for all m .
m r
(5) If and are disjoint, then and are disjoint for all m,r .
(2) This follows from (1) by induction on m. The details are left to the
reader.
(5) When and are disjoint and m,r , then m and are disjoint by
(4), and using (4) with r, , m respectively in place of m, , , we deduce
that r and m are disjoint. Hence m and r are disjoint for all m,r .
158
r
are disjoint by Lemma 15.14(5) and there cannot be any number in
{1,2, . . . ,n} which is moved both by m and by r. This is a contradiction.
Thus { }. As remarked above, this completes the proof.
Proof: The disjoint cycles are pairwise disjoint and the order of a cycle
is its length (Theorem 15.11). The claim follows now immediately from
Theorem 15.16.
Exercises
159
3. Write the permutations in Ex. 1 in cycle notation. Carry out the multi-
plication in cycle notation and compare the results.
4. Write the permutations in Ex. 2 in double row notation. Carry out the
multiplication in double row notation and compare the results.
H
11. Let H S n. For a,b G, put a b if and only if there is a H such
that a = b. Show that H is an equivalence relation on {1,2, . . . ,n}. (Lemma
15.7 is a special case when H = .)
160
15. Show that, for any Sn, there holds 1(123) = (abc) with suitable
a,b,c. How are a,b,c related to ? (Work out some specific examples.)
Generalize your conclusion to 1 .
161
§16
Alternating Groups
(132546) = (13)(12)(15)(14)(16) =
(24)(12)(14)(23)(46)(14)(16)(45)(16).
162
In fact, we can attach a product of two transpositions (ab)(ab) = at will
and increase the number of transpositions by 2. Hence a product of n
transpositions can be written also as a product of n + 2, n + 4, n + 6, . . .
transpositions. We note that this does not change the parity of the num-
ber of transpositions. The parity of the number of transpositions is
unique. If a permutation can be written as a product of an odd (even)
number of transpositions, then, in any representation of this permuta-
tion as a product of transpositions, the number of transpositions is odd
(even). A permutation cannot be written as a product of an odd number
of transpositions and also as a product of an even number of transpo-
sitions. We proceed to prove this assertion. We need the notion of inver-
sions of a permutation.
Let Sn. We write in double row notation, where, in the first row,
the numbers 1,2, . . . ,n are in their natural order:.
(112 . . . n
2 . . . n ).
Corresponding to the correct inequalities
1 2 1 3 ...... 1 n
2 3 ...... 2 n
.....................
n 1 n
1 2 1 3 ...... 1 n
2 3 ...... 2 n
.....................
(n - 1) n
163
2 5 2 6 2 3 2 1 2 4
5 6 5 3 5 1 5 4
6 3 6 1 6 4
3 1 3 4
1 4,
The second rows of and (ik) are identical, aside from the locations of
i and k . Here gives rise to the inequalities
1. h i , h k where h {1, . . . ,i 1} =: H,
i j , where j {i + 1, . . . ,k 1} =: J,
i k ,
2. i m , where m {k + 1, . . . ,n} =: M,
j k , where j J,
3. k m , where m M,
and to certain other inequalities that do not involve i or k . And (ik)
gives rise to the inequalities
1. h k , h i where h H,
k j , where j J,
k i ,
3. k m , where m M,
j i , where j J,
2. i m , where m M,
164
and to certain other inequalities that do not involve i or k .
I. i j , i k , j k (where j J) of
and
II. k j , k i , j i (where j J) of (ik)
165
As the number of inversions of a permutation is uniquely determined, it
is clear that a permutation cannot be both odd and even. With this
terminology, Lemma 16.3 reads as follows.
166
16.7 Theorem: Let n 2. The product of two permutations in Sn has
the "parity" given by the following law.
The assertion of Theorem 16.7 resembles the rule for finding the sign of
a product of two real numbers: the product of a negative number by a
negative number is positive, etc. In order to exploit this analogy, we
introduce a new term.
1 if is an even permutation
( )= { 1 if is an odd permutation.
With this definition, the content of Theorem 16.7 can be expressed more
succintly.
167
Proof: We must find a one-to-one correspondence. between the set of
odd permutations and the set of even permutations in Sn. Now
T: { S n: ( ) = 1} { S n: ( ) = 1}
(12)
is a one-to-one mapping (by Lemma 8.1(1)). from the set of odd permu-
tations in Sn into the set of even permutations in Sn (by Lemma 16.3),
which is in fact onto, since any even permutation is the image,. under
T, of the odd permutation (12) (Lemma 16.3).. So T is a one-to-one
corre-spondence between these sets and. they contain equal number of
ele-ments, say k elements.. Since these sets are disjoint, and their union
is Sn, there are 2k elements in Sn, whose order is n! by Theorem 15.2.
Hence k = n!/2. .
Exercises
168
4. Find the sign of (143)(1245)(243) and of (1435)(25643) without
evaluating these products.
8. Verify Lemma 16.3 by going through the argument in its proof in the
specific cases below.
1234567
= (3 1 5 7 2 4 6), (ik) = (12), (14), (23), (26), (27), (67).
169
§17
Groups of Matrices
After having learned about fields in Chapter 3, the reader may check
that the theory in this paragraph carries over to the more general situa-
tion where the term "field" is used in the sense of Definition 29.13.
(ac bd)
of four elements a,b,c,d of K, arranged in two rows and two columns, and
enclosed within parentheses. (The plural of "matrix" is "matrices".)
170
Thus ( 14 20) is matrix over (and also over and ), (5 2
7
) is a
matrix over (and also over ). In addition, (25 34) is a matrix over 7
,
The set of all matrices over a field K will be denotded by Mat2(K). The
subscript 2 signifies that there are 2 rows and 2 columns in a matrix (in
the sense of Definition 17.2).
a b a´ b´
A: (c d) is equal to B: (c´ d´)
In this definition of matrix equality, the location of the entries are taken
51 25
into account. Thus (0 2) and (1 0) are different matrices, although they
(a+e b+f
c+g d+h).
The sum of A and B will be denoted by A + B. Taking sums in Mat2(K)
will be called addition (of matrices).
Addition of matrices is essentially the addition in the underlying field,
carried out four times. Not surprisingly, many properties of addition in
the field are reflected in matrix addition. For example, just like a field is
a group under addition, matrices over a field form a group under
addition, too.
171
17.4 Theorem: Let K be a field. Then Mat2(K) is a commutative group
under addition.
= A + (B + C).
the zero matrix (over K) and will be designated by the symbol 0. This
should not be confused with the zero element of the underlying field K.
172
(ac bd) + ( a b a+( a) b+( b)
c d) = ( c+( c) d+( d)) = (00 00) = 0.
Thus Mat2(K) is a group under addition. We finally check commutativity.
The additive group Mat2(K) is somewhat dull. It is just four copies of the
additive group K.. More interesting matrix groups arise when the opera-
tion is multiplication. We introduce this operation now..
(ae+bgaf+bh
ce+dgcf+dh ).
The product of A and B will be denoted by A .B or simply by AB.. Taking
products in Mat2(K) will be called multiplication (of matrices).
This definition looks bizarre. One would expect the product of A and B,
ae bf
with the notation of Definition 17.5, to be (cg dh). Some motivation for
a b
Definition 17.5 can be gained as follows. With each matrix (c d) (over
x = ax´ + by´
y = cx´ + dy´
173
x = ax´ + by´ x´ = ex´´ + fy´´
y = cx´ + dy´ y´ = gx´´ + hy´´,
which gives
so the product of the matrices is the one which is associated with the
successive application of the transformation.
174
a b e f km ae+bg af+bh k m
(AB)C = [(c d)(g h)](n p ) = (ce+dg cf+dh )(n p )
a b 10 a1+b0 a0+b1 a b
(3) We compute AI = (c d)(0 1) = (c1+d0 ) ( d) = A,
=
c0+d1 c
10 a b 1a+0c 1b+0d a b
IA = (0 1)(c d) = (0a+1c 0b+1d) = (c d) = A,
as claimed.
a b e f km
(4) We have A(B + C) = (c d)[(g h) + (n p )]
a b e+k f+m
= (c d)(g+n h+p )
a(e+k)+b(g+n) a(f+m)+b(h+p)
= (c(e+k)+d(g+n) c(f+m)+d(h+p) )
ae+ak+bg+bn af+am+bh+bp
= ( ce+ck+dg+dn cf+cm+dh+dp )
ae+bg+ak+bn af+bh+am+bp
= ( ce+dg+ck+dn cf+dh+cm+dp )
ae+bg af+bh ak+bn am+bp
= (ce+dg cf+dh ) + (ck+dn cm+dp )
a b e f a b km
= (c d)(g h) + (c d)(n p )
= AB + AC.
The proof of (B + C)A = (BA + CA) follows similar lines and is left to the
reader.
175
Theorem 17.6 seems promising. Three of the group axioms are satisfied,
with I as the identity. It remains to investigate whether every matrix
over a field has a right inverse.
a b
Suppose K is a field and A = (c d) Mat2(K). Then A has a right inverse
xy
X = (z u) in Mat2(K) if and only if AX = I, which is equivalent to
(1) ax + bz = 1, (2) ay + bu = 0,
(3) cx + dz = 0, (4) cy + du = 1.
We multiply the equation (1) by d, (3) by b and add them side by side.
Using associativity of addition in K, distributivity of multiplication over
addition, and commutativity of multiplication in K, we get
(ad bc)x = d.
17.7 Definition: Let K be a field and A = (ac bd) Mat2(K). Then the
a b
We have shown: if K is a field and A = (c d) Mat2(K), and if X = (xz yu)
in Mat2(K) is a right inverse of A, then
176
These equations impose certain conditions on a matrix having a right
inverse. We cannot expect that every matrix has a right inverse. Those
having a right inverse are charecterized very simply as the matrices
with a nonzero determinant.
17.8 Theorem: Let K be a field and A = (ac bd) Mat2(K). Then A has a
right inverse if and only if det A 0. If this is the case, then there is a
unique right inverse of A, namely the matrix
(det A) 1d (det A) 1b
( (det A) 1c (det A) 1a )
where (det A) 1 is the inverse of det A K\{0} in the multiplicative
group K\{0}.
Proof: First we assume det A = 0 and show that A has no right inverse.
Indeed, if det A = 0 and A had a right inverse, then the equations (D)
would become
d =0 b =0
c =0 a = 0,
a b 0 0
and A = (c d) would be the zero matrix (0 0). The existence of a right
xy
inverse X = (z u) would yield
Now let us assume det A 0 and show that A has a unique right inverse.
Since det A K\{0} and K\{0} is a group under multiplication, det A has
an inverse in K\{0}, which we denote by (det A) 1. This is the nonzero
element of the field K such that (det A) 1(det A) = (det A)(det A) 1 = 1 =
the identity element of K\{0}. So we can solve for x,y,z,u in (D) by
multiplying the equations in (D) by (det A) 1. We get
177
right inverse). It is easy to check that this matrix is indeed a right
inverse of A:
a b (det A) 1d (det A) 1b
( )(
c d (det A) 1c (det A) 1a )
(det A) 1(ad bc) (det A) 1( ab+ba)
= ( (det A) 1(cd dc) (det A) 1( cb+da) )
10
= (0 1) = I.
Hence A does have a unique right inverse and it is the matrix given in
this theorem.
We will prove presently that the matrices with right inverses form a
group under multiplication. From Lemma 7.3, it will then follow that the
unique right inverse of a matrix with a nonzero determinant is also the
unique left inverse of the same matrix. We shall refer to is as its inverse.
a b
The rule for finding the inverse of A = (c d) is simple: interchange a and
d, then put a minus sign in front of b and c, and multiply each entry by
(det A) 1 [i.e., divide each entry by det A]. For example, the inverse of
1 2 1 2
8 8
(5
2 1/4 1/4
) Mat ( ) is = ( 1/8 5/8) and that of
1 2 2
1 1 1 5
8 8
178
= (ad bc)(eh fg)
= (det A)(det B).
The formula det AB = (det A)(det B). is known as the multiplication rule
of determinants.. Loosely speaking, the determinant of a product is the
product of the determinants.. By induction on n, it is extended to n
factors: det (A1A2. . . An) = (det A1)(det A2). . . (det An).
Proof: We check the group axioms. Let us call our set G for brevity.
Therefore, G is a group.
179
17.11 Definition: Let K be a field. The group of Theorem 17.10 is
called the general linear group (of degree 2) over K, and is written as
GL(2,K).
180
17.14 Theorem: The set
a b e f
(i) Suppose A = (c d) and B = (g h) are elements of H. Then
ae+bg af+bh
AB = (ce+dg cf+dh ). Here the entries of AB, namely ae+bg, ce+dg, af+bh,
a b d b
(ii) Let A = (c d) H. Then det A = 1 and so A 1 = c ( a )
1
by Theorem 17.8. The entries d, b, c, a of A are integers, because a, b,
c, d are integers. Also, we have det A = 1, so det (A 1) = (det A) 1 = 1 1 = 1
by Theorem 17.9(3) (or det (A 1) = da ( b)( c) = ad bc = 1). So A 1 H
and H is closed under the forming of inverses.
Exercises
181
4. Find all elements of SL(2, 3
). What is the order of SL(2, 3
)?
a b
6. Let K be a field and let A = (c d) Mat2(K). When a = 0 = b, we have
det A = 0. In case (a,b) (0,0),. prove that det A = 0 if and only if there
is an element k in K such that c = ka, d = kb. . Use this result and show
that GL(2, p
) = (p2 1)(p2 p).
10. Let K be a field. For any A = (ac bd) Mat2(K), we define the trace of
A to be the element a + d of K . (sum of the entries in the upper-left
lower-right diagonal).. Show that the trace of AB is equal to the trace of
BA for all A,B Mat2(K).
11. Let K be a field. For any A = (ac bd) Mat2(K), we define the
Put GL(2, m
) = {A Mat2( m
): det A m
}. Show that GL(2, m
) is a
group under multiplication.
182
13. Develope a theory of matrices over by modifying the theory of
matrices over . How do you define GL(2, )?
a b
14. Let H = {( b a ): a,b } Mat2( ), where x is the complex
conjugate of x . Prove that H is closed under addition and multiplica-
tion. Show that H\{0} is a group under multiplication.
a b a b
15. If K is a field and A = (c d) Mat2(K), we write A = ( c d). Let
10 i 0 0 1 0 i
1 = (0 1), i = (0 i ), j = (1 0), k = ( i 0) Mat2( ).
Thus 1 is the identity matrix over .. Show that ij = k, jk = i, ki = j. Prove
that {1, 1,i, i,j, j,k, k} is a group under multiplication,. called a
quaternion group of order 8 and is denoted as Q8. Show that Q8 has
exactly one element of order 2. Find all subgroups of Q8.
183
§18
Factor Groups
Another question is about the operation.. The central issue in §6, where
we introduced the operations on n, was whether these operations were
well defined. Once we knew that addition in n is a well defined opera-
tion, it was straightforward to prove that n is a group. Not surprisingly,
we have the same problem here.. The main point of the following
discussion is to show that we have a well defined operation. (Theorem
18.4).. Once we know it, it is easy to show that our set of cosets is a
group (Theorem 18.7)..
It turns out that these questions are intimately connected and they will
be resolved simultaneously.
185
Ha.Hb = Hab
which simplifies to
or as
186
18.2 Lemma: Let H G. For a G, let a 1Ha be the set
{a 1ha G: h H} = {b G: aba 1 H}.
The following are equivalent.
Proof: (1) (2) This follows from the definition of the set a 1Ha.
(2) (3) Suppose a 1Ha H for all a G. Then, for any a G, it is true
1 1 1
that (a ) Ha H. Hence, for any h H, a G, we have aha H, so h =
1 1 1
a (aha )a a Ha. Since this holds for all h H, we obtain H a 1Ha, for
all a G. Together with the hypothesis a 1Ha H for all a G, this yields
1
a Ha = H for all a G.
187
We employ the symbol H G to denote that H is a normal subgroup of
G. Also, H G means that H is not a normal subgroup of G. If H is a
proper and normal subgroup of G, we write H G. Finally, H G means
that H is not a proper normal subgroup of G.
Proof: This follows from (o), Lemma 18.2(5) and Definition 18.3.
18.5 Examples: (a) For any group G, it is clear that G G. Also, {1}
1
G, since a 1a {1} for all a G. We make a convention here. The trivial
subgroup {1} will henceforward be written simply as 1. It will be clear
from the context whether 1 stands for the identity element or for the
trivial subgroup. Thus 1 G and G G.
188
Here h1 h in general and therefore hg = gh1 gh. The second inclusion
has a similar meaning..
H = { ,(12)} H = { ,(12)}
H(13) = {(13),(123)} (13)H =
{(13),(132)}
H(23) = {(23),(132)} (23)H = {(23),(123)}
and the right coset {(13),(123)} is not a left coset. So H S 3. In the same
way, { ,(13)} and { ,(23)} are not normal subgroups of S 3.
189
(f) Let H = { ,(12),(34),(12)(34)}. It is easy to see that H S 4. Is H
normal in S 4? We compare the right and left cosets of H in S 4. Aside
from H, we see that the right coset
H(13)(24) = {(13)(24),(1423),(3241),(14)(23)}
is a left coset:
(13)(24)H = {(13)(24),(1324),(1423),(14)(23)}
H(13) = {(13),(123),(341),(1234)}
(13)H = {(13),(132)
and since each right coset is a left coset, V4 S4. For a more conceptual
proof of this result, see Ex. 5 at the end of this paragraph..
190
as well as the subgroup H when we speak about normality.. It is possible
that H G1 and H G2 for two groups G1,G2 containing H. Here is an
example.. Take
4 2 1 1
G1 = D8 = , : = 1, = 1, =
2 2 2
G2 = , = {1, , , } G1
H= = {1, }.
det (G 1SG) = det (G 1.SG) = det G 1.det (SG) = (det G) 1.(det S)(det G)
= (det G) 11(det G) = 1,
G 1SG SL(2,K) for all S SL(2,K), G GL(2,K),
h Hi for all i I,
1
g hg Hi for all i I,
g 1hg H
191
18.6 Definition: When H G, the set of all right cosets of H in G, which
is also the set of all left cosets of H in G by Lemma 18.2(4), will be
denoted by G/H, read G by H, or G modulo H, or G mod H.
Most authors do not insist on the condition H G when they write G/H.
They write G/H for the set of right cosets of H in G (or for the set of
left cosets, especially when they write functions on the left) and employ
some other symbol for the the set of left cosets (or for the the set of
right cosets). Throughout this book, whenever we write G/H, it will be
tacitly supposed that H G. The notation G/H is meaningless if H G
and will not be used in this case.
192
"quotient group" is also used. The group operation is called multiplica-
tion (of cosets).
Please notice that G/H is not a subgroup of G. The elements of G/H are
subsets of G, not elements of G.
Proof: (1) The elements of G/H are the cosets of H in G and there are
G:H cosets of H in G by Definition 10.7. So the order of G/H is the index
of H in G. The second assertion follows from Lagrange's theorem.
(2) If G is abelian, then ab = ba for all a,b G and Ha.Hb = Hab = Hba =
Hb.Ha for all Ha, Hb G/H. Thus G/H is abelian, too.
The converses of the claims in Lemma 18.9 are false. The factor group
G/H can be finite (abelian, cyclic) without G being finite (abelian, cyclic).
193
and multiplication in G/H = G/1 is given by
{a}{b} = {ab}.
The factor group G/1 is governed by the same operation as G. Thus G/1
is almost the same group as G. The only difference is that the elements
of G are enclosed within braces in G/1.
H = {1,g 3,g 6,g 9}, Hg = {g,g 4,g 7,g 10}, Hg2 = {g 2,g 5,g 8,g 11}.
194
H Hg Hg2
H H Hg Hg2
Hg Hg Hg2 H
Hg2 Hg2 H Hg
195
(23) (23) (123) (132) (12) (13)
Thus S4/V4 is almost the same group as S3. They are not the same groups,
of course, for the underlying sets are different.. Nevertheless, it is clear
from the tables above that the operations on S 4/V4 and on S 3 are closely
related. This will be made more precise in §20. .
Exercises
196
{g GL(2, ): det g 5} in GL(2, )
{g GL(2, ): det g 0} in GL(2, )
{g GL(2, ): det g 0} in GL(2, )
{g GL(2, ): det g = 1} in GL(2, )
{g GL(2, ): (det g)18 = 1} in GL(2, )
{g GL(2, 11): det g = 1 or 3 or 4 or 5 or 9} in GL(2, 11
).
197
§19
Product Sets in Groups
When X has only one element, say when X = {x}, we write xY instead of
{x}Y. Likewise, we write Xy instead of X{y}. This is consistent with the
definition of cosets (Definition 10.1).
198
(XY)Z = {uz G: u XY, z Z}
= {(xy)z G: x X, y Y, z Z}
= {x(yz) G: x X, y Y, z Z}
= {xv G: x X, v YZ}
= X(YZ).
Using Lemma 8.3, we may and do drop the parentheses in any product
set involving more than two subsets. For example, we write XYZUV for
(XY)(Z(UV)).
199
= { ,(12)(34),(13)(24),(14)(23),(13),(1432),(24),(1234)}.
is easily seen to be closed under multiplication, hence XV4 is a subgroup
of S4 (Lemma 9.3(2)), but not a normal subgroup of S4, for (13) XV4
but (12) 1(13)(12) = (23) XV4 (Lemma 18.2(1)). . We see that the
product of two subgroups is not necessarily a normal subgroup. even if
one of the factors is a normal subgroup..
In Example 19.3(f) above, it is easy to see XV4 = V4X. This is the basic
reason why XV4 turns out to be a subgroup of S4. The next lemma
describes the situation..
The first inclusion means, for any h H and k K, the element hk ofG
belongs to KH, so that there are k1 K and h1 H such that hk = k1h1.
Similarly, the second inclusion means, for any k K and h H, there are
h2 H and k2 K such that kh = h2k2.
200
(a) Suppose first HK = KH. We prove that HK is closed under multiplica-
tion and the forming of inverses (Lemma 9.2).
201
g 1xg = g 1hkg = g 1hg.g 1kg HK since g 1hg H and g 1kg K as H G
and K G. Hence HK G.
We turn our attention to the product of two right cosets. The product of
two right cosets, as in Definition 19.1, is a subset of the group under
discussion. When is it a right coset? The next lemma gives the answer.
202
aH = 1aH HaH = Ha
and so aH Ha for all a G. From Lemma 18.2(5), we obtain H G
19.6 Lemma: Let H,K G and assume that H and K are finite. Then HK
is a finite subset of G, whose cardinality is given by
H K .
HK =
H K
Proof: We list all products hk, where h and k run through H and K,
respectively. In this way, we get H K elements of G. These are the
elements of HK. Naïvely, we expect HK to be equal to H K , but there
may be repetitions in our list: the same element of HK may be written
more than once. We have to keep account of repetitions. We show that
each of the H K products hk appears exactly n times in our list, where
n := H K . Thus there are H K /n distinct elements in the list and HK =
H K /n. In other words, the mapping
:H K HK
(h,k) hk
203
therefore if and only if h1 1h2 = k1k2 1 = s belongs to H K. Thus (h1,k1)
and (h2,k2) have the same image under if and only if h2 = h1s and
k2 = s 1k1 for some s H K. Denoting by 1 = s1,s2, . . . ,sn the n = H K
elements of H K, we conclude that the n ordered pairs.
(h1,k1),(h1s2,s2 1k1),(h1s3,s3 1k1), . . . ,(h1sn,sn 1k1)
and only these ordered pairs have the image h1k1 under . This proves
that is indeed n-to-one, and consequently
H K .
HK =
H K
Exercises
X Y gX gY g 1Xg g 1Yg;
X=Y gX = gY Xg = Yg g 1Xg = g 1Yg;
X = gY g 1X = Y.
H K .
HgK = 1
g Hg K
204
§20
Group Homomorphisms
205
from the group of positive real numbers (under multiplication) into
the group of all real numbers (under addition). The homomorphism
property of the logarithm function is the well known identity
log ab = log a + log b
that holds for all a,b .
206
:H G
h h
be the inclusion mapping (Example 3.2(a)).Then
(ab) = ab = a b
for all a,b H. Hence is a homomorphism. Both and are one-to-one
homomorphisms.
Proof: (1) Here we use the same symbol "1". with two different
meanings. In "1 ", 1 is the identity element of the group G.. On the right
hand side, 1 is the identity element of the group G1. A more accurate
way of writing the claim is.
(1G) = 1G ,
1
where 1G is the identity element of G and 1G is that of G1. For the homo-
1
respectively.
207
1
(2) For any a G, we have a (a ) = (aa 1) = 1 = 1 = (a )(a ) 1, hence
a 1 = (a ) 1.
:G G2
Proof: We are to show that (ab) = (a) .(b) for all a,b G. This
follows immediately:
208
= (a .b ) ( is a homomorphism)
= (a ) .(b ) ( is a homomorphism)
= (a) .(b) (definition of )
{a G: a = 1}
of all elements of the domain G . that are mapped to the identity of the
range group G1 is called the kernel of and is written as Ker .
209
1
) of an element in G, namely of a G. So x 1 Im and Im is closed
under taking inverses.
Thus Im G1.
The elements of a group which have the same image under a homo-
morphism make up a coset of the kernel of that homomorphism.
by Lemma 10.2(5).
210
Since Ker G by Theorem 20.6, we also have a(Ker ) = {b G: b =
a }. Alternatively, one may prove a lemma analogous to Lemma 20.7,
stating that a and b have the same image under if and only if the left
cosets a(Ker ) and b(Ker ) are equal, and combine it Lemma 20.7 to get
a(Ker ) = (Ker )a, thereby proving Ker G anew.
211
20.10 Examples: (a) The logarithm function is well known to be a one-
to-one function onto the set of real numbers. Thus
log:
is an isomorphism.
: S3S 4/V4
V4
(where, on the right hand side, is the permutation in S 4 that fixes 4
and maps 1,2,3 as S 3 does) is an homomorphism. This is evident
from the tables in Example 18.10(d). Also, is clearly one-to-one and
onto. So is an isomorphism and S 3 S 4/V4.
1 1 1
(2) For any x,y G1, we must show (xy) = x .y . Since is onto,
there are a,b G such that a = x and b = y. Now a and b are unique
212
with this property, for is one-to-one, and a = x 1, b = y 1. This is the
definition of the inverse mapping. Since is a homomorphism, we have
(ab) = a .b = xy
1 1
Hence, by definition of , we get ab = (xy) . Thus
1 1 1
(xy) = ab = x .y
1
and this holds for all x,y G1. So :G1 G is a homomorphism. As it is
1
one-to-one and onto by Theorem 3.17(1), is an isomorphism.
213
range group that we can construct out of G and its normal subgroup is
the factor group with respect to that normal subgroup.
:G G/N
a Na
214
and G/N. This is done by showing that is essentially a natural homo-
morphism.
G/N G/N
G G1 G G1
(a) (b)
: G/N G1
Na a
215
From the definition of N = Ker and of , we see that this implication is
equivalent to
= {Na G/N: a = 1}
= {Na G/N: a Ker }
= {Na G/N: a N}
= {N}
216
G G1
G/Ker Im
G G1
´
G/Ker Im
´
G G/Ker Im G1
where (a) is onto G/Ker ; (b) ´ is one-to-one and onto Im and (c)
is one-to-one. So ´ is onto and ´ is one-to-one (Theorem 3.11). Hence,
if fails to be onto, it is only due to the fact that is not onto. Also, if
fails to be one-to-one, it is only due to the fact that is not one-to-one.
We see that any homomorphism is essentially an isomorphism ´,
"diluted" by a natural homomorphism which (eventualy) accounts for its
failure to be one-to-one and by an inclusion mapping which (eventualy)
accounts for its failure to be onto. In fact, is one-to-one if and only if
the associated natural homomorphism :G G/Ker is one-to-one and
is onto if and only if the associated inclusion mapping. : Im G1 is
onto.
217
20.17 Examples: (a) Let a = {a n: n } be a cyclic group. The
mapping : a
n an
(m + n) = a n+ m = a ma n = m . n
/Ker Im .
/Ker a .
Ker = {n : n = 1 = a 0}
= {n : a n = 1}
= {n : o(a) n} (Lemma 11.6)
=k .
Ker = {n : n = 1 = a 0}
= {n : a n = 1}
= {0} (Lemma 11.5)
218
(b) The determinant homomorphism (Example 20.2(b))
: Sn {1, 1} = C2
S n/Ker Im
: \{0}
a a
gives ( \{0})/C2 .
: \{0}
x e2 xi
219
(x + y) = e2 (x+ y)i = e2 xie2 yi = x .y
for all x,y . We have /Ker Im . The reader may verify that
Im = {z : z = 1}.
Exercises
8. Prove directly that any two cyclic groups of the same order are iso-
morphic.
9. Prove that any two dihedral groups of the same order are isomorphic.
220
10. Imitating Example 20.17(a), show that any dihedral group is iso-
morphic to a factor group of D .
221
§21
Isomorphism Theorems
C
S C
C C
C
A
H J K C C
1
1 1
222
In more detail and more precise language, the claim is the following.
G G1 G G1 G1/H1
J J1
H H1 H H1 1
Ker 1 Ker 1
1 1
Proof: (1) For each H G with Ker H,. we are to find a subgroup of
G1. How can we find it? Well, the subgroup we are looking for will be
first of all a subset of G1. How can we associate with H a subset of G1? At
our disposal, we have only one means of transportation from G to G1,
namely the mapping .. The only thing we can do, then, is form the set of
images of the elements of H under . Hence we put.
H1 := {h G1: h H}.
H1 = {h G1: h H} = {h H G1: h H} = Im H
223
by definition. Theorem 20.6 gives now Im H G1, hence H1 G1. The
description H1 = Im H will be useful.
1 = h (j ) 1 = h .j 1 = (hj 1) ,
hj 1 Ker J
h Jj = J.
(5) For any S G1, we are to find an H G such that Ker H and
H1 = S. What can H be? As in part (1), there is only one thing we can do:
take the preimages of the elements in S. Hence we put.
H = {a G: a S}.
224
Thus H is a subgroup of G.
This completes the proof of (5). (Part (5) shows that the correspondence
H H1 is onto.)
(6) First we assume H G and show that H1 G1. We are to show that
x 1h1x H1 for all x G1 and for all h1 H1 (Lemma 18.2(1)). If x G1
and h1 H1, then there are a G with a = x and h H with h = h1. This
is so because is onto G1 and H1 is defined as Im H . Then we are to
show (a ) 1(h )(a ) H1. This is equivalent to (a 1ha) H1. Since H G,
we know a 1ha H, so (a 1ha) Im H = H1. This proves H1 G1.
´
G G1 G1/H1. (i)
225
(7) We saw that any one of H G and H1 G1 implies the other. Assume
that one, and hence both of them are true. Then we have the homo-
morphism ´. From Theorem 20.16, we obtain
G/Ker ´ Im ´ (iii)
G/H G1/H1.
21.2 Theorem: Let N G. The subgroups of G/N are the factor groups
S/N, where S runs through the subgroups of G satisfying N S. More
precisely, for each subgroup X of G/N, there is a unique subgroup S of G
satisfying N S such that X = G/N. When X1 and X2 are subgroups of
G/N, say X1 = S 1/N and X2 = S 2/N, where N S1 G and N S2 G,
then X1 X 2 if and only if S 1 S 2. Furthermore, S/N G/N if and
only if S G. In this case, there holds
Proof: Since N G, we can build the factor group G/N. The natural
homomorphism : G G/N is onto by Theorem 20.12. We can therefore
apply Theorem 21.1.
Im S
= {s G/N: s S}
= {Ns G/N: s S} = S/N
226
and Ker = N by Theorem 20.12 (notice that S/N is meaningful, for N
G and N S imply N S; cf. Example 18.5(l)).. Thus the subgroups of
G/N are given by S/N, where N S G. By Theorem 21.1(2),(3),(4),.
S 1/N S 2/N if and only if S 1 S 2 and S 1/N S 2/N whenever S 1 S 2.
Finally, S 1/N G/N if and only if S G by Theorem 21.1(6) and in this
S S/N 1
N 1
227
groups of /n are given by m /n , where m runs through all positive
divisors of n. For the factor group, we know
/n /m /n /m
from Theorem 21.2. So all factor groups of /n and of Cn are cyclic (cf.
Lemma 18.9(3)). For each positive divisor m of n,. there is a unique
factor group of order m of Cn = a , namely a / a m , where a m is the
unique subgroup of order n/m of Cn (Lemma 11.10).
Cn /n /n / m /n
am m /n m 1
1 1 n
K/ H K HK/H. (*)
228
G
HK
H K
Exercises
229
§22
Direct Products
(h,k)(h1,k1) = (hh1,kk1)
Proof: Before beginning with the proof, it will not be amiss to formulate
the theorem in a more precise way. Suppose (H, o ) and (K, *) are groups.
The claim is that (H K, ) is a group, where is defined by
[(h,k)(h1,k1)](h2,k2) = (hh1,kk1)(h2,k2)
= ((hh1)h2,(kk1)k2)
= (h(h1h2),k(k1k2))
230
= (h,k)(h1h2,k1k2)
= (h,k)[(h1,k1)(h2,k2)]
and the operation on H K is associative.
Therefore, H K is a group.
Thus the notation "H K" stands for the cartesian product of the sets H
and K as well as the direct product of the groups H and K. This ambiguity
will not lead to any confusion. The reader should be careful to distin-
guish between HK and H K. The former is defined only when H and K
are subgroups of a common group G, whereas H K is a meaningful
group regardless of whether H and K are subgroups of a group. The
elements of HK are elements of the group that contains H and K; the
elements of H K ore ordered pairs.
When the groups H and K are written additively, we write the group of
Theorem 22.1 in the additive form, too. The operation is then given by
231
for all (h,k),(h1,k1) H K. The operation is called addition in this case,
and the group is called the direct sum of H and K. We write the group as
H K, to avoid confusion with H + K (which is HK in additive notation,
where H and K are subgroups of a group G).
: \{0} C2
q (sgn q, q )
\{0} C2 .
:
a + bi (a,b)
232
is an isomorphism (where is the group of complex numbers under
addition). Hence
H1 H, K1 K
H1 G, K1 G
H1K1 = G, H1 K1 = 1
(hh´) 1
= (hh´,1) = (h,1)(h´,1) = h 1
.h´ 1
That H1K1 = G follows immediately from the fact that any (h,k) G can be
written as (h,1)(1,k) with (h,1) H1, (1,k) K1.
233
Finally, H1 K1 = 1. Indeed, if (h,k) H1 K1, then h = 1 as (h,k) K1 and
k = 1 as (h,k) H1, thus (h,k) = (1,1) and so H1 K1 {(1,1)} = 1, yielding
H1 K1 = 1.
234
= h´ 1hh´ H
235
Proof: (1) follows from Theorem 22.4, Theorem 22.5, Theorem 22.6. As
for (2), we observe that G H K implies H G, K G, G = HK, H K=
1, so that G/H = HK/H K / H K = K/1 K by Theorem 21.3. The proof
of G/K H is similar.
/Ker Im
by Theorem 20.16. Now a Ker if and only if a = 0 and a* = 0*, that is,
if and only if m a and n a. Since m and n are relatively prime, the latter
condition is equivalent to mn a. Hence Ker = mn and /mn Im ,
where Im is a subgroup of /m /n . From
236
mn = /mn = Im /m /n = /m /n = mn
Cmn Cm Cn.
So far, we have examined the direct product of two groups. The con-
struction extends immediately to n groups, where n 2. We shall be
content with enunciating the appropriate theorems. Their proofs consist
in writing n-tuples in place of ordered pairs in the proofs above. The
only novel point is extension of the previous condition H K = 1. This is
discussed in Theorem 22.12, whose proof we briefly sketch.
237
22.11 Definition: The group of Theorem 22.10. is called the direct pro-
duct of H1,H2, . . . ,Hn and is denoted by H1 H2 . . . Hn. If the groups are
written additively, we call the group of Theorem 22.10. the direct sum
of H1,H2, . . . ,Hn and denote is by H1 H2 ... Hn.
Sketch of proof: Let Gi be the set {(1,. . . ,x,. . . ,1): x Hi} of all n-tuples
in G whose k-th components are equal to 1 Hk whenever k i. It is
easily verified that Gi is a subgroup of G, normal in G, isomorphic to Hi
and that G = G1G2. . .Gn. In fact, for all j = 2,. . . ,n,
238
22.14 Theorem: Let G be a group and let G1,G2, . . . ,Gn be subgroups of G.
Assume that Gi G for all i = 1,2, . . . ,n, G = G1G2. . .Gn and G1G2. . . Gj 1 Gj = 1
for all j = 2,. . . ,n. Then G G1 G2 . . . Gn.
22.16 Lemma: Let G1,G2, . . . ,Gn,H1,H2, . . . ,Hn be groups and assume that
G1 H1, G2 H2, . . . , Gn Hn. Then G1 G2 . . . Gn H1 H2 . . . Hn.
: G1 G2 . . . Gn H1 H2 . . . Hn
(g1,g2, . . . ,gn) (g1 1,g2 2, . . . ,gn n)
239
= (g1,g2, . . . ,gn) (g1´,g2´, . . . ,gn´)
Ker = {(g1,g2, . . . ,gn) G1 G2 ... Gn: (g1 1,g2 2, . . . ,gn n) = (1,1, . . . ,1)}
= {(g1,g2, . . . ,gn) G1 G2 ... Gn: g1 1 = 1, g2 2 = 1, . . . , gn n = 1}
= {(g1,g2, . . . ,gn) G1 G2 ... Gn: g1 =1, g2 = 1, . . . , gn = 1}
= {(1,1, . . . ,1)} = 1,
= (g1g1´,g2g2´, . . . ,gngn´)
= (H1g1g1´,H2g2g2´, . . . ,Hngngn´)
240
Ker = {(g1,g2, . . . ,gn) G1 G2 ... Gn: (g1,g2, . . . ,gn) = (H1,H2, . . . ,Hn)}
= {(g1,g2, . . . ,gn) G1 G2 ... Gn: H1g1 = H1, H2g2 = H2, . . . ,Hngn = Hn}
= {(g1,g2, . . . ,gn) G1 G2 ... Gn: g1 H1, g2 H2, . . . , gn Hn}
= H1 H2 . . . Hn.
Exercises
1 2 . . . n
(i i i n) in S n.
1 2 . . .
: G1 G2 . . . Gn H1 H2 . . . Hn
(g 1,g 2, . . . ,g n) (g 1 ,g
1 2 2
, . . . ,g n n)
( is sometimes denoted by 1 2
... n
). Show that is a homo-
morphism and
Ker =Ker 1
Ker 2
... Ker , Im
n
= Im 1
Im 2
... Im n
.
8. For any abelian group A, let A be the set of all homomorphisms from
A into \{0}. Prove that A is an abelian group under the multiplication
241
a( ) = a .a for all a A, , A
242
§23
Center and Automorphisms
of Groups
Thus Z(G) G.
243
As any two elements of Z(G) commute, Z(G) is an abelian subgroup of G.
It is also a normal subgroup of G. We prove a slightly stronger result.
23.4 Examples: (a) Let K be a field and let us put G = GL(2,K) for
a b a b
brevity. We want to find Z(G). Let (c d) G. Then (c d) Z(G) if and
a b xy xy a b xy
only if (c d)(z u) = (z u)(c d) for all (z u) G. In particular,
(ac bd)(10 11) = (10 11)(ac bd) and (ac bd)(01 10) = (01 10)(ac bd),
hence a = a + c, a +b =b+d and b =c a =d
c = c, c+d =d d =a c =b
for all (ac bd) Z(G), so (ac bd) = (a0 0a ), where a 0 since det (ac bd) 0.
and conversely the set on the right hand side is contained in Z(G), for
Thus Z(G) = {(a0 0a ) G: a 0}. The elements of Z(G) are called scalar
matrices.
244
(b) Let D4n = a,b: a 2n = 1, b2 = 1, bab = a 1
be a dihedral group of order
4n 4. What is Z(D4n)? Well, let x Z(D4n). Then x = a j or x = a j b for
some j ,0 j 2n 1. Since xa = ax and xb = bx, we get
The equations in (1) are satisfied only when n 2j, that is to say, only
when j = 0,n, so only when x = a 0,a n. The first equation in (2) is never
satisfied, for n 1 by hypothesis. Thus Z(D4n) {1,a n}. The reader will
easily show the reverse inclusion. Hence Z(D4n) = {1,a n} = a n .
(c) Let us find Z(S3). It is easy to see that and (12) are the only permu-
tations in S3 that commute with (12). Also, and (13) are the only
permu-tations in S 3 that commute with (13). Hence is the only
permutation in S3 that commute with both (12) and (13). A fortiori, Z(S3)
= 1.
245
construct a homomorphism whose kernel is Z(G). We will need the
concept of auto-morphisms.
Proof: We can check the group axioms,. but there is a shorter way. We
make use of Aut(G) SG. Now SG is a group under the composition of
mappings (Example 7.1(d)),. so all we have to do is show that Aut(G) is a
subgroup of SG.
246
23.8 Example: Let G be a group. We fix an arbitrary element g of G.
With each x G, we associate g 1xg. This is a uniquely determined
element of G, so we have a mapping x g 1xg, which we denote by g . So
g: G G
x g 1xg
1 = .
(2)
So g is an automorphism of G.
{ g Aut(G): g G}
247
of all inner automorphisms of G will be denoted by Inn(G).
The relation (1) has a deep significance. It states that the mapping
:G Aut(G)
g g
Ker = {z G: z = }
= {z G: g z = g for all g G}
= {z G: z 1gz = g for all g G}
= {z G: gz = zg for all g G}
= Z(G).
248
1 1
x( g ) = (x )( g )
1
= ((x ) g)
= (g (x 1)g )
1
= (g 1 )((x 1) )(g )
= (g ) 1x(g )
=x g ,
1 1
thus g = g and g Inn(G). This proves Inn(G) Aut(G).
249
of H. Then K H = K, because K is characteristic in H. Thus K = K for all
in Aut(G) and K is a characteristic subgroup of G..
Proof: We must show Z(G) = Z(G) for all Aut(G). If we can prove
Z(G) Z(G) for all Aut(G), then we will have Z(G) 1 Z(G), that is,
Z(G) Z(G) for any Aut(G) also (cf. the proof of (2) (3) in Lemma
18.2). So we need only prove Z(G) Z(G). For any z Z(G), we are to
show that (z )g = g(z ) for all g G. As g runs through G, so does g ,
because is onto G. Thes we need only show (z )(g ) = (g )(z ) for all
g G. But this is obvious: (z )(g ) = (zg) = (gz) = (g )(z ) since z Z(G)
and is a homomorphism. Consequently, Z(G) is characteristic in G.
250
xm for some m , and a = xm = (x )m = (x )m = xm = a . This proves
the claim.
Now m
is one-to-one if and only if Ker m
= 1 (Theorem 20.8) and
Ker m = {g Cn: g m
= 1}
= {xk: k and xkm = 1}
= {xk: k and n|km}
k
= {x : k and n/(n,m) | km/(n,m)}
= {xk: k and n/(n,m) | k}
n/(n,m)
= x ,
n
so Ker m = 1 = x if and only if (n,m) = 1. Thus m is an automorphism
of Cn if and only if (n,m) = 1.
Hence Aut(Cn) = { m
: (n,m) = 1}.
251
a am
this notation, we have
Aut(Cn) = { m
:m n
}
and m k implies m k
. In other words, the mapping
: n
Aut(Cn)
x mk
= xmk = (xm)k = (xm) k
= (x m
) k
= x( m k
)
Exercises
252
9. Let Aut(G) and H G. Prove that H is a subgroup of G and is iso-
morphic to H.
12. Find all characteristic subgroups of D8. Prove that Inn(D8) 1 and
that Aut(D8) D8.
13. Prove that Aut( ) C2, Aut(V4) S3, Aut(S3) S3, Aut(Q8) S4 (see
§17, Ex. 15).
253
§24
Generators and Commutators
The elements of X are described in the next lemma. See also Ex. 1 at
the end of this paragraph.
Proof: Let Y be the set on the right hand side. We must show Y X
and X Y.
254
In order to prove Y X , we show that Y H for every H G such that
X H. This follows from the closure properties of subgroups. If X H
n
and H G, then, for any x X, there holds x H for any n since H
n
is closed under multiplication, and also x H for any n since H is
0
closed under taking inverses and x = 1 H. Hence, for any k , any
m1 m2 mk
x1,x2, . . . ,xk X, any m1,m2, . . . ,mk , we have x1 ,x2 ,. . . ,xk H and,
from the closure of H under multiplication, we get x1m1 x2m2 . . . xkmk H.
Thus Y H whenever X H G. This proves Y X .
(b) Any element of the dihedral group D2n can be written in the form
m j
, where m,j . Hence D2n = , . So the notation of §14 is
consistent with Definition 24.1.
255
(d) SL(2, ) is generated by {(10 11),(01 1
}
0 ) . A proof of this is outlined
in Ex. 9.
x 1y 1xy G
Some authors define [x,y] to be xyx 1y 1. In this book, [x,y] will always
stand for x 1y 1xy. Clearly, xy = yx[x,y] for any x,y G. In general, xy
yx, and [x,y] is that element z in G for which xy = yx.z, whence the name
commutator.
256
24.8 Lemma: Let G be a group and x,y G.
(1) [x,y] 1 = [y,x].
(2) [x,y] = 1 if and only if x and y commute: xy = yx.
[H,K] = [h,k] G: h H, k K .
257
= [k,h] G: k K, h H
= [K,H].
258
(Definition 10.4). The equivalence classes are the right cosets of this
subgroup, which is normal, form a factor group. We expect this factor
group to be abelian. First we give a name to the subgroup.
Exercises
259
4. If H G and G is finitely generated, show that G/H is also finitely
generated.
n n
7. Let : and : and put G = , S . Let n
= for
u u+1 u 2u
2
n . Show that n+1
= n
for all n . Show that
.. . .
1 2 3
Prove that n
is a proper subgroup of A := i for all n . Using
i=1
Ex. 6, conclude that A is not finitely generated. Thus a subgroup of a
finitely generated group need not be finitely generated.
10. Let H G. Prove that [H,G] = 1 if and only if H Z(G) and also that
[H,G] H if and only if H G.
12. Let K G. Prove that [xK,yK] = [x,y]K G/K for any x,y G. Then
prove that [HK/K,JK/K] = [H,J]K/K for all H,J G.
260
13. Let H1,H2 H and K1,K2 K. Show that [H1 K1,H2 K2] = [H1,H2]
[K1,K2] as subgroups of H K.
14. Show that [xy,z] =y 1[x,z]y.[y,z] and [x,yz] = [x,z].z 1[x,y]z for any
elements x,y,z of a group G. Deduce that [HJ,K] = [H,K][J,K] whenever H,J,K
are normal subgroups of G.
17. Give an example of a group and three subgroups H,K,L of G such that
[[H,K],L] [H,[K,L]].
19. Find the derived subgroups of S 3,S 4,A4,D8,Q8 (see §17, Ex.
15),SL(2, 3), GL(2, 3), S n, An (for n 2).
20. Let G be a group such that G´ Z(G) and let a be a fixed element of G.
Prove that the mapping : G G is a homomorphism.
x [x,a]
261
§25
Group Actions
Many of the important groups we have examined so. far are groups of
functions. S X is the group of one-to-one mappings on the set X, Isom E is
the group of distance preserving functions on the Euclidean plane, Aut(G)
is the group of multiplication preserving functions on a group G. You will
see more examples later. In general, when X is a set with some structure
on it (algebraic, geometric, analytic, topological or of some other type),
the mappings on X that preserve this structure form a group. Up to now,
we neglected the functional character of the elements of a group they
might have. In this paragraph, we consider groups whose elements can
be thought of as functions on a set X. This leads to the idea of group ac-
tions.
262
a b
(x,y)(c d) = (xa + yc, xb + yd).
(ac bd),(eg hf ) G.
(d) Suppose G acts on X on the left and we denote the element of X cor-
responding to the pair g,x (g G, x X) by g *x. Then G acts on X on the
1
right when we put xg := g *x, because
(xg1)g2 = (g1 1*x)g2 = g2 1*(g1 1*x) = g2 1g1 1*x = (g1g2) 1*x = x(g1g2)
and
x1 = 1 1*x = 1*x = x
263
for all x X, g1,g2 G. We could not write xg := g *x, for then we would
get (xg1)g2 = x(g2g1) instead of (xg1)g2 = x(g1g2). However, if G is com-
mutative, G acts on X on the right when we put xg := g *x.
f( 1 2
) = (f 1
) 2
f =f
for all f F, 1
, 2
Sym F.
(g) Let be the set of all nonempty subsets. of the Euclidean plane E.
Then S E acts on since (F ) = F( ) and F = F for all F and , SE
(Lemma 14.1). .
(h) Assume that a group G acts on a set X. Then any subgroup of G also
acts on X.
In the next two theorems, we shall show that any group. action on a set
X is essentially a homomorphism into S X .
:G SX
g g
is a homomorphism (called the permutation representation of G corre-
sponding to the action).
264
x g1g2
= x(g 1g 2) = (xg1)g 2 = (xg1) g2
= (x g1
) g2
= x( g1 g2
),
so g1g2
= g1 g2
.
(1)
Furthermore, x 1
= x1 = x for all x X, hence
1
= X
S X.
(2)
g g -1
= gg -1
= 1
= X
= S X; g -1 g
= g -1g
= 1
= X
= SX
and thus g
is one-to-one and onto (Theorem 3.17(2)). So g
S X for all g
in G.
g g
xg = x(g )
for all x X, g1,g2 G. Here we use the fact that 1 S X is the identity
element of the group S X (Lemma 20.3(a)), which is the identity mapping
on X. Thus setting xg = x(g ) does define a group action.
265
x g
= xg = x(g )
25.5 Lemma: Let G act on X. for any x,y X, we put x y if and only if
there is an element g G such that xg = y. Then is an equivalence
relation on X.
Proof: (cf. Lemma 15.7.) (i) Since 1 G and x1 = x for all x X, we have
x x for all x X. Thus is reflexive.
(iii) Suppose x,y,z X and x y, y z. Then there are g,h G such that xg
= y and yh = z. Then x(gh) = (xg)h = yh = z. From gh G and x(gh) = z, we
conclude x z. Thus is transitive.
So is an equivalence relation on X.
StabG(x) = {g G: xg = x}.
266
Proof: The proof is a routine application of our subgroup criterion.
1
(ii) Let g StabG(x). Then xg = x. So xg 1 = (xg)g 1
= x(gg 1) =
1
x1 = x, so g StabG(x). Hence StabG(x) is closed under the forming of in-
verses.
Thus StabG(x) G.
StabG(xg) = g 1StabG(x)g.
Ker = {g G: g = 1 S X }
= {g G: x g = x for all x X}
267
= {g G: xg = x for all x X}
= {g G: xg = x }
x X
= StabG(x).
x X
orbit of x = G:StabG(x) .
Proof: The orbit of x is the set {xg X: g G}. The index G:StabG(x) is
the number of right cosets of StabG(x) in G, more precisely, the cardinal
number of = {StabG(x)g: g G}. We must find a one-to-one corre-
spondence between the orbit {xg X: g G} of x and the set =
{StabG(x)g: g G} of the right cosets of StabG(x) in G. The description of
these sets leads us to consider the mapping
: orbit of x ,
xg Sg
Before that, however, we must check that is well defined, for one and
the same element in the orbit of x can have representations xg,xh with
g h. We must prove that xg = xh implies Sg = Sh. If xg = xh, then x(gh 1)
= (xg)h 1 = (xh)h 1 = x(hh 1) = x1 = x, so gh 1 S and therefore Sg = Sh by
Lemma 10.2(5). Thus is well defined.
orbit of x = G:StabG(x) .
268
25.11 Definition: Let G act on X. We say G acts transitively on X or the
action of G on X is said to be a transitive action if, for any x,y X, there
is a g G such that xg = y. If G does not act transitively on X, then G is
said to act intransitively on X.
Thus G acts transitively on X if and only if there is one and only one or-
bit. The whole set X is the single orbit of the action.
(Ha)1 = Ha1 = Ha
for all Ha , g1,g2 G.
We have StabG(H) = {g G: Hg = H} = {g G: g H} = H
and StabG(Ha) = a 1StabG(H)a = a 1Ha
269
by Lemma 25.8.
N: S n S
1
270
confusion with right multiplication, we shall write xg for g 1xg. This
notation is standard. Since
as we know also from the proof of Theorem 23.10. In this case, Lemma
25.10 assumes the following form.
∑
k
G = G:CG(xi) .
i=1
271
k
G= conjugacy class of xi,
i=1
the union being disjoint. Counting the number of elements on both sides,
and using Lemma 25.15, we obtain
∑ ∑
k k
G = conjugacy class of xi = G:CG(xi) .
i=1 i=1
Proof: Let k be the number of conjugacy classes in G, and let x1,x2, . . . ,xk
be representatives of these classes. Then, in the class equation.
∑ G:CG(xi) ,
k
G =
i=1
272
G (Theorem 23.3), we can build the factor group G/Z(G), which has order
p2/p = p and which is therefore cyclic by Theorem 11.13. Then G must
be abelian by Lemma 23.5, and Z(G) = p2, contrary to the assumption
Z(G) = p. Thus Z(G) = p is impossible and there remains only the
possibility Z(G) = p2. Hence G is abelian.
We wish to present the basic idea in the proof of Theorem 25.17 in its
purest form. We need a definition.
Thus FixX (G) consists of all those elements in X which form an orbit with
only one element in it.. When we count the number of elements in X as
the sum of the number of elements in each orbit, each element in FixX (G)
contributes 1 to this sum.. Notice that, under the action of a group G on
itself by conjugation, FixG(G) is nothing else than Z(G).
273
Counting the number of elements on both sides, we get
∑
k
X = orbit of xi
i=1
∑
k
Hence, by Lemma 25.10, X = G:StabG(xi) .
i=1
25.21 Example: Let G be a group and let be the set of all nonempty
subsets of G. For any U and g G, we put
274
The orbit {Ug : g G} = {g 1Ug: g G} of U is called the conjugacy class
of U. We have
{g G: ug = u for all u U}
= {g G: g 1ug = u for all u U} = {g G: ug = gu for all u U}
CG(U) = CG(u).
u U
The orbit of U is
{Ug :g G} = {g 1Ug :g G}
275
Exercises
a b/2
1. Prove that SL(2, ) acts on X := {(b/2 c ): a,b,c } when we
3. Let G act on X and H act on Y. Prove that the direct product G H acts
on the cartesian product X Y.
276
§26
Sylow's Theorem
276
low's theorem states that every finite group has a Sylow p-subgroup, for
all prime numbers p.
pa m pa m
There are clearly (pa ) subsets of G in . We are to prove p ( pa ) .
277
pa m s pa m pbt pa bm t
= =
pa s pa pbt pa b t
pa m
contain p after cancellations are made. Hence their product ( pa ) is not
divisible by p.
322
( ) ( ) = 189 178 167 156 145 134 123 112 101 = 21 178 167 52 145 134 41 112 101
18
9
=
32
.
Step 3. There is an orbit of under the action of Step 2 such that the
number of elements (of ; equivalently, the number of subsets of G) in it
is not divisible by p:
= 1 2
... k
.
Counting the number of elements and keeping in mind that the orbits
are pairwise disjoint, we get
= 1
+ 2
+ ... + k
.
If 1 , 2
, . . . , k were all divisible by p, their sum would be di-
visible by p, too, contrary to Step 1.. Thus at least one of the numbers
1
, 2 , . . . , k is not divisible by p, as contended.
Let U0 be such that the number of elements (of ) in its orbit is not
divisible by p.. This is the juidicious choice we have alluded to. We put
H = StabG(U0).
278
Step 4. H G and H = pa :
Fix (J) 0
Fix (J) .
279
This completes the proof of part (2). In view of the remarks preceding
the proof, all Sylow p-subgroups of G are conjugate; and a normal Sylow
p-subgroup of a finite group is the unique Sylow p-subgroup of that
group.
Step 6 np = G:N :
1
Fix (H) (mod p).
1
Since np = G:N = 1
, the claim will be established when we show that
Fix (H) = 1.
1
280
From the equivalences
it follows that Fix (H) = {N}. Thus Fix (H) = 1 and np 1 (mod p).
1 1
Proof: (1) From Theorem 25.17, we know Z(G) 1. Let z Z(G) with z
k-1 k-1
1, and let o(z) = pk (1 k a). Then o(zp ) = p. Thus zp is a
subgroup of order p and is normal in G (Theorem 23.3).
Assume now that a 2 and that the claim is true for any finite p-group
a1
of order p . By part (1), there is H1 G with H1 = p. We consider the
281
factor group G/H1, which has order G/H1 = G / H1 = pa /p = pa 1. By
induction, there are normal subgroups, say Hi+1/H1, of G/H1 with Hi+1/H1
= pi (i = 0,1, . . . ,a 1) and
H1 H2 ... Ha Ha = G.
1
Here Hi+1 = Hi+1/H1 H1 = pip = pi+1 for i = 0,1, . . . ,a 1. Thus, when we put
H0 = 1, the claim is proved for finite p-groups of order pa .
282
Proof: Suppose p q and let np be the number of Sylow p-subgroups
of G. Then np divides G /p = q, so np = 1 or q, and np 1 (mod p). So np =
q implies p q 1, which is not compatible with p q. Thus np = q is
impossible and np = 1. Then there is a unique Sylow p-subgroup of G,
and it is normal in G.
where the union is taken over pairwise disjoint sets. Counting the
number of elements on the right hand side, we see that there are exactly
p2(q 1) elements of order q in G. So there are exactly G p2(q 1) = p2
elements in
283
Exercises
3. Let G be a finite group with exactly one Sylow p-subgroup. Prove that
every subgroup and every factor group of G, too, has exactly one Sylow
p-subgroup.
8. Let G be a finite group and H,J G.. Suppose J is a finite p-group and
H 1 (mod p), where p is a prime number. Prove that H CG(J) 1.
11. Let p,q,r be distinct prime numbers and let G be a group of order
pqr. Show that G has a nontrivial proper normal subgroup.
284
285
§27
Series
Thus a group G is simple if and only if G 1 and 1 and G are the only
normal subgroups of G. This resembles the definition of prime numbers.
Just as prime numbers are the building blocks of integers, simple groups
are the building blocks of certain groups, as will be seen below. More-
over, the fundamental theorem of arithmetic has a counterpart, namely
the Jordan-Hölder theorem. This theorem states that, any group G satis-
fying certain conditions that will be specified later, the building blocks
of G are uniquely determined. However, this analogy should not be
pushed too far. For one thing, the building blocks may be combined in
various ways to produce different groups. Stated otherwise, different
groups may have the same building blocks. In fact, the problem of de-
termining a group from its building blocks, known as the extension
problem, still awaits its solution.
285
of prime order. Conversely, a cyclic group of prime order has no non-
trivial proper subgroup by Lagrange's theorem, and is therefore an
abelian simple group. We proved the following theorem.
We prove next that the alternating groups An, where n 5, are simple.
We need a lemma. Let us recall that a 3-cycle is a permutation of the
form (abc), a b c a.
There are three cases to consider, in which two or one or none of c,d is in
the set {a,b}. In the first case, {c,d} = {a,b}, hence (cd) = (ab) and (ab)(cd)
= (ab)(ab) = = (abe)(abe)(abe) is a product of 3-cycles, where e is
distinct from a and b (here we use the assumption n 3). In the second
case, we may assume c = a without loss of generality. Then (ab)(cd) =
(ab)(ad) = (abd) is a product of one 3-cycle. In the third case, a,b,c,d are
all distinct and (ab)(cd) = (abc)(adc) is a product of two 3-cycles. The
proof is complete.
286
Proof: Let 1 N An. We will prove N = 1.
287
Hence, in the disjoint cycle decomposition of any nonidentity element of
N, there is no cycle of length 3. Combining this with what we proved
above, we conclude that any nonidentity element in N must be a product
of (an even number of) disjoint transpositions.
288
27.7 Definitions: Let H G. A finite sequence of subgroups of G,
including H and G, is called a series from H to G, or a series between H
and G, if each group in the sequence is a normal subgroup of the next
one. Thus a series from H to G can be written
H = H0 H1 ... Hn Hn = G.
1
(1)
The subgroups H0,H1, . . . ,Hn 1,Hn are called the terms of the series (1). The
factor groups H1/H0, H2/H1, . . . ,Hn/Hn 1 are called the factors of the series
(1). A series from 1 to G will be called shortly a series of G..
If each term H0,H1, . . . ,Hn 1,Hn of the series (1) happens to be normal
(characteristic) in G, the series (1) will be called a normal (characteristic)
series.
A series
H = J0 J1 ... Jm Jm = G.
1
(2)
289
27.9 Lemma: A series
1 = G0 G1 ... Gn 1 Gn = G
of a group G is a composition series of G if and only if all factors Gi/Gi 1
(i = 1,2, . . . ,n) are simple.
Proof: Suppose first that the given series is a composition series of G.. By
definition, it is a proper series. So Gi 1 Gi and all factors Gi/Gi 1 are dis-
tinct from the trivial group (i = 1,2, . . . ,n). If one of the factors, say
Gj/Gj 1, were not simple, Gj/Gj 1 would have a nontrivial proper normal
subgroup, which may be written as H/Gj 1, where Gj 1 H Gj by
Theorem 21.2. Hence Gj 1 H Gj (in fact Gj 1 Gj ) and the given series
has a proper refinement which is obtained by inserting H between Gj 1
and Gj , contrary to our hypothesis. that the given series is a composition
series. Hence Gi/Gi 1 are all simple (i = 1,2, . . . ,n).
Conversely, let us assume that all factors Gi/Gi 1 are simple (i = 1,2, . . . ,n).
Then Gi/Gi 1 is not trivial and so Gi 1 Gi for all i = 1,2, . . . ,n. Thus the
given series is proper.. If it were not a composition series, it would have
a proper refinement.. To fix the ideas, let us assume that such a refine-
ment has a term H between Gj and Gj 1, so that Gj 1 H Gj . By Theorem
21.2, H/Gj 1 would be a nontrivial proper normal subgroup of Gj/Gj 1,
contrary to the hypothesis that all factors, including Gj/Gj 1, are simple.
Hence the given series is a composition series..
290
(c) We want to find all composition series of S n for n 5. For this
purpose, we determine all normal subgroups of S n.
(a 1 a 2 )
= (a 1b1)(a 1 a 2 )(a 2b2)(a 1 a 2 ) (a 1 a 2 )
= (a 2b1)(a 1b2) (a 1b1)(a 2b2) =
(a 1 c)
= (ca1)(a 1b1)(a 1c) = (b1c) (a 1b1) = ,
291
(d) Not every group has a composition series. For example, has no
composition series. Indeed, any series of is of the form
0 m1 m2 ... mn , (3)
where m2 m1, m3 m2, . . . , mn mn 1. If m0 is a multiple of m1 and m0 m 1,
then
0 m0 m1 m2 ... mn
is a proper refinement of (3). Thus any series of has a proper refine-
ment. Consequently, no series of can be a composition series of .
1 = G0 G1 ... Gn Gn = G
1
1 = H0 H1 ... Hm Hm = G
1
292
27.12 Lemma (Dedekind's modular law): Let G be a group and let
A,B,C be subgroups of G such that A C. Then
A(B C) = AB C.
(A(B C) and AB C are not necessarily subgroups of G.)
BC = CB
C BA = AB
AB C = A(B C) B
A C B
A B
293
B BA N G(BA)
hence (BA)b = BA for all b B.
Then, for any b B, c C, we obtain
(BA)bc = [(BA)b]c = (BA)c = B cAc = BAc = BA
since B G and A C. Thus (BA)x = BA for all x BC and BA BC.
Using Theorem 21.3 with BC, BA, C in place of G,H,K, respectively, we get
AB C C and C/AB C C(AB)/AB.
Since AB C = A(B C) and C(AB) = (CA)B = CB = BC, this isomorphism
means C/A(B C) BC/BA.
U1 U2 G and V1 V2 G.
Then U1(U2 V1) U1(U2 V2), V1(U1 V2) V1(U2 V2) and
U1
V
E 1
D
12
D
21
D11
294
Here U1E = U1(D12D21) = (U1D12)D21 = U1D21 and E(U1 D22) = E (because
U1 D22 = D12 E), so (4) becomes
1 = G0 G1 ... Gn Gn = G (g)
1
1 = H0 H1 ... Hm Hm = G (h)
1
are series of G, then there are series (g´) and (h´) of G such that (g´) is a
refinement of (g), (h´) is a refinement of (h), and (g´) and (h´) are equi-
valent.
Hm Gi 1Hm Gi 1Hm Gi = Gi
Hm 1
Gi 1Hm 1
Gi 1Hm 1
Gi
H1 Gi 1H1 Gi 1H1 Gi
295
H0 Gi 1H0 Gi 1H0 Gi = Gi 1
Hm Hm Gi Gi 1(Hm Gi) = Gi
Hm 1
Hm 1
Gi Gi 1(Hm 1
Gi)
H1 H1 Gi Gi 1(H1 Gi)
H0 H0 Gi Gi 1(H0 Gi) = Gi 1
Thus Gi,j 1 Gij , Hi 1,j Hij and Gij /Gi,j 1 Hij/Hi 1,j .
296
factors.. Here (g´) is a refinement of (g) and (h´) is a refinement of (h).
Finally, in view of the isomorphisms Gij /Gi,j 1 Hij/Hi 1,j , the series (g´)
and (h´) are equivalent..
Proof: Let
1 = G0 G1 ... Gn Gn = G (g)
1
be a proper series of G and let
1 = H0 H1 ... Hm Hm = G (h)
1
(1) (h´´) is a proper series and is a refinement of (h). But (h) has no
proper refinement, because (h) is a composition series. Hence (h´´) is
identical with (h). Thus (g) has a refinement (g´´) which is equivalent to
the composition series (h´´) = (h). Then the factors of (g´´), being iso-
morphic to the composition factors in (h), are all simple groups and (g´´)
itself is a composition series by Lemma 27.9. Therefore any proper
series (g) of G has a refinement (g´´) which is a composition series.
297
27.18 Definition: A series
H = H0 H1 ... Hm Hm = G
1
from H to G is said to be an abelian series if all the factors
H1/H0, H2/H1, . . . , Hm/Hm 1
are abelian groups.
1 = H0 H1 ... Hm Hm = G.
1
298
is a series of G/N, and, for all i = 1,2, . . . ,m, HiN/N / Hi 1N/N HiN/Hi 1N
1 = N0 N1 ... Nm Nm = N
1
of N and an abelian series
1 = N0 N1 ... Nm Nm = N = H0 H1 ... Hk Hk = G
1 1
299
of G whose factors Hi/Hi 1
(i = 1,2, . . . ,a) are cyclic of order p (Theorem
26.3(2)). Thus G has an abelian series and G is solvable.
The following result will play a crucial role in proving that a polynomial
equation of degree greater than four cannot be solved by radicals.
300
27.26 Theorem: If n 5, then Sn is not solvable.
Exercises
1. Let {Gi: i } be a collection of simple groups such that Gi Gi+1 for all
301
(a) H/K is a chief factor of G if and only if H/K is a minimal
subgroup of G/K.
(b) If M is a minimal normal subgroup of G, then M has no
characteristic subgroup except 1 and M.
(c) If G has a composition series, then G has a chief series.
(d) 1 V4 A4 S4 is the unique chief series of S 4.
10. Repeat the proof of Schreier's theorem for the two series 1 C18
C36 and 1 C4 C12 C36 of the cyclic group C36.
12. Prove that, if H,K G and G/H, G/K are solvable, so is G/H K.
14. Prove that a solvable group has a composition series if and only if it
is finite (cf. Ex. 6).
302
§28
Finitely Generated Abelian Groups
Proof: (1) Since o(1) = 1 , 1 T(G) and T(G) . Suppose now a,b are
nm
in T(G), say o(a) = n, o(b) = m (n,m ). Then (ab) = a nmbnm = 1.1 = 1,
so o(ab) nm, thus ab T(G); and o(a 1) = n , thus a 1 T(G). By the
subgroup criterion, T(G) G.
(2) Since G is abelian, we can build the factor group G/T(G). If T(G)x in
G/T(G) has finite order, say n , then (T(G)x)n = T(G), so T(G)xn = T(G),
so xn T(G), so o(xn) is finite. Let o(xn) = m . Then xnm = (xn)m = 1, so
o(x) nm. Thus o(x) is finite and x T(G). It follows that T(G)x = T(G) is
the identity element of G/T(G). Hence every nonidentity element of
G/T(G) has infinite order.
302
Thus 1 is the only group which is both a torsion group and torsion-free.
Every finite group is a torsion group, but there are also infinite torsion
groups, for example / .
303
28.4 Lemma: Let G be an abelian group and g1,g2, . . . ,gr be finitely many
elements of G, not necessarily distinct (r 1). Let B G.
(2) Suppose o(gi) = ki for each i = 1,2, . . . ,r. If g g1,g2, . . . ,gr , then,
by part (1), g = g1m1g2m2. . . grmr with suitable integers mi. Dividing mi by ki,
we may write mi = kiqi + ti, where qi,ti and 0 ti ki. Then gimi =
(giki)qigiti = giti and g = g1t1g2t2. . . grtr. Thus
g1,g2, . . . ,gr {g1t1g2t2. . . grtr : 0 ti ki for all i = 1,2, . . . ,r}
and
g1,g2, . . . ,gr k1k2. . . kr.
(4) This follows from part (3) when we take A to be G/B and to be the
natural homomorphism : G G/B.
304
(5) Suppose G/B = Bg1,Bg2, . . . ,Bgr . Let g G. Then Bg G/B and, by part
(1) with G/B in place of G and Bgi in place of gi, we have
Bg = (Bg1)m1(Bg2)m2. . . (Bgr)mr = Bg1m1g2m2. . . grmr for some integers mi.
Hence g = bg1m1g2m2. . . grmr for some b B and g B g1,g2, . . . ,gr . So G =
B g1,g2, . . . ,gr . If, in addition, B = b1, . . . ,bs , then
G = b1, . . . ,bs g1,g2, . . . ,gr = b1 . . . bs g1 g2 . . . gr
= b1, . . . ,bs,g1,g2, . . . ,gr .
(6) This follows from part (5) with a slight change in notation.
We want to show that every element. of G has at most one such repre-
sentation if and only if {g1,g2, . . . ,gr} is independent. Equivalently, we will
prove that there is an element in G with. two different representations if
and only if {g1,g2, . . . ,gr} is not independent. Indeed, there is an element
in G with two different representations if and only if g1m1g2m2. . . grmr =
g1n1g2n2. . . grnr for some integers such that gimi gini for at least one
i {1,2, . . . ,r}. The latter condition holds if and only if
where not all of g1m1 n1, g2m2 n2, . . . , grmr nr are equal to 1, that is, if and only
if {g1,g2, . . . ,gr} is not independent.
then G = g1 g2 ... gr .
305
Proof: (1) If m1,m2, . . . ,mr are integers such that
g1m1g2m2. . . grmr = 1,
(*)
(2) If G/B = Bg2 ... Bgr , then G/B = Bg2, . . . ,Bgr and
{Bg2, . . . ,Bgr} is independent (Lemma 28.4(7)),
G = g1,g2, . . . ,gr (Lemma 28.4(6)),
{g1,g2, . . . ,gr} is independent (Lemma 28.5(1)),
G = g1 g2 ... gr (Lemma 28.4(7)).
where each ki runs through 0,1, . . . ,mi 1. Our list has thus m1m2. . . mn
entries. Every element of A appears in our list. Two entries a1k1 a2k2 . . . ankn
and a1s1 a2s2 . . . ansn are equal if and only if the entry a1r1 a2r2 . . . anrn , where ri
306
is such that 0 ri mi 1 and ki si ri (mod mi), is equal to the
identity element of A.. Thus any element of A appears in our list as
many times as 1 does, say t times.. The number of entries is therefore
m1m2. . . mn = nt. Since q divides n, we see q m1m2. . . mn and q divides one
of the numbers m1,m2, . . . ,mn (Lemma 5.16), say q m1. Let us put m1 =
qh, h . By Lemma 11.9(2), a1h has order
o(a1h) = o(a1)/(o(a1),h) = m1/(m1,h) = qh/(qh,h) = qh/h = q.
28.7 Theorem: Let G be a finite abelian group and let G = p1a 1 p2a 2 . . . psa s
be the canonical decomposition of G into prime numbers (ai 0).
(2) We must show that G = G1G2. . . Gs and G1. . .Gj 1 Gj = 1 for all j = 2, . . . ,s
(Theorem 22.12). We put G /piai
= mi (i = 1,2, . . . ,s). Here the integers
m1,m2, . . . ,ms are relatively prime and there are integers u1,u2, . . . ,us such
that u m + u m + . . . + u m = 1.
1 1 2 2 s s
307
(3) By the very definition of Gi = G[pia i], the order of any element in Gi is
a divisor of pia i. Then, by Lemma 28.6, Gi is not divisible by any prime
number q distinct from pi. Thus Gi = pibi for some bi, 0 bi ai. From
p1b1p2b2. . . psbs = G1 G2 . . . Gs = G1 G2 ... Gs = p1a 1 p2a 2 . . . psa s, we get
pibi = Gi = pia i for all i = 1,2, . . . ,s.
a
(4) Let : G H be an isomorphism. For any g Gi, we have gpi i = 1, so
a a
(g )pi i = (g pi i) = 1 = 1. Thus g Hi and Gi Hi. Also, if h Hi, then h
p ai p ai p ai p ai
= g for some g G and (g i ) = (g ) i =h i = 1. Thus g i Ker = 1,
p ai
so g i = 1, so g Gi and h = g Gi . Hence Hi Gi . We obtain Gi = Hi.
Consequently, Gi
: Gi Hi is an isomorphism and Gi Hi for all i = 1,2, . . .
,s.
Conversely, assume G = H and Gi Hi for all i = 1,2, . . . ,s. From part (2),
we get G = G1 G2 . . . Gs and H = H1 H2 . . . Hs and Lemma 22.16
gives G H..
(1) Gn G.
(2) If G = g1,g2, . . . ,gr , then Gn = g1n,g2n, . . . ,grn .
(3) If G = g1 g2 ... gr , then Gn = g1n g2n ... grn and
G/Gn g1 / g1n g2 / g2n ... gr / grn .
(4) Let H be an abelian group. If G H, then Gn Hn and G/Gn H/Hn.
Proof: (1) and (2) Since (ab)n = a nbn for all a,b G, the mapping
:G Gn
a an
308
is a homomorphism onto Gn. So Gn = Im G by Theorem 20.6. Also, if
G = g1,g2, . . . ,gr , then Gn = g1 ,g2 , . . . ,gr = g1n,g2n, . . . ,grn by Lemma
28.4(3).
k-m
We put z = g1tp and g = z 1x. Then z g1 = B and Bg = Bx (Lemma
pm tpk k-m m m
10.2(5)). From x = g1n = g1 = (g1tp )p = zp ,
m m m m
g p = (z 1x)p = (zp ) 1xp = 1,
we obtain o(g) pm. Also pm = o(Bx) = o(Bg) o(g). Thus o(g) = pm. This
completes the proof.
309
We can now describe finite abelian groups.
We choose an element g1 of G such that o(g1) o(a) for all a G and put
g1 = B. Since G 1, we have B 1. If G = B = g1 , the claim is
established, so we suppose B G.. Then G/B is a finite abelian p-group
with 1 G/B G . By induction, there are elements Bx2, . . . ,Bxr of G/B,
distinct from B1, such that
G/B = Bx2 ... Bxr .
mi
Let us put o(Bxi) = p for i = 2, . . . ,r. Using Lemma 28.9, we find gi G
mi m1
such that Bxi = Bgi and o(gi) = p (i = 2, . . . ,r). Let us write o(g1) = p .
Then G/B = Bg2 ... Bgr and, by Lemma 28.5(2),
G = g1 g2 ... gr ,
where g2, . . . ,gr are distinct from 1 since Bg2, . . . ,Bgr are distinct from B
and g1 is distinct from 1 since o(g1) o(a) for all a G and G 1. This
completes the proof of part (1).
310
(2) and (3). For convenience, a t-tuple (pa 1 ,pa 2 , . . . ,pa t) will be called a
type of a nontrivial finite abelian p-group if a1 a2 .. . as 0 and
if A has a basis {f1,f2, . . . ,ft} with o(fk) = pa k (k = 1,2, . . . ,t). We cannot say
the type of A, for part (2) is not proved yet. The claim in part (2) is that
all types of a nontrivial finite abelian p-group (arising from different
beses) are equal.
Let G and H be nontrivial finite abelian p-groups, let (pm1 ,pm2 , . . . ,pmr) be
a type of G, arising from a basis {g1,g2, . . . ,gr} of G and let (pn1 ,pn2 , . . . ,pns)
be a type of H, arising from a basis {h1,h2, . . . ,hs} of H.
If r = s and (pm1 ,pm2 , . . . ,pmr) = (pn1 ,pn2 , . . . ,pns), then gi Cpmi hi for
i = 1,2, . . . ,r and G = g1 g2 ... gr h1 h2 ... hr = H
(Lemma 22.16). This proves the "if" part of (3).
Now the "only if" part of (3), which includes (2) as a particular case
(when G = H): we will prove that G H implies r = s and (pm1 ,pm2 , . . . ,pmr)
= (pn1 ,pn2 , . . . ,pns).
311
(pn1 ,pn2 , . . . ,pnr) = (pn1 , . . . ,pnl,p, . . . ,p), (††)
s l times
it being understood that the entries p should be deleted when k = r or s
= l. By Lemma 28.8(3),
Gp = g1p g2p ... grp
= g1p ... gkp 1 . . . 1
r k times
= g1p ... gkp
with o(gip) = pmi 1 1 for i = 1, . . . ,k. Hence {g1p, . . . ,gkp} is a basis and
(pm1 1, . . . ,pmk 1) is a type of Gp. In the same way, (pn1 1, . . . ,pnl 1) is a type
. . . +(mk 1)
of Hp. Here Gp is an abelian p-group with 1 Gp = p(m1 1)+
...
pm1 +m2 + +ms
= G . Since Gp Hp by Lemma 28.8(4), our inductive hypo-
thesis gives
k = l and (pm1 1, . . . ,pmk 1) = (pn1 1, . . . ,pnl 1).
(pm1 ,pm2 , . . . ,pmr) = (pn1 ,pn2 , . . . ,pnr). This completes the proof.
28.11 Examples: (a) We find all abelian groups of order p5, where p is
a prime number. An abelian group A of order p5 is determined by its
...
type (pm1 , . . . ,pmr), where of course pm1 + +mr
= A = p5. Since mi 0 and
m1 + . . . + mr = 5, the only possible types are
(p5), (p4,p), (p3,p2) ,(p3,p,p), (p2,p2,p), (p2,p,p,p), (p,p,p,p,p)
and any abelian group of order p5 is isomorphic to one of
Cp5 , Cp4 Cp, Cp3 Cp2 , Cp3 Cp Cp, Cp2 Cp2 Cp,
Cp2 Cp Cp Cp, Cp Cp Cp Cp Cp.
In particular, there are exactly seven nonisomorphic abelian groups of
order p5.
312
the number of nonisomorphic abelian groups of order pn is the number
of partitions of n. Notice that this number depends only on n, not on p.
(c) Let us find all abelian groups of order 324 000 = 253453 (to within
isomorphism). An abelian group A of this order is the direct product
A2 A3 A5, where Ap denotes the Sylow p-subgroup of A (p = 2,3,5).
Here A2 has order 25 and is isomorphic to one of the seven groups of
type
(25), (24,2), (23,22) ,(23,2,2), (22,22,2), (22,2,2,2), (2,2,2,2,2).
Likewise there are five possibilities for A3:
(34), (33,3), (32,32) ,(32,3,3), (3,3,3,3)
and three possibilities for A5:
(53), (52,5), (5,5,5).
The 7.3.5 various direct products A2 A3 A5 gives us a complete list of
nonisomorphic abelian groups of order 324 000.
G=B y1 y2 ... yk .
313
Proof: Let Y := y1,y2, . . . ,yk G. Then G/B = By1,By2, . . . ,Byk and, from
Lemma 28.4(5), we obtain G = BY.. We will show that G = B Y and Y =
y1 y2 ... yk .
Finally, since Byi has infinite order, we see that yi has also infinite order
and yi is an infinite cyclic group (i = 1,2, . . . ,k).
(1) G has a basis, that is, there are elements g1,g2, . . . ,gr in G\1 such that
G = g1 g2 ... gr .
314
then G = u1 is a nontrivial cyclic group and the claim is true (with r = 1,
g1 = u1).
315
(see Lemma 28.4(4)) is a nontrivial abelian group, generated by n 1
elements. Moreover, G/B = G/ u1 / B/ u1 = G/ u1 / T(G/ u1 ) is
torsion-free by Lemma 28.1(2).. So, by induction,
G=B g2 ... gr .
G = g1 g2 ... gr .
(2) and (3) For convenience, a natural number r will be called a rank of
a finitely generated nontrivial torsion-free abelian group A if A has a
basis of r elements. We cannot say the rank of A, for part (2) is not
proved yet. The claim in part (2) is that all ranks of a finitely generated
nontrivial torsion-free abelian group (arising from different bases) are
equal.
Now the "only if" part of (3), which includes (2) as a particular case
(when G = H): we will prove that G H implies r = s. This is easy. Now
316
is a finite group of order 2s. If G H, then G/G2 H/H2 (Lemma 28.8(4)),
so 2r = G/G2 = H/H2 = 2s. Hence r = s.
317
As G = T(G) I, the finitely generated abelian group G is determined
uniquely to within isomorphism by T(G) and I. Now I is determined
uniquely to within isomorphism by the integer r(I) (Theorem 28.13.(3)
and the definition r(1) = 0); and T(G), being a finite abelian group
(Theorem 28.15), is determined uniquely to within isomorphism by its
Sylow subgroups (Theorem 28.7(4)). Let s be the number of distinct
prime divisors of T(G) (so s = 0 when T(G) 1). Each one of the s Sylow
subgroups (corresponding to the s distinct prime divisors) is determined
uniquely to within isomorphism by its type (Theorem 28.10(3)). Thus
the finitely generated abelian group G gives rise to the following system
of nonnegative integers.
(i) A nonnegative integer r, namely the rank of G/T(G). Here r = 0
means that G is a finite group. If r 0, then T(G) I, where I is a direct
product of r cyclic groups of infinite order. The subgroup I is not, but its
isomorphism type is uniquely determined by G.
(ii) A nonnegative integer s, namely the number of distinct prime
divisors of T(G) . Here s = 0 means that T(G) 1 and G is a torsion-free
group.
(iii) In case s 0, a system p1,p2, . . . ,ps of prime numbers, namely
the distinct prime divisors of T(G) ;. and for each i = 1,2, . . . ,s, a positive
integer ti and ti positive integers mi1,mi2, . . . ,mit , so that
i
mi1 mi2 mit
(pi ,pi , . . . ,pi ) is the type of the Sylow pi-subgroup of T(G).
Exercises
318
(b) T(G)/T(H) HT(G)/H T(G/H)
and that HT(G)/H need not be equal to T(G/H).
7. Keep the notation of Ex. 6. Prove that the integers o(g1), o(g2), . . . ,o(gr)
determine the types of the Sylow p-subgroups of G uniquely,. and con-
versely the types of the Sylow p-subgroups of. G completely determine
the integers o(g1), o(g2), . . . ,o(gr). (The integers o(g1), o(g2), . . . ,o(gr) are
319
called the invariant factors of G.. Two finite abelian groups are thus iso-
morphic if and only if they have the same invariant factors.).
320
CHAPTER 3
Rings
§29
Basic Definitions
29.1 Definition: Let R be a nonempty set and let + and . be two binary
operations defined on R. The ordered triple (R,+,.) is called a ring if the
following conditions (ring axioms) are satisfied.
320
(2) For all a,b,c R, (a .b).c = a .(b.c).
(D) For all a,b,c R, there hold
a .(b + c) = a .b + a .c and (b + c).a = b.a + c .a.
The conditions (i) and (1) assert that two binary operations + and . are
defined on R. We shall refer to + as addition and to . as multiplication.
Further, we shall call the element a + b the sum of a and b, and the
element ab the product of a and b. The conditions (i)-(v) say that R forms
a group with respect to addition. The identity element 0 of this group
will be called the zero element, or simply the zero of R. So 0 is an
element of the set R and not neccessarily the number zero. The inverse
element a of a R is called the opposite of a.
321
(b) A more interesting ring is ( ,+,.), where + and . are the usual addi-
tion and multiplication of integers.
(c) Let 2 denote the set of even integers. Then (2 ,+,.), where + and .
are the usual addition and multiplication of integers, is a ring. In the
same way, if n and n is the set of integers divisibile by n, then
(n ,+,.) is a ring.
(e) Let R := {a/b : (a,b) = 1 and 5 b}. With respect to the usual
addition and multiplication of rational numbers, (R,+,.) is a ring.
(f) Let S := {a/b : (a,b) = 1 and 6 b}. With respect to the usual
addition and multiplication of rational numbers, (S,+,.) is a not ring. The
very first property (i) is not satisfied. For example
1 1 1 1 5
S, S, but + = S.
2 3 2 3 6
(g) Let p be a prime number and put T = {a/b : (a,b) = 1 and p b}.
With respect to the usual addition and multiplication of rational num-
bers, (T,+,.) is a ring.
(h) Let R be a ring. A matrix over R is an array (ac bd) of four elements
a,b,c,d of R, arranged. in two rows and two columns and enclosed within
parentheses. The set of all matrices over R will be denoted by Mat2(R). If
A,B Mat2(R), we say A is equal to B provided the correspording entries
in A and B are equal and write A = B in this case.. This is clearly an equi-
valence relation on Mat2(R).
a b e f
Let A = (c d), B = (g h) Mat2(R). The sum A + B of A and B is defined
a+e b+f
to be the matrix (c+g d+h) and the product AB of A and B is defined to
ae+bg af+bh
be the matrix (ce+dg cf+dh ).
The proof of Theorem 17.4 remains valid and shows that Mat2(R) is a
commutative group under addition.. The proof of Theorem 17.6(1),(2),(4)
is also valid and establishes the ring axioms (1),(2),(D). So (Mat (R),+,.) is
2
a ring. .
322
(i) Let K be the set of all real-valued functions defined on the closed
interval [0,1]. We define operations + and . on K by
(f + g)(x) = f(x) + g(x), (f.g)(x) = f(x)g(x) for all x [0,1]
(f,g K). So f + g is that function that maps any x [0,1] to the sum of
the values f(x) and g(x) of the functions f and g at x; and f.g is that
function that maps any x [0,1] to the product of the values f(x) and
g(x). In "f + g", the sign "+" stands for the binary operation + we just
defined, and in "f(x) + g(x)", the sign "+" stands for the usual addition of
real numbers. It is easily verified that (K,+,.) is a ring. The sum f + g and
the product f.g are said to be defined pointwise. The operations + and .
are called pointwise addition and pointwise multiplication.
(j) Let S be any set and let (R,+,.) be any ring. Let L denote the set of all
functions from S into R. For f, g L, we put
Let us find the zero elements of the rings in Example 29.2. This is the
identity element of the commutative group R in Example 29.2(a); the
number zero in the Examples 29.2(b),(c),(d),(e),(f),(g) except in the case
n
of Example 29.2(d), where the zero element is the residue class 0 n
0 0
of 0 ; the so-called zero matrix (0 0) Mat2(R), where 0 is the zero
323
The addition in a ring has all the desirable properties one could wish for:
it is associative, there is an identity element, all elements possess
inverses, and it is also commutative. As for multiplication, only one of
these properties, namely the associativity, is assumed to be satisfied. It
may happen, of course, that multiplication in a ring has some of these
properties. Then we make the following definitions.
(11 01).(01 01) = (01 01) (02 01) = (01 01).(11 01).
Likewise, a ring with identity is a ring with a multiplicative identity. The
additive identity exists in any ring anyway. Notice that e in Definition
29.4 must be both a right identity and a left identity. Since multiplica-
tion in a ring is not necessarily commutative, we cannot conclude, say,
from
ae = a for all a R
that the other condition
ea = a for all a R
also holds. In the case of groups, we proved that a right identity is also a
left identity, but in the proof we made use of the existence of inverse
elements. We cannot use the same argument in the case of rings, for we
do not know anything about the existence of inverse elements. They
may or may not exist for all a R. It is possible that a ring R has an
element f such that
af = a for all a R
but fb b for some b R.
324
In short, R may have a multiplicative right identity which is not a left
identity. If each right identity in a ring fails to be a left identity, then
the ring is not a ring with identity.
change.
In view of this lemma, we can speak of the identity. We shall follow the
convention of writing 1 for the multiplicative identity of a ring with
identity. 1 is therefore an element of the ring under study, and not
necessarily the number one. For instance, in the ring Mat2( ), the
10
element 1 is the matrix (0 1), the identity matrix. The ring K of Example
29.2(i) is a ring with identity, and one checks easily that 1 here is the
function h: [0,1] such that h(x) = 1 (real number one) for all x
[0,1].
325
(1) a0 = 0 for all a R.
(2) 0a = 0 for all a R.
(3) a( b) = (ab) for all a,b R.
(4) ( a)b = (ab) for all a,b R.
(5) ( a)b = a( b) for all a,b R.
(6) ( a)( b) = ab for all a,b R.
The set {0} can be made into a ring if we define + and . in the only
possible way: 0 + 0 = 0 and 0.0 = 0. This is a commutative ring with
326
identity, the multiplicative identity being the additive identity 0. This
ring is called the null ring.
29.8 Lemma: Let R be a ring with identity 1. If R is not the null ring,
then 1 0.
(00 01) 0 = (00 00) and (10 00) 0, but (00 01)(10 00) = (00 00) = 0.
As a second example, consider the ring K of real-valued functions on
[0,1] with respect to pointwise addition and multiplication (Example
29.2(i)). The zero element in this ring is the function , where (x) = 0
for all x [0,1]. The functions a and b, where
a(x) = { 01 if
if 0
1/2 x
x 1/2
1 , b(x) =
1 if 0
{
0 if 1/2
x
x
1/2
1
are thus distinct from , but their pointwise product is , as a(x)b(x) = 0
for all x [0,1].
327
29.10 Lemma: Let R be a ring with identity. If a is a left zero divisor,
then a does not have a multiplicative left inverse. If a is a right zero
divisor, then a does not have a multiplicative right inverse.
We know that the zero element in a ring distinct from the null ring
cannot have an inverse and we understand from Lemma 29.10 that
being a zero divisor is the very opposite of having an inverse. So if we
want a ring to have the property that every nonzero element in it has a
multiplicative inverse, the ring has to be free from zero divisors.
29.12 Definition: A ring with identity, which is distinct from the null
ring, and in which every nonzero element has a right inverse, is called a
division ring.
328
that, in any group, right inverses are also left inverses and that they are
unique (Lemma 7.3). Hence, in a division ring, every nonzero element
has a left inverse as well, and the right and left inverse of an arbitrary
element coincide. This will be called the inverse of that element.
329
For example, the units of are 1 and 1, so = {1, 1}. The units in n
are the residue classes a for which there is a b n
such that a b = 1,
and this holds if and only if (a,n) = 1. Hence n = {a n
: (a,n) = 1}, as in
§11. We know that = {1, 1} and n are groups under multiplication
(Theorem 12.4). This are a special cases of the following theorem..
330
The reader will check easily that, if R is a ring with identity, distinct
from the null ring, then R is a division ring if and only if R = R\{0}.
Likewise, if K is a commutative ring with identity, distinct from the null
ring, then K is a field if and only if K = K\{0}.
n n
ab = ba, then (a + b)n = ∑ k a n kbk.
k=0
1 1
We make induction on n. The formula (a + b)1 = 0 a 1b0 + 1 a 0b1 is
clearly true. We suppose that the formula is proved when the exponent
of a + b is n. Then
n n
(a + b)n+1 = (a + b)(a + b)n = (a + b) ∑ k a n kbk
k=0
n n n n
= ∑ k a n+1 kbk + ∑ k a n kbk+1
k=0 k=0
n n n n 1 n n
= 0 a n+1b0 + ∑ k a n+1 kbk + ∑ k a n kbk+1 + n a 0bn+1
k=1 k=0
n+1 n n n n n+1
= 0 a n+1b0 + ∑ k a n+1 kbk + ∑ k 1 a n (k 1)bk + n+1 a 0bn+1
k=1 k=1
331
n+1 n n n n+1
= 0 a n+1b0 + ∑ k + k 1 a n+1 kbk + n+1 a 0bn+1
k=1
Exercises
3
1. Let X = {a + b 2 : a,b } and Y = {a + b 2 : a,b }. Determine
whether X and Y are rings under the usual addition and multiplication of
real numbers.
4. Show that the set A = {a/b : (a,b) = 1, n b} is not a ring (under the
usual addition and multiplication of rational numbers) if n is a composite
number.
5. Prove that n
has zero divisors if n is composite, and that n
is a field
if n is prime.
332
6. On the group R = , we define a multiplication by
(a,b).(c,d) = (ac,ad)
for all (a,b), (c,d) . Prove that, with this multiplication, R becomes
a ring. Show that (1,0) is a left identity in R, but not a right identity; and
that (1,0) is a right zero divisor, but not a left zero divisor. Is R a ring
with identity?
8. On the group R = n n
, we define a multiplication by
(a ,b).(c ,d) = ( ac bd , ad + bc )
for all (a ,b),(c ,d) R. Show that R is a commutative ring with identity.
Prove that R is a field when n = 3,7,11 and that R is not an integral
domain if n = 5,13,17.
9. Let H = {(ab b
a ): a,b }
Mat2( ). Prove that, under the usual
matrix addition and multiplication, H is a division ring (cf. §17, Ex. 14).
10. Let R1,R2, . . . ,Rn be rings. Prove that the group R1 R2 ... Rn
becomes a ring if multiplication is defined by
for all (r1,r2, . . . ,rn),(s1,s2, . . . ,sn) R1 R2 ... Rn. Moreover, prove that
R1 R2 ... Rn is a commutative ring if and only if each Rk is; and
that R1 R2 ... Rn is a ring with identity if and only if each Rk is.
The ring R1 R2 ... Rn is called the direct sum of R1,R2, . . . ,Rn.
333
§30
Subrings, Ideals and Homomorphisms
334
(ii) a S for all a S,
.
(iii) a b S for all a,b S.
(2) Assume R is not commutative.. Then there are a,b R with ab ba.
The point is that all such pairs a,b may be outside S,. and that st = ts may
335
hold for all s,t S. For example, R = Mat2( ) is not commutative, but S =
(4) The point is that. there may be an e in S such that es = se = s for all s
in S, but er = re = r need not be true for all r R, i.e., er0 r0 or r0e r0
for a particular r0 in R. As an example, considerR = , on which
addition and multiplication are defined by declaring.
(a,b) + (c,d) = (a + c,a + d)
(a,b)(c,d) = (ac,ad)
for all (a,b) R and which is easily verified to be a ring with respect to
these operations. If (a,b) R is a left identity element of R so that
(a,b)(x,y) = (x,y) for all (x,y) R, then (ax,ay) = (x,y) for all (x,y) R, thus
a = 1. But (1,b) is not a right identity element of R, because (x,y)(1,b) =
(x,xb) (x,y) for any (x,y) R with y xb. Thus R is a ring without an
identity. However, S = {(a,0): a } is a subring of R with an identity
(1,0) S, as (1,0)(a,0) = (a,0) = (a,0)(1,0) for any (a,0) S.
(6) If R has no zero divisors, it may happen that all zero divisors fall
outside S, and in this case S has no zero divisors. For instance, The ring R
a 0
{
= Mat2( ) has zero divisors, but its subset S = (0 0) : a } is a sub-
a 0
ring of R with no zero divisors. For if s,t S and st = 0, then s = (0 0)
b 0 a 0 b 0 0 0
and t = (0 0) for some a,b , and st = 0 means (0 0)(0 0) = (0 0),
(7) and (8) Consider the division ring , which is a field as well. Its
subring is neither a division ring nor a field.
336
(9) A subring S {0} of an integral domain R is commutative by (1) and
has no zero divisors by (5). Hence S is an integral domain if and only if S
has an identity. We claim S has an identity if and only if the identity of
R belongs to S. Indeed, if S contains the identity element 1R of R, then of
course 1R is an identity element of S. Conversely, if S has an identity
element e, then ee = e = 1R e, so ee 1R e, so (e 1R )e = 0 and, since e 0
(for S {0} by assumption) and R has no zero divisors, e 1R = 0 and
hence e must be equal to 1R .
337
r1 = r2 + s1, t1 = t2 + s2, s1,s2 S r1t1 r2t2 S (for all r1,r2,t1,t2 R),
i.e., if and only if
s1,s2 S (r2 + s1)(t2 + s2) r2t2 S (for all r2,t2 R),
i.e., if and only if
s1,s2 S r2s2 + s1t2 + s1s2 S (for all r2,t2 R),
338
30.5 Theorem: Let R be a ring and S a subgroup of R under addition.
The multiplication on the set R/S of right (and left) cosets of S, given by
(r + S)(u + S) = ru + S for all r,u R.
is well defined if and only if S is an ideal of R.
After giving some examples of ideals, we will prove that the multiplica-
tion on R/S makes R/S into a ring.
30.6 Examples: (a) In any ring R, the set {0} is an ideal (Lemma
29.6(1) and (2)). The set R itself is also an ideal of R since R is closed
under multiplication.
1 1.
(e) is not an ideal of , since for example, 1 , , but 1 .
2 2
339
Now let S 1 = {(0c 00): c } and S = {(a0 0b): a,b
2 }. It is easy to see
that S 1 and S 2 are subrings of Mat2( ) and of course S 1 S 2. Here S 1 is
a 0 c 0 c 0 a 0 ac 0
an ideal of S 2, because (0 b)(0 0) = (0 0)(0 b) = ( 0 0) S 1 for any
(a0 0b) S 2 and (0c 00) S 1. On the other hand, S 1 is not an ideal of
10 11
Mat2( ), because, for example, (0 0) S 1, (0 0) Mat2( ) and yet
(10 00)(10 10) = (10 10) S 1. Thus S 1 is an ideal of S 2 but not an ideal of
Mat2( ). This shows that "idealness" is not an intrinsic property of a
subring. A subring is not merely an ideal, but an ideal of a ring that has
to be clearly specified. Compare this with Example 18.5(i).
∑
n
{za + ua + at + riasi : z , u,t,ri,si R, n }
i=1
(cf. Lemma 24.2). If R has an identity, this ideal can be written more
simply as
{∑
n
riasi : ri,si R, n }.
i=1
{za + ra : z ,r R}.
340
{ra : r R} = {ar : r R}
341
30.8 Definition: Let A be an ideal of a ring R. The ring R/A of Theorem
30.7 is called the factor ring of R with respect to A, or the factor ring R
by A, or the factor ring R mod(ulo) A. Other names for R/A are: "quotient
ring", "difference ring", "residue class ring".
Ideals are the subrings with respect to which we can build factor rings,
just as normal subgroups are the subgroups with respect to which we
can build factor groups. We know that normal subgroups are exactly the
kernels of homomorphisms. We now show that ideals, too, are the ker-
nels of homomorphisms.
The operations on the left hand sides are the operations on R,. and those
on the right hand side are the operations on R1. If the operations on R1
342
were denoted by and , the equations would read (a + b) = a b
and (ab) = a b .
343
= (r ) .(s )
= r( ).s( )
for all r,s R, does preserve multiplication and hence is a ring
homomorphism.
344
Proof: The natural mapping : R R/A is a group homomorphism from
R onto R/A and Ker = A by Theorem 20.12. So we need only show that
is a ring homomorphism, i.e., that preserves multiplication. This
follows from the very definition of multiplication in R/A: we have
(rs) = rs + A = (r + A)(s + A) = r .s
for all r,s R. So is a ring homomorphism.
1
(2) We know that is a group isomorphism (Lemma 20.11(2)). We
1
must also show that preserves products. For any x,y R1, we must
show (xy) 1 = x 1.y 1. Since is onto, there are a,b R such that a = x
and b = y. Now a and b are unique with this property, for is one-to-
one, and a = x 1, b = y 1. This is the definition of the inverse mapping.
Since is a homomorphism, we have
(ab) = a .b
(ab) = xy
ab = (xy) 1
345
1
x .y 1 = (xy) 1
1
for all x,y R1. So : R1 R is a ring homomorphism and consequently
1
is a ring isomorphism. .
R/Ker R/Ker
R R1 R R1
(a) (b)
Proof: From Theorem 20.15 and its proof, we know that the mapping
: R/Ker R1
r + Ker r
346
(Theorem 30.17) and Im = Im (see the proof of Theorem 20.16).
Thus is a one-to-one ring homomorphism onto Im and therefore
R/Ker Im .
R R1 R R1 R1/S1
T T1
S S1 S S1 1
{0} {0}
S = {r R: r U}
347
with Ker S and S1 = U. It remains to show that S is a subring of R. We
need only check that S is closed under multiplication,. and this is easy: if
r,s S, then r , s U, then r .s
.
U, then (rs) U, then rs S and S
is multiplicatively closed. .
(7) Assume that S is an ideal of R and S1 is an ideal of R1. From the ring
homomorphism ´: R R1/S1, we get
30.20 Theorem: Let A be an ideal of R.. The subrings of R/A are given
by S/A, where S runs through the subrings of R containing A.. In other
words, for each subring U of R/A,. there is a unique subring S of R such
that A U and U = S/A. When U1 and U2 are subrings of R/A, say with
U1 = S1/A and U2 = S2/A, where S1, S2 are subrings of R containing A,
then U1 U2 if and only if S1 S2. Furthermore, S /A is an ideal of R/A if
and only if S is an ideal of R. In this case .
R/A / S/A R/S (ring isomorphism).
348
of R/A is of the form S1 = Im S = {s R/A: s S} = {s + A R/A: s S} =
S/A for some subring S of R containing Ker = A . (notice that S/A is
mean-ingful, for A is an ideal of S when A S . and S is a subring of R).
We know that U1 = Im S1
Im S2
= U2 if and only if S1 S2 (Theorem
30.19 (2),(3)). Finally, S /A = Im S
is an ideal of R/A if and only if S is
an ideal of R, in which case R/A / S/A R/S (Theorem 30.19(6),(7)). .
Exercises
349
1. Let R be a ring. The center of R is defined to be the set
Z(R) = {z R: za = az for all a R}.
Is Z(R) a subring or an ideal of R?
6. Show that, if K is a field, then {0} and K are the only ideals of K.
8. Let R be a ring and let End(R) be the set of all ring homomorphisms
from R into R. For any , End(R), we define + : R R by
r( + ) = r + r .
Show that + End(R) and that (End(R),+,o ) is a ring (o is the composi-
tion of functions).
350
AB P A P or B P
is valid (see Ex. 10). Prove the following statements.
(a) Let P be an ideal of R and P R. If, for any a,b R,
ab P a P or b P
then P is a prime ideal of R.
(b) Let R be commutative. If P is a prime ideal of R, then
ab P a P or b P
for any a,b R.
(c) {0} is a prime ideal of any integral domain.
(d) Let R be a commutative ring with identity and P an ideal of R.
Then P is a prime ideal of R if and only if R/P is an integral domain.
12. Let R be a ring. An ideal (resp. right ideal, resp. left ideal) M of R is
said to be maximal ideal (resp. right ideal, resp. left ideal)of R if M R
and if there is no ideal (resp. right ideal, resp. left ideal) N of R such that
M N R. Prove the following statements.
(a) If R is a commutative ring with identity, then every maximal
ideal of R is prime.
(b) If R is a ring with identity, distinct from the null ring, and if M
is an ideal of R such that R/M is a division ring, then M is maximal.
(c) If R is a ring with identity and M a maximal ideal of R, then
R/M is a field.
(d) Find a noncommutative ring R with identity and a maximal
ideal M of R such that R/M is not a division ring.
14. Let R be a commutative ring. Show that the set N of all nilpotent ele-
ments in R is an ideal of R and that the factor ring R/N has no nilpotent
elements other than 0.
15. Find rings R,S with identities 1R , 1S respectively and a ring homo-
morphism : R S such that (1R ) 1S .
17. The notation being as in §29, Ex. 7, prove that the mapping r (r,0)
is a one-to-one ring homomorphism from R into S.
351
§31
Field of Fractions of an Integral Domain
Proof: (cf. Lemma 9.3) Let D be an integral domain with finitely many
elements. We are to show that every nonzero element of D has a multi-
plicative inverse in D.
352
Let a D, a 0. Since D is finite, the elements
All these carry over to the more general case of an arbitrary integral
domain D in place of , and give rise to a field F which is related to D in
the same way as is related to . The elements of F will be like
"fractions" of elements of D. We introduce them in the next two lemmas.
353
31.2 Lemma: Let D be an integral domain and put
S := {(a,b): a,b D, b 0} = D (D\{0}).
We define a relation on S by declaring
(a,b) (c,d) if and only if ad = bc
for all (a,b), (c,d) S. Then is an equivalence relation on S.
Proof: (i) For all (a,b) S, we have (a,b) (a,b) since ab = ba. So is
reflexive.
So is an equivalence relation on S.
354
We must show that + and . are well defined operations on F. This means
we must show that the implication
[a:b] = [x:y], [c:d] = [z:u] [ad + bc:bd] = [xu + yz:yu], [ac:bd] = [xz:yu]
(a,b) (x,y), (c,d) (z,u) (ad + bc,bd) (xu + yz,yu), (ac,bd) (xz,yu)
(ad + bc)yu = adyu + bcyu = ay.du + by.cu = bx.du + by.cu = bx.du + by.dz
= bd.xu + bd.yz = bd(xu + yz) and ac.yu = ay.cu = bx.dz = bd.xz.
(iii) [0:1] is a right additive identity since [a:b] + [0:1] = [a1 + b0: b1] = [a:b]
for any [a:b] F. (Notice that [0:1] = [0:d] for all d D, d 0.)
(iv) Any [a:b] F has a right additive inverse: [ a:b] is the opposite of
[a:b], for [a:b] + [ a:b] = [ab + b( a): b2] = [0:b2] = [0:1].
355
(2) . is associative since for any [a:b], [c:d], [e:f] F, we have
([a:b].[c:d]).[e:f] = [ac:bd].[e:f]
= [(ac)e :(bd)f]
= [a(ce):b(df)]
= [a:b]. [ce:df]
= [a:b].([c:d].[e:f]).
356
(5) For all [a:b] F\{0}, we show that [b:a] is a multiplicative inverse of
[a:b]. First of all, since [a:b] [0:1] in F,
(a,b) is not equivalent to (0,1) in S
a1 b0
a 0
(b,a) S
and [b:a] is an element of F. Secondly, [a:b].[b:a] = [ab:ba] = [ab:ab] = [1:1]
= multiplicative identity of F. Thus [b:a] is a multiplicative inverse of
[a:b] [0:1] in F.
31.5 Theorem: Let D be an integral domain and let (F,+,.) be the field of
Theorem 31.4. Then D is isomorphic to a subring of F.
357
a
From now on, we shall write for [a:b]. The elements of F will be called
b
fractions (of elements from D). Furthermore, we identify the integral
domain D with its image D under the mapping in Theorem 31.5. Thus
a
we write a instead of and regard D as a subring of F. Then the inverse
1
1 a b
of b D F is F and that of is (here a,b D, a,b 0). With these
b b a
notations, calculations are carried out in the usual way.
358
Proof: We construct an isomorphism From F onto a subring of K. The
a
elements of F are fractions , where a,b D and b 0. Regarded as an
b
element of K, b has an inverse b 1 in K, so ab 1 K. Let : F K.
a
ab 1
b
a c
is a well defined mapping, for if = (a,b,c,d D, b 0 d), then ad
b d
a c
= bc, so ad.b 1d 1 = bc.b 1d 1, so ab 1 = cd 1, so ( ) = ( ) . Now
b d
a c ad+bc
( + ) =( ) = (ad + bc)(bd) 1
b d bd
a c
= (ad + bc)d 1b 1 = ab 1 + cd 1 = ( ) + ( )
b d
a c ac a c
and ( . . ) = ( ) = (ac)(bd) 1 = ac.d 1b 1 = ab 1.cd 1 = ( ) .( )
b d bd b d
a c
for any two fractions . , in F and is a ring homomorphism. Here is
b d
a a
one-to-one because Ker = {0}, for Ker implies ( ) = 0, so ab 1 = 0,
b b
a 0 0
so a = abb 1 = 0b 1 = 0, so = = = 0. Hence F is isomorphic to the
b b 1
subring Im of K (Theorem 30.18).
Exercises
359
Show that is an equivalence relation on R M. Denote the equivalence
r
class of (r,m) by and define, on the set M 1R of all equivalence classes,
m
addition and multiplication by
r r´ rm´+mr´ r r´ rr´
+ = and . . =
m m´ mm´ m m´ mm´
1
Prove that M R is a commutative ring with identity under these opera-
tions.
5. Discuss the rings in Example 29.2(e),(g) under the light of Ex. 3 and 4.
360
§32
Divisibility Theory in Integral Domains
361
(4) If , then .
(5) If and , then | + .
(6) If and , then | .
(7) If and , then | + .
(8) If 1
, 2
, ... , s
, then | 1 1
+ 2 2
+ ... + .
s s
(9) If 0, then |0.
(10) 1| and 1| .
Proof: The claims are proved exactly as in the proof of Lemma 5.2.
362
, ´ are units in D. So = ´ , with ´ D (Theorem 29.15), thus
and is transitive.
The relation holds if and only if the relation 1 1 holds for any
associate 1 of and for any associate 1 of . In other words, as far as
divisibility is concerned, associate elements play the same role..
363
(d) We put [i] := {a + bi : a,b }. One easily checks that [i] is a
subring of and that [i] is an integral domain. The elements of [i] are
called gaussian integers (after C. F. Gauss (1777-1855) who introduced
them in his investigations about the so-called biquadratic reciprocity
law).
1+ 3i 2 2
(e) We put = . Thus = cos + isin . By de Moivre's
2 3 3
4
2 2 1 3i
theorem, 2 = cos 2 + isin 2 = e3 = = and
3 3 2
3 2 2
= cos 3
+ isin 3 = 1. So 3 1 = 0, so ( 1)( 2 + + 1) = 0. Since
3 3
1 0, we conclude 2 + + 1 = 0, which can also be verified directly.
From 3 = 1, we obtain 4 = , whence ( 2)2 + 2 + 1 = 0.
364
[ ]
for all a + b , c + d [ ]. The ring [ ] was introduced independently
by C. G. J. Jacobi (1804-1851) and by G. Eisenstein (1823-1852) in their
investigations about the so-called cubic reciprocity law).
365
rather than irreducible, but the term "prime" is reserved for another
property (Definition 32.20).
When the sequence (s). stops after a finite number of steps, we obtain an
irreducible divisor of .. However, we do not know that the sequence (s)
ever terminates. In the case of , the absolute values of i, which are
nonnegative integers, get smaller and smaller and,. since there are
finitely many nonnegative integers less than ,. the sequence (s) does
come to an end.. But this argument cannot be extended to the general
case, for there is no absolute value concept.. Let us suppose, however,
that there is associated a nonnegative integer d( i) to each i in such a
way that d( i+1) d( i). If this is possible, we can conclude that
sequence (s) does terminate..
Proof: (Cf. Theorem 5.3; note that and are not claimed to be unique.)
The elements , of [i] (resp. of [ ]) are complex numbers, and 0.
Thus / . Let us write
366
= x + yi (resp. =x+y )
N( ) = N( ) = N( )= { NN (((x(x ++ yi)
(a + bi))
y )(a + b ))
N ((x a) + (y b)i )
={ N ((x a) + (y b) )
2
(x a) + (y b)2
={
(x a)2 (x a)(y b) + (y b)2
x a2 + y b2
={
x a2 + x a y b + y b2
(1/2)2 + (1/2)2
{ (1/2)2 + (1/2)(1/2) + (1/2)2
2/4
={ 1.
3/4
This completes the proof.
367
N( ) = N( ) = N( ) = N ((x a) + (y b) 5i )
= (x a)2 + 5(y b)2 (1/2)2 + 5(1/2)2 = 3/2
(†)
instead of N( ) 1 as in [i], [ ].
as claimed.
The first condition (i) assures that the d-value of a divisor of D\{0} is
less than or equal to the d-value of . It follows that d( ) = d( ´) when-
ever and ´ are associate. Using (ii) repeatedly, the analog of the
Euclidean algorithm is seen to be valid, and the last nonzero remainder
368
is a greatest common divisor. It will be a good exercise for the reader to
prove this result.
When A = {0}, we clearly have A = D0, and the claim is true. Assume
now A {0}. Then U = {d( ) {0}: A, 0} is a nonempty
subset of the set of nonnegative integers. Let m be the smallest integer
in U. Then m = d( ) for some A, 0; and d( ) d( ) for all A,
0.
369
necessarily = 0 and = D . This shows D for all A, pro-
vided 0. Since 0 = 0 D as well, we get A D . Thus A = D .
B0 B1 B2 B3 ...
of ideals in D consists of finitely many terms. An integral domain satis-
fying the ascending chain condition is also called a noetherian domain
(in honor of Emmy Noether (1882-1935)).
370
A0 A1 A2 A3 . . .
be a chain of ideals of D.. We must show there is an integer k such that
Using Theorem 32.14, we shall prove the analog of Theorem 5.13 for any
arbitrary principal ideal domain.
371
of ideals of D (here D i
D i+1
because i+1
is a proper divisor of i
).
Since D is noetherian (Theorem 32.14), this chain breaks off: the chain
consists only of the ideals
D =D 0 D 1 D 2 ... D k
say. Hence
= 0, 1, 2, . . . , k
are the only elements in the sequence (s). We claim that k is irreducible
in D. Otherwise, there would be proper divisors k+1 and k+1 of k with
k
= k+1 k+1, and the sequence (s) would contain the term k+1 after k,
and would not terminate with the term k, a contradiction. Hence k is
an irreducible divisor of .. We proved that every element in D, which
is neither zero nor a unit, has an irreducible divisor in D..
372
32.16 Definition: Let D be an arbitrary integral domain and let , D,
not both zero. An element of D is called a greatest common divisor of
and if
(i) and ,
(ii) for all 1 in D, if 1 and 1 , then 1 .
Notice that any associate of above satisfies the same conditions and
hence any associate of a greatest common divisor of and is also a
greatest common divisor of and . It is seen easily that any two great-
est common divisor of and , if and have a greatest common divi-
sor at all, are associates. So a greatest common divisor of and is not
uniquely determined and we have to say a greatest common divisor, not
the greatest common divisor.
Proof: (cf. Theorem 5.4) As in the proof of Theorem 5.4, we consider the
set A := { + : , D}. A is a nonempty subset of D. We claim that A is
an ideal of D. To prove this, let 1, 2 be arbitrary elements of A. Then 1 =
1
+ 1, 2 = 2 + 2 for some 1, 2, 1, 2 D. Hence
373
1
+ 2 = ( 1 + 1) + ( 2
+ 2) = ( 1 + 2
)+ ( 1
+ 2
) D
1
= ( 1 + 1) = ( 1
) + ( 1) D
1
= ( 1
+ 1
)= ( 1
)+ ( 1
) D
Since , are not both equal to zero, A {0}.. Now D is a principal ideal
domain, so A = D for some D, and A {0} implies 0.. Also, since
=1 D = A, there are 0, 0 D with = 0 + 0. We prove now that
is a greatest common divisor of and ..
374
get (since ) against our hypothesis . Thus is a unit and 1,
as claimed.
Proof: Omitted.
375
32.22 Theorem: Let D be a principal ideal domain. Every element of D,
which is not zero or a unit, can be expressed as a product of irreducible
elements of D in a unique way, apart from the order of the factors and
the ambiguity among associate elements.
1 2
... r
= = 1 2
... s
Now assume r 2 and that the theorem is proved for r 1. This means,
whenever we have an equation
1
´ 2
´. . . r 1
´ = 1
´ 2
´. . . t´
1 2
. . . r 1( s
)= = 1 2
... s
1 2
...( r 1
)= 1 2
... s 1
and by induction, we get
r 1 = s 1,
1
, 2
, ..., ( r 1
) are, in some order, associates of 1
, 2
, ..., s 1
.
Hence r=s
and 1
, 2
, ..., r 1
are, in some order, associates of 1
, 2
, ..., s 1
;
376
and r
is associate to s
. This completes the proof.
= 1 2
... ,
r
= ´ 1
´ 2
´. . . s
´, = ´´ 1
´´ 2
´´. . . t´´,
´´ 1
´´ 2
´´. . . t
´´ = = = 1 2
... r
´ 1
´ 2
´. . . ´=
s
´ 1 2
... r 1
´ 2
´. . . s
´
377
There is the following generalization of Theorem 32.22. If D is an
integral domain in which every nonzero, nonunit element can be written
as a product of finitely many irreducible elements, and if every
irreducible element in D is prime, then D is a unique factorization
domain. The proof of Theorem 32.22 is valid in this more general case.
There are unique factorization domains which are not principal ideal
domains and there are principal ideal domains which are not Euclidean
domains.
378
( + D )( + D ) = + D = + D = 0 + D = zero element of D/D .
Thus + D and + D are zero divisors in D/D and D/D cannot be a
field.
Exercises
1+ 7i
5. Let = . Show that [ ] := {a + b : a,b } is a Euclidean
2
domain.
379
11. Prove: an integral domain is a principal ideal domain if and only if
there is a function d: D\{0} {0} satisfying
(i) d( ) d( ) for any , D\{0} with , and d( ) = d( ) if
and only if ;
(ii) for all , D\{0} with and , there are , , D
such that = + and d( ) min{d( ),d( )}.
1+ 19i
12. Let = . Show that [ ] := {a + b : a,b } is a principal
2
ideal domain, but not a Euclidean domain.
380
§33
Polynomial Rings
x2 + 2x + 5 x3 + 2x2 7x + 1
are polynomials. One learns how to add, subtract, multiply and divide
two polynomials. Although one acquires a working knowledge about
polynomials, a satisfactory definition of polynomials is hardly given. In
this paragraph, we give a rigorous definition of polynomials.
381
of elements a0, a1, a2, . . . in R, where only finitely many of them are
distinct from the zero element of R, is called a polynomial over R. .
The terms a0, a1, a2, . . . are called the coefficients of the polynomial f =
(a0, a1, a2, . . . ). The term a0 will be referred to as the constant term of f.
The polynomial 0* = (0,0,0, . . . ) over R, whose terms are all equal to the
zero element 0 R of R, is called the zero polynomial over R. The leading
coefficient and the degree of the zero polynomial are not defined. The
leading coefficient of any other polynomial is defined. The constant term
of the zero polynomial is defined, and is 0 R.
382
f = (a0,a1,a2, . . . ) and g = (b0,b1,b2, . . . )
fg = (c0,c1,c2, . . . )
c0 = a0b0
c1 = a0b1 + a1b0
c2 = a0b2 + a1b1 + a2b0
c3 = a0b3 + a1b2 + a2b1 + a3b0
.....................
ck = a0bk + a 1bk 1 + a2bk 2 + . . . + a k 2b2 + a k 1b1 + akb0
..................... .
To find the k-th term ck in fg, we multiply all a's with all b's in such a
way that the sum of the indices is k, and add the results. We write ck =
∑
k
aibk i. The summation variable runs through different values for
i=0
383
deg(f + g) = max{m,n} in case m n,
deg(f + g) m in case m = n and f + g 0*.
(3) The product fg is a polynomial over R. If deg f = m and deg g = n,
then
deg fg m+n in case fg 0*,
deg fg = m + n in case R has no zero divisors.
(2) We must show that f + g has only finitely many terms. distinct from
0. We proved it in part (1) when f = 0* or g = 0*.. Now we assume f 0*
g. Then f and g have degrees. Suppose def f = m and deg g = n,. so that
am 0, ar = 0 for all r m and bn 0, br = 0 for all r n.
384
have degrees. Suppose def f = m and deg g = n,. so that am 0, ar = 0 for
all r m and bn 0, br = 0 for all r n.
33.4 Remark: The last argument shows in fact that the leading coeffi-
cient of fg is the leading coefficient of f times the leading coefficient of g,
provided R has no zero divisors.
Proof: First of all, we must prove that + makes the set of all polynomial
over R into an abelian group. The closure property was shown in Lemma
33.3(2). The associativity and commutativity of addition of polynomials
385
follow from the associativity and commutativity of addition in R. The
zero polynomial 0* is the zero element (Lemma 33.3(1)) and each poly-
nomial (a0,a1,a2, . . . ) over R has an opposite ( a0, a1, a2, . . . ).. The details
are left to the reader..
Here we used the distributivity in R. Since (aibj )cl = ai(bj cl), the m-th
term in (fg)h and f(gh) are equal, and this for all m. So (fg)h = f(gh) for
all polynomials f,g,h over R and the multiplication is associative.
386
= fg + fh
The ring of all polynomials over R will be denoted by R[x]. When f R[x],
we say f is a polynomial with coefficients in R.
Each one of the polynomials above has at most one nonzero coefficient. A
polynomial over R which has at most one nonzero coefficient will be
called a monomial over R. We can write monomials over R more com-
pactly as follows. If, for example, g is a monomial over R whose r-th
coefficient is a (the possibility a = 0 is not excluded) and whose other
coefficients are zero, then we can write g = (0,0,. . . ,a,0, . . . ) shortly as
(a,r). Here r denotes the index with the only the possibly nonzero
element, and a R is that possibly nonzero element in the r-th place.
.
Then our f would be written as (a0,0) + (a1,1) + (a2,2) + . . . + (ad,d). The
essential point is that a polynomial can be written as a sum of
monomials, and a monomial is determined as soon as the index r and the
possibly nonzero element a is given. We can choose other notations for
monomials, of course, as long as they display the index r and the
possibly nonzero element a. We prefer to write axr instead of (a,r) for
the monomial (0,0,. . . ,a,0, . . . ). In this notation, both the index r and the
element a are displayed. It should be noted that x does not have a
meaning by itself. It is like the comma in (a,r). In particular, xr is not the
r-th power of anything. r in axr is an index, a superscript showing where
the element a sits in. With this notation, our f is written as
387
indeterminate (over R). This does not mean that x fails to be determined
in some way. "Indeterminate" is just an odd name of a computational
device. Finally, we agree to write a0 for a0x0 and a1x for a1x1. In
particular, we write 0 for the zero polynomial 0*. This convention brings
f to the form
388
33.6 Lemma: Let R be a ring.
(1) If R is commutative, then R[x] is commutative.
(2) If R has an identity, then R[x] has an identity.
(3) If R has no zero divisors, then R[x] has no zero divisors.
(4) If R is an integral domain, then R[x] is an integral domain.
∑ ∑aibj )xk = ∑ ( ∑b a )x
m+ n n+ m
fg =
k=0
( i+ j=k k=0 j+ i=k
j i
k
= gf
(3) Assume now R has no zero divisors.. Let us suppose also that f 0
and g 0. Without loss of generality, we may assume that am is the
leading coefficient of f and that bn is the leading coefficient of g. Then
am 0, bn 0. By remark 33.4, the leading coefficient of fg is ambn and
ambn 0 since R has no zero divisors. Thus fg has a nonzero coefficient,
namely the (m + n)-th coefficient and fg 0.. This shows that R[x] has no
zero divisors..
389
33.7 Lemma: Let R and S be two rings and let : R S be a ring
homomorphism. Then the mapping : R[x] S[x], defined by
( ∑ aixi) ∑ (ai
m m
= )xi
i=0 i=0
∑ aixi, ∑ bjxj
m n
Proof: Let f = g = be arbitrary polynomials in R[x]. We
i=0 j=0
( ∑ aixi + ∑ bjxj)
m n
(f + g) =
i=0 j=0
( ∑ aixi + ∑ bixi)
m m
=
i=0 i=0
( ∑ (ai + bi)xi)
m
=
i=0
∑ [(ai + bi)
m
= ]xi
i=0
∑ (ai
m
= + bi )xi
i=0
∑ (ai ∑ (bi
m m
= )xi + )xi
i=0 i=0
f +g
and so preserves addition. As for multiplication (here we do not have
to assume m = n), we observe
[ ∑ ( ∑a b )x ]
m+ n
k
(fg) = i j
k=0 i+ j=k
∑ [( ∑a b ) ]x
m+ n
k
= i j
k=0 i+ j=k
390
∑ ( ∑(a b ) )x
m+ n
k
= i j
k=0 i+ j=k
∑ ( ∑a
m+ n
=
k=0 i+ j=k
i
.b
j )x k
( ∑ (ai )( ∑ (bj
m n
=
i=0
)xi
j=0
)xj )
= f .g .
Thus preserves multiplication as well. So is a ring homomorphism.
∑ (ai
m
= )xi is the zero polynomial in S[x], so if and only if the coefficients
i=0
ai are all equal to 0 S (i = 0,1, . . . ,m), so if and only if ai Ker for all
∑ aixi
m
i = 0,1, . . . ,m, so if and only if (Ker )[x].
i=0
∑ cixi ∑ cixi
m m
A polynomial S[x] belongs to the image of if and only if
i=0 i=0
( ∑ aixi) ∑ aixi
n n
= for some R[x], so (assuming m = n without loss of
i=0 i=0
generality) if and only if, for each i = 0,1, . . . ,m, there is an ai R such
that ci = ai , so if and only if ci Im for all i = 0,1, . . . ,m, and so if and
∑ cixi
m
only if (Im )[x].
i=0
391
whose image under is
= 30 x7 24 x6 3x5 + 23x4 + 15x3 21x2 + 11x + 5
= 2x4 + 2x + 2.
We have also (2x3 + 2x2 + 2x + 1)(1x + 2) = 2x4 + 2x + 2.
33.8 Theorem: If R and S are isomorphic rings, then R[x] and S[x] are
isomorphic.
∑ fiyi, where
m
(R[x])[y] =: R[x][y]. The elements of R[x][y] are of the form
i=0
33.9 Lemma: Let R be a ring and let x,y be two indeterminates over R.
Then R[x][y] R[y][x].
∑ ( ∑aijxj)yi ∑ ( ∑aijyi)xj,
m n n m
[∑ ( ∑aijxj)yi ∑ ( ∑bijxj)yi]T
m n r s
+
i=0 j=0 i=0 j=0
[∑ ( ∑aijxj + ∑bijxj)yi]T
m n s
= (assuming r = m without loss of
i=0 j=0 j=0
generality)
[∑ ( ∑(aij + bij)xj)yi]T
m n
= (assuming s = n without loss of generality)
i=0 j=0
392
∑ ( ∑(aij + bij)yi)xj
n m
=
j=0 i=0
∑ ( ∑aijyi + ∑bijyi)xj
n m m
=
j=0 i=0 i=0
∑ ( ∑aijyi)xj ∑ ( ∑bijyi)xj
n m n m
= +
j=0 i=0 j=0 i=0
[∑ ( ∑aijxj)yi]T [∑ ( ∑bijxj)yi]T
m n m n
= +
i=0 j=0 i=0 j=0
∑ ( ∑aijxj)yi, ∑ ( ∑bijxj)yi
m n r s
for all R[x][y].
i=0 j=0 i=0 j=0
= (∑ i,j
piqj T ) (by distributivity)
393
= ∑ piT.qj T (since T preserves products of monomials)
i,j
∑ ( ∑aijxj)yi
m n
T is one-to-one, for if R[x][y] is in the kernel ofT, then
i=0 j=0
∑ ( ∑aijyi)xj
n m
its image is the zero polynomial in R[y][x], so all the
j=0 i=0
∑aijyi
m
coefficients are equal to the zero polynomial in R[y], so all
i=0
∑aijxj ∑ ( ∑aijxj)yi
n m n
are the zero polynomial in R[x], so is the zero
j=0 i=0 j=0
∑ ( ∑aijyi)xj
n m
Moreover, T is onto, for any polynomial in R[y][x] is the
j=0 i=0
∑ ( ∑aijxj)yi in R[x][y].
m n
image of the polynomial
i=0 j=0
form ∑ aijxiyj , where aij R and there are finitely many terms in the
i,j
394
We can of course adjoin a new inteterminate z to R[x,y] and obtain
(R[x,y])[z] = (R[x][y])[z] =: R[x][y][z]. We see
We regard these six rings as identical and write R[x,y,z] for it. The
notations "R[x,y,z]", "R[x,z,y]", "R[z,x,y]", "R[z,y,x]", "R[y,z,x]", "R[y,x,z]" will
mean the same ring.
N1 N2 Nn
∑ ∑ ... ∑ aij...lx1ix2j . . . xnl, aij...l R.
i=0 j=0 l=0
The polynomials in R[x1,x2, . . . ,xn] of the form ax1ix2j . . . xnl will be called
monomials over R. It is customary to omit the indeterminates with
exponent zero in a monomial. For example, ax10x22x30x43 in R[x1,x2,x3,x4]
is written ax22x43. An exponent is dropped when it is equal to 1. If R
does not have an identity, the indeterminates x1,x2, . . . ,xn are not
elements of R[x1,x2, . . . ,xn] and the expressions x1ix2j . . . xnl are not
polynomials.
N1 N2 Nn
∑ ∑ . . . ∑ aij...lx1ix2j . . . xnl is defined to be the maximum of the
i=0 j=0 l=0
degrees of the monomials aij...lx1ix2j . . . xnl with aij...l 0. The total degree
of f will be denoted by deg f.. The degree of f, considered as an element
of R[x1, . . . ,xh 1,xh+1, . . . ,xn][xh] will be called the degree of f in xh; this will
395
be written deghf (h = 1,2, . . . ,n). The analog of Lemma 33.3 holds for
polynomials in n indeterminates,. both with the total degree and the
degree in xh in place of deg f.
Exercises
[(01 10)x4 + (10 20)x2 + (21 01)][(10 00)x2 (10 10)x + (11 1
1)] in
(Mat2( ))[x],
[(01 10)x3 + (10 32)x2 + (11 00)][(20 13)x2 + (12 50)x + (21 4
0)] in
(Mat2( 7))[x],
(we dropped the bars for ease of notation).
396
N1 N2 Nn
5. Let R be a ring and f = ∑ ∑ ... ∑ aij...lx1ix2j . . . xnl R[x1,x2, . . . ,xn].
i=0 j=0 l=0
Prove that deg1f is the largest i such that aij...l 0, deg2f is the largest j
such that aij...l 0, . . . , degnf is the largest l such that aij...l 0.
397
§34
Divisibility in Polynomial Domains
397
e D, h D,
eh = 1 holds in D,
e is a unit in D.
(e 0 h, because eh = 1 0 and D is an integral domain.) So a unit in
Thus any unit in D[x] has degree 0 and the associates of a polynomial in
D[x] have the same degrees as the polynomial itself. Any proper divisor
of f D[x] is therefore of degree distinct from 0 and deg f.
there exist some g,h D1[x] such that f = gh, 0 deg g deg f.
Then f is irreducible in D[x], but not in D1[x]. This shows that irreducibil-
ity of f is not an intrinsic property of f. It is a property of f relative to
the polynomial domain D[x]. For this reason, we have to mention the
domain D whenever we speak about irreducible polynomials. We say f is
irreducible over D when f is irreducible in D[x]. For example, x2 + 1
398
[x] is irreducible over since x2 + 1 has no proper divisors in [x], but
x2 + 1 is reducible in [x] since x2 + 1 = (x i)(x + i), with x i, x + i [x]
2
and 0 1 = deg(x i) 2 = deg (x + 1).
Now the converse. We suppose that a is irreducible in D[x] and show that
a is irreducible in D. First we must show that a is not a unit in D. Since a
is irreducible in D[x], so not a unit in D[x], we have a D[x] = D (Lemma
34.2), so a is not a unit in D. Secondly we must show that a = bc, where
b,c D, implies either b or c is a unit in D. We read a = bc as an equation
in D[x]. Since a is irreducible in D[x], either b or c is a unit in D[x], so, in
view of Lemma 34.2, either b or c is a unit in D. This proves that a is
irreducible in D.
399
by degree considerations, the last statement means (Lemma 34.2,
Lemma 34.3): each element of D\{0} that is not a unit in D, must be
written as a product of irreducible elements of D in a unique way. Thus
D must be a unique factorization domain. We shall prove conversely that
D[x] is a unique factorization domain whenever D is. The proof will make
use of the polynomial ring F[x], where F is the field of fractions of D
(§31). F[x] will turn out to be a Euclidean domain.
Proof: First we prove the existence of q and r. This is nothing but the
long division of polynomials. Suppose we divide f = x5 2x4 + 3x3 + x2 x +
2 by g = x2 + x + 1. What do we do? We subtract x3 times g from f:
x5 2x4 + 3x3 + x2 x + 2 x2 + x + 1
x5 + x4 + x3 x3
3x4 + 2x3 + x2 x + 2
400
x5 2x4 + 3x3 + x2 x+2 x2 + x + 1
x5 + x4 + x3 x3 3x2 + 5x 1
3x4 + 2x3 + x2 x + 2
3x4 3x3 3x2
5x3 + 4x2 x + 2
5x3 + 5x2 + 5x
x2 6x + 2
x2 x 1
5x + 3.
f g
m
ax g axm
f1 = f axmg
Now let f,g be nonzero polynomials in D[x] and suppose that the leading
coefficient of g is a unit in D. We prove the existence of q and r by
induction on deg f.
II. Now the inductive step.. We use the principle of induction in the
form 4.5.. We assume that deg f = n 1 and that, for any nonzero
401
polynomial h with deg h n, there are polynomials q1 and r1 in D[x]
such that.
h = q1g + r1, r1 = 0 or deg r1 deg g.
402
= deg (r´ r) max{deg r´, deg r} deg g
by Lemma 33.3. This forces q q´ = 0, so q = q´, so r = f qg = f q´g = r.
Thus q and r are uniquely determined.
34.6 Theorem: Let K be a field. Any two polynomials f,g in K[x], not
both zero, have a greatest common divisor d in K[x]. If d is a greatest
common divisor of f and g, then there are polynomials h and l in K[x]
such that d = hf + lg. Any two greatest common divisors of f and g are
associate. In particular, there is one and only one monic greatest
common divisor of f and g. (This unique monic greatest common divisor
of f and g is sometimes called the greatest common divisor of f and g).
Any irreducible polynomial in K[x] is prime in K[x] (Definition 32.20).
403
Theorem 34.5 is very satisfactory. If the underlying ring is a field, then
the polynomial domain is a unique factorization domain. We turn our
attention to polynomials with coefficients in a unique factorization
domain. Let D be a unique factorization domain and let F be the field of
fractions of D. We recall that the elements of F are fractions a/b of
element a,b D, b 0. We identify a D with a/1 F and thus regard D
as a subring of F. In this way, D[x] F[x]. (If you find this and the
following discussion too abstract, you may just assume D = and F = .)
Let f D[x] F[x]. Now, a priori, f may be irreducible over D and not ir-
reducible over F. See the comments preceding Lemma 34.3. In the case
where D is a unique factorization domain and F is the field of fractions of
D, it is in fact true that an irreducible polynomial in D[x] is also irre-
ducible in F[x]. After some preparation, this will be proved in Lemma
34.11. The hypothesis that D be a unique factorization domain is essen-
tial, for otherwise the following definition, which plays an important role
in the proof of Lemma 34.11, does not make sense.
404
Proof: First we remark that we cannot write C(fg) = C(f)C(g), for
contents are unique only up to associate elements.
Suppose now C(f) 1, C(g) 1 and C(fg) is not a unit. Then there is an
irreducible element in D with C(fg). Since C(f) 1 and C(g) 1 by
assumption, cannot divide all the coefficients of
f = anxn + a n 1xn 1 + . . . + a1x + a0
nor of
g = bmxm + bm 1xm 1 + . . . + b1x + b0,
say. Let ah be the coefficient of f with the largest index that is not
divisible by and let bk have a similar meaning for g. Then
an, a n 1, . . . , ah+1, ah
(1) bm, bm 1, . . . , bk+1, bk.
(2)
But divides the coefficient
(. . . + ah+2bk 2 + ah+1bk 1) + ahbk + [a h 1bk+1 + a h 2bk+2 + . . . ]
of xh+ k in fg. Because of (1) and (2),. divides the expressions in ( ) and
[ ]. So divides ahbk as well. Thus ah, bm and ahbk, which tells us
that is not a prime element in D.. On the other hand, D is a unique
factorization domain. and every irreducible element in D is prime
(Lemma 32.24),. hence is prime. This is a contradiction. We conclude
C(fg) 1. .
405
C(f) C(g). Then f and g are associate in F[x] if and only if f and g are
associate in D[x].
If f and g are associate in D[x], then f = eg for some unit e in D[x]. Then e
is a unit in D, so e is a nonzero element of D, so e is a nonzero element of
F, so e is a unit in F, so e is a unit in F[x], so f and g are associate in F[x].
If f and g are associate in F[x], then f = ug for some unit u in F[x]. Thus
u F\{0} and so u = a/b, where a,b D\{0}. So bf = ag. Thus
bC(f) C(bf) C(ag) aC(g) aC(f)
and b a in D. So a/b = u is a unit in D. Hence u is a unit in D[x] and f is
associate to g in D[x].
406
Thus e:= c1c2. . . cr/a1a2. . . ar is a unit in D and
f = (eh1)h2. . . hr.
407
34.12 Lemma: Let D be a unique factorization domain and let f be a
nonzero polynomial in D[x] such that C(f) 1 and deg f 1. Then f can
be written as a product of irreducible polynomials in a unique way.
Proof: Let F be the field of fractions of D. We will use the fact that F[x] is
a unique factorization domain and the fact that irreducibility in D[x] and
in F[x] coincide (Theorem 34.5, Lemma 34.11).
where g1, g2, . . . ,gr are irreducible in F[x]. According to Lemma 34.10,
408
34.13 Theorem: If D is a unique factorization domain, then D[x] is a
unique factorization domain.
Proof: Given any nonzero polynomial f in D[x] which is not a unit in D[x],
we have to show that f can be written as a product of irreducible poly-
nomials in D[x], and that this representation is unique up to the order of
factors and ambiguity between associate polynomials.
f = a1a2. . . arq1q2. . . qs
409
In particular,
Exercises
4. Show that x4 + 1 2
[x] is reducible over 2
.
6. Find a content of
8. Let D be a unique factorization domain and let f,g D[x]\D. Prove that
a greatest common divisor of f and g has degree 1 if and only if there
are polynomials h,k in D[x] satisfying deg h deg g and deg k deg f
such that fh = gk.
410
§35
Substitution and Differentiation
∑ aixi
m
35.1 Definition: Let R be a ring and let f = be an arbitrary
i=0
i=0
(b) Let h = 3x3 + 4x2 + x 1 [x]. Here is a ring that contains and
2 2 2 3 2 2 2 29
. We have h(5 ) = 3(5 ) + 4(5 ) + (5 ) 1 = 125 .
5
411
(c) Let f = (01 10)x2 + (10 11)x + ( 21 00) (Mat2( ))[x]. Now Mat2( ) is a
01 0 1
ring containing Mat2( ) and (1 0) Mat2( ). Then f((1 0 ))
01 01 2 11 01 10 02
= (1 0)(1 0) + (0 1)(1 0) + ( 2 0) = (4 0) Mat2( ).
∑ aixi
m
(d) Let R be a ring with identity and f = R[x]. Then R R[x] and
i=0
∑ aixi = f, so f(x) = f
m
x R[x]. The value of f at x R[x] is f(x) = R[x].
i=0
From now on, the notations f and f(x) for a polynomial in R[x] will be
used interchangably.
∑ aixi
m
(e) Again let R be a ring with identity and f = R[x] be a
i=0
∑ aiyi
m
f(y) = R[y].
i=0
(g) Let R be a ring. For any f R[x], the value of f at g R[x] can be
found as in the last example, and it is a polynomial f(g(x)) in R[x].
412
Ts: R[x] S
f f(s)
i=0 j=0
= ( ∑ (ai + bi)xi)Ts
m
(assuming n = m without loss of generality)
i=0
∑ (ai + bi)si
m
=
i=0
∑ aisi + ∑ bisi
m m
=
i=0 i=0
= f(s) + g(s)
= fTs + gTs,
and further
[∑ ∑aibj )xk]Ts = ∑ ( ∑a b )s ,
m+ n m+ n
(fg)Ts =
k=0
( i+ j=k k=0 i+ j=k
i j
k
= ∑ aisibj sj
i,j
∑ ( ∑a b )s
m+ n
k
= i j
k=0 i+ j=k
= (fg)Ts.
413
In the proof of Lemma 35.3,. the commutativity of S is used in a crucial
way. If S is not commutative, then Ts is not a homomorphism. For
example, .
10 01 01
Ix2 (0 1) = [Ix + (1 0)][Ix (1 0)] in (Mat2( ))[x]
11
but substituting (0 0) for x does not preserve sums and products:
(10 10)2 (10 01) [(10 10) + (01 10)][(10 10) (01 10)].
414
Proof: By the remainder theorem (with E in place of D), there is a
polynomial q in E[x] such that f(x) = q(x)(x a) + f(a). If a is a root of f,
then f(a) = 0, so f(x) = q(x)(x a) and (x a) f(x) in E[x]. Conversely, if
(x a) f(x) in E[x], then (x a) [f(x) q(x)(x a)] in E[x], so (x a) f(a) in
E[x]. Thus f(x) = u(x)(x a) for some u(x) E[x]. Substituting a for x, we
get f(a) = u(a)(a a) = 0. So a is a root of f.
The factor theorem puts an upper bound to the number of roots of poly-
nomials over integral domains, in particular of those over fields.
Suppose now n 2, deg f = n and that, for all integral domains D´, any
polynomial of degree n 1 in D´[x] has at most n 1 distinct roots in any
integral domain E´ that contains D´. If f has no roots in E, the theorem is
.
true. If f has a root a0 in E, we have
f(x) = q(x)(x a0) for some q(x) E[x]
by the factor theorem. Here q(x) is of degree n 1 by Lemma 33.3(3). By
our induction hypothesis, q(x) has at most n 1 distinct roots in E. Now
let A be the set of all distinct roots of q(x) in E (possibly A = ) so that
A n 1.
415
If b E is any root of f, then f(b) = 0, so q(b)(b a0) = 0, so q(b) = 0 or
b = a0, so b A or b = a0. Hence B A {a0}, where B is the set of all
distinct roots of f(x) in E. Thus B A +1 (n 1) + 1 = n and f has at
most n = deg f distinct roots in E. This completes the proof.
∑
n (x a0). . . (x a i 1)(x ai+1). . . (x a0)
f= b i.
(ai a0). . . (ai a i 1)(ai ai+1). . . (ai a0)
i=0
416
polynomial h = f g has at least n + 1 roots a0, a1, . . . ,an in K, and, if h 0,
then h has degree at most equal to n (Lemma 33.3(2)). This is not
compatible with Theorem 35.7, so h = 0 and g = f. Therefore f is the
unique polynomial satisfying the conditions above.
ap 1 1=0 in p
if a 0.
0 = coefficient of x0 in h
= (coefficient of x0 in f) (coefficient of x0 in g)
= ( 1) (( 1)( 2). . . ( (p 1))
= 1 ( 1)p 1(p 1)!
= ((p 1)! + 1) in p
417
provided p is odd. Hence (p 1)! + 1 0 (mod p) when p is an odd prime
number. But this congruence holds also when p = 2. This completes the
proof.
b
Proof: By hypothesis, a = is a root of f so that
c
bn bn 1 b
an n + a n 1 n 1 + . . . + a1 + a0 = 0.
c c c
Multiplying both sides by c n, we obtain
anbn + (a n 1bn 1c + . . . + a1bcn 1 + a0c n) = 0,
[a bn + a bn 1c + . . . + a bcn 1] + a c n = 0.
n n 1 1 0
n
c divides the expression in ( ), so c anb . As (b,c) 1, we have (bn,c) 1.
n n
From (b ,c) 1 and c anb , we conclude c an. Likewise, b divides the
expression in [ ], so b a0c n. As (b,c) 1, we have (b,c n) 1. From (b,c n) 1
n
and b a0c , we conclude b a0. In particular, if an is a unit in D, then c is
1
also a unit in D since c an, so there is a c D such that cc 1 = 1 and the
b bc 1 bc 1
root a = = = = bc 1 D.
c cc 1 1
418
fact a monic polynomial), any root of f in must be actually in by
Theorem 35.10. But
f(0) = 2 0; f( 1) = 1 0;
f( m) = m2 2 2, so f( m) 0 for m 2;
419
f(x) = (x a)m1 q1(x), q1(x) E1[x], q1(a) 0,
f(x) = (x a)m2 q2(x), q2(x) E2[x], q2(a) 0,
f(x) = (x a)m0 q0(x), q0(x) (E1 E2)[x], q0(a) 0,
∑ akxk ∑ kakxk 1.
m m
function x is the function x This suggests the
k=0 k=1
following definition.
420
∑ akxk
m
35.13 Definition: Let R be an arbitrary ring and let f = be an
k=0
∑ kakxk 1= ∑ (k+1)akxk 1
m m1
f´ = f´(x) = R[x].
k=1 k=0
1 5 1 4 2 3 4
(b) Let g(x) = x + x + x + x 3 [x]. Then
3 7 5 3
5 4 6 4
g´(x) = x4 + x3 + x2 + [x].
3 7 5 3
12 0 1 12 00
(c) Let h(x) = (3 4)x3 + ( 1 1)x2 + (0 3)x + (1 0) (Mat2( ))[x]. Then
12 0 1 12
h´(x) = 3(3 4)x2 + 2( 1 1)x + 1(0 3)
3 6 0 2 12
= (9 12)x2 + ( 2 2)x + (0 3) (Mat2( ))[x].
421
∑ akxk ∑ bjxj. We have
m n
Proof: Let f = and g =
k=0 j=0
( ∑ akxk ∑ bjxj)´
m n
(f + g)´= +
k=0 j=0
( ∑ akxk ∑ bkxk)´(assuming
m m
= + n = m without loss of
k=0 k=0
generality)
( ∑ (ak + bk)xk)´
m
=
k=0
∑ k(ak + bk)xk 1
m
=
k=1
∑ (kak + kbk)xk 1
m
=
k=1
∑ kakxk 1 + ∑ kbkxk 1
m m
=
k=1 k=1
= f´ + g´,
[( ∑ a x )( ∑ b x )]´
m n
k j
(fg)´ = k j
k=0 j=0
[ ∑ ( ∑a b )x ]´
m+ n
s
= k j
s=0 k+ j=s
∑ ( ∑a b )x
m+ n
s1
= s k j
,
s=1 k+ j=s
(1)
422
∑ ∑kakbj )xs 1 ∑ ( ∑ja b )x
m+ n m+ n
=
s=1
( k+ j=s
+
s=1 k+ j=s
k j
s1
∑ ( ∑ka b + ja b )x
m+ n
s1
= k j k j
s=1 k+ j=s
∑ ( ∑a b )x
m+ n
s1
= s k j
.
s=1 k+ j=s
(2)
From (1) and (2), we conclude (fg)´ = f´g + fg´. This completes the proof.
∑ akxk. ∑ akg k
m m
f = Then f(g(x)) = R[x] and, by (1) and (3), the
k=0 k=0
derivative of f(g(x)) is
423
Proof: Suppose c is a multiple root of f. Then it is a root of f. We wish to
show that c is a root of f´ as well. We have f(x) = (x c)2g(x) for some
g(x) E[x]. Differentiating and substituting c for x, we obtain
f´(x) = 2(x c)g(x) + (x c)2g´(x)
f´(c) = 2(c c)g(c) + (c c)2g´(c) = 0
and c is indeed a root of f´.
424
and prove that c is not a multiple root of f. Indeed, since f and f´ are
relatively prime, f and f´ have no common root by part (1), so f´(c) 0
and c is not a multiple root of f by Theorem 35.17.
N1 N2 N n-1 N n
f= ∑ ∑ ... ∑ ∑ aij...klx1ix2j . . . xnk1xnl
i=0 j=0 k=0 l=0
N1 N2 N n-1 N n
∑ ∑ ... ∑ ∑ aij...klc1ic2j . . . cnk1cnl
i=0 j=0 k=0 l=0
425
N1 N2 N n-1 N n
With the foregoing notation, f = ∑ (∑ ... ∑ ∑ )
aij...klx1ix2j . . . xnk1 xnl
i=0 j=0 k=0 l=0
is a polynomial in R[x1,x2, . . . ,xn 1][xn]. Substituting cn for xn in the sense
of Definition 35.1 (with S[x1,x2, . . . ,xn 1], R[x1,x2, . . . ,xn 1], xn, cn in place of
S, R, x, c, respectively), we get an element of S[x1,x2, . . . ,xn 1], namely
N1 N2 N n-1 Nn
∑ ∑ ... ∑ (∑ )
cnlaij...kl x1ix2j . . . xnk1 S[x1,x2, . . . ,xn 2][xn 1].
i=0 j=0 k=0 l=0
N1 N2 N n-2 N n-1 N n
∑ ∑ ... ∑ (∑ ∑ )
c nk1cnlaij...j´kl x1ix2j . . . xnj´ 2.
i=0 j=0 j´=0 k=0 l=0
426
Exercises
∑
f . f
f with respect to x and is written Thus = iaijkxi 1yj zk. The
x x i,j ,k
i 1
427
5. Let R be a ring and f R[x]. The derivative of f´ is called the second
derivative of f, and is written as f´´ or as f(2). More generally, the (n+1)-
st derivative of f is defined recursively as the derivative of the n-th
derivative f(n) of f, and is written as f(n+1). Thus f(n+1) = (f(n))´. We write
f(1) for f´ and f(0) = f. Prove that, for any f,g R[x], any c R, any n
8. Let K be a field. We put M = Mat2(K) for brevity. Let us recall that the
a b
determinant of (c d) M is ad bc and that A M is a unit in M if and
Let A(x), B(x) M[x] be nonzero polynomials and assume that the
leading coefficient of B(x) has a nonzero determinant. Show that there
are uniquely determined polynomials Q(x), R(x), Q†(x), R†(x) in M[x] such
that A(x) = Q(x)B(x) + R(x), R(x) = 0 or deg R(x) deg B(x).
and A(x) = B(x)Q†(x) + R†(x), R†(x) = 0 or deg R†(x) deg B(x).
Q(x) and R(x) are called the right quotient and right remainder, Q†(x) and
R†(x) are called the left quotient and left remainder when A(x) is divided
by B(x).
428
is called the left value of F(x) at A. Prove that the right (resp. left)
remainder of F(x) M[x], when F(x) is divided by Ix A, is equal to F(A)
(resp. F†(A)).
9. Let R be a ring and Di : R[x] R[x] be functions (i = 1,2) such that
429
§36
Fields of Rational Functions
The reader might have missed the familiar quotient rule (\f(f,g))´ =
f´g fg´ f
2 in Lemma 35.15. It was missing because is not a polynomial.
g g
f
We now introduce these quotients .
g
f
Thus a rational function over D is a fraction of two polynomials over D,
g
f1 f2
with g 0. Two rational functions and are equal if and only if the
g1 g2
f1 f2
polynomials f1g2 and g1f2 are equal. Two rational functions and are
g1 g2
added and multiplied according to the rules
430
words "rational" and "function" do not play any role in Definition 36.1. A
rational function is a fraction of polynomials over D. The reader should
exercise caution about this point. One should not conclude that
x2 1 x+1
and in (x)
x 1 1
are different rational functions, on grounds that that their domains are
different, since the domain of the first one does not contain 1, whereas 1
is in the domain of the second one. Neither of them has a domain, for
neither of them is a function. And these rational functions are equal
because the polynomials (x2 1)1 and (x 1)(x + 1) in [x] are equal.
a
Proof: F consists of the fractions , where a,b D and b 0; and D(x)
b
consists of the fractions
a
An element of D is identified with the fraction in F (Theorem 31.5),
1
f(x)
whence D F. Thus D[x] F[x] as sets. Note that two elements and
g(x)
p(x)
of D(x) are equal in D(x) if and only if f(x)q(x) = g(x)p(x) in D[x], and
q(x)
f(x)
this holds if and only if f(x)q(x) = g(x)p(x) in F[x], so if and only if
g(x)
431
p(x)
and are equal in F(x). Thus every element of D(x) is in F(x) and
q(x)
equality in D(x) coincides with equality in F(x). So D(x) F(x).
p(x)
Next we show F(x) D(x). Let F(x), with p(x), q(x) F[x], q(x) 0.
q(x)
ai m cj
∑ xi , q(x) = ∑
n
Then p(x) = xj , where ai,bi,cj ,dj D, bi 0, dj 0 for
i=0 bi j=0 dj
all i,j and not all of cj are equal to 0 D. We put b = b0b1. . . bn 1bn and d =
d0d1. . . dm 1dm. Then dbp(x) and dbq(x) are polynomials in D[x], and hence
p(x) dbp(x)
= D(x). So F(x) D(x). This proves D(x) = F(x).
q(x) dbq(x)
2 2 1 1
x x +
As an illustration of Lemma 36.2, observe that 3 7 4 (x)
2 2 1 1
x + x
5 3 2
2
5(56x 12x + 21)
is equal to the rational function in [x].
2
14(12x + 10x 15)
Also, we have D(x1,x2, . . . ,xn) = F(x1,x2, . . . ,xn), for this is true when n = 1
(Lemma 36.2) and, when it is true for n = k, so that D(x1,x2, . . . ,xk) =
F(x1,x2, . . . ,xk), it is also true for n = k + 1:
D(x1,x2, . . . ,xk,xk+1) = D(x1,x2, . . . ,xk)(xk+1)
= F(x1,x2, . . . ,xk)(xk+1)
= F(x1,x2, . . . ,xk,xk+1),
the last equation by the remark above, with F in place of D and k + 1 in
place of n.
432
In the remainder of this paragraph, we discuss partial fraction
expansions of ratinonal functions.
Proof: We first prove the existence of a(x) and b(x). Since q(x), r(x) are
relatively prime, there are polynomials h(x), k(x) in K[x] with
h(x)r(x) + k(x)q(x) = 1.
433
deg a(x)r(x)
= deg a(x) + deg r(x)
deg q(x) + deg r(x)
= deg q(x)r(x).
Thus s(x) + u(x), and consequently (s(x) + u(x))q(x)r(x) is the zero poly-
nomial in K[x]. This gives a(x)r(x) + b(x)q(x) = f(x). It remains to show
that a(x) and b(x) are distinct from the the zero polynomial in K[x]. Both
of them cannot be 0, for then f(x) would be also 0, which it is not by hy-
pothesis. If one of them is 0, say if a(x) = 0, then b(x) 0 and f(x) =
b(x)q(x) would not be relatively prime to q(x)r(x) (because q(x) is of
positive degree, so not a unit in K[x]), against the hypothesis. This proves
the existence of a(x), b(x).
f(x)
36.5 Lemma: Let K be a field and let be a nonzero rational function
g(x)
in K(x),with deg f(x) deg g(x). Suppose that f(x) and g(x) are both
monic and that f(x) is relatively prime to g(x). Assume g(x) = q(x)r(x),
where q(x)r(x) are two relatively prime polynomials of positive degree
in K[x]. Then there are uniquely determined nonzero polynomials a(x),
b(x) in K[x] such that
f(x) f(x) a(x) b(x)
= = +
g(x) q(x)r(x) q(x) r(x)
434
f(x)
Proof: If is a nonzero rational function in K(x), then f(x) is a nonzero
g(x)
polynomial in K[x], and f(x) is relatively prime to g(x) = q(x)r(x). As f(x)
and g(x) are monic, these conditions determine f(x) and g(x) uniquely.
The polynomials q(x), r(x) are relatively prime and deg f(x) is smaller
than deg q(x)r(x). So the hypotheses of Lemma 36.4 are satisfied and
therefore there are uniquely determined nonzero polynomials a(x),b(x)
in K[x] such that
f(x) = a(x)r(x) + b(x)q(x),
and deg a(x) deg q(x), deg b(x) deg r(x).
Dividing both sides of the equation above by g(x) = q(x)r(x), we see that
there are uniquely determined nonzero polynomials a(x),b(x) in K[x]
such that
f(x) f(x) a(x) b(x)
= = +
g(x) q(x)r(x) q(x) r(x)
f(x)
36.6 Lemma: Let K be a field and let be a nonzero rational
g(x)
function in K(x),with deg f(x) deg g(x). Suppose that f(x) and g(x) are
both monic and that f(x) is relatively prime to g(x). Assume g(x) =
q1(x)q2(x). . . qm(x), where q1(x), q2(x), . . . ,qm(x). are pairwise relatively
prime monic polynomials of positive degree in K[x].. Then there are
uniquely determined nonzero polynomials a1(x), a2(x), . . . ,am(x) in K[x]
such that .
f(x) f(x) a 1
(x) a 2
(x) a m
(x)
= = + + ... +
g(x) q1(x)q2(x). . .qm(x) q1(x) q2(x) qm(x)
36.7 Lemma: Let K be a field and x an indeterminate over K.. Let g(x)
be a polynomial in K[x] of degree 1. Then, for any f(x) K[x],. there
are uniquely determined polynomials r0(x), r1(x), r2(x), . . . ,rn(x) such that
435
f(x) = r0(x) + r1(x)g(x) + r2(x)g(x)2 + . . . + rn(x)g(x)n
and
ri(x) = 0 or deg ri(x) deg g(x) for all i = 1,2, . . . ,n.
p(x)
36.8 Theorem: Let K be a field and a nonzero rational function in
q(x)
K(x), where p(x),q(x) K[x] are relatively prime in K[x]. Let u be the
leading coefficient of q(x) and let q(x) = ug1(x)m1 g2(x)m2 . . . gt(x)mt be the
decomposition of q(x) into polynomials irreducible over K, where gi(x)
are monic. Then there are uniquely determined polynomials G(x),
436
a1(1)(x), a2(1)(x), . . . ,am (1)
(x),a1(2)(x), a2(2)(x), . . . ,am (2)
(x),. . . ,a1(t)(x), a2(t)(x),
1 2
Proof: We divide p(x) by q(x) and find unique polynomials G(x), H(x) in
K[x] with p(x) = q(x)G(x) + H(x), deg H(x) deg q(x) or H(x) = 0. In the
(k)
latter case, everything is proved (ai (x) = 0 for all i and k). If H(x) 0,
let v be the leading coefficient of H(x) and put c = v/u. Then H(x) and
q(x) are relatively prime (since p(x) and q(x) are). We have H(x) = vh(x),
where h(x) is monic, relatively prime to q(x) and
p(x) h(x)
= G(x) + c
q(x) q(x)
with deg h(x) deg q(x).. We may use Lemma 36.6 and get uniquely
determined nonzero polynomials b1(x), b2(x), . . . ,bt(x) in K[x] such that
and deg bk(x) deg gk(x)mk for all k = 1,2,. . . ,t. We put fk(x) = cbk(x).
Then
437
of fk(x), the polynomials rs(x) = 0 for s mk. So let
Exercises
f
1. Let K be a field. For any nonzero rational function in K(x), we
g
f f f
define the degree of , denoted by deg , by deg = deg f deg g.
g g g
Prove that the degree of a rational function is well defined. Can you
extend the degree assertions in Lemma 33.3 to rational functions?
f
2. Let K be a field. For any rational function in K(x), we define the
g
f f
derivative of , denoted by ( )´, by declaring
g g
438
(gf )´ = f´g
2
fg´ .
g
f a
Prove that differentiation is well defined, i.e., prove that = implies
g b
(gf )´ = (ab )´.
2x3 + 3x2 + 8x + 6
4. Expand (x) and
(x3 + 3x + 3)(x2 + 2x + 3)
4x3 + 3x2 + x + 2
(x)
x5 + 4x4 + 4x3 + 2x + 2 5
in partial fractions.
439
§37
Irreducibility Criteria
p an,
p a n 1,. . . . . . . . . , p a 1, p a 0,
p2 a0,
then f is irreducible over D.
440
Also an = bmck. Since p an and so p bmck by hypothesis, we have p bm.
Thus p b0 and p bm. Let r be the smallest index for which the coefficient
br in g(x) is not divisible by p, so that .
p b0, p b1, . . . , p br 1, p br (*)
(possibly r = 1 or r = m).
(b) Let D = [i] and f(x) = 3x3 + 2x2 + (4 2i)x + (1 + i) D[x]. Then D is a
unique factorization domain and C(f) 1. Moreover 1 + i D is a prime
element in D and
1+i 3
1 + i 2, 1 + i 4 2i, 1 + i 1 + i,
(1 + i)2 1 + i.
Hence f(x) is irreducible over D.
441
and, when we substitute x + 1 for x in both sides of this equation, we get
∑ (pk)xp k
p1
x p
(x + 1) = (x + 1)p 1=
k=0
p p p
(x + 1) = xp 1 + (1 )xp 2 + (2 )xp 3 + . . . + (p-1 )
p
This implies that p(x) is also irreducible over , since p(x) is clearly
not a unit in [x] and any factorization p(x) = f(x)g(x) of p(x) into
nonunit polynomials f(x), g(x) [x] would give a factorization p(x + 1)
= f(x + 1)g(x + 1) = f1(x)g1(x) of p(x + 1) into nonunit polynomials f1(x),
g1(x) in [x], contrary to the irreducibility of p(x + 1) over .
442
Proof: (1) The mapping T: f(x) f( x + ) is just the substitution
homomorphism T x+ (Lemma 35.3 with D, D[x], x + in place of R, S, s,
respectively). We are to show that T is one-to-one and onto. To this end,
we need only find an inverse of T (Theorem 3.17(2)). This is quite easy.
We are tempted to substitute (x )/ for x. This idea is correct, but we
must formulate it properly. Since is a unit in D, there is an inverse 1
of in D, and we put S: D[x] D[x]. Then we have
1
f(x) f( (x ))
f(x)TS = f( x + )S = f( ( 1
(x )) + ) = f(x)
f(x)ST = f( 1
(x ))T = f( 1
(( x + ) )) = f(x)
(3) If f(x) D[x]\{0} is not irreducible over D, then either f(x) is a unit in
D[x], hence f(x) D is a unit in D and f( x + ) = f(x) (by part (1)) is also
a unit in D and in D[x]; or f(x) = g(x)h(x) for some polynomials g(x), h(x)
in D[x] with 1 deg g(x) deg f(x), and then f( x + ) = g( x + )h( x +
) with g( x + ), h( x + ) D[x] and 1 deg g(x) = deg g( x + ) = deg
g(x) deg f(x) = deg f( x + ) (by part (2)), and thus f( x + ) has a
proper divisor. In either case, f( x + ) is not irreducible over D.
1
Repeating the same argument for the substitution x (x ), we
conclude: if f( x + ) is not irreducible over D, then f(x) is not irreducible
over D.
n
f( x + ) = ( 0 )an n n
x + ((n1 )a n
n1 n
+ ( 0 )a n 1
n1
)x n1
443
+ ((n2 )an
n2 2 n
+ ( 1 )a n 1
n1 n
+ ( 0 )a n 2
n2
)x n2
+ ... .
444
(2) Suppose, on the contrary, that f = gh in D[x], with 0 deg g deg f.
Then f = g h by (1). Since f is irreducible in K[x], f 0, so g 0
h and either deg g = 0 or deg h = 0. We get then
deg f = deg g h = deg g + deg h
deg g + deg h deg g + deg h
deg g + deg h = deg gh = deg f = deg f ,
which forces deg g = deg g and deg h = deg h. Thus either deg g = 0 or
deg h = 0, and so either 0 = deg g or deg g = deg f, against our hypothesis
0 deg g deg f.
(b) Lemma 37.4 can be useful even if f is not irreducible. The factori-
zation of f in K[x] gives us information about possible factors of f in D[x]
and restricts their number drastically.
445
As an illustration,. consider f(x) = x5 + 5x4 + 4x3 + 16x2 + 8x + 1 [x].
Under : [x] 3
[x], where : 3
is the natural homomorphism, we
have. (we drop the bars for ease of notation)
f = x5 + 2x4 + x3 + x2 + 2x + 1 3
[x]
= (x2 + 2x + 1)(x3 + 1)
= (x + 1)2(x + 1)(x2 x + 1)
= (x + 1)2(x + 1)(x2 + 2x + 1)
= (x + 1)5,
so any monic factor g of f in [x] with 1 deg g 2 satisfies
2
g =x+1 3
[x] or g = (x + 1) 3
[x]
( 3[x] is a unique factorization domain).
gm(1) f(1) in
3m + 4 35
3m + 4 {1,5,7,35, 1, 5, 7, 35}
3m + 4 = 1,7, 5, 35
3m + 2 = 1,5, 7, 37
gm(x) = x2 x + 1 or x2 + 5x + 1 or x2 + 7x + 1 or x2 + 37x + 1.
446
(c) Lemma 37.4 gives a very elegant proof of Eisenstein's criterion. in
case the underlying ring is a principal ideal domain. Suppose D is a
principal ideal domain and
f(x) = anxn + a n 1xn 1 + . . . + a1x + a0
is a nonzero polynomial in D[x] with C(f) 1 and p is a prime element D
such that
p an,
p a n 1,. . . . . . . . . , p a 1, p a 0,
p2 a0.
Since p is irreducible, the factor ring D/Dp is a field (Theorem 32.25). We
can use Lemma 37.4 with the natural homomorphism : D D/Dp. The
divisibility conditions on the coefficients of f imply
f = (an )xn, an D/Dp, an 0.
If f had a proper factorization f = gh in D[x], where 0 deg g n, we
would get
g h = f = (an )xn
hence g = b xr, h = c xs with 0 r n, 0 s n and b c = an .
Then the constant terms of g and h would be divisible by p, and p2
would be divide their product a0, contrary to the hypothesis.. Hence f is
irreducible over D..
The idea (that gm(x) f(x) gm(1) f(1) ) in Example 37.5(b) has been ex-
ploited by L. Kronecker (1823-1891). Let D be an integral domain and
let f(x) be an arbitrary nonzero polynomial in D[x]. To find out whether f
is irreducible over D, one must check whether g f or g f holds for all
polynomials g with deg g deg f. If D happens to be finite (and thus a
field; Theorem 31.1), there are finitely many g's with deg g deg f; and
the question whether f is irreducible over D can be decided by checking
g f for these the finitely many g's. If D is not finite, this argument does
not work, and we must, so it seems, check if g f for infinitely many
polynomials g D[x]. Kronecker showed that, if D is a unique
factorization do-main which possesses a finite number of units and if we
have a method for finding the irreducible factors of any given nonzero
element of D, then, to find out whether a given nonzero polynomial is
irreducible or not, we need check g f for only a finite number of
polynomials g in D[x].
447
His idea is that, if g(x) f(x) in D[x], then g(a) f(a) in D for any a D, and
that a polynomial g is determined uniquely if its values are known at
more than deg g elements of D (Lagrange's interpolation formula).
From this list of polynomials, we delete those which are not in D[x]. If
any polynomial g remains, we divide f by g in F[x]. Then f = qg + r, with
q,r F[x]. If r 0 or r = 0 but q D[x], we delete g from our list. We
delete g from our list also the the polynomials which are units in D. If
any polynomial g survives, it is a factor of f. Otherwise, f is irreducible
over D.
448
When a proper divisor g of f is found in this way, the same procedure
can be applied to g and f/g. Repeating this process, we can find all irre-
ducible factors of f.
Exercises
449
§38
Symmetric Polynomials
Let D be an integral domain and let f(x1,x2, . . . ,xm) D[x1,x2, . . . ,xm]. For
1 2 . . . m
any permutation = (i i . . . i ) in S m, the value of f at (xi ,xi , . . . ,xi )
1 2 m 1 2 m
450
and so h(x1,x2, . . . ,xm) is a symmetric polynomial. The same argument
works also when h = f g and h = fg. This proves
We see that x1,x2, . . . ,xm D[x1,x2, . . . ,xm] are the roots of f(t). We have
f(t) = tm 1
(x1,x2,. . . ,xm)tm 1 + 2
(x1,x2,. . . ,xm)tm 2 + . . . + ( 1)m m
(x1,x2,. . . ,xm)
for some 1
, 2
, ..., m
in D[x1,x2, . . . ,xm]. Since
f(t) = (t xi )(t xi ). . . (t xi )
1 2 m
= tm (x ,x ,. . . ,xi )tm 1 +
1 i i
(x ,x ,. . . ,xi )tm 2 + . . . + ( 1)m
2 i i m
(xi ,xi ,. . . ,xi )
1 2 m 1 2 m 1 2 m
1 2 . . . m
for any permutation (i i . . . i ) in S m, we have
1 2 m
(x ,x
j i i
,. . . ,xi ) = (x ,x
j 1 2
,. . . ,xm) for all j = 1,2, . . . ,m.
1 2 m
Thus 1
, 2
, ..., m
are symmetric polynomials in D[x1,x2, . . . ,xm].
1
= x + y, 2
= xy in D[x,y]
451
1
= x + y + z, 2
= xy + yz + zx, 3
= xyz in D[x,y,z]
1
= x + y + z + u, 2
= xy + xz + xu + yz + yu + zu,
3
= xyz + xyu + xzu + yzu, 4
= xyzu in D[x,y,z,u]
1
= ∑ xi
2
= ∑ xixj
3
= ∑ xixj xk
..........................
m
= x1x2. . . xm.
Note that " j " stands for many polynomials. j in D[x1,x2, . . . ,xm] is distinct
from j in D[x1,x2, . . . ,xn] when m n. This ambiguity in notation will not
cause any confusion. if we pay attention to the number of indeterminates.
When confusion is likely, we write j (x1,x2, . . . ,xm) instead of j .
452
nomial in D[x1,x2, . . . ,xm]. Then there is a unique polynomial g(u1,u2,. . . ,um)
in D[u1,u2, . . . ,um] such that f is the value of g at ( 1, 2, . . . , m):
f(x1,x2, . . . ,xm) = g( 1, 2
, ..., m
) D[x1,x2, . . . ,xm].
453
(a + b + c + . . . )x1k1 x2k2 . . . xmkm. We assume this has been done for each of
the exponent systems, so that each m-tuple (k1,k2, . . . ,km) occurs as an
exponent system of a monomial at most once. If, after this collection
process, a monomial ax1k1 x2k2 . . . xmkm occuring in f has a nonzero coeffi-
cient a D, we will say that a appears in f.
454
system arises only from the product (ax1k1 x2k2 . . . xmkm)(bx1n1 x2n2 . . . xmnm).
This will imply
Since the exponent system (k1 + n1,k2 + n2, . . . ,km + nm) does arise from
the product (ax1k1 x2k2 . . . xmkm)(bx1n1 x2n2 . . . xmnm), it is indeed the highest
exponent system of all the products (cx1r1 x2r2 . . . xmrm)(dx1s1 x2s2 . . . xmsm)
where cx1r1 x2r2 . . . xmrm and dx1s1 x2s2 . . . xmsm run through all monomials
appearing in f and g, respectively. This proves our contention, and also
the lemma.
By induction, we obtain
455
38.6 Lemma: Let D be an integral domain and f1,f2, . . . ,ft be nonzero
polynomials in D[x1,x2, . . . ,xm]. Then the leading monomial of f1f2. . . ft is
the product of the leading monomials of f1,f2, . . . ,ft.
38.7 Lemma: Let D be an integral domain, a D\{0}, and let 1, 2, . . . , m
be the elementary symmetric polynomials in D[x1,x2, . . . ,xm].
If k1 k2 k3 ... km 0 are integers, then the leading monomial
k1 k2 k2 k3
of a 1 2
. . . mkm-1
1
km km
m
is ax1k1 x2k2 . . . xkmm-11xmkm.
We need one more lemma for the proof of the fundamental theorem.
for all S m, (k1,k2, . . . ,km) is higher than or equal to (k1 ,k2 , . . . ,km ).
456
,km) is higher than or equal to (k1,k3,k2, . . . ,km), so k2 k3. In like
manner, when we choose = (34), . . . , (m 1,m) S m, we get k3 k4, . . .
,km 1 km. This proves (1)..
for all S m, (k1,k2, . . . ,km) is higher than or equal to (r1 ,r2 , . . . ,rm ).
Now suppose that (k1,k2, . . . ,km) is higher than (0,0, . . . ,0) and that, for
any nonzero symmetric polynomial f1 D[x1,x2, . . . ,xm] whose leading
monomial has a lower exponent system than (k1,k2, . . . ,km), there is a
polynomial g1 in D[u1,u2, . . . ,um] such that f1(x1,x2, . . . ,xm) = g1( 1, 2, . . . , m).
Under this assumption,. we will prove the existence of a polynomial g in
D[u1,u2, . . . ,um] with f(x1,x2, . . . ,xm) = g( 1, 2, . . . , m). This will establish the
fundamental theorem. because (0,0, . . . ,0) is the lowest possible exponent
system and the theorem has been proved in this case above.. Moreover,
457
as there are only a finite number of m-tuples lower than (k1,k2, . . . ,km),
the method of proof can be used. effectively to find the polynomial g ex-
plicitly in concrete cases. [Basicly, we write the m-tuples L1,L2,L3, . . . in
alphabetical order and prove that . (1) the theorem is true for all nonzero
symmetric polynomials. whose leading monomials have the exponent
system L1 = (0,0, . . . ,0) and that (2) for any s 1, if the theorem is true
for all nonzero symmetric polynomials. whose leading monomials have
exponent systems equal to one of L1,L2, . . . Ls 1, then the theorem is also
true for all nonzero symmetric polynomials. whose leading monomials
have the exponent system Ls. Once the leading monomial of a symmetric
polynomial is given, there can be only a finite number of exponent
systems of monomials appearing in that symmetric polynomial (Lemma
38.8(2).]
This completes the proof of the existence of g.. It remains to show the
uniqueness of g.. Suppose now f is a nonzero symmetric polynomial in
D[x1,x2, . . . ,xm] and assume that g,h D[u1,u2, . . . ,um] with g( 1, 2, . . . , m) =
f(x1,x2, . . . ,xm) = h( 1, 2, . . . , m). If g were distinct from h, then g h 0
458
would have a leading monomial. which we may write in the form
u1s1 s2 u2s2 s3 . . . umsm-1 sm
umsm , where s1 s2 ... sm 1 sm. Then 0 = f f
1
= g( 1, 2, . . . , m) h( 1, 2, . . . , m) in D[x1,x2, . . . ,xm] would have a leading
monomial bx1s1 x2s2 . . .xsmm-11xmsm, a contradiction. Hence g = h, as was to be
proved.
from f 1 2
and get
(f 1 2
) 2 3
= 2xyz 2xyz = 0.
Hence f(x,y,z) = 1 2 + 2 3.
(b) We express
f(x,y,z,w) = x3 + y3 + z3 + w3 [x,y,z]
in terms of 1, 2, 3, 4. The monomials are in alphabetical order, and the
leading monomial of f is 1x3y0z0w0. So we subtract 1 1
30
2
00
3
00
4
0
459
3
The leading monomial of f 1
is 3x2y = 3x2y1z0w0. We therefore
21 10 00 0 3
subtract 3 1 2 3 4
from f 1
and get
3 3
(f 1
) ( 3 1 2
) = (f ) + 3(x + y + z + w)(xy + xz + xw + yz + yw + zw)
1
= ......
= 3xyz + 3xyw + 3xzw + 3yzw
= 3 3.
3
Hence f(x,y,z,w) = 1
3 1 2
+3 3
.
and
0 = sm s
1 m 1
+ s
2 m 2
+ . . . + ( 1)m 2 s + ( 1)m 1
m 2 2
s + ( 1)mm
m 1 1 m
0 = sm+1 s +
1 m
s
2 m 1
+ . . . + ( 1)m 2 s + ( 1)m 1
m 2 3
s + ( 1)m
m 1 2
s
m 1
0 = sm+2 s
1 m+1
+ 2 m
s + ... + ( 1)m 2 s + ( 1)m 1
m 2 4
s + ( 1)m
m 1 3
s
m 2
0 = sm+3 s
1 m+2
+ s
2 m+1
+ . . . + ( 1)m 2 s + ( 1)m 1
m 2 5
s + ( 1)m
m 1 4
s
m 3
............................................................ .
460
and that x1,x2, . . . ,xm are the roots of f(t) D[x1,x2, . . . ,xm]. Hence
for all i = 1,2, . . . ,m. Multiplying both sides of this equation by xij , where
j = 0,1,2,3, . . . , we get
0 = xim+ j x m+ j 1 + x m+ j 2 . . . + ( 1)m 1 x j+1 + ( 1)m xj
1 i 2 i m 1 i m i
for all i = 1,2, . . . ,m. Adding these m equations side by side, we obtain 0
=
∑ xim+ j 1∑ i 2∑ i m 1∑ i
m m m m
x m+ j 1 + x m+ j 2 + . . . + ( 1)m 1 x j+1 + (
i=1 i=1 i=1 i=1
m∑ i
m
1)m xj
i=1
i.e.,
0 = sm+ j s
1 m+ j 1
+ s
2 m+ j 2
+ . . . + ( 1)m 2 s
m 2 j+2
+ ( 1)m 1 s
m 1 j+1
+ (
m
1) s.
m j
This establishes all the equations except the first m 1 of them. The first
m 1 equations will be established by a similar reasoning. This time we
make use of the derivative of f(t). By Lemma 35.16(2), we have
f(t)
= qm(i)1tm 1 + qm(i)2tm 2 + . . . + q(i)
1
t + q(i)
0
.
t xi
∑ ∑ (qm(i)1tm 1 + qm(i)2tm 2 + . . .
m f(t) m
= = + q(i)
1
t + q(i)
0 )
i=1 t xi i=1
so that
461
∑ ∑ ∑
m m m
m= qm(i)1, (m 1) 1
= qm(i)2, . . . , ( 1)m 22 m 2
= q(i)
1
,
i=1 i=1 i=1
∑
m
( 1)m 1 m 1
= q(i)
0
. (*)
i=1
1 = qm(i)1
1
= qm(i)2 qm(i)1xi
+ 2
= qm(i)3 qm(i)2xi
3
= qm(i)4 qm(i)3xi
............
m1
( 1) m 1
= q(i)
0
q(i)x
1 i
m
( 1) m
= q(i)
0 i
x,
qm(i)1 = 1
qm(i)2 = 1
+ qm(i)1xi
qm(i)3 = + 2
+ qm(i)2xi
qm(i)4 = 3
+ qm(i)3xi
............
q(i)
0
= ( 1)m 1 m 1
+ q(i)
1 i
x
0 = ( 1)m m
+ q(i)
0 i
x.
qm(i)2 = 1
+ xi
(1)
qm(i)3 = + 2
+( 1
+ xi)xi = 2
x + xi2
1 i
(2)
qm(i)4 = 3
+( 2 1 i
x + xi2)xi = 3
+ x
2 i
x 2 + xi3
1 i
(3)
..............................
q(i)
0
= ( 1)m 1 m 1 + (( 1)m 2 m 2
+ ( 1)m 3 x + . . . + ( 1)
m 3 i 1 i
x m 3 + xim 2)xi
462
= ( 1)m 1 m 1
+ ( 1)m 2 m 2 i
x + ( 1)m 3 m 3 i
x 2 + . . . + ( 1) x m 2 + xim.
1 i
(m
1)
We now the m equations (1), the m equations (2), the m equations (3),. . .
, the m equations (m 1) (for i = 1,2, . . . ,m). Using (*), we get
(m 1) 1
= m 1
+ s1
+(m 2) 2
= +m 2 1 1
s + s2
(m 3) 3
= m 3
+ 2 1
s 1 2
s + s3
.............................
( 1)m 1 m 1
= ( 1)m 1m m1
+( 1)m 2 m2 1
s +( 1)m 3 m3 2
s + . . . + ( 1) 1sm 2 + sm 1,
s1 1
=0
s2 s +2 2=0
1 1
s3 s + 2s1 3
1 2 3
=0
........................
sm 1
s
1 m 2
+ 2sm 3 + . . . + ( 1)m 2 m 2 1
s + ( 1)m 1(m 1) m 1
= 0.
s1 = 1
2
s2 = 1 1
s 2 2
= 1 1
2 2
= 1
2 2
2 3
s3 = 1 2
s 2 1
s +3 3
= 1
( 1
2 2
) 2 1
+3 3
= 1
3 1 2
+3 3
3 2
s4 = 1 3
s 2 2
s + 3 1
s 4 4
= 1
( 1
3 1 2
+3 3
) 2
( 1
2 2
)+ 3 1
4 4
4 2 2
= 1 4 1 2
+4 1 3
+2 2
4 4
(here j
should be replaced by 0 when j m).
463
p(t) = c0(t a1)(t a2). . . (t am) in E[t].
Hence p(t) E[t] is obtained from.
c0f(t) = c0(t x1)(t x2). . . (t xm) D[x1,x2, . . . ,xm][t]
by substituting ai for xi (i = 1,2, . . . ,m). Now c0f(t) =
c0(tm 1
(x1,x2, . . . ,xm)tm 1 + 2
(x1,x2, . . . ,xm)tm 2 + . . . + ( 1)m m
(x1,x2, . . . ,xm))
and, since substitution is a homomorphism (Lemma 35.20), we have
p(t) =
c0(t m
(a ,a , . . . ,am)tm 1 + 2(a1,a2, . . . ,am)tm 2 + . . . + ( 1)m m(a1,a2, . . .
1 1 2
,am)).
Therefore c1 = c0 1
(a1,a2, . . . ,am)
c2 = +c0 (a ,a , . . . ,am)
2 1 2
c3 = c0 (a ,a , . . . ,am)
3 1 2
...............
cm 1
= ( 1)m 1 m 1(a1,a2, . . . ,am)
cm = ( 1)m m(a1,a2, . . . ,am);
in words: the values. of the elementary symmetric polynomials at the
roots of a polynomial. are equal to the coefficients of that polynomial,
except for a factor c0, where c0 is the leading coefficient of the poly-
nomial. The equations above tell us that (i) i(a1,a2, . . . ,am) belong to D if
c0 is a unit in D; (ii) i(a1,a2, . . . ,am) belong to the field of fractions of D in
any case; (iii) in particular, i(a1,a2, . . . ,am) belong to D if D is a field.
Moreover, if h(x1,x2, . . . ,xm) D[x1,x2, . . . ,xm] is a symmetric polynomial,
then h(x1,x2, . . . ,xm) = g( 1, 2, . . . , m) for some polynomial in m
indeterminates over D, and substitution yields.
h(a1,a2, . . . ,am) = g ( 1(a1,a2, . . . ,am), 2(a1,a2, . . . ,am), . . . , m(a1,a2, . . . ,am))
so that (i) h(a1,a2, . . . ,am) belongs to D if c0 is a unit in D; (ii) h(a1,a2, . . . ,am)
belongs to the field of fractions of D in any case; (iii) h(a1,a2, . . . ,am) belongs
to D if D is a field. We summarize this discussion in the next theorem..
(1) ci = ( 1)i m
(a1,a2, . . . ,am) for i = 1,2, . . . ,m.
464
(2) If h is any symmetric polynomial in m indeterminates over D,. then
h(a1,a2, . . . ,am), which is an element of the integral domain containing the
roots of p(t), is in fact an element of the field of fractions of D. .
(3) If, in addition, the leading coefficient of p(t) is a unit in D,. then
h(a1,a2, . . . ,am) belongs to D.
(4) If, in particular, D is a field, then h(a1,a2, . . . ,am) belongs to D.
(x2y2 + . . . + z2u2) 2
2
= 2x2yz ... 6xyzu,
and (x2y2 + . . . + z2u2) 2
2
( 221 11 10 0
1 2 3 4
)= . . . = 0,
so x2y2 + x2z2 + x2u2 + y z + 2 2
y2u2 + z2u2 = 22 1 3
.
Then a 2b2 + a 2c 2 + a 2d2 + b2c 2 + b d + c d 2 2 2 2
(b) We find a polynomial of degree three in [t] whose roots are the
cubes of the roots of t3 + 2t2 + 3t + 4 [t]. Let us denote the roots of
this polynomial by a,b,c, so that a + b + c = 2, ab + ac + bc = 3, abc = 4.
We put t3 + q1t2 + q2t + q3 = (t a 3)(t b3)(t c 3).
From Theorem 38.11, we know that
q1 = a 3 + b3 + c 3, q2 = a 3b3 + a 3c 3 + b3c 3, q3 = a 3b3c 3.
3
Since s3 = 1
3 1 2
+3 3
, we conclude
q1 = a 3 + b + c 3
3
465
= (a + b + c)3 3(a + b + c)(ab + ac + bc) + 3(abc)
= ( 2)3 3( 2)(3) + 3( 4)
= 2.
We find easily that x3y3 + x3z3 + y3z3 = 23 3 1 2 3 + 3 32; hence
q2 = a 3b3 + a 3c 3 + b3c 3
= (ab + ac + bc)3 3(a + b + c )(ab + ac + bc)(abc) + 3(abc)2
= (3)3 3( 2)(3)( 4) + 3( 4)2
= 3.
Finally, q3 = a 3b3c 3
= (abc)3
= ( 4)3
= 64.
Thus
t3 ( 2)t2 + (3)t ( 64) = t3 + 2t2 + 3t + 64 [x]
is a polynomial whose roots are the cubes of the roots of t + 2t2 + 3t + 4.
3
Exercises
f(x1,x2, . . . ,xm)
3. Let K be a field. A rational function in K[x1,x2, . . . ,xm]
g(x1,x2, . . . ,xm)
is said to be a symmetic rational function over K if
f(x1 ,x2 , . . . ,xm ) f(x1,x2, . . . ,xm)
=
g(x1 ,x2 , . . . ,xm ) g(x1,x2, . . . ,xm)
for all Sn. Prove that a symmetic rational function over K can be
expressed as a fraction of two symmetic polynomials over K. Conclude
that any symmetic rational function over K can be written as
466
p( 1, 2, . . . , m)
q( 1, 2, . . . , m)
with suitable polynomials p,q in K[u1,u2, . . . ,um]. (Loosely speaking, any
symmetic rational function is a rational function of the elementary
symmetric polynomials.)
467
CHAPTER 4
Vector Spaces
§39
Definition and Examples
The term "vector" is familiar to the reader from Physics. Such physical
magnitudes as displacement, force, torque, momentum etc. are vectors.
Vectors can be added (by the parallelogram law) and multiplied by real
numbers, which are called scalars in this context. In this chapter, we
introduce systems of objects which can be added and multiplied by
scalars.
475
are called vectors. The mapping ( , ) . from K V into V will be
called multiplication by scalars.
From now on, the term "vector" will mean an element of a vector space.
We will see vectors which do not resemble the vectors of physics in any
way.
(b) The same construction can be carried out with n-tuples of elements
from any field K. Let K be a field and let V = K K ... K be the di-
rect sum of n copies of K, which is an abelian group under component-
wise addition. We define multiplication by scalars also componentwise:
( 1, 2, . . . , n) = ( 1, 2, . . . , n). (for all K, ( 1, 2, . . . , n) V).
It is easily verified that V is a K-vector space. It will be designated by
Kn, and will be called the K-vector space of n-tuples.
476
(c) Let V be the set of all real-valued functions defined on the interval
[0,1]. For any two functions f,g in V, we define a new function f + g in V
by (f + g)(x) = f(x) + g(x) for all x [0,1].
V is an abelian group under this addition (called the pointwise addition
of functions). Now let K = and define a pointwise multiplication by
scalars: ( f)(x) = f(x) for all x [0,1],
.
Then V is a vector space over (cf. Example 29.2(i)).
(d) Let K be a field and let K[x] be the ring of all polynomials over K. Let
us forget that we can multiply two polynomials and concentrate on the
fact that we can add them and multiply them by the elements of K
(which are polynomials of degree zero, or the zero polynomial). It is
easily seen that K[x] is a K-vector space.
(e) Let n be a fixed natural number. Let K be a field and let V be the set
of all polynomials over K which have degree n. Is V a vector space over
K? No, because the sum of two polynomials of degree n is not always a
polynomial of degree n (when the leading coefficients are opposites of
each other). On the other hand, the set consisting of the zero polynomial
and of all polynomials over K whose degrees are less than or equal to n
is a vector space over K.
(f) Let V be a vector space over a field K, and let K1 be a field contained
in K (in this case, K1 is called a subfield of K). Then V is a vector space
over K1, too, since the requirements in Definition 39.1 are satisfied by
the elements of K1 if they are satisfied by the elements of K.
(h) It follows from Example 39.2(f) and Example 39.2(g) that, if K1 and K
are fields such that K1 K, then K is a vector space over K1: any field is a
477
vector space over its subfields. For instance, is a vector space over ,
and is a vector space over .
39.3 Remarks: (1) A vector space is an abelian group and has an iden-
tity element, which we call zero and denote by 0. The underlying field K
has a zero element, too, which is also denoted by 0. The reader should
carefully distinguish between these zeroes. One of them is a vector, the
other is a scalar. The vector zero is sometimes denoted by 0.
39.4 Lemma: Let V be a vector space over a field K. For all , K and
for all , , V, the following hold.
(1) 0 + = .
(2) + = 0.
(3) 0 = 0 (vector zero).
(4) + = + implies = .
(5) 0 = 0.
(6) 0 = 0.
(7) ( ) = ( ) = ( ) ; in particular, 1. = .
(8) ( ) = .
(9) ( )= .
(10) = 0 implies = 0 or = 0.
(11) = implies = or = 0.
(12) = implies = 0 or = .
Proof: (1),(2),(3),(4) hold in any group (Lemma 7.3, Lemma 8.2), also in
the abelian group (V,+).
478
(6) We are to prove 0 = 0 (on the left hand side, we have the scalar
zero, on the right hand side, the vector zero). We observe
0 + 0 = 0 = (0 + 0) = 0 + 0 ,
hence 0 = 0 by (4).
(7) We have 0 = 0 = ( + ( )) = + ( )
and 0 = 0 = ( + ( )) = +( )
by (5) and (6), so ( ) and ( ) are the opposite of . Thus
( )= ( )=( ) .
(9) ( )= ( +( )) = + ( )= +( ( )) = .
39.5 Lemma: Let V be a vector space over a field K. Then, for all
, ,. . . , n, in K and 1, 2, . . . , n, in V, there hold
1 2
( 1 + 2 + . . . + n) = 1 + 2 + . . . + n
and ( +
1 2
+ ... + ) =
n 1
+2
+ ... +
n
.
Proof: This follows by induction on n. The details are left to the reader.
Just as there may be different group structures on a set, there may also
be different vector space structures on a set. Here is an example.
479
(1) c o [(a,b) + (d,e)] = c o (a + d,b + e)
= ( c (a + d),c (b + e))
= (c a + c d,c b + c e)
= (c a,c b) + (c d,c e)
= c o (a,b) + c o (d,e),
(2) (c + f) o (a,b) = ((c + f)a,(c + f)b)
= (c a + f a, c b + f b)
= (c a,c b) + (f a,f b)
= c o (a,b) + f o (a,b),
(3) (cf)(ab) = ( cf a, cf b)
= (c f a,c f b)
= c o (f a,f b)
= c o (f o (a,b)),
(4) 1 o (a,b) = (1a,1b) = (1a,1b) = (a,b)
for all (a,b),(d,e) V, c,f . Thus (V,+, ,o ) is a vector space, with the
same set V, the same addition + on V, the same underlying field as the
.
vector space (V,+, , ) of Example 39.2(b) (in case K = , n = 2), but
(V,+, ,o ) is distinct from (V,+, ,.) since the multiplication by scalars in
these vector spaces are different.
Exercises
3. Determine whether 7 7
is a 7-vector space when
(a,b) + (c,d) = (a + c,b + d), a(c,d) = (ac,0) for all a,b,c,d 7
.
480
4 Determine whether the set S of all sequences of real numbers is a
vector space over if addition and multiplication by scalars are defined
by (an) + (bn) = (an + bn), a(bn) = (abn).
for all (an),(bn) S, a (here (an) is the sequence a1,a2,a3, . . . ).
481
§40
Subspaces
Here (1) embraces two conditions: (i) W is closed under addition, (ii´) for
any W, the opposite of also belongs to W. Thus W is a subspace
if and only if (i),(ii´) and (ii) hold. One checks easily that (ii) implies (ii´): if
(ii) holds and W, then ( 1) W, hence W by Lemma 39.4(7),
so (ii´) holds. Thus (ii´) is superfluous. We proved the following lemma.
481
So a nonempty subset of a vector space V is a subspace of V if and only
if it is closed under addition and multiplication by scalars. The two
closure properties of Lemma 40.2 can be combined to a single one. When
(i) and (ii) of Lemma 40.2 hold, then
1
+ 2
W for all , K, 1
, 2
W (*)
since 1
, 2
W by (ii) and 1
+ 2
W by (i).. Conversely, if (*)
holds, then, choosing = 1, = 1, we see that (i) holds and, choosing =
0, we see that (ii) holds. Thus (i) and (ii) are together equivalent to (*).
Then we obtain another version of Lemma 40.2.
The expression 1
+ 2
W is said to be a linear combination of the
vectors 1, 2. More generally, we have the
482
40.6 Examples: (a) Let V be any vector space. Then {0} and V are
subspaces of V.
3
(d) Consider over and let
3 3
A = {( , , ) : 0} .
If ( 1, 1, 1) and ( 2, 2, 2) are in A, then 1 0, 2 0, so 1 + 2
0 and
( 1, 1, 1) + ( 2, 2, 2) = ( 1 + , +
2 1
, + 2)
2 1
belongs to A. Thus A is closed under addition. However, A is not a sub-
3
space of , since, for instance, (0,0,1) A but ( 1)(0,0,1) U. This
example shows that a subset of a vector space can be closed under
addition without being closed under multiplication by scalars.
(f) Consider the vector space K2 over a field K and let , be two
arbitrary but fixed elements of K. Put
R = {( , ) K2: + = 0} K2. Then R is a subspace of K2:
(i) If ( 1, 1), ( 2, 2) R, then 1 + 1 = 0 = 2 + 2, so
( 1 + 1) + ( 2 + 2) = 0, so ( 1 + 2) + ( 1 + 2) = 0, so ( 1, 1) + ( 2, 2) =
( 1 + 2, 1 + 2) R.
(ii) If K and ( , ) R, then + = 0, so + = 0, so
( , )=( , ) R.
483
(g) Let V be a vector space over a field K and let {Wi: i I} be a collec-
tion of subspaces of V. Then their intersection W := Wi is a subspace
i I
of V. First of all, this intersection is not empty, since 0 Wi for all i I.
Also, if , K and 1, 2 W, then , K and 1, 2 Wi for all i I,
so 1
+ 2
Wi for all i I, so 1
+ 2
W.
484
(k) Let p(x) and q(x) be continuous functions, defined on [0,1]. We write
L = {f C2([0,1]): f´´(x) + p(x)f´(x) + q(x)f(x) = 0 for all x [0,1]}.
L is a nonempty subset of C2([0,1]). If , and f,g L, then
( f + g)´´(x) + p(x)( f + g)´(x) + q(x)( f + g)(x)
= f´´(x) + g´´(x) + p(x) f´(x) + p(x) g´(x) + q(x) f(x) + q(x) g(x)
= (f´´(x) + p(x)f´(x) + q(x)f(x)) + (g´´(x) + p(x)g´(x) + q(x)g(x))
= 0+ 0=0
for all x [0,1], so f + g L. Thus L is a subspace of C2([0,1]).
485
belongs to W. Hence W is a subspace of V(Lemma 40.3). [Notice that
, , . . . , n are not assumed to be distinct.]
1 2
n,m and
+ = ( 1 i1
+ 2 i2
+ ... + n in
) + ( 1 j1
+ 2 j2
+ ... + m jm
)
= 1 i1
+ 2 i2
+ ... + n in
+ 1 j1
+ 2 j2
+ ... + m jm
486
40.10 Lemma: Let V be a vector space over a field K and let A V.
Then sK (A) is the smallest subspace of V which contains A. More exactly,
if U is a subspace of V and A U, then sK (A) U.
40.11 Lemma: Let V be a vector space over a field K and let A,B be
subspaces of V such that A s(B) and B s(A). Then s(A) = s(B).
40.12 Examples: (a) Let V be a vector space over a field K and let A
be a subset of V having only one element, say A = { }. Then the span s( )
of A is the set
{ V: K}
(b) Let V be a vector space over a field K and let , be two vectors in V.
The span s( , ) of these vectors is
{ + V: , K}.
In case is a scalar multiple of , we have
s( , ) = { + V: , K} = {( + ) : , K} = { : K} = s( ).
We see it is possible that A B and s(A) = s(B). In case K = and V = 3
and is not a scalar multiple of , this span is usually identified with
the plane through the origin determined by and .
487
(d) Let V be a vector space over a field K and let A,B be subsets of V
with A B. Then A B s(B) and, since s(B) is a subspace of V, Lemma
40.10 yields s(A) s(B). So A B implies s(A) s(B). We have seen in
Example 40.12(b) and Example 40.12(c) that A B does not necessarily
imply s(A) s(B).
Exercises
3
4. Determine whether {( , , ) :5 4 + 2 = 0}
3
{( , , ) :5 4 +2 0}
3
{( , , ) 11
:5 4 + 2 = 0}
3
{( , , ) :5 4 +2 0}
are subspaces of the vector spaces indicated.
3 3
5. Is (1,0,1) in the -span of {(5,4,1), (3,2,2)} ?
488
§41
Factor Spaces
We prove that o is well defined. To this end, we must show that the
implication
W +W= +W
hence to
W W.
(1) o (( + W) + ( + W)) = o (( + ) + W )
= ( + )+W
489
=( + )+W
=( + W) + ( + W)
= o ( + W) + o ( + W),
(2) ( + ) o ( + W) = ( + ) + W
=( + )+W
=( + W) + ( + W)
= o ( + W) + o ( + W),
(3) ( ) o ( + W) = ( ) + W
= ( )+W
= o( + W)
= o ( o ( + W)),
(4) 1 o ( + W) = 1 + W
= + W.
490
41.3 Definition: Let V and U be vector spaces over the same field K. A
mapping : V U is called a vector space homomorphism, or a K-linear
transformation, or a K-linear mapping if
( 1
+ 2
) = 1
+ 2
and ( ) = ( )
for all 1
, 2
, V, K. When there is no need to emphasize the field of
scalars, we speak simply of linear transformations or linear mappings.
More exactly, when (V,+,K,.) and (U, ,K,o ) are vector spaces, the mapping
:V U is a vector space homomorphism provided
( 1
+ 2
) = 1 2
and ( ) = o ( )
for all 1
, 2
, V, K. Notice that the field of scalars of both vector
spaces are the same. A linear transformation from V into U cannot be
defined if V and U are vector spaces over different fields.
491
Proof: If is a K-linear mapping and , K, 1, 2 V, then
( 1 + 2) = ( 1) + ( 2) = ( 1 ) + ( 2 )
since is additive and homogeneous. Conversely, if we have ( 1 + 2)
= ( 1 ) + ( 2 ) for all , K, 1, 2 V, then, choosing = = 1, we see
that is additive and choosing = 0, we see that is homogeneous..
41.5 Lemma: Let V,U be a vector spacer over a field K and let : V U
be a vector space homomorphism.
(1) 0 = 0.
(2) ( ) = ( ) for all V.
(3) ( 1 1 + 2 2 + . . . + n n) = 1( 1 ) + 2( 2 ) + . . . + n( n ) for all
, , . . . , n K and for all 1, 2, . . . , n V.
1 2
(4) (n ) = n( ) for all n .
492
(c) The mapping : C1([0,1]) is -linear, because
1
f f(2 )
1 1 1 1 1
( f + g) = ( f + g)(2 ) = ( ( g)(2 ) = (f(2 )) + (g(2 )) = (f ) + (g )
f)(2 ) +
for all , , f,g C1([0,1]). Likewise, for any [0,1], the mapping
: C1([0,1])
f f( )
is a vector space homomorphism.
(d) Let V,U be vector spaces over a field K and let W be a subspace of V.
If : V U is a vector space homomorphism, then its restriction
W
:W U
to W is also a vector space homomorphism, because
( 1+ 2
) = ( 1 )+ ( 2 )
for all , K, 1, 2 W, as this holds in fact for all , K, 1
, 2
V
(Lemma 41.4).
(e) Let V,U be vector spaces over a field K and let K1 be a field contained
in K. Then V,U are vector spaces over K1, too (Example 39.2(f)). If
: V U is a K-linear mapping, then is also a K1-linear mapping,
because ( 1+ 2
) = ( 1 )+ ( 2 )
for all , K1, 1, 2 V, as this holds in fact for all , K, 1, 2 V
(Lemma 41.4).
493
41.7 Theorem: Let V,U,W be a vector spaces over a field K. Let :V U
and :U W be vector space homomorphisms. Then the composition
mapping :V W is a vector space homomorphism from V into W.
41.8 Theorem: Let V,U be a vector spaces over a field K and let
:V U be K-linear. Then Im = { U: V} is a subspace of U and
Ker = { V: = 0} is a subspace of V.
494
41.10 Lemma: Let V,U,W be a vector spaces over a field K and let
:V U and : U W be vector space isomorphisms.
(1) The composition : V W is a vector space isomorphism from V
onto W.
(2) The inverse 1: U V of is a vector space isomorphism from U
onto V.
495
Proof: is an additive group homomorphism from V onto V/W such
that Ker = W (Theorem 20.12). Since ( ) = + W = ( + W) = ( )
for all K, V, we see that is a vector space homomorphism.
41.13 Theorem: Let V,U be vector spaces over a field K and let :V
U be a vector space homomorphism. Then
V/Ker Im (as vector spaces).
41.14 Theorem: Let V,V1 be vector spaces over a field K and let
:V V1 be a vector space homomorphism from V onto V1.
(1) Each subspace W of V with Ker W is mapped to a subspace of V1,
which will be denoted by W1.
496
(2) If W,U are subspaces of V with Ker W U, then W1 U1.
(3) If W,U are subspaces of V with Ker W and Ker U, and if
W1 U1, then W U.
(4) If W,U are subspaces of V with Ker W and Ker U, and if
W1 = U1, then W = U.
(5) If S is any subspace of V1, then there is a subspace W of V such that
Ker W and W1 = S.
(6) If U is a subspace of V with Ker U, then V/U V1/U1.
497
Proof: The natural homomorphism : V V/W . is onto by Theorem
41.11. We may therefore apply Theorem 41.14.. This theorem states that
any subspace of V/W is of the form Im U for some subspace U of V with
Ker U. Now
Im U
={ V/W: U}
= { + W V/W: U} = U/W
and Ker = W by Theorem 41.11. Thus the subspaces of V/W are given
by U/W, where U's are subspaces of V containing W. By Theorem
41.14(2),(3),(4), U1/W U2/W if and only if U1 U2, and U1/W U2/W
whenever U1 U2. Finally, by Theorem 41.14(6)
V/U Im V
/Im U
=V/W / U/W as vector spaces.
41.16 Theorem: Let V be a vector space over a field K and let U,W be
subspaces of V. Then U W and U + W are subspaces of V and
W/U W U + W /U (vector space isomorphism).
Exercises
498
1. Let V be a vector space over a field K and let W be a subgroup of the
additive group (V,+). For all in K and for all + W in the factor group
V/W, we write o ( + W) = + W. Prove that ( , + W) o ( + W) is
499
§42
Dependence and Bases
1 1
+2 2
+ ... +
n n
=0
(here 0 is the zero vector). If 1, 2, . . . , n are not linearly dependent over
K, then 1, 2, . . . , n are said to be linearly independent over K.
In place of the phrase "linearly (in)dependent over K", we shall also use
the expression "K-linearly (in)dependent". When the field of scalars is
clear from the context, we drop the phrase "over K" or the prefix "K-".
500
, 2, . . . , n K, 1 1 + 2 2 + . . . + n n = 0
1 1
= 2 = . . . = n = 0.
That is to say, 1, 2, . . . , n are K-linearly independent if the vector zero
can be written as a linear combination of 1, 2, . . . , n only in the trivial
way where the scalars are zero..
42.2 Examples: (a) Let V be a vector space over a field K and let be a
nonzero vector in V. Then = 0 implies = 0 (Lemma 39.4(10)). Hence
(and { }) is linearly independent over K. On the other hand, {0} is
linearly dependent over K because 1.0 = 0 and 1 0.
1
(1,0, . . . ,0) + 2(0,1, . . . ,0) + . . . + n(0,0, . . . ,1) = (0,0, . . . ,0)
1
( ,0, . . . ,0) + (0, , . . . ,0) + . . . + (0,0, . . . , ) = (0,0, . . . ,0)
2 n
( ,
, . . . , n) = (0,0, . . . ,0)
1 2
1
= 2 = . . . = n = 0.
The reader is probably acquainted with the vectors 1, 2, 3
in the
3
vector space over under the names i , j , k.
2
(d) The vectors (1,0) and ( 1,0) in the -vector space are linearly
dependent over because 1 0 in and 1(1,0) + 1( 1,0) = (0,0) = zero
vector in 2.
1 1
+ + ... +
2 2
= 0; m m
501
hence, when we put (in case m n) m+1
= . . . = n = 0, we obtain
+
1 1
+ ... +
2 2 m m
+ m+1 m+1 + . . . + n n = 0,
where not all of 1, 2, . . . , m, n+1, . . . , n are equal to zero, contradicting
the assumption that 1, 2, . . . , n are linearly independent over K. Thus
any nonempty subset of a linearly independent finite set of vectors is
linearly independent. But this statement is true also for infinite linearly
independent sets. Indeed, let A be an infinite linearly independent
subset of V and let B be a nonempty subset of A. If B is finite, then B is
linearly independent by definition. If B is infinite, then any finite subset
of B, being a finite subset of A, is linearly independent over K and hence
B itself is linearly independent over K. Thus we have shown that every
nonempty subset of a linearly independent set of vectors is linearly
independent. Equivalently, any set of vectors containing a linearly
dependent subset is linearly dependent.
(g) Let V be the vector space 2 over . The vectors (1,0), ( i,0) in V are
linearly dependent over , because
i(1,0) + 1( i,0) = (0,0) = zero vector in V.
However, when V is regarded as an -vector space, these two vectors
are not linearly dependent: if , and (1,0) + ( i,0) = (0,0), then
( i,0) = (0,0), hence the complex number i is equal to 0, so =
= 0. Thus (1,0), ( i,0) are linearly dependent over , but linearly
independent over . This example shows that the field of scalars must
be specified (unless it is clear from the context) whenever one discusses
linear (in)dependence of vectors.
∑ k k
= 0.
k=1
502
This equation is meaningless, for its left hand side is not defined. What
∑
n
is defined (Definition 8.4) is a sum k k
of a finite number n of
k=1
vectors 1
, 2
, ..., n
in V. The definition of ∑ k k
would involve some
k=1
(i) Consider the vector space C1([0,1]) over (Example 40.6(j)). The
functions f: [0,1] and g: [0,1] , where f(x) = ex and g(x) = e2x for
all x in [0,1], are vectors in C1([0,1]).We claim that f and g are linearly
independent over . To prove this, let us assume , and f + g =
1 1
zero vector in C ([0,1]). The zero vector in C ([0,1]) is the function
z: [0,1] such that z(x) = 0 for all x in [0,1]. Hence
( f + g)(x) = 0 for all x [0,1],
f(x) + g(x) = 0 for all x [0,1],
x 2x
e + e =0 for all x [0,1].
Differentiating, we obtain
ex + 2 e2x =0 for all x [0,1].
2x x 2x
We have thus e = e = 2 e for all x [0,1], hence = 0, so = 0.
Therefore f and g are linearly independent over .
503
Proof: We first assume that 1, 2, . . . , n are linearly dependent over K.
Then there are scalars 1, 2, . . . , n, not all of them zero, such that
+ + ... +
1 1
= 0.
2 2 n n
1
To fix the ideas, let us suppose 1
0. Then 1
has an inverse 1
in K
and we obtain 1
1
( 1 1
+ 2 2
+ . . . + n n) = 1
1
0 = 0,
1
+ 1
1 2 2
+ . . . + 11 n n
= 0,
1
= 2 2
+ ... + n n
1
where we put j = 1 j
K (j = 2, . . . ,n). So 1
is a K-linear combination of
the vectors 2, . . . , n.
504
Suppose now A0 1 or A0 = 1 but V A0. Then we may and do join
nonzero vectors 1, 2, . . . , n to A0 without disturbing the linear
dependence and finiteness of A0. One of the vectors 0, 1, 2, . . . , n, which
we may assume to be 0 without loss of generality, is a K-linear combi-
nation of the others (Lemma 42.3). So there are scalars 1, 2, . . . , n such
that
0
= 1 1
+ 2 2
+ ... + n n
.
In both cases, sK (B). Thus sK (A) sK (B) and sK (A) = sK (B), as was to be
proved.
505
A sK (A) = sK (B), we conclude that there are vectors 1, 2, . . . , m in
B and scalars 1, 2, . . . , m in K with
= 1 1 + 2 2 + . . . + m m.
So the vector in A is a K-linear combination of the vectors 1, 2, . . . , m
and the subset { , 1, 2, . . . , m} of A is K-linearly dependent by Lemma
42.3. From Example 42.2(e), it follows that A is K-linearly dependent.
42.7 Examples: (a) Consider the vector space Kn over a field K. The
vectors 1 = (1,0, . . . ,0), 2 = (0,1, . . . ,0), . . . , n = (0,0, . . . ,1) are linearly
independent over K (Example 42.2(c)). Moreover, { 1
, 2
, ..., n
} spans Kn
over K because any vector ( 1
, 2, . . . , n) in Kn is a K-linear combination
1 1
+ 2 2 + ... + n n
of the vectors 1
, 2
, ..., n
. Hence { 1
, 2
, ..., n
} is a basis of Kn over K.
506
42.8 Theorem: Let V be a vector space over a field K and let B .
= { 1, 2, . . . , n} be a nonempty subset of V. Then B is a K-basis of V if and
only if every element of V can be written in the form .
+ 2 2+ . . . + n n, ( 1, 2, . . . , n K)
1 1
1
=1
= ... =
2 2
= 0. This proves uniqueness.
n n
We prove next that any finitely spanned vector space has a basis.
507
Proof: If V happens to be the vector space {0},. then V has a K-basis,
namely the empty set (Definition 42.6). and T. Having disposed of
this degenerate case, let us assume V 0. Now sK (T) = V. Since V 0, we
have T .. If T is linearly independent over K, then T is a K-basis of V.
Otherwise, there is a proper subset T1 of T with sK (T1) = sK (T) = V
(Theorem 42.4). Here T1 , because sK (T1) = V {0}. If T1 is linearly
independent over K, then T1 is a K-basis of V. Otherwise, there is a
proper subset T2 of T1 with sK (T2) = sK (T1) = V. Here T2 , because
sK (T2) = V {0}. If T2 is linearly independent over K, then T2 is a K-basis
of V. Otherwise, there is a proper subset T3 of T2 with sK (T3) = sK (T2) = V.
Here T3 , because sK (T3) = V {0}. We continue in this way. Each time,
we get a nonempty subset Ti+1 of Ti such that sK (Ti+1) = V and Ti+1 has
less elements than Ti. Since T is a finite set, this process cannot go on
indefinitely. . Sooner or later, we will meet a K-linearly independent
subset Tm of T with sK (Tm) = V. This Tm is therefore a K-basis of V, and of
course Tm is finite.
508
sK ( 1, . . . , h, h+1
, ..., m
) = sK ( 1
, ..., h
, h+1
, ..., m
)".
This will establish A1,A2, . . . ,An 1,An. The second claim An in the theorem
will be proved in this way. .
1
sK ( 1, 2, . . . , m)
{ 1, 2, . . . , m} sK ( 1, 2, . . . , m). (i)
Since 1
sK ( 1, 2, . . . , m),
we also have { 1, 2, . . . , m} sK ( 1, 2, . . . , m). (ii)
Using (i) and (ii) and applying Lemma 40.11 (with A = { 1, 2, . . . , m}
and B = { 1, 2, . . . , m}, we obtain
sK ( 1, 2, . . . , m) = sK ( 1, 2, . . . , m).
This proves A1.
1 1
+ . . . + h 1 h 1 of the vectors 1, . . . , h 1 and the vectors 1, . . . , h 1, h
would not be linearly independent over K (Lemma 42.3), so 1, 2, . . . , n
would not be linearly independent over K (Example 42.2(e)),. contrary
to the hypothesis. So one of h, . . . , m is distinct from 0. Renaming h, . . .
1
, m
if necessary, we may suppose h
0. Then h
has an inverse h
in K
and we get
h h
= 1 1
+ ... + h 1 h 1 h
+ h+1 h+1
+ ... + m m
,
509
h
= + ... + h 1 h 1
h
1
( 1 1 h
+ h+1 h+1 + . . . + m m),
h
sK ( 1, . . . , h 1, h, h+1, . . . , m).
Now each one of the vectors 1, . . . , h 1, being an element of the span
sK ( 1, 2, . . . , m) = sK ( 1, . . . , h 1, h, . . . , m), can be written in the form
+ ... +
1 1
+ + + ... +
h 1 h 1 h h h+1 h+1 m m
with scalars 1
, ..., , ,
h 1 h h+1
, ... , m
K. Thus each one of 1
, ..., h 1
can
be written as
1 1
+ ... + h 1 h 1
+ h
( h
1
(+ ... +
1 1 h 1 h 1 h
+ h+1 h+1
+ ... + m m
))
+ h+1 h+1
+ ... + m m
,
and so { 1
, ..., h 1
} sK ( 1, . . . , h 1
, h
, h+1
, ..., m
).
Therefore { 1
, ..., h 1
, h
, h+1
,. . . , m
} sK ( 1, . . . , h 1
, h
, h+1
, ..., m
). (i´)
Since 1
, ..., h 1
, h
sK ( 1
, ..., h 1
, h
, h+1
,. . . , m
),
we also have
{ 1, . . . , h 1
, h
, h+1
, ..., m
} sK ( 1
, ..., h 1
, h
, h+1
,. . . , m
). (ii´)
Thus Ah is true.
510
42.11 Theorem: Let V be a vector space over a field K and assume that
V has a finite K-basis. Then any two K-bases of V have the same number
of elements.
If n = 0, then B = and V = {0}. Thus and {0} are the only subsets of V
and B = is the only K-basis of V. Then any K-basis of V has exactly 0
elements.
511
Thus dimK Kn = n (Example 42.7(a)) and dim V = 2, where V is the -
vector space of Example 42.7(b).
Any finite set spanning a vector space can be stripped off to a basis of
that vector space (Theorem 42.9). Similarly, any linearly independent
subset of a vector space can be extended to a basis, as we show now.
512
Proof: Let { 1, 2, . . . , m} be a K-basis of V. Then 1, 2, . . . , n are linearly
independent vectors in V = sK ( 1, 2, . . . , m), and Steinitz' replacement
theorem gives
sK ( 1, 2, . . . , n, n+1, . . . , m) = V and n m
on indexing 's suitably. Then the m = dimK V vectors
, , . . . , n, n+1, . . . , m
1 2
are linearly independent over K by Lemma 42.13(2). Hence
B = { 1, 2, . . . , n, n+1, . . . , m}
is a K-basis of V containing the vectors 1, 2, . . . , n.
513
This gives W sK ( 1, 2, . . . , m). But 1, 2, . . . , m belong to W, so
sK ( 1, 2, . . . , m) W (Lemma 40.5). The vectors 1, 2, . . . , m therefore
span W over K, so { 1, 2, . . . , m} is a K-basis of W. Thus W is finite
dimensional and in fact dimK W = m n = dimK V.
42.16 Lemma: Let V,U be vector spaces over a field K.. Suppose V is
finite dimensional and let. : V U be a vector space homomorphism.
Let 1, 2, . . . , n be vectors in V.
(1) If is one-to-one and { 1, 2, . . . , n} is linearly independent over K,
then { 1 , 2 , . . . , n } is linearly independent over K.
(2) If is onto U and { 1, 2, . . . , n} spans V over K, then { 1 , 2 , . . . , n }
spans U over K.
(3) If is a vector space isomorphism and { 1, 2, . . . , n} is a K-basis of V,
then { 1 , 2 , . . . , n } is a K-basis of U. In particular, dimK U = dimK V.
514
as was to be proved.
From Lemma 42.16, it follows that dimK U = n whenever U Kn. The con-
verse of this statement is also true.
Furthermore, = 1 1
+ 2 2
+ ... + n n
V belongs to Ker if and only if
( 1
, 2
, ..., n
) = (0,0, . . . ,0), thus if and only if =0 1
+0 2
+ . . . + 0 n = 0.
So Ker = {0} and is one-to-one.
515
42.18 Theorem:. Let V and U be finite dimensional vector spaces over a
field K. Then V U if and only if dimK V = dimK U.
1
= 2 = . . . = k = 0. Thus 1 + W, 2
+ W, . . . , k
+ W in V/W are linearly
independent over K.
516
Secondly, these vectors span V/W. To see this, let us take an arbitrary
vector + W in V/W, where V. Then
= 1 1+ 2 2+ . . . + k k + 1 1 + 2 2 + ... + m m
where 1, 2, . . . , k, 1, 2, . . . , m are scalars, and thus
+ W = ( 1 1 + 2 2 + . . . + k k + 1 1 + 2 2 + . . . + m m) + W
= ( + W) + ( + W) + . . . + ( + W)
1 1 2 2 k k
+ 1
( 1
+ W) + 2
2
+ W) + . . . + m( m + W)
(
= 1( 1 + W) + 2( 2 + W) + . . . + k( k + W)
sK ( 1 + W, 2 + W, . . . , k + W)
V/W,
hence V/W = sK ( 1 + W, 2 + W, . . . , k + W). This proves that
{ 1 + W, 2 + W, . . . , k + W} is a basis of V/W over K. As we remarked
above, this gives dimK V = dimK W + dimK V/W.
517
Proof: Theorem 41.13 tells us V/Ker Im and Theorem 42.19 gives
dimK V dimK Ker = dimK Im .
42.22 Theorem: Let V,U be vector spaces over a field K and let
:V U be a K-linear mapping. Suppose that V and U have the same
finite dimension. Then the following statements are equivalent.
(1) is one-to-one.
(2) is onto.
(3) is a vector space isomorphism.
Hence any one of (1),(2) implies the other, and these together imply (3).
Conversely, if is an isomorphism, then of course is one-to-one and
onto. Thus (3) implies both (1) and (2).
It is in fact true that every vector space has a basis, and a proof is given
in the appendix. The proof of this statement for infinite dimensional
vector spaces requires a fundamental tool known as Zorn's lemma. This
lemma can be used in a variety of situations to establish the existence of
certain objects.
518
The existence of bases having been assured by Zorn's lemma, we might
ask whether any two bases have the same cardinality. The answer
turned out to be "yes" in the finite dimensional case (Theorem 42.11),
and this was proved by using Steinitz' replacement theorem. The proof
of Steinitz' replacement theorem does not extend to the infinite dimen-
sional case. Nevertheless, theorems of set theory can be employed to
show that two bases of a vector space have the same cardinal number.
Thus renders it possible to define the dimension of a vector space as the
cardinality of a basis. Hence it is possible to distinguish between various
types of infinities. This is much finer than Definition 42.12, by which in-
finite dimensionality is merely a crude negation of finite dimensionality.
Theorem 42.14, which states that any linearly independent subset can
be extended to a basis, is true in the infinite dimensional case, too. The
proof makes use of Zorn's lemma.
Lemma 42.16 and its proof works in the infinite dimensional case.
Lemma 42.19 and its proof works in the infinite dimensional case, pro-
vided we refer to the generalization of Theorem 42.14 at the
appropriate place.
Exercises
519
W U = {0}. (U is called a direct complement of W in V. We write then
V=W U and call V the direct sum of W and U.)
3
2. Is {(1,1,1), (1,1,0), (1,0,0)} an -basis of ?
3
3. Is {(1,2,6,), (0,0,1,), (2,1,0,) a 3
-basis of 3
?
4. Find an -basis of
{f C2([0,1]): f´´(x) 7f´(x) + 12f(x) = 0 for all x [0,1]}.
4 5
5. Find all -linear mappings from onto .
3 2
6. Find all 2
-bases of 2
and 3
-bases of 3
.
7. Show that the vectors (1,2,1), (0,2,0), (1,2, 1) and also the vectors
(1,1,0), (1,0,1), (1,1,1) in 3 are linearly independent over .
8. Let fk(x) = sin kx for x [0,1] (k = 1,2,3, . . . ). Prove that the functions
{f1,f2,f3, . . . } in C ([0,1]) are linearly independent over .
520
§43
Linear Transformations and Matrices
Let T,S LK (V,W). How shall we define T + S? Well, the only natural way
to define T + S is to put (T + S) = T + S for all V (pointwise addi-
tion). What about multiplication by scalars? Given K and T L(V,W),
the mapping T had better mean: first multiply by , then apply T, so
that ( T) := ( )T (or, first apply T, then multiply by , so that ( T) :=
( T), but this is the same definition as before).
43.1 Theorem: Let V,W be vector spaces over a field K and let LK (V,W)
be the set of all K-linear transformations from V into W.. For any T,S in
LK (V,W) and for any in K, we write
(T + S) = T + S, ( T) = ( )T ( V).
521
= ( 1T + 1
S) + ( 2T + 2
S)
= ( 1(T + S)) + ( 2(T + S))
for all , K and 1, 2 V. Thus T + S is K-linear and T + S LK (V,W).
Therefore LK (V,W) is closed under addition..
522
Now the properties of multiplication by scalars. First we note that T is
in L(V,W) whenever K and T LK (V,W), because
( 1
+ 2
)( T) = ( ( 1
))T = ( 1 + 2)T
+ 2
= ( 1)T + ( 2)T = 1
( T) + 2
( T)
for all 1
, 2
V, so that T is additive and
( )( T) = ( ( ))T = (( ) )T = (( ) )T = (( ( ))T = (( )T) = ( ( T))
for all K, V, so that T is homogeneous. So T belongs to LK (V,W).
1
T = 11 1
+ 12 2
+ ... + 1m m
2
T = 21 1
+ 22 2
+ ... + 2m m
(*)
........................
n
T = n1 1 + n2 2 + . . . + nm m
where 11
, , ...,
12 nm
are scalars in K. The arrangement of the scalars in
(*) deserves a name.
523
43.2 Definition: Let K be a field and n,m . An n by m matrix over K
is an array
11 12 . . . 1m
21 22 . . . 2m
............
n1 n2 . . . nm
(1)
Sometimes we write "n m" instead of "n by m". The horizontal lines
i1 i2
. . . im
(2)
of a matrix over K are called the rows of that matrix. More specifically,
(2) is the i-th row of the matrix (1). The vertical lines
1j
2j
.. (3)
.
nj
Two matrices ( ij) Matn m(K) and ( ij) Matn m´(K) are declared to be
equal if n = n´, m = m´ and ij = ij for all i = 1,2, . . . ,n; j = 1,2, . . . ,m. Thus
two matrices are equal if and only if they have the same number of
524
rows and columns, and have the same elements at corresponding places.
We write then ( ij) = ( ij). Otherwise, we put ( ij) ( ij).
43.3 Definition: Let K be a field, K and let A,B Matn m(K), say A =
( ij), B = ( ij). We write
A + B = C, C being the matrix ( ij) in Matn m(K), where ij = ij + ij,
and A = E, E being the matrix ( ij) in Matn m(K), where ij = ij
.
In other words, ( ij) + ( ij) = ( ij + ij) and ( ij) = ( ij).
Proof: First we check that Matn m(K) is an abelian group under addition.
525
A + B = ( ij) + ( ij) = ( ij + ij) = ( ij + ij
) = ( ij) + ( ij) = B + A
and addition on Matn m(K) is commutative.
This proves that Matn m(K) is an abelian group under addition. Now the
properties of multiplication by scalars. For any , K and A = ( ij), B =
( ij) Matn m(K), we have
(4) 1A = 1( ) = (1 ij) = (
ij
) = A.
ij
43.5 Lemma: Let Eij be the matrix in Matn m(K) all of whose entries are
0, except for the single entry in the i-th row, j-th column,. which entry is
the identity element of K. Then the nm matrices Eij (where i = 1,2, . . . ,n;
j = 1,2, . . . ,m) form a K-basis of Matn m(K). In particular, dimK (Matn m(K))
is equal to nm. .
Proof: The matrices Eij span Matn m(K) over K because any A = ( ) in
ij
Matn m(K) can be written as a K-linear combination
A=( )=∑
ij
E
ij ij
i,j
∑ E = 0,
ij ij
i,j
526
then ( ij) = 0
and ij
= 0 for all i,j. Therefore {Eij: i = 1,2, . . . ,n; j = 1,2, . . . ,m} is a K-basis
of Matn m(K). In particular, dimK (Matn m(K)) = nm.
∑
m
i
T= ij j
(i = 1,2, . . . ,n), (*)
j=1
where ij K.
The n m matrix ( ij) over K will be called the matrix associated with T
B
(relative to the bases B and B´), and will be written MB´ (T).
In the following discussion, the bases will be fixed and we simply write
B
M(T) instead of MB´ (T). The role of the bases will be discused at the end
of this paragraph.
527
∑ ∑
m m
i
T= ij j
and i
S= ij j
.
j=1 j=1
∑ ∑ ∑(
m m m
Then i
(T + S) = i
T+ i
S= ij j
+ ij j
= ij
+ ij
) j
j=1 j=1 j=1
∑ ∑
m m
i
( T) = ( i
)T = ( iT) = ij j
= ij j
j=1 j=1
Proof: We know that M: LK (V,W) Matn m(K) (in the notation of Theo-
rem 43.7) is a vector space homomorphism. We will prove that M is in
fact an isomorphism.
∑
m
i
T= ij j
j=1
(∑ T=∑ ∑ i ∑ ∑ (∑
n n n m m n
i i) )
and ( T) =
i i ij j
= i ij j
i=1 i=1 i=1 j=1 j=1 i=1
With the hindsight gained from the chain of equations above,. given any
( ij) in Matn m(K), we define a function T: V W by
(∑ T = ∑ (∑
n m n
i i) )
i ij j
.
i=1 j=1 i=1
∑ ∑
n n
Then, for any , K and = i i
, ´= i i
V, we have
i=1 i=1
n
∑ ∑ T = (∑ (
n n
( + ´)T =( i i
+ i i) i
+ i
) i )T
i=1 i=1 i=1
528
∑ (∑ ( ∑ ( ∑ ∑
m n m n n
= i
+ i
) ij ) j
= i ij
+ )
i ij j
j=1 i=1 j=1 i=1 i=1
∑ (∑ ∑ (∑
m n m n
i ij) )
= j
+ i ij j
= ( T) + ( ´T)
j=1 i=1 j=1 i=1
∑
m
i i0, we obtain i0 T = i0 j j
, (i0 = 1,2, . . .
j=1
,n)
so M(T) = ( ij). Thus every ( ij) Matn m(K) is the image, under M, of at
least one T LK (V,W) and so M is onto. Consequently M is a vector space
isomorphism: LK (V,W) Matn m(K). From Theorem 42.18 and Lemma
43.5, we get dimK LK (V,W) = dimK Matn m(K) = nm.
Now let U be a vector space over the field K, with dimK U = k , and let
B´´ = { 1, 2,. . . , k} be a basis of U over K. If T: V W and S: W U are K-
linear transformations, whose associated matrices . [relative to the K-
bases B = { 1, 2, . . . , n}, B´ = { 1, 2, . . . , m} of V and W, and relative to the
K-bases B´, B´´ of W and U] are ( ij) Matn m(K) and ( jl) Matm k(K), so
that
∑ ∑
m k
i
T= ij j
, j
T= jl l
,
j=1 l=1
(TS) = ( iT)S = ( ∑ S= ∑
m m
j)
i ij ij
( j
S)
j=1 j=1
∑ ∑ ∑ (∑
m k k m
= ij jl l
= ij jl ) l
j=1 l=1 l=1 j=1
so that the matrix associated with TS [relative to the K-bases B,B´´] is the
∑
m
n k matrix whose i-th row, l-th column entry is ij jl
. This leads us to
j=1
529
43.9 Definition: Let A = ( ) be an n
ij
m matrix and let B = ( ij) be an
m k matrix, with entries from a field K. Then the product of A and B,
∑
m
denoted by AB, is the n k matrix ( ij) over K, where ij
= . Stated
ij jl
j=1
m
otherwise ( ij)( ij) := ( ∑ ij jl
)
j=1
∑
m
AB = E = ( il
), where il
= ij jl
j=1
530
∑
k
BC = F = ( jr), where jr
= jl lr
l=1
∑
k
and from (AB)C = EC =( il lr )
l=1
(∑ (∑ (∑ ∑
k m k m
= ij jl ) lr) = ( )
ij jl lr ),
l=1 j=1 l=1 j=1
(∑ ) =( ∑ ij ( ∑
m m k
A(BC) = AF = ij jr jl lr))
j=1 j=1 l=1
(∑ ∑ (∑ ∑
m k k m
=
j=1 l=1
ij
( jl lr
) ) =
l=1 j=1
(
ij jl lr
) )
(∑ ∑
k m
=
l=1 j=1
( )
ij jl lr ),
we conclude that (AB)C = A(BC).
However, there is no hope for commutativity. For one thing, the product
BA need not be defined even if the product AB happens to be defined.
For instance, if A is a 2 3 and B is a 3 4 matrix, then AB is a 2 4
matrix, but BA is not even defined, let alone is equal to AB. But also in
cases where both products AB and BA are defined, they will, generally
speaking, have diferent sizes, so they will fail to be equal on dimension
grounds. For instance, if A is a 2 3 matrix and B is a 3 2 matrix, then
AB is a 2 2 matrix and BA is a 3 3 matrix and AB BA, since a 2 2
matrix cannot be equal to a 3 3 matrix. Even if both AB and BA are
defined and have the same size (this occurs only in case A and B are
square matrices with the same number of rows), it usually happens that
00 10 10 00
AB BA. For example (1 0)(0 0) (0 0)(1 0).
Let Im be the square matrix over K with m rows, whose entries are all
equal to 0 K, except for those on the main diagonal, which are all equal
to 1 K (the main diagonal in any n m matrix consists of the places
where the i-th row and the i-th column intersect, (i = 1,2, . . . ,min{n,m})).
It is easily verified that AIm = A for any A Matn m(K).. Likewise InA = A
for any A Matn m(K).
531
Multiplication of matrices is distributive over addition. Indeed, for all A
= ( ij) Matn m(K) and B = ( jl), C = ( jl) Matm k(K), we have
) = (∑ ) = (∑ (
m m
jl )
A(B + C) = ( ij)( jl
+ jl ij
( jl
+ ij jl
+ ij jl
))
j=1 j=1
= (∑ ∑ = (∑ + (∑
m m m m
ij jl) jl) ) = AB + AC.
ij jl
+ ij ij jl
j=1 j=1 j=1 j=1
In like manner, one proves (B + C)A = BA + CA for all A Matn k(K) and
B,C Matm n(K).
One checks easily that, for all K, A Matn m(K), B Matm k(K)
( A)B = (AB) = A( B). (e)
)( jl) = ( ∑ ( = (∑
m m
jl)
( A)B = ( ij ij
) ( ))
ij jl
j=1 j=1
∑ = (∑
m m
=( ij jl) ij jl)= (AB)
j=1 j=1
and
) = (∑ ) = (∑
m m
jl )
A( B) = ( ij
)( jl ij
( ( ij jl
))
j=1 j=1
∑ = (∑
m m
=( ij jl) ij jl)= (AB).
j=1 j=1
Let us now consider the set Matn(K) of square matrices over a field K.
From Theorem 43.3, we know that Matn(K) is an abelian group under
addition.. The product of any two n n matrices is an n n matrix. Since
the matrix multiplication is associative. and distributive over addition,
Matn(K) is a ring. We also know that AIn = InA = A for any n n matrix
A. Thus we proved.
532
The counterpart of Theorem 43.11 for linear transformations is also
valid.
43.12 Theorem: Let V be a vector space over a field K and let LK (V,V)
be the set of all K-linear mappings from V into V.. Then, under the point-
wise addition and composition of K-linear transformations, LK (V,V) is a
ring with identity.. The identity mapping : V V is the identity element
of this ring LK (V,V). [Notice that there is no hypothesis about dimK V.]
Proof: We must check the ring axioms. From Theorem 43.1, we know
that LK (V,V) is an abelian group under addition.. Also, (1) LK (V,V) is
closed under the composition of mappings (Theorem 41.7), and (2)
composition of mappings (whether K-linear or not) is associative
(Theorem 3.10), and (D) composition is distributive over addition: when
T,S,R are arbitrary elements of LK (V,V), then.
533
Let us recall that a unit in a ring with. identity is an element of that ring
possessing a (unique) right inverse. which is also a left inverse. What are
the units of LK (V,V)? The units in LK (V,V) are, by definition, those K-
linear transformations T with the inverse T 1 in LK (V,V). The inverse of
T in LK (V,V), whenever it exists, is in LK (V,V) by Lemma 41.10(2). Thus
the units of LK (V,V) are the K-linear transformations in LK (V,V) which
are one-to-one and onto: the units of LK (V,V) are the vector space
isomorphisms from V onto V.. The set of all isomorphisms from V onto V
will be denoted by GL(V).. This is a group under the composition of
mappings, called the general linear group of V.. Thus LK (V,V) = GL(V).
The units in Matn(K) are the invertible matrices, that is to say, matrices
A in Matn(K) for which an A 1 Matn(K) exists such that AA 1 = In = A 1A.
These are the matrices associated with isomorphisms from V onto V.. In
the next paragraph,. we will give a necessary and sufficient condition for
a matrix to be invertible (Theorem 44.20).. The set of all invertible
matrices in Matn(K) will be denoted by GL(n,K). This is a group under the
multiplication of matrices,. called the general linear group of degree n
over K. Thus Matn(K) = GL(n,K). When V is an n-dimensional vector
space over K, the group GL(V) is isomorphic to the group GL(n,K)..
We return to the more general case of LK (V,W) and Matn m(K). Suppose
again that. V and W are K-vector spaces of K-dimensions n and m,
respectively, where n,m . Let B = { 1, 2, . . . , n} and B* = { *, 1 2
*, . . . , *}
n
be K-bases of V and let B´ = { 1, 2, . . . , m} and B´* = { *,1 2
*, . . . , m
*} be K-
bases of W. With each K-linear transformation T: V W, there is
B
associated a matrix MB´(T) relative to the bases B and B´, and a matrix
B*
MB´*(T) relative to the bases B* and B´*. We want to study the
B B*
relationship between MB´ (T) and MB´*(T).
B B*
We recall that MB´ (T) = ( ) and MB´*
ij
(T) = ( ij), where
∑ ∑
m m
i
T= ij j
and *T
i
= ij j
*. (i= 1,2, . . .
j=1 j=1
,n)
We introduce transition matrices which describe the change of bases.
Writing
534
∑
n
i
= *
ik k
(i= 1,2, . . .
k=1
,n)
we obtain a matrix ( ik) in Matn(K), called the transition matrix from the
B
K-basis B to the K-basis B* of V. Of course, ( ik) = MB*( ), where is the
identity mapping on V. We have the schema
mappings
V V V vector spaces
B* B B* bases
MB*
B
( ) M B
B*
( ) matrices
Now the composition is the identity mapping . Relative to the bases B*
and B*, the matrix associated with is the identity matrix In. This matrix
is also equal to MB*
B
B
( )MB*( ) by Theorem 43.10: MB*
B
B
( )MB*( ) = In. Thus
the transition matrix from B to B* is the inverse of the transition matrix
from B* to B.
V V V V
B B* B* B
B
MB*() MB*
B
( ) = (MB*( )) 1
B
V T W
V V W W
B* B B´ B´*
B
MB*( )
B V
MB´ (T) B´
MB´*( )
W
535
Now MB* B
( ) = [MB*
B V
( V)] 1 = P 1 and MB´*
B´
( W
) = Q. By Theorem 43.10, the
matrix associated with T relative to the bases B* and B´* is P 1MB´
B
(T)Q.
B*
Hence MB´*(T) = P 1MB´
B
(T)Q.
V T W
V V W W
B* B B´ B´*
1 B
P MB´ (T) Q
T
V V V V
B* B B B´*
1 P
P M(T)
the proof follows immediately from Theorem 43.10.
43.17 Lemma: Let K be a field and let A,B Matn m(K), C Matm k(K),
K. Then
536
(A + B)t = At + B t, ( A)t = (At), (AC)t = CtAt.
(AC)t = matrix whose l-th row, i-th column entry is the i-th
row, l-th column entry in AC
∑
m
= matrix whose l-th row, i-th column entry is ij jl
j
∑ jl
m
= matrix whose l-th row, i-th column entry is ij
j
∑ ´ij
m
= matrix whose l-th row, i-th column entry is ´ij
j
43.18 Remark: The results in this paragraph are very natural. All
operations discussed here are natural, and the vector spaces and rings of
this paragraph arise naturally. Another natural item is the isomorphism
in Theorem 43.10.
There is, however, a subtle point here. Theorem 43.10 is true only
because we write the functions on the right of the elements on which
they act! If we had written them on the left, Theorem 43.10 would read:
M(TS) = M(S)M(T). Of course this is not as good as M(TS) = M(T)M(S). For
537
this reason, people who write functions on the left define the associated
∑
m
matrices differently. If T LK (V,W) and i
T = ij j
as in Definition
j=1
43.6, they define the matrix associated with T (relative to the fixed K-
bases { 1, 2, . . . , n} of V and { 1, 2, . . . , m} of W) to be ( ij)t. Thus their
M(T) is our M(T)t, and
their M(TS) = their M(first S, then T) = our M(first S, then T)t = our
M(ST)t = our (M(S)M(T))t = our M(T)tM(S)t = their M(T)M(S)
so that Theorem 43.10 is true in their notation, too. In some books, the
forming of the transpose is included in the notation for the associated
matrix. More clearly, some people write
∑
m
i
T= ji j
j=1
Exercises
2. Let K be a field and A,B Matn(K) with AB = BA. Prove that (A + B)2 =
A2 + 2AB + B 2 and (A + B)(A B) = A2 B 2. Show that these equations
need not hold if AB BA.
538
0 1 1 1
0 1 1 0 0 1
A = 0 0 1 and by A = 0 0 1
1
. Generalize to square matrices of n
0 0 0 0 0 0
0
0
rows.
rows.
8. Let : 3 3
be the -linear mapping for which 1 = (1,0,2), 2 =
(0,1,1), 3 = (1,0,1), where, as usual, 1 = (1,0,0), 2 = (0,1,0), 3 =
(0,0,1). We put 1 = ( 1,1,0), 2 = (1,2,3), 3 = (0,1,2). Let B = { 1, 2, 3},
3
and B* = { 1
, 2
, 3
}. Show that B* is an -basis of and find the matrix
of the -linear transformation relative to the bases (a) B and B; (b) B
and B*; (c) B* and B; (d) B* and B*.
539
§44
Determinants
540
This pattern continues. The expressions we get in this. way are called
determinants. On changing a to 1, b to 2, c to 3, etc., the formal
definition reads as follows. .
∑ ( ) 1,1 2,2
... n,n
Sn
Hence det A is a sum of n! terms These summands are obtained from the
product 11 22. . . nn of the entries in the main diagonal by permuting the
second indices in all the n! ways and attaching a "+" or " " sign according
as the permutation is even or odd. Each summand, aside from its sign, is
the product of n entries of the matrix, the entries being from distinct
rows and distinct columns. The determinant can also be written
∑ 1,1 2,2
... n,n ∑ 1,1 2,2
... n,n
.
An S n \An
44.2 Remarks: (1) Determinants are defined for square matrices only.
Nonsquare matrices do not have a determinant. Note that the determi-
nant of the 1 1 matrix ( ) is equal to K.
541
We observe only: when R is a subring of. K, and all entries of A Matn(K)
are in R, then det A is in fact an element of R.
44.3 Lemma: Let K be a field and A Matn(K). Then det A = det At.
(The determinant does not change when rows are changed to columns.)
det A = ∑ ( 1
) 1,1 -1 2,2 -1
. . . n,n -1 .
Sn
det A = ∑ ( 1
) 1 ,1 2 ,2
... n ,n
.
Sn
1
Since ( )= ( ) for all Sn (in case n 2; if n = 1, there is nothing
to prove, for then A = At),
det A = ∑ ( ) 1 ,1 2 ,2
... n ,n
Sn
= ∑ ( ) 1,1 2,2
... n,n
Sn
= det At.
542
Proof: In view of Lemma 44.3, it suffices to prove the stetement about
rows only. Let A = ( ij). Assume that the elements of the k-th row are
multiplied by . The new matrix is ( ij), where ij = ij for i k and kj =
kj
. Thus
ij
= ∑ ( ) 1,1
... k,k
... n,n
Sn
= ∑ ( ) 1,1
...( k,k
). . . n,n
Sn
= ∑ ( ) 1,1
... k,k
... n,n
Sn
= . ij
.
11 12
. . . 1n
. . . . . . . . . . . . . .
+
k1 k1 k2 k2
+ . . . +
kn kn
. . . . . . . . . . . . . .
n1 n2
. . . nn
11 12
. . . 1n 11 12
. . . 1n
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
= k1 k2
. . . kn + k1 k2
. . . kn
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
n1 n2
. . . nn n1 n2
. . . nn
and
11
. . . 1k
+ 1k . . . 1n
21
. . . 2k
+ 2k . . . 2n
. . . . . . . . . . . .
n1
. . . nk
+ nk . . . nn
543
11
. . . 1k
. . . 1n 11
. . . 1k
. . . 1n
21
. . . 2k
. . . 2n 21
. . . 2k
. . . 2n
= + .
. . . . . . . . . . . . . . . . . . . . . . . .
n1
. . . nk
. . . nn n1
. . . nk
. . . nn
Proof: The proof is shorter than the wording of the lemma. It will be
sufficient to prove the assertion involving rows only, and this follows
from summing
( ) 1,1 . . . k,k . . . n,n
= ( ) 1,1 . . . ( k,k + k,k ). . . n,n
= ( ) 1,1 . . . k,k . . . n,n + ( ) 1,1 . . . k,k . . . n,n
over all Sn.
The last two lemmas mean that the determinant of a matrix is a linear
function of any one of its rows or columns.
Proof: We prove the statement about rows only. Assume A = ( ij) and
B = ( ij), and assume that B is obtained from A by interchanging the k-th
and m-th rows of A so that ij = ij for all i,j with i k, i m and kj = mj,
mj
= kj for all j. Then
det B = ∑ ( ) 1,1
... k,k
... m,m
... n,n
.
Sn
544
As ranges over Sn, so does (km) . Hence we have
= ∑ ( ) 1,1
... k,m
... m,k
... n,n
Sn
= ∑ ( ) 1,1
... m,m
... k,k
... n,n
Sn
= det A.
Proof: Let A = ( ij) and B = ( ij). We give a proof of the assertion about
rows only. The hypothesis is that ij = i ,j for some in Sn. We write as
a product of transpositions:
= 1 2. . . s ( 1, 2, . . . , s
are transpositions in Sn)
so that ( ) = ( 1)s by definition. We introduce matrices
A = A0, A1, A2, . . . ,As 1, As = B,
where each Ar is obtained from Ar 1 (r = 1,2, . . . ,s) by interchanging two
rows:
A0 = ( ), A1 = (
ij
), A2 = (
i 1 ,j
), . . . , As
i 1 2 ,j 1
=( i 1 2 ...
), As = ( ).
s-1,j i 1 2 ... s-1 s ,j
545
This conclusion is justified when we can divide by 2 in K,. that is to say,
if the multiplicative inverse of 2 exists in K.. Let us recall that 2 is an
abbreviation of 1K + 1K , where 1K is the identity of K. Since any nonzero
element of K has an inverse in K,. the conclusion is valid when K is a field
in which 1K + 1K 0. If, however, 1K + 1K = 0 (as in 2), this argument
does not work. .
11 12
Let A = ( ), with
ij 1j
= 2j
for all j. If n = 2, then det A = =
11 12
11 12 12 11
= 0. Let us suppose now n 3. Then
546
of a matrix A = ( )
ij
Matn(K), where K is a field, with the vector
( i1 i2 . . . in)
in Kn = Mat1 n(K). Similarly, the j-th column
1j
2j
..
.
nj
of A will be identified with the vector (matrix)
1j
...
2j
nj
in Matn 1(K). Thus it is meaningful to speak of K-linear (in)dependence of
rows and columns of a matrix. Likewise, we can add two rows (columns)
and multiply them by scalars.
Proof: We prove the assertion about rows only. Suppose that the k-th
row in A is multiplied by K and added to the m-th row. Writing A =
( ij), B = ( ij), we have mj = kj + mj and ij = ij for i m. Lemma 44.5
gives det B = det C + det A, where C Matn(K) is identical with A except
for the m-th row, which is times the k-th row of A. By Lemma 44.4,
det C = det D, where D Matn(K) is identical with A except that the
m-th row of D = the k-th row of A = the k-th row of D. Then det D = 0 by
Lemma 44.9 and det B = .det D + det A = det A.
547
Proof: Let A = ( ij). Under the hypothesis of the lemma, each summand
( ) 1,1 2,2 . . . n,n ( Sn) of det A is zero, for one of the factors is
zero. Hence det A = 0.
1
(1st row) + 2
(2nd row) + . . . + n
(n-th row) = (0,0, . . . ,0)
and not all of 1, 2, . . . , n are equal to 0 K. Suppose k 0. Then k has
1 1
an inverse k
in K. We multiply the i-th row by i k
and add it to the k-
th row; we do this for each i k. Then we obtain a matrix B whose
determinant is equal to det A by Lemma 44.10. On the other hand, the
k-th row of B consists entirely of zeroes and det B = 0 by Lema 44.11.
Hence det A = 0.
∑ ( ) 1,1 2,2
... n,n
Sn
548
( 1)i+ j det Mij is called the cofactor of ij
in A. We write Aij for the cofactor
of ij in A.
= ∑ ( ) 1,1 2,2
. . . n 1,(n 1) n,n
+ ∑ ( ) 1,1 2,2
. . .
Sn Sn
n =n n n
n,n
= nn ∑ ( ) 1,1 2,2
... n 1,(n 1)
+ terms not involving nn
.
Sn
n =n
Any Sn with n = n can be regarded as a permutation in Sn 1, and any
permutation in Sn 1 can be regarded as a permutation in Sn with n = n .
Here ( ) is independent of whether we regard as an element of Sn or
of Sn 1. Hence
549
11 12
. . . 1,n 1
21 22
. . . 2,n 1
= = Ann.
. . . . . . . . .
n 1,1 n 1,2
. . . n 1,n 1
(2) We prove cnm = Anm for all m = 1,2, . . . ,n. The case m = n having been
settled in part (1) above, we assume m n. Consider the matrix B =
11
. . . 1,m 1 1n 1,m+1
. . . 1,n 1 1m
21
. . . 2,m 1 2n 2,m+1
. . . 2,n 1 2m
. . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . .
n 1,1
. . . n 1,m 1 n 1,n n 1,m+1
. . . n 1,n 1 n 1,m
n1
. . . n,m 1 nn n,m+1
. . . n,n 1 nm
det B = nm
det M + terms not involving nm
(ii)
11
. . . 1,m 1 1,m+1
. . . 1,n 1 1n
21
. . . 2,m 1 2,m+1
. . . 2,n 1 2n
Mnm = . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . .
n 1,1
. . . n 1,m 1 n 1,m+1
. . . n 1,n 1 n 1,n
det A = det B = nm
( det M) terms not involving nm
= A + terms not involving nm,
nm nm
as was to be shown.
(3) We now prove ckm = Akm for all k,m. The case k = n having been
settled in part (2), we assume k n. We consider the matrix C obtained
from B by interchanging the k-th and n-th rows. Then det C = det B =
det A by Lemma 44.7 and, by part (1),
det C = km
det N + terms not involving km
550
where N is the (n 1) (n 1) matrix. we obtain from C by deleting its n-th
row and n-th column. The matrix N is obtained from Mkm by n m 1
interchanges of columns and n k 1 interchanges of rows. Hence .
= ∑ ( ) 1,1 2,2
... n,n
+ ∑ ( ) 1,1 2,2
... n,n
+ ...
Sn Sn
i =1 i =2
+∑ ( ) 1,1 2,2
... n,n
Sn
i =n
= i1ci1 + i2ci2 + . . . + incin
= i1Ai1 + i2Ai2 + . . . + inAin
for any i. This proves the first formula. Applying it with At, j in place of
A, i, we obtain the second formula.
551
+ + .. .. ..
+ + ...
+ + .
+ + . . .
. . . . . . . . .
The expansion along a row or column is sometimes given as a recursive
definition of determinants in terms of determinants of smaller size.
1 1 1
. . .
x1 x2 xn
. . .
evaluate det (xij 1) = x12 x22 . . . xn2 .
552
This holds for any n. Thus
Changing the sign of the (n2 ) factors on the right hand side and noting
n
that (n 1) + (n 2) + . . . + 1 = ( 2 ), we finally get
Dn = ∏ (xi xj ),
i j
the product being over all (n2 ) pairs (i,j), where i,j = 1,2, . . . ,n and i j.
11 12 13
... 1n
22 23
... 2n
33
... 3n
0 ..
.
nn
11
21 22
0
31 32 33
553
...... ..
.
n1 n2 n3
... nn
is evaluated to be 11 22 33
. . . nn
. In particular, the determinant of a
diagonal matrix
11
22
0
33
0 ..
.
nn
is ...
11 22 33
.
nn
Proof: The first (second) sum is the expansion, along the i-th row (j-th
column), of det B, where B is the matrix obtained from A by replacing
the k-th row (m-th column) of A by its i-th row (j-th column). Since two
rows (columns) of B are identical, det A = 0 by Lemma 44.9. The result
follows.
1 if r = s
rs
= { 0 if r s,
554
To express these equations more succintly, we introduce a definition.
1
Otherwise, det A 0 and det A has an inverse in K. If n = 1, then
det A
1
A = (det A) and ( ) is the inverse of A. If n 2, we multiply the
det A
1
members of the equations in Theorem 44.19 by and obtain
det A
555
1 1
A. (adjoint of A)t = I = (adjoint of A)t.A.
det A det A
1
This shows that (adjoint of A)t is an inverse of A. So A GL(n,K)
det A
and, since GL(n,K) is a group, A has a unique inverse. Hence
1
(adjoint of A)t is the inverse A 1 of A.
det A
∑
n
i
T= ij j
j=1
has the associated matrix ( ij) = A, which is not invertible since det A =
0. So A is not a unit in Matn(K) and T is not a unit in LK (V,V). Thus T is
not an isomorphism. From Theorem 42.22, we conclude that T is not
one-to-one. Thus Ker T {0}. Let Ker T, 0. We have
= 1 1+ 2 2+ . . . + n n
for some suitable scalars j K. Here not all of j are equal to 0, because
0 and { 1, 2, . . . , n} is a K-basis of V. Then
0 = T = (∑ T=∑ ∑ ∑ ∑ (∑
n n n n n n
i) )
i
( T) =
i i i ij j
= i ij i
,
i=1 i=1 i=1 j=1 j=1 i=1
∑
n
so i ij
=0 for j = 1,2, . . . ,n
i=1
since { 1, 2
, . . . , n} is a K-basis of V. Thus
1
(1st row) + 2
(2nd row) + . . . + n
(n-th row) = (0,0, . . . ,0)
556
with scalars 1
, 2
, ..., n
K which are not all equal to 0. So the rows of A
are K-linearly dependent. Repeating the same argument with At, we see
that the columns of A, too, are K-linearly dependent.
Proof: (2) That det I = 1 is a special case of the formula for the
determinant of a diagonal matrix discused in Example 44.16(b). And (3)
follows from (1) and (2): (det A 1)(det A) = det (A 1A) = det I = 1.
∑
n
We prove (1). Let A = ( ij
), B = ( ij), AB = ( ij), so that ij
= ik kj
for all
k=1
11
.
12
. . 1n
21 22
. . . 2n
i,j. Then det (AB) =
. . . . . . . . .
n1 n2
. . . nn
∑ ∑ ∑
n n n
1k1 k1 1 1k2 k2 2
... 1kn kn n
k1 =1 k2 =1 kn =1
∑ ∑ ∑
n n n
= 2k1 k1 1 2k2 k2 2
... 2kn kn n
k1 =1 k2 =1 kn =1
....................................
∑ ∑ ∑
n n n
nk1 k1 1 nk2 k2 2
... nkn kn n
k1 =1 k2 =1 kn =1
1k1 k1 1 1k2 k2 2
. . . 1kn kn n
. . .
∑ ∑ ∑
n n n 2k1 k1 1 2k2 k2 2 2kn kn n
= ... (Lemma 44.5)
k1 =1 k2 =1 kn =1 . . . . . .
nk1 k1 1 nk k 2
. . . nkn kn n
2 2
1k1 1k2
. . . 1kn
. . .
∑ ∑ ∑
n n n 2k1 2k2 2kn
= ... k1 1 k2 2
... kn n
(Lemma 44.4).
k1 =1 k2 =1 kn =1 . . . . . .
nk1 nk
. . . nkn
2
557
In this n-fold sum, k1,k2, . . . ,kn run independently over 1,2, . . . ,n. If,
however, any two of k1,k2, . . . ,kn are equal, then the determinant ik in
j
the n-fold sum has. two identical columns and therefore vanishes (Lemma
44.9). So we may disregard those combinations of the indices k1,k2, . . . ,kn
which contain two equal values,. and restrict the n-fold summation to
those combinations of k1,k2, . . . ,kn such that k1,k2, . . . ,kn are all distinct.
Then the n-fold sum becomes
1k1 1k2
. . . 1kn
. . .
∑ k1 1 k2 2
... kn n
2k1 2k2
. . . . . .
2kn
1 2 ... n
(k1 k2 ... kn ) Sn . . .
nk1 nk 2 nkn
1,1 1,2
. . . 1,n
. . .
= ∑ ( ) 1 ,1 2 ,2
... n ,n
. ( ) 2,1 2,2
. . . . . .
2,n
Sn
n,1 n,2
. . . n,n
11 12
. . . 1n
. . .
= ∑ ( ) 1 ,1 2 ,2
... n ,n
. 21 22
. . . . . .
2n
(Lemma
Sn
n1 n2
. . . nn
44.8)
= ∑ ( ) 1 ,1 2 ,2
... n ,n
(det A) = (det A)(det B) (Lemma 44.3).
Sn
The equation det AB = (det A)(det B) may also be written in the forms
det AB = (det A)(det Bt),
det AB = (det At)(det B),
det AB = (det At)(det Bt).
So there are four versions of the multiplication rule for determinants,
known as the rows by columns multiplication, rows by rows multiplica-
tion, columns by columns multiplication, column by rows multiplication,
which are respectively described below:
558
If K is a field, ( ij), ( ij), ( ij) Matn m(K), and if
∑
n
ij
= ik kj
for all i,j, or
k=1
∑
n
ij
= ik jk
for all i,j, or
k=1
∑
n
ij
= ki kj
for all i,j, or
k=1
∑
n
ij
= ki jk
for all i,j,
k=1
then ij
= ij ij
.
Exercises
21 22 23 21 22
31 32 33 31 32
We take the products of the upper-left, lower-right diagonals (full lines)
unchanged, the products of the lower-right, upper-left diagonals (broken
lines) with a minus sign. The sum of these six products is the determi-
559
nant of ( ij). . (This rule cannot be extented to n n matrices if n is
greater than 3).
1 1 1 2
1 0 0 7 3
2 1 1 0 2 3 0 4 4 0 1 0
(e) 3 1 0 ; (f) 1 1 0 1 ; (g) 2 8 2 1 4 .
0 3 2
0 1 0 4 2 1 5 2
1 0 1 0
1
2
2 1 1
3. Find det A if A is the matrix 3 1 0 in Mat3( ).
0 3 2 7
1 0 5 4 0
3 2 1 3 1
0 3 1 2 0 .
1 2 0 2 4
6 1 2 1 1
1 2 1 0
1 2 3 0143
5. Find the adjoints of 0 1 1 and 2 201
2 0 4
5 1 6 2
560
x + y xy 0 0 . . .
dn = 0 x + y xy 0 . . .
0 0 x + y xy . . .
............
Express dn in terms of dn 1 and dn 2, and evaluate it in closed form.
. . .
c+m-1 c+m c+m+1 c+m+m-1
( 0 ) ( 1 ) ( 2 ) ( m )
c+m c+m+1 c+m+2 . . . c+m+m
( 0 ) ( 1 ) ( 2 ) ( m )
.
561
§45
Linear Equations
Let K be a field and ij, i K (i = 1,2, . . . ,m; j = 1,2, . . . ,n). We ask if there
are elements x1,x2, . . . ,xn in K such that
x +
11 1
x + ... +
12 2
x =
1n n 1
x + 22x2 + . . . +
21 1
x =
2n n 2
(1)
.....................
x + m2x2 + . . . + mnxn =
m1 1 m
.
x +
11 1
x + ... +
12 2
x =0
1n n
x + 22x2 + . . . +
21 1
x =0
2n n
(2)
.....................
x + m2x2 + . . . + mnxn = 0,
m1 1
then there are elements x1,x2, . . . ,xn in K, not all of them being zero,
which satisfy the system (2).
562
x 1
0
0
0 = .. Matm 1(K), we may write (2) as a matrix equation:
.
0
AX = 0.
11 1
x + x + ... +
12 2 1n n
x = 1
x + 22x2 + . . . +
21 1 2n n
x = 2
(3)
.....................
x + n2x2 + . . . + nnxn =
n1 1 n
563
1
2
. .
..
n
x
1 1
2
), X = . Matn 1(K) and B = .
x 2
Proof: Let A = (
.. Matn 1(K). Then
x..
ij
n
n
1
Multiplying both sides of (4) on the left by A 1 = (adjoint of A)t,
det A
we obtain
1
X= (adjoint of A)tB. (5)
det A
Also, multiplying both sides of (5) on the left by A, and using Theorem
44.12, we obtain (4). Thus (4) and (5) are equivalent. So the system (3)
or (4) has a unique solution given by (5). In more detail, when we write
Aij for the cofactor of ij in A, so that (adjoint of A) = (Aij),. the solution is
given by
x
1 A A . . . A
A A . . . A
11 21 n1 1
... = det A . . . . . . . . . n2 2
x
2 1
...
12 22
x A A . . . A
1n 2n
nn n
n
A +
1j 1j
A
2j 2j
+ ... + A
nj nj
(Theorem 44.15) of det ( ij) along the j-th column, we see that the
paren-thetical expression is the expansion,. along the j-th column, of the
564
de-terminant of the matrix Bj that is obtained from ( ij
) by replacing its
j-th column by B. Thus .
det Bj
xj = (j = 1,2, . . . ,n),
det ( ij
)
as claimed.
det Bj
The formula xj = is known as Cramer's rule after G. Cramer
det ( ij
)
(1704-1752).
11 1
x + x + ... +
12 2
x =0
1n n
x + 22x2 + . . . +
21 1
x =0
2n n
(6)
.....................
x + n2x2 + . . . + nnxn = 0
n1 1
Proof: If det ( ij) 0, then the system has a unique solution by Theorem
45.2, which must be x1 = x2 = . . . = xn = 0, as follows also from Cramer's
rule, for the numerator determinants,. having a column consisting of
zeroes only, are all equal to 0. Thus,. if the system has a nontrivial solu-
tion in K, then det ( ij) must be zero.
Suppose conversely that det ( ij) = 0. Then the columns of ( ) are linear-
ij
ly dependent over K (Theorem 44.21): There are elements , , . . . , n in
1 2
K, not all of them being zero, such that.
0
11
12
1n
0
21 22 2n
... = ... .
...
1 ... + 2 ... + n
+
n1 n2 nn 0
Thus x1 = 1
, x2 = 2
, . . . , xn = n
is a nontrivial solution of (6).
565
45.4 Remark: The theorems in this paragraph are chiefly of theoretical
interest. Finding solutions of specific systems by the methods described
in this paragraph would be very tedious.
Exercises
(a) 3x + 4y 5z = 1
2x 3y + z = 3
2x + y + 6z = 0;
(b) 4x + y 5z u =1
6x + 2y 3z + 3u = 8
4x + 5y 2z + u = 3
2x 7z 3u = 0.
(a) 2x + 11y + 4z =1
3x + 8y + 5z = 6
9x + 12y + 4z = 7;
(b) 2x + 11y + 4z + 3u =5
8x + 10y + 6z + 7u = 2
1x + 9y + 2z + 8u = 6
3x + 1y + 0z + 5u = 4.
566
§46
Algebras
567
(( T)S ) = ( ( T))S = (( )T)S = ( )(TS) = ( (TS))
and
(T( S)) = ( T)( S) = ( ( T))S = (( T)S ) = ( (TS)) = ( )(TS) = ( (TS))
for all V, thus ( T)S = (TS) = T( S). Thus LK (V,V) is a K-algebra.
∑ ∑ ∑
n n n
x= b ,
i i
y= b ,
j j
z= b
k k
i=1 j=1 k=1
(xy)z = ( ∑ ∑ ). ∑ ∑ ∑
n n n n n
b
i i
b
j j
b =
k k
( ibi)( j bj ) . b
k k
i=1 j=1 k=1 i,j=1 k=1
∑ ∑ ∑ ∑
n n n n
= (b ( b )) .
i i j j
b
k k
= ( (b b )) .
i j i j
b
k k
i,j=1 k=1 i,j=1 k=1
∑ ∑ ∑
n n n
= ( i j
)(bibj ) . b
k k
= (( i j
)(bibj ))( kbk)
i,j=1 k=1 i,j,k=1
568
∑ ∑
n n
= ( i j
)((bibj )( kbk)) = ( i j
)[ k((bibj )bk)]
i,j,k=1 i,j,k=1
∑
n
= (( i j
) k)((bibj )bk)
i,j,k=1
∑ (∑ ∑ ∑ ∑
n n n n n
and likewise x(yz) = b .
i i
b
j j k k)
b = b .
i i
( j k
)(bj bk)
i=1 j=1 k=1 i=1 j,k=1
∑
n
= ( i( j k
))(bi(bj bk)).
i,j,k=1
Now ( i j ) k = i( j k) since the multiplication on K is associative and
(bibj )bk = bi(bj bk) by hypothesis, so (xy)z = x(yz). Hence the multiplication
on V is also associative. .
∑
G
elements of KG are sums i i
g , where G = {g1,g2, . . . ,g G }. It will be
i=1
∑ g
g of KG are equal if and only if g
= g
for each g G. The sum of
g G
∑ g
g and ∑ g
g is ∑ ( g
+ g
)g, and the product of K by ∑ g
g
g G g G g G g G
is ∑ g
g. We now define a multiplication on KG by extending the
g G
product of ∑ g
g by ∑ g
g = ∑ h
h to be ∑ g
g ∑ h
h =
g G g G h G g G h G
∑ g h
gh = ∑ (∑ g h )k.
g,h G k G g,h G
gh=k
569
For any a = ∑ g
g, b = ∑ g
g in KG and K, we have ( a)b =
g G g G
( ∑ g
g) ∑ g
g = ∑ g
g= ∑ g
g = ∑ (∑ g h )k =
g G g G g G g G k G g,h G
gh=k
g
= 0 if g g0 and g0 = 1. Thus we regard G as a subset of KG. It is
4
(b) Let = be the four-dimensional -vector space of ordered
quadruples, and let e = (1,0,0,0), i = (0,1,0,0), j = (0,0,1,0), k = (0,0,0,1).
Thus {e,i,j,k} is a basis of over . We give a multiplication table for
these basis elements:
e i j k
e e i j k
i i e k j .
j j k e i
k k j i e
Thus ea = ae = a for any a {i,j,k} and the products of i,j,k are like the
3
cross product of the vectors i,j,k in . The product of two distinct
elements from {i,j,k} is equal to the third, the sign being "+" for
products taken in the order indicated in the accompanying diagram, and
" " in the reverse order.
i
j k
570
+( ´+ ´+ ´ ´)i
+( ´ ´ + ´ + ´)j
+( ´+ ´ ´ + ´)k
which may be taken as the definition of multiplication on . One checks
that this multiplication is distributivite over addition, and that e is an
identity element. To prove the associativity of multiplication, we must
only verify the 43 = 64 equations (ab)c = a(bc), where a,b,c {e,i,j,k}
(Lemma 46.3). This verification is left to the reader. The multiplication is
thus seen to be associative. One also finds immediately ( a)b = (ab) =
a( b) for any and a,b . Thus is an algebra over . This alge-
bra was discovered by the Irish mathematician W. R. Hamilton (1805-
1865). The elements of are called quaternions, and is known as the
Hamilonian algebra of quaternions. It is not commutative, since ij = e e
= ji, for example.
571
Just like we divide a complex number a = + i by a nonzero complex
number b = + i by multiplying the numerator and denominator of a/b
by the conjugate b = i of b:
a + i + i i + +
= = = 2 2 + 2 2 i,
b + i + i i + +
a ab ab
= = .
b bb N(b)
572
Exercises
G e.
3. Let K be a field and A an algebra over K. Prove that the center Z(A) of
A (see §32, Ex. 1) is a subspace of A.
5. For any a , show that there are real numbers t,n such that
a 2 ta + n = 0.
7. Let a,b . Show that ab = ba if and only if 1,a,b are linearly depen-
dent over .
8. Prove that { 1, i, j, k}
is a group isomorphic to Q8 (see §17,
1 i j k
Ex.15) and that S = { 1, i, j, k,
2 } is a group
isomorphic to SL(2, 3). Show that { 1} S and S/{ 1} A4.
9. Prove that the quaternion algebra over is isomorphic (as ring and
-vector space) to the -algebra Mat2( ).
573
e i j k
e e i j k
i i e k j
j j k e i
k k j i e
(d) Prove that A is a division algebra if and only if N(a) 0 for any
nonzero a A and this holds if and only if 20 = 2
1
+ 2
2
implies 0 = 1 =
2
= 0 for any 0, 1, 2 K.
574
CHAPTER 5
Fields
§47
Historical Introduction
There is, of course, the related question concerning the existence of roots
of polynomials. Does every polynomial have a root? Here the coefficient
of polynomials were implicitly understood to be real numbers. A. Girard
(1595-1632) expressed that any polynomial has a root in some realm of
numbers (not neccessarily in the realm of complex numbers), without
indicating any method of proof. R. Descartes (1596-1650) noted that x
c is a divisor of a polynomial if c is a root of that polynomial and gave a
rule for determining the number of real roots in a specified interval. He
makes an obscure remark about the existence of roots. Euler stated that
any polynomial has a root in complex numbers. This result came to be
575
called the fundamental theorem of algebra, a very inappropriate name.
Euler proved it rigorously for polynomials of degree 6. J. R. D'Alembert
(1717-1783), Lagrange, P. S. Laplace (1749-1827) made attempts to
prove this statement. As Gauss criticized, their proof actually assumes
the existence of a root in some realm of numbers, and shows that the
root is in fact in . Gauss himself gave several proofs, some of which
cannot be accepted as rigorous by modern standards. Nevertheless,
Gauss has the credit for having given the first valid demonstration of the
so-called fundamental theorem of algebra. After Kronecker established
in 1882 that any polynomial has a root is some realm of "numbers" (see
§51), the earlier attempts became rigorous proofs. The really
fundamental the-orem is Kronecker's theorem.
This assures the existence of roots, but does not bring insight to the
problem of understanding the nature of roots any more than existence
theorems about differential equations give solutions of differential
equations or information about their analytic bahavior, singularities,
asymptotic expansions, etc.
576
Cardan's book contains a method for finding roots of biquadratic poly-
nomials (that is, polynomials of degree four) discovered by his pupil L.
Ferrari (1522-1565) round 1540. This book made a great impact on the
developement of algebra. Cardan even calculated with complex numbers,
which manifested themselves to be indispensable. Contrary to what one
may be at first inclined to believe, there was no need for complex
numbers as far as quadratic equations are concerned: mathematicians
had declared such equations as x2 = 1 simply unsolvable. However, in
Cardan's formula, one has to take square roots of negative numbers even
if all the roots are real (the irreducible case). In fact, the roots of a cubic
polynomial whose three roots are real cannot be expressed by a formula
involving real radicals only (Lemma 59.30).
577
induction), but for n 5, solving the auxiliary equation is not easier
than to solving the original equation.
Lagrange noted that, in the successful cases n 4, the resolvent has the
form r1 + r2 + . . . + n 1
rn, where ri are the roots of the polynomial and
is a root of xn 1. This type of a resolvent does not work in case n = 5,
but it is concievable that expressions of some other kind could work as
resolvents. Lagrange studied which type of expressions could be
resolvents.
578
radicals. What is the criterion for a polynomial equation to be solvable
by radicals? This question is resolved by the French mathematician
Évariste Galois (1811-1832). With Galois, the principle subject matter of
algebra definitely ceased to be polynomial equations. Galois marks the
beginning of modern algebra, which means the study of algebraic struc-
tures (groups, rings, vector spaces, fields, and many others).
* *
Galois had a short and dramatic life. He began publishing articles when
he was a pupil in Lycée (1828). He was a remarkable talent and a
difficult student. He wanted to enter the École Polytechnique, but failed
twice in the entrance examinations. The reason, he says later, was that
the questions were so simple that he refused to answer them. He later
entered the École Normale (1829), but expelled from it due to a letter in
the student newspaper. His unbearable pride was notorious. He became
politicized, was sent to jail for some months, then began a liason with
"une coquette de bas étage" and died in an obscure duel (1830).
579
it exists, ought to have an external character which can be verified by
inspecting the coefficients of a given equation or, all the better, by
solving other equations of degrees lower than that of the equation to be
solved."1 His is not a workable test that effectively decides if an
equation is solvable by radicals. Galois himself writes: "If now you give
me an equation that you have chosen at pleasure, and if you want to
know if it is or if it is not solvable by radicals, I need do nothing more
than indicate to you the means of answering your question, without
wanting to give myself or anyone else the task of doing it. In a word, the
calculations are impractical."2 But this is the whole point. Who cares
about solvability of polynomial equations. What Galois achieved, and
what his contemporaries failed to appreciate, is a fascinating parallel
between the group and field structures. The group-theoretical
solvability condition is at best a trivial application of the theory.
This was too big a change in algebra and in mathematics and heralded
the end of an era when mathematics was the science of numbers and
figures. Ever since the time of Gauss and Galois, mathematics is the
science of structures. Galois theory is the first mathematical theory that
compares two different structures: fields and groups. It was not easy to
follow this developement. Even mathematicians of later generations
concieved Galois theory as a tool for answering certain questions in the
theory of equations. The first writer on Galois theory who clearly
differentiated between the theory and its applications is Heinrich Weber
(1842-1913). In his famous text-book on algebra (1894), the exposition
of the theory occupies one chapter, its applications another.
580
introduced composition series, proved that the composition factors in
any two composition series of a solvable group are isomorphic. The
group concept became central, but solving polynomial equations still
remained as the major concern.
581
computations are eliminated from the theory. Where an earlier writer
would spend many pages for the step-by-step adjunction of resolvents
to construct a splitting field, we see Artin merely write: "Let E be a
splitting field of f(x)." With Artin, Galois theory lost all its connections
with its past. It is interesting to note that Artin does to applications of
the theory to polynomial equations. In his book Galois Theory, applica-
tions are harshly separated from the main text: they can be no more
than an appendix; but Artin does not even condescend to write the
appendix himself: this task is relegated to one of his students.
______________________________________________
1
Poisson, quoted from Kiernan's article (see References), page 76.
2
Galois, quoted from Edwards' book Galois Theory, page 81.
582
§48
Field Extensions
K
will mean that K is a subfield of E.
583
subring of E such that the nonzero elements in K form a commutative
group under multiplication. Certainly, every subgroup of E = E\{0} is
commutative. Thus K is a subfield of E if and only if K is a subring of E
and K\{0} is a subgroup of E . Now K is a subring of E if and only if
(i),(ii),(iii) hold and K\{0} is a subgroup of E if and only if
and (iv) hold. Since K E and the field E has no zero divisors, (iii)´ is
weaker than (iii), and we conclude that K is a subfield of E if and only if
(i),(ii),(iii),(iv) hold.
1
From now on, we will write (or 1/b) for the inverse b 1 of a nonzero el-
b
a
ement in a field. Likewise, we will write or (a/b) for the product ab 1 =
b
1 1
b a of two elements a,b in a field (assuming b 0). It follows from
Lemma 48.2 that, whenever K is a subfield of E and a,b K, then
a
a + b, a b, ab,
b
belong to K, it being assumed b 0 in the last case. A subfield of E is
therefore a nonempty subset of E that is closed under addition, subtrac-
tion, multiplication and division (by nonzero elements).
584
(iii) ab = (xz yu) + (xu + yz)i (i),
z u
(iv) b 1 = 2 2
+ 2 i (i), provided b = z + ui
z + u z + u2
0 + 0i = 0.
So (i) is a subfield of . It is in fact the field of fractions of [i], and is
called the gaussian field.
From the last example, we infer that the intersection of all subfields of a
field K is a subfield of K. Note that the intersection is taken over a
nonempty set, since at least K is a subfield of K.
585
Thus every subfield of K contains (is an extension of) the prime subfield
of K. We want to describe the elements in the prime subfield of K. Let P
denote the prime subfield of K. In order to distinguish clearly between
the integer 1 and the identity element of K, we will denote in this
discussion the identity element of K as e. We know 0 P, e P and 0 e
because P is a field. Now P is a group under addition, so e + e = 2e, 2e + e
= 3e, 3e + e = 4e, . . . are elements of P, and also e, 2e, 3e, 4e, . . . .
586
For any m,n , we have (m + n) = (m + n)e = me + ne (this is not
distributivity!) and (mn) = (mn)e = (me)(ne) = m .n (here (mn)e =
(me)(ne) is distributivity!), so is a ring homomorphism.
To prove P p
, we will find Ker . From pe = 0, we have p Ker , so
pn Ker for all n (because Ker is an ideal of ) and p Ker .
On the other hand, if m Ker , we divide m by p to get m = qp + r, with
q,r and 0 r p. This gives 0 = me = (qp + r)e = (qp)e + re = 0 + re.
As 0 r p, this forces r = 0, which means m = qp and m p . So we
get Ker p . Therefore Ker = p . [A more conceptual argument: Ker
is an ideal of and is a principal ideal domain, so Ker = d for some
d . We have d 0 in Case 1. From pe = 0 we get p Ker = d , so
d p. But p is a prime number, so d = 1 or d = p. The possibility d = 1 is
excluded, because 1e = e 0. Hence d = p and Ker = d = p = p .]
m m´
First we show that is well defined. If
= with m,n,m´,n´
n n´
(n 0 n´), then mn´ = m´n in , so (mn´)e = (m´n)e in P, thus (me)(n´e)
1 1
= (m´e)(ne) in P. Multiplying both sides of this equation by P,
ne n´e
me m´e
we obtain = . So is well defined.
ne n´e
m r
is a ring homomorphism: for all
, with m,n,r,s , n 0 s,
n s
m r ms + rn (ms + rn)e (ms)e + (rn)e
we have + = = =
n s ns (ns)e ne.se
587
(me)(se) + (re)(ne) me re m r
= = + = +
ne.se ne se n s
m r mr (mr)e (me)(re) me re m r
and = = = = =
n s ns (ns)e (ne)(se) ne se n s
588
occasions. For example, we will write 2 instead of 2 5
. A notation such
as "2" is therefore ambiguous:. it stands for the integer 2 , as well as
2 2
, as well as 2 3
, as well as 2 5
, etc. It will be clear from the
context, however,. which meaning is accorded to "2". The ambiguity is
therefore harmless. .
48.7 Lemma: If K is a field, then K and {0} are the only ideals of K.
589
If : K1 K2 is a field isomorphism, then is a homomorphism of addi-
tive groups, so 0K = 0K , and also Ker = {0K }, where 0K and 0K are
1 2 1 1 2
the zero elements of the fields K1, K2, respectively. Thus K 1 \{0}
is a one-
to-one mapping from K1\{0} onto K2\{0}. In addition, (ab) = a .b for all
a,b in K1, so (ab) = a .b for all a,b K1\{0} and therefore K : K1 K2
1
is a one-to-one homomorphism of groups onto K2 : we have K1 K2 In
particular, (1K ) = 1K , where 1K and 1K are the identities of the fields
1 2 1 2
K1, K2, respectively.
1
(3) If : K1 K2 is a field isomorphism, then : K2 K1 is a field iso-
morphism.
590
48.11 Examples: (a) The conjugation mapping : is an auto-
morphism of , because x x
x + y = x + y, x y = x y, x y = x . y, x/y = x/y
for any x,y .
591
48.12 Definition: Let E/K be a field extension. The dimension of E over
K is called the degree of E over K, or the degree of the extension E/K.
It will prove convenient to write E:K instead of dimK E for the degree of
E over K. The field E is said to be a finite dimensional extension or an
infinite dimensional extension of K according as E:K is finite or infinite.
Most authors use the term "finite extension" for a finite dimensional
extension.
48.13 Theorem: Let F/E and E/K be field extensions of finite degrees
F:E and E:K . Then F/K is a finite dimensional extension. In fact
F:K = F:E E:K
Now the claim about the degree. Put F:E = r and E:K = s for brevity.. We
are to prove that the dimension of F over K is equal to rs. Let {f1,f2, . . . ,fr}
be an E-basis of F and {e1,e2,. . . ,er} a K-basis of E. We are to find a K-basis
of F having exactly rs elements.. The most natural thing to do is to consi-
der the rs products fiej . We contend that {fiej : i = 1,2, . . . ,r; j = 1,2, . . . ,s}
is a K-basis of F. .
592
∑ ∑ (∑ aijej )fi = ∑ aij(ej fi)
r r s
f= bifi =
i=1 i=1 j=1 i,j
is a linear combination of ej fi = fiej over K. Thus {fiej } spans F over K.
∑ bijfiej = 0
i,j
∑ (∑
r s
then bijej )fi = 0,
i=1 j=1
∑
s
where bijej E for each i. Since {fi : i = 1,2, . . . ,r} is linearly independ-
j=1
∑
s
ent over E, we have bijej = 0 for each i. Since {ej : j = 1,2, . . . ,s} is
j=1
linearly independent over E, we obtain bij = 0 for each i,j. Hence {fiej } is
linearly independent over K.
48.14 Lemma: Let F/E and E/K be field extensions. If F:K is finite,
then F:E and E:K are both finite. In fact, both of them are divisors of
F:K and F:K = F:E E:K .
Proof: Let n = F:K and let {fi : i = 1,2, . . . ,n} be a basis of F over K. Then
{fi : i = 1,2, . . . ,n} spans F over E and so F:E n by Steinitz' replacement
theorem. Thus F:E is finite.
593
n + 1 K-linearly independent elements of F, contradicting F:K = n. Thus
E:K is finite.
Exercises
2. Let p be prime. Is p2
an extension of p
? Is p3
an extension of p2
?
4. Let K be a field and let Aut(K) be the set of all field automorphisms of
K. Show that Aut(K) is a group under composition.
3
5. Find all automorphisms of , p
, (i), ( ), ( 5i), ( 2) (see Ex.3).
9. Prove or disprove: If E/K1 and E/K2 are finite dimensional field exten-
sions, then E/(K1 K2) is finite dimensional, too.
10. Let K be a field and e the identity element of K. Show that char K = 0
or p according as the subring of K generated by e is isomorphic to or to
p
.
594
12. Let K be a field of characteristic p 0. Prove that : K K is a field
a ap
homomorphism.
595
§49
Field Extensions (continued)
596
49.4 Example: In the extension / , let us find the subfield of gene-
rated by i over . Any subfield of containing both and i contains
a + bi
complex numbers of the form , where a,b,c,d and c + di 0.
c + di
a + bi
One verifies easily that F = { : a,b,c,d , c + di 0} is a
c + di
subfield of containing both and i. Hence F is the subfield of
generated by i over .
Let us note that any element of F can be written in the form x + yi, with
x,y . Thus {x + yi : x,y } = F and F is equal to the field (i) de-
fined in Example 48.3(c). So the notation of Example 48.3(c) is consistent
with that of Definition 49.2.
49.5 Lemma: Let E/K be a field extension and a1,a2, . . . ,an E. Then
(1) K[a1,a2, . . . ,an] = {f(a1,a2, . . . ,an) E: f K[x1,x2, . . . ,xn]};
(2) K(a1,a2, . . . ,an)
f(a1,a2, . . . ,an)
={ E: f,g K[x1,x2,. . . ,xn], g(a1,a2, . . . ,an) 0}.
g (a ,a , . . . ,a )
1 2 n
Proof: (1) Let A be the set on the right hand side of the equation in (1).
Any subring of E containing K and {a1,a2, . . . ,an} will contain the elements
of the form ka1m1 a2m2 . . . anmn , where k K and m1,m2, . . . ,mn are nonnega-
tive integers, hence also the elements of the form
∑ km
1
m2 ...mn a1
m1
a2m2 . . . anmn (*)
597
at (a1,a2, . . . ,an). So every element of A is in any subring of E containing K
and {a1,a2, . . . ,an}. This gives A K[a1,a2, . . . ,an]. To prove the reverse
inclusion, it suffices, in view of K {a1,a2, . . . ,an} A E, to show that A
is a subring of E. But this is immediate: given any f(a1,a2, . . . ,an) and
g(a1,a2, . . . ,an) A, where f,g K[x1,x2, . . . ,xn], we have
f(a1,a2, . . . ,an) + g(a1,a2, . . . ,an) = (f + g)(a1,a2, . . . ,an) A
g(a1,a2, . . . ,an) = ( g)(a1,a2, . . . ,an) A
f(a1,a2, . . . ,an)g(a1,a2, . . . ,an) = (fg)(a1,a2, . . . ,an) A
since f + g, g, fg belong to [x1,x2, . . . ,xn] whenever f,g do. Thus A is a
subring of E by the subring criterion (Lemma 30.2). This proves.
K[a1,a2, . . . ,an] = A.
(2) The reasoning is similar. Let B be the set on the right hand side of
the equation in (2). Clearly A B. Note that B = {b/c E: b,c A, c 0} =
{bc 1 E: b,c A, c 0}. Any subfield of E containing K and {a1,a2, . . . ,an}
will contain K[a1,a2, . . . ,an] = A and, since a subfield is closed under
division, it will contain also the elements b/c, where b,c A and c 0.
This means that B is contained in any subfield of E containing K and
{a1,a2, . . . ,an}. Hence B K(a1,a2, . . . ,an). To prove the reverse inclusion, it
suffices, in view of K {a1,a2, . . . ,an} B E, to show that B is a subfield
of E. Indeed, given any b/c, d/e B, where b,c,d,e A, c,e 0, we have
b d be + dc
+ = B
c e ce
d
B
e
bd bd
= B
c e ce
1 e
= B (provided d/e 0,
d d
e
i.e.,d 0)
since be + dc, ce, d, bd,ce belong to A whenever b,c,d,e do and ce 0
whenever c 0 e (A is a subring of the field E and has therefore no
zero divisors). Thus B is a subfield of E by the subfield criterion (Lemma
48.2). This proves K(a1,a2, . . . ,an) = B.
598
Let us take a new look at Example 49.4 under the light of Lemma 49.5.
The field F in Example 49.4 is exactly the the field desribed in Lemma
49.5, with K = , n = 1, a1 = i . On the other hand, the field.
{x + yi : x,y }. is exactly the subring of described in Lemma
49.5, with K = , n = 1, a1 = i . Thus we have (i) = [i]. The reader
will easily verify that ( 2) = [ 2] also (cf. Theorem 50.6). .
(3) Using part (2) twice, we get (K(a))(b) = K(a,b) = K(b,a) = (K(b))(a) and
similarly [K[a]][b] = K[a,b] = K[b,a] = [K[b]][a].
599
We introduce a very important classification of field extensions:
algebraic vs. transcendental extensions. They behave very differently.
49.8 Examples: (a) Let K be any field.. Then, for any element a K, the
polynomial fa (x) := x a is in K[x], and a is a root of fa . Thus any element
of K is algebraic over K, and K is an algebraic extension of K..
600
(e) Let K be a field and x an indeterminate over K. Then K(x) is an
extension field of K and x K(x). If f is any nonzero polynomial in K[x],
then f(x) = f 0 (Example 35.2(d)). Thus x is transcendental over K and
K(x)/K is a transcendental extension.
Likewise f(x2) 0 for any nonzero polynomial f in K[x] and x2 is
transcendental over K. On the other hand, if y is another indeterminate
over K, then x is the root of the polynomial y2 x2 (K(x2))[y], so x is
algebraic over K(x2). Thus an element may be transcendental over a
field and algebraic over another field.
We close this paragraph with a theorem that describes all simple tran-
scendental extensions up to isomorphism. Simple algebraic extensions
will be treated in the next paragraph.
601
We claim that is well defined. Indeed, if f/g = f1/g1 in K(x), where
f,g,f1,g1 K[x] and g 0 g1, then fg1 = f1g in K[x] and, by Lemma 35.3,
f(a)g1(a) = f1(a)g(a) in E, with g1(a) 0 g(a); multiplying this equation
by 1/g1(a)g(a), we obtain
f(a) f1(a) f1
(gf ) = =
g(a) g 1(a)
=( ) ,
g1
which shows that is well defined.
Exercises
602
3. Let E/K be a field extension and S E. Show that K(S) = K if and only if
S K.
10. Prove or disprove: if E/K is a field extension and a,b E are tran-
scendental over K, then K(a,b) K(x,y), where x,y are indeterminates
over K.
603
§50
Algebraic Extensions
Let E/K be a field extension and let a E be algebraic over K. Then there
is a nonzero polynomial f in K[x] such that f(a) = 0. Hence the subset A =
{f K[x]: f(a) = 0} of K[x] does not consist only of 0. We observe that A is
an ideal of K[x], because A is the kernel of the substitution homomorph-
ism Ta : K[x] E.
for all f(x) K[x], f(x) = 0 if and only if g(x) f(x) in K[x].
604
In particular, a is a root of g(x) and g(x) has the smallest degree among
the nonzero polynomials in K[x] admitting a as a root. Moreover, g(x) is
irreducible over K.
Proof: We must show only that h(x) divides any polynomial f(x) K[x]
having a as a root. Let f(x) be a polynomial in K[x] and assume that a is a
root of f(x). We divide f(x) by h(x) and get
f(x) = q(x)h(x) + r(x), r(x) = 0 or deg r(x) deg h(x)
with suitable q(x),r(x) K[x]. Substituting a for x, we obtain
0 = f(a) = q(a)h(a) + r(a) = q(a)0 + r(a) = r(a).
If r(x) were distinct from the zero polynomial in K[x], then the irre-
ducible polynomial h(x) would have a common root a with the polyno-
mial r(x) whose degree is smaller than the degree of h(x). This is impos-
sible by Theorem 35.18(4). Hence r(x) = 0 and f(x) = q(x)h(x). Therefore
h(x) f(x) for any polynomial f(x) K[x] having a as a root, as was to be
proved.
605
50.4 Examples: (a) Let us find the minimal polynomial of i over
2
. Since i is a root of the polynomial x + 1 [x], which is monic and
irreducible over , Theorem 50.3 tells us that x2 + 1 is the minimal
poly-nomial of i over . In the same way, we see that x2 + 1 [x] is
the minimal polynomial of i over . On the other hand, x2 + 1 ( (i))[x]
2
is not irreducible over (i), because x + 1 = (x i)(x + i) in ( (i))[x]. Now
x i is a monic irreducible polynomial in ( (i))[x] having i as a root, and
thus x i is the minimal polynomial of i over (i).
606
The irreducibility of f(x) of degree four over could be proved by
showing the irreducibility of another polynomial, of degree less than
four, over a field larger than . As this gives a deeper insight to the
problem at hand, we will discuss this method. The equation (u) states
that 2 + 3 is a root of the polynomial f2(x) = x2 2 2x 1
( ( 2))[x]. Let g(x) ( ( 2))[x] be the minimal polynomial of 2 + 3
over ( 2). Then g(x) f2(x) in ( ( 2))[x] and, if g(x) f2(x), then deg
g(x) would be one and g(x) would be x ( 2 + 3), since the latter is the
unique monic polynomial of degree one having 2 + 3 as a root. But
g(x) ( ( 2))[x] and this would imply 2 + 3 ( 2), so 3 ( 2),
so 3 = m + n 2 with suitable m,n , where certainly m 0 n, so 3 =
2 2 2
m + 2 2mn + n , so 2 = (3 m 2n2)/2mn would be a rational
number, a contradiction. Thus f2(x) = g(x) is the minimal polynomial of 2
+ 3 over ( 2).
Now the irreducibility of f(x) over follows very easily. f(x) has no
factor of degree one in [x]. If f(x) had a factorization (e) in [x], where
a,b,c,d are rational numbers (not necessarily integers), then 2 + 3
would be a root of one of the factors on the right hand side of (e), say of
x2 + ax + b. But then x2 + ax + b, being a polynomial in ( ( 2))[x] having
2 + 3 as a root, would be divisible, in ( ( 2))[x], by the minimal
poly-nomial f2(x) = x2 2 2x 1 of 2 + 3 over ( 2). Comparing
degrees and leading coefficients, we would obtain x2 2 2x 1 = x2 + ax
+ b, so 2 2 = a , a contradiction. Hence f(x) is irreducible over .
The next lemma crystalizes the argument employed in the last example.
607
We proceed to desrcribe simple algebraic extensions. Let us recall that
we found [i] = (i). This situation obtains whenever we consider a
simple extension generated by an algebraic element.
It remains to show K(a) = K[a]. Since K[a] K(a), we must prove only
K(a) K[a]. To this end, we need only prove that 1/g(a) K[a] for any
g(x) K[x] with g(a) 0 (Lemma 49.5). Indeed, if g(x) K[x] and g(a) 0,
then f g and, since f is irreducible in K[x], the polynomials f(x) and g(x)
are relatively prime in K[x] (Theorem 35.18(3)). Thus there are poly-
nomials r(x), s(x) in K[x] such that
f(x)r(x) + g(x)s(x) = 1.
Substituting a for x and using f(a) = 0, we obtain g(a)s(a) = 1. Hence
1/g(a) = s(a) K[a]. This proves K[a] = K(a). (Another proof. Since K[x] is
a principal ideal domain and f is irreducible in K[x], the factor ring
K[x]/(f) is a field by Theorem 32.25; thus K[a], being a ring isomorphic to
the field K[x]/(f), is a subfield of E, and K[a] contains K and a. So K(a)
K[a] and K(a) = K[a].)
608
in a unique way.
609
50.9 Examples: (a) The minimal polynomial of i over is the
2 2
polynomial x + 1 in [x] (Example 50.4(a)), and x + 1 has degree 2.
Thus i is (algebraic and) has defree 2 over . Likewise, the minimal
poly-nomial of i over is x2 + 1 [x] and i has degree 2 over .
( 2+ 3)
x2 2 2x + 1
degree 2
x4 10x2 + 1
degree 4 ( 2)
x2 2
degree 2
9 1
2= ( 2 + 3) + ( 2 + 3)3, so 2 ( 2 + 3) and therefore
2 2
( 2) ( 2 + 3). Thus ( 2) is an intermediate field of the
extension ( 2 + 3)/ . From Theorem 48.13, we infer that
( 2+ 3): ( 2) = 2
610
(d) Likewise, if E/K is a field extension and a E, and if a is algebraic
n
over K with the minimal polynomial x + c n 1x n1
+ c n 2xn 2 + . . . + c1x + c0
over K so that
an = c n 1a n 1 c n 2a n 2 ... c1a c0,
then K(a) consists of the elements
k0 + k1a + . . . + kn 2a n 2 + kn 1a n 1 (k0,k1, . . . ,kn 2,kn 1
K)
and computations are carried out in K(a) just as though a were an inde-
terminate over K and then replacing a n by c n 1a n 1 c n 2a n 2 . . . c1a c0
wherever it occurs.
t + u = 2 + 2a + 5a 3 (a)
and tu = (2 + a a 2 + 3a 3)(a + a 2 + 2a 3)
= 2a + 2a 2 + 4a 3 + a 2 + a 3 + 2a 4 a 3 a 4 2a 5 + 3a 4 + 3a 5 + 6a 6
= 2a + 3a 2 + 4a 3 + 4a 4 + a 5 + 6a 6
= 2a + 3a 2 + 4a 3 + 4(10a 2 1) + a(10a 2 1) + 6a 2(10a 2 1)
= 2a + 3a 2 + 4a 3 + 40a 2 4 + 10a 3 a + 60(10a 2 1) 6a 2
= 64 + a + 637a 2 + 14a 3 (a).
1
so that 1 = (x2 + x + 1) (11 x)(11x + 11)
1
= (x2 + x + 1) (11 x)[(x4 10x2 + 1) (x2 x 10)(x2 + x + 1)]
1 1
= (x2 + x + 1)(1 + (11 x)(x2 x 10)) (11 x)(x4 10x2 + 1),
1 1 10 1
1 = (x2 + x + 1)(11 x3 x2 x + 1) (11 x)(x4 10x2 +
11 11
1)
and, substituting a for x, we get
1 1 10
1 = (a 2 + a + 1)(11 a 3 a2 a + 1),
11 11
611
1 1 10
1/(a 2 + a + 1) = 11 a 3 a2 a + 1.
11 11
Notice that a is treated here merely as a symbol that satisfies the rela-
tion a 4 10a 2 + 1 = 0. The numerical value of a = 2 + 3 = 3.14626337. .
. as a real number is totaly ignored. This is algebra, the calculus of
symbols. This allows enormous flexibility: we can regard a as an element
in any extension field E of in which the polynomial x4 10x2 + 1 has a
root. This idea will be pursued in the next paragraph.
As a separate lemma, we record the fact that the polynomial g(x) in the
preceding proof has degree n.
612
50.11 Lemma: Let E/K be a field extension of degree E:K = n . Then
every element of E is algebraic over K and has degree over K at most
equal to n.
50.12 Theorem: Let E/K be a field extension and let a1,a2, . . . ,a n 1,an be
finitely many elements in E. Suppose that a1,a2, . . . ,a n 1,an are algebraic
over K. Then K(a1,a2, . . . ,a n 1,an) is an algebraic extension of K. In fact,
K(a1,a2, . . . ,a n 1,an) is a finite dimensional extension of K and
K(a1,a2, . . . ,a n 1,an):K K(a1):K K(a2):K . . . K(an):K
1,n
613
50.13 Lemma: Let E/K be a field extension and a,b E. If a and b are
algebraic over K, then a + b, a b, ab and a/b (in case b 0) are alge-
braic over K.
50.14 Theorem: Let E/K be a field extension and let A be the set of all
elements of E which are algebraic over K. Then A is a subfield of E (and
an intermediate field of the extension E/K).
50.15 Definition: Let E/K be a field extension and let A be the subfield
of E in Theorem 50.14 consisting exactly of the elements of E which are
algebraic over K. Then A is called the algebraic closure of K in E.
614
Proof: We must show that every element of F is algebraic over K. Let
u F. Since F is algebraic over E, its element u is algebraic over E, and
there is a polynomial f(x) E[x] with f(u) = 0, say
f(x) = e0 + e1x + . . . + enxn.
We put L = K(e0,e1, . . . ,en). Then clearly f(x) L[x]. Since E is algebraic
over K, each of e0,e1, . . . ,en is algebraic over K and Theorem 50.12 tells us
that L/K is finite dimensional. Also, since f(u) = 0 and f(x) L[x], we see
that u is algebraic over L and Theorem 50.7 tells us that L(u)/L is finite
dimensional. So L(u):K = L(u):L K(e0,e1, . . . ,en):K is a finite number: L(u)
is a finite dimensional extension of K. By Theorem 50.10, L(u) is an alge-
braic extension of K. So every element of L(u) is algebraic over K. In
particular, since u L(u), we see that u is algebraic over K. Since u is an
arbitrary element of F, we conclude that F is an algebraic extension of K.
Exercises
615
1. Find the minimal polynomials of the following numbers over the
fields indicated.
(a) 2 over , ( 2), ( 3).
(b) 3 2 over , ( 2), ( 3).
(c) 2 + 3 + 5 over , ( 2), ( 3), ( 2 + 5).
3 3 4
(d) 2+ 2 over , ( 2), ( 2), ( 2).
3 3
(e) 2+ 3 over , ( 2), ( 2), ( 2+ 3).
(f) 3+ 2 over , ( 2).
3
(g) 1+ 2 over , ( 2), (i).
3
(h) 1 2 over , ( 2), (i).
3 3
(j) 1+ 2+ 1 2 over , ( 2), (i).
616
(c) Let f(x) and g(x) be monic polynomials in [x]. If f(x)g(x) [x],
then f(x) and g(x) are in [x]. (Hint: consider contents.)
(d) If u is an algebraic integer, then the minimal polynomial of
u over is in fact a polynomial in [x].
617
§51
Kronecker's Theorem
(a + b 1)(c + d 1) = ac + ad 1+b 1c + b 1d 1
617
= ac + bd( 1)2 + (ad + bc) 1
= (ac bd) + (ad + bc) 1,
It is now clear what to do in the general case. Given a field K and an ir-
reducible polynomial f(x) in K[x], to find a root of f(x), we invent a sym-
bol u, subject it to the condition f(u) = 0 and consider the K-vector space
with the K-basis 1,u,u2, . . . ,un 1, where n = deg f(x) and 1,u,u2, . . . ,un 1 are
computational symbols. We multiply the elements of this K-vector space
by treating u as an indeterminate over K and writing 0 for f(u) wher-
ever f(u) occurs. The rigorous method of doing is is to consider the factor
ring K[x]/(f), as suggested by Theorem 50.6.
618
51.1 Theorem (Kronecker's theorem): Let K be a field and f(x) an
irreducible polynomial in K[x]. Then there is an extension field E of K
such that f(x) has a root in E.
Proof: Let E = K[x]/(f), the factor ring of K[x] modulo the principal ideal
generated by f(x) in K[x]. Since K[x] is a principal ideal domain and f(x) is
irreducible in K[x], the factor ring E = K[x]/(f) is a field (Theorem 32.25).
The mapping :K E
k k + (f)
is a ring homomorphism because
(k1 + k2) = (k1 + k2) + (f) = (k1 + (f)) + (k2 + (f)) = k1 + k2
and
(k1k2) = k1k2 + (f) = (k1 + (f))(k2 + (f)) = k1 .k2
for any k1,k2 K. Since f(x) is irreducible in K[x], it is not a unit in K[x],
thus 1 = 1 + (f) 0 + (f) and is one-to-one by Lemma 48.8. So is a
field homomorphism. We identify K with its image K in E. So we will
write k instead of k + (f) when k K. In this way, we regard K as a
subfield of E and E as an extension field of K.
Let us keep the notation of the preceding proof. Clearly K(u) E. Also,
2
any element of E has the form c0 + c1x + c2x + . . . m
+ cmx + (f), and thus
equals c0 + c1(x + (f)) + c2(x + (f))2 + . . . + cm(x + (f))m
= c + c u + c u2 + . . . + c um
0 1 2 m
619
and belongs to K(u). So E K(u). This shows that E = K(u) is a simple
extension of K.
1
: K(u) K(t)
g(u) g(t)
620
51.4 Remark: Let K be a field and let f(x) K[x] be an irreducible
polynomial in K[x]. Suppose K(u) is the field obtained by adjoining a root
u of f(x) to K. Let c be the leading coefficient of f(x). From Theorem 50.3,
1
we learn that f(x) is the minimal polynomial of u over K. Then it
c
1
follows from Theorem 50.7 that K(u):K = deg f(x) = deg f(x): the
c
degree over K of the field obtained by adjoining to K a root an irreduc-
ible polynomial f(x) K[x] is equal to the degree of f(x).
Proof: From f(x) K, we know that f(x) is neither the zero polynomial
nor a unit in K[x]. As K[x] is a unique factorization domain, we can
decompose f(x) into irreducible polynomials, and adjoin a root of one of
the irreducible divisors of f(x) to K. The field E obtained in this way will
have a root of (that irreducible divisor of f(x), hence also of) f(x).
Moreover, E:K will be equal to the degree of that irreducible divisor of
f(x), hence will be smaller than or equal to deg f(x) = n.
5
(u). We keep in mind of course that 2 is just another name for our
computational device u: here 2 is not the real number 1.414. . . whose
square is the real number 2.
621
Let us express (1 + 2 2)(3 + 2) and (4 + 2) 1 in terms of the 5
-basis
{1, 2}.
(1 + 2 2)(3 + 2) = 3 + 2 + 6 2 + 2.2 = (3 + 4) + (1 + 6) 2 = 2 + 2 2,
1 1 4 2 4 2 4 2 4 2 1
= = = = = (4 2)
4+ 2 4+ 2 4 2 16 2 14 4 4
= 4(4 2) = 16 4 2 = 1 + 2.
Check: (4 + 2)(1 + 2) = 4 + 4 2 + 2 + 2 = 6 + 5 2 = 1.
Note that : 5
( 2) 5
( 2) is an automorphism of 5
( 2), because
a+b 2 a b 2
[(a + b 2) + (c + d 2)] = [(a + c) + (b + d) 2]
= (a + c) (b + d) 2
= (a b 2) + (c d 2)
= (a + b 2) + (c + d 2)
and [(a + b 2)(c + d 2)] = [(ac + 2bd) + (ad + bc) 2]
= (ac + 2bd) (ad + bc) 2
= (ac + 2( b)( d)) + (a( d) + ( b)c) 2
= (a b 2)(c d 2)
= (a + b 2) .(c + d 2)
for all a + b 2, c + d 2 5
( 2); and is clearly onto and Ker
5
( 2). By the binomial theorem (Theorem 29.16),
(a + b 2)5 = a 5 + 5a 4b 2 + 10a 3b22 + 10a 2b32 2 + 5ab44 +
b54 2
= a 5 + 4b5 2 = a + 4b 2 = a b 2
for all a + b 2 5
( 2). Thus can also be described as
: 5
( 2) 5
( 2).
5
t t
(3 + 2 3)(1 + 4 3) = 3 + 12 3 + 2 3 + 8.3 = 27 + 14 3 = 2 + 4 3,
(2 + 3 3)(2 + 4 3) = 4 + 8 3 + 6 3 + 12.3 = 4 + 3 3 + 3 + 36 = 4 3,
622
1 1 1 3 3 1 3 3 1+2 3 1
= = = = (1 + 2 3)
1+3 3 1+3 3 1 3 3 1 27 4 4
= 4(1 + 2 3) = 4 + 3 3.
As 8 = 3 in 5
, we may also write 8 for 3, with the understanding that
8 5
( 3) is a computational device satisfying ( 8)2 = 8 = 3. Here 8
is not the real number 2.828. . . whose square is 8 . We might be
tempted to write 8 = 4.2 = 2 2. For the time being, this is not legit-
imate: as 8 5
( 3) and 2 2 5
( 2) are in different fields, and not in
their intersection 5
, it is not meaningful to write 8 = 2 2.
5
( 2) are isomorphic fields. We identify these two fields by the iso-
morphism , i.e., by declaring a + b 3 = a + 2b 2 for all a,b 5
. Then,
but only then can we write 3 = 2 2.
When we identify 5
( 3) and 5
( 2) by declaring a + b 3 = a + 2b 2 for
all a,b 5
, we can no longer interpret 18, for example, merely as a
computational device whose square is 18 5
, for there are two ele-
ments in 5
( 3) = 5
( 2) whose squares are 18, viz. 2 2 and 2 2 =
623
3 2. We must specify which of 2 2, 3 2 we mean by 18. Otherwise
we might commit such mistakes as
3 2 = 9.2 = 18 = 9.2 = 4.2 = 2 2 in 5
( 2)
which resembles the mistake
7 = ( 7)2 = 49 = 7 in .
In , there are two numbers whose squares are 49, namely 7 and 7,
and 49 is understood to be the positive of the numbers 7, 7. Thus
when we write 49, we specify which of 7, 7 we mean by 49. This
prevents the mistake 7 = \r(\L(( 7))2). In 5(\r(2)), specifying 2\r(2) or
3 2 as 18 prevents the mistake 3 2 = 2 2.
Exercises
3. Prove that 5
and 5
( 2) are cyclic.
624
§52
Finite Fields
We have seen some examples of finite fields, i.e., fields with finitely
many elements. In this paragraph, we want to discuss some properties
of finite fields.
625
By the proof of Lemma 52.1, we know that a field with pn elements is of
characteristic p. We prove two lemmas about (not necessarily finite)
fields of prime characteristic.
∑ (pk)a p kbk + bp = a p + ∑
p1 p1
(a + b)p = a p + 0 + bp = a p + bp
k=1 k=1
p p
since p (k) and char K = p imply that (k)a p kbk = 0 for k = 1,2, . . . ,p 1.
This proves (a + b)p = a p + bp. The claim (ab)p = a pbp follows from Lemma
8.14(1).
626
and it is true for n = k + 1 also. Hence it is true for all n .
(3) xq x= ∏ (x a) in K[x].
a K
Theorem 35.7. So xq x= ∏ (x a)
a K
(4) We put xq x = f(x)g(x), with g(x) K[x]. Then deg g(x) = q d. The
q
roots of x x are pairwise distinct by part (3) and, since any root of f(x)
is also a root of xq x, we see that the roots of f(x), too, are pairwise
distinct. Likewise the roots of g(x) are pairwise distinct. Now g(x) has at
most q d roots in K (Theorem 35.7). If f(x) had r roots in K and r d,
q
then x x = f(x)g(x) would have at most r + (q d) q roots in K,
q
contrary to the fact that all q elements of K are roots of x x. Thus f(x)
has at least d roots in K. But it can have at most d roots in K by Theorem
35.7. Hence f(x) has exactly d roots in K.
627
52.5 Lemma: Let L/K be a field extension and assume that K has q ele-
ments, q . Let b be an element of L. Then b K if and only if bq = b.
The last two lemmas will now be employed to get information about the
subfields of a finite field. If K1 K2 are finite fields, with pm1 and pm2
elements, respectively, then K1 is a subgroup of K2 , hence pm1 1 = K1
divides K2 = pm2 1 by Lagrange's theorem. We proceed to show that
this happens if and only if m1 divides m2.
628
52.7 Lemma: Let m,n,p and let K be a field and x an indeterminate
over K.
(1) For any k , we have km 1 kn 1 if and only if m n.
(2) In the polynomial ring K[x], we have xm 1 xn 1 if and only if m n.
m n
(3) In the polynomial ring K[x], we have xp x xp x if and only if m n.
629
m
polynomial xp x has exactly pm roots (and these are pairwise distinct).
m
(Lemma 52.4(4)) and the roots of xp x are precisely the elements in
m
K1. Hence K1 has indeed p elements.
K4096
K26
K24
K23
K22
* *
In the following, we use the notation ∑ ad. This means that n and
d|n
that we take a sum of terms ad as d ranges through the positive divisors
630
of n, including 1 and n. For instance ∑ ad = a1 + a2 + a3 + a4 + a6 + a12 and
d|12
52.9 Lemma: Let be Euler's function. Then, for any natural number n,
∑ (d) = n.
d|n
Sd = {k :k n and (k,n) = d}
= {k : d k, k n and (k,n) = d}.
= {k : k = db for some b ,k n and (k,n) = d}
= {db : , db n and (db,n) = d}
n
= {db : , db n and (db,d ) = d}
d
n n
= {db :1 b and (b, ) = 1}.
d d
n= ∑ Sd = ∑ (n/d) = ∑ (d).
d|n d|n d|n
631
of m integers such that one and only one of them. is congruent to each
one of 1,2, . . . ,m.. Thus a complete residue system mod m is a set
{r1,r2,. . . ,rm} such that the residue classes mod m of r1,r2, . . . ,rm make
up m. In particular, ri are then mutually incongruent mod m (and, a
fortiori, mutually distinct). If r1,r2, . . . ,rm are integers mutually incongru-
ent mod m, then {r1,r2, . . . ,rm} is a complete residue system mod m. Also,
if any integer is congruent, modulo m, to one of the integers r1,r2, . . . ,rm,
then {r1,r2, . . . ,rm} is a complete residue system mod m.
Proof: (1) It will be sufficient to show that any two distinct of the mn
numbers msi + nrj are incongruent modulo mn. Indeed, if
msi + nrj msi´ + nrj´ (mod mn),
then msi + nrj msi´ + nrj´ (mod m) and msi + nrj msi´ + nrj´ (mod n)
nrj nrj´ (mod m) and msi msi´ (mod n)
632
rj rj´ (mod m) and si si´ (mod n)
rj = rj´ and si = si´
msi + nrj = msi´ + nrj´.
(2) Let us take a complete residue system {r1,r2, . . . ,rm} mod m such that
{a1,a2, . . . ,a (m)} {r1,r2, . . . ,rm} and a complete residue system {s1,s2, . . . ,sm}
mod n such that {b1,b2, . . . ,b (n)} {s1,s2, . . . ,sn}. We have {a1,a2, . . . ,a (m)} =
{rj : j = 1,2, . . . ,m, (rj ,m) = 1} and {b1,b2, . . . ,b (n)} = {si : i = 1,2, . . . ,n, (si,n) =
1}. Now {msi + nrj : j = 1,2, . . . ,m, j = 1,2, . . . ,n} is a complete residue
system mod mn. So it will be sufficient to show that msi + nrj is
relatively prime to mn if and only if (si,n) = 1 and (rj ,m) = 1.
If (si,n) 1, then (si,n) divides both msi + nrj , and mn, so (si,n) divides
(msi + nrj , mn) and (msi + nrj , mn) 1. Likewise (rj ,m) 1 implies that
(msi + nrj , mn) 1.
On the other hand, if (si,n) = 1 and (rj ,m) = 1, then (msi + nrj ,mn) = 1. For
otherwise (msi + nrj , mn) would be divisible by a prime number p. Then
we would have p mn, so p m or p n. Without loss of generality, assume
p m. Also p msi + nrj , so p nrj . Since p m and (m,n) = 1, we would get (p,n)
= 1. Then p nrj and (p,n) = 1 would give p rj and p would divide (rj ,m),
contrary to (rj ,m) = 1. So (si,n) = 1 and (rj ,m) = 1 implies (msi + nrj , mn) =
1.
(3) From part (2), we learn that a reduced residue system modulo mn
has (m) (n) elements. Hence (mn) = (m) (n) whenever m and n are
relatively prime.
633
1
= pa (1 ).
p
∑ n
(n) = (d)
d|n d
where (d) = 0 if d is divisible by the square of some prime number,
and, if d is not divisible by the square of any prime number, (d) = 1 or
1 according as the number of (distinct) prime divisors of d is even or
odd. This leads us to the function named after A. F. Möbius (1790-1868).
634
(6) = 1, (7) = 1, (8) = 0, (9) = 0, (10) = 1.
∑ ∑ n
The two formulas n = (d) and (n) = (d) are equivalent. This
d|n d|n d
is a special case of a formula known as Möbius inversion formula that
need a lemma.
F(n) = ∑ f(d).
d|n
∑ ∑
n n
for all n . Then f(n) = (d)F( ) = ( )F(d)
d|n d d|n d
for all n .
635
Proof: Let n . For any positive divisor d of n, we have
∑
n
F( ) = f(b),
d n
b|
d
∑
n
(d)F( ) = (d)f(b)
d n
b|
d
∑ (d)F( ) = ∑ ∑
n
(d)f(b).
d|n d d|n n
b|
d
The last sum is over all ordered pairs (d,b) of positive divisors of n such
that db n. Hence it is also the sum over all ordered pairs (b,d) of positive
divisors of n such that bd n and we get
F(n) = ∏ f(d).
d|n
f(n) = ∏ F( ) = ∏ F(d)
n (d) (n/d)
for all n . Then
d|n d d|n
for all n .
. We have F( ) = ∏ f(b),
n
Proof: Let n
d n
b|
d
= ∏ f(b)
n (d) (d)
F( )
d n
b|
d
636
∏ =∏ ∏
n (d) (d)
F( ) f(b)
d|n d d|n n
b|
d
and so
∑ (d)
∏ =∏ ∏ =∏
n (d) d|(n/b)
F( )
d
f(b) (d)
(f(b) ) = f(n)
d|n b|n n b|n
d|
b
* *
We return to finite fields. We will prove that, for any prime number p
and natural number n, there is a finite field with pn elements and that
any two finite fields with the same number of elements are isomorphic.
n
We begin by discussing the decomposition of xp x p
[x] into irreduc-
ible polynomials in the unique factorization domain p[x]. It turns out
n
that all irreducible factors of xp x are distinct, and an irreducible poly-
pn
nomial in p[x] divides x x if and only if its degree divides n.
52.15 Theorem: Let p be a positive prime number and let Fd(x) be the
product of all monic irreducible polynomials of degree d in p[x] (if there
is no monic irreducible polynomial of degree d in p[x], let Fd(x) be the
constant polynomial 1 p
[x]). Then
x = ∏ Fd(x)
n
xp in p
[x].
d|n
n n
Proof: All roots of xp x are simple, because xp x is relatively prime
pn
to its derivative derivative 1. So x x is not divisible by the square of
n
any polynomial in p[x]. In particular, xp x is not divisible by the
.
square of any of its irreducible factors in p[x].
Supposef(x) p
[x] is a monic irreducible polynomial in p[x] and let d =
deg f(x). We construct the field p(a) by adjoining a root a of f(x) to p.
Now f(x) is the minimal polynomial of a over p, so p(a): p = deg f(x) =
d
d and p
(a) is a field of pd elements. Therefore bp = b for all b p
(a)
637
n
(Lemma 52.4(3)). We are to prove that f(x) xp x in p
[x] if and only if
d n in .
d d
Assume d n. As a p
(a), we have a p = a, so a is a root of xp x p
[x].
pd
But f(x) is the minimal polynomial of a over p
, hence f(x) x x in p
[x].
pd pn pn
From d n, it follows that x x x x (Lemma 52.7(3)), so f(x) x x.
n n
Assume now f(x) xp x. Then f(x)g(x) = xp x for some g(x) p
[x], and
pn pn
f(a)g(a) = a a = 0. So a is a root of x x. But then any element of
pn
p
(a) is a root of x x: if b p
(a), say b = f0 + f1a + f2a 2 + . . . + fd 1a d 1
with f0,f1,f2, . . . ,fd 1 p
, then we get
pn n
b = (f + f a + f a 2 + . . . + f a d 1)p
0 1 2 d 1
n pn pn pn n 2 pn n
= f0p + f2 (a ) + . . . + (fd 1)p (a d 1)p
+ f1 a
= f0 + f1a + f2a 2 + . . . + fd 1a d 1 = b.
d
Since the elements of p
(a) coincide with the roots of xp x (Lemma
pd pn
52.4(3)), we see that any root of x x is also a root of x x. Therefore
pd pn
x x divides x x and, by Lemma 52.7(3), d divides n.
(1) pn = ∑ dNd ;
d|n
∏ d
(2) Fn(x) = (xp x) (n/d)
;
d|n
∑
1 n
(3) Nn = ( )pd;
n d|n d
(4) Nn 0.
∏
n
Proof: (1) This follows from xp x= Fd(x) by equating the degrees
d|n
638
(2) This follows from the same equation by Lemma 52.14 (with the
function F: p
(x) that maps n to Fn(x)).
(3) This follows from part (1) by Möbius inversion formula (Lemma
52.13).
∑ n
(4) Nn 0 by its definition. Also, if Nn = 0, we get ( )pd = 0 from
d|n d
n
part (3) and, dividing both sides by the smallest pd for which ( ) 0,
d
∑
n n
say by pd0 , we obtain an equation ( )= ( )pd d0 , where the right
d0 d|n d
d d0
hand side is and the left hand side is not divisible by p,. a contradiction.
Hence Nn 0.
p
. Then K: p
= n and K is a field with pn elements (Theorem 50.7).
n= G = ∑ (d),
dn
639
g d = 1. Hence they are roots of the polynomial xd 1 K[x] and this
polynomial has therefore at least d roots in K. On the other hand, it can
have at most roots in K, thus it has exactly d roots in K, namely the
elements in g . Thus any element in G that has order d, which
necesarily is a root of xd 1, is in the subgroup g , and an element in
g is of order d if and only if that element is a generator of g . Thus
the elements in G of order d coincide with the generators of g . There
are (d) generators of g , so there are (d) elements in G of order d, i.e.,
(d) = (d), as claimed.
∑ (d) = n and this gives (d) = (d) for all positive divisors d of n. In
dn
Proof: Since 0 p
(t) and since any nonzero element of K, being a power
of t, is in p(t), we get K p
(t); thus K = p(t). This proves (1). Then the
degree of the minimal polynomial of t over p is equal to p(t): p = K: p
= n. This proves (2).. Finally, since the degree of the minimal polynomial
of t over p is equal to n, hence a divisor of n, this polynomial is a divisor
n
of xp x (Theorem 52.15) and has n distinct roots in K1 (Lemma
52.4(3)); in particular, there is a root of this polynomial in K1. This
proves (3).
52.20 Theorem: Any two finite fields with the same number of ele-
ments are isomorphic.
640
Proof: Let K and K1 be fields of pn elements. Then K is a cyclic group
(Theorem 52.18). Let t be a generator of K . Then K = p(t) by Theorem
52.19(1). Let f(x) p
[x] be the minimal polynomial of t over p. Now
f(x) has a root c in K1 (Theorem 52.19(3)). Let p(c) K1 be the subfield
of K1 generated by c over p. Then n = deg f(x) = p
(c): p K1: p = n
yields p(c) = K1. We then get
K1 = p(c) p
[x]/(f(x)) p
(t) = K
from Theorem 50.6. Hence K1 K.
In view of this theorem, we identify all finite fields of the same number
of elements. Thus there is a unique field of q elements (q = pn), and this
field will be henceforward denoted by q..
Exercises
1. Find finite subgroups of and show directly that they are cyclic.
∏
n 1 k
6. Let K be a field with pn elements. Let a K and put f(x) = (x a p ).
k=0
p2 n-1
Show that f(x) p
p
[x]. Conclude that a + a + a + . . . + ap p
. This sum
2 n-1
a + a p + a p + . . . + a p is called the trace of a over p
and is denoted by
641
TK / (a). Prove that TK / (a + b) = TK / (a) + TK / (b) and TK / (ca) =
p p p p p
cTK / (a) for all a,b K and c p
and show that there is an a K with
p
TK / (a) 0.
p
642
§53
Splitting Fields
Let us recall that, for any field isomorphism : K1 K2, we have a ring
( ∑ aixi) ∑ (ai
m m
isomorphism : K1[x] K2[x] given by = )xi (Lemma
i=0 i=0
53.1 Lemma: Let E1/K1 and E2/K2 be field extensions and let : K1 K2
be a field isomorphism. Assume f1(x) K1[x] is an irreducible polynomial
in K1[x] and let f2(x) = (f1(x)) K2[x] be its image under . Let u1 E1 be
a root of f1(x) and u2 E2 a root of f2(x). Let K1(u1) E1 be the subfield of
E1 generated by u1 and let K2(u2) E2 be the subfield of E2 generated by
u2. Then extends to an isomorphism of fields K1(u1) K2(u2) that maps
u1 to u2; that is, there is a field isomorphism : K1(u1) K2(u2) such that
u1 = u2 and K = . Moreover, there is only one isomorphism with
these properties.
643
: K1(u1) K1[x]/(f1)
: K2(u2) K2[x]/(f2).
: K1[x] K2[x].
: K1[x]/(f1) K2[x]/(f2).
g + (f1) g + (f2)
1
Hence : K1(u1) K2(u2) is a (ring, and therefore also a) field iso-
1 1 1 1
morphism. We write = . Then a = (a ) =a = [a + (f1)] =
[a + (f2)] 1 = a 1 = a for any a K1 (we regard K1 as a subfield of
K1[x]/(f1) and K2 as a subfield of K2[x]/(f2) as in Kronecker's theorem
(Theorem 51.1)) and u1 = (u1 ) 1 = (x + (f1)) 1
= (x + (f2)) 1
= u2.
Thus is an extension of such that u1 = u2.
53.2 Theorem: Let E1/K and E2/K be field extensions and let u1 E1
and u2 E2 be algebraic over K. Then the minimal polynomial of u1 over
644
K coincides with the minimal polynomial of u2 if and only if there is an
isomorphism (necessarily unique) of fields :K(u1) K(u2) that maps u1
to u2 and whose restriction to K is the identity mapping on K.
.
∑ ∑ ∑ ∑
m m m m
= (aiu1i) = ai u1i = ai(u1 )i = aiu2i = f(u2). Thus u2 is a
i=0 i=0 i=0 i=0
root of f(x) and . f(x) K[x] is a monic irreducible polynomial, which
means that f(x) is the minimal polynomial of u2 over K.
53.3 Remark: Theorem 53.1 should not mislead the reader to believe
that any field isomorphism can be extended to larger fields. Consider, for
example, the isomorphism : ( 2) ( 2) given by a +b 2 a b 2
4
(a,b ). Now ( 2) is an extension field of ( 2). If : ( 2) ( 2)
4 4
could be extended to an isomorphism : ( 2) ( 2), we would have
4 4 2
2 = ( 2) = ( 2) = (( 2)2) = (( 2) ) , a contradiction, since the
4 4
square of ( 2) ( 2) has to be positive. So cannot be extended
4
to an isomorphism of ( 2).
The most important application of Theorem 53.1 is that any two splitting
fields of a polynomial are isomorphic. We now discuss this matter.
645
53.4 Definition: Let E/K be a field extension and f(x) K[x]\K. If f(x)
can be written as a product of linear polynomials in E[x], i.e., if there are
a0,a1,a2, . . . ,am in E such that f(x) = a0(x a1)(x a2). . . (x am), then f(x) is
said to split in E. If f(x) splits in E but not in any proper subfield of E
containing K, then E is called a splitting field of f(x) over K.
3
(c) x3 2 [x] does not split in ( 2) because x3 2
3 3 3 3
= (x 2)(x2 + 2x + ( 2)2) in ( 2)[x] and the second factor is
3 3 3
irreduc-ible in ( 2)[x]. On the other hand, x3 2 = (x 2)(x 2)(x
3 3 3 3
2
2) in ( 2, )[x], so x3 2 splits in ( 2, )[x]. In fact ( 2, ) is a
3 3 3 3
splitting field of x3 2 over . Notice that ( 2, ) = ( 2, 2, 2
2)
is the field generated by the roots of x3 2 over .
(d) Let E/K be a field extension and . f(x) K[x] a polynomial of positive
degree n. Assume that E contains n roots a1,a2, . . . ,an of f(x) (counted with
multiplicity). Then H = K(a1,a2, . . . ,an) is a splitting field of f(x) over K.
Indeed, with the leading coefficient a0 K, we have the factorization f(x)
= a0(x a1)(x a2). . . (x an) in H[x] since each factor x ak belongs to H[x].
Hence f(x) splits in H[x].. On the other hand, if L is any intermediate field
of E/K in which f(x) splits, then x ak is in L[x] and so ak is in L for all k,
thus {a1,a2, . . . ,an} L and H = K(a1,a2, . . . ,an) L. Hence f(x) does not split
in any proper subfield of H containing K.. Therefore, H is a splitting field
of f(x) over K. This argument shows in fact that K(a1,a2, . . . ,an) is the
unique intermediate field of E/K which is a splitting field of. f(x) over K.
In particular, E is a splitting field of f(x) if and only if E = K(a1,a2, . . . ,an).
646
(e) Let E/K be a field extension, L an intermediate field of this extension
and f(x) K[x]\K. Assume that E is a spliting field of f(x) over K. Then E
is a spliting field of f(x) over L, too, since f(x) splits in E but not in any
proper subfield of E containing K so that all the more so f(x) does not
split in any proper subfield of E containing L.
n
(f) Let p be prime. Any greatest common divisor of xp x with its
n n
derivative pnxp 1 1 = 1 is a unit in p[x]. Hence xp x p
[x] has no
.
multiple roots (Theorem 35.18(2)). Thus an extension field of p
in
pn
which x x splits must have at least the pn distinct roots of f(x). We
n
know that xp x splits in the field pn with pn elements (Lemma
n
52.4(3)). Thus pn
is a splitting field of xp x over p
.
(g) Let E/K be a field extension and f(x) K[x]\K. Let a1 E be a root of
f(x) and let L = K(a1) be the subfield of E generated by a1 over K, so that
f(x) = (x a1)g(x) for some g(x) L[x]. We claim that, if (g(x) has positive
degree and) E is a splitting field of g(x) over L,. then E is also a splitting
field of f(x) over K.. Indeed, if E is a splitting field of g(x) over L, then
g(x) = c(x a2). . . (x an) where c K and a2, . . . ,an E. We know that E =
L(a2, . . . ,an) from Example 53.5(d). Then f(x) = c(x a1)(x a2). . . (x an) in
E[x] and f(x) splits in E[x].. On the other hand, if E´ is any intermediate
field of E/K and f(x) splits in E´, then c(x a1)(x a2). . . (x an) in E´[x], so
a1 E´, so L = K(a1) E´ and a2, . . . ,an E´, so L(a2, . . . ,an) E´ and E E´.
Thus f(x). cannot split in any proper subfield of E containing K and E is a
splitting field of f(x) over K. .
3
(h) We saw in Example 53.5(c) that ( 2, ) is a splitting field of x3 2.
3
over . Likewise, ( 2)[y]/(y2 + y + 1) and ( )[y]/(y3 2) are splitting
fields of x3 2 over (here y is an indeterminate over ). In these
3
fields, x 2 splits as
3 3 3
[x ( 2 + (y2 + y + 1))] [(x ( 2y + (y2 + y + 1))] [x ( 2y2 + (y2 + y +
1))] . and [x (y + (y3 2))] [x ( y + (y3 2))] [x ( 2y + (y3
2))],
respectively.
647
A natural question is whether any polynomial has a splitting field. We
show now this is indeed the case. The following theorem is due to
Kronecker.
Suppose now deg f(x) = n 2 and the theorem is true for any poly-
nomial over any field if its degree is n 1. We construct an extension
field L of K in which f(x) has a root a and L:K n (Theorem 51.5;
possibly L = K). Then, by theorem 35.6, f(x) = (x a)g(x) for some g(x) in
L[x]. Now deg g(x) = n 1 and, by induction, there is an extension field E
of L such that E is a splitting field of g(x) over L and E:L (n 1)!.
From Example 53.5(g), we conclude that E is a splitting field of f(x) over
K. Moreover, E:K = E:L L:K (n 1)! L:K (n 1)!n = n!.
53.7 Theorem: Let E1/K1 and E2/K2 be field extensions and let : K1 K2
be a field isomorphism. Let f1(x) K1[x] be a polynomial in K1[x]\K1 and
let f2(x) = (f1(x)) K2[x]\K2 be its image under . If E1 is a splitting field
648
of f1(x) over K1 and E2 is a splitting field of f2(x) over K2, then extends
to a field isomorphism : E1 E2 and so E1 E2.
If E1:K1 = 1, then E1 = K1 and f1(x) splits in K1. Then f2(x) splits in K2 and
K2 = E2. Thus E1 = K1 K2 = E2 is the desired isomorphism.
Suppose now E1:K1 2 and suppose that any field isomorphism can be
extended to an isomorphism. of splitting fields of corresponding poly-
nomials. whenever the degree of a splitting field is less than or equal to
n 1. Since E1:K1 2 and E is generated over K1 by the roots of f1(x),
there must be a root of f1(x) in E1 which does not belong to K1. Let u1 be
a root of f1(x) in E1\K1. Assume g1(x) K1[x] is the minimal polynomial of
u1 over K1 and let u2 be a root of (g1(x)) = g2(x) K2[x] in E2. From
Lemma . 53.1, we know that can be extended to an isomorphism
: K1(u1) K2(u2). Now u1 E1\K1, so K1(u1):K1 1 and E1:K1(u1) n
(Theorem 48.13). As E1 is a splitting field of f1(x) over K1(u1) and E2 is a
splitting field of f2(x) over K2(u2) (Example 53.5(e)), we conclude, by
induction, that can be extended to an isomorphism : E1 E2. This is
the desired extension of . .
Proof: Let E1 and E2 be splitting fields of f(x) over K and apply Theorem
53.7 with K1 = K = K2 and = = identity mapping on K.
649
53.9 Definition: A field K is said to be algebraically closed if K has no
proper algebraic extension field, i.e., if any algebraic extension E of K
coincides with K.
(ii) (i) Suppose that any irreducible polynomial in K[x] has degree one.
We want to show that K has no proper algebraic extension. If E were a
proper algebraic extension of K, then there would be an a E\K. Now a
is algebraic over K and K K(a) since a K (Lemma 49.6(1)). This leads
to the contradiction
1 K(a):K = degree of the minimal polynomial of a over K
= degree of an irreducible polynomial in K[x] = 1.
Thus K is algebraically closed.
(ii) (iii) Suppose that any irreducible polynomial in K[x] has degree
one. Let f(x) be any polynomial of positive degree in K[x]. We show that
f(x) has a root in K. Indeed, any irreducible divisor of f(x) has the form
c(x a) with c,a K and thus has a root a in K, so f(x) too, has a root in a
in K.
(iii) (iv) Assume that any polynomial of positive degree in K[x]. has a
root in K and let f1(x) K[x]\K. Then f1(x) has a root a1 in K and f1(x) =
(x a1)f2(x) for some f2(x) K[x]. If f2(x) has positive degree, then f2(x)
has a root a2 in K and f2(x) = (x a2)f3(x) for some f3(x) K[x]; so f1(x) =
(x a1)(x a2)f3(x). If f3(x) has positive degree, then f3(x) has a root a3 in
K and f3(x) = (x a3)f4(x) for some f4(x) K[x]; so f1(x)
650
= (x a1)(x a2)(x a3)f4(x). Proceeding in this way, we will meet an fn(x)
of degree zero and f1(x) = (x a1)(x a2)(x a3). . . (x an)fn splits in K.
651
Does every field K have an algebraic closure? The answer is 'yes' and its
proof requires Zorn's Lemma. There is no algebraic difficulty in the
proof, but there are certain set-theoretical subtelties and we will not
give the proof in this book. It is also true that an algebraic closure of a
fieldK is unique in the sense that any two algebraic closures of a field K
are isomorphic by an isomorphism that fixes each element of K.
Exercises
652
§54
Galois Theory
Aut(E) is the collection of mappings from E onto E that preserve the field
structure of E. From these field automorphisms, we select the mappings
that preserve the vector space structure of E. We introduce some
terminology.
653
54.1 Definition: Let E/K and F/K be field extensions. A mapping : E
F is called a K-homomorphism if is both a field homomorphism and a
K-vector space homomorphism. A K-homomorphism : E F is called a
K-isomorphism if is one-to-one and onto F. A K-isomorphism from E
onto E is called a K-automorphism of E. The set of all K-automorphisms
of E will . be denoted by AutK E or by G(E/K).
54.2 Lemma: Let E/K be a field extension and let AutK E be the set of
all K-automorphisms of E over K. Then AutK E is a group.
54.3 Definition: Let E/K be a field extension. The group AutK E = G(E/K)
is called the Galois group of E over K. .
54.4 Examples: (a) Let E be any field and let P be the prime subfield
of E. Any field automorphism of E fixes 1 E. This implies that fixes
each element in P. Therefore any field automorphism of E is a P-auto-
morphism of E and Aut(E) = AutP (E)..
654
(b) The familiar complex conjugation mapping (a + bi a bi, where
a,b ) is an -automorphism of .
We find AutK K(x). In the following, y and z are two additional distinct
indeterminates over K.
655
in K(u)[y]. Thus x is algebraic over K(u). We see moreover that deg F(y) =
max (m,n) = max (deg p(x),deg q(x)), because bmu an 0 as u K. We
will show that F(y) is irreducible over K(u). This will imply cF(y) is the
minimal polynomial of x over K(u), where 1/c is the leading coefficient
of F(y), and so K(x):K(u) = deg cF(y) = deg F(y) = max (deg p(x),deg
q(x)).
Thus we get K(x):K(u) = max (deg p(x),deg q(x)) for any u = p(x)/q(x) in
K(x)\K, where p(x) and q(x) are relatively prime polynomials in K[x] and
q(x) 0.
656
infer that is in AutK K(x). Therefore AutK K(x) consists exactly of the
substitution homomorphisms x (ax + b)/(cx + d), where a,b,c,d K and
ad bc 0.
The next lemma is a generalization of the familiar fact that the complex
conjugate of any root of a polynomial with real coefficients is also a root
of the same polynomial. In the terminology of §26, if E/K is a field
extension,. AutK E acts on the set of distinct roots of a polynomial f(x)
over K. .
∑ ∑
m m
= (ai )(ui ) = ai(u )i = f(u ). Thus u is a root of f(x).
i=0 i=0
Let E/K be a finite dimensional extension and assume that {a1,a2, . . . ,am}
is a K-basis of E. Then any K-automorphism of E is completely
determined by its effect on the basis elements, for if and are K-
automorphisms of E and ai = ai for i = 1,2, . . . ,m, then, for any a E,
∑ = ( ∑ kiai) = ∑
m m m
which we write in the form kiai, we have a ki ai
i=0 i=0 i=0
∑ ∑ ∑ = ( ∑ kiai)
m m m m
= ki(ai ) = ki(ai ) = ki ai = a . For this reason,
i=0 i=0 i=0 i=0
657
= (a )i for any i = 0,1,2, . . . ,n 1,. the mapping is completely deter-
mined by its effect on a.. Now a is a root in K(a) of the minimal polyno-
mial of a over K. Thus AutK E r, where r is the number of distinct
roots in K(a) of the minimal polynomial of a over K.. We proved the
following lemma. .
3
54.7 Examples: (a) Let 2 be the positive real cube root of 2. Thus
3 3 3 3
( 2) . We find Aut ( 2). If Aut ( 2), then 2 is a
3
root of the minimal polynomial x3 2 of 2 over . Since the roots of x3
3 3 3
2 other than 2 are complex, 2 must be 2. Thus must be the
3 3
identity mapping on ( 2) and Aut ( 2) = 1.
(a + b 2 + c 3 + d 6) 1
=a+b 2+c 3+d 6
(a + b 2 + c 3 + d 6) 2
=a+b 2 c 3 d 6
(a + b 2 + c 3 + d 6) 3
=a b 2+c 3 d 6
(a + b 2 + c 3 + d 6) 4
=a b 2 c 3+d 6
658
identity mapping on ( 2, 3) and i j
= k
when {i,j,k} = {1,2,3}. Thus
Aut ( 2, 3) C2 C2 V4.
( 2, 3)´ = 1 G = Aut ( 2, 3)
( 2)´ = { 1
, 2
}, ( 3) = { 1
, 3
}, ( 6)´ = { 1
, 4
},
´=G
and 1´ = ( 2, 3)
{ 1, 2}´ = ( 2), { 1, 3}´ = ( 3), { 1
, 4
}´ = ( 6),
659
G´ = .
If E/K is a field extension and H AutK E, then H´ is called the fixed field
of H. Let us consider the four extreme cases of the priming correspond-
ence in Lemma 54.8.
E 1 E 1
K G K G
Equivalently,. E/K is Galois if and only if for any element a of E\K, there
exists a AutK E such that a a. It is easy to verify that is a Galois
extension of and that ( 2) and ( 2, 3) are Galois extensions of .
660
54.11 Lemma: Let E/K be a field extension and put G = AutK E. Let L,M
be intermediate fields of E/K and let H,J be subgroups of G. If X is an
intermediate field of E/K or a subgroup of G, we denote (X´)´ shortly by
X´´. Then the following hold.
(4) By parts (1) and (2), priming reverses inclusion, therefore L L´´
and H H´´ yieldL´´´ L´ and H´´´ H´. Also, using (3) with L replaced
by H´ and H by L´, we get H´ H´´´ and L´ L´´´. So L´´´ = L´ and H´´´ =
H´.
E 1 E 1
M M´ H´ H
L L´ J´ J
K G K G
In general, L may very well be a proper subset of L´´ and H a proper
subset of H´´. We introduce a term for the case of equality.
661
So E is Galois over K if and only if K is closed. Lemma 54.11(4) states
that any primed object is closed.
662
then the relative index of M´ and L´ is also finite. In fact, L´:M´ M:L .
In particular, if E/K is a finite dimensional extension, then AutK E
E:K ..
M M´
n/k
L(a) L(a)´
k
L L´
663
1 1 1
a, so fixes a, so fixes each element of L(a) = M, so M´ and
M´ = M´ . This completes the proof of L´:M´ M:L .
The assertion AutK E E:K follows easily: AutK E = AutK E:1 = K´:1 =
K´:E´ E:K .
54.15 Lemma: Let E/K be a field extension and H,J are subgroups of G
= AutK E with H J. . If the relative index J:H of H and J is finite, then
the relative dimension of J´ and H´ is also finite. In fact, H´:J´ J:H .
(a1 1
)x1 + (a2 1
)x2 + (a3 1
)x3 + . . . + (an+1 1
)xn+1 = 0
(a1 2
)x1 + (a2 2
)x2 + (a3 )x + . . . + (an+1
2 3 2
)xn+1 = 0
..................... (b)
(a1 n
)x1 + (a2 n
)x2 + (a3 n
)x3 + . . . + (an+1 n
)xn+1 = 0
Eventually after renumbering, we may assume that b1, . . . ,br are distinct
from zero and (in case n + 1 r) br+1 = . . . = bn+1 = 0. Also, we may
assume that b1 = 1, for otherwise we may take the solution b1/b1, b2/b1,
b3/b1, . . . , bn+1/b1 instead of b1,b2,b3, . . . ,bn+1. Of course the number r of
nonzero elements in both solutions are the same.
664
(a1 1
)x1 + (a2 1
)x2 + (a3 1
)x3 + . . . + (an+1 1
)xn+1 = 0
(a1 2
)x1 + (a2 2
)x2 + (a3 2
)x3 + . . . + (an+1 2
)xn+1 = 0
..................... (s)
(a1 n
)x1 + (a2 n
)x2 + (a3 n
)x3 + . . . + (an+1 n
)xn+1 = 0
so that 1
= 1 i1
, 2
= 2 i2
, 3
= 3 i3
, ..., n
= n in
for some 1, 2, 3, . . . , n H (where {i1,i2,i3, . . . ,in} = {1,2, . . . ,n}). Thus each
k
fixes each am in H´ and the ik-th equation
(a )x + (a )x + (a
1 ik 1
)x + . . . + (a
2 ik 2
)x =0
3 ik 3 n+1 ik n+1
665
from the first equation in (b). Here {a1,a2,a3 . . . ,an+1} is linearly indepen-
dent over J´ and b1 = 1 0. Thus all of b1,b2,b3 . . . ,bn+1 cannot be in J´: one
of them, say b2, is not in J´. So there is a J such that b2 b2.
We choose J such that b2 b2. Then the solution (c) of the system
(b) is a nontrivial solution in which the number of zonzero elements is
less than r, contrary to the meaning of r as the smallest number of
nonzero elements in any solution of (b). This contradiction shows that
H´:J´ n is impossible. Hence H´:J´ n = J:H .
54.16 Theorem: Let E/K be a field extension and G = AutK E. Let L,M be
intermediate fields of E/K with L M and let H,J be subgroups of G with
H J.
(1) If L is closed and M:L is finite, then M is closed and L´:M´ = M:L .
(2) If H is closed and J:H is finite, then J is closed and H´:J´ = J:H .
We are now in a position to state and prove the major theorem of this
paragraph.
666
Proof: By Theorem 54.13, there is a one-to-one correspondence
between the set of all closed intermediate fields of E/K and the set of all
closed subgroups of G, given by L L´. Now K is closed (E/K is a Galois
exten-sion) by hypothesis and all intermediate fields are closed by
Theorem 54.16(1) since they are finite dimensional over K. Moreover, if
M is any intermediate field, then K´:M´ = M:K . In particular, E is closed
and AutK E = G = G:1 = G:E´ = K´:E´ = E:K . Hence G is finite. Since 1 is
closed, it follows from Theorem 54.16(2) that all subgroups of G are
closed, because they are finite subgroups of G. Hence the priming map-
ping is a one-to-one correspondence between the set of all intermediate
fields of E/K and the set of all subgroups of G. Theorem 54.16 tells that
the relative dimension M:L of two intermediate fields L M is equal to
the relative index L´:M´ of the corresponding subgroups of G and that
the relative index J:H of two subgroups H J of G is equal to the
relative dimension H´:J´ of the corresponding intermediate fields.
3
54.18 Examples: (a) Let 2 be the real cube root of 2 and consider the
3 3
extension ( 2, ) over . The -automorphisms of ( 2, ) are 1
, 2
, 3
,
, , , where
4 5 6
3 3
1
: 2 2, ,
3 3
2
2
: 2 2, = 1 ,
3 3
3
: 2 2 , ,
3 3
2
4
: 2 2 , = 1 ,
3 3 3
2
5
: 2 2 = 2( 1 ), ,
3 3 3
2 2
6
: 2 2 = 2( 1 ), = 1 .
3
Any element u of ( 2, ) can be written uniquely in the form
3 3 3 3
u=a +b 2+c 4+d +e 2 +f 4 ,
3
where a,b,c,d,e,f are rational numbers. We show that ( 2, ) is Galois
over . To this end, we have to show that the fixed field of G is exactly
. Since
667
3 3 3 3
(a + b 2 + c 4 + d + e 2 + f 4 ) 2
3 3 3 3
2
= a + b 2 + c 4 + (d + e 2 + f 4)
3 3 3 3
= a + b 2 + c 4 + (d + e 2 + f 4)( 1 )
3 3 3 3
=(a d) + (b e) 2 + (c f) 4 d e 2 f 4 ,
3 3 3 3 3
we see that an element u = a + b 2 + c 4 + d +e 2 +f 4 of ( 2, )
is fixed by 2 if and only if
a = a d, d= d
b = b e, e= e
c = c f, f = f.
3 3 3
So an element u of ( 2, ) fixed by 2
has the form a + b 2 + c 4. If u
3 3 3 3
is fixed also by 3
, then a + b 2 + c 4 = (a + b 2 + c 4)
3 3
2
=a +b 2 +c 4
3 3
= a + b 2 + c 4( 1 )
3 3 3
=a c 4+b 2 c 4
3
The multiplication table of G( ( 2, )/ ) can be constructed easily.
3 3 3
2 2
Since 2 2 3
= 2 3
= 2 and 2 3
= 3
= , we have 2 3
= 4
etc.
3
and the multiplication table of G( ( 2, )/ ) is
1 2 3 4 5 6
1 1 2 3 4 5 6
2 2 1 4 3 6 5
3 3 6 5 2 1 4
4 4 5 6 1 2 3
5 5 4 1 6 3 2
6 6 3 2 5 4 1
668
3
So G( ( 2, )/ ) is a nonabelian group of order 6 and isomorphic to S3,
as can be easily seen by comparing the table above with the multiplica-
tion table of S3:
3
The isomorphism G( ( 2, )/ ) S3 can be found in a better way by
3
ob-serving that any automorphism inG( ( 2, )/ ) is completely deter-
3
mined by its effect on the roots of x3 2. The roots of x3 2 are u1 = 2,
3 3
2
u2 = 2 , u3 = 2 . Now 2
maps u1 to u1, u2 to u3 and u3 to u2 and can
therefore be represented, in a readily understood extension of the nota-
u1 u2 u3
tion for permutations, as (u u u ) = (u1)(u2u3) = (u2u3). Dropping u
1 3 2
and retaining only the indices, we see that 2 can be thought of as the
permutation (23) in S3. The other j can be thought of as permutations in
3
S3 in a similar way and this gives the isomorphism G( ( 2, )/ ) S3.
In the multiplication tables above, j and its image in S3 under this iso-
morphism occupy corresponding places.
669
{ ,(23)} { ,(13)} { ,(12)}
{ ,(123),(132)} = A3
S3
3
So the subgroups of G( ( 2, )/ ) are
{ 1, 2
} { 1, 6
} { 1, 4
}
{ 1, 3
, 5
}
3
G( ( 2, )/ )
3
( 2, )
3 3 3
2
( 2) ( 2 ) ( 2 )
( )
670
4
(b) Let 2 be the real fourth root of 2 and consider the extension
4 4
( 2,i) over . The -automorphisms of ( 2,i) are 1
, 2
, 3
, 4
, 5
, 6
, 7
, 8
where
4 4
1
: 2 2, i i,
4 4
2
: 2 2, i i,
4 4
3
: 2 2i, i i,
4 4
4
: 2 2i, i i,
4 4
5
: 2 2, i i,
4 4
6
: 2 2, i i,
4 4
7
: 2 2i, i i,
4 4
8
: 2 2i, i i.
1
We put 2
= and 3
= . Then o( ) = 2, o( ) = 4 and = . Thus
4
G( ( 2,i)/ ) is a dihedral group of order 8. Since any automorphism in
4
G( ( 2,i)/ ) is completely determined by its effect on the four roots u1
4 4 4 4 4
= 2, u2 = 2i, u3 = 2, u4 = 2i of x4 2, the group G( ( 2,i)/ ) is
u1 u2 u3 u4
isomorphic to a subgroup of S4. We see = (u u u u ) = (u1u2u3u4)
2 3 4 1
u
u2 u3 u4 4
and = (u1
u4 u3 u2 ) = (u u
2 4
). So G( ( 2,i)/ ) (24),(1234) =
1
{ ,(13),(24),(12)(34),(13)(24),(14)(23),(1234),(1432)} S4 by an
isomorph-ism 2
= (24), 3
= (1234). The subgroups of
4
G( ( 2,i)/ ) are
2 2
{1, } {1, } {1, } {1, }
{1, 3 }
671
2 2 2 3 2 3
{1, , , } {1, , , } {1, , , }
4
Aut ( 2,i)
4
2
Let us find the intermediate field of ( 2,i)/ corresponding to {1, }.
4
We write u = 2 for brevity. We have (u) 2 = (u ) = (ui) = (u .i ) =
(ui.i) = ( u) = (u ) = u and (i) 2 = (i ) = (i ) = i = i. Now let
a,b,c,d,e,f,g,h and s = a + bu + cu2 + du3 + ei + fui + gu2i + hu3i. Then
s 2 = (a + bu + cu2 + du3 + ei + fui + gu2i + hu3i) 2
= a + b( u) + c( u)2 + d( u)3 + e( i) + f( u)( i) + g( u)2( i) + h( u)3( i)
= a bu + cu2 du3 ei + fui gu2i + hu3i
and so s is fixed under 2 if and only if
a = a, b = b, c = c, d = d,
e = e, f = f, g = g, h = h,
so if and only if b = d = e = g = 0,
so if and only if s = a + cu2 + fui + hu3i = a + f(ui) c(ui)2 h(ui)3
so if and only if s (ui).
4
2
Thus the intermediate field of ( 2,i)/ corresponding to {1, } is
4
{1, 2 }´ = (ui) = ( 2i). Similar computations yield that the Galois
corre-spondence is as in the diagram below, where intermediate fields
occupy the same relative position as the corresponding subgroups.
4
( 2,i)
4 4 4 4
( 2) ( 2i) ( 2,i) ( 2(1+i)) ( 2(1 i))
( 2) (i) ( 2i)
672
(c) Let p be a prime number and n . We consider the extension
pn
/ p. The mapping : pn pn
a ap
We put G = Aut pn
. We want to show G = . First we prove o( ) = n.
p
n pn
From a =a = a for all a pn
(Lemma 52.4(2) or Theorem 52.8), we
n
get = 1, so o( ) n.. On the other hand, if m is a positive proper divisor
of n, then pn
has a proper subfield pm
with pm elements (Theorem 52.8)
m
and there is a b pn
\ pm with b m = bp b, so m
1. So we conclude
o( ) = n. Since pn : p is finite, we get
n = o( ) = G = G:1 = ( p)´:( pn )´ pn
: p
=n
from Lemma 54.14, so = G = n and G = .
pn
1
n/m n/m
m
pm
m m
p
G=
673
them. However, it is in general more difficult to find all intermediate
fields of an extension, for it is likely that one overlooks some of them.
Also, it is more difficult to avoid duplications. For instance, in Example
54.18(c), it is not immediately clear where ( 2(1+i)) and ( 2(1 i))
are, nor whether ( 2(1+i)) = ( 2(1 i)). It is far easier to list the
subgroups than to list the intermediate fields.
54.19 Definition: Let E/K be a field extension and let G = AutK E be its
Galois group.. An intermediate field L of this extension is said to be
stable relative to K and E,. or to be (K,E)-stable if every K-automorphism
AutK E of E maps L into L.
674
1
Proof: (1) We are to prove that L´ for all L´ and AutK E.
Thus we must show that a( 1 ) = a for all a L. Indeed, if a L, L´
1
and AutK E, then a L since L is (K,E)-stable, so (a 1) = a 1
, so
1 1 1
a( ) = (a ) = (a ) = a. Hence L´ AutK E.
Proof: For any a L\K, we must find a AutK L such that a a. Since
E is Galois over K, there is a AutK E such that a a. Then L
AutK L
by stability of L relative to K and E. Thus L can be taken as .
675
The next theorem is a kind of converse to Theorem 54.21. The result is
not necessarily true without the hypothesis that L is algebraic (cf. Ex. 8).
676
54.25 Theorem: . Let E/K be a finite dimensional Galois extension of
fields and G = AutK E. Let L be an intermediate field of E/K.
(1) E is Galois over L. .
(2) L is Galois over K if and only if L´ = AutL E is normal in G = AutK E. In
this case, G/L´ = (AutK E)/(AutL E) is isomorphic to the Galois group AutK L
of L over K. Thus G(E/K)/G(E/L) G(L/K). .
(1) In order to show that E is Galois over L, we must prove that L = L´´,
that is, that L is closed. This follows from the fundamental theorem.
677
54.26 Theorem: Let q be a field of q elements and E a finite dimen-
sional extension of q. Then E is Galois over q and Aut E is cyclic, gener-
q
q
ated by the automorphism , where :a a for all a E.
Proof: Let E: q
= r and char q
= p, so that p
is the prime subfield of q
m
(and of E). We have q = p , where m = q: p . We consider the extension
E/ p. Since E is an r-dimensional vector space over q and q is an m-
dimensional vector space over p, Theorem 48.13 says E is an rm-dimen-
sional vector space over p
and so E = prm. Thus E is a finite field and E
is Galois over (Example 54.18(c)). Then E is Galois over any
p
intermediate field of E/ p (Theorem 54.25(1)); in particular, E is Galois
over q
. Furthermore, we know from Example 54.18(c) that Aut E = ,
p
where is the field isomorphism a a p for all a E and that the group
( q)´ corresponding to the intermediate field q with pm elements is m
.
m m
Thus Aut E = ( q
)´ = , where = is the mapping a a p = a q for all
q
a E.
Exercises
1. Find the Galois group AutK E and all its subgroups and describe the
Galois correspondence between the subgroups of AutK E and the inter-
mediate fields of E/K when
(a) E = ( 2, 3) and K = ;
3 3 3
(b) E = ( 2, 5) and K = ,K= ( 2);
3 4 4
(c) E = ( 2, 3,i) and K = (i), K = (i, 3);
3
(d) E = ( 2,i) and K = (i);
3
(e) E = ( 2, 5) and K = , ( 2).
678
3. Let E/K be a field extension and G = AutK E. Let L, M be intermediate
fields of E/K and let H,J be subgroups of G. Prove that H J ´ = H´ J´
and (LM)´ = L´ M´.
679
§55
Separable Extensions
Thus all the deg f(x) roots of a polynomial f(x) separable over K are
distinct and f(x) splits into distinct linear factors in any splitting field of
f(x) over K.
How can an irreducible polynomial f(x) have a zero derivative? Now f(x)
is not 0 or a unit because of irreducibility, so deg f(x) =: m 1. Let f(x) =
680
condition iai = 0 is equivalent to p i in case char K = p. So for terms aixi
∑ apjxpj.
[m/p]
with ai 0, we have i = pj for some j and we may write f(x) =
j=0
681
The polynomial x2 + 1 [x] is separable over , because it is irreduc-
ible over and char = 0. On the other hand, x2 + 1 2
[x] is not separ-
able over 2
because x2 + 1 = (x + 1)2 is not even irreducible over 2
.
Proof: Lemma 50.5 shows that a is algebraic over L. Let f(x) be the
minimal polynomial of a over K and g(x) the minimal polynomial of a
over L. By Lemma 50.5, g(x) is a divisor of f(x). Thus any root of g(x) is a
root of f(x). Since a is separable over K, the roots of f(x) are all simple,
hence, all the more so, the roots of g(x) are all simple and a is separable
over L.
682
The converse of Lemma 55.6 is also true and will be proved later in this
paragraph (Theorem 55.19). Our next goal is to characterize Galois
exten-sions as splitting fields of separable polynomials.
We must now show that there is a polynomial g(x) in K[x] such that E is a
splitting field of f(x) over K. Let {a1,a2, . . . ,am} be a K-basis of E and let
fi(x) K[x] be the minimal polynomial of ai over K (i = 1,2, . . . ,m). We put
g(x) = f1(x)f2(x). . . fm(x) K[x]. From Theorem 54.22 again, we learn that
each fi(x), hence also g(x), splits in E. Moreover, g(x) cannot split in any
proper subfield L of E containing K for if L is an intermediate field of E/K
and g(x) splits in L, then L contains all roots of g(x), hence L contains
a1,a2, . . . ,am and we have
E = sK (a1,a2, . . . ,am) K(a1,a2, . . . ,am) L,
so E = L. Thus E is indeed a splitting field of g(x) over K.
683
irreducible in K[x]. So cifi(x) is separable over K and consequently fi(x) is
also separable over K.
Since, for any AutK E, there holds a = a for all a K0, we see that
AutK E AutK E.
0
(i) In order to show that E is Galois over K0, we have to find, for
each b E\K0, an automorphism AutK E such that b b. If b E\K0,
0
then, by definition of K0, there is a AutK E such that b b. From
AutK E AutK E, we see AutK E and b b. Thus E is Galois over K0.
0 0
group of AutK E. Hence AutK E = K0´ = ((AutK E)´)´ = (AutK E)´´ = AutK E.
0 0
684
Suppose now n 2 and suppose that E1:K1 = AutK E1 whenever E1/K1 is
1
a finite dimensional extension with 1 E1:K1 n such that E1 is a split-
ting field of a polynomial in K1[x] whose irreducible factors (in K1[x]) are
separable over K1.
Now E is a splitting field of g(x) L[x] over L (Example 53.5(e)). and the
irreducible factors (in L[x]) of g(x), being divisors of fi(x), have no
multiple roots and are therefore separable over L.. Since E:L = n/r n,
we get E:L = AutL E = L´ by induction.
A: {a1,a2, . . . ,ar}.
L´ a
( AutK E; we know a E is a root of f1(x) from Lemma 54.5). This
mapping A is well defined, for if L´ = L´ , then
1
L´
1
fixes each element of L = K(a)
1
fixes a
a( 1) = a
(a ) 1 = a
a =a
(L´ )A = (L´ )A,
so A is well defined and,. reading the lines backwards, we see that A is
one-to-one as well. It remains to show that A is onto. Indeed, if ai is any
root of f1(x) in E, then there is a field homomorphism i: K(a) K(ai)
685
mapping a to ai and fixing each element of K (Theorem 53.1) and i can
be extended to a K-automorphism i: E E (Theorem 53.7). Then A
sends the coset L´ i to a i = a i = ai. Hence A is onto. This gives
AutK E:L´ = = {a1,a2, . . . ,ar} = r. The proof is complete.
Proof: (1) (2) Asume that g(x) K[x] is irreducible over K and that
g(x) has a root u E. We want to show that all irreducible factors of g(x)
in E[x] have degree one. Suppose, on the contrary, that h(x) E[x] is an
irreducible (over E) factor of g(x) with deg h(x) = n 1. We adjoin a root
t of h(x) to E and thereby construct the field E(t).
Now u and t are roots of the irreducible polynomial g(x) in K[x], so there
is a K-isomorphism : K(u) K(t) (Theorem 53.2). Since E is a splitting
field of f(x) over K(u) and E(t) is a splitting field of f(x) over K(t)
(Example 53.5(e)), the K-isomorphism can be extended to a K-
isomorphism : E E(t) (Theorem 53.7). But then E:K = E(t):K =
E(t):E E:K = n E:K E:K , a contradiction. Thus all irreducible factors of
g(x) in E[x] have degree one and g(x) splits in E.
(2) (1) Suppose now that any irreducible polynomial in K[x] splits in E
whenever it has a root in E. Let {a1,a2, . . . ,am} be a K-basis of E and let
686
fi(x) K[x] be the minimal polynomial of ai over K. We put f(x) =
f1(x)f2(x). . . fm(x). We claim E is a splitting field of f(x) over K.
687
Proof: Let {a1,a2, . . . ,am} be a K-basis of E and let fi(x) K[x] be the
minimal polynomial of ai over K. We put f(x) = f1(x)f2(x). . . fm(x) K[x].
Let N be a splitting field of f(x) over E, with N:E finite. (Theorem 53.6).
We claim N has the properties stated above. . Since N:E and E:K are both
finite, N:K is finite. This proves (iii)..
688
55.12 Definition: Let E/K be a finite dimensional field extension. An
extension field N of E as in Theorem 55.11 is called a normal closure of E
over K.
Our next topic is the so-called primitive element theorem which states
that a finitely generated separable extension is in fact a simple exten-
sion. This theorem is due to Abel, but the first complete proof was given
by Galois. The elements of a finitely generated separable extension can
689
Let a = a1,a2, . . . ,an N be the roots of f(x) and b = b1,b2, . . . ,bm N be the
roots of g(x).. Since E is separable over K, a and b are separable over K, so
f(x) and g(x) are separable over K, so ai aj when i j (i,j = 1,2, . . . ,n) and
bk bl when k l (k,l = 1,2, . . . ,m).
690
then K(a1,a2, . . . ,am 1) = K(c1) for some c1 and therefore we have K(a1,a2, . . .
,am) = K(a1,a2, . . . ,am 1)(am) = K(c1)(am) = K(c1,am) = K(c) for some c E.
691
f(x), there is only a finite number of possibilities for g(x) and there are
only a finite number of intermediate fields L.
692
polynomial xp a p K(a p)[x], we have g(x) xp a p in K(a p)[x]. Therefore
g(x) xp a p and g(x) (x a)p in E[x]. So g(x) = (x a)m for some m such
that 1 m p. Since g(x) has no multiple roots, we get m = 1. Then
g(x) = x a K(a p)[x] and consequently a K(a p). This gives K(a) K(a p)
and, since K(a p) K(a) in any case, we obtain K(a) = K(a p).
Conversely, suppose that K(a) = K(a p). We want to show that a is separ-
able over K. Let f(x) be the minimal polynomial of a over K. If a is not
separable over K, then f(x) has the form f(x) = g(xp) for some g(x) K[x].
Here g(x) is irreducible over K because g(x) is not a unit in K[x] (for f(x),
being irreducible over K, is not a unit in K[x]) and any factorization g(x) =
r(x)s(x) with deg r(x) 0 deg s(x) would give a proper factorization
f(x) = r(x )s(x ) with deg r(xp)
p p
0 deg s(xp), contrary to the irre-
ducibility of f(x) over K. Clearly g(x) is a monic polynomial and, since 0 =
f(a) = g(a p), we see that a p is a root of g(x). Thus g(x) is the minimal
polynomial of a p over K (Theorem 50.3). Of course deg f(x) = p.deg g(x)
and
K(a):K = deg f(x) = p(deg g(x)) deg g(x) = K(a p):K .
Hence K(a p) is a proper subspace of the K-vector space K(a) (Lemma
42.15(2)), contrary to the hypothesis K(a) = K(a p). Consequently, K(a) =
K(a p) implies that a is separable over K.
693
ui = k1t1 + k2t2 + . . . + kntn for some kj K,
uip = k1pt1p + k2pt2p + . . . + knptnp for some kj p K,
uip sK (t1p,t2p, . . . ,tnp).
∑
n
is a K-basis of E, there are elements cijk in K with titj = cijktk and so
i,j=1
∑
n
tiptj p = cpijktkp sK (t1p,t2p, . . . ,tnp) = L. Thus L is a subring of E.
i,j=1
694
Since L contains K and {t1p,t2p, . . . ,tnp}, and L is contained in the ring
K[t1p,t2p, . . . ,tnp], we get L = K[t1p,t2p, . . . ,tnp]. Now for each i = 2, . . . ,n, the
element tip is algebraic over K, so algebraic over K(t1p, . . . ,ti p1) and so
K(t1p, . . . ,ti p1)[ti] = K(t1p, . . . ,ti p1)(ti) (Theorem 50.6) and repeated
application of Lemma 49.6(2), Lemma 49.6(3) gives L = K[t1p,t2p, . . . ,tnp]
= K(t1p,t2p, . . . ,tnp). Thus L = K(t1p,t2p, . . . ,tnp) and L is in fact a field.
695
over K. The case char K = 0 being trivial, we may assume char K = p 0.
Let n = K(a):K . Then {1,a,a 2, . . . ,a n 1} is a K-basis of K(a) (Theorem 50.7).
Likewise {1,a p,(a p)2, . . . ,(a p)m 1} is a K-basis of K(a p), where m = K(a p):K .
Since a is separable over K, we have K(a p) = K(a) (Lemma 55.16) and m =
K(a p):K = K(a):K = n. Thus {1,a p,(a p)2, . . . ,(a p)n 1} = {1p,(a)p,(a 2)p, . . .
,(a n 1)p} is also a K-basis of K(a). Thus K(a) is separable over K by Lemma
55.17.
55.19 Theorem: Let E/K be a finite dimensional field extension and let
L be an intermediate field of E/K. Then E is separable over K if and only
if E is separable over L and L is separable over K.
696
Thus in case char K = p 0, K is a perfect field if and only if the field
homomorphism : K K is onto K. Then for each a K, there is a
p
u u
unique b K such that a = bp, for is one-to-one. This unique b will be
p
denoted by a.
Suppose first that K is perfect. Now, if f(x) K[x] is not separable over K,
∑ aixi and
m
then f(x) = g(xp) for some g(x) K[x], say g(x) =
i=0
∑ aixip = ∑ ( ai)pxip = ( ∑
m m p m p
f(x) = g(xp) = ai xi)p,
i=0 i=0 i=0
697
Consequently every algebraic extension of a perfect field K is separable
over K. Theorem 55.21 yields the corollary that every algebraically
closed field is perfect, since any irreducible polynomial in an algebraic-
ally closed field is of first degree and has therefore no multiple roots (is
separable over that field).
Exercises
5
1. Find a normal closure of ( 3, 7) over .
3. Let E/K be a field extension with E:K = 3 and assume that E is not
normal over K. Let N be a normal closure of E over K. Show that N:K = 6
and that there is a unique intermediate field L of N/K satisfying L:K = 2.
4. Let N/K be a field extension and assume that N is normal over K. Let L
be an intermediate field of N/K. Prove that L is normal over K if and
only if E is (K,N)-stable.
698
10. Let p be a prime number and x,y two distinct indeterminates over
p
. Let E = p(x,y) and K = p(xp,yp). Show that E is not a simple extension
of K and find infinitely many intermediate fields of E/K.
13. Prove that Theorem 55.19 is valid without the hypothesis that E be
finite dimensional over K. (Hint: Reduce the general case to the finite
dimensional case.)
699
§56
Galois Group of a Polynomial
56.1 Lemma: Let K be a field and . f(x) = anxn + a n 1xn 1 + . . . + a1x + a0,
g(x) = bmxm + bm 1xm 1 + . . . + b1x + b0 be nonzero polynomials in K[x]\K.
Assume that at least one of an,bm is distinct from 0. Then f(x),g(x) have a
nonunit greatest common divisor in K[x]. if and only if there are nonzero
polynomials g1(x),f1(x) K[x] such that
f(x)g1(x) = g(x)f1(x) and deg f1(x) n, deg g1(x) m.
Proof: One direction is clear.. If f(x) and g(x) have a nonunit greatest
common divisor h(x) in K[x], then f(x) = h(x)f1(x), g(x) = h(x)g1(x) with
some suitable f1(x),g1(x) in K[x] and
deg f1(x) = deg f(x) deg h(x) n deg h(x) n
since deg h(x) is greater than zero. Likewise deg g1(x) m. We have of
course f(x)g1(x) = f1(x)h(x)g1(x) = f1(x)g(x).
700
two polynomials in K[x], where an 0 or bm 0, so that deg f(x) = n or
deg g(x) = m.. From Lemma 56.1, we know that f(x) and g(x) have a
nonunit greatest common divisor in K[x]. if and only if there are elements
cm 1,cm 2, . . . ,c1,c0,dn 1,dn 2, . . . ,d1,d0, where at least one ci 0 and at least
one dj 0, such that
ancm 1 = bmdn 1
ancm 2 + a n 1cm 1 = bmdn 2 + bm 1dn 1
ancm 3 + a n 1cm 2 + a n 2cm 1 = bmdn 3 + bm 1dn 2
+ bm 2dn 1
........................
a1c0 + a0c1 = b1d0 + b0d1
a0c0 = b0d0.
ancm 1
bmdn 1 =0
ancm 2
+ a n 1cm 1
bmdn 2 bm 1dn 1
=0
ancm 3
+ a n 1cm 2
+ a n 2cm 1
bmdn 3 bm 1dn 2
bm 2dn 1
=0
........................
a1c0 + a0c1 b1d0 b0d1 = 0
a0c0 b0d0 =0
or as
ancm 1 bmdn 1 =0
a n 1cm 1
+ ancm 2 bm 1dn 1 bmdn 2 =0
a n 2cm 1
+ a n 1cm 2
+ ancm 3
bm 2dn 1 bm 1dn 2 bmdn 3
=0
........................
a1cm 1
+ a2cm 2 + a3cm 3 ......... =0
a0cm 1
+ a1cm 2 + a2cm 3 ......... =0
a0cm 2 + a1cm 3 ......... =0
........................
a0c1 + a1c0 b0d1 b1d0 =0
a0c0 b0d0 =0
701
We write this system in matrix form:
m columns n columns
an 0 0 ... 0 bm 0 0 ... 0 cm 1
=0
an 1
an 0 ... 0 bm 1 bm 0 ... 0 cm 2
=0
an 2
an 1
an ... 0 bm 2 bm 1 bm ... 0 cm 3
=0
.................................
0 0 0 ... a0 0 0 0 ... b0 d0 =0
Let A denote the matrix of this system. Then the polynomials f(x),g(x)
have a nonunit greatest common divisor if and only if the matrix
equation AX = 0 has a solution
in which at least one ci 0 and at least one dj 0. From the equation (*)
and the fact that K[x] has no zero divisors, we deduce that, in a solution
X = (c m 1, . . . , d0)t of AX = 0, there is at least one ci 0 if and only if
there is at least one dj 0. Thus the polynomials f(x),g(x) have a nonunit
greatest common divisor if and only if the matrix equation AX = 0 has a
nontrivial solution. This is the case if and only if det A = 0 (Theorem
45.3). Since det A = det At, we get that f(x),g(x) have a nonunit greatest
common divisor if and only if det At = 0. We proved the
56.2 Theorem: Let K be a field and f(x) = anxn + a n 1xn 1 + . . . + a1x + a0,
g(x) = bmxm + bm 1xm 1 + . . . + b1x + b0 be polynomials in K[x]\K, where at
least one of an,bm is distinct from 0. Then f(x) and g(x) have a nonunit
greatest common divisor in K[x] if and only if the determinant
an an 1 . . . a1 a0 0 0 0 0 0
0 an a n 1 . . . a1 a0 0 0 0 0
0 0 a a ... a1 a0 0 0 0
n n 1
..................
0 0 0 ... an an ... a1 a0
1
bm bm 1
... b1 b0 0 0 0 0 0
0 bm bm 1
... b1 b0 0 0 0 0
702
0 0 bm bm 1
... b1 b0 0 0 0
..................
0 0 0 ... bm bm 1
... b1 b0
is equal to zero.
56.3 Definition: Let K be a field and f(x) = anxn + a n 1xn 1 + . . . + a1x + a0,
bm bm 1 . . . b1 b0
bm bm 1 . . . b1 b0
bm bm 1 . . . b1 b0
.........
bm bm 1 . . . b1 b0
(empty places are to be filled with zeroes) is called the resultant of f(x)
and g(x), and is denoted by R(f,g) or by R (f(x),g(x)).
703
then g(x) is obtained from G(x) by adding m k initial terms bmxm,
bm 1xm 1, . . . , bk+1xk with coefficient 0 and so R(f,g) = anm kR(f,G).
56.2 Theorem: Let K be a field and f(x) = anxn + a n 1xn 1 + . . . + a1x + a0,
g(x) = bmxm + bm 1xm 1 + . . . + b1x + b0 be polynomials in K[x]\K, where at
least one of an,bm is distinct from 0. Then f(x) and g(x) have a nonunit
greatest common divisor in K[x] if and only if R(f,g) = 0.
∏ ∏
n m
(2) R(f,g) = am bn
n m
(ui yj ).
i=1 j=1
∏
n
(3) R(f,g) = am
n
g(ui).
i=1
∏
m
(4) R(f,g) = ( 1)mnbnm f(yj ).
j=1
Proof: We put
f(x) = anxn + a n 1xn 1 + . . . + a1x + a0,
704
g(x) = bmxm + bm 1xm 1 + . . . + b1x + b0,
∏ ∏
n m
S = am bn
n m
(ui yj ) L.
i=1 j=1
∏
m
We have g(x) = bm (x yj ),
j=1
∏
m
g(ui) = bm (ui yj ),
j=1
∏ ∏ ∏
n n m
g(ui) = bnm (ui yj ).
i=1 i=1 j=1
∏
n
and thus S = am
n
g(ui). (i)
i=1
∏ ∏
n n
In like manner, from f(x) = an (x ui) = ( 1)nan (ui x), we get
i=1 i=1
∏
n
f(yj ) = ( 1)nan (ui yj ),
i=1
∏ f(yj ) = ∏ ∏
m m n
(( 1)nan (ui yj )),
j=1 j=1 i=1
∏ ∏ ∏
m m n
f(yj ) = ( 1)nmam
n
(ui yj ),
j=1 j=1 i=1
705
∏
m
S = ( 1)nmbnm f(yj ). (ii)
j=1
has the value R(f0,g) = 0 when yj is substituted for ui. So R(f,g) has the
root yj . So ui yj divides R(f,g) in
This is true for all i = 1,2, . . . ,n and for all j = 1,2, . . . ,m. Since any ui yj is
irreducible in L, and ui yj is distinct from ui´ yj´ whenever (i,j) (i´,j´),
the polynomials ui yj are pairwise relatively prime. Thus R(f,g) is
divisible, in L, by their product
∏ ∏
n m
(ui yj ).
i=1 j=1
∏ ∏
n m
S = am bn
n m
(ui yj )
i=1 j=1
Let us write H = R(f,g)/S. Basically, we will argue that R(f,g) and S are
both homogeneous (§35, Ex. 4) of the same degree and conclude that H is
a constant. Comparison of a monomial appearing in these polynomials
will yield that this constant must be equal to 1, whence R(f,g) = S. The
details are rather tedious.
706
∏
n
From (i), we see that S/am
n
= i
g(u )
i=1
= (bmu1m + . . . )(bmu2m + . . . ). . . (b u m + . . . )
m n
= bnm u1mu2m. . . unm + . . .
P[bm,y1,y2, . . . ,ym][u1,u2, . . . ,un]
S/am
n
= h1( an 1/an, an 2/an, . . . , a1/an, a0 /an).
k1 k2 k2 k3 kn-1 kn kn
y 1 2
... n 1 n
, y P[bm,y1,y2, . . . ,ym]
S = (am
n
)(S/am
n
) = am h ( an 1/an, an 2/an, . . . , a1/an, a0 /an),
n 1
707
Thus H M[y1,y2, . . . ,ym][u1,u2, . . . ,un] is symmetric in u1,u2, . . . ,un and
therefore
...+ kn
changes then to y(ta0)k0 (ta1)k1 . . . (tan)kn = tk0 + k1 + ya0k0 a1k1 . . . ankn . Thus
the exponent system of any monomial ya0k0 a1k1 . . . ankn appearing in H is
such that k + k + . . . + k = 0. This means k = k = . . . = k = 0 for all
0 1 n 0 1 n
k0 k1 kn
monomials ya0 a1 . . . an appearing in H and H is a "constant", i.e., H is
in M[bm,y1,y2, . . . ,ym].
∏
n
Thus R(f,g) = HS for some H K. The constant term in S = am
n
g(ui) is
i=1
equal to am bn. So R(f,g) must have a term Ham
n 0
bn. Now R(f,g) has the term
n 0
am bn, the product of the entries in the principal diagonal. Hence H = 1
n 0
and R(f,g) = S. This proves (2). From (i) and (ii), we get the equations in
(3) and (4).
708
coefficient of f(x) and bm the leading coefficient of g(x). Let r1,r2, . . . ,rn be
roots of f(x) and s1,s2, . . . ,sm roots of g(x) in a splitting field of f(x)g(x)
over K. Then
∏ ∏ ∏ ∏
n m n m
R(f,g) = am bn
n m
(ri sj ) = am
n
g(ri) = ( 1)mnbnm f(sj ).
i=1 j=1 i=1 j=1
∏ ∏ ∏ ∏
n m n m
R(F,G) = am bn
n m
(ui yj ) = am
n
g(ui) = ( 1)mnbnm f(yj )
i=1 j=1 i=1 j=1
∏ ∏ ∏ ∏
n m n m
R(f,g) = am bn
n m
(ri sj ) = am
n
g(ri) = ( 1)mnbnm f(sj )
i=1 j=1 i=1 j=1
∏
n
R(f,g) = am
n
g(ri).
i=1
∏
n
both f(x) and g(x) split completely in E. Then R(f,g) = am
n
g(ri) by
i=1
Lemma 56.6.
Assume now bm = 0 and let k be the largest index for which bk 0. Thus
bm = bm 1 = . . . = bk+1 = 0 and bk 0. We put G(x) = bkxk + bk 1xk 1 + . . . +
709
b1x + b0. We get R(f,g) = am
n
k
R(f,G) from Remark 56.4 and we have R(f,G)
710
56.9 Theorem: Let K be a field and f(x) a polynomial of positive degree
n and let an be the leading coefficient of f(x). Then the discriminant D(f)
of f(x) is in K. In fact, R(f,f´) = ( 1)n(n 1)/2anD(f).
Proof: Let E be a splitting field of f(x) over K and let r1,r2, . . . ,rn be the
i=1
Lemma 56.7. We must find f´(ri). From f(x) = an(x r1)(x r2). . . (x rn),
we get
∑
n
f´(x) = an(x r1). . . (x rj 1)(x rj+1). . . (x rn)
j=1
rn) = an ∏ (ri
n
so f´(ri) = an(ri r1). . . (ri ri 1)(ri ri+1). . . (ri rj ).
j =1
j i
= an.a2n
n
2
∏ (ri rj ) = an.a2n
n
2
∏ (ri rj ) ∏ (ri rj )
i j i j j i
= an.a2n
n
2
∏ (ri rj ) ∏ ( 1)(rj ri)
i j j i
= an.a2n
n
2
∏ (ri rj ) ∏ ( 1)(ri rj )
i j i j
∏ ∏
...
= an.a2n
n
2
(ri rj ) . ( 1)(n 1) + (n 2) + + 2 + 1 (ri rj )
i j i j
= ( 1)n(n 1)/2an.a2n
n
2
∏ (ri rj )2 = ( 1)n(n 1)/2anD(f).
i j
a b c a b c
b 2c
2a b 0 = 0 b 2c = a = a( b2 + 4ac) = a(b2
2 b
0 2a b 0 2a b
4ac),
711
hence the discriminant of f(x) is b2 4ac.
q 0 1 0 p
1 0 p q 0
1 0 p q
p q 0 1 0
0 1 0 p q
0 2p 3q 0
3 0 p
0 0 = 0 0 2p 3q 0 = 3 0 p 0
p 0 0 3 0
0 3 0 p 0
0 3 0 p
0 p 0 0 3
0 0 3 0 p
1 0 p q
2p 3q 0
0 2p 3q 0
= 0 0 0 2p 3q = 4p3 + 27q2.
2p 3q =
3 0 p
0 3 0 p
So the discriminant of f(x) is equal to 4p3 27q2.
56.11 Lemma: (1) Let E/K, E1/K1 be field extensions. Assume that there
are field isomorphisms : K K1 and : E E1 and that is an extension
of . Then AutK E AutK E1.
1
(2) Let K be a field and f(x) a polynomial in K[x]\K.. Let E and F be two
splitting fields of f(x) over K. Then AutK E AutK F.
1
Proof: (1) For any AutK E, consider the mapping : E1 E1.
1
Clearly is a field isomorphism (Lemma 48.10). Moreover, for any a1
1 1
K1, there is a unique a K with a =a = a1, i.e., a1 = a1 = a and
a1 1 = (a1 1) = (a) = (a ) = a = a1, so 1
is in fact a K1-
automorphism of E1. Thus we have a mapping
A: AutK E AutK E1
1
1
712
B: AutK E1 AutK E
1
1
(2) The fields E and F are K-isomorphic by Theorem 53.8, so the claim
follows immediately from part (1).
Thus Galois groups of any two splitting fields (over K) of f(x) are iso-
morphic. This justifies the definite article in the next definition.
n
(d) Let p be a prime number. The field pn
is a splitting field of xp x
pn
over p
(Example 53.5(f)). Hence the Galois group of x x p
[x] is
Aut pn
[x] = , where is the homomorphism a a p (Example
54.18(c)).
713
Proof: Let E be a splitting field of f(x) over K and let a1,a2, . . . ,an be the
distinct roots of f(x) in E (1 n deg f(x)). Any G = AutK E maps
any ai to a aj and thus gives rise to a permutation Sn, namely i j.
Thus is given by ai ai .
, G, we have
ai = ai( ) = (ai ) = ai = aj (put i = j)
= aj = a (i )
= a i( )
for i = 1,2, . . . ,n and so = . Here Ker if and only if ai = ai for
all i = 1,2, . . . ,n. Thus an automorphism in Ker fixes each element of K
and fixes each ai. Since E is generated by ai over K (Example 53.5(d)), we
deduce that an automorphism in Ker fixes all elements of E. Thus Ker
= { E }. So is one-to-one and G is isomorphic to Im Sn.
The preceding proof is quite simple.. G acts on the set of distinct roots of
f(x),. and the permutation representation is one-to-one; thus G is
isomorphic to a subgroup of SU , and SU itself is isomorphic to Sn. We will
often identify the Galois group of a polynomial. with its isomorphic
images in SU and in Sn.
714
1
{1,2, . . . ,n}, there are , G with 1 = i and 1 = j, so G maps i to j;
hence the condition is also sufficient.
i
(d) Let = (12. . . n) Sn. Then is a transitive subgroup of Sn since 1
= i for any i = 1,2, . . . ,n.
(f) It follows from the last two examples that (1234) and its
conjugates (1324) , (1243) are transitive subgroups of S 4. Also V4 =
{ ,(12)(34),(13)(24),(14)(23)} is a transitive subgroup of S 4. From V4
A4 and V4 S 4, we see that A4 and S 4 are transitive subgroups of S 4.
Likewise D = { ,(13),(24),(12)(34),(13)(24),(14)(23),(1234),(1432)} and
its conjugates
{ ,(12),(34),(13)(24),(12)(34),(14)(32),(1324),(1423)}
{ ,(14),(23),(12)(43),(14)(23),(13)(24),(1243),(1342)}
are transitive subgroups of S 4. On the other hand, { ,(12),(34),(12)(34)}
and its conjugates are not transitive subgroups of S 4.
56.16 Theorem: Let K be a field and let f(x) K[x]. be a monic poly-
nomial having no multiple roots.. Let E be a splitting field of f(x) and G =
AutK E the Galois group of f(x). Let r1,r2, . . . ,rn E be the roots of f(x). Let
m0 = 0 and mk = n.
(1) Assume the notation so chosen that
{r1,r2, . . . ,rm }, {rm +1,rm +2, . . . ,rm }, {rm ,rm , . . . ,rm },
1 1 1 2 2 +1 2 +2 3
715
. . . , {rm ,rm , . . . ,rm }
k-1+1 k-1+2 k
(2) Let f(x) = f1(x)f2(x). . . fk(x) be the canonical decomposition of f(x) into
monic irreducible polynomials in K[x] and let rm +1, rm +2, . . . ,rm be the
i-1 i-1 i
roots of fi(x) (i = 1,2, . . . ,k). Then
{r1,r2, . . . ,rn} = {r1,r2, . . . ,rm } {rm +1,rm , . . . ,rm } {rm ,rm , . . . ,rm }
1 1 1 +2 2 2 +1 2 +2 3
... {rm ,r
+1 m +2
, . . . ,rm }
k-1 k-1 k
is the partitioning of {r1,r2,. . . ,rn} into disjoint orbits under the action of G.
Proof: (1) We first prove that fi(x) K[x]. The coefficients of fi(x) =
(x rm +1)(x rm +2). . . (x rm ) are elementary symmetric polynomials in
i-1 i-1 i
rm , r
+1 m +2
, . . . ,rm . Any automorphism in G maps each one of these rm
i-1 i-1 i i-
,r
+1 m +2
, . . . ,rm to one of them again and thus leaves the coefficients of
1 i-1 i
fi(x) unchanged. So the coefficients of fi(x) are in the fixed field of G. Now
f(x) has no multiple roots, so the irreducible divisors of f(x) are
separable over K and, since E is a splitting field of f(x) over K, we infer E
is a Galois extension of K (Theorem 55.7) and the fixed field of G is
exactly K. Hence fi(x) K[x]..
g(x). These roots are distinct, for f(x) has no multiple roots. Thus g(x) has
at least mi mi 1 distinct roots. Then mi mi 1 deg g(x) deg fi(x) =
mi mi 1 and so g(x) = fi(x). Thus fi(x) = g(x) is irreducible in K[x].
716
fi(x) make up the orbit of rm . Indeed, if G, then rm is also a
i-1+1 i-1+1
root of fi(x) and thus:
orbit of rm {rm , rm , . . . ,rm }.
i-1+1 i-1+1 i-1+2 i
56.18 Theorem: Let K be a field such that char K 2 and let f(x) K[x].
Assume deg f = n 0 and let E be a splitting field of f(x) over K.
Suppose f(x) has n distinct roots r1,r2, . . . ,rn in E. Put
717
= ∏(ri rj ) = (r1 r2)(r1 r3). . . (rn 1
rn) and d = 2
.
i j
Proof: (1) We have = ∏(ri´ rj´) = (r1´ r2´)(r1´ r3´). . . (r(n 1)´
rn´),
i j
where ri´ = ri . We divide the ordered pairs (i,j) with i j into two
718
are separable over K and thus E is Galois over K (Theorem 55.7), so the
fixed field of AutK E is K and d is in K.
56.19 Theorem: Let K be a field such that char K 2 and let f(x) K[x].
Assume deg f = n 0 and let E be a splitting field of f(x) over K.
Suppose f(x) has n distinct roots r1,r2, . . . ,rn in E so that E is a Galois
719
Then any in G maps r to r and thus fixes K(r). This means G consists of
the identity mapping on K(u). Hence G = 1.
720
56.23 Theorem: Let K be a field such that char K 2 and let f(x) K[x]
be a polynomial of degree four. Let E be a splitting field of f(x) over K.
Suppose f(x) has four distinct roots r1,r2,r3,r4 in E so that E is a Galois
extension of K (Theorem 55.7). We put = r1r2 + r3r4, = r1r3 + r2r4 and
= r1r4 + r2r3 and consider the Galois group AutK E as a subgroup of S4
(Theorem 56.14).
In the Galois correspondence, the intermediate field K( , , ) corresponds
to AutK E V4.
721
56.24 Definition: Let K be a field and let f(x) K[x] be a polynomial of
degree four having four distinct roots r1,r2,r3,r4 in a splitting field of f(x)
over K. We put = r1r2 + r3r4, = r1r3 + r2r4 and = r1r4 + r2r3. The
polynomial (x )(x )(x ) K( , , )[x] is called the resolvent cubic
of f(x).
Proof: Since f(x) is irreducible and separable over K, its roots are
distinct. We know that G is a transitive subgroup of of S 4 and 4 divides
722
G (Theorem 56.17). The transitive subgroups of S 4 whose orders are
divisible by 4 are S 4, A4, the Sylow 2-subgroups of S 4 (isomorphic to D8),
V4 and the cyclic groups generated by 4-cycles like (1234) (Example
56.15(f)). Thus G is one of S 4,A4,D8,V4,C4.
723
and the roots , , of the resolvent cubic are 4, 2, 2. Thus ( , , )=
and m = ( , , ): = 1. Theorem 56.26 yields G = V4.
From f(r) = 0 (r2 2)2 = 3, we see that the roots (say in ) of f(x) are
r1 = 2+ 3, r2 = 2 3, r3 = 2+ 3, r4 = 2 3.
Note that r2 = 1/r1, r3 = r1 and r4 = 1/r1. Since
(12)(34) V4 = G fixes r1 + r2 = 6,
(13)(24) V4 = G fixes r12, hence also r12 2= 3,
(14)(23) V4 = G fixes r1 + r4 = 2, the Galois correspondence
is as depicted below.
( 2+ 3) 1
V4
724
Exercises
1. Find the resultant R(f,g) when f(x) = x3 + 4x3 3x2 + x 2 [x] and
g(x) = x 3 [x].
2. Let K be a field and f(x) = anxn + a n 1xn 1 + . . . + a1x + a0, g(x) = b1x + b0
polynomials in K[x], with b1 0. Show that R(f,g) = ( b1)nf( b1/b0).
5. Let K be a field and f,g,h K[x]. Prove that D(fg) = D(f)D(g)[R(f,g)]2 and
that D(f(x)) = D(f(x c)) for any c K.
s0 s1 s2 . . . sn 1
s1 s2 s3 . . . sn
D(f) = a2n 2 .
. . . . . . .
sn 1 sn sn+1 . . . s2n 2
725
(b) x3 2x2 + 4x + 6 [x].
(c) x3 x+2 3
[x].
(d) x3 + 3x2 3 5
[x].
10. Find the Galois groups of the following polynomials over the fields
indicated.
(a) x4 2 over ( 2) and over ( 2i).
(b) (x3 2)(x2 5) over .
(c) x4 8x2 + 15 over .
(d) x4 + 4x2 + 2 over and over ( 2).
2 2 2
(e) (x 2)(x 3)(x 5) over , over ( 2), over ( 6) and over
( 2, 3).
11. Let K be any arbitrary field and f(x) = x3 3x + 1 K[x]. Show that
f(x) is either irreducible over K or splits in K.
14. Let p be a prime number and G Sp. Show that G is transitive if and
only if p divides the order of G.
726
§57
Norm and Trace
N N
i
: E = K(a) = K[a] K[ai] = K(ai)
k0 + k1a + . . . + kn 2a n 2 + kn 1a n 1 k0 + k1ai + . . . + kn 2ani 2 + kn 1ani 1,
727
57.1 Lemma: Let K be a field and E a finite dimensional separable
extension of K. Let L be an intermediate field of E/K and let N be a
normal closure of K over E. If : L N is a K-homomorphism, then can
be extended is exactly E:L ways to a K-homomorphism E N.
E i
:a ai, l l (i = 1,2, . . . ,m; l L)
i
: E = L(a) = L[a] L[ai] = L(ai) N
l0 + l1a + . . . + lm 2a m 2 + l m 1a m 1 (l0 ) + (l1 )ai + . . . + (lm 2 )am
i
2
+ (l m 1
)am
i
1
(l0,l1, . . . ,lm 2,l m 1 L), where i = 1,2, . . . ,m; and these mappings i are
indeed extensions of (since i: l0 l0 ) and field homomorphisms (cf.
Lemma 53.1, Theorem 53.2). Thus { 1, 2, . . . , m} is the complete set of K-
homomorphisms from E into N which are extensions of .
728
N E/K (a) = (a 1
)(a 2
). . . (a k
).
TE/K (a) = a 1
+a 2
+ ... + a k
.
729
= 3a
3 3 3
for any a + b 2 + c( 2)2 ( 2) (a,b,c ).
In these examples, norm and trace are found to be in the base field. This
is always true. In fact, the norm and trace of an element are essentially
coefficients of the minimal polynomial of that element. In particular,
they are independent of the normal closure that we use in their defini-
tion. We now prove these assertions.
(1) N E/K (ab) = N E/K (a)N E/K (b) and TE/K (a + b) = TE/K (a) + TE/K (b).
730
(2) If b K, then b i
= b for all i = 1,2, . . . ,k and
N E/K (b) = (b 1)(b 2). . . (b k) = bb. . . b = bk
TE/K (b) = b 1
+ b + . . . + b = b + b + . . . + b = kb.
2 k
(m)
{ 1, 2
, . . . , k} = { i
: i = 1,2, . . . ,n and m = 1,2, . . . ,s}.
Thus
)=∏ ∏ ∏
n s n
(m)
N E/K (b) = (b 1
)(b 2
). . . (b k
b i
= (b i)s
i=1 m=1 i=1
∑ ∑ ∑
n s n
and TE/K (b) = b 1
+b 2
+ ... + b k
= b i
(m)
= s(b i)
i=1 m=1 i=1
∑
n
=s bi = s( a n 1).
i=1
We have already mentioned that N E/K (a) and TE/K (a) depend on the
fields E and K.. It is clear from the definition or from Lemma 57.4(3) that
N E/K (a) and TE/K (a) will be distinct from N L /K (a) and TL /K (a) and also
from N E/L (a) and TE/L (a) if L is an intermediate field (with a L in the
first case).
731
give a new argument that works in more general situations.
E
s N E/L
N E/K L
n N L /K
K
732
N is a splitting field of a polynomial f(x) K[x] over K (Theorem 55.11
and Theorem 55.7) and therefore N is a splitting field of f(x) over L and
over L i (Example 53.5(e)). The isomorphism i: L L i ( N) can be
(1)
extended to an isomorphism i
:N N (Theorem 53.7). Here of course
(1)
i
:N N is a K-homomorphism.
(1)
We claim { 1, 2
, . . . , k} = { j i
: i = 1,2, . . . ,n; j = 1,2, . . . ,s}. Since k = ns,
(1) (1)
we must merely show that j´ i´ j i
when (i,j) (i´,j´). Indeed, if
j´
\o\al(i´,(1)) = j
\o\al(i(1)), then the restriction of j´
\o\al(i´,(1)) and
(1)
j i
to L must be equal and since j and j´ fix each element in L, we
get (1)i´ L
= (1)
i L
, so i´ = i and i´ = i. Then, as (1)
i
is one-to-one, i´ = i and
(1)
j´ i´
= j i(1) imply that j´ = j and j´ = j. This establishes the claim.
)=∏ ∏
n s
(1)
N E/K (a) = (a 1
)(a 2
). . . (a k
a j i
i=1 j=1
= ∏ (∏ a ∏ ∏
n s n n
(1) (1)
j
) i
= (N E/L (a)) i
= (N E/L (a)) i
i=1 j=1 i=1 i=1
= N L /K (N E/L (a))
733
57.7 Lemma: Let E be a field and let { 1, 2, . . . , k} be a finite set of field
automorphisms of E. If 1, 2, . . . , k are pairwise distinct, then { 1, 2, . . . , k}
is linearly independent.
Then c1(b 1
) + c2(b 2
) + . . . + cr(b r) = 0 for all b E.
(2)
Since 1
, 2
, ..., k
are distinct,
and there is a u E with u 1 2 1
u 2
.
Writing ub in place of b in (2) and using (ub) i = u i.u i, we get
c1(u 1
)(b 1
) + c2(u 2
)(b 2
) + . . . + cr(u r)(b r) = 0 for all b E. (3)
Multiplying (2) by u 1
, we obtain
c1(u 1
)(b 1
) + c2(u 1
)(b 2
) + . . . + cr(u 1
)(b r) = 0 for all b E. (4)
Subtraction gives
[c2(u 2
u 1
)](b 2
) + . . . + [cr(u r
u 1
)](b r) = 0 for all b E.
(0,c2(u 2
u 1
), . . . ,cr(u r
u 1
), 0, . . . ,0) (0,0, . . . ,0)
734
We now characterize all elements with trace 0 and all elements with
norm 1 in case of a Galois extension with a finite cyclic group. The
second part of Theorem 57.9 (formulated for finite dimensional
extension of ) is the theorem with number 90 in D. Hilbert's (1862-
1943) famous report on algebraic number theory and is known as
"Hilbert's theorem 90". It is the beginning of cohomology theory.
57.9 Theorem: Let E/K be a finite dimensional. cyclic extension and let
be a generator of AutK E. Let a E.
(1) TE/K (a) = 0 if and only if there is an element b E with a = b b .
(2) N E/K (a) = 1 if and only if there is an element b E\{0} with a = b /b .
735
u u u 2 ... u n 2 u n 1
= + + + + +
T(u) T(u) T(u) T(u) T(u)
T(u)
= = 1.
T(u)
b := (a)d + (a .a )d + (a .a .a 2
)d 2
+ . . . + (a .a .a 2 . . . a n2
)d n2
+ (a .a .a 2 . . . a n 2.a n1
)d n1
736
We close this paragraph with two applications of Theorem 57.9. We
describe cyclic extensions. The degree is the characteristic in the first
case and relatively prime to the characteristic in the second case.
It remains to show that f(x) is irreducible over K and that E = K(t) for
any root t of f(x). Since b is not fixed by , we see b K, so u K and
thus K K(u) E. But E:K = p is prime and so there is no intermediate
field of E/K distinct from K and E. This forces K(u) = E. Then deg f(x) = p
= E:K = K(u):K = degree of the minimal polynomial of u over K. Since the
minimal polynomial of u over K divides f(x), we deduce that f(x) is the
minimal polynomial of u over K. In particular, f(x) is irreducible in K[x].
737
Proof: By hypothesis, AutK E is a cyclic group, say AutK E = , and o( ) =
n
AutK E = E:K = n. All roots of the polynomial x 1, which splits in K, are
simple since its derivative nxn 1 0 in view of the assumption on char K.
Thus there are exactly n distinct roots of xn 1 in K. Since rn = sn = 1
implies (rs)n = 1, the roots of xn 1 make up a subgroup of K . Any finite
subgroup of K is cyclic (Theorem 52.18), so the roots of xn 1 form a
cyclic group of order n. Let r K be a generator of this group so that the
n roots of xn 1 are 1,r,r2, . . . ,rn 1.
j
Any K-automorphism AutK E (j = 0,1,2, . . . ,n 1) sends u to urj K(u),
thus the restriction of j to K(u) is a K-automorphism of K(u) (Theorem
42.22). Since u i = uri urj = u j when i,j {0,1,2, . . . ,n 1} and i j, we
see that these K-automorphisms of K(u) are distinct.. Hence there are at
least n K-automorphisms of K(u). This implies K(u):K = AutK K(u) n.
From n = E:K K(u):K n, we get K(u):K = n, whence E = K(u)..
738
Exercises
739
§58
Cyclotomic Fields
740
So in the situation of Lemma 58.3, a splitting field of xn 1 over K is also
a splitting field of xm 1 over K, and conversely. For this reason, in case
char K 0, it is no loss of generality to assume that the order of a
cyclotomic extension is relatively prime to the characteristic of K.
Proof: If u and t are n-th roots of unity, then (ut)n = untn = 1.1 = 1 and
ut is also an n-th root of unity. Since the number of n-th roots of unity is
at most n (Theorem 35.7), it follows that the set of all n-th roots of unity
is a subgroup of K (Lemma 9.3(1)). This group of n-th roots of unity is
cyclic by Theorem 52.18. To prove that the order of this group is equal
to n, we must only show that all roots of xn 1 are simple. This follows
from the fact that the derivative nxn 1 of xn 1 is distinct from zero
(because of the asumption char K = 0 or (char K,n) = 1) so that xn 1 and
nxn 1 have no common root.
741
If u is a root of unity and o(u) = d, then, by definition, u is a primitive d-
th root of unity.
1
(x) = x 1, 2
(x) = x ( 1) = x + 1,
2
3
(x) = (x )(x ) = x2 + x + 1, 4
(x) = (x i)(x + i) = x2 + 1.
We see that these are in fact polynomials in [x]. This is true for any
cyclotomic polynomial. The n-th cyclotomic polynomial over K does not
depend on the extension field of K in which the primitive n-th roots of
unity are assumed to lie. In fact, it does not even depend on K (but only
on char K).
(1) xn 1= ∏ d
(x).
dn
(2) n
(x) [x] if char K = 0 and n
(x) p
[x] = p
[x] if char K = p 0.
Proof: (1) Any root u of xn 1 is an n-th root of unity and o(u) = d for
some divisor of n. Then u is a primitive d-th root of unity. Conversely, if
d n, any primitive d-th root of unity is an n-th root of unity with o(u) =
742
d. Thus d
(x) = ∏ (x u). Collecting together the roots of xn 1 with
un =1
o(u)= d
xn 1= ∏ (x u) = ∏ ∏ (x u) = ∏ d
(x).
un =1 dn un =1 dn
o(u)= d
xn 1= n
(x)f(x) + 0.
xn 1
The equation (x) = is a recursive formula for (x). Thus
n
∏ d
(x)
n
dn
d n
6
(x) = x6 1/ 1
(x) 2
(x) 3
(x)
= x6 1/(x 1)(x + 1)(x2 + x + 1) = x2 x + 1.
743
58.8 Lemma: Let K be a field, n and assume that char K = 0 or
(char K,n) = 1. Then
n
(x) = ∏ (xd 1) (n/d)
= ∏ (xn/d 1) (d)
.
dn dn
Proof: This follows immediately from Lemma 58.7(1) and Lemma 52.14
(in Lemma 52.14, let the field be K(x) and let the function F: K(x)
be n n
(x)).
12
(x) = (x12 1) (1)
(x6 1) (2)
(x4 1) (3)
(x3 1) (4)
(x2 1) (6)
(x
(12)
1)
= (x12 1)(x2 1)/(x6 1)(x4 1) = x4 x2 + 1,
15
(x) = (x15 1) (1)
(x5 1) (3)
(x3 1) (5)
(x 1) (15)
Proof: (1) Let a1,a2, . . . ,ak be the natural numbers less than n and
relatively prime to n (where k = (n)). so that a 1 , a 2 , . . . a k are the roots
of n(x). Now E is a splitting field of n(x) by definition, so E is generated
a1 a2 ak
by the roots of n
(x) over K (Example 53.5(d)) and E = K( , , ... )=
K( ).
744
(2) The roots of n
(x) are simple because n
(x) is a divisor of xn 1 and
the roots of xn 1 are simple . (the derivative of xn 1, being distinct
from 0 since char K = 0 or (char K,n) = 1, is relatively prime to xn 1).. So
the irreducible factors of n(x) are separable over K. Since E is a splitting
field of n(x), Theorem 55.7 shows that E is Galois over K.
(3) Since is a root of n(x) K[x] and f(x) is the minimal polynomial of
over K, we see f(x) divides n
(x) in K[x] and the roots of f(x) are certain
of the roots of n(x).. Let deg f(x) = s and m1 , m2 , . . . ms be the roots of
f(x), where m1,m2, . . . ,ms are some suitable natural numbers relatively
prime to n and less than n and m1 = 1, say. Thus
m1 m2 ms
f(x) = (x )(x ). . . (x ).
mi mj
mi
= mj
= mi mj (mod n) i = j,
m1
, m2
, ..., ms
are pairwise distinct and AutK E = { m1
, m2
, ..., ms
}.
:G AutK E.
mi* mi
As mi
= mj
mi* = mj *, the mapping is well defined and one-to-one.
Both G and AutK E have s elements, so is also onto AutK E. Then has an
inverse :
: AutK E G n
.
mi
mi*
Suppose mi mj
= mk
. Then
mk mi
= mk
= mi mj
=( mi
) mj
=( ) mj
=( mj
)mi = ( mj mi
) = mi mj
,
745
( mi mj
) =( mk
) = mk* = mi*mj * = ( mi
) ( mj
) .
Hence : AutK E n
is a one-to-one group homomorphism, and Im = G
is a subgroup of n and is an isomorphism form AutK E onto G. This
proves that AutK E is isomorphic to a subgroup of n. It follows from
Lagrange's theorem that AutK E = G divides n = (n).
We have AutK E = deg f(x) and (n) = deg n(x). Now f(x) divides n(x) in
K[x] and both f(x) and n(x) are monic, so f(x) = n(x) if and only if
deg f(x) = deg n(x), so if and only if AutK E = (n).
Our first step will be to show that p is also a root of. g(x) for any prime
number p relatively prime to n. Now is a root of n(x), so o( ) = (n)
and if p is a prime number such that (p,n) = 1, then o( p) = (n) and p is
also a primitive n-th root of unity: p is a root of n(x), so p is a root of
g(x) or of h(x). Let us assume, by way of contradiction, that p is not a
root of g(x). Then p is a root of h(x). Then is a root of h(xp) and h(xp) is
divisible by the minimal polynomial g(x) of over .
746
h(xp) = g(x)q(x) + r(x), r(x) = 0 or deg r(x) deg g(x)
Let : p
be the natural homomorphism and let : [x] p
[x] be the
homomorphism of Lemma 33.7. We shall write s(x) instead of (s(x)) for
s(x) [x]. Then h(xp) = g(x)p(x) implies
h(xp) = g(x)p(x) in p[x].
Since char p
= p, there holds h(xp) = h(x)p in p
[x] and we get
h(x)p = g(x)p(x) in p
[x].
and xn 1 p
[x] has a multiple root. But the derivative of xn 1 p
[x]
is not 0 p
[x], so relatively prime to xn 1 and xn 1 p
[x] has no
p
multiple roots. This contradiction shows that must be a root of g(x).
n
(x) is irreducible in [x].
747
58.12 Theorem: Let n .
and let be a primitive n-th root of unity
in some extension of . Then ( ) is Galois over and Aut ( ) n
.
Proof: Since n
(x) is monic and irreducible in [x], it is irreducible in
[x] (Lemma 34.11). The claim follows now from Theorem 58.10.
g g2 g3 g p-2
, , , , ...,
k k
and we have k: g
. Let us put k = g . Then k+(p 1) = k+(p 1) = k
=
k
so that any index k can be replaced by any j with k j (mod p 1).
gk k k g k+1 gk m gk
Now k
=( ) =( )g = ( g )g = = k+1
and k
m
=( ) m
=( ) =
gm gk g k+m m
( ) = = k+ m
. Thus raises the index by 1 and more generally
raises the index by m.
g g2 g3 g p-2 2 3 p2
{1, , , , , ..., } = {1, 0, 1
, 2
, 3
,. . . , p 2
} = {1, , , , , ..., }
u = a0 0
+ a1 1
+ a2 2
+ a3 3
+ . . . + ap 1 p 1
748
with uniquely determined a0,a1,a2, . . . ,ap 1
. Here
u e
= (a0 0
+ a1 1
+ a2 2
+ a3 + . . . + a p 2 p 2)
3
e
= a0 e+0
+ a1 e+1
+ a2 e+2
+ a3 e+3 + . . . + a p 2 e+(p 2)
e e
and u is fixed by , i.e., u = u if and only if
a0 e+0
+ a1 e+1
+ a2 e+2
+ a3 e+3
+ . . . + ap 2 e+(p 2)
= ae+0 + ae+1 e+1
+ ae+2 e+2
+ ae+3 e+3
+ . . . + a e+(p 2) e+(p 2)
We put k
= k
+ e+ k
+ 2e+ k
+ 3e+ k
+ ... + (f 1)e+ k
(k = 1,2, . . . ,e 1). The
e
elements are called the periods of f terms. We see u is fixed by
k
if
and only if u = a0 0 + a1 1 + a2 2 + . . . + ae 1 e 1 with a0,a1,a2, . . . ,ae 1 .
So { 0, 1, 2, . . . , e 1} is a -basis of Ke.
Note that : 0 1
, 1 2
, 2 3
, ..., e 2 e 1
, e 1 0
. Thus each
e e
of 0, 1, 2, . . . , e 1 is fixed by and by powers of , but not by any
other automorphism of Aut ( ). Hence all intermediate fields ( 0),
e
( 1
), ( 2
), . . . , ( e 1
) of
correspond to the same subgroup ( )/
of Aut ( ). This forces ( 0) = ( 1) = ( 2) = . . . = ( e 1) = Ke. So any
period of f terms is a primitive element of Ke, the unique intermediate
field of ( )/ with Ke: = e.
749
( ) 1
f f
e
( k
)
e e
Then there is one ond only one intermediate field of the extension
( )/ whose -dimension is equal to e. This field is ( k) for any k =
0,1,2, . . . ,e 1. The set { 0, 1, 2, . . . , e 1} is a -basis of ( k). All
intermediate fields of ( )/ are found in this way as e ranges through
the positive divisors of p 1.
3 2 6 4 5
, , , , , .
750
6
The 2-term periods are + , 3 + 4, 2 + 5 and ( + 6
) is the
6 6
intermediate field ( + ): = 3. We also have ( + ) = ( + 1) =
( 3 + 4) = ( 2 + 5).
( )
2
1
3 ( + )
( )
2 3
has the plus sign depends on the choice of . We may assume is a 17-th
root of unity that appears in the period with the plus sign (otherwise
replace by one of the roots of unity that appear in the period with the
1+ 17 1 17 .
plus sign). Then 0 = and =
2 1 2
751
3 5
1
= + + 14 + 12
, 3
= 10
+ 11
+ 7
+ 6
and 0
+ 2
= 0
, 0 2
= 1; 1
+ 3
= 1
, 1 3
= 1.
Hence 0
and 2
are the roots of x2 0
x 1 and 1
and 3
are the roots
2 2
0
+ 0
+4 0 0
+4
2
of x x 1. Here we may put and 2 = = by
1 2 2 0
assuming that is a 17-th root of unity that appears in the period with
2
1 1
+4
the plus sign The signs of radicals in , = , however, can no
1 3 2
longer be arbitrarily assigned by choosing suitably. To determine
which of 1, 3 has the positive radical, we note
( 0 2
)( 1 3
) = 2( 0 1
)
2
0
+4
.( ) = 17,
2 1 3
2 2
1
+ 1
+4 1 1
+4
so that is positive. This gives = and = .
1 3 1 2 3 2
752
( )
2
( 0
)
2
( 0
)
2
( 0
)
2
* *
∑ G:CG(xi) ,
k
G =
i=1
753
58.15 Lemma: Let n be a natural number greater than one and let
n
(x) be the n-th cyclotomic polynomial over . Then, for any a proper
divisor d of n, we have
xn 1
(x) = x(n/d) 1 + x(n/d) 2 + . . . + x(n/d) + 1 in [x]
n xd 1
qn 1
(q) in .
n qd 1
Proof: Since n
(x) (xn 1) and xn 1 = (xd 1)[(xn 1)/(xd 1)], it is
sufficient to show that n
(x) is relatively prime to xd 1 for any proper
divisor d of n. But this is clear, because n(x) and xd 1 have no root in
common: the roots of n(x) are primitive n-th roots of unity, whereas a
root of xd 1 cannot be a primitive n-th root of unity if d is a proper
divisor of n. This proves the divisibility relation in [x]. Substituting any
integer q for x (and using n(x), (xn 1)/(xd 1) [x]) we obtain the
divisibility relation in .
∏
n
k
Proof: We have n
(x) = (x ), where is a primitive n-th root of
k=1
(k,n)=1
∏ ∏ ∏
n n n
k
n
(q) = q = q e2 ki/n
q e2 ki/n
k=1 k=1 k=1
(k,n)=1 (k,n)=1 (k,n)=1
∏
n
= q 1 = (q 1) (n)
= (q 1).(q 1) (n) 1
q 1
k=1
(k,n)=1
754
in case (n) 1 1 since q 1. In case (n) 1 = 0, we have n = 2 and
2
(q) = q + 1 q 1.
∑ D :CD (xi) ,
k
D =
i=1
We now put CD (xi) = {a D: xia = axi} = CD (xi) {0} D. Since a,b CD (xi)
implies xi(a + b) = xia + xib = axi + bxi = (a + b)xi, we see CD (xi) is closed
under addition and thus CD (xi) is a subgroup of D under addition
(Lemma 9.3(2)). As CD (xi)\{0} = CD (xi) is a subgroup of D , we conclude
that CD (xi) is a division ring (a subdivision ring of D).
D qn 1
∑ D :CD (xi) ∑ = ∑ mi
k k k
n
q 1= = .
i=1 i=1
CD (xi) i=1 q 1
755
Now D :CD (xi) is an integer, so qmi 1 divides qn 1 and this implies
that mi divides n (Lemma 52.7(1)).
We want to show that D is commutative, or, what is the same thing, that
Z = D. We will assume Z D and derive a contradiction. Well, if Z D, then
n 1 and there is at least one xi such that D :CD (xi) 1, because
D :CD (xi) = 1 if and only if xi Z(D ). We so choose the notation that
{x1,x2, . . . ,xh} = Z(D ) and xh+1, . . . ,xk are not in the center of D . Then the
class equation becomes
qn 1
1 = ∑ D :CD (xi) + ∑ D :CD (xi) = Z(D ) + ∑ mi
h k k
n
q
i=1 i=h+1 i=h+1 q 1
qn 1
∑
k
qn 1 = (q 1) + mi .
i=h+1 q 1
Exercises
4. Let p,k and p be prime. Let p(x) denote the p-th cyclotomic
polynomial over . Prove that, if d p(k), then d 1 (mod p) or d = p.
756
5. Let p be prime, k and let k* be the residue class of k in p. Let
n and n(x) the n-th cyclotomic polynomial over . Suppose that
p n. Prove the following statements.
(a) p n(k) if and only if o(k*) = n (order of k* in p is n).
(b) There is an integer a with p n(a) if and only if p 1 (mod n).
6. Let n and n(x) the n-th cyclotomic polynomial over and let
p1,p2, . . . ,pm be prime numbers of the form tn + 1 (t,n ). Use Ex. 5 and
prove the following statements.
(a) n(anp1p2. . . pm) 1 (mod np1p2. . . pm) for any a .
(b) n(anp1p2. . . pm) 1 if a is sufficiently large.
(c) For some a , there is a prime divisor p of n(anp1p2. . . pm)
which is distinct from p1,p2, . . . ,pm.
(d) There are infinitely many prime numbers p of the form tn + 1.
(This is a special case of the following celebrated theorem of Dirichlet: if
a,b are any relatively prime integers, then there infinitely many prime
numbers of the form an + b.)
2 1 1 1
cos = + 17 + 34 2 17
17 16 16 16
1
+ 17 + 3 17 34 2 17 2 34 + 2 17
8
9. Under the hypotheses of Theorem 58.13, show that the set of periods
independent of the integer g for which g * is a generator of p , but the
indices of individual periods do depend on g. Describe this dependence.
10. Let the hypotheses of Theorem 58.13 be valid, with p an odd prime
number, and let 0, 1 be the [(p 1)/2]-term periods. Prove that 0 1 =
(p 1)/4 or (p + 1)/4 according as p 1 (mod p) or p 3 (mod p). Show
that 0 1
= ( 1)(p 1)/2p. (The sign depends on the primitive p-th
root of unity we take. If we choose = e2 i/p
, then the sign is plus.
This is considerably difficult to prove. This exercise shows ( (
1)(p 1)/2p)) is contained in the cyclotomic field ( ). A theorem of class
757
field theory, known as Kronecker-Weber theorem, states that any finite
dimensional Galois extension of whose Galois group is abelian is
contained in a suitable cyclotomic extension of .)
11. Let k
denote a primitive k-th root of unity. Show that, if (n,m) =
1, then ( n, m) = ( nm) and ( n) ( m) = .
12. Let be a primitive n-th root of unity. Prove that all roots of
unity in ( ) are j (j = 0,1,2, . . . ,n 1).
14. Show that any finite subring of a division ring is a division ring.
758
§ 59
Applications
This paragraph consists of five parts. In the first part, we give an exact
definition of solvability by radicals, discuss radical extensions and
establish the fundamental theorem due to Galois that a polynomial
equation is solvable by radicals if and only if the Galois group of the
polynomial is a solvable group. In the second part, we apply this
theorem to the general polynomial of degree n over a field and deduce
Abel's theorem: if n 5, then the general polynomial of degree n is not
solvable by radicals. In the third part, we discuss solvability of
equations when the degree is prime. In the fourth part, we give
formulas for the roots of polynomials of degree two, three and four. In
the last part, we examine which real numbers can be constructed by
ruler and compass.
* *
u
k t
m n r s
. . . + . . . + . . . . . .
K0 K1 K2 ... Kn
759
of fields, K0 being the field in which the coefficients of the given
polynomial lie and each Ki+1 is obtained from Ki by adjoining a root of a
polynomial of the form xn a Ki[x] to Ki. These considerations lead to
the following definitions.
59.2 Definition: Let K be a field and f(x) K[x]. We say the equation
f(x) = 0 is solvable by radicals provided there is a splitting field S of f(x)
over K and a radical extension R of K such that K S R.
If, in the setup of Definition 59.1, hi = rs and if we put uir = ui´ so that
ui´s K(u1, . . . ,ui 1), then we may insert the field K(u1, . . . ,ui 1,ui´) between
K(u1, . . . ,ui 1) and K(u1, . . . ,ui 1,ui):
K(u1, . . . ,ui 1) K(u1, . . . ,ui 1,ui´) K(u1, . . . ,ui 1,ui´,ui) = K(u1, . . . ,ui 1,ui),
760
One of the principle theorems in this paragraph is that, if a polynomial
equation f(x) = 0 is solvable by radicals, then the Galois group of f(x) is a
solvable group (Definition 27.19). In fact, we obtain more general
results. In the next three lemmas, we study radicality of some related
field extensions.
761
L = K(u1,u2, . . . ,un) and uihi K(u1, . . . ,uh 1) (i = 1,2, . . . ,n)
M = K(t1,t2, . . . ,tm) and tj kj K(t1, . . . ,tk 1) (j = 1,2, . . . ,m)
with some suitable elements ui, tj and natural numbers hi,kj . Now LM is
the smallest subfield of E containing K and u1,u2, . . . ,un,t1,t2, . . . ,tm, so LM =
K(u1,u2, . . . ,un,t1,t2, . . . ,tm). Since uihi K(u1, . . . ,uh 1) for i = 1,2, . . . ,n and
likewise tj kj K(u1,u2, . . . ,un,t1, . . . ,tk 1) for j = 1,2, . . . ,m, we conclude that
LM is a radical extension of K (where K(u1,u2, . . . ,un,t1, . . . ,tk 1) is to be
read as K(u1,u2, . . . ,un) when k = 1).
59.5 Lemma: Let E/K be a radical field extension and let N be a normal
closure of K over E. Then N is a radical extension of K.
Proof: Let {a1,a2, . . . ,am} be a K-basis of E and let fi(x) K[x] be the
minimal polynomial of ai over K. We remind the reader of the fact that N
is a splitting field of f(x):= f1(x)f2(x). . . fm(x) over K (see the proof of
Theorem 55.11).
Let E1,E2, . . . ,Es be the fields obtained in this way. Then each Ei is K-
isomorphic to E and so a radical extension of K. Using Lemma 59.4
repeatedly, we get that the compositum E1(E2(E3( . . . Es)). . . ) is a radical
extension of K. But this compositum is a subfield of N containing all roots
of f(x). Since N is a splitting field of f(x) over K, the compositum must
equal N. Thus N is a radical extension of K.
762
59.6 Lemma: Let E/K be a finite dimensional field extension, m .
Assume char K = 0 or (m,char K) = 1 and let be a primitive m-th root of
unity. If E is Galois over K, then E( ) is also Galois over K.
m
(x) K[x] by Lemma 58.7(2)). Since the irreducible factors of
f(x) m(x) have no multiple roots, they are separable over K and the
claim will imply that E( ) is a Galois extension of K (Theorem 55.7)..
Proof: (1) We must show K(u) is Galois over K and AutK K(u) is a cyclic
group. If K is a primitive n-th root of unity, then u, u, 2u, . . . , n 1u
are the roots of xn a. Thus K(u) is a splitting field of xn a over K. The
polynomial xn a has no multiple roots, so the irreducible divisors of
xn a are separable over K. Thus K(u) is Galois over K (Theorem 55.7).
We now show that AutK K(u) is cyclic. If AutK K(u), then u is a root of
xn a, so u = u for some (not necessarily primitive) n-th root of
unity. Since u = u( ) = (u ) = ( u) = ( )(u ) = . u = ( )u and so
763
= for any , AutK K(u), the mapping : AutK K(u) K is a
(2) Let K(u):K = d. Since K(u) is Galois over K, we have AutK K(u) = d by
the fundamental theorem of Galois theory. So AutK K(u) is a cyclic group
of order d, say AutK K(u) = . Now is isomorphic to a subgroup
of and has order n. Hence d n. Moreover, o( ) = = = o( )
d
= d, so = 1 and (ud) = (u )d = ( u)d = d d
u = ud, so ud is fixed by
and by AutK K(u), so ud K since K(u) is Galois over K.
764
First we show that char K, if distinct from 0,. can be assumed to be
distinct from all the prime numbers hi. Indeed, if 0 char K = p = hi,
then uip K(u1, . . . ,ui 1). But E is Galois, hence separable over K and over
K(u1, . . . ,ui 1) (Lemma 55.6), so ui is separable over K(u1, . . . ,ui 1) and
K(u1, . . . ,ui 1,ui) = K(u1, . . . ,ui 1)(ui) = K(u1, . . . ,ui 1)(upi ) = K(u1, . . . ,ui 1,upi ) =
K(u1, . . . ,ui 1) by Lemma 55.16. Thus K(u1, . . . ,ui 1) = K(u1, . . . ,ui 1,ui) and ui
can be deleted from the set of generators.. We assume all generators of
this type have been deleted and thus all the prime numbers hi are
relatively prime to the characteristic of K in case char K = p 0..
E( )
E K( )
. E K( )
765
E0. Let Gi AutE En = AutK ( )E( ) be the subgroup Ei´ = AutE En of AutE En
0 i 0
corresponding to Ei (i = 0,1,2, . . . ,n).
E( ) = En Gn = 1
Ei Gi
Ei 1 Gi 1
K( ) = E0 G0
since in fact Ei 1 has a primitive m-th root of unity (i = 1,2, . . . ,n). Thus
Lemma 59.7 applies and shows that Ei is a cyclic extension of Ei 1 of
degree Ei : Ei 1 = hi or 1. In particular, Ei is Galois over Ei 1 and, since En is
also Galois over Ei 1, we get Gi Gi 1 and Gi 1/Gi AutE Ei from Theorem
i-1
54.25(2). Thus Gi 1/Gi = Ei : Ei 1 = hi or 1 and Gi 1/Gi is cyclic (of prime
order hi or of order 1). Hence
1 = Gn Gn 1
Gn 2
... G1 G0 = AutE En = AutK ( )E( )
0
is an abelian series of AutK ( )E( ) and AutK ( )E( ) is a solvable group. This
completes the proof.
766
If b E is fixed by all K1-automorphisms of E, then b is fixed by all K-
automorphisms of E, so b K1. Hence K1 is the fixed field of AutK E, which
1
means E is a Galois extension of K1.
radical
S radical
Galois
K1
767
AutK S = AutK S AutK N / AutS N.
1 1
59.9, Lemma 59.3(1)) and Theorem 59.8 yields that AutK N = AutK N is a
1 2
solvable group..
59.11 Theorem: Let K be a field and f(x) K[x]. If the equation f(x) = 0
is solvable by radicals, then the Galois group of f(x) is a solvable group.
768
59.12 Theorem: Let K be a field and E a finite dimensional Galois
extension of K. Assume char K = 0 or 0 char K and char K does not
divide E : K . If AutK E is a solvable group, then there is a radical extension
R of K such that K E R.
Let n = E:K . Assume n 2 and the theorem is proved for all field
extensions of degree n.
1 E( )
E K( )
. E K( )
H
p
AutK E K
769
If Im AutK E, then E( ):K( ) = AutK ( )E( ) = Im AutK E = n. Now
AutK ( )E( ), being isomorphic to a subgroup of the solvable group AutK E,
itself is a solvable group (Lemma 27.20) and. E( ) is Galois over K( ), so,
by induction, there is a radical extension R of K( ) containing E( ).. The
proof is complete in this case. .
E( ) 1 1
F J H
p p p
K( ) AutK ( )E( ) AutK E
770
Conversely, let S be a splitting field of f(x) over K and assume that the
Galois group AutK S of f(x) is solvable. In order to prove that the equation
f(x) = 0 is solvable by radicals, i.e., in order to prove that there is a
radical extension R of K satisfying K S R, it suffices, in view of
Theorem 59.12, to show that S is Galois over K and char K = 0 or char K
does not divide S:K .
over K.
* *
In this part, we prove the celebrated theorem due to Abel which states
that the general polynomial (over a field of characteristic 0) of degree n
is solvable by radicals if and only if n 4 and some related results.
First of all, we must explain what we mean by the general polynomial of
degree n.
59. 14 Definition: Let K be a field and let a1,a2, . . . ,an 1,an be n distinct
indeterminates over K. The polynomial
g(x) = xn a 1xn 1 + a 2xn 2 a 3xn 3 + . . . + ( 1)n 1an 1x + ( 1)nan
in K(a1,a2, . . . ,an 1,an)[x] is called the general polynomial of degree n over
K.
771
degree n over K is not a polynomial over K, that is, it is not in K[x], but in
K(a1,a2, . . . ,an 1,an)[x].
Our main goal is to prove that. the Galois group of the general polynomial
of degree n is the symmetric group Sn. After we established some
prepatory lemmas, we prove that each permutation of the roots induces
an automorphism of the splitting field if the roots are indeterminates
(Theorem 59.17) and that we can indeed treat the roots of the general
polynomial as indeterminates (Theorem 59.18).
59.15 Lemma: Let D1,D2 be integral domains and F1,F2 the field of frac-
tions of D1,D2, respectively. If : D1 D2 is a ring isomorphism, then the
mapping
: F1
1
F2
a/b (a )/(b )
is a field isomorphism.
[(a/b) + (c/d)] 1
= [(ad + bc)/bd] 1 = (ad + bc) /(bd)
= (a d + b c )/(b d ) = (a /b ) + (c /d ).
= (a/b) 1 + (c/d) 1
772
and
[(a/b)(c/d)] 1
= (ac/bd) 1 = (ac) /(bd) = (a c )/(b d )
= (a /b )(c /d ) = (a/b) 1(c/d) 1.
Thus 1
is a field isomorphism.
59.16 Lemma: Let K be a field and let x1,x2, . . . ,xn be n distinct indeter-
minates over K. .
(1) For each permutation Sn, the mapping
Proof: (1) Let Sn. The mapping ´´: K[x1,x2, . . . ,xn] K[x1,x2, . . . ,xn]
f(x1,x2, . . . ,xn) f(x1 ,x2 , . . . ,xn )
f1 = ∑ xi
773
f2 = ∑ xixj
f3 = ∑ xixj xk
...............
fn = x1x2. . . xn
Since E = L(x1,x2,. . . ,xn) is generated by the roots x1,x2,. . . ,xn of g(x) over L,
we deduce that E is a splitting field of h(x) over L (Example 53.5(d)). As
h(x) has no multiple roots, the irreducible factors of h(x) in L[x] are
separable over L. Theorem 55.7 tells now E is a Galois extension of L.
774
Proof: Let a1,a2, . . . ,an 1,an indeterminates over K so that
Let x1,x2, . . . ,xn be n indeterminates over K which are distinct from the
a1,a2, . . . ,an. Let E = K(x1,x2, . . . ,xn) and let f1,f2,. . . ,fn be the elementary
symmetric polynomials in K[x1,x2, . . . ,xn] and put L = K(f1,f2, . . . ,fn). We
know AutL E Sn from Theorem 59.17.
1
K(a1,a2, . . . ,an) = L1 K(f1,f2, . . . ,fn) = L
K K
The homomorphism 1
: L1[x] L[x] of Lemma 33.7 maps
775
to h(x) = xn f1 x n 1 + f2 x n 2 f3 x n 3 + . . . + ( 1)n 1f x + ( 1)nf .
n 1 n
* *
776
Let p be a prime number. It will be convenient to regard Sp as acting on
the p elements 1,2, . . . ,p of p. For any a p
and b p
, We write
a,b
: p p
.
u au + b
u a,b c,d
= (au + b) c,d
= c(au + b) + d = cau + cb + d
= (ac)u + (bc + d) = u ac,bc+ d,
= 1 2 . . . p
a,b a+b a2+b . . . ap+b ,
1 2 . . . p
= a+b a2+b . . . ap+b .
777
2 3 p1
(2) If , , , ... , are the only elements of order p in H, then is a
subgroup of Sp.
n nan 1
u =a u+ b
a 1
a
(t + 1) = t =t = t + a, for any t {1,2, . . . ,p 1,p}.
Then (t + 2) = (t + 1) + a = t + 2a,
(t + 3) = (t + 2) + a = t + 3a,
778
and similarly (t + u) = t + ua for all t,u {1,2, . . . ,p 1,p}. Putting t = 0
and t = b, we get u = t + ua = au + b for any u = 1,2, . . . ,p 1,p. There-
fore = a,b A(p). This proves H A(p).
Proof: Let i,j {1,2, . . . ,p}. We claim that the number of elements in the
H-orbit of i is equal to the number of elements in the H-orbit of j.
Indeed, since K is transitive, there is a K with i = j and
in view of Lemma 25.10 and Lemma 25.8. Thus all orbits of H have the
same number of elements, say m. If k is the number of H-orbits, then
the {1,2, . . . ,p} is partitioned into k subsets each of which has m
elements. Thus p = mk and k = p or k = 1. If k = p were true, i.e., if there
were p H-orbits, the H-orbits would consist of single terms and we
would get u = u for any u {1,2, . . . ,p}, H. This would give H = 1,
contrary to the hypothesis. Hence k = 1 and H is transitive.
779
1,2, . . . ,p, a permutation a
with 1 a
= a and we have the coset
decomposi-tion
p
G= [StabG(1)] a ,
i=1
We can now find all solvable transitive subgroups of Sp. Basically, we use
Lemma 59.21 and Lemma 59.22 to go downwards and upwards along a
composition series of such subgroups.
1 = H0 H1 H2 ... Hm 1 Hm = G.
: A(p) p
780
a,b
a
1 A(p)
781
less than or equal to the number Sp:H of right cosets of H in Sp and, as H
is isomorphic to Sp 2, we have G Sp:H = Sp / H = p!/(p 2)! = p(p 1).
Lemma 59.23 yields p divides G , so there is an element ´ = (a1a2. . . ap)
a a . . . ap
of order p in G. If we write = 1 2 Sp, then (12. . . p) = = ´ is
1 2. . . p
an element of order p in G . Aside from the powers of , there is no
permutation of order in G , for if G had order p and = 1,
then = = p2 (Lemma 19.6) and so there would be at
least p2 distinct elements in G , whereas G = G is at most p2 p. So
is a normal subgroup of G by Lemma 59.21(2) and G is a linear
subgroup of Sp by Lemma 59.21(3). Hence G is conjugate to a linear
subgroup of Sp and G is solvable by Theorem 59.24.
Proof: . Let E be a splitting field of f(x) over K and G the Galois group of
f(x). Then E is a Galois extension of K (Theorem 55.7). and G is a
transitive subgroup of Sp (Theorem 56.17). Let a,b be two distinct roots
of f(x) and let J = K(a.b)´ be the subgroup of G corresponding to it. .
E 1
K(a.b) J
K G
E = K(a.b) J=1
1 is the only permutation in G fixing a and b
782
G is a solvable subgroup of Sp
the equation f(x) = 0 is solvable by radicals.
* *
g(x) = x2 ax + b
is the general polynomial of degree two over K. Then the roots r1,r2 of
g(x) are given by
a+ D , a D,
r1 = r2 =
2 2
where D = a 2 4b.
Proof: The discriminant D of g(x) is (r1 r2)2 = (r1 + r2)2 4r1r2 = a 2 4b.
Hence r1 + r2 = a and r1 r2 = D . Solving this system of linear equations
for r1,r2, we find
a+ D , a D
r1 = r2 =
2 2
In particular, the cubic roots of unity, which are the roots of the poly-
nomial x2 + x + 1, are given by 1 = ( 1 + 3)/2, 2 = ( 1 3)/2. This
will be used in the next theorem.
783
59.28 Theorem: Let K be a field with char K 2,3 and assume that K
contains a primitive cube root of unity, say . Let a,b,c be distinct
indeterminates over K and let
g(x) = x3 ax2 + bx c
be the general cubic polynomial over K. Then the roots r1,r2,r3 of g(x) are
given by
1 1 2 1 2
r1 = (a + u + v) r2 = (a + u + v) r2 = (a + u + v),
3 3 3
3
9 27 3
where u= a3 ab + c + 3D
2 2 2
3
9 27 3
v= a3 ab + c 3D
2 2 2
4 1
are such that uv = a 2 3b and D = (a 2 3b)3 (2a 3 9ab + 27c)2.
27 27
E 1
784
L( ) = L( D) A3
2
L S3
u3 = (r1 + r2 + 2
r3)3 = r13 + r23 + r33 + 3 A + 3 2
B + 6r1r2r3,
(1)
where we put A = r12r2 + r22r3 + r32r1 and B = r1r22 + r2r32 + r3r12 for
shortness. The method of §38 gives
so (1) becomes
ab 3c+ D ab 3c D
u3 = [(r1 + r2 + r3)3 3[( )+( )] 6r1r2r3]
2 2
1+ 3 ab 3c+ D 1 3 ab 3c D
+ 3( )( )+ 3( )( ) + 6r1r2r3
2 2 2 2
785
9 27 3
= a3 ab + c+ 3D
2 2 2
9 27 3
v3 = a 3 ab + c 3D.
2 2 2
So u and v are cube roots of the expressions found above. But there are
three cube roots of these expressions, and we must decide which cube
roots we should take. This is found from
2 2
uv = (r1 + r2 + r3)(r1 + r2 + r3) = a12 + a22 + a32 a1a2 a1a3 a2a3
= a2 3b.
The cube roots must be therefore so chosen that their product will be
equal to a 2 3b. If
3
9 27 3
u= a3 ab + c + 3D
2 2 2
3
9 27 3
v= a3 ab + c 3D
2 2 2
are denote cube roots with this property, then, solving the equations
a = r1 + r2 + r3
2
u = r1 + r2 + r3
2
v = r1 + r2 + r3
for r1,r2,r3, we get
1 1 2 1 2
r1 = (a + u + v) r2 = (a + u + v) r2 = (a + u + v),
3 3 3
as was to be proved.
A remarkable fact is that, if f(x) [x] has three real roots, then the
roots of f(x) cannot be expessed in terms of real radicals. We want to
dicsuss this matter. We need an elementary lemma.
786
59.29 Lemma: Let K be a field, a K and p a prime number. Assume
p
char K p. If x a K[x] is reducible in K[x], then a = c pfor some c K.
∏
p1
xp a= (x k
u)
k=0
Proof: Let r1,r2,r3 be the roots and D = (r1 r2)2(r1 r3)2(r2 r3)2 the dis-
criminant of f(x). Then D is a positive real number. Put K1 = K( D) .
Clearly K1 is a subfield of S. We may assume that f(x) is monic.
K1 K2 ... Kn 1 Kn = RK1
such that Ki = Ki 1(ui) for some root ui of a polynomial of the form xmi ai
in Ki 1[x] (i = 2,3,. . . ,n). We may assume mi are prime numbers. Moreover,
after deleting redundant fields, we may assume ui Ki 1. Thus we
assume mi are prime, uimi Ki 1 and ui Ki 1. Then xmi uimi Ki 1[x] is
irreducible in Ki 1[x], for otherwise we had uimi = c mi for some c Ki 1
(Lemma 59.29) and ui/c, which is distinct from 1 in view of ui Ki 1,
would be a primitive mi-th root of unity, so ui/c Ki 1 would be
complex number with nonzero imaginary part, a contradiction. Therefore
787
xmi uimi Ki 1[x] is irreducible over Ki 1 and is in fact the minimal poly-
nomial of ui Ki over Ki 1. This gives Ki:Ki 1 = mi.
f(x) is irreducible in K1[x], for f(x) is the minimal polynomial of any of its
roots over K and if r1, say, were in K1, then 2 = K1:K = K1:K(r1) K(r1):K
would be divisible by K(r1):K = deg f(x) = 3, which is nonsense. Now S is
a splitting field of f(x) over K1 (Example 53.5(e)) and since D K1, the
Galois group AutK S is isomorphic to A3. (Theorem 56.21).
1
On the other hand, the roots of f(x) are in S R RK1 and f(x) is
reducible over RK1 = Kn. Let Ki be the field in the chain above where f(x)
becomes reducible,. that is to say, let i {2, . . . ,n} be such that f(x) is
irreducible over Ki 1 and reducible over Ki = Ki 1(ui). Then there is a root
of f(x) in Ki, say r1 Ki and, as above, f(x) is the minimal polynomial of r1
over Ki 1, so the prime number mi = Ki:Ki 1 = Ki:Ki 1(r1) Ki 1(r1):Ki 1 is
divisible by Ki 1(r1):Ki 1 = deg f(x) = 3 and so mi = 3. Thus Ki is an
extension of Ki 1 containing the root r1 of f(x).
Theorem 55.10 yields now that Ki is normal over Ki 1 and since the irre-
ducible polynomial xmi ai in Ki 1[x] has a root ui in Ki, the other roots ui
2
and ui of xmi ai are in Ki, so = ui /ui Ki. This contradicts Ki .
59.31 Theorem: Let K be a field with char K 2,3 and assume that K
contains a primitive cube root of unity. Let a,b,c,d be distinct
indeterminates over K and let
be the general cubic polynomial over K. Then the roots r1,r2,r3,r4 of g(x)
are given by
788
1
r1 = (a + u+ v+ y)
4
1
r2 = (a + u v y)
4
1
r3 = (a u+ v y)
4
1
r4 = (a u v+ y),
4
where u = a2 4 4 , v = a2 4 4 , y = a2 4 4 ,
u v y = a 3 + 4ab 8c.
Proof: Let r1,r2,r3,r4 be the roots of g(x) and let = r1r2 + r3r4,
= r1r3 + r2r4, = r1r4 + r2r3,. Then , , are the roots of the resolvent
cubic
u = (r1 + r2 r3 r4)2
v = (r1 r2 + r3 r4)2
y = (r1 r2 r3 + r4)2.
u = a2 4 4 ; v = a2 4 4 ; y = a2 4 4
a = r1 + r2 + r3 + r4
789
u = r1 + r2 r3 r4
v = r1 r2 + r3 r4
y = r1 r2 r3 + r4,
1 1
r1 = (a + u+ v+ y), r2 = (a + u v y),
4 4
1 1
r3 = (a u+ v y), r4 = (a u v+ y).
4 4
* *
From elementary geometry, it is known that, for any given line l and a
given P, we can draw a line through P parallel to l and also a line
through P perpendicular to l.
790
After the introduction of a coordinate system on the plane, we see that a
real number a 0 is constructible if and only if the line segment
[0,a] {0} or {0} [0,a] is constructible. (Closed intervals. Here [0,a] is to
be read as [a,0] when a 0.)
791
Now let K be a subfield of . We determine the nature of intersection
points of two K-two straight lines and/or K-circles that arise as a result
of one of the steps (i),(ii),(iii).
792
So each step in a ruler and compass construction gives rise to a K-point
or a K( D)-point for some D K, if K denotes the field of lines/circles
used in that step.
= K0 K1 K2 ... Kn 1 Kn
The converse of Theorem 59.32 is also true. See Ex. 10. We are now in a
position to resolve some famous construction problems. The first one is
the construction of a cube whose volume is twice the volume of a given
cube (duplication of a cube). Choosing the length of a side of the given
3
cube as unit length, the side of the cube to be constructed has length 2.
3
Thus the problem is to construct the real number 2. Its minimal
3
polynomial is x 2, since this polynomial is irreducible over by
3
Eisenstein's criterion. Thus 2 is algebraic over , but its degree over
3
is three, not a power of two. Hence 2 cannot be constructed: it is
impossible to duplicate a cube by ruler and compass alone.
The second problem is to divide a given angle into three equal parts
(trisection of an angle). An angle of radians is the circular arc of lenth
on the unit circle, which we may assume to issue from the point (1,0)
793
and terminate at the point (cos , sin ). It is constructible if and only if
(cos , sin ) is constructible. In view of sin = 1 (cos )2, we see
that an angle of radians is constructible if and only if cos is
constructible. The problem is thus equivalent to: given cos , construct
cos ( /3). From the trigonometric identity
4x3 3x a=0
4x3 3x a = 0, a = cos
The third problem is to draw a square whose area is the area of a given
circle (squaring the circle). Choosing the radius of the given circle as unit
length, the side of the square to be constructed has length . Thus the
problem is to construct the real number . But and all the more so
are not algebraic over (Example 49.8(d)), let alone be of degree a
power of two. Hence cannot be constructed: it is impossible to square
the circle by ruler and compass alone.
The final problem is to draw a regular n-gon. This is the same problem
as dividing the circle into n equal parts. Thus we are to divide the angle
of 2 radians into n equal parts, which means we are to construct the
794
+ 1
number cos (2 /n). Now cos (2 /n) = , where = e2 i/n is
2
a primitive n-th root of unity. The field (cos (2 /n)) is fixed only by
1
the automorphisms and in the Galois group of the
cyclotomic extension ( )/ , which is Galois and of degree (n) over
(Theorem 58.12). So (cos (2 /n)) is an intermediate field of ( )/
1
satisfying ( ): (cos (2 /n)) = { , } = 2. Thus (cos (2 /n)): =
(n)/2. Hence, if cos (2 /n) is constructible, then (n)/2 and
consequently also (n) is a power of two. Let n = 2 p1 p2 . . . pra r be the
a0 a1 a2
Exercises
795
Prove that Theorem 59.8, Theorem 59.9, Theorem 59.10 and Theorem
59.11 remain valid with this new definition of radical extensions.
Prove that Theorem 59.8 remains valid with this new definition of
radical extensions if we assume E is normal over K. Discuss Theorem
59.9, Theorem 59.10 and Theorem 59.11.
6. Find five irreducible polynomials in [x] whose Galois groups are S5.
3 3
7. Show that 2+ 121 + 2 121 = 4 (Raffael Bombelli (ca. 1520-
1572)).
796
8. Let K be a subfield of and let f(x) be a cubic polynomial in K[x]. Let
D be the discriminant of f(x). Prove that
(a) D 0 if and only if f(x) has three real distinct roots;
(b) D 0 if and only if f(x) has one real and two complex conjugate
roots;
(c) D = 0 if and only if f(x) has three real roots, one of which is
repeated.
10. Prove the converse of Theorem 59.32.. (Hint: Show that, if the degree
of Kn over is a power of two, so is the degree over of the normal
closure of over Kn. Use Galois correspondence and Ex. 12 in §26.)
11. Prove that the angle 90° can and the angle 60° cannot be trisected
by ruler and compass alone.
12. Show that, if n has the form n = 2a 0 p1p2. . . pr, where pi are distinct
Fermat primes, then a regular n-gon is constructible by ruler and
compass alone.
797