You are on page 1of 1290

# Free Encyclopedia of Mathematics (0.0.

1)
– volume 2
Chapter 242

16-00 – General reference works
(handbooks, dictionaries,
bibliographies, etc.)

242.1 direct product of modules

Let {Xi : i ∈ Q I} be a collection of modules in some category of modules. Then the
direct product i∈I Xi of that collection is the module whose underlying set is the cartesian product
of the Xi with componentwise addition and scalar multiplication. For example, in a category
of left modules:
(xi ) + (yi ) = (xi + yi),
r(xi ) = (rxi ).
Q
For each j ∈QI we have a projection pj : i∈I Xi → Xj defined by (xi ) 7→ xQ
j , and an injection
λj : Xj → i∈I Xi where an element xj of Xj maps to the element of i∈I Xi whose jth
term is xj and every other term is zero.
Q
The direct product i∈I Xi satisfies a certain universal property. Namely, if Y is a module
and there exist homomorphisms
Q fi : Xi → Y for all i ∈ I, then there exists a unique
homomorphism φ : Y → i∈I Xi satisfying φλi = fi for all i ∈ I.
fi
Xi Y

λi φ
Q
i∈I Xi

The direct product is often referred to as the complete direct sum, or the strong direct sum,
or simply the product.

1088
Compare this to the direct sum of modules.

Version: 3 Owner: antizeus Author(s): antizeus

242.2 direct sum

Let {X` i : i ∈ I} be a collection of modules in some category of modules. Then the direct
sum i∈I Xi of that collection is the submodule of the direct product of the Xi consisting
of all elements (xi ) such that all but a finite number of the xi are zero.
`
For each j ∈`I we have a projection pj : i∈I Xi → Xj defined by (xi ) 7→ x` j , and an injection
λj : Xj → i∈I Xi where an element xj of Xj maps to the element of i∈I Xi whose jth
term is xj and every other term is zero.
`
The direct sum i∈I Xi satisfies a certain universal property. Namely, if Y is a module
and there exist homomorphisms
` fi : Y → Xi for all i ∈ I, then there exists a unique
homomorphism φ : i∈I Xi → Y satisfying pi φ = fi for all i ∈ I.
fi
Xi Y
pi φ
`
i∈I Xi

The direct sum is often referred to as the weak direct sum or simply the sum.

Compare this to the direct product of modules.

Version: 3 Owner: antizeus Author(s): antizeus

242.3 exact sequence

If we have two homomorphisms f : A → B and g : B → C in some category of modules,
then we say that f and g are exact at B if the image of f is equal to the kernel of g.

A sequence of homomorphisms
fn+1 fn
· · · → An+1 −→ An −→ An−1 → · · ·
is said to be exact if each pair of adjacent homomorphisms (fn+1 , fn ) is exact – in other
words if imfn+1 = kerfn for all n.

Compare this to the notion of a chain complex.

Version: 2 Owner: antizeus Author(s): antizeus

1089
242.4 quotient ring

Definition.
Let R be a ring and let I be a two-sided ideal of R. To define the quotient ring R/I, let us first
define an equivalence relation in R. We say that the elements a, b ∈ R are equivalent, written
as a ∼ b, if and only if a − b ∈ I. If a is an element of R, we denote the corresponding
equivalence class by [a]. Thus [a] = [b] if and only if a − b ∈ I. The quotient ring of R
modulo I is the set R/I = {[a] | a ∈ R}, with a ring structure defined as follows. If [a], [b]
are equivalence classes in R/I, then

• [a] + [b] := [a + b],

• [a] · [b] := [a · b].

Here a and b are some elements in R that represent [a] and [b]. By construction, every
element in R/I has such a representative in R. Moreover, since I is closed under addition
and multiplication, one can verify that the ring structure in R/I is well defined.

properties.

1. If R is commutative, then R/I is commutative.

Examples.

1. For any ring R, we have that R/R = {0} and R\{0} = R.

2. Let R = Z, and let I be the set of even numbers. Then R/I contains only two classes;
one for even numbers, and one for odd numbers.

Version: 3 Owner: matte Author(s): matte, djao

1090
Chapter 243

16D10 – General module theory

243.1 annihilator

Let R be a ring.

Suppose that M is a left R-module.

If X is a subset of M, then we define the left annihilator of X in R:
l.ann(X) = {r ∈ R | rx = 0 for all x ∈ X}.

If Z is a subset of R, then we define the right annihilator of Z in M:
r.annM (Z) = {m ∈ M | zm = 0 for all z ∈ Z}.

Suppose that N is a right R-module.

If Y is a subset of N, then we define the right annihilator of Y in R:
r.ann(Y ) = {r ∈ R | yr = 0 for all y ∈ Y }.

If Z is a subset of R, then we define the left annihilator of Z in N:
l.annN (Z) = {n ∈ N | nz = 0 for all z ∈ Z}.

Version: 3 Owner: antizeus Author(s): antizeus

243.2 annihilator is an ideal

The right annihilator of a right R-module MR in R is an ideal.

1091
B y the distributive law for modules, it is easy to see that r. ann(MR ) is closed under
addition and right multiplication. Now take x ∈ r. ann(MR ) and r ∈ R.

Take any m ∈ MR . Then mr ∈ MR , but then (mr)x = 0 since x ∈ r. ann(MR ). So m(rx) = 0
and rx ∈ r. ann(MR ).

An equivalent result holds for left annihilators.

Version: 2 Owner: saforres Author(s): saforres

243.3 artinian

A module M is artinian if it satisfies the following equivalent conditions:

• the descending chain condition holds for submodules of M;
• every nonempty family of submodules of M has a minimal element.

A ring R is left artinian if it is artinian as a left module over itself (i.e. if R R is an artinian
module), and right artinian if it is artinian as a right module over itself (i.e. if RR is an
artinian module), and simply artinian if both conditions hold.

Version: 3 Owner: antizeus Author(s): antizeus

243.4 composition series

Let R be a ring and let M be a (right or left) R-module. A series of submodules
M = M0 ⊃ M1 ⊃ M2 ⊃ · · · ⊃ Mn = 0
in which each quotient Mi /Mi+1 is simple is called a composition series for M.

A module need not have a composition series. For example, the ring of integers, Z, cond-
sidered as a module over itself, does not have a composition series.

A necessary and sufficient condition for a module to have a composition series is that it is
both noetherian and artinian.

If a module does have a composition series, then all composition series are the same length.
This length (the number n above) is called the composition length of the module.

If R is a semisimple Artinian ring, then RR and R R always have composition series.

Version: 1 Owner: mclase Author(s): mclase

1092
243.5 conjugate module

If M is a right module over a ring R, and α is an endomorphism of R, we define the
conjugate module M α to be the right R-module whose underlying set is {mα | m ∈ M},
with abelian group structure identical to that of M (i.e. (m − n)α = mα − nα ), and scalar
multiplication given by mα · r = (m · α(r))α for all m in M and r in R.

In other words, if φ : R → EndZ (M) is the ring homomorphism that describes the right module action
of R upon M, then φα describes the right module action of R upon M α .

If N is a left R-module, we define α N similarly, with r · α n = α (α(r) · n).

Version: 4 Owner: antizeus Author(s): antizeus

243.6 modular law

Let R M be a left R-module with submodules A, B, C, and suppose C ⊆ B. Then
\ \
C + (B A) = B (C + A)

Version: 1 Owner: saforres Author(s): saforres

243.7 module

Let R be a ring, and let M be an abelian group.

We say that M is a left R-module if there exists a ring homomorphism φ : R → EndZ (M)
from R to the ring of abelian group endomorphisms on M (in which multiplication of endo-
morphisms is composition, using left function notation). We typically denote this function
using a multiplication notation:

[φ(r)](m) = r · m = rm

This ring homomorphism defines what is called a left module action of R upon M.

If R is a unital ring (i.e. a ring with identity), then we typically demand that the ring
homomorphism map the unit 1 ∈ R to the identity endomorphism on M, so that 1 · m = m
for all m ∈ M. In this case we may say that the module is unital.

Typically the abelian group structure on M is expressed in additive terms, i.e. with operator
+, identity element 0M (or just 0), and inverses written in the form −m for m ∈ M.

1093
Right module actions are defined similarly, only with the elements of R being written on
the right sides of elements of M. In this case we either need to use an anti-homomorphism
R → EndZ (M), or switch to right notation for writing functions.

Version: 7 Owner: antizeus Author(s): antizeus

243.8 proof of modular law
T T
T A) ⊆ B (C + A):
First we show C + (B T
Note that C ⊆ B, B AT⊆ B, and therefore C + (BT A) ⊆ B.
Further, C ⊆ C + A, B A ⊆ C + A, thus C + (B A) ⊆ C + A.

T T
Next we show
T B (C + A) ⊆ C + (B A):
Let b ∈ B (C + A). Then b = c + a for some c ∈ C and a ∈ A. Hence a = b − c, and so
a ∈ B since bT∈ B and c ∈ C ⊆ B. T
Hence a ∈ B A, so b = c + a ∈ C + (B A).

Version: 5 Owner: saforres Author(s): saforres

243.9 zero module

Let R be a ring.

The abelian group which contains only an identity element (zero) gains a trivial R-module
structure, which we call the zero module.

Every R-module M has an zero element and thus a submodule consisting of that element.
This is called the zero submodule of M.

Version: 2 Owner: antizeus Author(s): antizeus

1094
Chapter 244

16D20 – Bimodules

244.1 bimodule

Suppose that R and S are rings. An (R, S)-bimodule is an abelian group M which has a left
R-module action as well as a right S-module action, which satisfy the relation r(ms) = (rm)s
for every choice of elements r of R, s of S, and m of M.

A (R, S)-sub-bi-module of M is a subgroup which is also a left R-submodule and a right
S-submodule.

Version: 3 Owner: antizeus Author(s): antizeus

1095
Chapter 245

16D25 – Ideals

245.1 associated prime

Let R be a ring, and let M be an R-module. A prime ideal P of R is an annihilator prime
for M if P is equal to the annihilator of some nonzero submodule X of M.

Note that if this is the case, then the module annA (P ) contains X, has P as its annihilator,
and is a faithful (R/P )-module.

If, in addition, P is equal to the annihilator of a submodule of M that is a fully faithful
(R/P )-module, then we call P an associated prime of M.

Version: 2 Owner: antizeus Author(s): antizeus

245.2 nilpotent ideal

A left (right) ideal I of a ring R is a nilpotent ideal if I n = 0 for some positive integer n.
Here I n denotes a product of ideals – I · I · · · I.

Version: 2 Owner: antizeus Author(s): antizeus

245.3 primitive ideal

Let R be a ring, and let I be an ideal of R. We say that I is a left (right) primitive ideal if
there exists a simple left (right) R-module X such that I is the annihilator of X in R.

We say that R is a left (right) primitive ring if the zero ideal is a left (right) primitive ideal

1096
of R.

Note that I is a left (right) primitive ideal if and only if R/I is a left (right) primitive ring.

Version: 2 Owner: antizeus Author(s): antizeus

245.4 product of ideals

Let R be a ring, and let A and B be left (right) ideals of R. Then the product of the
ideals A and B, which we denote AB, is the left (right) ideal generated by the products
{ab | a ∈ A, b ∈ B}.

Version: 2 Owner: antizeus Author(s): antizeus

245.5 proper ideal

Suppose R is a ring and I is an ideal of R. We say that I is a proper ideal if I is not equal
to R.

Version: 2 Owner: antizeus Author(s): antizeus

245.6 semiprime ideal

Let R be a ring. An ideal I of R is a semiprime ideal if it satisfies the following equivalent
conditions:

(a) I can be expressed as an intersection of prime ideals of R;

(b) if x ∈ R, and xRx ⊂ I, then x ∈ I;

(c) if J is a two-sided ideal of R and J 2 ⊂ I, then J ⊂ I as well;

(d) if J is a left ideal of R and J 2 ⊂ I, then J ⊂ I as well;

(e) if J is a right ideal of R and J 2 ⊂ I, then J ⊂ I as well.

Here J 2 is the product of ideals J · J.

The ring R itself satisfies all of these conditions (including being expressed as an intersection
of an empty family of prime ideals) and is thus semiprime.

A ring R is said to be a semiprime ring if its zero ideal is a semiprime ideal.

1097
Note that an ideal I of R is semiprime if and only if the quotient ring R/I is a semiprime
ring.

Version: 7 Owner: antizeus Author(s): antizeus

245.7 zero ideal

In any ring, the set consisting only of the zero element (i.e. the additive identity) is an ideal
of the left, right, and two-sided varieties. It is the smallest ideal in any ring.

Version: 2 Owner: antizeus Author(s): antizeus

1098
Chapter 246

16D40 – Free, projective, and flat
modules and ideals

246.1 finitely generated projective module

Let R be a unital ring. A finitely generated projective right R-module is of the form eRn ,
n ∈ N, where e is an idempotent in EndR (Rn ).

Let A be a unital C ∗ -algebra and p be a projection in EndA (An ), n ∈ N. Then, E = pAn is
a finitely generated projective right A-module. Further, E is a pre-Hilbert A-module with
(A-valued) inner product
n
X
hu, vi = u∗i vi , u, v ∈ E.
i=1

Version: 3 Owner: mhale Author(s): mhale

246.2 flat module

A right module M over a ring R is flat if the tensor product functor M ⊗R (−) is an
exact functor.

Similarly, a left module N over R is flat if the tensor product functor (−) ⊗R N is an exact
functor.

Version: 2 Owner: antizeus Author(s): antizeus

1099
246.3 free module

Let R be a commutative ring. A free module over R is a direct sum of copies of R. In
particular, as every abelian group is a Z-module, a free abelian group is a direct sum of
copies of Z. This is equivalent to saying that the module has a free basis, i.e. a set of
elements with the property that every element of the module can be uniquely expressed as
an linear combination over R of elements of the free basis. In the case that a free module
over R is a sum of finitely many copies of R, then the number of copies is called the rank of
the free module.

An alternative definition of a free module is via its universal property: Given a set X, the
free R-module F (X) on the set X is equipped with a function i : X → F (X) satisfying the
property that for any other R-module A and any function f : X → A, there exists a unique
R-module map h : F (X) → A such that (h ◦ i) = f .

Version: 4 Owner: mathcam Author(s): mathcam, antizeus

246.4 free module

Let R be a ring. A free module over R is a direct sum of copies of R.

Similarly, as an abelian group is simply a module over Z, a free abelian group is a direct sum
of copies of Z.

This is equivalent to saying that the module has a free basis, i.e. a set of elements with the
property that every element of the module can be uniquely expressed as an linear combination
over R of elements of the free basis.

Version: 1 Owner: antizeus Author(s): antizeus

246.5 projective cover

Let X and P be modules. We say that P is a projective cover of X if P is a projective module
and there exists an epimorphism p : P → X such that ker p is a superfluous submodule of P .

Equivalently, P is an projective cover of X if P is projective, and there is an epimorphism
p : P → X, and if g : P 0 → X is an epimorphism from a projective module P 0 to X, then

1100
there exists an epimorphism h : P 0 → P such that ph = g.

P0
h g

P p X 0

0

Version: 2 Owner: antizeus Author(s): antizeus

246.6 projective module

A module P is projective if it satisfies the following equivalent conditions:

(a) Every short exact sequence of the form 0 → A → B → P → 0 is split;

(b) The functor Hom(P, −) is exact;

(c) If f : X → Y is an epimorphism and there exists a homomorphism g : P → Y , then
there exists a homomorphism h : P → X such that f h = g.

P
h g

X f
Y 0

(d) The module P is a direct summand of a free module.

Version: 3 Owner: antizeus Author(s): antizeus

1101
Chapter 247

16D50 – Injective modules,
self-injective rings

247.1 injective hull

Let X and Q be modules. We say that Q is an injective hull or injective envelope of X if Q
is both an injective module and an essential extension of X.

Equivalently, Q is an injective hull of X if Q is injective, and X is a submodule of Q, and
if g : X → Q0 is a monomorphism from X to an injective module Q0 , then there exists a
monomorphism h : Q → Q0 such that h(x) = g(x) for all x ∈ X.

0

i
0 X Q
g
h
0
Q

Version: 2 Owner: antizeus Author(s): antizeus

247.2 injective module

A module Q is injective if it satisfies the following equivalent conditions:

(a) Every short exact sequence of the form 0 → Q → B → C → 0 is split;

(b) The functor Hom(−, Q) is exact;

1102
(c) If f : X → Y is a monomorphism and there exists a homomorphism g : X → Q, then
there exists a homomorphism h : Y → Q such that hf = g.
f
0 X Y
g
h
Q

Version: 3 Owner: antizeus Author(s): antizeus

1103
Chapter 248

16D60 – Simple and semisimple
modules, primitive rings and ideals

248.1 central simple algebra

Let K be a field. A central simple algebra A (over K) is an algebra A over K, which is
finite dimensional as a vector space over K, such that

• A has an identity element, as a ring

• A is central: the center of A equals K (for all z ∈ A, we have z · a = a · z for all a ∈ A
if and only if z ∈ K)

• A is simple: for any two sided ideal I of A, either I = {0} or I = A

By a theorem of Brauer, for every central simple algebra A over K, there exists a unique (up
to isomorphism) division ring D containing K and a unique natural number n such that A
is isomorphic to the ring of n × n matrices with coefficients in D.

Version: 2 Owner: djao Author(s): djao

248.2 completely reducible

A module M is called completely reducible (or semisimple) if it is a direct sum of irreducible
(or simple) modules.

Version: 1 Owner: bwebste Author(s): bwebste

1104
248.3 simple ring

A nonzero ring R is said to be a simple ring if it has no (two-sided) ideal other then the
zero ideal and R itself.

This is equivalent to saying that the zero ideal is a maximal ideal.

If R is a commutative ring with unit, then this is equivalent to being a field.

Version: 4 Owner: antizeus Author(s): antizeus

1105
Chapter 249

16D80 – Other classes of modules and
ideals

249.1 essential submodule

Let X be a submodule of a module Y . We say that X is an essential submodule of Y , and
that
T Y is an essential extension of X, if whenever A is a nonzero submodule of Y , then
A X is also nonzero.

A monomorphism f : X → Y is an essential monomorphism if the image imf is an essential
submodule of Y .

Version: 2 Owner: antizeus Author(s): antizeus

249.2 faithful module

Let R be a ring, and let M be an R-module. We say that M is a faithful R-module if its
annihilator annR (M) is the zero ideal.

We say that M is a fully faithful R-module if every nonzero R-submodule of M is faithful.

Version: 3 Owner: antizeus Author(s): antizeus

1106
249.3 minimal prime ideal

A prime ideal P of a ring R is called a minimal prime ideal if it does not properly contain
any other prime ideal of R.

If R is a prime ring, then the zero ideal is a prime ideal, and is thus the unique minimal
prime ideal of R.

Version: 2 Owner: antizeus Author(s): antizeus

249.4 module of finite rank

Let M be a module, and let E(M) be the injective hull of M. Then we say that M has finite
rank if E(M) is a finite direct sum of indecomposible submodules.

This turns out to be equivalent to the property that M has no infinite direct sums of nonzero
submodules.

Version: 3 Owner: antizeus Author(s): antizeus

249.5 simple module

Let R be a ring, and let M be an R-module. We say that M is a simple or irreducible module
if it contains no submodules other than itself and the zero module.

Version: 2 Owner: antizeus Author(s): antizeus

249.6 superfluous submodule

Let X be a submodule of a module Y . We say that X is a superfluous submodule of Y if
whenever A is a submodule of Y such that A + X = Y , then A = Y .

Version: 2 Owner: antizeus Author(s): antizeus

1107
249.7 uniform module

A module M is said to be uniform if any two nonzero submodules of M must have a nonzero
intersection. This is equivalent to saying that any nonzero submodule is an essential submodule.

Version: 3 Owner: antizeus Author(s): antizeus

1108
Chapter 250

16E05 – Syzygies, resolutions,
complexes

250.1 n-chain

An n-chain on a topological space X is a finite formal sum of n-simplices on X. The group
of such chains is denoted Cn (X). For a CW-complex Y, Cn (Y ) = Hn (Y n , Y n−1 ), where Hn
denotes the nth homology group.

The boundary of an n-chain is the (n − 1)-chain given by the formal sum of the boundaries
of its constitutent simplices. An n-chain is closed if its boundary is 0 and exact if it is the
boundary of some (n + 1)-chain.

Version: 3 Owner: mathcam Author(s): mathcam

250.2 chain complex

A sequence of modules and homomorphisms
dn+1 n d
· · · → An+1 −→ An −→ An−1 → · · ·

is said to be a chain complex or complex if each pair of adjacent homomorphisms (dn+1 , dn )
satisfies the relation dn dn+1 = 0. This is equivalent to saying that im dn+1 ⊂ ker dn . We
often denote such a complex as (A, d) or simply A.

Compare this to the notion of an exact sequence.

Version: 4 Owner: antizeus Author(s): antizeus

1109
250.3 flat resolution

Let M be a module. A flat resolution of M is an exact sequence of the form

· · · → Fn → Fn−1 → · · · → F1 → F0 → M → 0

where each Fn is a flat module.

Version: 2 Owner: antizeus Author(s): antizeus

250.4 free resolution

Let M be a module. A free resolution of M is an exact sequence of the form

· · · → Fn → Fn−1 → · · · → F1 → F0 → M → 0

where each Fn is a free module.

Version: 2 Owner: antizeus Author(s): antizeus

250.5 injective resolution

Let M be a module. An injective resolution of M is an exact sequence of the form

0 → M → Q0 → Q1 → · · · → Qn−1 → Qn → · · ·

where each Qn is an injective module.

Version: 2 Owner: antizeus Author(s): antizeus

250.6 projective resolution

Let M be a module. A projective resolution of M is an exact sequence of the form

· · · → Pn → Pn−1 → · · · → P1 → P0 → M → 0

where each Pn is a projective module.

Version: 2 Owner: antizeus Author(s): antizeus

1110
250.7 short exact sequence

A short exact sequence is an exact sequence of the form

0 → A → B → C → 0.

Note that in this case, the homomorphism A → B must be a monomorphism, and the
homomorphism B → C must be an epimorphism.

Version: 2 Owner: antizeus Author(s): antizeus

250.8 split short exact sequence
f g
In an abelian category, a short exact sequence 0 → A → B → C → 0 is split if it satisfies
the following equivalent conditions:

(a) there exists a homomorphism h : C → B such that gh = 1C ;

(b) there exists a homomorphism j : B → A such that jf = 1A ;

(c) B is isomorphic to the direct sum A ⊕ C.

In this case, we say that h and j are backmaps or splitting backmaps.

Version: 4 Owner: antizeus Author(s): antizeus

250.9 von Neumann regular

An element a of a ring R is said to be von Neumann regular if there exists b ∈ R such
that aba = a.

A ring R is said to be a von Neumann regular ring (or simply a regular ring, if the
meaning is clear from context) if every element of R is von Neumann regular.

Note that regular ring in the sense of von Neumann should not be confused with regular
ring in the sense of commutative algebra.

Version: 1 Owner: igor Author(s): igor

1111
Chapter 251

16K20 – Finite-dimensional

251.1 quaternion algebra

A quaternion algebra over a field K is a central simple algebra over K which is four dimen-
sional as a vector space over K.

Examples:

• For any field K, the ring M2×2 (K) of 2 × 2 matrices with entries in K is a quaternion
algebra over K. If K is algebraically closed, then all quaternion algebras over K are
isomorphic to M2×2 (K).

• For K = R, the well known algebra H of Hamiltonian quaternions is a quaternion
algebra over R. The two algebras H and M2×2 (R) are the only quaternion algebras
over R, up to isomorphism.

• When K is a number field, there are infinitely many non–isomorphic quaternion alge-
bras over K. In fact, there is one such quaternion algebra for every even sized finite
collection of finite primes or real primes of K. The proof of this deep fact leads to
many of the major results of class field theory.

Version: 1 Owner: djao Author(s): djao

1112
Chapter 252

16K50 – Brauer groups

252.1 Brauer group

Let K be a field. The Brauer group Br(K) of K is the set of all equivalence classes of
central simple algebras over K, where two central simple algebras A and B are equivalent
if there exists a division ring D over K and natural numbers n, m such that A (resp. B) is
isomorphic to the ring of n × n (resp. m × m) matrices with coefficients in D.

The group operation in Br(K) is given by tensor product: for any two central simple al-
gebras A, B over K, their product in Br(K) is the central simple algebra A ⊗K B. The
identity element in Br(K) is the class of K itself, and the inverse of a central simple algebra
A is the opposite algebra Aopp defined by reversing the order of the multiplication operation
of A.

Version: 5 Owner: djao Author(s): djao

1113
Chapter 253

16K99 – Miscellaneous

253.1 division ring

A division ring is a ring D with identity such that

• 1 6= 0

• For all nonzero a ∈ D, there exists b ∈ D with a · b = b · a = 1

A field is equivalent to a commutative division ring.

Version: 3 Owner: djao Author(s): djao

1114
Chapter 254

quasimultiplication

The Jacobson radical J(R) of a ring R is the intersection of the annihilators of irreducible
left R-modules.

The following are alternate characterizations of the Jacobson radical J(R):

1. The intersection of all left primitive ideals.

2. The intersection of all maximal left ideals.

3. The set of all t ∈ R such that for all r ∈ R, 1 − rt is left invertible (i.e. there exists u
such that u(1 − rt) = 1).

4. The largest ideal I such that for all v ∈ I, 1 − v is a unit in R.

5. (1) - (3) with “left” replaced by “right” and rt replaced by tr.

Note that if R is commutative and finitely generated, then

J(R) = {x ∈ R | xn = 0for some n ∈ N} = Nil(R)

Version: 13 Owner: saforres Author(s): saforres

1115
254.2 a ring modulo its Jacobson radical is semiprim-
itive

Let R be a ring. Then J(R/J(R)) = (0).

L et [u] ∈ J(R/J(R)). Then by one of the alternate characterizations of the Jacobson radical,
1 − [r][u] is left invertible for all r ∈ R, so there exists v ∈ R such that [v](1 − [r][u]) = 1.

Then v(1 − ru) = 1 − a for some a ∈ J(R).

So wv(1−ru) = 1 since w(1−a) = 1 for some w ∈ R. Since this holds for all r ∈ R, u ∈ J(R),
then [u] = 0.

Version: 3 Owner: saforres Author(s): saforres

254.3 examples of semiprimitive rings

Examples of semiprimitive rings:

The integers Z:

S ince Z is commutative, any left ideal is two-sided. So the maximal left ideals
T of Z are
the maximal ideals of Z, which are the ideals pZ for p prime. Note that pZ qZ = (0) if
gcd(p, q) > 1. T
Hence J(Z) = p pZ = (0).

A matrix ring Mn (D) over a division ring D:

T he ring Mn (D) is simple, so the only proper ideal is (0). Thus J(Mn (D)) = (0).

A polynomial ring R[x] over a domain R:

T ake a ∈ J(R[x]) with a 6= 0. Then ax ∈ J(R[x]), since J(R[x]) is an ideal, and deg(ax) >
1.
By one of the alternate characterizations of the Jacobson radical, 1 − ax is a unit. But
deg(1 − ax) = max{deg(1), deg(ax)} > 1.
So 1 − ax is not a unit, and by this contradiction we see that J(R[x]) = (0).

Version: 5 Owner: saforres Author(s): saforres

1116
254.4 proof of Characterizations of the Jacobson rad-
ical

First, note that by definition a left primitive ideal is the annihilator of an irreducible left R-
module, so clearly characterization 1) is equivalent to the definition of the Jacobson radical.

Next, we will prove cyclical containment. Observe that 5) follows after the equivalence of 1)
- 4) is established, since 4) is independent of the choice of left or right ideals.

1) ⊂ 2) We know that every left primitive ideal is the largest ideal contained in a maximal
left ideal. So the intersection of all left primitive ideals will be contained in the inter-
section of all maximal left ideals.

T
2) ⊂ 3) Let ST= {M : M a maximal left ideal of R} and take r ∈ R. Let t ∈ M ∈S M. Then
rt ∈ M ∈S M.
Assume 1 − rt is not left invertible; therefore there exists a maximal left ideal M0 of
R such that R(1 − rt) ⊆ M0 .
Note then that 1 − rt ∈ M0 . Also, by definition of t, we have rt ∈ M0 . Therefore
1 ∈ M0 ; this contradiction implies 1 − rt is left invertible.

3) ⊂ 4) We claim that 3) satisfies the condition of 4).
Let K = {t ∈ R : 1 − rt is left invertible for all r ∈ R}.
We shall first show that K is an ideal.
Clearly if t ∈ K, then rt ∈ K. If t1 , t2 ∈ K, then

1 − r(t1 + t2 ) = (1 − rt1 ) − rt2

Now there exists u1 such that u1(1 − rt1 ) = 1, hence

u1 ((1 − rt1 ) − rt2 ) = 1 − u1rt2

Similarly, there exists u2 such that u2 (1 − u1 rt2 ) = 1, therefore

u2 u1 (1 − r(t1 + t2 )) = 1

Hence t1 + t2 ∈ K.
Now if t ∈ K, r ∈ R, to show that tr ∈ K it suffices to show that 1−tr is left invertible.
Suppose u(1 − rt) = 1, hence u − urt = 1, then tur − turtr = tr.
So (1 + tur)(1 − tr) = 1 + tur − tr − turtr = 1.
Therefore K is an ideal.
Now let v ∈ K. Then there exists u such that u(1 − v) = 1, hence 1 − u = −uv ∈ K,
so u = 1 − (1 − u) is left invertible.
So there exists w such that wu = 1, hence wu(1 − v) = w, then 1 − v = w. Thus

1117
(1 − v)u = 1 and therefore 1 − v is a unit.
Let J be the largest ideal such that, for all v ∈ J, 1 − v is a unit. We claim that
K ⊆ J.
Suppose this were not true; in this case K + J strictly contains J. Consider rx + sy ∈
K + J with x ∈ K, y ∈ J and r, s ∈ R. Now 1 − (rx + sy) = (1 − rx) − sy, and since
rx ∈ K, then 1 − rx = u for some unit u ∈ R.
So 1 − (rx + sy) = u − sy = u(1 − u−1 sy), and clearly u−1sy ∈ J since y ∈ J. Hence
1 − u−1 sy is also a unit, and thus 1 − (rx + sy) is a unit.
Thus 1 − v is a unit for all v ∈ K + J. But this contradicts the assumption that J is
the largest such ideal. So we must have K ⊆ J.

4) ⊂ 1) We must show that if I is an ideal such that for all u ∈ I, 1 − u is a unit, then
I ⊂ ann(R M) for every irreducible left R-module R M.

Suppose this is not the case, so there exists R M such that I 6⊂ ann(R M). Now we
know that ann(R M) is the largest ideal inside some maximal left ideal J of R. Thus
we must also have I 6⊂ J, or else this would contradict the maximality of ann(R M)
inside J.
But since I 6⊂ J, then by maximality I + J = R, hence there exist u ∈ I and v ∈ J
such that u + v = 1. Then v = 1 − u, so v is a unit and J = R. But since J is a proper
left ideal, this is a contradiction.

Version: 25 Owner: saforres Author(s): saforres

254.5 properties of the Jacobson radical

Theorem:
Let R, T be rings and ϕ : R → T be a surjective homomorphism. Then ϕ(J(R)) ⊆ J(T ).

W e shall use the characterization of the Jacobson radical as the set of all a ∈ R such that
for all r ∈ R, 1 − ra is left invertible.

Let a ∈ J(R), t ∈ T . We claim that 1 − tϕ(a) is left invertible:

Since ϕ is surjective, t = ϕ(r) for some r ∈ R. Since a ∈ J(R), we know 1 − ra is left
invertible, so there exists u ∈ R such that u(1 − ra) = 1. Then we have

ϕ(u) (ϕ(1) − ϕ(r)ϕ(a)) = ϕ(u)ϕ(1 − ra) = ϕ(1) = 1

So ϕ(a) ∈ J(T ) as required.

Theorem:
Let R, T be rings. Then J(R × T ) ⊆ J(R) × J(T ).

1118
L et π1 : R ×T → R be a (surjective) projection. By the previous theorem, π1 (J(R ×T )) ⊆
J(R).

Similarly let π2 : R × T → T be a (surjective) projection. We see that π2 (J(R × T )) ⊆ J(T ).

Now take (a, b) ∈ J(R × T ). Note that a = π1 (a, b) ∈ J(R) and b = π2 (a, b) ∈ J(T ). Hence
(a, b) ∈ J(R) × J(T ) as required.

Version: 8 Owner: saforres Author(s): saforres

254.6 quasi-regularity

An element x of a ring is called right quasi-regular [resp. left quasi-regular] if there is
an element y in the ring such that x + y + xy = 0 [resp. x + y + yx = 0].

For calculations with quasi-regularity, it is useful to introduce the operation ∗ defined:
x ∗ y = x + y + xy.
Thus x is right quasi-regular if there is an element y such that x ∗ y = 0. The operation ∗ is
easily demonstrated to be associative, and x ∗ 0 = 0 ∗ x = 0 for all x.

An element x is called quasi-regular if it is both left and right quasi-regular. In this case,
there are elements x and y such that x+y+xy = 0 = x+z+zx (equivalently, x∗y = z∗x = 0).
A calculation shows that
y = 0 ∗ y = (z ∗ x) ∗ y = z ∗ (x ∗ y) = z.
So y = z is a unique element, depending on x, called the quasi-inverse of x.

An ideal (one- or two-sided) of a ring is called quasi-regular if each of its elements is quasi-
regular. Similarly, a ring is called quasi-regular if each of its elements is quasi-regular (such
rings cannot have an identity element).
Lemma 1. Let A be an ideal (one- or two-sided) in a ring R. If each element of A is right
quasi-regular, then A is a quasi-regular ideal.

This lemma means that there is no extra generality gained in defining terms such as right
quasi-regular left ideal, etc.

Quasi-regularity is important because it provides elementary characterizations of the Jacobson radical
for rings without an identity element:

• The Jacobson radical of a ring is the sum of all quasi-regular left (or right) ideals.

1119
• The Jacobson radical of a ring is the largest quasi-regular ideal of the ring.

For rings with an identity element, note that x is [right, left] quasi-regular if and only if 1 + x
is [right, left] invertible in the ring.

Version: 1 Owner: mclase Author(s): mclase

254.7 semiprimitive ring

A ring R is said to be semiprimitive (sometimes semisimple) if its Jacobson radical is the
zero ideal.

Any simple ring is automatically semiprimitive.

A finite direct product of matrix rings over division rings can be shown to be semiprimitive
and both left and right artinian.

The Artin-Wedderburn Theorem states that any semiprimitive ring which is left or right
Artinian is isomorphic to a finite direct product of matrix rings over division rings.

Version: 11 Owner: saforres Author(s): saforres

1120
Chapter 255

16N40 – Nil and nilpotent radicals,
sets, ideals, rings

255.1 Koethe conjecture

The Koethe Conjecture is the statement that for any pair of nil right ideals A and B in any
ring R, the sum A + B is also nil.

If either of A or B is a two-sided ideal, it is easy to see that A + B is nil. Suppose A is a
two-sided ideal, and let x ∈ A + B. The quotient (A + B)/A is nil since it is a homomorphic
image of B. So there is an n > 0 with xn ∈ A. Then there is an m > 0 such that xnm = 0,
because A is nil.

In particular, this means that the Koethe conjecture is true for commutative rings.

It has been shown to be true for many classes of rings, but the general statement is still
unproven, and no counter example has been found.

Version: 1 Owner: mclase Author(s): mclase

255.2 nil and nilpotent ideals

An element x of a ring is nilpotent if xn = 0 for some positive integer n.

A ring R is nil if every element in R is nilpotent. Similarly, a one- or two-sided ideal is
called nil if each of its elements is nilpotent.

A ring R [resp. a one- or two sided ideal A] is nilpotent if Rn = 0 [resp. An = 0] for some
positive integer n.

1121
A ring or an ideal is locally nilpotent if every finitely generated subring is nilpotent.

The following implications hold for rings (or ideals):

nilpotent ⇒ locally nilpotent ⇒ nil

Version: 3 Owner: mclase Author(s): mclase

1122
Chapter 256

16N60 – Prime and semiprime rings

256.1 prime ring

A ring R is said to be a prime ring if the zero ideal is a prime ideal.

If R is commutative, this is equivalent to being an integral domain.

Version: 2 Owner: antizeus Author(s): antizeus

1123
Chapter 257

16N80 – General radicals and rings

The prime radical of a ring R is the intersection of all the prime ideals of R.

Note that the prime radical is the smallest semiprime ideal of R, and that R is a semiprime ring
if and only if its prime radical is the zero ideal.

Version: 2 Owner: antizeus Author(s): antizeus

Let x◦ be a property which defines a class of rings, which we will call the x◦ -rings.

Then x◦ is a radical property if it satisfies:

1. The class of x◦ -rings is closed under homomorphic images.
2. Every ring R has a largest ideal in the class of x◦ -rings; this ideal is written x◦ (R).
3. x◦ (R/x◦ (R)) = 0.

Note: it is extremely important when interpreting the above definition that your definition
of a ring does not require an identity element.

The ideal x◦ (R) is called the x◦ -radical of R. A ring is called x◦ -radical if x◦ (R) = R, and
is called x◦ -semisimple if x◦ (R) = 0.

If x◦ is a radical property, then the class of x◦ -rings is also called the class of x◦ -radical
rings.

1124
The class of x◦ -radical rings is closed under ideal extensions. That is, if A is an ideal of R,
and A and R/A are x◦ -radical, then so is R.

Radical theory is the study of radical properties and their interrelations. There are several
well-known radicals which are of independent interest in ring theory (See examples – to
follow).

The class of all radicals is however very large. Indeed, it is possible to show that any
partition of the class of simple rings into two classes, R and S gives rise to a radical x◦ with
the property that all rings in R are x◦ -radical and all rings in S are x◦ -semisimple.

A radical x◦ is hereditary if every ideal of an x◦ -radical ring is also x◦ -radical.

A radical x◦ is supernilpotent if the class of x◦ -rings contains all nilpotent rings.

Version: 2 Owner: mclase Author(s): mclase

1125
Chapter 258

16P40 – Noetherian rings and
modules

258.1 Noetherian ring

A ring R is right noetherian (or left noetherian ) if R is noetherian as a right module (or
left module ), i.e., if the three equivalent conditions hold:

1. right ideals (or left ideals) are finitely generated

2. the ascending chain condition holds on right ideals (or left ideals)

3. every nonempty family of right ideals (or left ideals) has a maximal element.

We say that R is noetherian if it is both left noetherian and right noetherian. Examples of
Noetherian rings include any field (as the only ideals are 0 and the whole ring) and the ring
Z of integers (each ideal is generated by a single integer, the greatest common divisor of the
elements of the ideal). The Hilbert basis theorem says that a ring R is noetherian iff the
polynomial ring R[x] is.

Version: 10 Owner: KimJ Author(s): KimJ

258.2 noetherian

A module M is noetherian if it satisfies the following equivalent conditions:

1126
• the ascending chain condition holds for submodules of M ;

• every nonempty family of submodules of M has a maximal element;

• every submodule of M is finitely generated.

A ring R is left noetherian if it is noetherian as a left module over itself (i.e. if R R is a
noetherian module), and right noetherian if it is noetherian as a right module over itself (i.e.
if RR is an noetherian module), and simply noetherian if both conditions hold.

Version: 2 Owner: antizeus Author(s): antizeus

1127
Chapter 259

16P60 – Chain conditions on
annihilators and summands:
Goldie-type conditions , Krull
dimension

259.1 Goldie ring

Let R be a ring. If the set of annihilators {r. ann(x) | x ∈ R} satisifies the ascending chain condition,
then R is said to satisfy the ascending chain condition on right annihilators.

A ring R is called a right Goldie ring if it satisfies the ascending chain condition on right
annihilators and RR is a module of finite rank.

Left Goldie ring is defined similarly. If the context makes it clear on which side the ring
operates, then such a ring is simply called a Goldie ring.

A right noetherian ring is right Goldie.

Version: 3 Owner: mclase Author(s): mclase

259.2 uniform dimension

Let M be a module over a ring R, and suppose that M contains no infinite direct sums of
non-zero submodules. (This is the same as saying that M is a module of finite rank.)

1128
Then there exits an integer n such that M contains an essential submodule N where

N = U1 ⊕ U2 ⊕ · · · ⊕ Un

is a direct sum of n uniform submodules.

This number n does not depend on the choice of N or the decomposition into uniform
submodules.

We call n the uniform dimension of M. Sometimes this is written u-dim M = n.

If R is a field K, and M is a finite-dimensional vector space over K, then u-dim M = dimK M.

u-dim M = 0 if and only if M = 0.

Version: 3 Owner: mclase Author(s): mclase

1129
Chapter 260

16S10 – Rings determined by
universal properties (free algebras,
etc.)

260.1 Ore domain

Let R be a domain. We say that R is a right Ore domain if any two nonzero elements of R
have a nonzero common right multiple, i.e. for every pair of nonzero x and y, there exists a
pair of elements r and s of R such that xr = ys 6= 0.

This condition turns out to be equivalent to the following conditions on R when viewed as
a right R-module:
(a) RR is a uniform module.
(b) RR is a module of finite rank.

The definition of a left Ore domain is similar.

If R is a commutative domain, then it is a right (and left) Ore domain.

Version: 6 Owner: antizeus Author(s): antizeus

1130
Chapter 261

16S34 – Group rings , Laurent
polynomial rings

261.1 support

Let R[G] be the group ring of a group G over a ring R.
P
Let x = g xg g be an element of R[G]. The support of x, often written supp(x), is the set
of elements of G which occur with non-zero coefficient in the expansion of x.

Thus:
supp(x) = {g ∈ G | xg 6= 0}.

Version: 2 Owner: mclase Author(s): mclase

1131
Chapter 262

16S36 – Ordinary and skew
polynomial rings and semigroup rings

262.1 Gaussian polynomials

For an indeterminate u and integers n ≥ m ≥ 0 we define the following:

(a) (m)u = um−1 + um−2 + · · · + 1 for m > 0,

(b) (m!)u = (m)u (m − 1)u · · · (1)u for m > 0, and (0!)u = 1,
n
 (n!)u n

(c) m u
= (m!)u ((n−m)!)u
. If m > n then we define m u
= 0.
n

The expressions m u
are called u-binomial coefficients or Gaussian polynomials.

Note: if we replace u with 1, then we obtain the familiar integers, factorials, and binomial coefficients.
Specifically,

(a) (m)1 = m,

(b) (m!)1 = m!,
n
 n

(c) m 1
= m .

Version: 3 Owner: antizeus Author(s): antizeus

1132
262.2 q skew derivation

Let (σ, δ) be a skew derivation on a ring R. Let q be a central (σ, δ)-constant. Suppose
further that δσ = q · σδ. Then we say that (σ, δ) is a q-skew derivation.

Version: 5 Owner: antizeus Author(s): antizeus

262.3 q skew polynomial ring

If (σ, δ) is a q-skew derivation on R, then we say that the skew polynomial ring R[θ; σ, δ] is
a q-skew polynomial ring.

Version: 3 Owner: antizeus Author(s): antizeus

262.4 sigma derivation

If σ is a ring endomorphism on a ring R, then a (left) σ-derivation is an additive map δ on
R such that δ(x · y) = σ(x) · δ(y) + δ(x) · y for all x, y in R.

Version: 7 Owner: antizeus Author(s): antizeus

262.5 sigma, delta constant

If (σ, δ) is a skew derivation on a ring R, then a (σ, δ)-constant is an element q of R such
that σ(q) = q and δ(q) = 0.

Note: If q is a (σ, δ)-constant, then it follows that σ(q · x) = q · σ(x) and δ(q · x) = q · δ(x)
for all x in R.

Version: 3 Owner: antizeus Author(s): antizeus

262.6 skew derivation

A (left) skew derivation on a ring R is a pair (σ, δ), where σ is a ring endomorphism of R,
and δ is a left σ-derivation on R.

Version: 4 Owner: antizeus Author(s): antizeus

1133
262.7 skew polynomial ring

If (σ, δ) is a left skew derivation on R, then we can construct the (left) skew polynomial ring
R[θ; σ, δ], which is made up of polynomials in an indeterminate θ and left-hand coefficients
from R, with multiplication satisfying the relation

θ · r = σ(r) · θ + δ(r)

for all r in R.

Version: 2 Owner: antizeus Author(s): antizeus

1134
Chapter 263

16S99 – Miscellaneous

263.1 algebra

Let A be a ring with identity. An algebra over A is a ring B with identity together with a
ring homomorphism f : A −→ Z(B), where Z(B) denotes the center of B.

Equivalently, an algebra over A is an A–module B which is a ring and satisfies the property

a · (x ∗ y) = (a · x) ∗ y = x ∗ (a · y)

for all a ∈ A and all x, y ∈ B. Here · denotes A–module multiplication and ∗ denotes
ring multiplication in B. One passes between the two definitions as follows: given any ring
homomorphism f : A −→ Z(B), the scalar multiplication rule

a · b := f (a) ∗ b

makes B into an A–module in the sense of the second definition.

Version: 5 Owner: djao Author(s): djao

263.2 algebra (module)

Given a commutative ring R, an algebra over R is a module M over R, endowed with a law
of composition
f :M ×M →M
which is R-bilinear.

Most of the important algebras in mathematics belong to one or the other of two classes:
the unital associative algebras, and the Lie algebras.

1135
263.2.1 Unital associative algebras

In these cases, the ”product” (as it is called) of two elements v and w of the module, is
denoted simply by vw or v  w or the like.

Any unital associative algebra is an algebra in the sense of djao (a sense which is also used
by Lang in his book Algebra (Springer-Verlag)).

Examples of unital associative algebras:

– tensor algebras and – quotients of them

– Cayley algebras, such as the – ring of – quaternions – polynomial rings

– the ring of – endomorphisms of a – vector space, in which – the bilinear product of – two
mappings is simply the – composite mapping.

263.2.2 Lie algebras

In these cases the bilinear product is denoted by [v, w], and satisfies

[v, v] = 0 for all v ∈ M

[v, [w, x]] + [w, [x, v]] + [x, [v, w]] = 0 for all v, w, x ∈ M
The second of these formulas is called the Jacobi identity. One proves easily

[v, w] + [w, v] = 0 for all v, w ∈ M

for any Lie algebra M.

Lie algebras arise naturally from Lie groups, q.v.

Version: 1 Owner: karthik Author(s): Larry Hammick

1136
Chapter 264

16U10 – Integral domains

264.1 Prüfer domain

An integral domain R is a Prüfer domain if every finitely generated ideal I of R is invert-
ible.

Let RI denote the localization of R at I. Then the following statements are equivalent:

• i) R is a Prüfer domain.
• ii) For every prime ideal P in R, RP is a valuation domain.
• iii) For every maximal ideal M in R, RM is a valuation domain.

A Prüfer domain is a Dedekind domain if and only if it is noetherian.

If R is a Prüfer domain with quotient field K, then any domain S such that R ⊂ S ⊂ K is
Prüfer.

REFERENCES
1. Thomas W. Hungerford. Algebra. Springer-Verlag, 1974. New York, NY.

Version: 2 Owner: mathcam Author(s): mathcam

264.2 valuation domain

An integral domain R is a valuation domain if for all a, b ∈ R, either a|b or b|a.

1137
Version: 3 Owner: mathcam Author(s): mathcam

1138
Chapter 265

16U20 – Ore rings, multiplicative sets,
Ore localization

265.1 Goldie’s Theorem

Let R be a ring with an identity. Then R has a right classical ring of quotients Q which
is semisimple Artinian if and only if R is a semiprime right Goldie ring. If this is the case,
then the composition length of Q is equal to the uniform dimension of R.

An immediate corollary of this is that a semiprime right noetherian ring always has a right
classical ring of quotients.

This result was discovered by Alfred Goldie in the late 1950’s.

Version: 3 Owner: mclase Author(s): mclase

265.2 Ore condition

A ring R satisfies the left Ore condition (resp. right Ore condition) if and only if for
all elements x and y with x regular, there exist elements u and v with v regular such that

ux = vy (resp.xu = yv).

A ring which satisfies the (left, right) Ore condition is called a (left, right) Ore ring.

Version: 3 Owner: mclase Author(s): mclase

1139
265.3 Ore’s theorem

A ring has a (left, right) classical ring of quotients if and only if it satisfies the (left, right)
Ore condition.

Version: 3 Owner: mclase Author(s): mclase

265.4 classical ring of quotients

Let R be a ring. An element of R is called regular if it is not a right zero divisor or a
left zero divisor in R.

A ring Q ⊃ R is a left classical ring of quotients for R (resp. right classical ring of
quotients for R) if it satisifies:

• every regular element of R is invertible in Q

• every element of Q can be written in the form x−1 y (resp. yx−1 ) with x, y ∈ R and x
regular.

If a ring R has a left or right classical ring of quotients, then it is unique up to isomorphism.

If R is a commutative integral domain, then the left and right classical rings of quotients
always exist – they are the field of fractions of R.

For non-commutative rings, necessary and sufficient conditions are given by Ore’s theorem.

Note that the goal here is to construct a ring which is not too different from R, but in
which more elements are invertible. The first condition says which elements we want to be
invertible. The second condition says that Q should contain just enough extra elements to
make the regular elements invertible.

Such rings are called classical rings of quotients, because there are other rings of quotients.
These all attempt to enlarge R somehow to make more elements invertible (or sometimes to
make ideals invertible).

Finally, note that a ring of quotients is not the same as a quotient ring.

Version: 2 Owner: mclase Author(s): mclase

1140
265.5 saturated

Let S be multiplicative subset of A. We say that S is a saturated if

ab ∈ S ⇒ a, b ∈ S.

When A is an integral domain, then S is saturated if and only if its complement A\S is
union of prime ideals.

Version: 1 Owner: drini Author(s): drini

1141
Chapter 266

16U70 – Center, normalizer (invariant
elements)

266.1 center (rings)

If A is a ring, the center of A, sometimes denoted Z(A), is the set of all elements in A that
commute with all other elements of A. That is,

Z(A) = {a ∈ A | ax = xa∀x ∈ A}

Note that 0 ∈ Z(A) so the center is non-empty. If we assume that A is a ring with a
multiplicative unity 1, then 1 is in the center as well. The center of A is also a subring of A.

Version: 3 Owner: dublisk Author(s): dublisk

1142
Chapter 267

16U99 – Miscellaneous

267.1 anti-idempotent

An element x of a ring is called an anti-idempotent element, or simply an anti-idempotent
if x2 = −x.

The term is most often used in linear algebra. Every anti-idempotent matrix over a field is
diagonalizable. Two anti-idempotent matrices are similar if and only if they have the same
rank.

Version: 1 Owner: mathcam Author(s): mathcam

1143
Chapter 268

16W20 – Automorphisms and
endomorphisms

268.1 ring of endomorphisms

Let R be a ring and let M be a right R-module.

An endomorphism of M is a R-module homomorphism from M to itself. We shall write
endomorphisms on the left, so that f : M → M maps x 7→ f (x). If f, g : M → M are two
f + g : x 7→ f (x) + g(x)
and multiply them
f g : x 7→ f (g(x))
With these operations, the set of endomorphisms of M becomes a ring, which we call the
ring of endomorphisms of M, written EndR (M).

Instead of writing endomorphisms as functions, it is often convenient to write them multi-
plicatively: we simply write the application of the endomorphism f as x 7→ f x. Then the
fact that each f is an R-module homomorphism can be expressed as:
f (xr) = (f x)r
for all x ∈ M and r ∈ R and f ∈ EndR (M). With this notation, it is clear that M becomes
an EndR (M)-R-bimodule.

Now, let N be a left R-module. We can construct the ring EndR (N) in the same way. There
is a complication, however, if we still think of endomorphism as functions written on the
left. In order to make M into a bimodule, we need to define an action of EndR (N) on the
right of N: say
x · f = f (x)

1144
But then we have a problem with the multiplication:

x · f g = f g(x) = f (g(x))

but
(x · f ) · g = f (x) · g = g(f (x))!
In order to make this work, we need to reverse the order of composition when we define
multiplication in the ring EndR (N) when it acts on the right.

There are essentially two different ways to go from here. One is to define the muliplication in
EndR (N) the other way, which is most natural if we write the endomorphisms as functions
on the right. This is the approach taken in many older books.

The other is to leave the muliplication in EndR (N) the way it is, but to use the opposite ring
to define the bimodule. This is the approach that is generally taken in more recent works.
Using this approach, we conclude that N is a R-EndR (N)op -bimodule. We will adopt this
convention for the lemma below.

Considering R as a right and a left module over itself, we can construct the two endomor-
phism rings EndR (RR ) and EndR (R R).

Lemma 2. Let R be a ring with an identity element. Then R ' EndR (RR ) and R '
EndR (R R)op .

D efine ρr ∈ EndR (R R) by x 7→ xr.

A calculation shows that ρrs = ρs ρr (functions written on the left) from which it is easily
seen that the map θ : r 7→ ρr is a ring homomorphism from R to EndR (R R)op .

We must show that this is an isomorphism.

If ρr = 0, then r = 1r = ρr (1) = 0. So θ is injective.

Let f be an arbitrary element of EndR (R R), and let r = f (1). Then for any x ∈ R,
f (x) = f (x1) = xf (1) = xr = ρr (x), so f = ρr = θ(r).

The proof of the other isomorphism is similar.

Version: 4 Owner: mclase Author(s): mclase

1145
Chapter 269

16W30 – Coalgebras, bialgebras, Hopf
algebras ; rings, modules, etc. on
which these act

269.1 Hopf algebra

A Hopf algebra is a bialgebra A over a field K with a K-linear map S : A → A, called the

Definition 1. antipode, such that

m ◦ (S ⊗ id) ◦ ∆ = η ◦ ε = m ◦ (id ⊗ S) ◦ ∆, (269.1.1)

where m : A ⊗ A → A is the multiplication map m(a ⊗ b) = ab and η : K → A is the unit
map η(k) = k1I.

In terms of a commutative diagram:

A
∆ ∆

A⊗A ε A⊗A

S⊗id C id⊗∗S

A⊗A η A⊗A
m m

A

1146
Example 1 (Algebra of functions on a finite group). Let A = C(G) be the algebra of complex-
valued functions on a finite group G and identify C(G × G) with A ⊗ A. Then, A is a
Hopf algebra with comultiplication (∆(f ))(x, y) = f (xy), counit ε(f ) = f (e), and antipode
(S(f ))(x) = f (x−1 ).
Example 2 (Group algebra of a finite group). Let A = CG be the complex group algebra
of a finite group G. Then, A is a Hopf algebra with comultiplication ∆(g) = g ⊗ g, counit
ε(g) = 1, and antipode S(g) = g −1 .

The above two examples are dual to one another. Define a bilinear form C(G) ⊗ CG → C
by hf, xi = f (x). Then,
hf g, xi = hf ⊗ g, ∆(x)i,
h1, xi = ε(x),
h∆(f ), x ⊗ yi = hf, xyi,
ε(f ) = hf, ei,
hS(f ), xi = hf, S(x)i.
Example 3 (Polynomial functions on a Lie group). Let A = Poly(G) be the algebra of
complex-valued polynomial functions on a complex Lie group G and identify Poly(G × G)
with A ⊗ A. Then, A is a Hopf algebra with comultiplication (∆(f ))(x, y) = f (xy), counit
ε(f ) = f (e), and antipode (S(f ))(x) = f (x−1 ).
Example 4 (Universal enveloping algebra of a Lie algebra). Let A = U(g) be the universal enveloping algebra
of a complex Lie algebra g. Then, A is a Hopf algebra with comultiplication ∆(X) =
X ⊗ 1 + 1 ⊗ X, counit ε(X) = 0, and antipode S(X) = −X.

The above two examples are dual to one another (if g is the Lie algebra of G). Define a
d
bilinear form Poly(G) ⊗ U(g) → C by hf, Xi = dt t=0
f (exp(tX)).

Version: 6 Owner: mhale Author(s): mhale

269.2 almost cocommutative bialgebra

A bialgebra A is called almost cocommutative if there is an unit R ∈ A ⊗ A such that
R∆(a) = ∆op (a)R
where ∆op is the opposite comultiplication (the usual comultiplication, composed with the
flip map of the tensor product A ⊗ A). The element R is often called the R-matrix of A.

The significance of the almost cocommutative condition is that σV,W = σ ◦ R : V ⊗ W →
W ⊗ V gives a natural isomorphism of bialgebra representations, where V and W are A-
modules, making the category of A-modules into a quasi-tensor or braided monoidal category.
Note that σW,V ◦ σV,W is not necessarily the identity (this is the braiding of the category).

Version: 2 Owner: bwebste Author(s): bwebste

1147
269.3 bialgebra

A
Definition 2. bialgebra is a vector space that is both a unital algebra and a coalgebra, such
that the comultiplication and counit are unital algebra homomorphisms.

Version: 2 Owner: mhale Author(s): mhale

269.4 coalgebra

A
Definition 3. coalgebra is a vector space A over a field K with a K-linear map ∆ : A →
A ⊗ A, called the
Definition 4. comultiplication, and a (non-zero) K-linear map ε : A → K, called the
Definition 5. counit, such that
(∆ ⊗ id) ◦ ∆ = (id ⊗ ∆) ◦ ∆ (coassociativity), (269.4.1)

(ε ⊗ id) ◦ ∆ = id = (id ⊗ ε) ◦ ∆. (269.4.2)

In terms of commutative diagrams:
A
∆ ∆

A⊗A A⊗A

∆⊗id id⊗∆
A⊗A⊗A

A
∆ ∆

A⊗A id A⊗A

ε⊗id id⊗ε
A

Let σ : A ⊗ A → A ⊗ A be the flip map σ(a ⊗ b) = b ⊗ a. A coalgebra is said to be
Definition 6. cocommutative if σ ◦ ∆ = ∆.

Version: 4 Owner: mhale Author(s): mhale

1148
269.5 coinvariant

Let V be a comodule with a right coaction t : V → V ⊗ A of a coalgebra A. An element
v ∈ V is
Definition 7. right coinvariant if
t(v) = v ⊗ 1IA . (269.5.1)

Version: 1 Owner: mhale Author(s): mhale

269.6 comodule

Let (A, ∆, ε) be a coalgebra. A
Definition 8. right A-comodule is a vector space V with a linear map t : V → V ⊗A, called
the
Definition 9. right coaction, satisfying
(t ⊗ id) ◦ t = (id ⊗ ∆) ◦ t, (id ⊗ ε) ◦ t = id. (269.6.1)
An A-comodule is also referred to as a corepresentation of A.

Let V and W be two right A-comodules. Then V ⊕ W is also a right A-comodule. If A is
a bialgebra then V ⊗ W is a right A-comodule as well (make use of the multiplication map
A ⊗ A → A).

Version: 2 Owner: mhale Author(s): mhale

269.7 comodule algebra

Let H be a bialgebra. A right H-comodule algebra is a unital algebra A which is a right
H-comodule satisfying
X
t(ab) = t(a)t(b) = a(1) b(1) ⊗ a(2) b(2) , t(1IA ) = 1IA ⊗ 1IH , (269.7.1)

for all h ∈ H and a, b ∈ A.

There is a dual notion of a H-module coalgebra.
Example 5. Let H be a bialgebra. Then H is itself a H-comodule algebra for the right
regular coaction t(h) = ∆(h).

Version: 5 Owner: mhale Author(s): mhale

1149
269.8 comodule coalgebra

Let H be a bialgebra. A right H-comodule coalgebra is a coalgebra A which is a right
H-comodule satisfying
X
(∆ ⊗ id)t(a) = a(1)(1) ⊗ a(2)(1) ⊗ a(1)(2) a(2)(2) , (ε ⊗ id)t(a) = ε(a)1IH , (269.8.1)
for all h ∈ H and a ∈ A.

There is a dual notion of a H-module algebra.
Example 6. Let H be a Hopf algebra. Then H is itself a H-comodule coalgebra for the
adjoint coaction t(h) = h(2) ⊗ S(h(1) )h(3) .

Version: 4 Owner: mhale Author(s): mhale

269.9 module algebra

Let H be a bialgebra. A left H-module algebra is a unital algebra A which is a left
H-module satisfying
X
h . (ab) = (h(1) . a)(h(2) . b), h . 1IA = ε(h)1IA , (269.9.1)
for all h ∈ H and a, b ∈ A.

There is a dual notion of a H-comodule coalgebra.
ExampleP7. Let H be a Hopf algebra. Then H is itself a H-module algebra for the adjoint action
g . h = g(1) hS(g(2) ).

Version: 4 Owner: mhale Author(s): mhale

269.10 module coalgebra

Let H be a bialgebra. A left H-module coalgebra is a coalgebra A which is a left H-
module satisfying
X
∆(h . a) = (h(1) . a(1) ) ⊗ (h(2) . a(2) ), ε(h . a) = ε(h)ε(a), (269.10.1)
for all h ∈ H and a ∈ A.

There is a dual notion of a H-comodule algebra.
Example 8. Let H be a bialgebra. Then H is itself a H-module coalgebra for the left regular
action g . h = gh.

Version: 5 Owner: mhale Author(s): mhale

1150
Chapter 270

16W50 – Graded rings and modules

An algebra A is graded if it is a graded module and satisfies

Ap · Aq ⊆ Ap+q

Examples of graded algebras include the polynomial ring k[X] being an N-graded k-algebra,
and the exterior algebra.

Version: 1 Owner: dublisk Author(s): dublisk

If R = R0 ⊕ R1 ⊕ · · · is a graded ring, then a graded module over R is a module M of the
form M = ⊕∞ i=−∞ Mi and satisfies Ri Mj ⊆ Mi+j for all i, j.

Version: 4 Owner: KimJ Author(s): KimJ

270.3 supercommutative

Let R be a Z2 -graded ring. Then R is supercommutative if for any homogeneous elements a
and b ∈ R:

ab = (−1)deg a deg b ba.

1151
This is, even homogeneous elements are in the center of the ring, and odd homogeneous
elements anti-commute.

Common examples of supercommutative rings are the exterior algebra of a module over a
commutative ring (in particular, a vector space) and the cohomology ring of a topological space
(both with the standard grading by degree reduced mod 2).

Version: 1 Owner: bwebste Author(s): bwebste

1152
Chapter 271

16W55 – “Super” (or “skew”)
structure

271.1 super tensor product

If A and B are Z-graded algebras, we define the super tensor product A ⊗su B to be the
ordinary tensor product as graded modules, but with multiplication - called the super product
- defined by
0
(a ⊗ b)(a0 ⊗ b0 ) = (−1)(deg b)(deg a ) aa0 ⊗ bb0
where a, a0 , b, b0 are homogeneous. The super tensor product of A and B is itself a graded algebra,
as we grade the super tensor product of A and B as follows:

a
(A ⊗su B)n = Ap ⊗ B q
p,q : p+q=n

Version: 4 Owner: dublisk Author(s): dublisk

271.2 superalgebra

A graded algebra A is said to be a super algebra if it has a Z/2Z grading.

Version: 2 Owner: dublisk Author(s): dublisk

1153
271.3 supernumber

Let ΛN be the Grassmann algebra generated by θi , i = 1 . . . N, such that θi θj = −θj θi and
(θi )2 = 0. Denote by Λ∞ , the case of an infinite number of generators θi . A

Definition 10. supernumber is an element of ΛN or Λ∞ .

Any supernumber z can be expressed uniquely in the form
1 1
z = z0 + zi θi + zij θi θj + . . . + zi1 ...in θi1 . . . θin + . . . ,
2 n!
where the coefficients zi1 ...in ∈ C are antisymmetric in their indices. The

Definition 11. body of z is defined as zB = z0 , and its

Definition 12. soul is defined as zS = z − zB . If zB 6= 0 then z has an inverse given by
 k
−1 1 X zS
z = − .
zB k=0 zB

A supernumber can be decomposed into the even and odd parts
1 1
zeven = z0 + zij θi θj + . . . + zi1 ...i2n θi1 . . . θi2n + . . . ,
2 (2n)!
1 1
zodd = zi θi + zijk θi θj θk + . . . + zi ...i θi1 . . . θi2n+1 + . . . .
6 (2n + 1)! 1 2n+1

Purely even supernumbers are called

Definition 13. c-numbers, and odd supernumbers are called

Definition 14. a-numbers. The superalgebra ΛN thus has a decomposition ΛN = Cc ⊕ Ca ,
where Cc is the space of c-numbers, and Ca is the space of a-numbers.

Supernumbers are the generalisation of complex numbers to a commutative superalgebra of
commuting and anticommuting “numbers”. They are primarily used in the description of
fermionic fields in quantum field theory.

Version: 5 Owner: mhale Author(s): mhale

1154
Chapter 272

16W99 – Miscellaneous

272.1 Hamiltonian quaternions

Definition of Q

We define a unital associative algebra Q over R, of dimension 4, by the basis {1, i, j, k} and
the multiplication table

1 i j k
i −1 k −j
j −k −1 i
k j −i −1

(where the element in row x and column y is xy, not yx). Thus an arbitrary element of Q
is of the form
a1 + bi + cj + dk, a, b, c, d ∈ R
(sometimes denoted by ha, b, c, di or by a+hb, c, di) and the product of two elements ha, b, c, di
and hα, β, γ, δi is hw, x, y, zi where

w = aα − bβ − cγ − dδ
x = aβ + bα + cδ − dγ
y = aγ − bδ + cα + kβ
z = aδ + bγ − cβ + kα

The elements of Q are known as Hamiltonian quaternions.

Clearly the subspaces of Q generated by {1} and by {1, i} are subalgebras isomorphic to R
and C respectively. R is customarily identified with the corresponding subalgebra of Q. (We

1155
shall see in a moment that there are other and less obvious embeddings of C in Q.) The
real numbers commute with all the elements of Q, and we have

λ · ha, b, c, di = hλa, λb, λc, λdi

for λ ∈ R and ha, b, c, di ∈ Q.

norm, conjugate, and inverse of a quaternion

Like the complex numbers (C), the quaternions have a natural involution called the quater-
nion conjugate. If q = a1 + bi + cj + dk, then the quaternion conjugate of q, denoted q, is
simply q = a1 − bi − cj − dk.

One can readily verify that if q = a1 + bi + cj + dk, then qq = (a2 + b2 + c2 + d2 )1. (See
√ This product is used to form a norm | · | on the algebra (or the
Euler four-square identity.)
ring) Q: We define kqk = s where qq = s1.

If v, w ∈ Q and λ ∈ R, then

1. kvk > 0 with equality only if v = h0, 0, 0, 0i = 0

2. kλvk = |λ|kvk

3. kv + wk 6 kvk + kwk

4. kv · wk = kvk · kwk

which means that Q qualifies as a normed algebra when we give it the norm | · |.

Because the norm of any nonzero quaternion q is real and nonzero, we have
qq qq
2
= = h1, 0, 0, 0i
kqk kqk2
which shows that any nonzero quaternion has an inverse:
q
q −1 = .
kqk2

Other embeddings of C into Q

One can use any non-zero q to define an embedding of C into Q. If n(z) is a natural embedding
of z ∈ C into Q, then the embedding:

z → qn(z)q −1

is also an embedding into Q. Because Q is an associative algebra, it is obvious that:

(qn(a)q −1 )(qn(b)q −1 ) = q(n(a)n(b))q −1

1156
and with the distributive laws, it is easy to check that

(qn(a)q −1 ) + (qn(b)q −1 ) = q(n(a) + n(b))q −1

Rotations in 3-space

Let us write
U = {q ∈ Q : ||q|| = 1}
With multiplication, U is a group. Let us briefly sketch the relation between U and the
group SO(3) of rotations (about the origin) in 3-space.

An arbitrary element q of U can be expressed cos 2θ + sin θ2 (ai + bj + ck), for some real
numbers θ, a, b, c such that a2 + b2 + c2 = 1. The permutation v 7→ qv of U thus gives rise to
a permutation of the real sphere. It turns out that that permutation is a rotation. Its axis
is the line through (0, 0, 0) and (a, b, c), and the angle through which it rotates the sphere
is θ. If rotations F and G correspond to quaternions q and r respectively, then clearly the
permutation v 7→ qrv corresponds to the composite rotation F ◦ G. Thus this mapping of
U onto SO(3) is a group homomorphism. Its kernel is the subset {1, −1} of U, and thus it
comprises a double cover of SO(3). The kernel has a geometric interpretation as well: two
unit vectors in opposite directions determine the same axis of rotation.

Version: 3 Owner: mathcam Author(s): Larry Hammick, patrickwonders

1157
Chapter 273

16Y30 – Near-rings

273.1 near-ring

A near-ring is a set N together with two binary operations, denoted + : N × N → N and
· : N × N → N, such that

1. (a + b) + c = a + (b + c) and (a · b) · c = a · (b · c) for all a, b, c ∈ N (associativity of
both operations)

2. There exists an element 0 ∈ N such that a + 0 = 0 + a = a for all a ∈ N (additive
identity)

3. For all a ∈ N, there exists b ∈ N such that a + b = b + a = 0 (additive inverse)

4. (a + b) · c = (a · c) + (b · c) for all a, b, c ∈ N (right distributive law)

Note that the axioms of a near-ring differ from those of a ring in that they do not require
addition to be commutative, and only require distributivity on one side.

Every element a in a near-ring has a unique additive inverse, denoted −a.

We say N has an identity element if there exists an element 1 ∈ N such that a · 1 = 1 · a = a
for all a ∈ N. We say N is distributive if a · (b + c) = (a · b) + (a · c) holds for all a, b, c ∈ N.
We say N is commutative if a · b = b · a for all a, b ∈ N.

A natural example of a near-ring is the following. Let (G, +) be a group (not necessarily
abelian), and let M be the set of all functions from G to G. For two functions f and g in M
define f + g ∈ M by (f + g)(x) = f (x) + g(x) for all x ∈ G. Then (M, +, ◦) is a near-ring
with identity, where ◦ denotes composition of functions.

Version: 13 Owner: yark Author(s): yark, juergen

1158
Chapter 274

17A01 – General theory

274.1 commutator bracket

Let A be an associative algebra over a field K. For a, b ∈ A, the element of A defined by
[a, b] = ab − ba
is called the commutator of a and b. The corresponding bilinear operation
[−, −] : A × A → A
is called the commutator bracket.

The commutator bracket is bilinear, skew-symmetric, and also satisfies the Jacobi identity.
To wit, for a, b, c ∈ A we have
[a, [b, c]] + [b, [c, a]] + [c, [a, b]] = 0.
The proof of this assertion is straightforward. Each of the brackets in the left-hand side
expands to 4 terms, and then everything cancels.

In categorical terms, what we have here is a functor from the category of associative algebras
to the category of Lie algebras over a fixed field. The action of this functor is to turn an
associative algebra A into a Lie algebra that has the same underlying vector space as A, but
whose multiplication operation is given by the commutator bracket. It must be noted that
this functor is right-adjoint to the universal enveloping algebra functor.

Examples

• Let V be a vector space. Composition endows the vector space of endomorphisms
End V with the structure of an associative algebra. However, we could also regard
End V as a Lie algebra relative to the commutator bracket:
[X, Y ] = XY − Y X, X, Y ∈ End V.

1159
• The algebra of differential operators has some interesting properties when viewed as a
Lie algebra. The fact is that even though, even though the composition of differential
operators is a non-commutative operation, it is commutative when restricted to the
highest order terms of the involved operators. Thus, if X, Y are differential operators
of order p and q, respectively, the compositions XY and Y X have order p + q. Their
highest order term coincides, and hence the commutator [X, Y ] has order p + q − 1.

• In light of the preceding comments, it is evident that the vector space of first-order
differential operators is closed with respect to the commutator bracket. Specializing
even further we remark that, a vector field is just a homogeneous first-order differential
operator, and that the commutator bracket for vector fields, when viewed as first-order
operators, coincides with the usual, geometrically motivated vector field bracket.

Version: 4 Owner: rmilson Author(s): rmilson

1160
Chapter 275

17B05 – Structure theory

275.1 Killing form

Let g be a finite dimensional Lie algebra over a field k, and adX : g → g be the adjoint action,
adX Y = [X, Y ]. Then the Killing form on g is a bilinear map
Bg : g × g → k
given by

The Killing form is invariant and symmetric (since trace is symmetric).

Version: 4 Owner: bwebste Author(s): bwebste

275.2 Levi’s theorem

Let g be a complex Lie algebra, r its radical. Then the extension 0 → r → g → g/r → 0 is
split, i.e., there exists a subalgebra h of g mapping isomorphically to g/r under the natural
projection.

Version: 2 Owner: bwebste Author(s): bwebste

Let g be a Lie algebra. Then the nilradical n of g is defined to be the intersection
T of
the kernels of all the irreducible representations of g. Equivalently, n = [g, g] rad g, the

1161
interesection of the derived ideal and radical of g.

Version: 1 Owner: bwebste Author(s): bwebste

Let g be a Lie algebra. Since the sum of any two solvable ideals of g is in turn solvable, there
is a unique maximal solvable ideal of any Lie algebra. This ideal is called the radical of g.
Note that g/rad g has no solvable ideals, and is thus semi-simple. Thus, every Lie algebra is
an extension of a semi-simple algebra by a solvable one.

Version: 2 Owner: bwebste Author(s): bwebste

1162
Chapter 276

17B10 – Representations, algebraic
theory (weights)

Every finite dimensional Lie algebra has a faithful finite dimensional representation. In other
words, every finite dimensional Lie algebra is a matrix algebra.

This result is not true for Lie groups.

Version: 2 Owner: bwebste Author(s): bwebste

276.2 Lie algebra representation

A representation of a Lie algebra g is a Lie algebra homomorphism

ρ : g → End V,

where End V is the commutator Lie algebra of some vector space V . In other words, ρ is a
linear mapping that satisfies

ρ([a, b]) = ρ(a)ρ(b) − ρ(b)ρ(a), a, b ∈ g

Alternatively, one calls V a g-module, and calls ρ(a), a ∈ g the action of a on V .

We call the representation faithful if ρ is injective.

A invariant subsspace or sub-module W ⊂ V is a subspace of V satisfying ρ(a)(W ) ⊂ W for
all a ∈ g. A representation is called irreducible or simple if its only invariant subspaces are
{0} and the whole representation.

1163
The dimension of V is called the dimension of the representation. If V is infinite-dimensional,
then one speaks of an infinite-dimensional representation.

Given a representation or pair of representation, there are a couple of operations which will
produce other representations:

First there is direct sum. If ρ : g → End(V ) and σ : g → End(W ) are representations,
then V ⊕ W has the obvious Lie algebra action, by the embedding End(V ) × End(W ) ,→
End(V ⊕ W ).

Version: 9 Owner: bwebste Author(s): bwebste, rmilson

Let g be a Lie algebra. For every a ∈ g we define the adjoint endomorphism, a.k.a. the
to be the linear transformation with action

ad(a) : b 7→ [a, b], b ∈ g.

The linear mapping ad : g → End(g) with action

is called the adjoint representation of g. The fact that ad defines a representation is a
straight-forward consequence of the Jacobi identity axiom. Indeed, let a, b ∈ g be given. We
wish to show that
where the bracket on the left is the g multiplication structure, and the bracket on the right
is the commutator bracket. For all c ∈ g the left hand side maps c to

[[a, b], c],

while the right hand side maps c to

[a, [b, c]] + [b, [a, c]].

Taking skew-symmetry of the bracket as a given, the equality of these two expressions is
logically equivalent to the Jacobi identity:

[a, [b, c]] + [b, [c, a]] + [c, [a, b]] = 0.

Version: 2 Owner: rmilson Author(s): rmilson

1164
276.4 examples of non-matrix Lie groups

While most well-known Lie groups are matrix groups, there do in fact exist Lie groups which
are not matrix groups. That is, they have no faithful finite dimensional representations.

For example, let H be the real Heisenberg group

  
 1 a b 
H = 0 1 c  | a, b, c ∈ R ,
 
0 0 1

and Γ the discrete subgroup
  
 1 0 n 
Γ = 0 1 0  | n ∈ Z .
 
0 0 1

The subgroup Γ is central, and thus normal. The Lie group H/Γ has no faithful finite
dimensional representations over R or C.

Another example is the universal cover of SL2 R. SL2 R is homotopy equivalent to a circle,
and thus π(SL2 R) ∼ = Z, and thus has an infinite-sheeted cover. Any real or complex repre-
sentation of this group factors through the projection map to SL2 R.

Version: 3 Owner: bwebste Author(s): bwebste

276.5 isotropy representation

Let g be a Lie algebra, and h ⊂ g a subalgebra. The isotropy representation of h relative to
g is the naturally defined action of h on the quotient vector space g/h.

Here is a synopsis of the technical details. As is customary, we will use

b + h, b ∈ g

to denote the coset elements of g/h. Let a ∈ h be given. Since h is invariant with respect to
adg(a), the adjoint action factors through the quotient to give a well defined endomorphism
of g/h. The action is given by

b + h 7→ [a, b] + h, b ∈ g.

This is the action alluded to in the first paragraph.

Version: 3 Owner: rmilson Author(s): rmilson

1165
Chapter 277

17B15 – Representations, analytic
theory

277.1 invariant form (Lie algebras)

Let V be a representation of a Lie algebra g over a field k. Then a bilinear form B : V ×V →
k is invariant if
B(Xv, w) + B(v, Xw) = 0.
for all X ∈ g, v, w ∈ V . This criterion seems a little odd, but in the context of Lie algebras,
it makes sense. For example, the map B̃ : V → V ∗ given by v 7→ B(·, v) is equivariant if and
only if B is an invariant form.

Version: 2 Owner: bwebste Author(s): bwebste

1166
Chapter 278

17B20 – Simple, semisimple,
reductive (super)algebras (roots)

278.1 Borel subalgebra

Let g be a semi-simple Lie group, h a Cartan subalgebra, R the associated root system and
R+ ⊂ R a set of positive roots. We have a root decomposition into the Cartan subalgebra
and the root spaces gα !
M
g=h⊕ gα .
α∈R

Now let b be the direct sum of the Cartan subalgebra and the positive root spaces.
 
M
b=h⊕ gβ  .
β∈R+

This is called a Borel subalgebra.

Version: 2 Owner: bwebste Author(s): bwebste

278.2 Borel subgroup

Let G be a complex semi-simple Lie group. Then any maximal solvable subgroup B 6 G
is called a Borel subgroup. All Borel subgroups of a given group are conjugate. Any Borel
group is connected and equal to its own normalizer, and contains a unique Cartan subgroup.
The intersection of B with a maximal compact subgroup K of G is the maximal torus of K.

If G = SLn C, then the standard Borel subgroup is the set of upper triangular matrices.

1167
Version: 2 Owner: bwebste Author(s): bwebste

278.3 Cartan matrix

Let R ⊂ E be a reduced root system, with E a euclidean vector space, with inner product
(·, ·), and let Π = {α1 , · · · , αn } be a base of this root system. Then the Cartan matrix of
the root system is the matrix  
2(αi , αj )
Ci,j = .
(αi , αi )
The Cartan matrix uniquely determines the root system, and is unique up to simultaneous
permutation of the rows and columns. It is also the basis change matrix from the basis of
fundamental weights to the basis of simple roots in E.

Version: 1 Owner: bwebste Author(s): bwebste

278.4 Cartan subalgebra

Let g be a Lie algebra. Then a Cartan subalgebra is a maximal subalgebra of g which is self-
normalizing, that is, if [g, h] ∈ h for all h ∈ h, then g ∈ h as well. Any Cartan subalgebra
h is nilpotent, and if g is semi-simple, it is abelian. All Cartan subalgebras of a Lie algebra
are conjugate by the adjoint action of any Lie group with algebra g.

Version: 3 Owner: bwebste Author(s): bwebste

278.5 Cartan’s criterion

A Lie algebra g is semi-simple if and only if its Killing form Bg is nondegenerate.

Version: 2 Owner: bwebste Author(s): bwebste

278.6 Casimir operator

Let g be a semisimple Lie algebra, and let (·, ·) denote the Killing form. If {gi} is a basis of
g, then there is a dual
P basis {g i} with respect to the Killing form, i.e., (gi , g j ) = δij . Consider
the element Ω = gi g i of the universal enveloping algebra of g. This element, called the
Casimir operator is central in the enveloping algebra, and thus commutes with the g action
on any representation.

1168
Version: 2 Owner: bwebste Author(s): bwebste

278.7 Dynkin diagram

Dynkin diagrams are a combinatorial way of representing the imformation in a root system.
Their primary advantage is that they are easier to write down, remember, and analyze than
explicit representations of a root system. They are an important tool in the classification of
simple Lie algebras.

Given a reduced root system R ⊂ E, with E an inner-product space, choose a base or
simple roots Π (or equivalently, a set of positive roots R+ ). The Dynkin diagram associated
to R is a graph whose vertices are Π. If πi and πj are distinct elements of the root system, we
−4(πi ,πj )2
add mij = (πi ,πi)(π j ,πj )
lines between them. This number is obivously positive, and an integer
since it is the product of 2 quantities that the axioms of a root system require to be integers.
By the Cauchy-Schwartz inequality, and the fact that simple roots are never anti-parallel
(they are all strictly contained in some half space), mij ∈ {0, 1, 2, 3}. Thus Dynkin diagrams
are finite graphs, with single, double or triple edges. Fact, the criteria are much stronger
than this: if the multiple edges are counted as single edges, all Dynkin diagrams are trees,
and have at most one multiple edge. In fact, all Dynkin diagrams fall into 4 infinite families,
and 5 exceptional cases, in exact parallel to the classification of simple Lie algebras.

(Does anyone have good Dynkin diagram pictures? I’d love to put some up, but am decidedly
lacking.)

Version: 1 Owner: bwebste Author(s): bwebste

278.8 Verma module

Let g be a semi-simple Lie algebra, h a Cartan subalgebra, and b a Borel subalgebra. Let
Fλ for a weight λ ∈ h∗ be the 1-d dimensional b module on which h acts by multiplication
by λ, and the positive root spaces act trivially. Now, the Verma module Mλ of the weight λ
is the g module

Mλ = Fλ ⊗U(b) U(g).

This is an infinite dimensional representation, and it has a very important property: If V
is any representation with highest weight λ, there is a surjective homomorphism Mλ → V .
That is, all representations with highest weight λ are quotients of Mλ . Also, Mλ has a unique
maximal submodule, so there is a unique irreducible representation with highest weight λ.

Version: 1 Owner: bwebste Author(s): bwebste

1169
278.9 Weyl chamber

If R ⊂ E is a root system, with E a euclidean vector space, and R+ is a set of positive roots,
then the positive Weyl chamber is the set

C = {e ∈ E|(e, α) > 0 ∀α ∈ R+ }.

The interior of C is a fundamental domain for the action of the Weyl group on E. The image
w(C) of C under the any element of the Weyl group is called a Weyl chamber. The Weyl
group W acts simply transitively on the set of Weyl chambers.

A weight which lies inside the positive Weyl chamber is called dominant

Version: 2 Owner: bwebste Author(s): bwebste

278.10 Weyl group

The Weyl group WR of a root system R ⊂ E, where E is a euclidean vector space, is the
subgroup of GL(E) generated by reflection in the hyperplanes perpendicular to the roots.
The map of reflection in a root α is given by

(v, α)
rα (v) = v − 2
(α, α)
.

The Weyl group is generated by reflections in the simple roots for any choice of a set of
positive roots. There is a well-defined length function ` : WR → Z, where `(w) is the
minimal number of reflections in simple roots that w can be written as. This is also the
number of positive roots that w takes to negative roots.

Version: 1 Owner: bwebste Author(s): bwebste

278.11 Weyl’s theorem

Let g be a finite dimensional semi-simple Lie algebra. Then any finite dimensional representation
of g is completely reducible.

Version: 1 Owner: bwebste Author(s): bwebste

1170
278.12 classification of finite-dimensional representa-
tions of semi-simple Lie algebras

If g is a semi-simple Lie algebra, then we say that an irreducible representation V has highest
weight λ, if there is a vector v ∈ Vλ , the weight space of λ, such that Xv = 0 for X in any
positive root space, and v is called a highest vector, or vector of highest weight.

There is a unique (up to isomorphism) irreducible finite dimensional representation of g with
highest weight λ for any dominant weight λ ∈ ΛW , where ΛW is the weight lattice of g, and
every irreducible representation of g is of this type.

Version: 1 Owner: bwebste Author(s): bwebste

278.13 cohomology of semi-simple Lie algebras

There are some important facts that make the cohomology of semi-simple Lie algebras easier
to deal with than general Lie algebra cohomology.

In particular, there are a number of vanishing theorems. First of all, let g be a finite-dimensional,
semi-simple Lie algebra over C.

Theorem. Let M be an irreducible representation of g. Then H n (g, M) = 0 for all n.

Whitehead’s lemmata. Let M be any representation of g, then H 1 (g, M) = H 2 (g, M) = 0.

Whitehead’s lemmata lead to two very important results. From the vanishing of H 1 , we
can derive Weyl’s theorem, the fact that representations of semi-simple Lie algebras are
completely reducible, since extensions of M by N are classified by H 1 (g, HomMN). And
from the vanishing of H 2 , we obtain Levi’s theorem, which states that every Lie algebra
is a split extension of a semi-simple algebra by a solvable algebra since H 2 (g, M) classifies
extensions of g by M with a specified action of g on M.

Version: 2 Owner: bwebste Author(s): bwebste

278.14 nilpotent cone

Let g be a finite dimensional semisimple Lie algebra. Then the nilpotent cone N of g is set
of elements which act nilpotently on all representations of g. This is a irreducible subvariety
of g (considered as a k-vector space), which is invariant under the adjoint action of G on g
(here G is the adjoint group associated to g).

1171
Version: 3 Owner: bwebste Author(s): bwebste

278.15 parabolic subgroup

Let G be a complex semi-simple Lie group. Then any subgroup P of G containg a Borel subgroup
B is called parabolic. Parabolics are classified in the following manner. Let g be the
Lie algebra of G, h the unique Cartan subalgebra contained in b, the algebra of B, R the
set of roots corresponding to this choice of Cartan, and R+ the set of positive roots whose
root spaces are contained in b and let p be the Lie algebra of P . Then there exists a unique
subset ΠP of Π, the base of simple roots associated to this choice of positive roots, such that
{b, g−α }α∈ΠP generates p. In other words, parabolics containing a single Borel subgroup are
classified by subsets of the Dynkin diagram, with the empty set corresponding to the Borel,
and the whole graph corresponding to the group G.


Version: 1 Owner: bwebste Author(s): bwebste

278.16 pictures of Dynkin diagrams

Here is a complete list of connected Dynkin diagrams. In general if the name of a diagram
has n as a subscript then there are n dots in the diagram. There are four infinite series that
correspond to classical complex (that is over C) simple Lie algebras. No pan intended.

• An , for n > 1 represents the simple complex Lie algebra sln+1 :

A1

A2

A3

An

• Bn , for n > 1 represents the simple complex Lie algebra so2n+1 :

• Cn , for n > 1 represents the simple complex Lie algebra sp2n :

1172


B1

B2

B3



Bn


C1

C2

C3

Cn


• Dn , for n > 3 represents the simple complex Lie algebra so2n :

D3

D4

D5

Dn

And then there are the exceptional cases that come in finite families. The corresponding
Lie algebras are usually called by the name of the diagram.

• There is the E series that has three members: E6 which represents a 78–dimensional Lie
algebra, E7 which represents a 133–dimensional Lie algebra, and E8 which represents
a 248–dimensional Lie algebra.

1173



E6


E7

E8

• There is the F4 diagram which represents a 52–dimensional complex simple Lie algebra:

F4

• And finally there is G2 that represents a 14–dimensional Lie algebra.

G2

Notice the low dimensional coincidences:
A1 = B1 = C1
which reflects the exceptional isomorphisms
sl2 ∼
= so3 ∼
= sp2 .
Also
B2 ∼
= C2
reflecting the isomorphism
so5 ∼
= sp4 .
And,
A3 ∼
= D3
reflecting
sl4 ∼
= so6 .

1174

Remark 1. Often in the literature the listing of Dynkin diagrams is arranged so that there
are no “intersections” between different families. However by allowing intersections one gets
a graphical representation of the low degree isomorphisms. In the same vein there is a
graphical representation of the isomorphism
so4 ∼
= sl2 × sl2 .
Namely, if not for the requirement that the families consist of connected diagrams, one could
start the D family with

D2

which consists of two disjoint copies of A2 .

Version: 9 Owner: Dr Absentius Author(s): Dr Absentius

278.17 positive root

If R ⊂ E is a root system, with E a euclidean vector space, then a subset R+ ⊂ R is called a
set of positive roots if there is a vector v ∈ E such that (α, v) > 0 if α ∈ R+ , and (α, v) < 0
if α ∈ R\R+ . roots which are not positive are called negative. Since −α is negative exactly
when α is positive, exactly half the roots must be positive.

Version: 2 Owner: bwebste Author(s): bwebste

278.18 rank

Let lg be a finite dimensional Lie algebra. One can show that all Cartan subalgebras h ⊂ lg
have the same dimension. The rank of lg is defined to be this dimension.

Version: 5 Owner: rmilson Author(s): rmilson

278.19 root lattice

If R ⊂ E is a root system, and E a euclidean vector space, then the root lattice ΛR of R
is the subset of E generated by R as an abelian group. In fact, this group is free on the
simple roots, and is thus a full sublattice of E.

1175
Version: 1 Owner: bwebste Author(s): bwebste

278.20 root system

Root systems are sets of vectors in a Euclidean space which are used classify simple Lie algebras,
and to understand their representation theory, and also in the theory of reflection groups.

Axiomatically, an (abstract) root system R is a set of vectors in a euclidean vector space E
with inner product (·, ·), such that:

1. R spans the vector space E.
2. if α ∈ R, then reflection in the hyperplane orthogonal to α preserves R.
(α,β)
3. if α, β ∈ R, then 2 (α,α) is an integer.

Axiom 3 is sometimes dropped when dealing with reflection groups, but it is necessary for
the root systems which arise in connection with Lie algebras.

Additionally, a root system is called reduced if for all α ∈ R, if kα ∈ R, then k = ±1.

We call a root system indecomposable if there is no proper subset R0 ⊂ R such that every
vector in R0 is orthogonal to R.

Root systems arise in the classification of semi-simple Lie algebras in the following manner:
If g is a semi-simple complex Lie algebra, then one can choose a maximal self-normalizing
subalgebra of g (alternatively, this is the commutant of an element with commutant of
minimal dimension), called a Cartan subalgebra, traditionally denote h. These act on g by
the adjoint action by diagonalizable linear maps. Since these maps all commute, they are all
simultaneously diagonalizable. The simultaneous eigenspaces of this action are called root
spaces, and the decomposition of g into h and the root spaces is called a root decompositon
of g. It turns out that all root spaces are all one dimensional. Now, for each eigenspace,
we have a map λ : h → C, given by Hv = λ(H)v for v an element of that eigenspace. The
set R ⊂ h∗ of these λ is called the root system of the algebra g. The Cartan subalgebra h
has a natural inner product (the Killing form), which in turn induces an inner product on
h∗ . With respect to this inner product, the root system R is an abstract root system, in the
sense defined up above.

Conversely, given any abstract root system R, there is a unique semi-simple complex Lie
algebra g such that R is its root system. Thus to classify complex semi-simple Lie algebras,
we need only classify roots systems, a somewhat easier task. Really, we only need to classify
indecomposable root systems, since all other root systems are built out of these. The Lie
algebra corresponding to a root system is simple if and only if the associated root system is
indecomposable.

1176
By convention e1 , . . . , en are orthonormal vectors, and the subscript on the name of the root
system is the dimension of the space it is contained in, also called the rank of the system,
and the indices i and j will run from 1 to n. There are four infinite series of indecomposable
root systems :

P
• An = {ei − ej , δ + ei }i6=j , where δ = nk=1 ek . This system corresponds to sl2 C.
S
• Bn = {±ei ± ej }i<j {ei }. This system corresponds to so2n+1 C.
S
• Cn = {±ei ± ej }i<j {2ei }. This system corresponds to sp2n C.

• Dn = {±ei ± ej }i<j . This sytem corresponds to so2n C.

and there are five exceptional root systems G2 , F4 , E6 , E7 , E8 , with five corresponding excep-
tional algebras, generally denoted by the same letter in lower-case Fraktur (g2 , etc.).

Version: 3 Owner: bwebste Author(s): bwebste

278.21 simple and semi-simple Lie algebras

A Lie algebra is called simple if it has no proper ideals and is not abelian. A Lie algebra is
called semisimple if it has no proper solvable ideals and is not abelian.

Let k = R or C. Examples of simple algebras are sln k, the Lie algebra of the special linear group
(traceless matrices), son k, the Lie algebra of the special orthogonal group (skew-symmetric matrices),
and sp2n k the Lie algebra of the symplectic group. Over R, there are other simple Lie al-
gebas, such as sun , the Lie algebra of the special unitary group (skew-Hermitian matrices).
Any semisimple Lie algebra is a direct product of simple Lie algebras.

Simple and semi-simple Lie algebras are one of the most widely studied classes of algebras for
a number of reasons. First of all, many of the most interesting Lie groups have semi-simple
Lie algebras. Secondly, their representation theory is very well understood. Finally, there is
a beautiful classification of simple Lie algebras.

Over C, there are 3 infinite series of simple Lie algebras: sln , son and sp2n , and 5 exceptional
simple Lie algebras g2 , f4 , e6 , e7 , and e8 . Over R the picture is more complicated, as several
different Lie algebras can have the same complexification (for example, sun and sln R both
have complexification sln C).

Version: 3 Owner: bwebste Author(s): bwebste

1177
278.22 simple root

Let R ⊂ E be a root system, with E a euclidean vector space. If R+ is a set of positive roots,
then a root is called simple if it is positive, and not the sum of any two positive roots. The
simple roots form a basis of the vector space E, and any positive root is a positive integer
linear combination of simple roots.

A set of roots which is simple with respect to some choice of a set of positive roots is called
a base. The Weyl group of the root system acts simply transitively on the set of bases.

Version: 1 Owner: bwebste Author(s): bwebste

278.23 weight (Lie algebras)

Let g be a semi-simple Lie algebra. Choose a Cartan subalgebra h. Then a weight is simply
an element of the dual h∗ . Weights arise in the representation theory of semi-simple Lie
algebras in the following manner: The elements of h must act on V by diagonalizable (also
called semi-simple) linear transformations. Since h is abelian, these must be simultaneously
diagonalizable. Thus, V decomposes as the direct sum of simultaneous eigenspaces for h.
Let V be such an eigenspace. Then the map λ defined by λ(H)v = Hv is a linear functional
on h, and thus a weight, as defined above. The maximal eigenspace Vλ with weight λ is called
the weight space of λ. The dimension of Vλ is called the multiplicity of λ. A representation
of a semi-simple algebra is determine by the multiplicities of its weights.

Version: 3 Owner: bwebste Author(s): bwebste

278.24 weight lattice

The weight lattice ΛW of a root system R ⊂ E is the dual lattice to ΛR , the root lattice of
R. That is,
ΛW = {e ∈ E|(e, r) ∈ Z}.
Weights which lie in the weight lattice are called integral. Since the simple roots are free
generators of the root lattice, one need only check that (e, π) ∈ Z for all simple roots π. If
R ⊂ h is the root system of a semi-simple Lie algebra g with Cartan subalgebra h, then ΛW
is exactly the set of weights appearing in finite dimensional representations of g.

Version: 4 Owner: bwebste Author(s): bwebste

1178
Chapter 279

17B30 – Solvable, nilpotent
(super)algebras

279.1 Engel’s theorem

Before proceeding, it will be useful to recall the definition of a nilpotent Lie algebra. Let g
be a Lie algebra. The lower central series of g is defined to be the filtration of ideals

D0 g ⊃ D1 g ⊃ D2 g ⊃ . . . ,

where
D0 g = g, Dk+1g = [g, Dk g], k ∈ N.
To say that g is nilpotent is to say that the lower central series has a trivial termination, i.e.
that there exists a k such that
Dk g = 0,
or equivalently, that k nested bracket operations always vanish.

Theorem 1 (Engel). Let g ⊂ End V be a Lie algebra of endomorphisms of a finite-dimensional
vector space V . Suppose that all elements of g are nilpotent transformations. Then, g is a
nilpotent Lie algebra.

Lemma 3. Let X : V → V be a nilpotent endomorphism of a vector space V . Then, the
ad(X) : End V → End V
is also a nilpotent endomorphism.

Proof. Suppose that
Xk = 0

1179
for some k ∈ N. We will show that

Note that
where
l(X), r(X) : End V → End V,
are the endomorphisms corresponding, respectively, to left and right multiplication by X.
These two endomorphisms commute, and hence we can use the binomial formula to write
2k−1
X
2k−1
ad(X) = (−1)i l(X)2k−1−i r(X)i .
i=0

Each of terms in the above sum vanishes because

l(X)k = r(X)k = 0.

QED

Lemma 4. Let g be as in the theorem, and suppose, in addition, that g is a nilpotent Lie
algebra. Then the joint kernel, \
ker g = ker a,
a∈g

is non-trivial.

Proof. We proceed by induction on the dimension of g. The claim is true for dimen-
sion 1, because then g is generated by a single nilpotent transformation, and all nilpotent
transformations are singular.

Suppose then that the claim is true for all Lie algebras of dimension less than n = dim g.
We note that D1 g fits the hypotheses of the lemma, and has dimension less than n, because
g is nilpotent. Hence, by the induction hypothesis

V0 = ker D1 g

is non-trivial. Now, if we restrict all actions to V0 , we obtain a representation of g by abelian
transformations. This is because for all a, b ∈ g and v ∈ V0 we have

abv − bav = [a, b]v = 0.

Now a finite number of mutually commuting linear endomorphisms admits a mutual eigenspace
decomposition. In particular, if all of the commuting endomorphisms are singular, their joint
kernel will be non-trivial. We apply this result to a basis of g/D1 g acting on V0 , and the
desired conclusion follows. QED

1180
Proof of the theorem. We proceed by induction on the dimension of g. The theorem is
true in dimension 1, because in that circumstance D1 g is trivial.

Next, suppose that the theorem holds for all Lie algebras of dimension less than n = dim g.
Let h ⊂ g be a properly contained subalgebra of minimum codimension. We claim that there
exists an a ∈ g but not in h such that [a, h] ⊂ h.

By the induction hypothesis, h is nilpotent. To prove the claim consider the isotropy representation
of h on g/h. By Lemma 1, the action of each a ∈ h on g/h is a nilpotent endomorphism.
Hence, we can apply Lemma 2 to deduce that the joint kernel of all these actions is non-
trivial, i.e. there exists a a ∈ g but not in h such that

[b, a] ⇔ 0 (mod#1) ,

for all b ∈ h. Equivalently, [h, a] ⊂ h and the claim is proved.

Evidently then, the span of a and h is a subalgebra of g. Since h has minimum codimension,
we infer that h and a span all of g, and that

D1 g ⊂ h. (279.1.1)

Next, we claim that all the Dk h are ideals of g. It is enough to show that

[a, Dk h] ⊂ Dk h.

We argue by induction on k. Suppose the claim is true for some k. Let b ∈ h, c ∈ Dk h be
given. By the Jacobi identity

[a, [b, c]] = [[a, b], c] + [b, [a, c]].

The first term on the right hand-side in Dk+1h because [a, b] ∈ h. The second term is in
Dk+1 h by the induction hypothesis. In this way the claim is established.

Now a is nilpotent, and hence by Lemma 1,

for some n ∈ N. We now claim that

Dn+1 g ⊂ D1 h.

By (278.1.1) it suffices to show that
n times
z }| {
[g, [. . . [g, h] . . .]] ⊂ D1 h.

Putting
g1 = g/D1 h, h1 = h/D1 h,

1181
this is equivalent to
n times
z }| {
[g1 , [. . . [g1 , h1 ] . . .]] = 0.
However, h1 is abelian, and hence, the above follows directly from (278.1.2).

Adapting this argument in the obvious fashion we can show that

Dkn+1g ⊂ Dk h.

Since h is nilpotent, g must be nilpotent as well. QED

Historical remark. In the traditional formulation of Engel’s theorem, the hypotheses are
the same, but the conclusion is that there exists a basis B of V , such that all elements of g
are represented by nilpotent matrices relative to B.

Let us put this another way. The vector space of nilpotent matrices Nil, is a nilpotent Lie
algebra, and indeed all subalgebras of Nil are nilpotent Lie algebras. Engel’s theorem asserts
that the converse holds, i.e. if all elements of a Lie algebra g are nilpotent transformations,
then g is isomorphic to a subalgebra of Nil.

The classical result follows straightforwardly from our version of the Theorem and from
Lemma 2. Indeed, let V1 be the joint kernel g. We then let U2 be the joint kernel of g acting
on V /V0 , and let V2 ⊂ V be the subspace obtained by pulling U2 x back to V . We do this a
finite number of times and obtain a flag of subspaces

0 = V0 ⊂ V1 ⊂ V2 ⊂ . . . ⊂ Vn = V,

such that
gVk+1 = Vk
for all k. The choose an adapted basis relative to this flag, and we’re done.

Version: 2 Owner: rmilson Author(s): rmilson

279.2 Lie’s theorem

Let g be a finite dimensional complex solvable Lie algebra, and V a repesentation of g. Then
there exists an element of V which is a simultaneous eigenvector for all elements of g.

Applying this result inductively, we find that there is a basis of V with respect to which all
elements of g are upper triangular.

Version: 3 Owner: bwebste Author(s): bwebste

1182
279.3 solvable Lie algebra

Let g be a Lie algebra. The lower central series of g is the filtration of subalgebras

D1 g ⊃ D2 g ⊃ D3 g ⊃ · · · ⊃ Dk g ⊃ · · ·

of g, inductively defined for every natural number k as follows:

D1 g := [g, g]
Dk g := [g, Dk−1 g]

The upper central series of g is the filtration

D1 g ⊃ D2 g ⊃ D3 g ⊃ · · · ⊃ Dk g ⊃ · · ·

defined inductively by

D1 g := [g, g]
Dk g := [Dk−1g, Dk−1 g]

In fact both Dk g and Dk g are ideals of g, and Dk g ⊂ Dk g for all k. The Lie algebra g is
defined to be nilpotent if Dk g = 0 for some k ∈ N, and solvable if Dk g = 0 for some k ∈ N.

A subalgebra h of g is said to be nilpotent or solvable if h is nilpotent or solvable when
considered as a Lie algebra in its own right. The terms may also be applied to ideals of g,
since every ideal of g is also a subalgebra.

Version: 1 Owner: djao Author(s): djao

1183
Chapter 280

17B35 – Universal enveloping
(super)algebras

280.1 Poincaré-Birkhoff-Witt theorem

Let g be a Lie algebra over a field k, and let B be a k-basis of g equipped with a linear order
6. The Poincaré-Birkhoff-Witt-theorem (often abbreviated to PBW-theorem) states that
the monomials
x1 x2 · · · xn with x1 6 x2 6 . . . 6 xn elements of B
constitute a k-basis of the universal enveloping algebra U(g) of g. Such monomials are often
called ordered monomials or PBW-monomials.

It is easy to see that they span U(g): for all n ∈ N, let Mn denote the set

Mn = {(x1 , . . . , xn ) | x1 6 . . . 6 xn } ⊂ B n ,
S∞
and denote by π : n=0 B n → U(g) the multiplication map. Clearly it suffices to prove that
n
X
π(B n ) ⊆ π(Mi )
i=0

for all n ∈ N; to this end, we proceed by induction. For n = 0 the statement is clear. Assume
that it holds for n − 1 > 0, and consider a list (x1 , . . . , xn ) ∈ B n . If it is an element of Mn ,
then we are done. Otherwise, there exists an index i such that xi > xi+1 . Now we have

π(x1 , . . . , xn ) = π(x1 , . . . , xi−1 , xi+1 , xi , xi+2 , . . . , xn )
+ x1 · · · xi−1 [xi , xi+1 ]xi+1 · · · xn .

As B is a basis of k, [xi , xi+1 ] is a P
linear combination of B. Using this to expand the second
n−1
term above, we find that it is in i=0 π(Mi ) by the induction hypothesis. The argument

1184
of π in the first term, on the other hand, is lexicographically smaller than (x1 , . . . , xn ), but
contains the same entries. Clearly this rewriting proces must end, and this concludes the
induction step.

The proof of linear independence of the PBW-monomials is slightly more difficult.

Version: 1 Owner: draisma Author(s): draisma

280.2 universal enveloping algebra

A universal enveloping algebra of a Lie algebra g over a field k is an associative algebra U
(with unity) over k, together with a Lie algebra homomorphism ι : g → U (where the Lie
algebra structure on U is given by the commutator), such that if A is a another associative
algebra over k and φ : g → A is another Lie algebra homomorphism, then there exists a
unique homomorphism ψ : U → A of associative algebras such that the diagram
ι
g U
ψ
φ
A
commutes. Any g has a universal enveloping algebra: let T be the associative tensor algebra
generated by the vector space g, and let I be the two-sided ideal of T generated by elements
of the form
xy − yx − [x, y] for x, y ∈ g;
then U = T /I is a universal enveloping algebra of g. Moreover, the universal property above
ensures that all universal enveloping algebras of g are canonically isomorphic; this justifies
the standard notation U(g).

Some remarks:

1. By the Poincaré-Birkhoff-Witt theorem, the map ι is injective; usually g is identified
with ι(g). From the construction above it is clear that this space generates U(g) as an
associative algebra with unity.
2. By definition, the (left) representation theory of U(g) is identical to that of g. In
particular, any irreducible g-module corresponds to a maximal left ideal of U(g).

Example: let g be the Lie algebra generated by the elements p, q, and e with Lie bracket
determined by [p, q] = e and [p, e] = [q, e] = 0. Then U(g)/(e − 1) (where (e − 1) denotes the

two-sided ideal generated by e − 1) is isomorphic to the skew polynomial algebra k[x, ∂x ],
the isomorphism being determined by

p + (e − 1) 7→ and
∂x
q + (e − 1) 7→ x.

1185
Version: 1 Owner: draisma Author(s): draisma

1186
Chapter 281

17B56 – Cohomology of Lie
(super)algebras

281.1 Lie algebra cohomology

Let g be a Lie algebra, and M a representation of g. Let

M g = {m ∈ M : Xm = 0∀X ∈ g}.

This is clearly a covariant functor. Call its derived functor Ri (−g) = H i(g, −) the Lie algebra
cohomology of g with coefficients in M

These cohomology groups have certain interpretations. For any Lie algebra, H 1 (g, k) ∼
=
2
g/[g, g], the abelianization of g, and H (g, M) is in natural bijection with Lie algebra
extensions (thinking of M as an abelian Lie algebra) 0 → M → f → g → 0 such that
the action of g on M induced by that of f coincides with that already specified.

Version: 2 Owner: bwebste Author(s): bwebste

1187
Chapter 282

17B67 – Kac-Moody (super)algebras
(structure and representation theory)

282.1 Kac-Moody algebra

Let A be an n × n generalized Cartan matrix. If n − r is the rank of A, then let h be a n + r
dimensional complex vector space. Choose n linearly independent elements α1 , . . . , αn ∈ h∗
(called roots), and α̌1 , . . . , α̌n ∈ h (called coroots) such that hαi , α̌j i = aij , where h·, ·i is the
natural pairing of h∗ and h. This choice is unique up to automorphisms of h.

Then the Kac-Moody algebra associated to g(A) is the Lie algebra generated by elements
X1 , . . . , Xn , Y1 , . . . , Yn and h, with the relations
[Xi , Yi ] = α̌i [Xi , Yj ] = 0
[Xi , h] = αi (h)Xi [Yi , h] = −αi (h)Yi
[X , [Xi , · · · , [Xi , Xj ] · · · ]] = 0 [Y , [Y , · · · , [Yi, Yj ] · · · ]] = 0
| i {z } | i i{z }
1−aij times 1−aij times

If the matrix A is positive-definite, we obtain a finite dimensional semi-simple Lie algebra,
and A is the Cartan matrix associated to a Dynkin diagram. Otherwise, the algebra we
obtain is infinite dimensional and has an r-dimensional center.

Version: 2 Owner: bwebste Author(s): bwebste

282.2 generalized Cartan matrix

A generalized Cartan matrix is a matrix A whose diagonal entries are all 2, and whose
off-diagonal entries are nonpositive integers, such that aij = 0 if and only if aji = 0. Such a

1188
matrix is called symmetrizable if there is a diagonal matrix B such that AB is symmetric.

Version: 2 Owner: bwebste Author(s): bwebste

1189
Chapter 283

17B99 – Miscellaneous

283.1 Jacobi identity interpretations

The Jacobi identity in a Lie algebra g has various interpretations that are more transparent,
whence easier to remember, than the usual form

[x, [y, z]] + [y, [z, x]] + [z, [x, y]] = 0.

One is the fact that the adjoint representation ad : g → End(g) really is a representation.
Yet another way to formulate the identity is

i.e., ad(x) is a derivation on g for all x ∈ g.

Version: 2 Owner: draisma Author(s): draisma

283.2 Lie algebra

A Lie algebra over a field k is a vector space g with a bilinear map [ , ] : g × g → g, called
the Lie bracket and denoted (x, y) 7→ [x, y]. It is required to satisfy:

1. [x, x] = 0 for all x ∈ g.

2. The Jacobi identity: [x, [y, z]] + [y, [z, x]] + [z, [x, y]] = 0 for all x, y, z ∈ g.

1190
283.2.1 Subalgebras & Ideals

A vector subspace h of the Lie algebra g is a subalgebra if h is closed under the Lie bracket
operation, or, equivalently, if h itself is a Lie algebra under the same bracket operation as g.
An ideal of g is a subspace h for which [x, y] ∈ h whenever either x ∈ h or y ∈ h. Note that
every ideal is also a subalgebra.

Some general examples of subalgebras:

• The center of g, defined by Z(g) := {x ∈ g | [x, y] = 0for all y ∈ g}. It is an ideal of g.

• The normalizer of a subalgebra h is the set N(h) := {x ∈ g | [x, h] ⊂ h}. The Jacobi
identity guarantees that N(h) is always a subalgebra of g.

• The centralizer of a subset X ⊂ g is the set C(X) := {x ∈ g | [x, X] = 0}. Again, the
Jacobi identity implies that C(X) is a subalgebra of g.

283.2.2 Homomorphisms

Given two Lie algebras g and g0 over the field k, a homomorphism from g to g0 is a
linear transformation φ : g → g0 such that φ([x, y]) = [φ(x), φ(y)] for all x, y ∈ g. An
injective homomorphism is called a monomorphism, and a surjective homomorphism is called
an epimorphism.

The kernel of a homomorphism φ : g → g0 (considered as a linear transformation) is denoted
ker (φ). It is always an ideal in g.

283.2.3 Examples
• Any vector space can be made into a Lie algebra simply by setting [x, x] = 0 for all x.
The resulting Lie algebra is called an abelian Lie algebra.

• If G is a Lie group, then the tangent space at the identity forms a Lie algebra over the
real numbers.

• R3 with the cross product operation is a nonabelian three dimensional Lie algebra over
R.

283.2.4 Historical Note

Lie algebras are so-named in honour of Sophus Lie, a Norwegian mathematician who pio-
neered the study of these mathematical objects. Lie’s discovery was tied to his investigation

1191
of continuous transformation groups and symmetries. One joint project with Felix Klein
called for the classification of all finite-dimensional groups acting on the plane. The task
seemed hopeless owing to the generally non-linear nature of such group actions. However,
Lie was able to solve the problem by remarking that a transformation group can be locally
reconstructed from its corresponding “infinitesimal generators”, that is to say vector fields
corresponding to various 1–parameter subgroups. In terms of this geometric correspondence,
the group composition operation manifests itself as the bracket of vector fields, and this is
very much a linear operation. Thus the task of classifying group actions in the plane became
the task of classifying all finite-dimensional Lie algebras of planar vector field; a project that
Lie brought to a successful conclusion.

This “linearization trick” proved to be incredibly fruitful and led to great advances in
geometry and differential equations. Such advances are based, however, on various results
from the theory of Lie algebras. Lie was the first to make significant contributions to this
purely algebraic theory, but he was surely not the last.

Version: 10 Owner: djao Author(s): djao, rmilson, nerdy2

283.3 real form

Let G be a complex Lie group. A real Lie group K called a real form of G if g ∼
= C ⊗R k,
where g and k are the Lie algebras of G and K, respectively.

Version: 2 Owner: bwebste Author(s): bwebste

1192
Chapter 284

18-00 – General reference works
(handbooks, dictionaries,
bibliographies, etc.)

284.1 Grothendieck spectral sequence

If F : C → D and G : D → E are two covariant left exact functors between abelian categories,
and if F takes injective objects of C to G-acyclic objects of D then there is a spectral sequence
for each object A of C:

E2pq = (Rp G ◦ Rq F )(A) → Rp+q (G ◦ F )(A)

If X and Y are topological spaces and C = Ab(X) is the category of sheaves of abelian groups
on X and D = Ab(Y ) and E = Ab is the category of abelian groups, then for a continuous map
f : X → Y we have a functor f∗ : Ab(X) → Ab(Y ), the direct image functor. We also
have the global section functors ΓX : Ab(X) → Ab, and ΓY : Ab(Y ) → Ab. Then since
ΓY ◦ f∗ = ΓX and we can verify the hypothesis (injectives are flasque, direct images of
flasque sheaves are flasque, and flasque sheaves are acyclic for the global section functor),
the sequence in this case becomes:

H p (Y, Rq f∗ F) → H p+q (X, F)

for a sheaf F of abelian groups on X, exactly the Leray spectral sequence.

I can recommend no better book than Weibel’s book on homological algebra. Sheaf theory
can be found in Hartshorne or in Godement’s book.

Version: 5 Owner: bwebste Author(s): Manoj, ceps, nerdy2

1193
284.2 category of sets

The category of sets has as its objects all sets and as its morphisms functions between sets.
(This works if a category’s objects are only required to be part of a class, as the class of all
sets exists.)

Alternately one can specify a universe, containing all sets of interest in the situation, and
take the category to contain only sets in that universe and functions between those sets.

Version: 1 Owner: nerdy2 Author(s): nerdy2

284.3 functor

Given two categories C and D, a covariant functor T : C → D consists of an assignment
for each object X of C an object T (X) of D (i.e. a “function” T : Ob(C) → Ob(D))
together with an assignment for every morphism f ∈ HomC(A, B), to a morphism T (f ) ∈
HomD(T (A), T (B)), such that:

• T (1A ) = 1T (A) where 1X denotes the identity morphism on the object X (in the re-
spective category).

• T (g ◦ f ) = T (g) ◦ T (f ), whenever the composition g ◦ f is defined.

A contravariant functor T : C → D is just a covariant functor T : Cop → D from the
opposite category. In other words, the assignment reverses the direction of maps. If f ∈
HomC(A, B), then T (f ) ∈ HomD(T (B), T (A)) and T (g ◦ f ) = T (f ) ◦ T (g) whenever the
composition is defined (the domain of g is the same as the codomain of f ).

Given a category C and an object X we always have the functor T : C → Sets to the
category of sets defined on objects by T (A) = Hom(X, A). If f : A → B is a morphism of C,
then we define T (f ) : Hom(X, A) → Hom(X, B) by g 7→ f ◦ g. This is a covariant functor,
denoted by Hom(X, −).

Similarly, one can define a contravariant functor Hom(−, X) : C → Sets.

Version: 3 Owner: nerdy2 Author(s): nerdy2

284.4 monic

A morphism f : A → B in a category is called monic if for any object C and any morphisms
g1 , g2 : C → A, if f ◦ g1 = f ◦ g2 then g1 = g2 .

1194
A monic in the category of sets is simply a one-to-one function.

Version: 1 Owner: nerdy2 Author(s): nerdy2

284.5 natural equivalence

A natural transformation between functors τ : F → G is called a natural equivalence (or a
natural isomorphism) if there is a natural transformation σ : G → F such that τ ◦ σ = idG
and σ ◦ τ = idF where idF is the identity natural transformation on F (which for each object
A gives the identity map F (A) → F (A)), and composition is defined in the obvious way
(for each object compose the morphisms and it’s easy to see that this results in a natural
transformation).

Version: 2 Owner: mathcam Author(s): mathcam, nerdy2

284.6 representable functor

A contravariant functor T : C → Sets between a category and the category of sets is rep-
resentable if there is an object X of C such that T is isomorphic to the functor X • =
Hom(−, X).

Similarly, a covariant functor is T called representable if it is isomorphic to X• = Hom(X, −).

We say that the object X represents T . X is unique up to canonical isomorphism.

A vast number of important objects in mathematics are defined as representing functors.
For example, if F : C → D is any functor, then the adjoint G : D → C (if it exists)
can be defined as follows. For Y in D, G(Y ) is the object of C representing the functor
X 7→ Hom(F (X), Y ) if G is right adjoint to F or X 7→ Hom(Y, F (X)) if G is left adjoint.

Thus, for example, if R is a ring, then N⊗M represents the functor L 7→ HomR (N, HomR (M, L)).

Version: 3 Owner: bwebste Author(s): bwebste, nerdy2

284.7 supplemental axioms for an Abelian category

These are axioms introduced by Alexandre Grothendieck for an abelian category. The first
two are satisfied by definition in an Abelian category, and others may or may not be.

(Ab1) Every morphism has a kernel and a cokernel.

1195
(Ab2) Every monic is the kernel of its cokernel.

(Ab3) coproducts exist. (Coproducts are also called direct sums.) If this axiom is satisfied
the category is often just called cocomplete.

(Ab3*) Products exist. If this axiom is satisfied the category is often just called complete.

(Ab4) Coproducts exist and the coproduct of monics is a monic.

(Ab4*) Products exist and the product of epics is an epic.

(Ab5) Coproducts exist and filtered colimits of exact sequences are exact.

(Ab5*) Products exist and filtered inverse limits of exact sequences are exact.

Grothendieck introduced these in his homological algebra paper in the Tokohu Journal of
Math. They can also be found in Weibel’s excellent homological algebra book.

Version: 5 Owner: nerdy2 Author(s): nerdy2

1196
Chapter 285

18A05 – Definitions, generalizations

285.1 autofunctor

Let F : C → C be an endofunctor on a category C. If F is a bijection on objects, Ob(C), and
morphisms, Mor(C), then it is an autofunctor.

In short, an autofunctor is a full and faithful endofunctor F : C → C such that the mapping
Ob(C) → Ob(C) which is induced by F is a bijection.

An autofunctor F : C → C is naturally isomorphic to the identity functor idC.

Version: 10 Owner: mathcam Author(s): mathcam, mhale, yark, gorun manolescu

285.2 automorphism

Roughly, an automorphism is a map from a mathematical object onto itself such that: 1.
There exists an ”inverse” map such that the composition of the two is the identity map of
the object, and 2. any relevent structure related to the object in question is preserved.

In category theory, an automorphism of an object A in a category C is a morphishm ψ ∈
Mor(A, A) such that there exists another morphism φ ∈ Mor(A, A) and ψ ◦ φ = φ ◦ ψ = idA .

For example in the category of groups an automorphism is just a bijective (inverse exists and
composition gives the identity) group homomorphism (group structure is preserved). Con-
cretely, the map: x 7→ −x is an automorphism of the additive group of real numbers. In the
category of topological spaces an automorphism would be a bijective, continuous map such
that it’s inverse map is also continuous (not guaranteed as in the group case). Concretely,
the map ψ : S 1 → S 1 where ψ(α) = α + θ for some fixed angle θ is an automorphism of the
topological space that is the circle.

1197
Version: 4 Owner: benjaminfjones Author(s): benjaminfjones

285.3 category

A category C consists of the following data:

1. a collection ob(C) of objects (of C)

2. for each ordered pair (A, B) of objects of C, a collection (we will assume it is a set)
Hom(A, B) of morphisms from the domain A to the codomain B

3. a function ◦ : Hom(A, B) × Hom(B, C) → Hom(A, C) called composition.

We normally denote ◦(f, g) by g ◦ f for morphisms f, g. The above data must satisfy the
following axioms: for objects A, B, C, D,
T
A1: Hom(A, B) Hom(C, D) = ∅ whenever A 6= B or C 6= D

A2: (associativity) if f ∈ Hom(A, B), g ∈ Hom(B, C) and h ∈ Hom(C, D), h ◦ (g ◦ f ) =
(h ◦ g) ◦ f

A3: (Existence of an identity morphism) for each object A there exists an identity morphism
idA ∈ Hom(A, A) such that for every f ∈ Hom(A, B), f ◦ idA = f and idA ◦ g = g for every
g ∈ Hom(B, A).

Some examples of categories:

• 0 is the empty category with no objects or morphisms, 1 is the category with one
object and one (identity) morphism.

• If we assume we have a universe U which contains all sets encountered in “everyday”
mathematics, Set is the category of all such small sets with morphisms being set
functions

• Top is the category of all small topological spaces with morphisms continuous functions

• Grp is the category of all small groups whose morphisms are group homomorphisms

Version: 9 Owner: mathcam Author(s): mathcam, RevBobo

1198
285.4 category example (arrow category)

Let C be a category, and let D be the category whose objects are the arrows of C. A
morphism between two morphisms f : A → B and g : A0 → B 0 is defined to be a couple
of morphisms (h, k), where h ∈ Hom(A, A0 ) and k ∈ Hom(B, B 0 ) such that the following
diagram
h
A A0
f g

B k
B0
commutes. The resulting category D is called the arrow category of C.

Version: 6 Owner: n3o Author(s): n3o

285.5 commutative diagram

Definition 15. Let C be a category. A diagram in C is a directed graph Γ with vertex
set V and edge set E, (“loops” and “parallel edges” are allowed) together with two maps
o : V → Obj(C), m : E → Morph(C) such that if e ∈ E has source s(e) ∈ V and target
t(e) ∈ V then m(e) ∈ HomC (s(e), t(e)).

Usually diagrams are denoted by drawing the corresponding graph and labeling its vertices
(respectively edges) with their images under o (respectively m), for example if f : A → B is
a morphism
f
A B
is a diagram. Often (as in the previous example) the vertices themselves are not drawn since
their position can b deduced by the position of their labels.
Definition 16. Let D = (Γ, o, m) be a diagram in the category C and γ = (e1 , . . . , en ) be a
path in Γ. Then the composition along γ is the following morphism of C
◦(γ) := m(en ) ◦ · · · ◦ m(e1 ) .
We say that D is commutative or that it commutes if for any two objects in the image of
o, say A = o(v1 ) and B = o(v2 ), and any two paths γ1 and γ2 that connect v1 to v2 we have
◦(γ1 ) = ◦(γ2 ) .

For example the commutativity of the triangle
f
A B
h g

C

1199
translates to h = g ◦ f , while the commutativity of the square
f
A B
k g

C h
D

translates to g ◦ f = h ◦ k.

Version: 3 Owner: Dr Absentius Author(s): Dr Absentius

285.6 double dual embedding

Let V be a vector space over a field K. Recall that V ∗ , the dual space, is defined to be the
vector space of all linear forms on V . There is a natural embedding of V into V ∗∗ , the dual
of its dual space. In the language of categories, this embedding is a natural transformation
between the identity functor and the double dual functor, both endofunctors operating on
VK , the category of vector spaces over K.

Turning to the details, let
I, D : VK → VK
denote the identity and the dual functors, respectively. Recall that for a linear mapping
L : U → V (a morphism in VK ), the dual homomorphism D[L] : V ∗ → U ∗ is defined by

D[L](α) : u 7→ α(Lu), u ∈ U, α ∈ V ∗ .

The double dual embedding is a natural transformation

δ : I → D2 ,

that associates to every V ∈ VK a linear homomorphism δV ∈ Hom(V, V ∗∗ ) described by

δV (v) : α 7→ α(v), v ∈ V, α ∈ V ∗

To show that this transformation is natural, let L : U → V be a linear mapping. We must
show that the following diagram commutes:

δU
U U ∗∗
L D 2 [L]
δV
V V ∗∗

Let u ∈ U and α ∈ V ∗ be given. Following the arrows down and right we have that

(δV ◦ L)(u) : α 7→ α(Lu).

1200
Following the arrows right, then down we have that

(D[D[L]] ◦ δU )(u) : α 7→ (δU u)(D[L]α)
= (D[L]α)(u)
= α(Lu),

as desired.

Let us also note that for every non-zero v ∈ V , there exists an α ∈ V ∗ such that α(v) 6= 0.
Hence δV (v) 6= 0, and hence δV is an embedding, i.e. it is one-to-one. If V is finite dimensional,
then V ∗ has the same dimension as V . Consequently, for finite-dimensional V , the natural
embedding δV is, in fact, an isomorphism.

Version: 1 Owner: rmilson Author(s): rmilson

285.7 dual category

Let C be a category. The dual category C∗ of C is the category which has the same objects
as C, but in which all morphisms are ”reversed”. That is to say if A, B are objects of C and
we have a morphism f : A → B, then f ∗ : B → A is a morphism in C∗ . The dual category
is sometimes called the opposite category and is denoted Cop .

Version: 3 Owner: RevBobo Author(s): RevBobo

285.8 duality principle

Let Σ be any statement of the elementary theory of an abstract category. We form the dual
of Σ as follows:

1. Replace each occurrence of ”domain” in Σ with ”codomain” and conversely.

2. Replace each occurrence of g ◦ f = h with f ◦ g = h

Informally, these conditions state that the dual of a statement is formed by reversing arrows
and compositions. For example, consider the following statements about a category C:

• f :A→B

• f is monic, i.e. for all morphisms g, h for which composition makes sense, f ◦ g = f ◦ h
implies g = h.

1201
The respective dual statements are

• f :B→A
• f is epi, i.e. for all morphisms g, h for which composition makes sense, g ◦ f = h ◦ f
implies g = h.

The duality principle asserts that if a statement is a theorem, then the dual statment is
also a theorem. We take ”theorem” here to mean provable from the axioms of the elementary
theory of an abstract category. In practice, for a valid statement about a particular category
C, the dual statement is valid in the dual category C∗ (Cop ).

Version: 3 Owner: RevBobo Author(s): RevBobo

285.9 endofunctor

Given a category C, an endofunctor is a functor T : C → C.

Version: 2 Owner: rmilson Author(s): NeuRet, Logan

285.10 examples of initial objects, terminal objects and
zero objects

Examples of initial objects, terminal objects and zero objects of categories include:

• The empty set is the unique initial object in the category of sets; every one-element
set is a terminal object in this category; there are no zero objects. Similarly, the empty
space is the unique initial object in the category of topological spaces; every one-point
space is a terminal object in this category.

• In the category of non-empty sets, there are no initial objects. The singletons are not
initial: while every non-empty set admits a function from a singleton, this function is
in general not unique.

• In the category of pointed sets (whose objects are non-empty sets together with a distin-
guished point; a morphism from (A, a) to (B, b) is a function f : A → B with f (a) = b)
every singleton serves as a zero object. Similarly, in the category of pointed topological spaces,
every singleton is a zero object.

1202
• In the category of groups, any trivial group (consisting only of its identity element) is
a zero object. The same is true for the category of abelian groups as well as for the
category of modules over a fixed ring. This is the origin of the term ”zero object”.

• In the category of rings with identity, the ring of integers (and any ring isomorphic to
it) serves as an initial object. The trivial ring consisting only of a single element 0 = 1
is a terminal object.

• In the category of schemes, the prime spectrum of the integers spec(Z) is a terminal
object. The emtpy scheme (which is the prime spectrum of the trivial ring) is an initial
object.

• In the category of fields, there are no initial or terminal objects.

• Any partially ordered set (P, ≤) can be interpreted as a category: the objects are the
elements of P , and there is a single morphism from x to y if and only if x ≤ y. This
category has an initial object if and only if P has a smallest element; it has a terminal
object if and only if P has a largest element. This explains the terminology.

• In the category of graphs, the null graph is an initial object. There are no terminal
objects, unless we allow our graphs to have loops (edges starting and ending at the
same vertex), in which case the one-point-one-loop graph is terminal.

• Similarly, the category of all small categories with functors as morphisms has the empty
category as initial object and the one-object-one-morphism category as terminal object.

• Any topological space X can be viewed as a category X̂ by taking the open sets as
objects, and a single morphism between two open sets U and V if and only if U ⊂ V .
The empty set is the initial object of this category, and X is the terminal object.

• If X is a topological space and C is some small category, we can form the category of
all contravariant functors from X̂ to C, using natural transformations as morphisms.
This category is called the category of presheaves on X with values in C. If C
has an initial object c, then the constant functor which sends every open set to c is an
initial object in the category of presheaves. Similarly, if C has a terminal object, then
the corresponding constant functor serves as a terminal presheave.

• If we fix a homomorphism f : A → B of abelian groups, we can consider the category
C consisting of all pairs (X, φ) where X is an abelian group and φ : X → A is a

1203
group homomorphism with f φ = 0. A morphism from the pair (X, φ) to the pair
(Y, ψ) is defined to be a group homomorphism r : X → Y with the property ψr = φ:

X φ
f
r A B
Y ψ

The kernel of f is a terminal object in this category; this expresses the universal prop-
erty of kernels. With an analogous construction, cokernels can be retrieved as initial
objects of a suitable category.

• The previous example can be generalized to arbitrary limits of functors: if F : I → C is
a functor, we define a new category F̂ as follows: its objects are pairs (X, (φi )) where
X is an object of C and for every object i of I, φi : X → F (i) is a morphism in C such
that for every morphism ρ : i → j in I, we have F (ρ)φi = φj . A morphism between
pairs (X, (φi )) and (Y, (ψi )) is defined to be a morphism r : X → Y such that ψi r = φi
for all objects i of I. The universal property of the limit can then be expressed as
saying: any terminal object of F̂ is a limit of F and vice versa (note that F̂ need not
contain a terminal object, just like F need not have a limit).

Version: 11 Owner: AxelBoldt Author(s): AxelBoldt

285.11 forgetful functor

Let C and D be categories such that each object c of C can be regarded an object of D
by suitably ignoring structures c may have as a C-object but not a D-object. A functor
U : C → D which operates on objects of C by “forgetting” any imposed mathematical
structure is called a forgetful functor. The following are examples of forgetful functors:

1. U : Grp → Set takes groups into their underlying sets and group homomorphisms to
set maps.

2. U : Top → Set takes topological spaces into their underlying sets and continuous maps
to set maps.

3. U : Ab → Grp takes abelian groups to groups and acts as identity on arrows.

Forgetful functors are often instrumental in studying adjoint functors.

Version: 1 Owner: RevBobo Author(s): RevBobo

1204
285.12 isomorphism

A morphism f : A −→ B in a category is an isomorphism if there exists a morphism
f −1 : B −→ A which is its inverse. The objects A and B are isomorphic if there is an
isomorphism between them.

Examples:

• In the category of sets and functions, a function f : A −→ B is an isomorphism if and
only if it is bijective.

• In the category of groups and group homomorphisms (or rings and ring homomorphisms),
a homomorphism φ : G −→ H is an isomorphism if it has an inverse map φ−1 : H −→ G
which is also a homomorphism.

• In the category of vector spaces and linear transformations, a linear transformation is
an isomorphism if and only if it is an invertible linear transformation.

• In the category of topological spaces and continuous maps, a continuous map is an
isomorphism if and only if it is a homeomorphism.

Version: 2 Owner: djao Author(s): djao

285.13 natural transformation

Let A, B be categories and T, S : A → B functors. A natural transformation τ : S → T
is a family of morphisms τ = {τA : T (A) → S(A)} such that for each object A of A,
0
τA : S(A) → T (A) is an object of B and for each morphism f : A → A in A the following
diagram commutes:
τA
S(A) T (A)
Sf Tf
0
τA0 0
S(A ) T (A )

Version: 6 Owner: RevBobo Author(s): RevBobo

285.14 types of homomorphisms

Often in a category of algebraic structures, those structures are generated by certain ele-
ments, and subject to certain relations. One often refers to functions between structures

1205
which are said to preserve those relations. These functions are typically called homomor-
phisms.

An example is the category of groups. Suppose that f : A → B is a function between two
groups. We say that f is a group homomorphism if:

(a) the binary operator is preserved: f (a1 · a2 ) = f (a1 ) · f (a2 ) for all a1 , a2 ∈ A;

(b) the identity element is preserved: f (eA ) = eB ;

(c) inverses of elements are preserved: f (a−1 ) = [f (a)]−1 for all a ∈ A.

One can define similar natural concepts of homomorphisms for other algebraic structures,
giving us ring homomorphisms, module homomorphisms, and a host of others.

We give special names to homomorphisms when their functions have interesting properties.

If a homomorphism is an injective function (i.e. one-to-one), then we say that it is a
monomorphism. These are typically monic in their category.

If a homomorphism is an surjective function (i.e. onto), then we say that it is an epimorphism.
These are typically epic in their category.

If a homomorphism is an bijective function (i.e. both one-to-one and onto), then we say that
it is an isomorphism.

If the domain of a homomorphism is the same as its codomain (e.g. a homomorphism
f : A → A), then we say that it is an endomorphism. We often denote the collection of
endomorphisms on A as End(A).

If a homomorphism is both an endomorphism and an isomorphism, then we say that it is an
automorphism. We often denote the collection of automorphisms on A as Aut(A).

Version: 4 Owner: antizeus Author(s): antizeus

285.15 zero object

An initial object in a category C is an object A in C such that, for every object X in C, there
is exactly one morphism A −→ X.

A terminal object in a category C is an object B in C such that, for every object X in C,
there is exactly one morphism X −→ B.

A zero object in a category C is an object 0 that is both an initial object and a terminal
object.

1206
All initial objects (respectively, terminal objects, and zero objects), if they exist, are isomorphic
in C.

Version: 2 Owner: djao Author(s): djao

1207
Chapter 286

18A22 – Special properties of functors
(faithful, full, etc.)

286.1 exact functor

A covariant functor F is said to be left exact if whenever
α β
0 → A −→ B −→ C

is an exact sequence, then
Fα Fβ
0 → F A −→ F B −→ F C
is also an exact sequence.

A covariant functor F is said to be right exact if whenever
α β
A −→ B −→ C → 0

is an exact sequence, then
Fα Fβ
F A −→ F B −→ F C → 0
is also an exact sequence.

A contravariant functor F is said to be left exact if whenever
α β
A −→ B −→ C → 0

is an exact sequence, then
Fβ Fα
0 → F C −→ F B −→ F A
is also an exact sequence.

1208
A contravariant functor F is said to be right exact if whenever
α β
0 → A −→ B −→ C

is an exact sequence, then
Fβ Fα
F C −→ F B −→ F A → 0
is also an exact sequence.

A (covariant or contravariant) functor is said to be exact if it is both left exact and right
exact.

Version: 3 Owner: antizeus Author(s): antizeus

1209
Chapter 287

18A25 – Functor categories, comma
categories

287.1 Yoneda embedding

If C is a category, write Ĉ for the category of contravariant functors from C to Sets, the
category of sets. The morphisms in Ĉ are natural transformations of functors.

(To avoid set theoretical concerns, one can take a universe U and take all categories to be
U-small.)

For any object X of C, there is the functor hX = Hom(−, X). Then X 7→ hX is a covariant
functor C → Ĉ, which embeds C faithfully as a full subcategory of Ĉ.

Version: 4 Owner: nerdy2 Author(s): nerdy2

1210
Chapter 288

18A30 – Limits and colimits
(products, sums, directed limits,
pushouts, fiber products, equalizers,
kernels, ends and coends, etc.)

288.1 categorical direct product

Let {Ci}i∈IQbe a set of objects in a category Q
C. A direct product of the collection {Ci }i∈I is
an object i∈I Ci of C, with morphisms πi : j∈I Cj −→ Ci for each i ∈ I, such that:

For every object A in C, and any collection
Q of morphisms fi : A −→ Ci for every i ∈ I, there
exists a unique morphism f : A −→ i∈I Ci making the following diagram commute for all
i ∈ I.
fi
A Ci

f πi
Q
j∈I Cj

Version: 4 Owner: djao Author(s): djao

288.2 categorical direct sum

Let {C`
i }i∈I be a set of objects in a category C. `
A direct sum of the collection {Ci }i∈I is an
object i∈I Ci of C, with morphisms ιi : Ci −→ j∈I Cj for each i ∈ I, such that:

1211
` collection of morphisms fi : Ci −→ A for every i ∈ I, there
For every object A in C, and any
exists a unique morphism f : i∈I Ci −→ A making the following diagram commute for all
i ∈ I.
fi
Ci A
ιi f
`
j∈I Cj

Version: 4 Owner: djao Author(s): djao

288.3 kernel

Let f : X → Y be a function and let Y be have some sort of zero, neutral or null element
that we’ll denote as e. (Examples are groups, vector spaces, modules, etc)

The kernel of f is the set:
ker f = {x ∈ X : f (x) = e}
that is, the set of elements in X such that their image is e. This set can also denoted as
f −1 (e) (that doesn’t mean f has an inverse function, it’s just notation) and that is read
as ”the kernel is the preimage of the neutral element”. Let’s see an examples. If X = Z
and Y = Z6 , the function f that sends each integer n to its residue class modulo 6. So
f (4) = 4, f (20) = 2, f (−5) = 1. The kernel of f consist precisely of the multiples of 6 (since
they have residue 0, we have f (6k) = 0).

This is also an example of kernel of a group homomorphism, and since the sets are also rings,
the function f is also a homomorphism between rings and the kernel is also the kernel of a
ring homomorphism.

Usually we are interested on sets with certain algebraic structure. In particular, the following
theorem holds for maps between pairs of vector spaces, groups, rings and fields (and some
other algebraic structures):

A map f : X → Y is injective if and only if ker f = {0} (the zero of Y ).

Version: 4 Owner: drini Author(s): drini

1212
Chapter 289

constructions, reflective
subcategories, Kan extensions, etc.)

Let C, D be categories and T : C → D, S : D → C be covariant functors. T is said to be a
left adjoint functor to S (equivalently, S is a right adjoint functor to T ) if there exists
ν = νC,D such that

ν : Hom(T (C), D) ∼
= Hom(C, S(D))
D C

is a natural bijection of hom-sets for all objects C of C and D of D.

An adjoint to any functor are unique up to natural transformation.

Examples:

1. Let U : Top → Set be the forgetful functor (i.e. U takes topological spaces to their
underlying sets, and continuous maps to set functions). Then U is right adjoint to the
functor F : Set → Top which gives each set the discrete topology.

2. If U : Grp → Set is again the forgetful functor, this time on the category of groups,
the functor F : Set → Grp which takes a set A to the free group generated by A is

3. If UN : R − mod → R − mod is the functor M 7→ N ⊗ M for an R-module N, then UN
is the left adjoint to the functor FN : R−mod → R−mod given by L 7→ HomR (N, L).

1213
Version: 8 Owner: bwebste Author(s): bwebste, RevBobo

289.2 equivalence of categories

Let C and D be two categories with functors F : C → D and G : D → C. The functors F
and G are an

Definition 17. equivalence of categories if there are natural isomorphisms F G ∼
= idD and

GF = idC .

Note, F is left adjoint to G, and G is right adjoint to F as
G
Hom(F (c), d) −→ Hom(GF (c), G(d)) ←→ Hom(c, G(d)).
D C C

And, F is right adjoint to G, and G is left adjoint to F as
F
Hom(G(d), c) −→ Hom(F G(d), F (c)) ←→ Hom(d, F (c)).
C D D

In practical terms, two categories are equivalent if there is a fully faithful functor F : C → D,
such that every object d ∈ D is isomorphic to an object F (c), for some c ∈ C.

Version: 2 Owner: mhale Author(s): mhale

1214
Chapter 290

18B40 – Groupoids, semigroupoids,
semigroups, groups (viewed as
categories)

290.1 groupoid (category theoretic)

A groupoid, also known as a virtual group, is a small category where every morphism is
invertible.

There is also a group-theoretic concept with the same name.

Version: 6 Owner: akrowne Author(s): akrowne

1215
Chapter 291

18E10 – Exact categories, abelian
categories

291.1 abelian category

An abelian category is a category A satisfying the following axioms. Because the later axioms
rely on terms whose definitions involve the earlier axioms, we will intersperse the statements
of the axioms with such auxiliary definitions as needed.

Axiom 1. For any two objects A, B in A, the set of morphisms Hom(A, B) is an abelian group.

The identity element in the group Hom(·, ·) will be denoted by 0, and the group operation
by +.

Axiom 2. Composition of morphisms distributes over addition in Hom(·, ·). That is, given
any diagram of morphisms
g1
f h
A B C D
g2

we have (g1 + g2 )f = g1 f + g2 f and h(g1 + g2 ) = hg1 + hg2 .

Axiom 3. A has a zero object.

Axiom 4. For any two objects A, B in A, the categorical direct product A × B exists in A.

Given a morphism f : A −→ B in A, a kernel of f is a morphism i : X −→ A such that:

• f i = 0.

• For any other morphism j : X 0 −→ A such that f j = 0, there exists a unique morphism

1216
j 0 : X 0 −→ X such that the diagram

X0
j0
j
i f
X A B
commutes.

Likewise, a cokernel of f is a morphism p : B −→ Y such that:

• pf = 0.

• For any other morphism j : B −→ Y 0 such that jf = 0, there exists a unique morphism
j 0 : Y −→ Y 0 such that the diagram
f p
A B Y
j
j0
0
Y
commutes.

Axiom 5. Every morphism in A has a kernel and a cokernel.

The kernel and cokernel of a morphism f in A will be denoted ker (f ) and cok(f ), respectively.

A morphism f : A −→ B in A is called a monomorphism if, for every morphism g : X −→ A
such that f g = 0, we have g = 0. Similarly, the morphism f is called an epimorphism if, for
every morphism h : B −→ Y such that hf = 0, we have h = 0.

Axiom 6. ker (cok(f )) = f for every monomorphism f in A.

Axiom 7. cok(ker (f )) = f for every epimorphism f in A.

Version: 6 Owner: djao Author(s): djao

291.2 exact sequence

Let A be an abelian category. We begin with a preliminary definition.

Definition 1. For any morphism f : A −→ B in A, let m : X −→ B be the morphism
equal to ker (cok(f )). Then the object X is called the image of f , and denoted Im(f ). The
morphism m is called the image morphism of f , and denoted i(f ).

1217
Note that Im(f ) is not the same as i(f ): the former is an object of A, while the latter is a
morphism of A. We note that f factors through i(f ):
e i(f )
A Im(f ) B
f

The proof is as follows: by definition of cokernel, cok(f )f = 0; therefore by definition of
kernel, the morphism f factors through ker (cok(f )) = i(f ), and this factor is the morphism
e above. Furthermore m is a monomorphism and e is an epimorphism, although we do not
prove these facts.
Definition 2. A sequence
f g
···A B C ···
of morphisms in A is exact at B if ker (g) = i(f ).

Version: 3 Owner: djao Author(s): djao

291.3 derived category

Let A be an abelian category, and let K(A) be the category of chain complexes in A, with
morphisms chain homotopy classes of maps. Call a morphism of chain complexes a quasi-
isomorphism if it induces an isomorphism on homology groups of the complexes. For exam-
ple, any chain homotopy is an quasi-isomorphism, but not conversely. Now let the derived
category D(A) be the category obtained from K(A) by adding a formal inverse to every
quasi-isomorphism (technically this called a localization of the category).

Derived categories seem somewhat obscure, but in fact, many mathematicians believe they
are the appropriate place to do homological algebra. One of their great advantages is that the
important functors of homological algebra which are left or right exact (Hom,N⊗k , where
N is a fixed k-module, the global section functor Γ, etc.) become exact on the level of
derived functors (with an appropriately modified definition of exact).

See Methods of Homological Algebra, by Gelfand and Manin for more details.

Version: 2 Owner: bwebste Author(s): bwebste

291.4 enough injectives

An abelian category is said to have enough injectives if for every object X, there is a
monomorphism 0 → X → I where I is an injective object.

Version: 2 Owner: bwebste Author(s): bwebste

1218
Chapter 292

18F20 – Presheaves and sheaves

292.1 locally ringed space

292.1.1 Definitions

A locally ringed space is a topological space X together with a sheaf of rings OX with the
property that, for every point p ∈ X, the stalk (OX )p is a local ring 1 .

A morphism of locally ringed spaces from (X, OX ) to (Y, OY ) is a continuous map f : X −→
Y together with a morphism of sheaves φ : OY −→ OX with respect to f such that, for every
point p ∈ X, the induced ring homomorphism on stalks φp : (OY )f (p) −→ (OX )p is a local
homomorphism. That is,
φp (y) ∈ mp for every y ∈ mf (p) ,
where mp (respectively, mf (p) ) is the maximal ideal of the ring (OX )p (respectively, (OY )f (p) ).

292.1.2 Applications

Locally ringed spaces are encountered in many natural contexts. Basically, every sheaf on
the topological space X consisting of continuous functions with values in some field is a
locally ringed space. Indeed, any such function which is not zero at a point p ∈ X is
nonzero and thus invertible in some neighborhood of p, which implies that the only maximal
ideal of the stalk at p is the set of germs of functions which vanish at p. The utility of
this definition lies in the fact that one can then form constructions in familiar instances of
locally ringed spaces which readily generalize in ways that would not necessarily be obvious
without this framework. For example, given a manifold X and its locally ringed space DX
of real–valued differentiable functions, one can show that the space of all tangent vectors to
1

1219
X at p is naturally isomorphic to the real vector space (mp /m2p )∗ , where the ∗ indicates the
dual vector space. We then see that, in general, for any locally ringed space X, the space
of tangent vectors at p should be defined as the k–vector space (mp /m2p )∗ , where k is the
residue field (OX )p /mp and ∗ denotes dual with respect to k as before. It turns out that
this definition is the correct definition even in esoteric contexts like algebraic geometry over
finite fields which at first sight lack the differential structure needed for constructions such
as tangent vector.

Another useful application of locally ringed spaces is in the construction of schemes. The
forgetful functor assigning to each locally ringed space (X, OX ) the ring OX (X) is adjoint to
the ”prime spectrum” functor taking each ring R to its prime spectrum Spec(R), and this
correspondence is essentially why the category of locally ringed spaces is the proper building
block to use in the formulation of the notion of scheme.

Version: 9 Owner: djao Author(s): djao

292.2 presheaf

For a topological space X a presheaf F with values in a category C associates to each open set
U ⊂ X, an object F (U) of C and to each inclusion U ⊂ V a morphism of C, ρU V : F (V ) →
F (U), the restriction morphism. It is required that ρU U = 1F (U ) and ρU W = ρU V ◦ ρV W for
any U ⊂ V ⊂ W .

A presheaf with values in the category of sets (or abelian groups) is called a presheaf of sets
(or abelian groups). If no target category is specified, either the category of sets or abelian
groups is most likely understood.

A more categorical way to state it is as follows. For X form the category Top(X) whose
objects are open sets of X and whose morphisms are the inclusions. Then a presheaf is
merely a contravariant functor Top(X) → C.

Version: 2 Owner: nerdy2 Author(s): nerdy2

292.3 sheaf

292.3.1 Presheaves

Let X be a topological space and let A be a category. A presheaf on X with values in A is
a contravariant functor F from the category of open sets in X and inclusion morphisms to
the category A.

As this definition may be less than helpful to many readers, we offer the following equivalent

1220
(but longer) definition. A presheaf F on X consists of the following data:

1. An object F (U) in A, for each open set U ⊂ X

2. A morphism resV,U : F (V ) −→ F (U) for each pair of open sets U ⊂ V in X (called
the restriction morphism), such that:

(a) For every open set U ⊂ X, the morphism resU,U is the identity morphism.
(b) For any open sets U ⊂ V ⊂ W in X, the diagram
resW,U

F (W ) resW,V F (V ) resV,U F (U)

commutes.

If the object F (U) of A is a set, its elements are called sections of U.

292.3.2 Morphisms of Presheaves

Let f : X −→ Y be a continuous map of topological spaces. Suppose FX is a presheaf on
X, and GY is a presheaf on Y (with FX and GY both having values in A). We define a
morphism of presheaves φ from GY to FX , relative to f , to be a collection of morphisms
φU : GY (U) −→ FX (f −1 (U)) in A, one for every open set U ⊂ Y , such that the diagram

φV
GY (V ) FX (f −1 (V ))
resV,U resf −1 (V ),f −1 (U )

GY (U) φU
FX (f −1 (U))

commutes, for each pair of open sets U ⊂ V in Y .

In the special case that f is the identity map id : X −→ X, we omit mention of the map f ,
and speak of φ as simply a morphism of presheaves on X. Form the category whose objects
are presheaves on X and whose morphisms are morphisms of presheaves on X. Then an
isomorphism of presheaves φ on X is a morphism of presheaves on X which is an isomorphism
in this category; that is, there exists a morphism φ−1 whose composition with φ both ways
is the identity morphism.

More generally, if f : X −→ Y is any homeomorphism of topological spaces, a morphism of
presheaves φ relative to f is an isomorphism if it admits a two–sided inverse morphism of
presheaves φ−1 relative to f −1 .

1221
292.3.3 Sheaves

We now assume that the category A is a concrete category. A sheaf is a presheaf F on X,
with values in A, such that for every open set U ⊂ X, and every open cover {Ui } of U, the
following two conditions hold:

1. Any two elements f1 , f2 ∈ F (U) which have identical restrictions to each Ui are equal.
That is, if resU,Ui f1 = resU,Ui f2 for every i, then f1 = f2 .

2. Any collection of elements fi ∈ F (Ui ) that have common restrictions can be realized
as the collective restrictions of a single element of F (U). That is, if resUi ,Ui T Uj fi =
resUj ,Ui T Uj fj for every i and j, then there exists an element f ∈ F (U) such that
resU,Ui f = fi for all i.

292.3.4 Sheaves in abelian categories

If A is a concrete abelian category, then a presheaf F is a sheaf if and only if for every open
subset U of X, the sequence
incl Q diff
Q T
0 F (U) i F (Ui ) i,j F (Ui Uj ) (292.3.1)

is an exact sequence of morphisms in A for every open cover {Ui } of U in X. This diagram
requires some explanation, because we owe the reader a definition of the morphisms incl and
diff. We start with incl (short for “inclusion”). The restriction morphisms F (U) −→ F (Ui )
induce a morphism Y
F (U) −→ F (Ui )
i
Q
to the categorical direct product i F (Ui ), which we define to be incl. The map diff (called
“difference”) is defined as follows. For each Ui , form the morphism
Y \
αi : F (Ui ) −→ F (Ui Uj ).
j

By the universal properties of categorical direct product, there exists a unique morphism
Y YY \
α: F (Ui ) −→ F (Ui Uj )
i i j

such that πi α = αi πi for all i, where πi is projection onto the ith factor. In a similar manner,
form the morphism Y YY \
β: F (Uj ) −→ F (Ui Uj ).
j j i

1222
Then α and β are both elements of the set
!
Y Y \
Hom F (Ui ), F (Ui Uj ) ,
i i,j

which is an abelian group since A is an abelian category. Take the difference α − β in this
group, and define this morphism to be diff.

Note that exactness of the sequence (291.3.1) is an element free condition, and therefore
makes sense for any abelian category A, even if A is not concrete. Accordingly, for any
abelian category A, we define a sheaf to be a presheaf F for which the sequence (291.3.1) is
always exact.

292.3.5 Examples

It’s high time that we give some examples of sheaves and presheaves. We begin with some
of the standard ones.
Example 9. If F is a presheaf on X, and U ⊂ X is an open subset, then one can define a
presheaf F |U on U by restricting the functor F to the subcategory of open sets of X in U
and inclusion morphisms. In other words, for open subsets of U, define F |U to be exactly
what F was, and ignore open subsets of X that are not open subsets of U. The resulting
presheaf is called, for obvious reasons, the restriction presheaf of F to U, or the restriction
sheaf if F was a sheaf to begin with.
Example 10. For any topological space X, let cX be the presheaf on X, with values in the
category of rings, given by

• cX (U) := the ring of continuous real–valued functions U −→ R,
• resV,U f := the restriction of f to U, for every element f : V −→ R of cX (V ) and every
subset U of V .

Then cX is actually a sheaf of rings, because continuous functions are uniquely specified by
their values on an open cover. The sheaf cX is called the sheaf of continuous real–valued
functions on X.
Example 11. Let X be a smooth differentiable manifold. Let DX be the presheaf on X, with
values in the category of real vector spaces, defined by setting DX (U) to be the space of
smooth real–valued functions on U, for each open set U, and with the restriction morphism
given by restriction of functions as before. Then DX is a sheaf as well, called the sheaf of
smooth real–valued functions on X.

Much more surprising is that the construct DX can actually be used to define the concept
of smooth manifold! That is, one can define a smooth manifold to be a locally Euclidean
n–dimensional second countable topological space X, together with a sheaf F , such that
there exists an open cover {Ui } of X where:

1223
For every i, there exists a homeomorphism fi : Ui −→ Rn and an isomorphism
of sheaves φi : DRn −→ F |Ui relative to fi .

The idea here is that not only does every smooth manifold X have a sheaf DX of smooth functions,
but specifying this sheaf of smooth functions is sufficient to fully describe the smooth man-
ifold structure on X. While this phenomenon may seem little more than a toy curiousity
for differential geometry, it arises in full force in the field of algebraic geometry where the
coordinate functions are often unwieldy and algebraic structures in many cases can only be
satisfactorily described by way of sheaves and schemes.
Example 12. Similarly, for a complex analytic manifold X, one can form the sheaf HX of
holomorphic functions by setting HX (U) equal to the complex vector space of C–valued
holomorphic functions on U, with the restriction morphism being restriction of functions as
before.
Example 13. The algebraic geometry analogue of the sheaf DX of differential geometry is the
prime spectrum Spec(R) of a commutative ring R. However, the construction of the sheaf
Spec(R) is beyond the scope of this discussion and merits a separate article.
Example 14. For an example of a presheaf that is not a sheaf, consider the presheaf F on X,
with values in the category of real vector spaces, whose sections on U are locally constant
real–valued functions on U modulo constant functions on U. Then every section f ∈ F (U)
is locally zero in some fine enough open cover {Ui } (it is enough to take a cover where each
Ui is connected), whereas f may be nonzero if U is not connected.

We conclude with some interesting examples of morphisms of sheaves, chosen to illustrate the
unifying power of the language of schemes across various diverse branches of mathematics.

1. For any continuous function f : X −→ Y , the map φU : cY (U) −→ cX (f −1 (U)) given
by φU (g) := gf defines a morphisms of sheaves from cY to cX with respect to f .
2. For any continuous function f : X −→ Y of smooth differentiable manifolds, the map
given by φU (g) := gf has the property
g ∈ DY (U) → φU (g) ∈ DX (f −1 (U))
if and only if f is a smooth function.
3. For any continuous function f : X −→ Y of complex analytic manifolds, the map given
by φU (g) := gf has the property
g ∈ HY (U) → φU (g) ∈ HX (f −1 (U))
if and only if f is a holomorphic function.
4. For any Zariski continuous function f : X −→ Y of algebraic varieties over a field k,
the map given by φU (g) := gf has the property
g ∈ OY (U) → φU (g) ∈ OX (f −1 (U))
if and only if f is a regular function. Here OX denotes the sheaf of k–valued regular
functions on the algebraic variety X.

1224
REFERENCES
1. David Mumford, The Red Book of Varieties and Schemes, Second Expanded Edition, Springer–
Verlag, 1999 (LNM 1358).
2. Charles Weibel, An Introduction to Homological Algebra, Cambridge University Press, 1994.

Version: 9 Owner: djao Author(s): djao

292.4 sheafification

Let F be a presheaf over a topological space X with values in a category A for which sheaves
are defined. The sheafification of F , if it exists, is a sheaf F 0 over X together with a morphism
θ : F −→ F 0 satisfying the following universal property:

For any sheaf G over X and any morphism of presheaves φ : F −→ G over X,
there exists a unique morphism of sheaves ψ : F 0 −→ G such that the diagram

θ ψ
F F0 G
φ

commutes.

In light of the universal property, the sheafification of F is uniquely defined up to canonical
isomorphism whenever it exists. In the case where A is a concrete category (one consisting
of sets and set functions), the sheafification ofSany presheaf F can be constructed by taking
F 0 (U) to be the set of all functions s : U −→ p∈U Fp such that

1. s(p) ∈ Fp for all p ∈ U

2. For all p ∈ U, there is a neighborhood V ⊂ U of p and a section t ∈ F (V ) such that,
for all q ∈ V , the induced element tq ∈ Fq equals s(q)

for all open sets U ⊂ X. Here Fp denotes the stalk of the presheaf F at the point p.

The following quote, taken from [1], is perhaps the best explanation of sheafification to be
found anywhere:

F 0 is ”the best possible sheaf you can get from F ”. It is easy to imagine how
to get it: first identify things which have the same restrictions, and then add in
all the things which can be patched together.

1225
REFERENCES
1. David Mumford, The Red Book of Varieties and Schemes, Second Expanded Edition, Springer–
Verlag, 1999 (LNM 1358)

Version: 4 Owner: djao Author(s): djao

292.5 stalk

Let F be a presheaf over a topological space X with values in an abelian category A, and
suppose direct limits exist in A. For any point p ∈ X, the stalk Fp of F at p is defined to
be the object in A which is the direct limit of the objects F (U) over the directed set of all
open sets U ⊂ X containing p, with respect to the restriction morphisms of F . In other
words,
Fp := lim F (U)
−→
U 3p

If A is a category consisting of sets, the stalk Fp can be viewed as the set of all germs of
sections of F at the point p. That is, the set Fp consists of all the equivalence classes of
ordered pairs (U, s) where p ∈ U and s T∈ F (U), under the equivalence relation (U, s) ∼ (V, t)
if there exists a neighborhood W ⊂ U V of p such that resU,W s = resV,W t.

By universal properties of direct limit, a morphism φ : F −→ G of presheaves over X induces
a morphism φp : Fp −→ Gp on each stalk Fp of F . Stalks are most useful in the context
of sheaves, since they encapsulate all of the local data of the sheaf at the point p (recall
that sheaves are basically defined as presheaves which have the property of being completely
characterized by their local behavior). Indeed, in many of the standard examples of sheaves
that take values in rings (such as the sheaf DX of smooth functions, or the sheaf OX of
regular functions), the ring Fp is a local ring, and much of geometry is devoted to the study
of sheaves whose stalks are local rings (so-called ”locally ringed spaces”).

We mention here a few illustrations of how stalks accurately reflect the local behavior of a
sheaf; all of these are drawn from [1].

• A morphism of sheaves φ : F −→ G over X is an isomorphism if and only if the
induced morphism φp is an isomorphism on each stalk.

• A sequence F −→ G −→ H of morphisms of sheaves over X is an exact sequence at
G if and only if the induced morphism Fp −→ Gp −→ Hp is exact at each stalk Gp .

• The sheafification F 0 of a presheaf F has stalk equal to Fp at every point p.

1226
REFERENCES
1. Robin Hartshorne, Algebraic Geometry, Springer–Verlag New York Inc., 1977 (GTM 52).

Version: 4 Owner: djao Author(s): djao

1227
Chapter 293

18F30 – Grothendieck groups

293.1 Grothendieck group

Let S be an Abelian semigroup. The Grothendieck group of S is K(S) = S × S/∼, where
∼ is the equivalence relation: (s, t) ∼ (u, v) if there exists r ∈ S such that s + v + r =
t + u + r. This is indeed an abelian group with zero element (s, s) (any s ∈ S) and inverse
−(s, t) = (t, s).

The Grothendieck group construction is a functor from the category of abelian semigroups
to the category of abelian groups. A morphism f : S → T induces a morphism K(f ) :
K(S) → K(T ).
Example 15. K(N) = Z.
Example 16. Let G be an abelian group, then K(G) ∼
= G via (g, h) ↔ g − h.

Let C be a symmetric monoidal category. Its Grothendieck group is K([C]), i.e. the
Grothendieck group of the isomorphism classes of objects of C.

Version: 2 Owner: mhale Author(s): mhale

1228
Chapter 294

18G10 – Resolutions; derived functors

294.1 derived functor

There are two objects called derived functors. First, there are classical derived functors. Let
A, B be abelian categories, and F : A → B be a covariant left-exact functor. Note that a
completely analogous construction can be done for right-exact and contravariant functors,
but it is traditional to only describe one case, as doing the other mostly consists of reversing
arrows. Given an object A ∈ A, we can construct an injective resolution:

A→A:0 A I1 I2 ···

which is unique up to chain homotopy equivalence. Then we apply the functor F to the
injectives in the resolution to to get a complex

F (A) : 0 F (I 1 ) F (I 2 ) ···

(notice that the term involving A has been left out. This is not an accident, in fact, it is
crucial). This complex also is independent of the choice of I’s (up to chain homotopy equiv-
alence). Now, we define the classical right derived functors Ri F (A) to be the cohomology
groups H i (F (A)). These only depend on A.

Important properties of the classical derived functors are these: If the sequence 0 → A →
A0 → A00 → 0 is exact, then there is a long exact sequence

0 F (A) F (A0 ) F (A00 ) R1 F (A) ···

which is natural (a morphism of short exact sequences induces a morphism of long exact
sequences). This, along with a couple of other properties determine the derived functors
completely, giving an axiomatic definition, though the construction used above is usually
necessary to show existence.

From the definition, one can see immediately that the following are equivalent:

1229
1. F is exact

2. Rn F (A) = 0 for n > 1 and all A ∈ A.

3. R1 F (A) = 0 for all A ∈ A.

However, R1 F (A) = 0 for a particular A does not imply that Rn F (A) = 0 for all n > 1.

Important examples are Extn , the derived functor of Hom, Torn , the derived functor of
tensor product, and sheaf cohomology, the derived functor of the global section functor on
sheaves.

(Coming soon: the derived categoies definition)

Version: 4 Owner: bwebste Author(s): bwebste

1230
Chapter 295

18G15 – Ext and Tor, generalizations,
Künneth formula

295.1 Ext

For a ring R, and R-module A, we have a covariant functor HomA − R. ExtnR (A, −) are
defined to be the right derived functors of HomA − R (ExtnR (A, −) = Rn HomA − R).

Ext gets its name from the following fact: There is a natural bijection between elements of
Ext1R (A, B) and extensions of B by A up to isomorphism of short exact sequences, where an
extension of B by A is an exact sequence

0→B→C→A→0

. For example,
Ext1Z (Z/nZ, Z) ∼
= Z/nZ
, with 0 corresponding to the trivial extension 0 → Z → Z ⊕ Z/nZ → 0, and m 6= 0
corresponding to
n m
0 Z Z Z/nZ 0.

Version: 3 Owner: bwebste Author(s): bwebste

1231
Chapter 296

18G30 – Simplicial sets, simplicial
objects (in a category)

296.1 nerve

The

Definition 18. nerve of a category C is the simplicial set Hom(i(−), C), where i : ∆ → Cat
is the fully faithful functor that takes each ordered set [n] in the simplicial category, ∆, to
op
the pre-order n + 1. The nerve is a functor Cat → Set∆ .

Version: 1 Owner: mhale Author(s): mhale

296.2 simplicial category

The simplicial category ∆ is defined as the small category whose objects are the totally ordered
finite sets
[n] = {0 < 1 < 2 < . . . < n}, n > 0, (296.2.1)
and whose morphisms are monotonic non-decreasing (order-preserving) maps. It is generated
by two families of morphisms:

δin : [n − 1] → [n] is the injection missing i ∈ [n],
σin : [n + 1] → [n] is the surjection such that σin (i) = σin (i + 1) = i ∈ [n].

The δin morphisms are called

Definition 19. face maps, and the σin morphisms are called

1232
Definition 20. degeneracy maps. They satisfy the following relations,

δjn+1 δin = δin+1 δj−1
n
for i < j, (296.2.2)
σjn−1 σin = σin−1 σj+1
n
for i 6 j, (296.2.3)
 n n−1
 δi σj−1 if i < j,
n n+1
σj δi = idn if i = j or i = j + 1, (296.2.4)
 n n−1
δi−1 σj if i > j + 1.

All morphisms [n] → [0] factor through σ00 , so [0] is terminal.

There is a bifunctor + : ∆ × ∆ → ∆ defined by

[m] + [n] = [m + n + 1], (296.2.5)

f (i) if 0 6 i 6 m,
(f + g)(i) = (296.2.6)
g(i − m − 1) + m0 + 1 if m < i 6 (m + n + 1),

where f : [m] → [m0 ] and g : [n] → [n0 ]. Sometimes, the simplicial category is defined to
include the empty set [−1] = ∅, which provides an initial object for the category. This makes
∆ a strict monoidal category as ∅ is a unit for the bifunctor: ∅ + [n] = [n] = [n] + ∅ and
id∅ + f = f = f + id∅ . Further, ∆ is then the free monoidal category on a monoid object
(the monoid object being [0], with product σ00 : [0] + [0] → [0]).

There is a fully faithful functor from ∆ to Top, which sends each object [n] to an oriented n-
simplex. The face maps then embed an (n − 1)-simplex in an n-simplex, and the degeneracy
maps collapse an (n + 1)-simplex to an n-simplex. The bifunctor forms a simplex from the
disjoint union of two simplicies by joining their vertices together in a way compatible with
their orientations.

There is also a fully faithful functor from ∆ to Cat, which sends each object [n] to a pre-order
n + 1. The pre-order n is the category consisting of n partially-ordered objects, with one
morphism a → b iff a 6 b.

Version: 4 Owner: mhale Author(s): mhale

296.3 simplicial object

A

Definition 21. simplicial object in a category C is a contravariant functor from the simplicial category
∆ to C. Such a functor X is uniquely specified by the morphisms X(δin ) : [n] → [n − 1] and

1233
X(σin ) : [n] → [n + 1], which satisfy

X(δin−1 ) X(δjn ) = X(δj−1
n−1
) X(δin ) for i < j, (296.3.1)
X(σin+1 ) X(σjn ) = X(σj+1
n+1
) X(σin ) for i 6 j, (296.3.2)
 n−1
 X(σj−1 ) X(δin ) if i < j,
X(δin+1 ) X(σjn ) = idn if i = j or i = j + 1, (296.3.3)

X(σjn−1 ) X(δi−1
n
) if i > j + 1.

In particular, a

Definition 22. simplicial set is a simplicial object in Set. Equivalently, one could say that a
simplicial set is a presheaf on ∆. The object X([n]) of a simplicial set is a set of n-simplices,
and is called the n-skeleton.

Version: 2 Owner: mhale Author(s): mhale

1234
Chapter 297

18G35 – Chain complexes

297.1 5-lemma

If Ai , Bi for i = 1, . . . , 5 are objects in an abelian category (for example, modules over a ring
R) such that there is a commutative diagram

A1 A2 A3 A4 A5
γ1 γ2 γ3 γ4 γ5

B1 B2 B3 B4 B5

with the rows exact, and γ1 is surjective, γ5 is injective, and γ2 and γ4 are isomorphisms,
then γ3 is an isomorphism as well.

Version: 2 Owner: bwebste Author(s): bwebste

1235
297.2 9-lemma

If Ai , Bi , Ci , for i = 1, 2, 3 are objects of an abelian category such that there is a commutative diagram
0 0 0

0 A1 B1 C1 0

0 A2 B2 C2 0

0 A3 B3 C3 0

0 0 0
with the columns and bottom two rows are exact, then the top row is exact as well.

Version: 2 Owner: bwebste Author(s): bwebste

297.3 Snake lemma

There are two versions of the snake lemma: (1) Given a commutative (1) diagram as below,
with exact (1) rows
0 −−−→ A1 −−−→ B1 −−−→ C1 −−−→ 0
  

αy

βy

γ y
0 −−−→ A2 −−−→ B2 −−−→ C2 −−−→ 0
there is an exact sequence

0 → ker α → ker β → ker γ → coker α → coker β → coker γ → 0

where ker denotes the kernel of a map and coker its cokernel.

(2) Applying this result inductively to a short exact (2) sequence of (2) chain complexes, we
obtain the following: Let A, B, C be chain complexes, and let

0→A→B→C→0

be a short exact sequence. Then there is a long exact sequence of homology groups

· · · → Hn (A) → Hn (B) → Hn (C) → Hn−1 (A) → · · ·

Version: 5 Owner: bwebste Author(s): bwebste

1236
297.4 chain homotopy
0 0 0 0
Let (A, d) and (A , d ) be chain complexes and f : A → A , g : A → A be chain maps. A
0
chain homotopy D between f and g is a sequence of homomorphisms {Dn : An → An+1 }
0
so that dn+1 ◦ Dn + Dn−1 ◦ dn = fn − gn for each n. Thus, we have a commutative diagram:
dn+1 dn
An+1 An An−1
Dn−1
fn+1 −gn+1 fn−1 −gn−1
Dn
0 0 0
An+1 0 An 0
An−1
dn+1 dn

Version: 4 Owner: RevBobo Author(s): RevBobo

297.5 chain map
0 0 0
Let (A, d) and (A , d ) be chain complexes. A chain map f : A → A is a sequence of
0
homomorphisms {fn } such that dn ◦ fn = fn−1 ◦ dn for each n. Diagramatically, this says
that the following diagram commutes:
dn
An An−1
fn fn−1
0
0 dn 0
An An−1

Version: 3 Owner: RevBobo Author(s): RevBobo

297.6 homology (chain complex)

If (A, d) is a chain complex
dn−1 d dn+1 dn+2
· · · ←−−− An−1 ←−n−− An ←−−− An+1 ←−−− · · ·
then the n-th homology group Hn (A, d) (or module) of the chain complex A is the quotient

ker dn
Hn (A, d) = .
i dn+1

Version: 2 Owner: bwebste Author(s): bwebste

1237
Chapter 298

18G40 – Spectral sequences,
hypercohomology

298.1 spectral sequence

A spectral sequence is a collection of R-modules (or more generally, objects of an abelian category)
r
{Ep,q r
} for all r ∈ N, p, q ∈ Z, equipped with maps drpq : Ep,q r
→ Ep−r,q+r−1 such that is a
r+1
chain complex, and the E ’s are its homology, that is,
r+1 ∼
Ep,q = ker(drp,q )/im(drp+r,q−r+1).

(Note: what I have defined above is a homology spectral sequence. Cohomology spectral
sequences are identical, except that all the arrows go in the other direction.)
r
Most interesting spectral sequences are upper right quadrant, meaning that Ep,q = 0 if p
or q < 0. If this is the case then for any p, q, both drpq and drp+r,q−r+1 are 0 for sufficiently
large r since the target or source is out of the upper right quadrant, so that for all r > r0
r r+1 ∞
Ep,q = Ep,q · · · . This group is called Ep,q .
r
A upper right quadrant spectral sequence {Ep,q } is said to converge to a sequence Fn of
R-modules if there is an exhaustive filtration Fn,0 = 0 ⊂ Fn,1 ⊂ · · · ⊂ of each Fn such that

Fp+q,q+1 /Fp+q,q ∼ ∞
= Ep,q
r
. This is typically written Ep,q ⇒ Fp+q .

Typically spectral sequences are used in the following manner: we find an interpretation of
E r for a small value of r, typically 1, and of E ∞ , and then in cases where enough groups
and differentials are 0, we can obtain information about one from the other.

Version: 2 Owner: bwebste Author(s): bwebste

1238
Chapter 299

19-00 – General reference works
(handbooks, dictionaries,
bibliographies, etc.)

299.1 Algebraic K-theory

Algebraic K-theory is a series of functors on the category of rings. It classifies ring invariants,
i.e. ring properties that are Morita invariant.

The functor K0

Let R be a ring and denote by M∞ (R) the algebraic direct
 limit
 of matrix algebras Mn (R)
a 0
under the embeddings Mn (R) → Mn+1 (R) : a → . The zeroth K-group of
0 0
R, K0 (R), is the Grothendieck group (abelian group of formal differences) of the unitary
equivalence classes of projections in M∞ (R). The addition of two equivalence classes [p] and
[q] is given by the direct summation of the projections p and q: [p] + [q] = [p ⊕ q].

The functor K1

[To Do: coauthor?]

The functor K2

[To Do: coauthor?]

Higher K-functors

Higher K-groups are defined using the Quillen plus construction,

Knalg (R) = πn (BGL∞ (R)+ ), (299.1.1)

1239
where GL∞ (R) is the infinite general linear group over R (defined in a similar way to
M∞ (R)), and BGL∞ (R) is its classifying space.

Algebraic K-theory has a product structure,

Ki (R) ⊗ Kj (S) → Ki+j (R ⊗ S). (299.1.2)

Version: 2 Owner: mhale Author(s): mhale

299.2 K-theory

Topological K-theory is a generalised cohomology theory on the category of compact Hausdorff
spaces. It classifies the vector bundles over a space X up to stable equivalences. Equivalently,
via the Serre-Swan theorem, it classifies the finitely generated projective modules over the
C ∗ -algebra C(X).

Let A be a unital C ∗ -algebra over C and denote by M∞ (A) the algebraicdirect  limit of
a 0
matrix algebras Mn (A) under the embeddings Mn (A) → Mn+1 (A) : a → . The
0 0
K0 (A) group is the Grothendieck group (abelian group of formal differences) of the homotopy
classes of the projections in M∞ (A). Two projections p and q are homotopic if p = uqu−1 for
some unitary u ∈ M∞ (A). Addition of homotopy classes is given by the direct summation
of projections: [p] + [q] = [p ⊕ q].

Denote by U∞ (A)  the direct
 limit of unitary groups Un (A) under the embeddings Un (A) →
u 0
Un+1 (A) : u → . Give U∞ (A) the direct limit topology, i.e. a subset U of U∞ (A)
0 1 T
is open if and only if U Un (A) is an open subset of Un (A), for all n. The K1 (A) group
is the Grothendieck group (abelian group of formal differences) of the homotopy classes of
the unitaries in U∞ (A). Two unitaries u and v are homotopic if uv −1 lies in the identity
component of U∞ (A). Addition of homotopy classes is given by the direct summation of
unitaries: [u] + [v] = [u ⊕ v]. Equivalently, one can work with invertibles in GL∞ (A) (an
invertible g is connected to the unitary u = g|g|−1 via the homotopy t → g|g|−t).

Higher K-groups can be defined through repeated suspensions,

Kn (A) = K0 (S n A). (299.2.1)

But, the Bott periodicity theorem means that

K1 (SA) ∼
= K0 (A). (299.2.2)

1240
The main properties of Ki are:

Ki (A ⊕ B) = Ki (A) ⊕ Ki (B), (299.2.3)
Ki (Mn (A)) = Ki (A) (Morita invariance), (299.2.4)
Ki (A ⊗ K) = Ki (A) (stability), (299.2.5)
Ki+2 (A) = Ki (A) (Bott periodicity). (299.2.6)

There are three flavours of topological K-theory to handle the cases of A being complex (over
C), real (over R) or Real (with a given real structure).

Ki (C(X, C)) = KU −i (X) (complex/unitary), (299.2.7)
Ki (C(X, R)) = KO −i (X) (real/orthogonal), (299.2.8)
KR i (C(X), J) = KR −i (X, J) (Real). (299.2.9)

Real K-theory has a Bott period of 8, rather than 2.

REFERENCES
1. N. E. Wegge-Olsen, K-theory and C ∗ -algebras. Oxford science publications. Oxford University
Press, 1993.
2. B. Blackadar, K-Theory for Operator Algebras. Cambridge University Press, 2nd ed., 1998.

Version: 12 Owner: mhale Author(s): mhale

299.3 examples of algebraic K-theory groups

R K0 (R) K1 (R) K2 (R) K3 (R) K4 (R)
Z Z Z/2 Z/2 Z/48 0
R Z R×
C C C×

Algebraic K-theory of some common rings.

Version: 2 Owner: mhale Author(s): mhale

1241
Chapter 300

19K33 – EXT and K-homology

300.1 Fredholm module

Fredholm modules represent abstract elliptic pseudo-differential operators.
Definition 3. An
Definition 23. odd Fredholm module (H, F ) over a C ∗ -algebra A is given by an involutive
representation π of A on a Hilbert space H, together with an operator F on H such that
F = F ∗ , F 2 = 1I and [F, π(a)] ∈ K(H) for all a ∈ A.
Definition 4. An
Definition 24. even Fredholm module (H, F, Γ) is given by an odd Fredholm module (H, F )
together with a Z2 -grading Γ on H, Γ = Γ∗ , Γ2 = 1I, such that Γπ(a) = π(a)Γ and ΓF =
−F Γ.
Definition 5. A Fredholm module is called
Definition 25. degenerate if [F, π(a)] = 0 for all a ∈ A. Degenerate Fredholm modules are
homotopic to the 0-module.
Example 17 (Fredholm modules over C). An even Fredholm module (H, F, Γ) over C is given
by
 
k k a1Ik 0
H = C ⊕ C with π(a) = ,
0 0
 
0 1Ik
F = ,
1Ik 0
 
1Ik 0
Γ = .
0 −1Ik

Version: 3 Owner: mhale Author(s): mhale

1242
300.2 K-homology

K-homology is a homology theory on the category of compact Hausdorff spaces. It classifies
the elliptic pseudo-differential operators acting on the vector bundles over a space. In terms
of C ∗ -algebras, it classifies the Fredholm modules over an algebra.

The K 0 (A) group is the abelian group of homotopy classes of even Fredholm modules over
A. The K 1 (A) group is the abelian group of homotopy classes of odd Fredholm modules over
A. Addition is given by direct summation of Fredholm modules, and the inverse of (H, F, Γ)
is (H, −F, −Γ).

Version: 1 Owner: mhale Author(s): mhale

1243
Chapter 301

19K99 – Miscellaneous

301.1 examples of K-theory groups

A K0 (A) K1 (A)
C Z 0
Mn (C) Z 0
H Z 0
K Z 0
B 0 0
B/K 0 Z
C0 ((0, 1)) 0 Z
C0 (R2n ) Z 0
C0 (R2n+1 ) 0 Z
C([0, 1]) Z 0
n−1 n−1
C(Tn ) Z2 Z2
C(S2n ) Z2 0
C(S2n+1 ) Z Z
C(CPn ) Zn+1 0
On Z/(n − 1) 0
Aθ Z2 Z2
C ∗ (H3 ) Z3 Z3

Topological K-theory of some common C ∗ -algebras.

Version: 5 Owner: mhale Author(s): mhale

1244
Chapter 302

20-00 – General reference works
(handbooks, dictionaries,
bibliographies, etc.)

302.1 alternating group is a normal subgroup of the
symmetric group

Theorem 2. The alternating group An is a normal subgroup of the symmetric group Sn

D efine the epimorphism f : Sn → Z2 by : σ 7→ 0 if σ is an even permutation and : σ 7→ 1
if σ is an odd permutation. Hence, An is the kernel of f and so it is a normal subgroup
of the domain Sn . Furthermore Sn /An ∼ = Z2 by the first isomorphism theorem. So by
Lagrange’s theorem
|Sn | = |An ||Sn /An |.
Therefore, |An | = n!/2. That is, there are n!/2 many elements in An

Version: 1 Owner: tensorking Author(s): tensorking

302.2 associative

Let (S, φ) be a set with binary operation φ. φ is said to be associative over S if

φ(a, φ(b, c)) = φ(φ(a, b), c)

1245
for all a, b, c ∈ S.

Examples of associative operations are addition and multiplication over the integers (or
reals), or addition or multiplication over n × n matrices.

We can construct an operation which is not associative. Let S be the integers. and define
ν(a, b) = a2 + b. Then ν(ν(a, b), c) = ν(a2 + b, c) = a4 + 2ba2 + b2 + c. But ν(a, ν(b, c)) =
ν(a, b2 + c) = a + b4 + 2cb2 + c2 , hence ν(ν(a, b), c) 6= ν(a, ν(b, c)).

Note, however, that if we were to take S = {0}, ν would be associative over S!. This
illustrates the fact that the set the operation is taken with respect to is very important.

Example. We show that the division operation over nonzero reals is non-associative. All
we need is a counter-example: so let us compare 1/(1/2) and (1/1)/2. The first expression
is equal to 2, the second to 1/2, hence division over the nonzero reals is not associative.

Version: 6 Owner: akrowne Author(s): akrowne

302.3 canonical projection

Given a group G and a normal subgroup N  G there is an epimorphism

π : G → G/N

defined by sending an element g ∈ G to its coset gN. The epimorphism π is referred to as
the canonical projection.

Version: 4 Owner: Dr Absentius Author(s): Dr Absentius

302.4 centralizer

For a given group G, the centralizer of an element a ∈ G is defined to be the set

C(a) = {x ∈ G | xa = ax}

We note that if x, y ∈ C(a) then xy −1 a = xay −1 = axy −1 so that xy −1 ∈ C(a). Thus C(a)
is a non-trivial subgroup of G containing at least {e, a}. To illustrate an application of this
concept we prove the following lemma.

There exists a bijection between the right cosets of C(a) and the conjugates of a.

If x, y ∈ G are in the same right coset, then y = cx for some c ∈ C(a). Thus y −1ay =
x−1 c−1 acx = x−1 c−1 cax = x−1 ax. Conversely, if y −1 ay = x−1 ax then xy −1 a = axy −1 and

1246
xy −1 ∈ C(a) giving x, y are in the same right coset. Let [a] denote the conjugacy class of a.
It follows that |[a]| = [G : C(a)] and |[a]| | |G|.

We remark that a ∈ Z(G) ⇔ |C(a)| = |G| ⇔ |[a]| = 1, where Z(G) denotes the center of G.

Now let G be a p-group, i.e. a finite group of order pn , where p is a prime and n P > 0. Let
z =P |Z(G)|. Summing over elements in distinct conjugacy classes, we have pn = |[a]| =
z + a∈Z(G)
/ |[a]| since the center consists precisely of the conjugacy classes of cardinality 1.
n
But |[a]| | p , so p | z. However, Z(G) is certainly non-empty, so we conclude that every
p-group has a non-trivial center.

The groups C(gag −1) and C(a), for any g, are isomorphic.

Version: 5 Owner: mathcam Author(s): Larry Hammick, vitriol

302.5 commutative

Let (S, φ) be a set with binary operation φ. φ is said to be commutative if

φ(a, b) = φ(b, a)

for all a, b ∈ S.

Some operations which are commutative are addition over the integers, multiplication over
the integers, addition over n × n matrices, and multiplication over the reals.

An example of a non-commutative operation is multiplication over n × n matrices.

Version: 3 Owner: akrowne Author(s): akrowne

302.6 examples of groups

Groups are ubiquitous throughout mathematics. Many “naturally occurring” groups are
either groups of numbers (typically abelian) or groups of symmetries (typically non-abelian).

Groups of numbers

• The most important group is the group of integers Z with addition as operation.
• The integers modulo n, often denoted by Zn , form a group under addition. Like Z
itself, this a cyclic group; any cyclic group is isomorphic to one of these.

1247
• The rational (or real, or complex) numbers form a group under addition.

• The positive rationals form a group under multiplication, and so do the non-zero
rationals. The same is true for the reals.

• The non-zero complex numbers form a group under multiplication. So do the non-zero
quaternions. The latter is our first example of a non-abelian group.

• More generally, any (skew) field gives rise to two groups: the additive group of all field
elements, and the multiplicative group of all non-zero field elements.

• The complex numbers of absolute value 1 form a group under multiplication, best
thought of as the unit circle. The quaternions of absolute value 1 form a group under
multiplication, best thought of as the three-dimensional unit sphere S 3 . The two-
dimensional sphere S 2 however is not a group in any natural way.

Most groups of numbers carry natural topologies turning them into topological groups.

Symmetry groups

• The symmetric group of degree n, denoted by Sn , consists of all permutations of n
items and has n! elements. Every finite group is isomorphic to a subgroup of some Sn .

• An important subgroup of the symmetric group of degree n is the alternating group,
denoted An . This consists of all even permutations on n items. A permutation is said
to be even if it can be written as the product of an even number of transpositions.
The alternating group is normal in Sn , of index 2, and it is an interesting fact that An
is simple for n > 5. See the proof on the simplicity of the alternating groups. By the
Jordan-Hölder theorem, this means that this is the only normal subgroup of Sn .

• If any geometrical object is given, one can consider its symmetry group consisting
of all rotations and reflections which leave the object unchanged. For example, the
symmetry group of a cone is isomorphic to S 1 .

• The set of all automorphisms of a given group (or field, or graph, or topological space,
or object in any category) forms a group with operation given by the composition of
homomorphisms. These are called automorphism groups; they capture the internal
symmetries of the given objects.

• In Galois theory, the symmetry groups of field extensions (or equivalently: the symme-
try groups of solutions to polynomial equations) are the central object of study; they
are called Galois groups.

• Several matrix groups describe various aspects of the symmetry of n-space:

1248
– The general linear group GL(n, R) of all real invertible n × n matrices (with
matrix multiplication as operation) contains rotations, reflections, dilations, shear
transformations, and their combinations.
– The orthogonal group O(n, R) of all real orthogonal n × n matrices contains the
rotations and reflections of n-space.
– The special orthogonal group SO(n, R) of all real orthogonal n × n matrices with
determinant 1 contains the rotations of n-space.

All these matrix groups are Lie groups: groups which are differentiable manifolds such
that the group operations are smooth maps.

Other groups

• The trivial group consists only of its identity element.

• If X is a topological space and x is a point of X, we can define the fundamental group
of X at x. It consists of (equivalence classes of) continuous paths starting and ending
at x and describes the structure of the “holes” in X accessible from x.

• The free groups are important in algebraic topology. In a sense, they are the most
general groups, having only those relations among their elements that are absolutely
required by the group axioms.

• If A and B are two abelian groups (or modules over the same ring), then the set
Hom(A, B) of all homomorphisms from A to B is an abelian group (since the sum
and difference of two homomorphisms is again a homomorphism). Note that the
commutativity of B is crucial here: without it, one couldn’t prove that the sum of
two homorphisms is again a homomorphism.

• The set of all invertible n × n matrices over some ring R forms a group denoted by
GL(n, R).

• The positive integers less than n which are coprime to n form a group if the operation
is defined as multiplication modulo n. This is a cyclic group whose order is given by
the Euler phi-function φ(n),

• Generalizing the last two examples, every ring (and every monoid) contains a group,
its group of units (invertible elements), where the group operation is ring (monoid)
multiplication.

• If K is a number field, then multiplication of (equivalence classes of) non-zero ideals
in the ring of algebraic integers OK gives rise to the ideal class group of K.

• The set of arithmetic functions that take a value other than 0 at 1 form an abelian group
under Dirichlet convolution. They include as a subgroup the set of multiplicative functions.

1249
• Consider the curve C = {(x, y) ∈ R2 | y 2 = x3 − x}. Every straight line intersects
this set in three points (counting a point twice if the line is tangent, and allowing for a
point at infinity). If we require that those three points add up to zero for any straight
line, then we have defined an abelian group structure on C. Groups like these are
called abelian varieties; the most prominent examples are elliptic curves of which C is
the simplest one.
• In the classification of all finite simple groups, several “sporadic” groups occur which
don’t follow any discernable pattern. The largest of these is the monster group with
some 8 · 1053 elements.

Version: 14 Owner: AxelBoldt Author(s): AxelBoldt, NeuRet

302.7 group

Group.
A group is a pair (G, ∗) where G is a non-empty set and ∗ is binary operation on G that
holds the following conditions.

• For any a, b, c ∈ G, (a ∗ b) ∗ c = a ∗ (b ∗ c). (associativity of the operation).
• For any a, b in G, a ∗ b belongs to G. (The operation ∗ is closed).
• There is an element e ∈ G such that ge = eg = g for any g ∈ G. (Existence of
identity element).
• For any g ∈ G there exists an element h such that gh = hg = e. (Existence of inverses)

Usually the symbol ∗ is omitted and we write ab for a ∗ b. Sometimes, the symbol + is used
to represent the operation, especially when the group is abelian.

It can be proved that there is only one identity element , and that for every element there
is only one inverse. Because of this we usually denote the inverse of a as a−1 or −a when we
are using additive notation. The identity element is also called neutral element due to its
behavior with respect to the operation.

Version: 10 Owner: drini Author(s): drini

302.8 quotient group

Let (G, ∗) be a group and H a normal subgroup. The relation ∼ given by a ∼ b when ab−1 ∈
H is an equivalence relation. The equivalence classes are called cosets. The equivalence class
of a is denoted as aH (or a + H if additive notation is being used).

1250
We can induce a group structure on the cosets with the following operation:

(aH) ? (bH) = (a ∗ b)H.

The collection of cosets is denoted as G/H and together with the ? operation form the
quotient group or factor group of G with H.
Example. Consider the group Z and the subgroup

3Z = {n ∈ Z : n = 3k, k ∈ Z}.

Since Z is abelian, 3Z is then also a normal subgroup.

Using additive notation, the equivalence relation becomes n ∼ m when (n−m) ∈ 3Z, that is,
3 divides n − m. So the relation is actually congruence modulo 3. Therefore the equivalence
classes (the cosets) are:

3Z = . . . , −9, −6, −3, 0, 3, 6, 9, . . .
1 + 3Z = . . . , −8, −5, −2, 1, 4, 7, 10, . . .
2 + 3Z = . . . , −7, −4, −1, 2, 5, 8, 11, . . .

which we’ll represent as 0̄, 1̄ and 2̄.

Then we can check that Z/3Z is actually the integers modulo 3 (that is, Z/3Z ∼
= Z3 ).
Version: 6 Owner: drini Author(s): drini

1251
Chapter 303

20-02 – Research exposition
(monographs, survey articles)

303.1 length function

Let G be a group. A length function on G is a function L : G → R+ satisfying:

L(e) = 0,
L(g) = L(g −1 ), ∀g ∈ G,
L(g1 g2 ) 6 L(g1 ) + L(g2 ), ∀g1 , g2 ∈ G.

Version: 2 Owner: mhale Author(s): mhale

1252
Chapter 304

20-XX – Group theory and
generalizations

304.1 free product with amalgamated subgroup

Definition 26. Let Gk , k = 0, 1, 2 be groups and ik : G0 → Gi , k = 1, 2 be monomorphisms.
The free product of G1 and G2 with amalgamated subgroup G0 , is defined to be a group G
that has the following two properties

1. there are homomorphisms jk : Gk → G, k = 1, 2 that make the following diagram
commute
G1
i1 j1

G0 G

i2 j2
G2

2. G is universal with respect to the previous property, that is for any other group G0 and
homomorphisms jk0 : Gk → G0 , k = 1, 2 that fit in such a commutative diagram there
is a unique homomorphism G → G0 so that the following diagram commutes

G1 j10
i1 j1

!
G0 G G0
i2 j2
G2 j20

1253
It follows by “general nonsense” that the free product of G1 and G2 with amalgamated
subgroup G0 , if it exists, is “unique up to unique isomorphism.” The free product of G1 and
G2 with amalgamated subgroup G0 , is denoted by G1 FG0 G2 . The following theorem asserts
its existence.

Theorem 1. G1 FG0 G2 exists for any groups Gk , k = 0, 1, 2 and monomorphisms ik : G0 →
Gi , k = 1, 2.

[ Sketch of proof] Without loss of generality assume that G0 is a subgroup of Gk and
that ik is the inclusion for k = 1, 2. Let

Gk = h(xk;s )s∈S | (rk;t)t∈T i

be a presentation of Gk for k = 1, 2. Each g ∈ G0 can be expressed as a word in the generators
of Gk ; denote that word by wk (g) and let N be the normal closure of {w1 (g)w2(g)−1 | g ∈ G0 }
in the free product G1 FG2 . Define

G1 FG0 G2 := G1 FG2 /N

and for k = 0, 1 define jk to be the inclusion into the free product followed by the canonical projection.
Clearly (1) is satisfied, while (2) follows from the universal properties of the free product
and the quotient group.

Notice that in the above proof it would be sufficient to divide by the relations w1 (g)w2 (g)−1
for g in a generating set of G0 . This is useful in practice when one is interested in obtaining
a presentation of G1 FG0 G2 .

In case that the ik ’s are not injective the above still goes through verbatim. The group thusly
obtained is called a “pushout”.

Examples of free products with amalgamated subgroups are provided by Van Kampen’s theorem.

Version: 1 Owner: Dr Absentius Author(s): Dr Absentius

304.2 nonabelian group

Let (G, ∗) be a group. If a ∗ b 6= b ∗ a for some a, b ∈ G, we say that the group is nonabelian
or noncommutative.

proposition. There is a nonabelian group for which x 7→ x3 is a homomorphism

Version: 2 Owner: drini Author(s): drini, apmxi

1254
Chapter 305

20A05 – Axiomatics and elementary
properties

305.1 Feit-Thompson theorem

An important result in the classification of all finite simple groups, the Feit-Thompson the-
orem states that every non-Abelian simple group must have even order.

The proof requires 255 pages.

Version: 1 Owner: mathcam Author(s): mathcam

305.2 Proof: The orbit of any element of a group is a
subgroup

Following is a proof that, if G is a group and g ∈ G, then hgi ≤ G. Here hgi is the orbit of
g and is defined as
hgi = {g n : n ∈ Z}

Since g ∈ hgi, then hgi is nonempty.

Let a, b ∈ hgi. Then there exist x, y ∈ Z such that a = g x and b = g y . Since ab−1 =
g x (g y )−1 = g x g −y = g x−y ∈ hgi, it follows that hgi ≤ G.

Version: 3 Owner: drini Author(s): drini, Wkbj79

1255
305.3 center

The center of a group G is the subgroup of elements which commute with every other element.
Formally
Z(G) = {x ∈ G | xg = gx, ∀ g ∈ G}

It can be shown that the center has the following properties

• It is non-empty since it contains at least the identity element

• It consists of those conjugacy classes containing just one element

• The center of an abelian group is the entire group

• It is normal in G

• Every p-group has a non-trivial center

Version: 5 Owner: vitriol Author(s): vitriol

305.4 characteristic subgroup

If (G, ∗) is a group, then H is a characteristic subgroup of G (H char G) if every automorphism
of G maps H to itself. That is:

∀f ∈ Aut(G)∀h ∈ Hf (h) ∈ H

or, equivalently:
∀f ∈ Aut(G)f [H] = H

A few properties of characteristic subgroups:

(a) If H char G then H is a normalsubgroup of G

(b) If G has only one subgroup of a given size then that subgroup is characteristic

(c) If K char H and H E G then K E G (contrast with normality of subgroups is not transitive)

(d) If K char H and H char G then K char G

Proofs of these properties:

1256
(a) Consider H char G under the inner automorphisms of G. Since every automorphism
preserves H, in particular every inner automorphism preserves H, and therefore g ∗
h ∗ g −1 ∈ H for any g ∈ G and h ∈ H. This is precisely the definition of a normal
subgroup.

(b) Suppose H is the only subgroup of G of order n. In general, homomorphisms takes
subgroups to subgroups, and of course isomorphisms take subgroups to subgroups of
the same order. But since there is only one subgroup of G of order n, any automorphism
must take H to H, and so H char G.

(c) Take K char H and H E G, and consider the inner automorphisms of G (automor-
phisms of the form h 7→ g ∗ h ∗ g −1 for some g ∈ G). These all preserve H, and so are
automorphisms of H. But any automorphism of H preserves K, so for any g ∈ G and
k ∈ K, g ∗ k ∗ g −1 ∈ K.

(d) Let K char H and H char G, and let φ be an automorphism of G. Since H char G,
φ[H] = H, so φH , the restriction of φ to H is an automorphism of H. Since K char H,
so φH [K] = K. But φH is just a restriction of φ, so φ[K] = K. Hence K char G.

Version: 1 Owner: Henry Author(s): Henry

305.5 class function

Given a field K, a K–valued class function on a group G is a function f : G −→ K such
that f (g) = f (h) whenever g and h are elements of the same conjugacy class of G.

An important example of a class function is the character of a group representation. Over
the complex numbers, the set of characters of the irreducible representations of G form a
basis for the vector space of all C–valued class functions, when G is a compact Lie group.

Relation to the convolution algebra Class functions are also known as central func-
tions, because they correspond to functions f in the convolution algebra C ∗ (G) that have
the property f ∗ g = g ∗ f for all g ∈ C ∗ (G) (i.e., they commute with everything under the
convolution operation). More precisely, the set of measurable complex valued class func-
tions f is equal to the set of central elements of the convolution algebra C ∗ (G), for G a
locally compact group admitting a Haar measure.

Version: 5 Owner: djao Author(s): djao

1257
305.6 conjugacy class

Two elements g and g 0 of a group G are said to be conjugate if there exists h ∈ G such that
g 0 = hgh−1 . Conjugacy of elements is an equivalence relation, and the equivalence classes of
G are called conjugacy classes.

Two subsets S and T of G are said to be conjugate if there exists g ∈ G such that

T = {gsg −1 | s ∈ S} ⊂ G.

In this situation, it is common to write gSg −1 for T to denote the fact that everything in T
has the form gsg −1 for some s ∈ S. We say that two subgroups of G are conjugate if they
are conjugate as subsets.

Version: 2 Owner: djao Author(s): djao

305.7 conjugacy class formula

The conjugacy classes of a group form a partition of its elements. In a finite group, this
means that the order of the group is the sum of the number of elements of the distinct
conjugacy classes. For an element g of group G, we denote the conjugacy class of g as Cg
and the normalizer in G of g as NG (g). The number of elements in Cg equals [G : NG (g)], the
index of the normalizer of g in G. For an element g of the center Z(G) of G, the conjugacy
class of g consists of the singleton {g}. Putting this together gives us the conjugacy class
formula m
X
|G| = |Z(G)| + [G : NG (xi )]
i=1

where the xi are elements of the distinct conjugacy classes contained in G − Z(G).

Version: 3 Owner: lieven Author(s): lieven

305.8 conjugate stabilizer subgroups

Let · be a right group action of G on a set M. Then

Gα·g = g −1 Gα g
1
for any α ∈ M and g ∈ G.

Proof:
1
Gα is the stabilizer subgroup of α ∈ M .

1258
x ∈ Gα·g ↔ α · (gx) = α · g ↔ α · (gxg −1 ) = α ↔ gxg −1 ∈ Gα ↔ x ∈ g −1 αg

and therefore Gα·g = g −1 Gα g.

Thus all stabilizer subgroups for elements of the orbit G(α) of α are conjugate to Gα .

Version: 4 Owner: Thomas Heye Author(s): Thomas Heye

305.9 coset

Let H be a subgroup of a group G, and let a ∈ G. The left coset of a with respect to H in
G is defined to be the set
aH := {ah | h ∈ H}.
The right coset of a with respect to H in G is defined to be the set
Ha := {ha | h ∈ H}.
T
Two left cosets aH and bH of H in G are either identical or disjoint. Indeed, if c ∈ aH bH,
then c = ah1 and c = bh2 for some h1 , h2 ∈ H, whence b−1 a = h2 h−1 1 ∈ H. But then, given
any ah ∈ aH, we have ah = (bb−1 )ah = b(b−1 a)h ∈ bH, so aH ⊂ bH, and similarly
bH ⊂ aH. Therefore aH = bH.

Similarly, any two right cosets Ha and Hb of H in G are either identical or disjoint. Accord-
ingly, the collection of left cosets (or right cosets) partitions the group G; the correspond-
ing equivalence relation for left cosets can be described succintly by the relation a ∼ b if
a−1 b ∈ H, and for right cosets by a ∼ b if ab−1 ∈ H.

The index of H in G, denoted [G : H], is the cardinality of the set G/H of left cosets of H
in G.

Version: 5 Owner: djao Author(s): rmilson, djao

305.10 cyclic group

A group G is said to be cyclic if it is generated entirely by some x ∈ G. That is, if G has
infinite order then every g ∈ G can be expressed as xk with k ∈ Z. If G has finite order
then every g ∈ G can be expressed as xk with k ∈ N0 , and G has exactly φ(|G|) generators,
where φ is the Euler totient function.

It is a corollary of Lagrange’s theorem that every group of prime order is cyclic. All cyclic
groups of the same order are isomorphic to each other. Consequently cyclic groups of order
n are often denoted by Cn . Every cyclic group is abelian.

1259
Examples of cyclic groups are (Zm , +m ), (Z?p , ×p ) and (Rm , ×m ) where p is prime and Rm =
{n ∈ N : (n, m) = 1, n ≤ m}

Version: 10 Owner: yark Author(s): yark, Larry Hammick, vitriol

305.11 derived subgroup

Let G be a group and a, b ∈ G. The group element aba−1 b−1 is called the commutator of a
and b. An element of G is called a commutator if it is the commutator of some a, b ∈ G.

The subgroup of G generated by all the commutators in G is called the derived subgroup
of G, and also the commutator subgroup. It is commonly denoted by G0 and also by G(1) .
Alternatively, one may define G0 as the smallest subgroup that contains all the commutators.

Note that the commutator of a, b ∈ G is trivial, i.e.

aba−1 b−1 = 1

if and only if a and b commute. Thus, in a fashion, the derived subgroup measures the degree
to which a group fails to be abelian.
Proposition 1. The derived subgroup G0 is normal in G, and the factor group G/G0 is
abelian. Indeed, G is abelian if and only if G0 is the trivial subgroup.

One can of course form the derived subgroup of the derived subgroup; this is called the
second derived subgroup, and denoted by G00 or by G(2) . Proceeding inductively one defines
the nth derived subgroup as the derived subgroup of G(n−1) . In this fashion one obtains a
sequence of subgroups, called the derived series of G:

G = G(0) ⊇ G(1) ⊇ G(2) ⊇ . . .

Proposition 2. The group G is solvable if and only if the derived series terminates in the
trivial group {1} after a finite number of steps. In this case, one can refine the derived series
to obtain a composition series (a.k.a. a Jordan-Holder decomposition) of G.

Version: 4 Owner: rmilson Author(s): rmilson

305.12 equivariant

Let G be a group, and X and Y left (resp. right) homogeneous spaces of G. Then a map
f : X → Y is called equivariant if g(f (x)) = f (gx) (resp. (f (x))g = f (xg)) for all g ∈ G.

Version: 1 Owner: bwebste Author(s): bwebste

1260
305.13 examples of finite simple groups

This entry under construction. If I take too long to finish it, nag me about it, or fill in the
rest yourself.

All groups considered here are finite.

It is now widely believed that the classification of all finite simple groups up to isomorphism
is finished. The proof runs for at least 10,000 printed pages, and as of the writing of this
entry, has not yet been published in its entirety.

Abelian groups

• The first trivial example of simple groups are the cyclic groups of prime order. It is
not difficult to see (say, by Cauchy’s theorem) that these are the only abelian simple
groups.

Alternating groups

• The alternating group on n symbols is the set of all even permutations of Sn , the
symmetric group on n symbols. It is usually denoted by An , or sometimes by Alt(n).
This is a normal subgroup of Sn , namely the kernel of the homomorphism that sends
every even permutation to 1 and the odd permutations to −1. Because every per-
mutation is either even or odd, and there is a bijection between the two (multiply
every even permutation by a transposition), the index of An in Sn is 2. A3 is simple
because it only has three elements, and the simplicity of An for n > 5 can be proved
by an elementary argument. The simplicity of the alternating groups is an important
fact that Évariste Galois required in order to prove the insolubility by radicals of the
general polynomial of degree higher than four.

Groups of Lie type

• Projective special linear groups

• Other groups of Lie type.

Sporadic groups There are twenty-six sporadic groups (no more, no less!) that do not
fit into any of the infinite sequences of simple groups considered above. These often arise as
the group of automorphisms of strongly regular graphs.

1261
• Mathieu groups.

• Janko groups.

• The baby monster.

• The monster.

Version: 8 Owner: drini Author(s): bbukh, yark, NeuRet

305.14 finitely generated group

A group G is finitely generated if there is a finite subset X ⊆ G such that X generates G.
That is, every element of G is a product of elements of X and inverses of elements of X. Or,
equivalently, no proper subgroup of G contains X.

Every finite group is finitely generated, as we can take X = G. Every finitely generated
group is countable.

Version: 6 Owner: yark Author(s): yark, nerdy2

305.15 first isomorphism theorem

If f : G → H is a homorphism of groups (or rings, or modules), then it induces an
isomorphism G/ker f ≈ imf .

Version: 2 Owner: nerdy2 Author(s): nerdy2

305.16 fourth isomorphism theorem

fourth isomorphism theorem

1: X group

2: N E X

3: A set of subgroups of X that contain N

4: B set of subgroups of X/N

1262
5: ∃ϕ : A → B bijection : ∀Y, Z 6 X : *N 6 Y & N 6 Z+ ⇒ **Y 6 Z ⇔ Y /N T 6 Z/N +
& *Z T6 Y ⇒ [Y : Z] = [Y /N : Z/N] + & *hY, Zi/N = hY /N, Z/Ni + & *(Y Z)/N =
Y /N Z/N + & *Y E G ⇔ Y /N E G/N+

Note: This is a “seed” entry written using a short-hand format described in this FAQ.

Version: 2 Owner: bwebste Author(s): yark, apmxi

305.17 generator

If G is a cyclic group and g ∈ G, then g is a generator of G if hgi = G.

All infinite cyclic groups have exactly 2 generators. Let G be an infinite cyclic group and
g be a generator of G. Let z ∈ Z such that g z is a generator of G. Then hg z i = G. Since
g ∈ G, then g ∈ hg z i. Thus, there exists, n ∈ Z with g = (g z )n = g nz . Thus, g nz−1 = eG .
Since G is infinite and |g| = |hgi| = |G| must be infinity, then nz − 1 = 0. Since nz = 1 and
n and z are integers, then n = z = 1 or n = z = −1. It follows that the only generators of
G are g and g −1.

A finite cyclic group of order n has exactly ϕ(n) generators, where ϕ is the Euler totient function.
Let G be a finite cyclic group of order n and g be a generator of G. Then |g| = |hgi| = |G| = n.
Let z ∈ Z such that g z is a generator of G. By the division algorithm, there exist q, r ∈ Z with
0 ≤ r < n such that z = qn + r. Thus, g z = g qn+r = g qn g r = (g n )q g r = (eG )q g r = eG g r = g r .
|g| n
Since g r is a generator of G, then hg r i = G. Thus, n = |G| = |hg r i| = |g r | = gcd(r,|g|) = gcd(r,n) .
Therefore, gcd(r, n) = 1, and the result follows.

Version: 3 Owner: Wkbj79 Author(s): Wkbj79

305.18 group actions and homomorphisms

Notes on group actions and homomorphisms Let G be a group, X a non-empty set and SX
the symmetric group of X, i.e. the group of all bijective maps on X. · may denote a left
group action of G on X.

1. For each g ∈ G we define
fg : X −→ X, fg (x) = g · x ∀ x ∈ X.
fg− 1 (fg (x)) = g −1 · (g · x) = x ∀ x ∈ X, so fg− 1 is the inverse of fg . so fg is bijective and
thus element of SX . We define F : G −→ SX , F (g) = fg for all g ∈ G. This mapping
is a group homomorphism: Let g, h ∈ G, x ∈ X. Then
F (gh)(x) = fgh (x) = (gh) · x = g · (h · x) = (fg ◦ fh )(x) =

1263
implies F (gh) = F (g) ◦ F (h). – The same is obviously true for a right group action.
2. Now let F : G −→ Sx be a group homomorphism, and let f : G × X −→ X, (g, x) −→
F (g)(x) satisfies
(a) f (1G , x) = F (1g )(x) = x∀x ∈ X and
(b) f (gh, x) = F (gh)(x) = (F (g) ◦ F (h)(x) = F (g)(F (h)(x)) = f (g, f (h, x)),
so f is a group action induced by F .

Characterization of group actions

Let G be a group acting on a set X. Using the same notation as above, we have for each
g ∈ ker(F ) [
F (g) = id x = fg ↔ g · x = x ∀x ∈ X ↔ g ∈ Gx (305.18.1)
x∈X

and it follows [
ker(F ) = Gx .
x∈X

Let G act transitively on X. Then for any x ∈ X X is the orbit G(x) of x. As shown in
“conjugate stabilizer subgroups’, all stabilizer subgroups of elements y ∈ G(x) are conjugate subgroups
to Gx in G. From the above it follows that
[
ker(F ) = gGx g −1 .
g∈G

For a faithful operation of G the condition g · x = x∀x ∈ X → g = 1G is equivalent to
ker(F ) = {1G }
and therefore F : G −→ SX is a monomorphism.

For the trivial operation of G on X given by g · x = x∀g ∈ G the stabilizer subgroup Gx is
G for all x ∈ X, and thus
ker(F ) = G.
The corresponding homomorphism is g −→ id x∀g ∈ G.

If the operation of G on X is free, then Gx = {1G } ∀ x ∈ X, thus the kernel of F is {1G }–like
for a faithful operation. But:

Let X = {1, . . . , n} and G = Sn . Then the operation of G on X given by
π · i := π(i)∀i ∈ X, π ∈ Sn
is faithful but not free.

Version: 5 Owner: Thomas Heye Author(s): Thomas Heye

1264
305.19 group homomorphism

Let (G, ∗g ) and (K, ∗k ) be two groups. A group homomorphism is a function φ : G → K
such that φ(s ∗g t) = φ(s) ∗k φ(t) for all s, t ∈ G.

The composition of group homomorphisms is again a homomorphism.

Let φ : G → K a group homomorphism. Then

• φ(eg ) = ek where eg and ek are the respective identity elements for G and K.
• φ(g)−1 = φ(g −1 ) for all g ∈ G
• φ(g)z = φ(g z ) for all g ∈ G and for all z ∈ Z

The kernel of φ is a subgroup of G and its image is a subgroup of K.

Some special homomorphisms have special names. If φ : G → K is injective, we say that φ is
an monomorphism, and if φ is onto we call it an epimorphism. When φ is both injective and
surjective (that is, bijective) we call it an isomorphism. In the latter case we also say that G
and K are isomorphic, meaning they are basically the same group (have the same structure).
An homomorphism from G on itself is called an endomorphism, and if it is bijective, then is
called an automorphism.

Version: 15 Owner: drini Author(s): saforres, drini

305.20 homogeneous space

Overview and definition. Let G be a group acting transitively on a set X. In other
words, we consider a homomorphism φ : G → Perm(X), where the latter denotes the group
of all bijections of X. If we consider G as being, in some sense, the automorphisms of X,
the transitivity assumption means that it is impossible to distinguish a particular element
of X from any another element. Since the elements of X are indistinguishable, we call X a
homogeneous space. Indeed, the concept of a homogeneous space, is logically equivalent
to the concept of a transitive group action.

Action on cosets. Let G be a group, H < G a subgroup, and let G/H denote the set of
left cosets, as above. For every g ∈ G we consider the mapping ψH (g) : G/H → G/H with
action
aH → gaH, a ∈ G.
Proposition 3. The mapping ψH (g) is a bijection. The corresponding mapping ψH : G →
Perm(G/H) is a group homomorphism, specifying a transitive group action of G on G/H.

1265
Thus, G/H has the natural structure of a homogeneous space. Indeed, we shall see that
every homogeneous space X is isomorphic to G/H, for some subgroup H.

N.B. In geometric applications, the want the homogeneous space X to have some extra
structure, like a topology or a differential structure. Correspondingly, the group of automor-
phisms is either a continuous group or a Lie group. In order for the quotient space X to
have a Hausdorff topology, we need to assume that the subgroup H is closed in G.

The isotropy subgroup and the basepoint identification. Let X be a homogeneous
space. For x ∈ X, the subgroup
Hx = {h ∈ G : hx = x},
consisting of all G-actions that fix x, is called the isotropy subgroup at the basepoint x. We
identify the space of cosets G/Hx with the homogeneous space by means of the mapping
τx : G/Hx → X, defined by
τx (aHx ) = ax, a ∈ G.
Proposition 4. The above mapping is a well-defined bijection.

To show that τx is well defined, let a, b ∈ G be members of the same left coset, i.e. there
exists an h ∈ Hx such that b = ah. Consequently
bx = a(hx) = ax,
as desired. The mapping τx is onto because the action of G on X is assumed to be transitive.
To show that τx is one-to-one, consider two cosets aHx , bHx , a, b ∈ G such that ax = bx.
It follows that b−1 a fixes x, and hence is an element of Hx . Therefore aHx and bHx are the
same coset.

The homogeneous space as a quotient. Next, let us show that τx is equivariant relative
to the action of G on X and the action of G on the quotient G/Hx .
Proposition 5. We have that
φ(g) ◦ τx = τx ◦ ψHx (g)
for all g ∈ G.

To prove this, let g, a ∈ G be given, and note that
ψHx (g)(aHx ) = gaHx .
The latter coset corresponds under τx to the point gax, as desired.

Finally, let us note that τx identifies the point x ∈ X with the coset of the identity element
eHx , that is to say, with the subgroup Hx itself. For this reason, the point x is often called
the basepoint of the identification τx : G/Hx → X.

1266
The choice of basepoint. Next, we consider the effect of the choice of basepoint on the
quotient structure of a homogeneous space. Let X be a homogeneous space.
Proposition 6. The set of all isotropy subgroups {Hx : x ∈ X} forms a single conjugacy class
of subgroups in G.

To show this, let x0 , x1 ∈ X be given. By the transitivity of the action we may choose a
ĝ ∈ G such that x1 = ĝx0 . Hence, for all h ∈ G satisfying hx0 = x0 , we have
(ĝhĝ −1 )x1 = ĝ(h(ĝ −1 x1 )) = ĝx0 = x1 .
Similarly, for all h ∈ Hx1 we have that ĝ −1hĝ fixes x0 . Therefore,
ĝ(Hx0 )ĝ −1 = Hx1 ;
or what is equivalent, for all x ∈ X and g ∈ G we have
gHx g −1 = Hgx .

Equivariance. Since we can identify a homogeneous space X with G/Hx for every possible
x ∈ X, it stands to reason that there exist equivariant bijections between the different G/Hx .
To describe these, let H0 , H1 < G be conjugate subgroups with
H1 = ĝH0 ĝ −1
for some fixed ĝ ∈ G. Let us set
X = G/H0 ,
and let x0 denote the identity coset H0 , and x1 the coset ĝH0 . What is the subgroup of G
that fixes x1 ? In other words, what are all the h ∈ G such that
hĝH0 = ĝH0 ,
or what is equivalent, all h ∈ G such that
ĝ −1 hĝ ∈ H0 .
The collection of all such h is precisely the subgroup H1 . Hence, τx1 : G/H1 → G/H0 is the
desired equivariant bijection. This is a well defined mapping from the set of H1 -cosets to
the set of H0 -cosets, with action given by
τx1 (aH1 ) = aĝH0 , a ∈ G.

Let ψ0 : G → Perm(G/H0 ) and ψ1 : G → Perm(G/H1 ) denote the corresponding coset
G-actions.
Proposition 7. For all g ∈ G we have that
τx1 ◦ ψ1 (g) = ψ0 (g) ◦ τx1 .

Version: 3 Owner: rmilson Author(s): rmilson

1267
305.21 identity element

Let G be a groupoid, that is a set with a binary operation G×G → G, written muliplicatively
so that (x, y) 7→ xy.

An identity element for G is an element e such that ge = eg = g for all g ∈ G.

The symbol e is most commonly used for identity elements. Another common symbol for
an identity element is 1, particularly in semigroup theory (and ring theory, considering the
multiplicative structure as a semigroup).

Groups, monoids, and loops are classes of groupoids that, by definition, always have an
identity element.

Version: 6 Owner: mclase Author(s): mclase, vypertd, imran

305.22 inner automorphism

Let G be a group. For every x ∈ G, we define a mapping

φx : G → G, y 7→ xyx−1 , y ∈ G,

called conjugation by x. It is easy to show the conjugation map is in fact, a group automorphism.

An automorphism of G that corresponds to the conjugation by some x ∈ G is called inner.
An automorphism that isn’t inner is called an outer automorphism.

The composition operation gives the set of all automorphisms of G the structure of a group,
Aut(G). The inner automorphisms also form a group, Inn(G), which is a normal subgroup
of Aut(G). Indeed, if φx , x ∈ G is an inner automorphism and π : G → G an arbitrary
automorphism, then
π ◦ φx ◦ π −1 = φπ(x) .
Let us also note that the mapping

x 7→ φx , x∈G

is a surjective group homomorphism with kernel Z(G), the centre subgroup. Consequently,
Inn(G) is naturally isomorphic to the quotient of G/ Z(G).

Version: 7 Owner: rmilson Author(s): rmilson, tensorking

1268
305.23 kernel

Let ρ : G → K be a group homomorphism. The preimage of the codomain identity element
eK ∈ K forms a subgroup of the domain G, called the kernel of the homomorphism;
ker(ρ) = {s ∈ G | ρ(s) = eK }

The kernel is a normal subgroup. It is the trivial subgroup if and only if ρ is a monomorphism.

Version: 9 Owner: rmilson Author(s): rmilson, Daume

305.24 maximal

Let G be a group. A subgroup H of G is said to be maximal if H 6= G and whenever K is
a subgroup of G with H ⊆ K ⊆ G then K = H or K = G.

Version: 1 Owner: Evandar Author(s): Evandar

305.25 normal subgroup

A subgroup H of a group G is normal if aH = Ha for all a ∈ G. Equivalently, H ⊂ G is
normal if and only if aHa−1 = H for all a ∈ G, i.e., if and only if each conjugacy class of G
is either entirely inside H or entirely outside H.

The notation H E G or H / G is often used to denote that H is a normal subgroup of G.

The kernel ker (f ) of any group homomorphism f : G −→ G0 is a normal subgroup of G.
More surprisingly, the converse is also true: any normal subgroup H ⊂ G is the kernel of
some homomorphism (one of these being the projection map ρ : G −→ G/H, where G/H is
the quotient group).

Version: 6 Owner: djao Author(s): djao

305.26 normality of subgroups is not transitive

Let G be a group. Obviously, a subgroup K ≤ H of a subgroup H ≤ G of G is a subgroup
K ≤ G of G. It seems plausible that a similar situation would also hold for normal subgroups.

This is not true. Even when K E H and H E G, it is possible that K 5 G. Here are two
examples:

1269
1. Let G be the subgroup of orientation-preserving isometries of the plane R2 (G is just
all rotations and translations), let H be the subgroup of G of translations, and let K
be the subgroup of H of integer translations τi,j (x, y) = (x + i, y + j), where i, j ∈ Z.
Any element g ∈ G may be represented as g = r1 ◦ t1 = t2 ◦ r2 , where r1,2 are rotations
and t1,2 are translations. So for any translation t ∈ H we may write

g −1 ◦ t ◦ g = r −1 ◦ t0 ◦ r,

where t0 ∈ H is some other translation and r is some rotation. But this is an orientation-
preserving isometry of the plane that does not rotate, so it too must be a translation.
Thus G−1 HG = H, and H E G.
H is an abelian group, so all its subgroups, K included, are normal.
We claim that K 5 G. Indeed, if ρ ∈ G is rotation by 45◦ about the origin, then
ρ−1 ◦ τ1,0 ◦ ρ is not an integer translation.

2. A related example uses finite subgroups. Let G = D4 be the dihedral group with four
elements (the group of automorphisms of the graph of the square C4 ). Then

D4 = r, f | f 2 = 1, r 4 = 1, f r = r −1 f

is generated by r, rotation, and f , flipping.

The subgroup 
H = hrf, f ri = 1, rf, r 2 , f r ∼= C2 × C2
is isomorphic to the Klein 4-group – an identity and 3 elements of order 2. H E G
since [G : H] = 2. Finally, take

K = hrf i = {1, rf } E H.

We claim that K 5 G. And indeed,

f ◦ rf ◦ f = f r ∈
/ K.

Version: 4 Owner: ariels Author(s): ariels

305.27 normalizer

Let G be a group, and let H ⊆ G. The normalizer of H in G, written NG (H), is the set

{g ∈ G | gHg −1 = H}

1270
If H is a subgroup of G, then NG (H) is a subgroup of G containing H.

Note that H is a normal subgroup of NG (H); in fact, NG (H) is the largest subgroup of
G of which H is a normal subgroup. In particular, if H is a normal subgroup of G, then
NG (H) = G.

Version: 6 Owner: saforres Author(s): saforres

305.28 order (of a group)

The order of a group G is the number of elements of G, denoted |G|; if |G| is finite, then G
is said to be a finite group.

The order of an element g ∈ G is the smallest positive integer n such that g n = e, where e
is the identity element; if there is no such n, then g is said to be of infinite order.

Version: 5 Owner: saforres Author(s): saforres

305.29 presentation of a group

A presentation of a group G is a description of G in terms of generators and relations. We
say that the group is finitely presented, if it can be described in terms of a finite number
of generators and a finite number of defining relations. A collection of group elements
gi ∈ G, i ∈ I is said to generate G if every element of G can be specified as a product of the
gi , and of their inverses. A relation is a word over the alphabet consisting of the generators
gi and their inverses, with the property that it multiplies out to the identity in G. A set of
relations rj , j ∈ J is said to be defining, if all relations in G can be given as a product of
the rj , their inverses, and the G-conjugates of these.

The standard notation for the presentation of a group is
G = hgi | rj i,
meaning that G is generated by generators gi , subject to relations rj . Equivalently, one has
a short exact sequence of groups
1 → N → F [I] → G → 1,
where F [I] denotes the free group generated by the gi , and where N is the smallest normal subgroup
containing all the rj . By the Nielsen-Schreier theorem, the kernel N is itself a free group, and
hence we assume without loss of generality that there are no relations among the relations.

Example. The symmetric group on n elements 1, . . . , n admits the following finite pre-
sentation (Note: this presentation is not canonical. Other presentations are known.) As

1271
generators take
gi = (i, i + 1), i = 1, . . . , n − 1,
the transpositions of adjacent elements. As defining relations take
(gi gj )ni,j = id, i, j = 1, . . . n,
where
ni,i = 1
ni,i+1 = 3
ni,j = 2, i<j+1
This means that a finite symmetric group is a Coxeter group,

Version: 11 Owner: rmilson Author(s): rmilson

305.30 proof of first isomorphism theorem

Let K denote ker f . K is a normal subgroup of G because, by the following calculation,
gkg −1 ∈ K for all g ∈ G and k ∈ K (rules of homomorphism imply the first equality,
definition of K for the second):
f (gkg −1) = f (g)f (k)f (g)−1 = f (g)1H f (g)−1 = 1H
Therefore, G/K is well defined.

Define a group homomorphism θ : G/K → imf given by:
θ(gK) = f (g)

We argue that θ is an isomorphism.

First, θ is well defined. Take two representative, g1 and g2 , of the same modulo class. By
definition, g1 g2−1 is in K. Hence, f sends g1 g2−1 to 1 (all elements of K are sent by f to
1). Consequently, the next calculation is valid: f (g1)f (g2 )−1 = f (g1 g2−1) = 1 but this is the
same as saying that f (g1 ) = f (g2 ). And we are done because the last equality indicate that
θ(g1 K) is equal to θ(g2 K).

Going backward the last argument, we get that θ is also an injection: If θ(g1 K) is equal to
θ(g2 K) then f (g1 ) = f (g2 ) and hence g1 g2−1 ∈ K (exactly as in previous part) which implies
an equality between g1 K and g2 K.

Now, θ is a homomorphism. We need to show that θ(g1 K · g2 K) = θ(g1 K)θ(g2 K) and that
θ((gK)−1 ) = (θ(gK))−1 . And indeed:
θ(g1 K · g2 K) = θ(g1 g2 K) = f (g1 g2 ) = f (g1 )f (g2 ) = θ(g1 K)θ(g2 K)

1272
θ((gK)−1 ) = θ(g −1 K) = f (g −1 ) = (f (g))−1 = (θ(gK))−1

To conclude, θ is surjective. Take h to be an element of imf and g its pre-image. Since
h = f (g) we have that h is also the image of of θ(gK).

Version: 3 Owner: uriw Author(s): uriw

305.31 proof of second isomorphism theorem

First, we shall prove that HK is a subgroup of G: Since e ∈ H and e ∈ K, clearly e = e2 ∈
HK. Take h1 , h2 ∈ H, k1, k2 ∈ K. Clearly h1 k1 , h2 k2 ∈ HK. Further,

h1 k1 h2 k2 = h1 (h2 h−1 −1
2 )k1 h2 k2 = h1 h2 (h2 k1 h2 )k2

Since K is a normal subgroup of G and h2 ∈ G, then h−1 −1
2 k1 h2 ∈ K. Therefore h1 h2 (h2 k1 h2 )k2 ∈
HK, so HK is closed under multiplication.

Also, (hk)−1 ∈ HK for h ∈ H, k ∈ K, since

(hk)−1 = k −1 h−1 = h−1 hk −1 h−1

and hk −1 h−1 ∈ K since K is a normal subgroup of G. So HK is closed under inverses, and
is thus a subgroup of G.

Since HK is a subgroup of G, the normality of K in HK follows immediately from the
normality of K in G.
T
Clearly H K is a subgroup of G, since it is the intersection of two subgroups of G.

Finally, define φ : H → HK/K by ϕ(h) = hK. We claim that φ is a surjective homomorphism
from H to HK/K. Let h0 k0 K be some element of HK/K; since k0 ∈ K, then h0 k0 K = h0 K,
and φ(h0 ) = h0 K. Now

ker (φ) = {h ∈ H | φ(h) = K} = {h ∈ H | hK = K}

and if hK = K, then we must have h ∈ K. So
\
ker (φ) = {h ∈ H | h ∈ K} = H K

T
Thus, since
T φ(H) = HK/K and ker φ = H K, by the first isomorphism theorem we
T see
that H K is normal in H and that there is a natural isomorphism between H/(H K)
and HK/K.

Version: 8 Owner: saforres Author(s): saforres

1273
305.32 proof that all cyclic groups are abelian

Following is a proof that all cyclic groups are abelian.

Let G be a cyclic group and g be a generator of G. Let a, b ∈ G. Then there exist x, y ∈ Z
such that a = g x and b = g y . Since ab = g x g y = g x+y = g y+x = g y g x = ba, it follows that G
is abelian.

Version: 2 Owner: Wkbj79 Author(s): Wkbj79

305.33 proof that all cyclic groups of the same order
are isomorphic to each other

The following is a proof that all cyclic groups of the same order are isomorphic to each other.

Let G be a cyclic group and g be a generator of G. Define ϕ : Z → G by ϕ(c) = g c . Since
ϕ(a + b) = g a+b = g a g b = ϕ(a)ϕ(b), then ϕ is a group homomorphism. If h ∈ G, then there
exists x ∈ Z such that h = g x . Since ϕ(x) = g x = h, then ϕ is surjective.

ker ϕ = {c ∈ Z|ϕ(c) = eG } = {c ∈ Z|g c = eG }

If G is infinite, then ker ϕ = {0}, and ϕ is injective. Hence, ϕ is a group isomorphism, and
G∼ = Z.

If G is finite, then let |G| = n. Thus, |g| = |hgi| = |G| = n. If g c = eG , then n divides c.
Therefore, ker ϕ = nZ. By the first isomorphism theorem, G ∼ Z ∼
= nZ = Zn .

Let H and K be cyclic groups of the same order. If H and K are infinite, then, by the
above argument, H ∼
= Z and K ∼ = Z. If H and K are finite of order n, then, by the above
argument, H = Zn and K ∼
∼ = Zn . In any case, it follows that H ∼
= K.

Version: 1 Owner: Wkbj79 Author(s): Wkbj79

305.34 proof that all subgroups of a cyclic group are
cyclic

Following is a proof that all subgroups of a cyclic group are cyclic.

Let G be a cyclic group and H ≤ G. If G is trivial, then H = G, and H is cyclic. If H is
the trivial subgroup, then H = {eG } = heG i, and H is cyclic. Thus, for the remainder of the
proof, it will be assumed that both G and H are nontrivial.

1274
Let g be a generator of G. Let n be the smallest positive integer such that g n ∈ H.

Claim: H = hg n i

Let a ∈ hg n i. Then there exists z ∈ Z with a = (g n )z . Since g n ∈ H, then (g n )z ∈ H. Thus,
a ∈ H. Hence, hg n i ⊆ H.

Let h ∈ H. Then h ∈ G. Let x ∈ Z with h = g x . By the division algorithm, there exist
q, r ∈ Z with 0 ≤ r < n such that x = qn + r. Since h = g x = g qn+r = g qn g r = (g n )q g r , then
g r = h(g n )−q . Since h, g n ∈ H, then g r ∈ H. By choice of n, r cannot be positive. Thus,
r = 0. Therefore, h = (g n )q g 0 = (g n )q eG = (g n )q ∈ hg n i. Hence, H ⊆ hg n i.

Since hg n i ⊆ H and H ⊆ hg n i, then H = hg n i. It follows that every subgroup of G is cyclic.

Version: 3 Owner: Wkbj79 Author(s): Wkbj79

305.35 regular group action

Let G be a group action on a set X. The action is called regular if for any pair α, β ∈ X
there exists exactly one g ∈ G such that g · α = β. (For a right group action it is defined
correspondingly.)

Version: 3 Owner: Thomas Heye Author(s): Thomas Heye

305.36 second isomorphism theorem

Let (G, ∗) be a group. Let H be a subgroup of G and let K be a normal subgroup of G.
Then

• HK := {h ∗ k | h ∈ H, k ∈ K} is a subgroup of G,

• K is a normal subgroup of HK,
T
• H K is a normal subgroup of H,
T
• There is a natural group isomorphism H/(H K) = HK/K.

The same statement also holds in the category of modules over a fixed ring (where normality is
neither needed nor relevant), and indeed can be formulated so as to hold in any abelian category.

Version: 4 Owner: djao Author(s): djao

1275
305.37 simple group

Let G be a group. G is said to be simple if the only normal subgroups of G are {1} and G
itself.

Version: 3 Owner: Evandar Author(s): Evandar

305.38 solvable group

A group G is solvable if it has a composition series

G = G0 ⊃ G1 ⊃ · · · ⊃ Gn = {1}

where all the quotient groups Gi /Gi+1 are abelian.

Version: 4 Owner: djao Author(s): djao

305.39 subgroup

Definition:
Let (G, ∗) be a group and let K be subset of G. Then K is a subgroup of G defined under
the same operation if K is a group by itself (respect to ∗), that is:

• K is closed under the ∗ operation.

• There exists an identity element e ∈ K such that for all k ∈ K, k ∗ e = k = e ∗ k.

• Let k ∈ K then there exists an inverse k −1 ∈ K such that k −1 ∗ k = e = k ∗ k −1 .

The subgroup is denoted likewise (K, ∗). We denote K being a subgroup of G by writing
K 6 G.

properties:

• The set {e} whose only element is the identity is a subgroup of any group. It is called
the trivial subgroup.

• Every group is a subgroup of itself.

• The null set {} is never a subgroup (since the definition of group states that the set
must be non-empty).

1276
There is a very useful theorem that allows proving a given subset is a subgroup.

Theorem:
If K is a nonempty subset of of the group G. Then K is a subgroup of G if and only if
s, t ∈ K implies that st−1 ∈ K.

Proof: First we need to show if K is a subgroup of G then st−1 ∈ K. Since s, t ∈ K then
st−1 ∈ K, because K is a group by itself.
Now, suppose that if for any s, t ∈ K ⊆ G we have st−1 ∈ K. We want to show that K is a
subgroup, which we will acomplish by proving it holds the group axioms.

Since tt−1 ∈ K by hypothesis, we conclude that the identity element is in K: e ∈ K.
(Existence of identity)

Now that we know e ∈ K, for all t in K we have that et−1 = t−1 ∈ K so the inverses of
elements in K are also in K. (Existence of inverses).

Let s, t ∈ K. Then we know that t−1 ∈ K by last step. Applying hypothesis shows that
s(t−1 )−1 = st ∈ K
so K is closed under the operation. QED

Example:

• Consider the group (Z, +). Show that(2Z, +) is a subgroup.
The subgroup is closed under addition since the sum of even integers is even.
The identity 0 of Z is also on 2Z since 2 divides 0. For every k ∈ 2Z there is an
−k ∈ 2Z which is the inverse under addition and satisfies −k + k = 0 = k(−k).
Therefore (2Z, +) is a subgroup of (Z, +).
Another way to show (2Z, +) is a subgroup is by using the proposition stated above. If
s, t ∈ 2Z then s, t are even numbers and s −t ∈ 2Z since the difference of even numbers
is always an even number.

• Wikipedia, subgroup

Version: 7 Owner: Daume Author(s): Daume

305.40 third isomorphism theorem

If G is a group (or ring, or module) and H ⊂ K are normal subgroups (or ideals, or submod-
ules), with H normal (or an ideal, or a submodule) in K then there is a natural isomorphism

1277
(G/H)/(K/H) ≈ G/K.

I think it is not uncommon to see the third and second isomorphism theorems permuted.

Version: 2 Owner: nerdy2 Author(s): nerdy2

1278
Chapter 306

20A99 – Miscellaneous

306.1 Cayley table

A Cayley table for a group is essentially the “multiplication table” of the group.1 The
columns and rows of the table (or matrix) are labeled with the elements of the group, and
the cells represent the result of applying the group operation to the row-th and column-th
elements.

Formally, Let G be our group, with operation ◦ the group operation. Let C be the Cayley
table for the group, with C(i, j) denoting the element at row i and column j. Then

C(i, j) = ei ◦ ej

where ei is the ith element of the group, and ej is the jth element.

Note that for an abelian group, we have ei ◦ ej = ej ◦ ei , hence the Cayley table is a
symmetric matrix.

All Cayley tables for isomorphic groups are isomorphic (that is, the same, invariant of the
labeling and ordering of group elements).

306.1.1 Examples.
• The Cayley table for Z4 , the group of integers modulo 4 (under addition), would be
1
A caveat to novices in group theory: multiplication is usually used notationally to represent the group
operation, but the operation needn’t resemble multiplication in the reals. Hence, you should take “multipli-
cation table” with a grain or two of salt.

1279
 
[0] [1] [2] [3]
 [0] [0] [1] [2] [3] 
 
 [1] [1] [2] [3] [0] 
 
 [2] [2] [3] [0] [1] 
[3] [3] [0] [1] [2]

• The Cayley table for S3 , the permutation group of order 3, is
 
(1) (123) (132) (12) (13) (23)
 (1) (1) (123) (132) (12) (13) (23) 
 
 23) (123) (132) (1) (13) (23) (12) 
 
 (132) (132) (1) (123) (23) (12) (13) 
 
 2) (12) (23) (13) (1) (132) (123) 
 
 (13) (13) (12) (23) (123) (1) (132) 
(23) (23) (13) (12) (132) (123) (1)

Version: 6 Owner: akrowne Author(s): akrowne

306.2 proper subgroup

A group H is a proper subgroup of a group G if and only if H is a subgroup of G and

H 6= G. (306.2.1)

Similarly a group H is an improper subgroup of a group G if and only if H is a subgroup
of G and
H = G. (306.2.2)

Version: 2 Owner: imran Author(s): imran

306.3 quaternion group

The quaternion group, or quaternionic group, is a noncommutative group with eight ele-
ments. It is traditionally denoted by Q (not to be confused with Q) or by Q8 . This group
is defined by the presentation
{i, j; i4 , i2 j 2 , iji−1 j −1 }
or, equivalently, defined by the multiplication table

1280
· 1 i j k −i −j −k −1
1 1 i j k −i −j −k −1
i i −1 k −j 1 −k j −i
j j −k −1 i k 1 −i −j
k k j −i −1 −j i 1 −k
−i −i 1 −k j −1 k −j i
−j −j k 1 −i −k −1 i j
−k −k −j i 1 j −i −1 k
−1 −1 −i −j −k i j k 1

where we have put each product xy into row x and column y. The minus signs are justified
by the fact that {1, −1} is subgroup contained in the center of Q. Every subgroup of Q is
normal and, except for the trivial subgroup {1}, contains {1, −1}. The dihedral group D4
(the group of symmetries of a square) is the only other noncommutative group of order 8.

Since i2 = j 2 = k 2 = −1, the elements i, j, and k are known as the imaginary units, by
analogy with i ∈ C. Any pair of the imaginary units generate the group. Better, given
x, y ∈ {i, j, k}, any element of Q is expressible in the form xm y n .

Q is identified with the group of units (invertible elements) of the ring of quaternions over
Z. That ring is not identical to the group ring Z[Q], which has dimension 8 (not 4) over Z.
Likewise the usual quaternion algebra is not quite the same thing as the group algebra R[Q].

Quaternions were known to Gauss in 1819 or 1820, but he did not publicize this discovery,
and quaternions weren’t rediscovered until 1843, with Hamilton. For an excellent account of
this famous Story, see http://math.ucr.edu/home/baez/Octonions/node1.html.

Version: 6 Owner: vernondalhart Author(s): vernondalhart, Larry Hammick, patrickwon-
ders

1281
Chapter 307

20B05 – General theory for finite
groups

307.1 cycle notation

The cycle notation is a useful convention for writing down a permutations in terms of its
constituent cycles. Let S be a finite set, and
a1 , . . . , ak , k>2
distinct elements of S. The expression (a1 , . . . , ak ) denotes the cycle whose action is
a1 7→ a2 7→ a3 . . . ak 7→ a1 .
Note there are k different expressions for the same cycle; the following all represent the same
cycle:
(a1 , a2 , a3 , . . . , ak ) = (a2 , a3 , . . . , ak , a1 ), = . . . = (ak , a1 , a2 , . . . , ak−1 ).
Also note that a 1-element cycle is the same thing as the identity permutation, and thus
there is not much point in writing down such things. Rather, it is customary to express the
identity permutation simply as ().

Let π be a permutation of S, and let
S1 , . . . , Sk ⊂ S, k∈N
be the orbits of π with more than 1 element. For each j = 1, . . . , k let nj denote the
cardinality of Sj . Also, choose an a1,j ∈ Sj , and define
ai+1,j = π(ai,j ), i ∈ N.
We can now express π as a product of disjoint cycles, namely
π = (a1,1 , . . . an1 ,1 )(a2,1 , . . . , an2 ,2 ) . . . (ak,1 , . . . , ank ,k ).

1282
By way of illustration, here are the 24 elements of the symmetric group on {1, 2, 3, 4} ex-
pressed using the cycle notation, and grouped according to their conjugacy classes:

(),
(12), (13), (14), (23), (24), (34)
(123), (213), (124), (214), (134), (143), (234), (243)
(12)(34), (13)(24), (14)(23)
(1234), (1243), (1324), (1342), (1423), (1432)

Version: 1 Owner: rmilson Author(s): rmilson

307.2 permutation group

A permutation group is a pair (G, X) where G is an abstract group, and X is a set on
which G acts faithfully. Alternatively, this can be thought of as a group G equipped with a
homomorphism in to Sym(X), the symmetric group on X.

Version: 2 Owner: bwebste Author(s): bwebste

1283
Chapter 308

20B15 – Primitive groups

308.1 primitive transitive permutation group

1: A finite set

2: G transitive permutation group on A

3: ∀B ⊂ A block or B = 1

example

1: S4 is a primitive transitive permutation group on {1, 2, 3, 4}

counterexample

1: D8 is not a primitive transitive permutation group on the vertices of a square

stabilizer maximal necessary and sufficient for primitivity

1: A finite set

2: G transitive permutation group on A

3: G primitive ⇔ ∀a ∈ A : H 6 G & H ⊃ StabG (a) ⇒ H = G or H = StabG (a)

1284
Note: This was a “seed” entry written using a short-hand format described in this FAQ.

Version: 4 Owner: Thomas Heye Author(s): yark, apmxi

1285
Chapter 309

20B20 – Multiply transitive finite
groups

309.1 Jordan’s theorem (multiply transitive groups)

Let G be a sharply n-transitive permutation group, with n 6 4. Then

1. G is similar to Sn with the standard action or

2. n = 4 and G is similar to M11 , the Mathieu group of degree 10 or

3. n = 5 and G is similar to M12 , the Mathieu group of degree 11.

Version: 1 Owner: bwebste Author(s): bwebste

309.2 multiply transitive

Let G be a group, X a set on which it acts. Let X (n) be the set of order n-tuples of distinct
elements of X. This is a G-set by the diagonal action:

g · (x1 , . . . , xn ) = (g · x1 , . . . , g · xn )

The action of G on X is said to be n-transitive if it acts transitively on X (n) .

For example, the standard action of S n , the symmetric group, is n-transitive, and the stan-
dard action of An , the alternating group, is (n − 2)-transitive.

Version: 2 Owner: bwebste Author(s): bwebste

1286
309.3 sharply multiply transitive

Let G be a group, and X a set that G acts on, and let X (n) be the set of order n-tuples of
distinct elements of X. Then the action of G on X is sharply n-transitive if G acts regularly
on X (n) .

Version: 1 Owner: bwebste Author(s): bwebste

1287
Chapter 310

20B25 – Finite automorphism groups
of algebraic, geometric, or
combinatorial structures

310.1 diamond theory

Diamond theory is the theory of affine groups over GF (2) acting on small square and cubic
arrays. In the simplest case, the symmetric group of order 4 acts on a two-colored Diamond
figure like that in Plato’s Meno dialogue, yielding 24 distinct patterns, each of which has
some ordinary or color-interchange symmetry.

This can be generalized to (at least) a group of order approximately 1.3 trillion acting on a
4x4x4 array of cubes, with each of the resulting patterns still having nontrivial symmetry.

The theory has applications to finite geometry and to the construction of the large Witt
design underlying the Mathieu group of degree 24.

• ”Diamond Theory,” http://m759.freeservers.com/

Version: 4 Owner: m759 Author(s): akrowne, m759

1288
Chapter 311

20B30 – Symmetric groups

311.1 symmetric group

Let X be a set. Let S(X) be the set of permutations of X (i.e. the set of bijective functions
on X). Then the act of taking the composition of two permutations induces a group structure
on S(X). We call this group the symmetric group and it is often denoted Sym(X).

Version: 5 Owner: bwebste Author(s): bwebste, antizeus

311.2 symmetric group

Let X be a set. Let S(X) be the set of permutations of X (i.e. the set of bijective functions
on X). Then the act of taking the composition of two permutations induces a group structure
on S(X). We call this group the symmetric group and it is often denoted Sym(X).

When X has a finite number n of elements, we often refer to the symmetric group as Sn ,
and describe the elements by using cycle notation.

Version: 2 Owner: antizeus Author(s): antizeus

1289
Chapter 312

20B35 – Subgroups of symmetric
groups

312.1 Cayley’s theorem

Let G be a group, then G is isomorphic to a subgroup of the permutation group SG

If G is finite and of order n, then G is isomorphic to a subgroup of the permutation group
Sn

Furthermore, suppose H is a proper subgroup of G. Let X = {Hg|g ∈ G} be the set of
right cosets in G. The map θ : G → SX given by θ(x)(Hg) = Hgx is a homomorphism. The
kernel is the largest normal subgroup of H. We note that |SX | = [G : H]!. Consequently if
|G| doesn’t divide [G : H]! then θ is not an isomorphism so H contains a non-trivial normal
subgroup, namely the kernel of θ.

Version: 4 Owner: vitriol Author(s): vitriol

1290
Chapter 313

20B99 – Miscellaneous

313.1 (p, q) shuffle

Definition.
Let p and q be positive natural numbers. Further, let S(k) be the set of permutations of the
numbers {1, . . . , k}. A permutation τ ∈ S(p + q) is a (p, q) shuffle if
τ (1) < · · · < τ (p),
τ (p + 1) < · · · < τ (p + q).
The set of all (p, q) shuffles is denoted by S(p, q).

It is clear that S(p, q) ⊂ S(p + q). Since a (p, q) shuffle is completely
 determined by how
p+q
the p first elements are mapped, the cardinality of S(p, q) is p . The wedge product of a
p-form and a q-form can be defined as a sum over (p, q) shuffles.

Version: 3 Owner: matte Author(s): matte

313.2 Frobenius group

A permutation group G on a set X is Frobenius if no non-trivial element of G fixes more
than one element of X. Generally, one also makes the restriction that at least one non-trivial
element fix a point. In this case the Frobenius group is called non-regular.

The stabilizer of any point in X is called a Frobenius complement, and has the remarkable
property that it is distinct from any conjugate by an element not in the subgroup. Conversely,
if any finite group G has such a subgroup, then the action on cosets of that subgroup makes
G into a Frobenius group.

Version: 2 Owner: bwebste Author(s): bwebste

1291
313.3 permutation

A permutation of a set {a1 , a2 , . . . , an } is an arrangement of its elements. For example, if
S = {ABC} then ABC, CAB , CBA are three different permutations of S.

The number of permutations of a set with n elements is n!.

A permutation can also be seen as a bijective function of a set into itself. For example, the
permutation CAB could be seen a function that assigns:

f (A) = C, f (B) = A, f (C) = B.

In fact, every bijection of a set into itself gives a permutation, and any permutation gives
rise to a bijective function.

Therefore, we can say that there are n! bijective fucntion from a set with n elements into
itself.

Using the function approach, it can be proved that any permutation can be expressed
as a composition of disjoint cycles and also as composition of (not necessarily disjoint)
transpositions.

Moreover, if σ = τ1 τ2 · · · τm = ρ1 ρ2 · · · ρn are two factorization of a permutation σ into
transpositions, then m and n must be both even or both odd. So we can label permutations
as even or odd depending on the number of transpositions for any decomposition.

Permutations (as functions) form a non-abelian group with function composition as binary operation
called symmetric group of order n. The subset of even permutations becomes a subgroup
called the alternating group of order n.

Version: 3 Owner: drini Author(s): drini

313.4 proof of Cayley’s theorem

Let G be a group, and let SG be the permutation group of the underlying set G. For each
g ∈ G, define ρg : G → G by ρg (h) = gh. Then ρg is invertible with inverse ρg−1 , and so is a
permutation of the set G.

Define Φ : G → SG by Φ(g) = ρg . Then Φ is a homomorphism, since

(Φ(gh))(x) = ρgh (x) = ghx = ρg (hx) = (ρg ◦ ρh )(x) = ((Φ(g))(Φ(h)))(x)

And Φ is injective, since if Φ(g) = Φ(h) then ρg = ρh , so gx = hx for all x ∈ X, and so
g = h as required.

1292
So Φ is an embedding of G into its own permutation group. If G is finite of order n, then
simply numbering the elements of G gives an embedding from G to Sn .

Version: 2 Owner: Evandar Author(s): Evandar

1293
Chapter 314

20C05 – Group rings of finite groups
and their modules

314.1 group ring

For any group G, the group ring Z[G] is defined to be the ring whose additive group is the
abelian group of formal integer linear combinations of elements of G, and whose multiplica-
tion operation is defined by multiplication in G, extended Z–linearly to Z[G].

More generally, for any ring R, the group ring of G over R is the ring R[G] whose additive
group is the abelian group of formal R–linear combinations of elements of G, i.e.:
( n )
X

R[G] := ri gi ri ∈ R, gi ∈ G ,

i=1

and whose multiplication operation is defined by R–linearly extending the group multiplica-
tion operation of G. In the case where K is a field, the group ring K[G] is usually called a
group algebra.

Version: 4 Owner: djao Author(s): djao

1294
Chapter 315

20C15 – Ordinary representations and
characters

315.1 Maschke’s theorem

Let G be a finite group, and k a field of characteristic not dividing |G|. Then any representation
V of G over k is completely reducible.

W e need only show that any subrepresentation has a compliment, and the result follows
by induction.

Let V be a representation of G and W a subrepresentation. Let π : V → W be an arbitrary
projection, and let
1 X −1
π 0 (v) = g π(gv)
|G| g∈G

This map is obviously G-equivariant, and is the identity on W , and its image is contained in
W , since W is invariant under G. Thus it is an equivariant projection to W , and its kernel
is a compliment to W .

Version: 5 Owner: bwebste Author(s): bwebste

315.2 a representation which is not completely reducible

If G is a finite group, and k is a field whose characteristic does divide the order of the group,
then Maschke’s theorem fails. For example let V be the regular representation of G, which
can be thought of as functions from G to k, with the G action g · ϕ(g 0) = ϕ(g −1 g 0). Then
this representation is not completely reducible.

1295
There is an obvious trivial subrepresentation W of V , consisting of the constant functions. I
claim that there is no complementary invariant subspace to this one. If W 0 is such a subspace,
then there is a homomorphism ϕ : V → V /W 0 ∼ = k. Now consider the characteristic function
of the identity e ∈ G (
1 g=e
δe (g) =
0 g 6= e
and ` = ϕ(δe ) in V /W 0. This is not zero since δ generates the representation V . By G-
equivarience, ϕ(δg ) = ` for all g ∈ G. Since
X
η= η(g)δg
g∈G

for all η ∈ V , !
X
0
W = ϕ(η) = ` η(g) .
g∈G

Thus, X
ker ϕ = {η ∈ V | η(g) = 0}.
∈G

But since the characteristic of the field k divides the order of G, W 6 W 0 , and thus could
not possibly be complimentary to it.

For example, if G = C2 = {e, f } then the invariant subspace of V is spanned by e + f . For
characteristics other than 2, e − f spans a complimentary subspace, but over characteristic
2, these elements are the same.

Version: 1 Owner: bwebste Author(s): bwebste

315.3 orthogonality relations

First orthogonality relations: Let χ1 , χ2 be characters of representations V1 , V2 of a finite group
G over a field k of characteristic 0. Then

1 X
(χ1 , χ2 ) = χ1 (g)χ2 (g) = dim(HomV1 V2 ).
|G| g∈G

F irst of all, consider the special case where V = k with the trivial action of the group.
Then HomG (k, V2 ) ∼= V2G , the fixed points. On the other hand, consider the map
1 X
φ= g : V2 → V2
|G| g∈G

1296
(with the sum in End(V2 )). Clearly, the image of this map is contained in V2G , and it is
the identity restricted to V2G . Thus, it is a projection with image V2G . Now, the rank of a
projection (over a field of characteristic 0) is its trace. Thus,
1 X
dimk HomG (k, V2 ) = dim V2G = tr(φ) = χ2 (g)
|G|

which is exactly the orthogonality formula for V1 = k.

Now, in general, Hom(V1 , V2 ) ∼
= V1∗ ⊗V2 is a representation, and HomG (V1 , v2 ) = (Hom(V1 , V2 ))G .
Since χV1∗ ⊗V2 = χ1 χ2 ,

X
dimk HomG (V1 , V2 ) = dimk (Hom(V1 , V2 ))G = χ1 χ2
g∈G

which is exactly the relation we desired.

In particular, if V1 , V2 irreducible, by Schur’s lemma
(
D V1 ∼
= V2
HomV1 V2 =
0 V1  V2

where D is a division algebra. In particular, non-isomorphic irreducible representations have
orthogonal characters. Thus, for any representation V , the multiplicities ni in the unique
decomposition of V into the direct sum of irreducibles

V ∼
= V1⊕n1 ⊕ · · · ⊕ Vm⊕nm

where Vi ranges over irreducible representations of G over k, can be determined in terms of
the character inner product:

(ψ, χi )
ni =
(χi , χi )

where ψ is the character of V and χi the character of Vi . In particular, representations over
a field of characteristic zero are determined by their character. Note: This is not true over
fields of positive characteristic.

If the field k is algebraically closed, the only finite division algebra over k is k itself, so the
characters of irreducible representations form an orthonormal basis for the vector space of
class functions with respect to this inner product. Since (χi , χi ) = 1 for all irreducibles, the
multiplicity formula above reduces to ni = (ψ, χi ).

1297
Second orthogonality relations: We assume now that k is algebraically closed. Let g, g 0 be
elements of a finite group G. Then
(
X |CG (g1 )| g ∼ g 0
χ(g)χ(g 0) =
χ
0 g  g0

where the sum is over the characters of irreducible representations, and CG (g) is the centralizer
of g.

L et χ1 , . . . , χn be the characters of the irreducible representations, and let g1 , . . . , gn be
representatives of the conjugacy classes.
p
Let A be the matrix whose ijth entry is |G : CG (gj )|(χi (gj )). By first orthogonality,
AA∗ = |G|I (here ∗ denotes conjugate transpose), where I is the identity matrix. Since left
inverses are right inverses, A∗ A = |G|I. Thus,
n
X
p
|G : CG (gi )||G : CG (gk )| χj (gi )χj (gk ) = |G|δij .
j=1

Replacing gi or gk with any conjuagate will not P change the expression above. thus, if our
two elements are not conjugate, we obtain that χ χ(g)χ(g 0) = 0. On the other hand, if
g ∼ g 0 , then i = k in the sum above, which reduced to the expression we desired.

P
A special case of this result, applied to 1 is that |G| = χ χ(1)2 , that is, the sum of the
squares of the dimensions of the irreducible representations of any finite group is the order
of the group.

Version: 8 Owner: bwebste Author(s): bwebste

1298
Chapter 316

20C30 – Representations of finite
symmetric groups

316.1 example of immanent

If χ = 1 we obtain the permanent. If χ = sgn we obtain the determinant.

Version: 1 Owner: gholmes74 Author(s): gholmes74

316.2 immanent

Let χ : Sn → C be a complex character. For any n × n matrix A define
X n
Y
Immχ (A) = χ(σ) A(j, σj)
σ∈Sn j=1

functions obtained in this way are called immanents.

Version: 4 Owner: gholmes74 Author(s): gholmes74

316.3 permanent

The permanent of an n × n matrix A over C is the number
n
XY
per(A) = A(j, σj)
σ∈Sn j=1

1299
Version: 2 Owner: gholmes74 Author(s): gholmes74

1300
Chapter 317

20C99 – Miscellaneous

317.1 Frobenius reciprocity

Let V be a finite-dimensional representation of a finite group G, and let W be a represen-
tation of a subgroup H ⊂ G. Then the characters of V and W satisfy the inner product
relation
(χInd(W ) , χV ) = (χW , χRes(V ) )
where Ind and Res denote the induced representation IndG
H and the restriction representation
ResG
H .

The Frobenius reciprocity theorem is often given in the stronger form which states that
Res and Ind are adjoint functors between the category of G–modules and the category of
H–modules:
HomH (W, Res(V )) = HomG (Ind(W ), V ),
or, equivalently
V ⊗ Ind(W ) = Ind(Res(V ) ⊗ W ).

Version: 4 Owner: djao Author(s): rmilson, djao

317.2 Schur’s lemma

Schur’s lemma in representation theory is an almost trivial observation for irreducible modules,
but deserves respect because of its profound applications and implications.

Lemma 5 (Schur’s lemma). Let G be a finite group represented on irreducible G-modules
V and W . Any G-module homomorphism f : V → W is either invertible or the zero map.

1301
T he only insight here is that both ker f and im f are G-submodules of V and W respec-
tively. This is routine. However, because V is irreducible, ker f is either trivial or all of V .
In the former case, im f is all of W , also because W is irreducible, so f is invertible. In the
latter case, f is the zero map.

The following corollary is a very useful form of Schur’s lemma, in case that our representations
are over an algebraically closed field.
Corollary 1. If G is represented over an algebraically closed field F on irreducible G-modules
V and W , then any G-module homomorphism f : V → W is a scalar.

T he insight in this case is to consider the modules V and W as vector spaces over F . Notice
then that the homomorphism f is a linear transformation and therefore has an eigenvalue λ
in our algebraically closed F . Hence, f −λ1 is not invertible. By Schur’s lemma, f −λ1 = 0.
In other words, f = λ, a scalar.

Version: 14 Owner: rmilson Author(s): rmilson, NeuRet

317.3 character

Let ρ : G −→ GL(V ) be a finite dimensional representation of a group G (i.e., V is a
finite dimensional vector space over its scalar field K). The character of ρ is the function
χV : G −→ K defined by
χV (g) := Tr(ρ(g))
where Tr is the trace function.

Properties:

• χV (g) = χV (h) if g is conjugate to h in G. (Equivalently, a character is a class function
on G.)

• If G is finite, the characters of the irreducible representations of G over the complex numbers
form a basis of the vector space of all class functions on G (with pointwise addition
and scalar multiplication).

• Over the complex numbers, the characters of the irreducible representations of G are
orthonormal under the inner product
1 X
(χ1 , χ2 ) := χ1 (g)χ2 (g)
|G| g∈G

Version: 4 Owner: djao Author(s): djao

1302
317.4 group representation

Let G be a group, and let V be a vector space. A representation of G in V is a group homomorphism
ρ : G −→ GL(V ) from G to the general linear group GL(V ) of invertible linear transformations
of V .

Equivalently, a representation of G is a vector space V which is a (left) module over the
group ring Z[G]. The equivalence is achieved by assigning to each homomorphism ρ : G −→
GL(V ) the module structure whose scalar multiplication is defined by g · v := (ρ(g))(v), and
extending linearly.

Special kinds of representations (preserving all notation from above)

A representation is faithful if either of the following equivalent conditions is satisfied:

• ρ : G −→ GL(V ) is injective

• V is a faithful left Z[G]–module

A subrepresentation of V is a subspace W of V which is a left Z[G]–submodule of V ; or,
equivalently, a subspace W of V with the property that

(ρ(g))(w) ∈ W for all w ∈ W.

A representation V is called irreducible if it has no subrepresentations other than itself and
the zero module.

Version: 2 Owner: djao Author(s): djao

317.5 induced representation

Let G be a group, H ⊂ G a subgroup, and V a representation of H, considered as a Z[H]–
module. The induced representation of ρ on G, denoted IndG
H (V ), is the Z[G]–module whose
underlying vector space is the direct sum
M
σV
σ∈G/H

of formal translates of V by left cosets σ in G/H, and whose multiplication operation is
defined by choosing a set {gσ }σ∈G/H of coset representatives and setting

g(σv) := τ (hv)

1303
where τ is the unique left coset of G/H containing g · gσ (i.e., such that g · gσ = gτ · h for
some h ∈ H).

One easily verifies that the representation IndG
H (V ) is independent of the choice of coset
representatives {gσ }.

Version: 1 Owner: djao Author(s): djao

317.6 regular representation

Given a group G, the regular representation of G over a field K is the representation
ρ : G −→ GL( K G ) whose underlying vector space K G is the K–vector space of formal
linear combinations of elements of G, defined by
n
! n
X X
ρ(g) kigi := ki (ggi)
i=1 i=1

for ki ∈ K, g, gi ∈ G.

Equivalently, the regular representation is the induced representation on G of the trivial
representation on the subgroup {1} of G.

Version: 2 Owner: djao Author(s): djao

317.7 restriction representation

Let ρ : G −→ GL( V ) be a representation on a group G. The restriction representation of ρ
to a subgroup H of G, denoted ResG H (V ), is the representation ρ|H : H −→ GL( V ) obtained
by restricting the function ρ to the subset H ⊂ G.

Version: 1 Owner: djao Author(s): djao

1304
Chapter 318

20D05 – Classification of simple and
nonsolvable groups

318.1 Burnside p − q theorem

If a finite group G is not solvable, the order of G is divisible by at least 3 distinct primes.
Alternatively, any groups whose order is divisible by only two distinct primes is solvable
(these two distinct primes are the p and q of the title).

Version: 2 Owner: bwebste Author(s): bwebste

318.2 classification of semisimple groups

For every semisimple group G there is a normal subgroup H of G, (called the centerless com-
petely reducible radical) which isomorphic to a direct product of nonabelian simple groups
such that conjugation on H gives an injection into Aut(H). Thus G is isomorphic to a
subgroup of Aut(H) containing the inner automorphisms, and for every group H isomorphic
to a direct product of non-abelian simple groups, every such subgroup is semisimple.

Version: 1 Owner: bwebste Author(s): bwebste

318.3 semisimple group

A group G is called semisimple if it has no proper normal solvable subgroups. Every group
is an extension of a semisimple group by a solvable one.

1305
Version: 1 Owner: bwebste Author(s): bwebste

1306
Chapter 319

groups

319.1 Janko groups

The Janko groups denoted by J1 , J2 , J3 , and J4 are four of the 26 sporadic groups. They were
discovered by Z. Janko in 1966 and published in the article ”A new finite simple group with
abelan Sylow subgroups and its characterization.” (Journal of algebra, 1966, 32: 147-186).

Each of these groups have very intricate matrix representations as maps into large general linear groups.
For example, the matrix K corresponding to J4 gives a representation of J4 in GL112 (2).

Version: 7 Owner: mathcam Author(s): mathcam, Thomas Heye

1307
Chapter 320

20D10 – Solvable groups, theory of
formations, Schunck classes, Fitting
classes, π-length, ranks

320.1 Čuhinin’s Theorem

Let G be a finite, π-separable group, for some set π of primes. Then if H is a maximal
π-subgroup of G, the index of H in G, |G : H|, is coprime to all elements of π and all such
subgroups are conjugate. Such a subgroup is called a Hall π-subgroup. For π = {p}, this
essentially reduces to the Sylow theorems (with unnecessary hypotheses).

If G is solvable, it is π-separable for all π, so such subgroups exist for all π. This result is
often called Hall’s theorem.

Version: 4 Owner: bwebste Author(s): bwebste

320.2 separable

Let π be a set of primes. A finite group G is called π-separable if there exists a composition series

{1} = G0  · · ·  Gn = G

such that Gi+1 /Gi is a π-group, or a π 0 -group. π-separability can be thought of as a gener-
alization of solvability; a group is π-separable for all sets of primes if and only it is solvable.

Version: 3 Owner: bwebste Author(s): bwebste

1308
320.3 supersolvable group

A group G is supersolvable if it has a finite normal series

G = G0  G1  · · ·  Gn = 1

with the property that each factor group Gi−1 /Gi is cyclic.

A supersolvable group is solvable.

Finitely generated nilpotent groups are supersolvable.

Version: 1 Owner: mclase Author(s): mclase

1309
Chapter 321

20D15 – Nilpotent groups, p-groups

321.1 Burnside basis theorem

If G is a p-group, then Frat G = G0 Gp , where Frat G is the Frattini subgroup, G0 the
commutator subgroup, and Gp is the subgroup generated by p-th powers.

Version: 1 Owner: bwebste Author(s): bwebste

1310
Chapter 322

20D20 – Sylow subgroups, Sylow
properties, π-groups, π-structure

322.1 π-groups and π 0 -groups

Let π be a set of primes. A finite group G is called a π-group if all the primes dividing |G|
are elements of π, and a π 0 -group if none of them are. Typically, if π is a singleton π = {p},
we write p-group and p0 -group for these.

Version: 2 Owner: bwebste Author(s): bwebste

322.2 p-subgroup

Let G be a finite group with order n, and let p be a prime integer. We can write n = pk m
for some k, m integers, such that k and m are coprimes (that is, pk is the highest power of p
that divides n). Any subgroup of G whose order is pk is called a Sylow p-subgroup or simply
p-subgroup.

While there is no reason for p-subgroups to exist for any finite group, the fact is that all
groups have p-subgroups for every prime p that divides |G|. This statement is the First
Sylow theorem When |G| = pk we simply say that G is a p-group.

Version: 2 Owner: drini Author(s): drini, apmxi

1311
322.3 Burnside normal complement theorem

Let G be a finite group, and S a Sylow subgroup such that CG (S) = NG (S). Then
T S has a
normal complement. That is, there exists a normal subgroup N  G such that S N = {1}
and SN = G.

Version: 1 Owner: bwebste Author(s): bwebste

322.4 Frattini argument

If H is a normal subgroup of a finite group G, and S is a Sylow subgroup of H, then G =
HNG (S), where NG (S) is the normalizer of S in G.

Version: 1 Owner: bwebste Author(s): bwebste

322.5 Sylow p-subgroup

If (G, ∗) is a group then any subgroup of order pa for any integer a is called a p-subgroup.
If|G| = pa m, where p - m then any subgroup S of G with |S| = pa is a Sylow p-subgroup.
We use Sylp (G) for the set of Sylow p-groups of G.

Version: 3 Owner: Henry Author(s): Henry

322.6 Sylow theorems

Let G be a finite group whose order is divisible by the prime p. Suppose pm is the highest
power of p which is a factor of |G| and set k = p|G|
m

• The group G contains at least one subgroup of order pm

• Any two subgroups of G of order pm are conjugate

• The number of subgroups of G of order pm is congruent to 1 modulo p and is a factor
of k

Version: 1 Owner: vitriol Author(s): vitriol

1312
322.7 Sylow’s first theorem

existence of subgroups of prime-power order

1: G finite group

2: p prime

3: pk divides |G|

4: ∃H 6 G : |H| = pk

Note: This is a “seed” entry written using a short-hand format described in this FAQ.

Version: 2 Owner: bwebste Author(s): yark, apmxi

322.8 Sylow’s third theorem

Let G finite group, and let n be the number of Sylow p-subgroups of G. Then n ⇔ 1 (mod p),
and any two Sylow p-subgroups of G are conjugate to one another.

Version: 8 Owner: bwebste Author(s): yark, apmxi

322.9 application of Sylow’s theorems to groups of or-
der pq

We can use Sylow’s theorems to examine a group G of order pq, where p and q are primes
and p < q.

Let nq denote the number of Sylow q-subgroups of G. Then Sylow’s theorems tell us that nq
is of the form 1 + kq for some integer k and nq divides pq. But p and q are prime and p < q,
so this implies that nq = 1. So there is exactly one Sylow q-subgroup, which is therefore
normal (indeed, characteristic) in G.
T
Denoting the Sylow q-subgroup by Q, and letting P be a Sylow p-subgroup, then Q P =
{1} and QP = G, so G is a semidirect product of Q and P . In particular, if there is only
one Sylow p-subgroup, then G is a direct product of Q and P , and is therefore cyclic.

Version: 9 Owner: yark Author(s): yark, Manoj, Henry

1313
322.10 p-primary component

Definition 27. Let G be a finite abelian group and let p ∈ N be a prime. The p-primary
component of G, Πp , is the subgroup of all elements whose order is a power of p.

Note: The p-primary component of an abelian group G coincides with the unique Sylow
p-subgroup of G.

Version: 2 Owner: alozano Author(s): alozano

322.11 proof of Frattini argument

Let g ∈ G be any element. Since H is normal, gSg −1 ⊂ H. Since S is a Sylow subgroup
of H, gSg −1 = hSh−1 for some h ∈ H, by Sylow’s theorems. Thus n = h−1 g normalizes S,
and so g = hn for h ∈ H and n ∈ NG (S).

Version: 1 Owner: bwebste Author(s): bwebste

322.12 proof of Sylow theorems

We let G be a group of order pm k where p - k and prove Sylow’s theorems.

First, a fact which will be used several times in the proof:
Proposition 8. If p divides the size of every conjugacy class outside the center then p divides
the order of the center.

Proof: This follows from this Centralizer:
X
|G| = Z(G) + |[a]|
a∈Z(G)
/

If p divides the left hand side, and divides all but one entry on the right hand side, it must
divide every entry on the right side of the equation, so p|Z(G).
Proposition 9. G has a Sylow p-subgroup

Proof: By induction on |G|. If |G| = 1 then there is no p which divides its order, so the
condition is trivial.

Suppose |G| = pm k, p - k, and the proposition holds for all groups of smaller order. Then
we can consider whether p divides the order of the center, Z(G).

1314
If it does then, by Cauchy’s theorem, there is an element of Z(G) of order p, and therefore
a cyclic subgroup generated by p, hpi, also of order p. Since this is a subgroup of the center,
it is normal, so G/hpi is well-defined and of order pm−1 k. By the inductive hypothesis, this
group has a subgroup P/hpi of order pm−1 . Then there is a corresponding subgroup P of G
which has |P | = |P/hpi| · |N| = pm .

On the other hand, if p - |Z(G)| then consider the conjugacy classes not in the center. By
the proposition above, since Z(G) is not divisible by p, at least one conjugacy class can’t
be. If a is a representative of this class then we have p - |[a]| = [G : C(a)], and since
|C(a)| · [G : C(a)] = |G|, pm | |C(a)|. But C(a) 6= G, since a ∈
/ Z(G), so C(a) has a subgroup
m
of order p , and this is also a subgroup of G.

Proposition 10. The intersection of a Sylow p-subgroupTwith the normalizer
T of a Sylow
p-subgroup is the intersection of the subgroups. That is, Q NG (P ) = Q P .

T T
Proof: If P and Q are Sylow p-subgroups, consider R = Q NG (P ). Obviously Q P ⊆ R.
In addition, since R ⊆ NG (P ), the second isomorphism theorem tells us that RP is a group,
|R|·|P
T | . P is a subgroup of RP , so pm | |RP |. But |R| is a subgroup of Q and P
and |RP | = |R P|
m
is a Sylow p-subgroup, so |R| · |P | is a multiple of p. Then it must
T be that |RP | = p , and
therefore P = RP , and so R ⊆ P . Obviously R ⊆ Q, so R ⊆ Q P .

The following construction will be used in the remainder of the proof:

Given any Sylow p-subgroup P , consider the set of its conjugates C. Then X ∈ C ↔ X =
xP x−1 = {xpx−1 |∀p ∈ P } for some x ∈ G. Observe that every X ∈ C is a Sylow p-subgroup
(and we will show that the converse holds as well). We define a group action of a subset G
on C by:
g · X = g · xP x−1 = gxP x−1 g −1 = (gx)P (gx)−1
This is clearly a group action, so we can consider the orbits of P under it. Of course, if all G
is used then there is only one orbit, so we restrict the action to a Sylow p-subgroup Q. Name
the orbits O1 , . . . , Os , and let P1 , . . . , Ps be representatives of the corresponding orbits. By
the orbit-stabilizer theorem, the size of an orbit isTthe index of the T stabilizer, and under
T this
action the stabilizer of any Pi is just NQ (Pi ) = Q NG (Pi ) = Q P , so |Oi | = [Q : Q Pi ].
T
There are two easy T results on this construction. If Q = Pi then |Oi| = [Pi : Pi Pi ] = 1. If
Q 6= Pi then [Q : Q Pi ] > 1, and since the index of any subgroup of Q divides Q, p | |Oi|.

Proposition 11. The number of conjugates of any Sylow p-subgroup of G is congruent to 1
modulo p

In the construction above, let Q = P1 . Then |O1 | = 1 and p | |Oi| for i 6= 1. Since the
number of conjugates of P is the sum of the number in each orbit, the number of conjugates
is of the form 1 + k2 p + k3 p + · · · + ks p, which is obviously congruent to 1 modulo p.

Proposition 12. Any two Sylow p-subgroups are conjugate

1315
Proof: Given a Sylow p-subgroup P and any other Sylow p-subgroup Q, consider again
the construction given above. If Q is not conjugate to P then Q 6= Pi for every i, and
therefore p | |Oi| for every orbit. But then the number of conjugates of P is divisible by p,
contradicting the previous result. Therefore Q must be conjugate to P .

Proposition 13. The number of subgroups of G of order pm is congruent to 1 modulo p and
is a factor of k

Proof: Since conjugates of a Sylow p-subgroup are precisely the Sylow p-subgroups, and since
a Sylow p-subgroup has 1 modulo p conjugates, there are 1 modulo p Sylow p-subgroups.

Since the number of conjugates is the index of the normalizer, it must be |G : NG (P )|. Since
P is a subgroup of its normalizer, pm | NG (P ), and therefore |G : NG (P )| | k.

Version: 3 Owner: Henry Author(s): Henry

322.13 subgroups containing the normalizers of Sylow
subgroups normalize themselves

Let G be a finite group, and S a Sylow subgroup. Let M be a subgroup such that NG (S) ⊂
M. Then M = NG (M).

B y order considerations, S is a Sylow subgroup of M. Since M is normal in NG (M), by
the Frattini argument, NG (M) = NG (S)M = M.

Version: 3 Owner: bwebste Author(s): bwebste

1316
Chapter 323

20D25 – Special subgroups (Frattini,
Fitting, etc.)

323.1 Fitting’s theorem

If G is a finite group and M and N are normal nilpotent subgroups, then MN is also a
normal nilpotent subgroup.

Thus, any finite group has a maximal normal nilpotent subgroup, called its Fitting subgroup.

Version: 1 Owner: bwebste Author(s): bwebste

323.2 characteristically simple group

A group G is called characterisitically simple if its only characteristic subgroups are {1}
and G. Any finite characteristically simple group is the direct product of several copies of
isomorphic simple groups.

Version: 3 Owner: bwebste Author(s): bwebste

323.3 the Frattini subgroup is nilpotent

The Frattini subgroup Frat G of any finite group G is nilpotent.

L et S be a Sylow p-subgroup of G. Then by the Frattini argument, (Frat G)NG (S) = G.
Since the Frattini subgroup is formed of non-generators, NG (S) = G. Thus S is normal in

1317
G, and thus in Frat G. Any subgroup whose Sylow subgroups are all normal is nilpotent.

Version: 4 Owner: bwebste Author(s): bwebste

1318
Chapter 324

20D30 – Series and lattices of
subgroups

324.1 maximal condition

A group is said to satisfy the maximal condition if every strictly ascending chain of
subgroups
G1 ⊂ G2 ⊂ G3 ⊂ · · ·
is finite.

This is also called the ascending chain condition.

A group satifies the maximal condition if and only if the group and all its subgroups are
finitely generated.

Similar properties are useful in other classes of algebraic structures: see for example the
noetherian condition for rings and modules.

Version: 2 Owner: mclase Author(s): mclase

324.2 minimal condition

A group is said to satisfy the minimal condition if every strictly descending chain of
subgroups
G1 ⊃ G2 ⊃ G3 ⊃ · · ·
is finite.

This is also called the descending chain condition.

1319
A group which satisfies the minimal condition is necessarily periodic. For if it contained an
element x of infinite order, then
n
hxi ⊃ hx2 i ⊃ hx4 i ⊃ · · · ⊃ hx2 i ⊃ · · ·

is an infinite descending chain of subgroups.

Similar properties are useful in other classes of algebraic structures: see for example the
artinian condition for rings and modules.

Version: 1 Owner: mclase Author(s): mclase

324.3 subnormal series

Let G be a group with a subgroup H, and let

G = G0  G1  · · ·  Gn = H (324.3.1)

be a series of subgroups with each Gi a normal subgroup of Gi−1 . Such a series is called a
subnormal series or a subinvariant series.

If in addition, each Gi is a normal subgroup of G, then the series is called a normal series.

A subnormal series in which each Gi is a maximal normal subgroup of Gi−1 is called a
composition series.

A normal series in which Gi is a maximal normal subgroup of G contained in Gi−1 is called
a principal series or a chief series.

Note that a composition series need not end in the trivial group 1. One speaks of a composi-
tion series (1) as a composition series from G to H. But the term composition series
for G generally means a compostion series from G to 1.

Similar remarks apply to principal series.

Version: 1 Owner: mclase Author(s): mclase

1320
Chapter 325

20D35 – Subnormal subgroups

325.1 subnormal subgroup

Let G be a group, and H a subgroup of G. Then H is subnormal if there exists a finite series

H = H0 hdH1 hd · · · hdtHn = G

with Hi a normal subgroup of Hi+1 .

Version: 1 Owner: bwebste Author(s): bwebste

1321
Chapter 326

20D99 – Miscellaneous

326.1 Cauchy’s theorem

Let G be a finite group and let p be a prime dividing |G|. Then there is an element of G of
order p.

Version: 1 Owner: Evandar Author(s): Evandar

326.2 Lagrange’s theorem

Let G be a finite group and let H be a subgroup of G. Then the order of H divides the
order of G.

Version: 2 Owner: Evandar Author(s): Evandar

326.3 exponent

If G is a finite group, then the exponent of G, denoted exp G, is the smallest positive integer
n such that, for every g ∈ G, g n = eG . Thus, for every group G, exp G divides G, and, for
every g ∈ G, |g| divides exp G.

The concept of exponent for finite groups is similar to that of characterisic for rings.

If G is a finite abelian group, then there exists g ∈ G with |g| = exp G. As a result of the
fundamental theorem of finite abelian groups, there exist a1 , . . . , an with ai dividing ai+1 for
every integer i between 1 and n such that G ∼ = Za1 ⊕ · · · ⊕ Zan . Since, for every c ∈ G,

1322
can = eG , then exp G ≤ an . Since |(0, . . . , 0, 1)| = an , then exp G = an , and the result
follows.

Following are some examples of exponents of nonabelian groups.

Since |(12)| = 2, |(123)| = 3, and |S3 | = 6, then exp S3 = 6.

In Q8 = {1, −1, i, −i, j, −j, k, −k}, the ring of quaternions of order eight, since |i| = | − i| =
|j| = | − j| = |k| = | − k| = 4 and 14 = (−1)4 = 1, then exp Q = 4.

Since the order of a product of two disjoint transpositions is 2, the order of a three cycle is
3, and the only nonidentity elements of A4 are products of two disjoint transpositions and
three cycles, then exp A4 = 6.

Since |(123)| = 3 and |(1234)| = 4, then exp S4 ≥ 12. Since S4 is not abelian, then it is not
cyclic, and thus contains no element of order 24. It follows that exp S4 = 12.

Version: 5 Owner: Wkbj79 Author(s): Wkbj79

326.4 fully invariant subgroup

A subgroup H of a group G is fully invariant if f (H) ⊆ H for all endomorphisms f : G → G

This is a stronger condition than being a characteristic subgroup.

The derived subgroup is fully invariant.

Version: 1 Owner: mclase Author(s): mclase

326.5 proof of Cauchy’s theorem

Let G be a finite group and p be a prime divisor of |G|. Consider the set X of all ordered
strings (x1 , x2 , . . . , xp ) for which x1 x2 . . . xp = e. Note |X| = |G|p−1, i.e. a multiple of
p. There is a natural group action of Zp on X. m ∈ Zp sends the string (x1 , x2 , . . . , xp )
to (xm+1 , . . . , xp , x1 , . . . , xm ). By orbit-stabilizer theorem each orbit contains exactly 1 or
p strings. Since (e, e, . . . , e) has an orbit of cardinality 1, and the orbits partition X, the
cardinality of which is divisible by p, there must exist at least one other string (x1 , x2 , . . . , xp )
which is left fixed by every element of Zp . i.e. x1 = x2 = . . . = xp and so there exists an
element of order p as required.

Version: 1 Owner: vitriol Author(s): vitriol

1323
326.6 proof of Lagrange’s theorem

We know that the cosets Hg form a partition of G (see the coset entry for proof of this.)
Since G is finite, we know it can be completely decomposed into a finite number of cosets.
Call this number n. We denote the ith coset by Hai and write G as

[ [ [
G = Ha1 Ha2 ··· Han

since each coset has |H| elements, we have

|G| = |H| · n

and so |H| divides |G|, which proves Lagrange’s theorem. 

Version: 2 Owner: akrowne Author(s): akrowne

326.7 proof of the converse of Lagrange’s theorem for
finite cyclic groups

Following is a proof that, if G is a finite cyclic group and n ∈ Z+ is a divisor of |G|, then G
has a subgroup of order n.

Let g be a generator of G. Then |g| = |hgi| = |G|. Let z ∈ Z such that nz = |G| = |g|.
|g|
Consider hg z i. Since g ∈ G, then g z ∈ G. Thus, hg z i ≤ G. Since |hg z i| = |g z | = GCD(z,|g|) =
nz nz z
GCD(z,nz)
= z = n, it follows that hg i is a subgroup of G of order n.

Version: 3 Owner: Wkbj79 Author(s): Wkbj79

326.8 proof that expG divides |G|

Following is a proof that exp G divides |G| for every finite group G.

By the division algorithm, there exist q, r ∈ Z with 0 ≤ r < exp G such that |G| =
q(exp G) + r. Let g ∈ G. Then eG = g |G| = g q(exp G)+r = g q(exp G) g r = (g exp G )q g r =
(eG )q g r = eG g r = g r . Thus, for every g ∈ G, g r = eG . By the definition of exponent, r
cannot be positive. Thus, r = 0. It follows that exp G divides |G|.

Version: 4 Owner: Wkbj79 Author(s): Wkbj79

1324
326.9 proof that |g| divides expG

Following is a proof that, for every finite group G and for every g ∈ G, |g| divides exp G.

By the division algorithm, there exist q, r ∈ Z with 0 ≤ r < |g| such that exp G = q|g| + r.
Since eG = g exp G = g q|g|+r = (g |g| )q g r = (eG )q g r = eG g r = g r , then, by definition of the
order of an element, r cannot be positive. Thus, r = 0. It follows that |g| divides exp G.

Version: 2 Owner: Wkbj79 Author(s): Wkbj79

326.10 proof that every group of prime order is cyclic

Following is a proof that every group of prime order is cyclic.

Let p be a prime and G be a group such that |G| = p. Then G contains more than one
element. Let g ∈ G such that g 6= eG . Then hgi contains more than one element. Since
hgi ≤ G, then by Lagrange’s theorem, |hgi| divides p. Since |hgi| > 1 and |hgi| divides a
prime, then |hgi| = p = |G|. Hence, hgi = G. It follows that G is cyclic.

Version: 3 Owner: Wkbj79 Author(s): Wkbj79

1325
Chapter 327

20E05 – Free nonabelian groups

327.1 Nielsen-Schreier theorem

Let G be a free group and H a subgroup of G. Then H is free.

Version: 1 Owner: Evandar Author(s): Evandar

327.2 Scheier index formula

Let G be a free group and H a subgroup of finite index |G : H| = n. By the Nielsen-Schreier theorem,
H is free. The Scheier index formula states that

rank(H) = n(rank(G) − 1) + 1.

Thus implies more generally, if G0 is any group generated by m elements, then any subgroup
of index n can be generated by nm − n + 1 elements.

Version: 1 Owner: bwebste Author(s): bwebste

327.3 free group

Let A be a set with elements ai for some index set I. We refer to A as an alphabet and the
elements of A as letters. A syllable is a symbol of the form ani for n ∈ Z. It is customary
to write a for a1 . Define a word to be a finite ordered string, or sequence, of syllables made
up of elements of A. For example,
a32 a1 a−1 2 −3
4 a3 a2

1326
is a five-syllable word. Notice that there exists a unique empty word, i.e. the word with
no syllables, usually written simply as 1. Denote the set of all words formed from elements
of A by W[A].

Define a binary operation, called the product, on W[A] by concatenation of words. To
illustrate, if a32 a1 and a−1 4 3 −1 4
1 a3 are elements of W[A] then their product is simply a2 a1 a1 a3 .
This gives W[A] the structure of a semigroup with identity. The empty word 1 acts as a
right and left identity in W[A], and is the only element which has an inverse. In order to
give W[A] the structure of a group, two more ideas are needed.

If v = u1 a0i u2 is a word where u1 , u2 are also words and ai is some element of A, an elemen-
tary contraction of type I replaces the occurrence of a0 by 1. Thus, after this type of
contraction we get another word w = u1 u2 . If v = u1 api aqi u2 is a word, an elementary con-
traction of type II replaces the occurrence of api aqi by aip+q which results in w = u1 ap+q
i u2 .
In either of these cases, we also say that w is obtained from v by an elementary contraction,
or that v is obtained from w by an elementary expansion.

Call two words u, v equivalent (denoted u ∼ v) if one can be obtained from the other by
a finite sequence of elementary contractions or expansions. This is an equivalence relation
on W[A]. Let F[A] be the set of equivalence classes of words in W[A]. Then F[A] is group
under the operation
[u][v] = [uv]
where [u] ∈ F[A]. The inverse [u]−1 of an element [u] is obtained by reversing the order of
the syllables of [u] and changing the sign of each syllable. For example, if [u] = [a1 a23 ], then
[u]−1 = [a−2 −1
3 a1 ].

We call F[A] the free group on the alphabet A or the free group generated by A.
A given group G is free if G is isomorphic to F[A] for some A. This seemingly ad hoc
construction gives an important result: Every group is the homomorphic image of some free
group.

Version: 4 Owner: jihemme Author(s): jihemme, rmilson, djao

327.4 proof of Nielsen-Schreier theorem and Schreier
index formula

While there are purely algebraic proofs of this fact, a much easier proof is available through
geometric group theory.

Let G be a group which is free on a set X. Any group acts freely on its Cayley graph, and
the Cayley graph of G is a 2|X|-regular tree, which we will call T.

If H is any subgroup of G, then H also acts freely on T by restriction. Since groups that act freely on trees a
H is free.

1327
Moreover, we can obtain the rank of H (the size of the set on which it is free). If G is a finite
graph, then π1 (G) is free of rank −χ(G) − 1, where χ(G) denotes the Euler characteristic of
G. Since H ∼ = π1 (H\T), the rank of H is χ(H\T). If H is of finite index n in G, then H\T
is finite, and χ(H\T) = nχ(G\T). Of course −χ(G\T) − 1 is the rank of G. Substituting,
we find that
rank(H) = n(rank(G) − 1) + 1.

Version: 2 Owner: bwebste Author(s): bwebste

327.5 Jordan-Holder decomposition

A Jordan–Hölder decomposition of a group G is a filtration

G = G1 ⊃ G2 ⊃ · · · ⊃ Gn = {1}

such that Gi+1 is a normal subgroup of Gi and the quotient Gi /Gi+1 is a simple group for
each i.

Version: 4 Owner: djao Author(s): djao

327.6 profinite group

A topological group G is profinite if it is isomorphic to the inverse limit of some projective
system of finite groups. In other words, G is profinite if there exists a directed set I, a
collection of finite groups {Hi }i∈I , and homomorphisms αij : Hj → Hi for each pair i, j ∈ I
with i 6 j, satisfying

1. αii = 1 for all i ∈ I,

2. αij ◦ αjk = αik for all i, j, k ∈ I with i 6 j 6 k,

with the property that:

• G is isomorphic as a group to the projective limit
( )
Y

lim Hi := (hi ) ∈ Hi αij (hj ) = hi for all i 6 j
←−
i∈I

under componentwise multiplication.

1328
Q
• The isomorphism from G to lim Hi (considered as a subspace of Hi ) is a homeomorphism
←− Q
of topological spaces, where each Hi is given the discrete topology and Hi is given
the product topology.

The topology on a profinite group is called the profinite topology.

Version: 3 Owner: djao Author(s): djao

327.7 extension

A short exact sequence 0 → A → B → C → 0 is sometimes called an extension of C by A.
This term is also applied to an object B which fits into such an exact sequence.

Version: 1 Owner: bwebste Author(s): bwebste

327.8 holomorph

Let K be a group, and let θ : Aut(K) → Aut(K) be the identity map. The holomorph of
K, denoted Hol(K), is the semidirect product K oθ Aut(K). Then K is a normal subgroup of
Hol(K), and any automorphism of K is the restriction of an inner automorphism of Hol(K).
For if φ ∈ Aut(K), then

(1, φ) · (k, 1) · (1, φ−1) = (1 · k θ(φ) , φ) · (1, φ−1 )
= (k θ(φ) · 1θ(φ) , φφ−1)
= (φ(k), 1).

Version: 2 Owner: dublisk Author(s): dublisk

327.9 proof of the Jordan Holder decomposition theo-
rem

Let |G| = N. We first prove existence, using induction on N. If N = 1 (or, more generally,
if G is simple) the result is clear. Now suppose G is not simple. Choose a maximal proper
normal subgroup G1 of G. Then G1 has a Jordan–Hölder decomposition by induction, which
produces a Jordan–Hölder decomposition for G.

1329
To prove uniqueness, we use induction on the length n of the decomposition series. If n = 1
then G is simple and we are done. For n > 1, suppose that

G ⊃ G1 ⊃ G2 ⊃ · · · ⊃ Gn = {1}

and
G ⊃ G01 ⊃ G02 ⊃ · · · ⊃ G0m = {1}
are two decompositions of G. If G1 = G01Tthen we’re done (apply the induction hypothesis
to G1 ), so assume G1 6= G01 . Set H := G1 G01 and choose a decomposition series

H ⊃ H1 ⊃ · · · ⊃ Hk = {1}

for H. By the second isomorphism theorem, G1 /H = G1 G01 /G01 = G/G01 (the last equality
is because G1 G01 is a normal subgroup of G properly containing G1 ). In particular, H is a
normal subgroup of G1 with simple quotient. But then

G1 ⊃ G2 ⊃ · · · ⊃ Gn

and
G1 ⊃ H ⊃ · · · ⊃ H k
are two decomposition series for G1 , and hence have the same simple quotients by the
induction hypothesis; likewise for the G01 series. Therefore n = m. Moreover, since G/G1 =
G01 /H and G/G01 = G1 /H (by the second isomorphism theorem), we have now accounted for
all of the simple quotients, and shown that they are the same.

Version: 4 Owner: djao Author(s): djao

327.10 semidirect product of groups

The goal of this exposition is to carefully explain the correspondence between the notions
of external and internal semi–direct products of groups, as well as the connection between
semi–direct products and short exact sequences.

Definition 6. Let H and Q be groups and let θ : Q −→ Aut(H) be a group homomorphism.
The semi–direct product Hoθ Q is defined to be the group with underlying set {(h, q)such thath ∈
H, q ∈ Q} and group operation (h, q)(h0 , q 0 ) := (hθ(q)h0 , qq 0 ).

We leave it to the reader to check that H oθ Q is really a group. It helps to know that the
inverse of (h, q) is (θ(q −1 )(h−1 ), q −1 ).

For the remainder of this article, we omit θ from the notation whenever this map is clear
from the context.

1330
Set G := H o Q. There exist canonical monomorphisms H −→ G and Q −→ G, given by

h 7→ (h, 1Q ), h∈H
q 7→ (1H , q), q∈Q

where 1H (resp. 1Q ) is the identity element of H (resp. Q). These monomorphisms are so
natural that we will treat H and Q as subgroups of G under these inclusions.

Theorem 3. Let G := H o Q as above. Then:

• H is a normal subgroup of G.

• HQ = G.
T
• H Q = {1G }.

L et p : G −→ Q be the projection map defined by p(h, q) = q. Then p is a homomorphism
with kernel H. Therefore H is a normal subgroup of G.

Every (h, q) ∈ G can be written as (h, 1Q )(1H , q). Therefore HQ = G.

Finally, it is evident that (1H , 1Q ) is the only element of G that is of the form (h, 1Q ) for
h ∈ H and (1H , q) for q ∈ Q.

This result motivates the definition of internal semi–direct products.

Definition 7. Let G be a group with subgroups H and Q. We say G is the internal semi–
direct product of H and Q if:

• H is a normal subgroup of G.

• HQ = G.
T
• H Q = {1G }.

We know an external semi–direct product is an internal semi–direct product (Theorem 3).
Now we prove a converse (Theorem 4), namely, that an internal semi–direct product is an
external semi–direct product.
T
Lemma 6. Let G be a group with subgroups H and Q. Suppose G = HQ and H Q = {1G }.
Then every element g of G can be written uniquely in the form hq, for h ∈ H and q ∈ Q.

S ince G = HQ, we know that g can be T
written as hq. Suppose it can also be written as
0 −1
h q . Then hq = h q so h h = q q ∈ H Q = {1G }. Therefore h = h0 and q = q 0 .
0 0 0 0 0 −1

1331
Theorem 4. Suppose G is a group with subgroups H and Q, and G is the internal semi–
direct product of H and Q. Then G ∼
= H oθ Q where θ : Q −→ Aut(H) is given by

θ(q)(h) := qhq −1 , q ∈ Q, h ∈ H.

B y lemma 6, every element g of G can be written uniquely in the form hq, with h ∈ H
and q ∈ Q. Therefore, the map φ : H o Q −→ G given by φ(h, q) = hq is a bijection from
G to H o Q. It only remains to show that this bijection is a homomorphism.

Given elements (h, q) and (h0 , q 0 ) in H o Q, we have

φ((h, q)(h0 , q 0 )) = φ((hθ(q)(h0 ), qq 0)) = φ(hqh0 q −1 , qq 0) = hqh0 q 0 = φ(h, q)φ(h0 , q 0 ).

Therefore φ is an isomorphism.

Consider the external semi–direct product G := H oθ Q with subgroups H and Q. We know
from Theorem 4 that G is isomorphic to the external semi–direct product H oθ0 Q, where
we are temporarily writing θ0 for the conjugation map θ0 (q)(h) := qhq −1 of Theorem 4. But
in fact the two maps θ and θ0 are the same:

θ0 (q)(h) = (1H , q)(h, 1Q )(1H , q −1 ) = (θ(q)(h), 1Q ) = θ(q)(h).

In summary, one may use Theorems 3 and 4 to pass freely between the notions of internal
semi–direct product and external semi–direct product.

Finally, we discuss the correspondence between semi–direct products and split exact sequences
of groups.
Definition 8. An exact sequence of groups
i j
1 −→ H −→ G −→ Q −→ 1.

is split if there exists a homomorphism k : Q −→ G such that j ◦ k is the identity map on Q.
Theorem 5. Let G, H, and Q be groups. Then G is isomorphic to a semi–direct product
H o Q if and only if there exists a split exact sequence
i j
1 −→ H −→ G −→ Q −→ 1.

F irst suppose G ∼
= H o Q. Let i : H −→ G be the inclusion map i(h) = (h, 1Q ) and let
j : G −→ Q be the projection map j(h, q) = q. Let the splitting map k : Q −→ G be the
inclusion map k(q) = (1H , q). Then the sequence above is clearly split exact.

Now suppose we have the split exact sequence above. Let k : Q −→ G be the splitting map.
Then:

• i(H) = ker j, so i(H) is normal in G.

1332
• For any g ∈ G, set q := k(j(g)). Then j(gq −1 ) = j(g)j(k(j(g)))−1 = 1Q , so gq −1 ∈ Im i.
Set h := gq −1 . Then g = hq. Therefore G = i(H)k(Q).

• Suppose g ∈ G is in both i(H) and k(Q). Write g = k(q). Then T k(q) ∈ Im i = ker j,
so q = j(k(q)) = 1Q . Therefore g = k(q) = k(1Q ) = 1G , so i(H) k(Q) = {1G }.

This proves that G is the internal semi–direct product of i(H) and k(Q). These are iso-
morphic to H and Q, respectively. Therefore G is isomorphic to a semi–direct product
H o Q.

Thus, not all normal subgroups H ⊂ G give rise to an (internal) semi–direct product G =
H o G/H. More specifically, if H is a normal subgroup of G, we have the canonical exact
sequence
1 −→ H −→ G −→ G/H −→ 1.
We see that G can be decomposed into H o G/H as an internal semi–direct product if and
only if the canonical exact sequence splits.

Version: 5 Owner: djao Author(s): djao

327.11 wreath product

Let A and B be groups, and let B act on the set Γ. Let AΓ be the set of all functions from
Γ to A. Endow AΓ with a group operation by pointwise multiplication. In other words, for
any f1 , f2 ∈ AΓ ,
(f1 f2 )(γ) = f1 (γ)f2 (γ) ∀γ ∈ Γ
where the operation on the right hand side above takes place in A, of course. Define the
action of B on AΓ by
bf (γ) := f (bγ),
for any f : Γ → A and all γ ∈ Γ.

The wreath product of A and B according to the action of B on Γ, sometimes denoted
A oΓ B, is the following semidirect product of groups:

AΓ o B.

Before going into further constructions, let us pause for a moment to unwind this definition.
Let W := A oΓ B. The elements of W are ordered pairs (f, b), for some function f : Γ → A
and some b ∈ B. The group operation in the semidirect product, for any (f1 , b1 ), (f2 , b2 ) ∈ W
is,
(f1 (γ), b1 )(f2 (γ), b2 ) = (f1 (γ)f2 (b1 γ), b1 b2 ) ∀γ ∈ Γ
The set AΓ can be interpreted as the cartesian product of A with itself, of cardinality Γ.
That is to say, Γ here plays the role of an index set for the Cartesian product. If Γ is finite,

1333
for instance, say Γ = {1, 2, . . . , n}, then any f ∈ AΓ is an n-tuple, and we can think of
any (f, b) ∈ W as the following ordered pair:

((a1 , a2 , . . . , an ), b) where a1 , a2 , . . . , an ∈ A

The action of B on Γ in the semidirect product has the effect of permuting the elements of
the n-tuple f , and the group operation defined on AΓ gives pointwise multiplication. To be
explicit, suppose (f, a), (g, b) ∈ W , and for j ∈ Γ, f (j) = rj ∈ A and g(j) = sj ∈ A. Then,

(f, a)(g, b) = ((r1 , r2 , . . . , rn ), a)((s1 , s2 , . . . , sn ), b)
= ((r1 , r2 , . . . , rn )(sa1 , sa2 , . . . , san ), ab) (Notice the permutation of the indices!)
= ((r1 sa1 , r2 sa2 , . . . , rn san ), ab).

A moment’s thought to understand this slightly messy notation will be illuminating (and
might also shed some light on the choice of terminology, “wreath” product).

Version: 11 Owner: bwebste Author(s): NeuRet

327.12 Jordan-Hlder decomposition theorem

Every finite group G has a filtration

G ⊃ G0 ⊃ · · · ⊃ Gn = {1},

where each Gi+1 is normal in Gi and each quotient group Gi /Gi+1 is a simple group. Any two
such decompositions of G have the same multiset of simple groups Gi /Gi+1 up to ordering.

A filtration of G satisfying the properties above is called a Jordan–Hölder decomposition of
G.

Version: 4 Owner: djao Author(s): djao

327.13 simplicity of the alternating groups

This is an elementary proof that for n > 5 the alternating group on n symbols, An , is simple.

Throughout this discussion, fix n > 5. We will extensively employ cycle notation, with
composition on the left, as is usual. The following observation will also be useful. Let π be
a permutation written as disjoint cycles

π = (a1 , a2 , . . . , ak )(b1 , b2 , . . . , bl )(. . .) . . .

1334
It is easy to check that for any other permutation σ ∈ Sn

σπσ −1 = (σ(a1 ), σ(a2 ), . . . , σ(ak ))(σ(b1 ), σ(b2 ), . . .)(. . .) . . .)

In particular, two permutations of Sn are conjugate exactly when they have the same cycle
type.

Two preliminary results are necessary.
Lemma 7. An is generated by all cycles of length 3.

A product of 3-cycles is an even permutation, so the subgroup generated by all 3-cycles is
therefore contained in An . For the reverse inclusion, by definition every even permutation
is the product of even number of transpositions. Thus, it suffices to show that the product
of two transpositions can be written as a product of 3-cycles. There are two possibilities.
Either the two transpositions move an element in common, say (a, b) and (a, c), or the two
transpositions are disjoint, say (a, b) and (c, d). In the former case,

(a, b)(a, c) = (a, c, b),

and in the latter,
(a, b)(c, d) = (a, b, d)(c, b, d).
This establishes the first lemma.
Lemma 8. If a normal subgroup N / An contains a 3-cycle, then N = An .

W e will show that if (a, b, c) ∈ N, then the assumption of normality implies that any other
(a0 , b0 , c0 ) ∈ N. This is easy to show, because there is some permutation in σ ∈ Sn that under
conjugation takes (a, b, c) to (a0 , b0 , c0 ), that is

σ(a, b, c)σ −1 = (σ(a), σ(b), σ(c)) = (a0 , b0 , c0 ).

In case σ is odd, then (because n > 5) we can choose some transposition (d, e) ∈ An disjoint
from (a0 , b0 , c0 ) so that
σ(a, b, c)σ −1 = (d, e)(a0 , b0 , c0 )(d, e),
that is,
σ 0 (a, b, c)σ 0−1 = (d, e)σ(a, b, c)σ −1 (d, e) = (a0 , b0 , c0 )
where σ 0 is even. This means that N contains all 3-cycles, as N / An . Hence, by previous
lemma N = An as required.

The rest of the proof proceeds by an exhuastive verification of all the possible cases. Suppose
there is some nontrivial N / An . We will show that N = An . In each case we will suppose N
contains a particular kind of element, and the normality will imply that N also contains a
certain conjugate of the element in An , thereby reducing the situation to a previously solved
case.

1335
Case 1 Suppose N contains a permutation π that when written as disjoint cycles has a
cycle of length at least 4, say

π = (a1 , a2 , a3 , a4 , . . .) . . .

Upon conjugation by (a1 , a2 , a3 ) ∈ An , we obtain

π 0 = (a1 , a2 , a3 )π(a3 , a2 , a1 ) = (a2 , a3 , a1 , a4 , . . .) . . .

so that π 0 ∈ N, and also π 0 π −1 = (a1 , a2 , a4 ) ∈ N. Notice that the rest of the cycles cancel.
By Lemma 8, N = An .

Case 2 The cyclic decompositions of elements of N only involve cycles of length 2 and at
least two cycles of length 3. Consider then π = (a, b, c)(d, e, f ) . . . Conjugation by (c, d, e)
implies that N also contains

π 0 = (c, d, e)π(e, d, c) = (a, b, d)(e, c, f ) . . . ,

and hence N also contains π 0 π = (a, d, c, b, f ) . . ., which reduces to Case 1.

Case 3 There is an element of N whose cyclic decomposition only involves transpositions
and exactly one 3-cycle. Upon squaring, this element becomes a 3-cycle and Lemma 8
applies.

Case 4 There is an element of N of the form π = (a, b)(c, d). Conjugating by (a, e, b) with
e distinct from a, b, c, d (again, at least one such e, as n > 5) yields

π 0 = (a, e, b)π(b, e, a) = (a, e)(c, d) ∈ N.

Hence π 0 π = (a, b, e) ∈ N. Lemma 8, applies and N = An .

Case 5 Every element of N is the product of at least four transpositions. Suppose N
contains π = (a1 , b1 )(a2 , b2 )(a3 , b3 )(a4 , b4 ) . . ., the number of transpostions being even, of
course. This time we conjugate by (a2 , b1 )(a3 , b2 ).

π 0 = (a2 , b1 )(a3 , b2 )π(a3 , b2 )(a2 , b1 ) = (a1 , a2 )(a3 , b1 )(b2 , b3 )(a4 , b4 ),

and π 0 π = (a1 , a3 , b2 )(a2 , b3 , b1 ) ∈ N which is Case 2.

Since this covers all possible cases, N = An and the alternating group contains no proper
nontrivial normal subgroups. QED.

Version: 8 Owner: rmilson Author(s): NeuRet

1336
327.14 abelian groups of order 120

Here we present an application of the fundamental theorem of finitely generated abelian groups.

Example (abelian groups of order 120):

Let G be an abelian group of order n = 120. Since the group is finite it is obviously
finitely generated, so we can apply the theorem. There exist n1 , n2 , . . . , ns with

G∼
= Z/n1 Z ⊕ Z/n2 Z ⊕ . . . ⊕ Z/ns Z
∀i, ni > 2; ni+1 | ni for 1 6 i 6 s − 1
Notice that in the case of a finite group, r, as in the statement of the theorem, must be equal
to 0. We have s
Y
3
n = 120 = 2 · 3 · 5 = ni = n1 · n2 · . . . · ns
i=1

and by the divisibility properties of ni we must have that every prime divisor of n must
divide n1 . Thus the possibilities for n1 are the following

2 · 3 · 5, 22 · 3 · 5, 23 · 3 · 5

If n1 = 23 · 3 · 5 = 120 then s = 1. In the case that n1 = 22 · 3 · 5 then n2 = 2 and
s = 2. It remains to analyze the case n1 = 2 · 3 · 5. Now the possibilities for n2 are 2 (with
n3 = 2, s = 3) or 4 (with s = 2).

Hence if G is an abelian group of order 120 it must be (up to isomorphism) one of the
following:

Z/120Z, Z/60Z ⊕ Z/2Z, Z/30Z ⊕ Z/4Z, Z/30Z ⊕ Z/2Z ⊕ Z/2Z

Also notice that they are all non-isomorphic. This is because

Z/(n · m)Z ∼
= Z/nZ ⊕ Z/mZ ⇔ gcd(n, m) = 1
which is due to the Chinese remainder theorem.

Version: 1 Owner: alozano Author(s): alozano

327.15 fundamental theorem of finitely generated abelian
groups

Theorem 2 (Fundamental Theorem of finitely generated abelian groups). Let G
be a finitely generated abelian group. Then there is a unique expression of the form:

G∼
= Zr ⊕ Z/n1 Z ⊕ Z/n2 Z ⊕ . . . ⊕ Z/ns Z

1337
for some integers r, ni satisfying:

r > 0; ∀i, ni > 2; ni+1 | ni for 1 6 i 6 s − 1

Version: 1 Owner: bwebste Author(s): alozano

327.16 conjugacy class

Let G a group, and consider its operation (action) on itself give by conjugation, that is, the
mapping
(g, x) 7→ gxg −1

Since conjugation is an equivalence relation, we obtain a partition of G into equivalence classes,
called conjugacy classes. So, the conjugacy class of X (represented Cx or C(x) is given by

Cx = {y ∈ X : y = gxg −1 for some g ∈ G}

Version: 2 Owner: drini Author(s): drini, apmxi

327.17 Frattini subgroup

Let G be a group. The Frattini subgroup Φ(G) of G is the intersection of all maximal subgroups
of G.

Equivalently, Φ(G) is the subgroup of non-generators of G.

Version: 1 Owner: Evandar Author(s): Evandar

327.18 non-generator

Let G be a group. g ∈ G is said to be a non-generator if whenever X is a generating set
for G then X r {g} is also a generating set for G.

Version: 1 Owner: Evandar Author(s): Evandar

1338
Chapter 328

20Exx – Structure and classification
of infinite or finite groups

328.1 faithful group action

Let A be a G-set. That is, a set over which acts (or operates) a group G.

The map mg : A → A defined as
mg (x) = ψ(g, x)
where g ∈ G and ψ is the action, is a permutation of A (in other words, a bijective function
of A) and so an element of SA . We can even get an homorphism from G to SA by the rule
g 7→ mg .

If for any pair g, h ∈ G g 6= h we have mg 6= mh , in other words, the homomorphism g → mg
being injective, we say that the action is faithful.

Version: 3 Owner: drini Author(s): drini, apmxi

1339
Chapter 329

20F18 – Nilpotent groups

329.1 classification of finite nilpotent groups

Let G be a finite group. The following are equivalent:

1. G is nilpotent.

2. Every subgroup of G is subnormal.

3. Every subgroup H 6 G is properly contained in its normalizer.

4. Every maximal subgroup is normal.

5. Every Sylow subgroup is normal.

6. G is a direct product of p-groups.

Version: 1 Owner: bwebste Author(s): bwebste

329.2 nilpotent group

We define the lower central series of a group G to be the filtration of subgroups

G = G1 ⊃ G2 ⊃ · · ·

defined inductively by:

G1 := G,
Gi := [Gi−1 , G], i > 1,

1340
where [Gi−1 , G] denotes the subgroup of G generated by all commutators of the form hkh−1 k −1
where h ∈ Gi−1 and k ∈ G. The group G is said to be nilpotent if Gi = 1 for some i.

Nilpotent groups can also be equivalently defined by means of upper central series. For a
group G, the upper central series of G is the filtration of subgroups

C1 ⊂ C2 ⊂ · · ·

defined by setting C1 to be the center of G, and inductively taking Ci to be the unique
subgroup of G such that Ci /Ci−1 is the center of G/Ci−1 , for each i > 1. The group G is
nilpotent if and only if G = Ci for some i.

Nilpotent groups are related to nilpotent Lie algebras in that a Lie group is nilpotent as
a group if and only if its corresponding Lie algebra is nilpotent. The analogy extends to
solvable groups as well: every nilpotent group is solvable, because the upper central series is
a filtration with abelian quotients.

Version: 3 Owner: djao Author(s): djao

1341
Chapter 330

20F22 – Other classes of groups
defined by subgroup chains

330.1 inverse limit

Let {Gi }∞
i=0 be a sequence of groups which are related by a chain of surjective homomorphisms
fi : Gi → Gi−1 such that
f1 f2 f3 f4
G0 G1 G2 G3 ...

Definition 28. The inverse limit of (Gi , fi ), denoted by
lim(Gi , fi ), or lim Gi
←− ←−
Q∞
is the subset of G
i=0 i formed by elements satisfying
( g0 , g1 , g2 , g3 , . . .), with gi ∈ Gi , fi (gi ) = gi−1
Q∞
Note: The inverse limit of Gi can be checked to be a subgroup of the product i=0 Gi . See
below for a more general definition.

Examples:

1. Let p ∈ N be a prime. Let G0 = {0} and Gi = Z/pi Z. Define the connecting
homomorphisms fi , for i > 2, to be “reduction modulo pi−1 ” i.e.
fi : Z/pi Z → Z/pi−1 Z
fi (x mod pi ) = x mod pi−1
which are obviously surjective homomorphisms. The inverse limit of (Z/pi Z, fi ) is
called the p-adic integers and denoted by
Zp = lim Z/pi Z
←−

1342
2. Let E be an elliptic curve defined over C. Let p be a prime and for any natural number
n write E[n] for the n-torsion group, i.e.

E[n] = {Q ∈ E | n · Q = O}

In this case we define Gi = E[pi ], and

fi : E[pi ] → E[pi−1 ], fi (Q) = p · Q

The inverse limit of (E[pi ], fi ) is called the Tate module of E and denoted

Tp (E) = lim E[pi ]
←−

The concept of inverse limit can be defined in far more generality. Let (S, 6) be a directed set
and let C be a category. Let {Gα }α∈S be a collection of objects in the category C and let

{fα,β : Gβ → Gα | α, β ∈ S, α 6 β}

be a collection of morphisms satisfying:

1. For all α ∈ S, fα,α = IdGα , the identity morphism.

2. For all α, β, γ ∈ S such that α 6 β 6 γ, we have fα,γ = fα,β ◦ fβ,γ (composition of
morphisms).

Definition 29. The inverse limit of ({Gα }α∈S , {fα,β }), denoted by

lim(Gα , fα,β ), or lim Gα
←− ←−
Q
is defined to be the set of all (gα ) ∈ α∈S Gα such that for all α, β ∈ S

α 6 β ⇒ fα,β (gβ ) = gα

For a good example of this more general construction, see infinite Galois theory.

Version: 6 Owner: alozano Author(s): alozano

1343
Chapter 331

20F28 – Automorphism groups of
groups

331.1 outer automorphism group

The outer automorphism group of a group is the quotient of its automorphism group by
its inner automorphism group:

Out(G) = Aut(G)/Inn(G).

Version: 7 Owner: Thomas Heye Author(s): yark, apmxi

1344
Chapter 332

20F36 – Braid groups; Artin groups

332.1 braid group

Consider two sets of n points on the complex plane C2 , of the form (1, 0), . . . , (n, 0), and
of the form (1, 1), . . . , (n, 1). We connect these two sets of points via a series of paths
fi : I → C2 , such that fi (t) 6= fj (t) for i 6= j and any t ∈ [0, 1]. Also, each fi may only
intersect the planes (0, z) and (1, z) for t = 0 and 1 respectively. Thus, the picture looks
like a bunch of strings connecting the two sets of points, but possibly tangled. The path
f = (f1 , . . . , fn ) determines a homotopy class f, where we require homotopies to satisfy the
same conditions on the fi . Such a homotopy class f is called a braid on n strands. We can
obtain a group structure on the set of braids on n strands as follows. Multiplication of two
strands f, g is done by simply following f first, then g, but doing each twice as fast. That is,
f · g is the homotopy class of the path
(
f (2t) if 0 6 t 6 1/2
fg =
g(2t − 1) if 1/2 6 t 6 1

where f and g are representatives for f and g respectively. Inverses are done by following
the same strand backwards, and the identity element is the strand represented by straight
lines down. The result is known as the braid group on n strands, it is denoted by Bn .

The braid group determines a homomorphism φ : Bn → Sn , where Sn is the symmetric group
on n letters. For f ∈ Bn , we get an element of Sn from map sending i 7→ p1 (fi (1)) where f
is a representative of the homtopy class f, and p1 is the projection onto the first factor. This
works because of our requirement on the points that the braids start and end, and since our
homotopies fix basepoints. The kernel of φ consists of the braids that bring each strand to
its original order. This kernel gives us the pure braid group on n strands, and is denoted
by Pn . Hence, we have a short exact sequence

1 → Pn → Bn → Sn → 1.

1345
We can also describe braid groups as certain fundamental groups, and in more generality.
Let M be a manifold, The configuration space of n ordered points on M is defined to
be Fn (M) = {(a1 , . . . , an ) ∈ M n | ai 6= aj fori 6= j}. The group Sn acts on Fn (M) by
permuting coordinates, and the corresponding quotient space Cn (M) = Fn (M)/Sn is called
the configuration space of n unordered points on M. In the case that M = C, we obtain the
regular and pure braid groups as π1 (Cn (M)) and π1 (Fn (M)) respectively.

The group Bn can be given the following presentation. The presentation was given in Artin’s
first paper [1] on the braid group. Label the braids 1 through n as before. Let σi be the
braid that twists strands i and i + 1, with i passing beneath i + 1. Then the σi generate Bn ,
and the only relations needed are

σi σj = σj σi for |i − j| > 2, 1 6 i, j 6 n − 1
σi σi+1 σi = σi+1 σi σi+1 for 1 6 i 6 n − 2

The pure braid group has a presentation with
−1 −1 −1
generatorsaij = σj−1 σj−2 · · · σi+1 σi2 σi+1 · · · σj−2 σj−1 for 1 6 i < j 6 n

and defining relations

 aij if i < r < s < j or r < s < i < j

a a a−1
rj ij rj if r<i=s<j
a−1
rs aij ars = −1 −1

 arj asj aij asj arj if i=r<s<j

arj asj a−1 −1 −1 −1
rj asj aij asj arj asj arj if r<i<s<j

REFERENCES
1. E. Artin Theorie der Zöpfe. Abh. Math. Sem. Univ. Hamburg 4(1925), 42-72.
2. V.L. Hansen Braids and Coverings. London Mathematical Society Student Texts 18. Cambridge
University Press. 1989.

Version: 7 Owner: dublisk Author(s): dublisk

1346
Chapter 333

20F55 – Reflection and Coxeter
groups

333.1 cycle

Let S be a set. A cycle is a permutation (bijective function of a set onto itself) such that
there exist distinct elements a1 , a2 , . . . , ak of S such that
f (ai ) = ai+1 and f (ak ) = a1
that is
f (a1 ) = a2
f (a2 ) = a3
..
.
f (ak ) = a1

and f (x) = x for any other element of S.

This can also be pictured as
a1 7→ a2 7→ a3 7→ · · · 7→ ak 7→ a1
and
x 7→ x
for any other element x ∈ S, where 7→ represents the action of f .

One of the basic results on symmetric groups says that any permutation can be expressed
as product of disjoint cycles.

Version: 6 Owner: drini Author(s): drini

1347
333.2 dihedral group

The nth dihedral group, Dn is the symmetry group of the regular n-sided polygon. The
group consists of n reflections, n − 1 rotations, and the identity transformation. Letting
ω = exp(2πi/n) denote a primitive nth root of unity, and assuming the polygon is centered
at the origin, the rotations Rk , k = 0, . . . , n − 1 (Note: R0 denotes the identity) are given
by
Rk : z 7→ ω k z, z ∈ C,
and the reflections Mk , k = 0, . . . , n − 1 by
Mk : z 7→ ω k z̄, z∈C
The abstract group structure is given by
Rk Rl = Rk+l , Rk Ml = Mk+l
Mk Ml = Rk−l , Mk Rl = Mk−l ,
where the addition and subtraction is carried out modulo n.

The group can also be described in terms of generators and relations as
(M0 )2 = (M1 )2 = (M1 M0 )n = id.
This means that Dn is a rank-1 Coxeter group.

Since the group acts by linear transformations
(x, y) → (x̂, ŷ), (x, y) ∈ R2
there is a corresponding action on polynomials p → p̂, defined by
p̂(x̂, ŷ) = p(x, y), p ∈ R[x, y].
The polynomials left invariant by all the group transformations form an algebra. This algebra
is freely generated by the following two basic invariants:
 
2 2 n n n−2 2
x +y , x − x y + ...,
2
the latter polynomial being the real part of (x + iy)n . It is easy to check that these two
polynomials are invariant. The first polynomial describes the distance of a point from the
origin, and this is unaltered by Euclidean reflections through the origin. The second polyno-
mial is unaltered by a rotation through 2π/n radians, and is also invariant with respect to
complex conjugation. These two transformations generate the nth dihedral group. Showing
that these two invariants polynomially generate the full algebra of invariants is somewhat
trickier, and is best done as an application of Chevalley’s theorem regarding the invariants
of a finite reflection group.

Version: 8 Owner: rmilson Author(s): rmilson

1348
Chapter 334

20F65 – Geometric group theory

334.1 groups that act freely on trees are free

Let X be a tree, and Γ a group acting freely and faithfully by group automorphisms on X.
Then Γ is a free group.

S ince Γ acts freely on X, the quotient graph X/Γ is well-defined, and X is the universal cover
of X/Γ since X is contractible. Thus Γ ∼ = π1 (X/Γ). Since any graph is homotopy equivalent
to a wedge of circles, and the fundamental group of such a space is free by Van Kampen’s theorem,
Γ is free.

Version: 3 Owner: bwebste Author(s): bwebste

1349
Chapter 335

20F99 – Miscellaneous

335.1 perfect group

A group G is called perfect if G = [G, G], where [G, G] is the derived subgroup of G, or
equivalently, if the abelianization of G is trivial.

Version: 1 Owner: bwebste Author(s): bwebste

1350
Chapter 336

20G15 – Linear algebraic groups over
arbitrary fields

336.1 Nagao’s theorem

For any integral domain k, the group of n×n invertible matrices with coefficients in k[t] is the
amalgamated free product of invertible matrices over k and invertible upper triangular matrices
over k[t], amalgamated over the upper triangular matrices of k. More compactly

GLn (k[t]) ∼
= GLn (k) ∗B(k) B(k[t]).

Version: 3 Owner: bwebste Author(s): bwebste

336.2 computation of the order of GL(n, Fq )

GL(n, Fq ) is the group of n × n matrices over a finite field Fq with non-zero determinant.
Here is a proof that |GL(n, Fq )| = (q n − 1)(q n − q) · · · (q n − q n−1 ).

Each element A ∈ GL(n, Fq ) is given by a collection of n Fq linearly independent vectors. If
one chooses the first column vector of A from (Fq )n there are q n choices, but one can’t choose
the zero vector since this would make the determinant of A zero. So there are really only
(q n − 1) choices. To choose an i-th vector from (Fq )n which is linearly independent from (i-1)
already choosen linearly independent vectors {V1 , · · · , Vi−1 } one must choose a vector not in
the span of {V1 , · · · , Vi−1 }. There are q i−1 vectors in this span so the number of choices is
clearly (q n − q i−1 ). Thus the number of linearly independent collections of n vectors in Fq
is: (q n − 1)(q n − q) · · · (q n − q n−1 ).

Version: 5 Owner: benjaminfjones Author(s): benjaminfjones

1351
336.3 general linear group

Given a vector space V , the general linear group GL( V ) is defined to be the group of
invertible linear transformations from V to V . The group operation is defined by com-
position: given T : V −→ V and T 0 : V −→ V in GL( V ), the product T T 0 is just the
composition of the maps T and T 0 .

If V = Fn for some field F, then the group GL(V ) is often denoted GL(n, F) or GLn (F).
In this case, if one identifies each linear transformation T : V −→ V with its matrix with
respect to the standard basis, the group GL(n, F) becomes the group of invertible n × n
matrices with entries in F, under the group operation of matrix multiplication.

Version: 3 Owner: djao Author(s): djao

336.4 order of the general linear group over a finite
field

GL(n, Fq ) is a finite group when Fq is a finite field with q elements. Furthermore, |GL(n, Fq )| =
(q n − 1)(q n − q) · · · (q n − q n−1 ).

Version: 16 Owner: benjaminfjones Author(s): benjaminfjones

336.5 special linear group

Given a vector space V , the special linear group SL(V ) is defined to be the subgroup of the
general linear group GL(V ) consisting of all invertible linear transformations T : V −→ V
in GL(V ) that have determinant 1.

If V = Fn for some field F, then the group SL(V ) is often denoted SL(n, F) or SLn (F), and if
one identifies each linear transformation with its matrix with respect to the standard basis,
then SL(n, F) consists of all n × n matrices with entries in F that have determinant 1.

Version: 2 Owner: djao Author(s): djao

1352
Chapter 337

20G20 – Linear algebraic groups over
the reals, the complexes, the
quaternions

337.1 orthogonal group

Let Q be a non-degenerate symmetric bilinear form over the real vector space Rn . A linear transformation
T : V −→ V is said to preserve Q if Q(T x, T y) = Q(x, y) for all vectors x, y ∈ V . The
subgroup of the general linear group GL(V ) consisting of all linear transformations that
preserve Q is called the orthogonal group with respect to Q, and denoted O(n, Q).

If Q is also positive definite (i.e., Q is an inner product), then O(n, Q) is equivalent to the
group of invertible linear transformations that preserve the standard inner product on Rn ,
and in this case it is usually denoted O(n). One can show that a transformation T is in O(n)
if and only if T −1 = T T (the inverse of T equals the transpose of T ).

Version: 2 Owner: djao Author(s): djao

1353
Chapter 338

20G25 – Linear algebraic groups over
local fields and their integers

338.1 Ihara’s theorem

Let Γ be a discrete, torsion-free subgroup of SL2 Qp (where Qp is the field of p-adic numbers).
Then Γ is free.

[ Proof, or a sketch thereof] There exists a p + 1 regular tree X on which SL2 Qp acts, with
stabilizer SL2 Zp (here, Zp denotes the ring T of p-adic integers). Since Zp is compact in its
profinite topology, so is SL2 Zp . Thus, SL2 Zp Γ must be compact, discrete and torsion-free.
Since compact and discrete implies finite, the only such group is trivial. Thus, Γ acts freely
on X. Since groups acting freely on trees are free, Γ is free.

Version: 6 Owner: bwebste Author(s): bwebste

1354
Chapter 339

20G40 – Linear algebraic groups over
finite fields

339.1 SL2(F3)

The special linear group over the finite field F3 is represented by SL2 (F3 ) and consists of the
2 × 2 invertible matrices with determinant equal to 1 and whose entries belong to F3 .

Version: 6 Owner: drini Author(s): drini, apmxi

1355
Chapter 340

20J06 – Cohomology of groups

340.1 group cohomology

Let G be a group and let M be a (left) G-module. The 0th cohomology group of the
G-module M is
H 0 (G, M) = {m ∈ M : ∀σ ∈ G, σm = m}
which is the set of elements of M which are G-invariant, also denoted by M G .

A map φ : G → M is said to be a crossed homomorphism (or 1-cocycle) if
φ(αβ) = φ(α) + αφ(β)
for all α, β ∈ G. If we fix m ∈ M, the map ρ : G → M defined by
ρ(α) = αm − m
is clearly a crossed homomorphism, said to be principal (or 1-coboundary). We define
the following groups:
Z 1 (G, M) = {φ : G → M : φ is a 1-cocycle}
B 1 (G, M) = {ρ : G → M : ρ is a 1-coboundary}
Finally, the 1st cohomology group of the G-module M is defined to be the quotient group:
H 1 (G, M) = Z 1 (G, M)/B 1 (G, M)

The following proposition is very useful when trying to compute cohomology groups:
Proposition 1. Let G be a group and let A, B, C be G-modules related by an exact sequence:
0→A→B→C→0
Then there is a long exact sequence in cohomology:
0 → H 0 (G, A) → H 0 (G, B) → H 0 (G, C) → H 1 (G, A) → H 1 (G, B) → H 1(G, C)

1356
In general, the cohomology groups H n (G, M) can be defined as follows:
Definition 30. Define C 0 (G, M) = M and for n > 1 define the additive group:
C n (G, M) = {φ : Gn → M}
The elements of C n (G, M) are called n-cochains. Also, for n > 0 define the nth coboundary
homomorphism dn : C n (G, M) → C n+1 (G, M):
dn (f )(g1 , ..., gn+1 ) = g1 · f (g2, ..., gn+1 )
Xn
+ (−1)i f (g1 , ..., gi−1, gi gi+1 , gi+2 , ..., gn+1)
i=1
+ (−1)n+1 f (g1 , ..., gn )
Let Z n (G, M) = ker dn for n > 0, the set of n-cocyles. Also, let B 0 (G, M) = 1 and for n > 1
let B n (G, A) = image dn−1 , the set of n-coboundaries.

Finally we define the nth -cohomology group of G with coefficients in M to be
H n (G, M) = Z n (G, M)/B n (G, M)

REFERENCES
1. J.P. Serre, Galois Cohomology, Springer-Verlag, New York.
2. James Milne, Elliptic Curves, online course notes.
3. Joseph H. Silverman, The Arithmetic of Elliptic Curves. Springer-Verlag, New York, 1986.

Version: 4 Owner: alozano Author(s): alozano

340.2 stronger Hilbert theorem 90

Let K be a field and let K̄ be an algebraic closure of K. By K̄ + we denote the abelian group
(K̄, +) and similarly K̄ ∗ = (K̄, ∗) (here the operation is multiplication). Also we let
GK̄/K = Gal(K̄/K)
be the absolute Galois group of K.

Theorem 3 (Hilbert 90). Let K be a field.

1.
H 1 (GK̄/K , K̄ + ) = 0

1357
2.
H 1 (GK̄/K , K̄ ∗ ) = 0

3. If char(K), the characteristic of K, does not divide m (or char(K) = 0) then

H 1 (GK̄/K , µm ) ∼
= K ∗ /K ∗m

where µm denotes the set of all mth -roots of unity.

REFERENCES
1. J.P. Serre, Galois Cohomology, Springer-Verlag, New York.
2. J.P. Serre, Local Fields, Springer-Verlag, New York.

Version: 2 Owner: alozano Author(s): alozano

1358
Chapter 341

20J15 – Category of groups

341.1 variety of groups

A variety of groups is the set of groups G such that all elements x1 , . . . , xn ∈ G satisfy a set
of equationally defined relations
ri (x1 , . . . , xn ) = 1 ∀i ∈ I,
where I is an index set.

For example, abelian groups are a variety defined by the equations
{[x1 , x2 ] = 1},
where [x, y] = xyx−1 y −1.

Nilpotent groups of class < c are a variety defined by
{[[· · · [[x1 , x2 ], x3 ] · · · ], xc ]}.

Analogously, solvable groups of length < c are a variety. Abelian groups are a special case
of both of these.

Groups of exponent n are a variety, defined by {xn1 = 1}.

A variety of groups is a full subcategory of the category of groups, and there is a free group
on any set of elements in the variety, which is the usual free group modulo the relations of the
variety applied to all elements. This satisfies the usual universal property of the free group
on groups in the variety, and is thus adjoint to the forgetful functor in the category of sets.
In the variety of abelian groups, we get back the usual free abelian groups. In the variety of
groups of exponent n, we get the Burnside groups.

Version: 1 Owner: bwebste Author(s): bwebste

1359
Chapter 342

20K01 – Finite abelian groups

342.1 Schinzel’s theorem

Let a ∈ Q, not zero or 1 or −1. For any prime p which does not divide the numerator or
denominator of a in reduced form, a can be viewed as an element of the multiplicative group
Z/pZ. Let np be the order of this element in the multiplicative group.

Then the set of np over all such primes has finite complement in the set of positive integers.

One can generalize this as follows:

Similarly, if K is a number field, choose a not zero or a root of unity in K. Then for any finite
place (discrete valuation) p with vp (a) = 0, we can view a as an element of the residue field
at p, and take the order np of this element in the multiplicative group.

Then the set of np over all such primes has finite complement in the set of positive integers.

Silverman also generalized this to elliptic curves over number fields.

References to come soon.

Version: 4 Owner: mathcam Author(s): Manoj, nerdy2

1360
Chapter 343

20K10 – Torsion groups, primary
groups and generalized primary
groups

343.1 torsion

The

Definition 31. torsion of a group G is the set

Tor(G) = {g ∈ G : g n = e for some n ∈ N}.

A group is said to be

Definition 32. torsion-free if Tor(G) = {e}, i.e. the torsion consists only of the identity element.

If G is abelian then Tor(G) is a subgroup (the

Definition 33. torsion group) of G.
Example 18 (Torsion of a cyclic group). For a cyclic group Zp , Tor(Zp ) = Zp .

In general, if G is a finite group then Tor(G) = G.

Version: 2 Owner: mhale Author(s): mhale

1361
Chapter 344

20K25 – Direct sums, direct products,
etc.

344.1 direct product of groups

The external direct product G × H of two groups G and H is defined to be the set of
ordered pairs (g, h), with g ∈ G and h ∈ H. The group operation is defined by

(g, h)(g 0, h0 ) = (gg 0, hh0 )

It can be shown that G × H obeys the group axioms. More generally, we can define the
external direct product of n groups, in the obvious way. Let G = G1 × . . . × Gn be the
set of all ordered n-tuples {(g1 , g2 . . . , gn ) | gi ∈ Gi } and define the group operation by
componentwise multiplication as before.

Version: 4 Owner: vitriol Author(s): vitriol

1362
Chapter 345

20K99 – Miscellaneous

345.1 Klein 4-group

The Klein 4-group is the subgroup V (Vierergruppe) of S4 (see symmetric group) consisting
of the following 4 permutations:
(), (12), (34), (12)(34).
(see cycle notation). This is an abelian group, isomorphic to the product Z/2Z × Z/2Z. The
group is named after Felix Klein, a pioneering figure in the field of geometric group theory.

The Klein 4 group enjoys a number of interesting properties, some of which are listed below.

1. It is the automorphism group of the graph consisting of two disjoint edges.
2. It is the unique 4 element group with the property that all elements are idempotent.
3. It is the symmetry group of a planar ellipse.
4. Consider the action of S4 , the permutation group of 4 elements, on the set of partitions
into two groups of two elements. There are 3 such partitions, which we denote by
(12, 34) (13, 24) (14, 23).
Thus, the action of S4 on these partition induces a homomorphism from S4 to S3 ; the
kernel is the Klein 4-group. This homomorphism is quite exceptional, and corresponds
to the fact that A4 (the alternating group) is not a simple group (notice that V is
actually a subgroup of A4 ). All other alternating groups are simple.
5. A more geometric way to see the above is the following: S4 is the group of symmetries
of a tetrahedron. There is an iduced action of S4 on the six edges of the tetrahedron.
Observing that this action preserves incidence relations one gets an action of S4 on the
three pairs of opposite edges (See figure).

1363
6. It is the symmetry group of the Riemannian curvature tensor.

3

4

1 2
Version: 7 Owner: rmilson Author(s): Dr Absentius, rmilson, imran

345.2 divisible group

An abelian group D is said to be divisible if for any x ∈ D, n ∈ Z+ , there exists an element
x0 ∈ D such that nx0 = x.

Some noteworthy facts:

• An abelian group is injective (as a Z-module) if and only if it is divisible.
• Every group is isomorphic to a subgroup of a divisible group.

• Any divisible abelian group is isomorphic to the direct sum of its torsion subgroup and
n copies of the group of rationals (for some cardinal number n).

Version: 4 Owner: mathcam Author(s): mathcam

345.3 example of divisible group

Let G denote the group of rational numbers taking the operation to be addition. Then for
any pq ∈ G and n ∈ Z+ , we have nq
p p
∈ G satisfying n nq = pq , so the group is divisible.

Version: 1 Owner: mathcam Author(s): mathcam

345.4 locally cyclic group

A locally cyclic (or generalized cyclic) group is a group in which any pair of elements generates
a cyclic subgroup.

1364
Every locally cyclic group is abelian.

If G is a locally cyclic group, then every finite subset of G generates a cyclic subgroup.
Therefore, the only finitely-generated locally cyclic groups are the cyclic groups themselves.
The group (Q, +) is an example of a locally cyclic group that is not cyclic.

Subgroups and quotients of locally cyclic groups are also locally cyclic.

A group is locally cyclic if and only if its lattice of subgroups is distributive.

Version: 10 Owner: yark Author(s): yark

1365
Chapter 346

20Kxx – Abelian groups

346.1 abelian group

Let (G, ∗) be a group. If for any a, b ∈ G we have a ∗ b = b ∗ a, we say that the group is
abelian. Sometimes the expression commutative group is used, but this is less frequent.

Abelian groups hold several interesting properties.
Theorem 4. If ϕ : G → G defined by ϕ(x) = x2 is a homomorphism, then G is abelian.

Proof. If such function were a homomorphism, we would have

(xy)2 = ϕ(xy) = ϕ(x)ϕ(y) = x2 y 2

that is, xyxy = xxyy. Left-mutiplying by x−1 and right-multiplying by y −1 we are led to
yx = xy and thus the group is abelian. QED
Theorem 5. Any subgroup of an abelian group is normal.

Proof. Let H be a subgroup of the abelian group G. Since ah = ha for any a ∈ G and any
h ∈ H we get aH = Ha. That is, H is normal in G. QED
Theorem 6. Quotient groups of abelian groups are also abelian.

Proof Let H a subgroup of G. Since G is abelian, H is normal and we can get the quotient
group G/H whose elements are the equivalence classes for a ∼ b if ab−1 ∈ H.

The operation on the quotient group is given by aH · bH = (ab)H. But bh · aH = (ba)H =
(ab)H, therefore the quotient group is also commutative. QED

Version: 12 Owner: drini Author(s): drini, yark, akrowne, apmxi

1366
Chapter 347

20M10 – General structure theory

347.1 existence of maximal semilattice decomposition

Let S be a semigroup. A maximal semilattice decomposition for S is a surjective
homomorphism φ : S → Γ onto a semilattice Γ with the property that any other semilattice
decomposition factors through φ. So if φ0 : S → Γ0 is any other semilattice decomposition of
S, then there is a homomorphism Γ → Γ0 such that the following diagram commutes:
φ
S Γ

φ0
Γ0

Proposition 14. Every semigroup has a maximal semilattice decomposition.

R ecall that each semilattice decompostion determines a semilattice congruence.
T If {ρi |
i ∈ I} is the family of all semilattice congruences on S, then define ρ = i∈I ρi . (Here, we
consider the congruences as subsets of S × S, and take their intersection as sets.)

It is easy to see that ρ is also a semilattice congruence, which is contained in all other
semilattice congruences.

Therefore each of the homomorphisms S → S/ρi factors through S → S/ρ.

Version: 2 Owner: mclase Author(s): mclase

1367
347.2 semilattice decomposition of a semigroup
S
A semigroup S has a semilattice decomposition if we can write S = γ∈Γ Sγ as a disjoint
union of subsemigroups, indexed by elements of a semilattice Γ, with the additional condition
that x ∈ Sα and y ∈ Sβ implies xy ∈ Sαβ .

Semilattice decompositions arise from homomorphims of semigroups onto semilattices. If
φ : S → Γ is a surjective homomorphism, then it is easy to see that we get a semilat-
tice decomposition by putting Sγ = φ−1 (γ) for each γ ∈ Γ. Conversely, every semilattice
decomposition defines a map from S to the indexing set Γ which is easily seen to be a
homomorphism.

A third way to look at semilattice decompositions is to consider the congruence ρ defined
by the homomorphism φ : S → Γ. Because Γ is a semilattice, φ(x2 ) = φ(x) for all x, and
so ρ satisfies the constraint that x ρ x2 for all x ∈ S. Also, φ(xy) = φ(yx) so that xy ρ yx
for all x, y ∈ S. A congruence ρ which satisfies these two conditions is called a semilattice
congruence.

Conversely, a semilattice congruence ρ on S gives rise to a homomorphism from S to a
semilattice S/ρ. The ρ-classes are the components of the decomposition.

Version: 3 Owner: mclase Author(s): mclase

347.3 simple semigroup

Let S be a semigroup. If S has no ideals other than itself, then S is said to be simple.

If S has no left ideals [resp. Right ideals] other than itself, then S is said to be left simple
[resp. right simple].

Right simple and left simple are stronger conditions than simple.

A semigroup S is left simple if and only if Sa = S for all a ∈ S. A semigroup is both left
and right simple if and only if it is a group.

If S has a zero element θ, then 0 = {θ} is always an ideal of S, so S is not simple (unless it
has only one element). So in studying semigroups with a zero, a slightly weaker definition is
required.

Let S be a semigroup with a zero. Then S is zero simple, or 0-simple, if the following
conditions hold:

• S 2 6= 0
• S has no ideals except 0 and S itself

1368
The condition S 2 = 0 really only eliminates one semigroup: the 2-element null semigroup.
Excluding this semigroup makes parts of the structure theory of semigroups cleaner.

Version: 1 Owner: mclase Author(s): mclase

1369
Chapter 348

20M12 – Ideal theory

348.1 Rees factor

Let I be an ideal of a semigroup S. Define a congruence ∼ by x ∼ y iff x = y or x, y ∈ I.

Then the Rees factor of S by I is the quotient S/ ∼. As a matter of notation, the
congruence ∼ is normally suppressed, and the quotient is simply written S/I.

Note that a Rees factor always has a zero element. Intuitively, the quotient identifies all
element in I and the resulting element is a zero element.

Version: 1 Owner: mclase Author(s): mclase

348.2 ideal

Let S be a semigroup. An ideal of S is a non-empty subset of S which is closed under
multiplication on either side by elements of S. Formally, I is an ideal of S if I is non-empty,
and for all x ∈ I and s ∈ S, we have sx ∈ I and xs ∈ I.

One-sided ideals are defined similarly. A non-empty subset A of S is a left ideal (resp.
right ideal) of S if for all a ∈ A and s ∈ S, we have sa ∈ A (resp. as ∈ A).

S a single element. If a 1∈ S, then the
A principal left ideal of S is a left ideal generated by
1
principal left ideal of S generated by a is S a = Sa {a}. (The notation S is explained
here.)
S
Similarly, the principal right ideal generated by a is aS 1 = aS {a}.

The notation L(a) and R(a) are also common for the principal left and right ideals generated

1370
by a respectively.

A principal ideal of S is an ideal generated by a single element. The ideal generated by a
is [ [ [
S 1 aS 1 = SaS Sa aS {a}.
The notation J(a) = S 1 aS 1 is also common.

Version: 5 Owner: mclase Author(s): mclase

1371
Chapter 349

20M14 – Commutative semigroups

349.1 Archimedean semigroup

Let S be a commutative semigroup. We say an element x divides an element y, written
x | y, if there is an element z such that xz = y.

An Archimedean semigroup S is a commutative semigroup with the property that for all
x, y ∈ S there is a natural number n such that x | y n .

This is related to the Archimedean property of positive real numbers R+ : if x, y > 0 then
there is a natural number n such that x < ny. Except that the notation is additive rather
than multiplicative, this is the same as saying that (R+ , +) is an Archimedean semigroup.

Version: 1 Owner: mclase Author(s): mclase

349.2 commutative semigroup

A semigroup S is commutative if the defining binary operation is commutative. That is,
for all x, y ∈ S, the identity xy = yx holds.

Although the term Abelian semigroup is sometimes used, it is more common simply to
refer to such semigroups as commutative semigroups.

A monoid which is also a commutative semigroup is called a commutative monoid.

Version: 1 Owner: mclase Author(s): mclase

1372
Chapter 350

20M20 – Semigroups of
transformations, etc.

350.1 semigroup of transformations

Let X be a set. A transformation of X is a function from X to X.

If α and β are transformations on X, then their product αβ is defined (writing functions on
the right) by (x)(αβ) = ((x)α)β.

With this definition, the set of all transformations on X becomes a semigroup, the full
semigroupf of transformations on X, denoted TX .

More generally, a semigroup of transformations is any subsemigroup of a full set of
transformations.

When X is finite, say X = {x1 , x2 , . . . , xn }, then the transformation α which maps xi to yi
(with yi ∈ X, of course) is often written:
 
x1 x2 . . . xn
α=
y1 y2 . . . yn

With this notation it is quite easy to calculate products. For example, if X = {1, 2, 3, 4},
then     
1 2 3 4 1 2 3 4 1 2 3 4
=
3 2 1 2 2 3 3 4 3 3 2 3

When X is infinite, say X = {1, 2, 3, . . . }, then this notation is still useful for illustration
in cases where the transformation pattern is apparent. For example, if α ∈ TX is given by

1373
α : n 7→ n + 1, we can write  
1 2 3 4 ...
α=
2 3 4 5 ...

Version: 3 Owner: mclase Author(s): mclase

1374
Chapter 351

20M30 – Representation of
semigroups; actions of semigroups on
sets

351.1 counting theorem

Given a group action of a finite group G on a set X, the following expression gives the
number of distinct orbits
1 X
stabg (X)
|G| g∈G

Where stabg (X) is the number of elements fixed by the action of g.

Version: 8 Owner: mathcam Author(s): Larry Hammick, vitriol

351.2 example of group action

Let a, b, c be integers and let [a, b, c] denote the mapping

[a, b, c] : Z × Z 7→ Z, (x, y) 7→ ax2 + bxy + cy 2 .

Let G be the group of 2 × 2 matrices such that det A = ±1 ∀ A ∈ G, and A ∈ G. The
substitution
txy 7→ A · txy
0 0 0
[a, b, c](a11 x + a12 y, a21x + a22 y) = a x2 + b xy + c y 2 ,

1375
where
0
a = a · a211 + b · a11 · a21 + c · a221 (351.2.1)
0
b = 2a · a11 · a12 + 2c · a21 · a22 + b(a11 a22 + a12 a21
0
c = a · a212 + b · a12 a22 + c · a222

So we define
0 0 0
[a, b, c] ∗ A := [a , b , c ]
0 0 0
to be the binary quadratic form with
 coefficients a , b , c of x2 , xy, y 2, respectively as in
1 0
(495.2.1). Putting in A = we have [a, b, c] ∗ A = [a, b, c] for any binary quadratic
0 1
form [a, b, c]. Now let B be another matrix in G. We must show that

[a, b, c] ∗ (AB) = ([a, b, c] ∗ A) ∗ B.
00 00 00
Set [a, b, c] ∗ (AB) := [a , b , c ]. So we have

a · (a11 b11 + a12 b21 )2 + c · (a21 b11 + a22 b21 )2 + b · (a11 b11 + a12 b21 ) (a21 b11 + (351.2.2)
00
a = a22 b21 )
0 2 0 2
= a · b11 + c · b21 + (2a · a11 a12 + 2c · a21 a22 + b (a11 a22 + a12 a21 )) b11 b21
a · (a11 b12 + a12 b22 )2 + c · (a21 b12 + a22 b22 )2 + b · (a11 b12 + a12 b22 ) (a21 b12 + (351.2.3)
00
c = a22 b22 )
0 2 0 2
= a · b12 + c · b22 + (2a · a11 a12 + 2c · a21 a22 + b (a11 a22 + a12 a21 )) b12 b22
00
as desired. For the coefficient b we get
00
b = 2a · (a11 b11 + a12 b21 ) (a11 b12 + a12 b22 )
+ 2c · (a21 b11 + a22 b21 ) (a21 b12 + a22 b22 )
+ b · ((a11 b11 + a12 b21 ) (a21 b12 + a22 b22 ) + (a11 b12 + a12 b22 ) (a21 b11 + a22 b21 ))

and by evaluating the factors of b11 b12 , b21 b22 , and b11 b22 + b21 b12 , it can be checked that
00 0 0
b = 2a b11 b12 + 2c b21 b22 + (b11 b22 + b21 b12 ) (2a · a11 a12 + 2c · a21 a22 + b (a11 a22 + a12 a21 )) .

This shows that
00 00 00 0 0 0
[a , b , c ] = [a , b , c ] ∗ B (351.2.4)
and therefore [a, b, c] ∗ (AB) = ([a, b, c] ∗ A) ∗ B. Thus, (495.2.1) defines an action of G on
the set of (integer) binary quadratic forms. Furthermore, the discriminant of each quadratic
form in the orbit of [a, b, c] under G is b2 − 4ac.

Version: 5 Owner: Thomas Heye Author(s): Thomas Heye

351.3 group action

Let G be a group and let X be a set. A left group action is a function · : G × X −→ X such
that:

1376
1. 1G · x = x for all x ∈ X
2. (g1 g2 ) · x = g1 · (g2 · x) for all g1 , g2 ∈ G and x ∈ X

A right group action is a function · : X × G −→ X such that:

1. x · 1G = x for all x ∈ X
2. x · (g1 g2 ) = (x · g1 ) · g2 for all g1 , g2 ∈ G and x ∈ X

There is a correspondence between left actions and right actions, given by associating the
right action x · g with the left action g · x := x · g −1. In many (but not all) contexts, it is
useful to identify right actions with their corresponding left actions, and speak only of left
actions.

Special types of group actions

A left action is said to be effective, or faithful, if the function x 7→ g ·x is the identity function
on X only when g = 1G .

A left action is said to be transitive if, for every x1 , x2 ∈ X, there exists a group element
g ∈ G such that g · x1 = x2 .

A left action is free if, for every x ∈ X, the only element of G that stabilizes x is the identity;
that is, g · x = x implies g = 1G .

Faithful, transitive, and free right actions are defined similarly.

Version: 3 Owner: djao Author(s): djao

351.4 orbit

Let G be a group, X a set, and · : G × X −→ X a group action. For any x ∈ X, the orbit
of x under the group action is the set
{g · x | g ∈ G} ⊂ X.

Version: 2 Owner: djao Author(s): djao

351.5 proof of counting theorem

Let N be the cardinality of the set of all the couples (g, x) such that g · x = x. For each
g ∈ G, there exist stabg (X) couples with g as the first element, while for each x, there are

1377
|Gx | couples with x as the second element. Hence the following equality holds:
X X
N= stabg (X) = |Gx |.
g∈G x∈X

From the orbit-stabilizer theorem it follows that:
X 1
N = |G| .
x∈X
|G(x)|

Since all the x belonging to the same orbit G(x) contribute with
1
|G(x)| =1
|G(x)|
P
in the sum, then x∈X 1/|G(x)| precisely equals the number of distinct orbits s. We have
therefore X
stabg (X) = |G|s,
g∈G

which proves the theorem.

Version: 2 Owner: n3o Author(s): n3o

351.6 stabilizer

Let G be a group, X a set, and · : G × X −→ X a group action. For any subset S of X, the
stabilizer of S, denoted Stab(S), is the subgroup

Stab(S) := {g ∈ G | g · s ∈ Sfor all s ∈ S}.

The stabilizer of a single point x in X is often denoted Gx .

Version: 3 Owner: djao Author(s): djao

1378
Chapter 352

20M99 – Miscellaneous

352.1 a semilattice is a commutative band

This note explains how a semilattice is the same as a commutative band.

Let S be a semilattice, with partial order < and each pair of elements x and y hav-
ing a greatest lower bound x ∧ y. Then it is easy to see that the operation ∧ defines a
binary operation on S which makes it a commutative semigroup, and that every element is
idempotent since x ∧ x = x.

Conversely, if S is such a semigroup, define x 6 y iff x = xy. Again, it is easy to see that
this defines a partial order on S, and that greatest lower bounds exist with respect to this
partial order, and that in fact x ∧ y = xy.

Version: 3 Owner: mclase Author(s): mclase

352.2 adjoining an identity to a semigroup

It is possible to formally adjoin an identity element to any semigroup to make it into a
monoid.
S
Suppose S is a semigroup without an identity, and consider the set S S {1} where 1 is
a symbol not in S. Extend the semigroup operation from S to S {1} by additionally
defining: [
s · 1 = s = 1 · s, for alls ∈ S 1
It is easy to verify that this defines a semigroup (associativity is the only thing that needs
to be checked).

1379
As a matter of notation, it is customary to write S 1 for the semigroup S with an identity
adjoined in this manner, if S does not already have one, and to agree that S 1 = S, if S does

Despite the simplicity of this construction, however, it rarely allows one to simplify a problem
by considering monoids instead of semigroups. As soon as one starts to look at the structure
of the semigroup, it is almost invariably the case that one needs to consider subsemigroups
and ideals of the semigroup which do not contain the identity.

Version: 2 Owner: mclase Author(s): mclase

352.3 band

A band is a semigroup in which every element is idempotent.

A commutative band is called a semilattice.

Version: 1 Owner: mclase Author(s): mclase

352.4 bicyclic semigroup

The bicyclic semigroup C(p, q) is the monoid generated by {p, q} with the single relation
pq = 1.

The elements of C(p, q) are all words of the form q n pm for m, n > 0 (with the understanding
p0 = q 0 = 1). These words are multiplied as follows:
(
q n+k−m pl if m 6 k,
q n pm q k pl =
q n pl+m−k if m > k.

It is apparent that C(p, q) is simple, for if q n pm is an element of C(p, q), then 1 = pn (q n pm )q m
and so S 1 q n pm S 1 = S.

It is useful to picture some further properties of C(p, q) by arranging the elements in a table:

1 p p2 p3 p4 ...
2 3
q qp qp qp qp4 ...
q 2 q 2 p q 2 p2 q 2 p3 q 2 p4 ...
q 3 q 3 p q 3 p2 q 3 p3 q 3 p4 ...
q 4 q 4 p q 4 p2 q 4 p3 q 4 p4 ...
.. .. .. .. .. ..
. . . . . .

1380
Then the elements below any horizontal line drawn through this table form a right ideal and
the elements to the right of any vertical line form a left ideal. Further, the elements on the
diagonal are all idempotents and their standard ordering is

1 > qp > q 2 p2 > q 3 p3 > · · · .

Version: 3 Owner: mclase Author(s): mclase

352.5 congruence

Let S be a semigroup. An equivalence relation ∼ defined on S is called a congruence if it is
preserved under the semigroup operation. That is, for all x, y, z ∈ S, if x ∼ y then xz ∼ yz
and zx ∼ zy.

If ∼ satisfies only x ∼ y implies xz ∼ yz (resp. zx ∼ zy) then ∼ is called a right congruence
(resp. left congruence).
Example 19. Suppose f : S → T is a semigroup homomorphism. Define ∼ by x ∼ y iff
f (x) = f (y). Then it is easy to see that ∼ is a congruence.

If ∼ is a congruence, defined on a semigroup S, write [x] for the equivalence class of x
under ∼. Then it is easy to see that [x] · [y] = [xy] is a well-defined operation on the set of
equivalence classes, and that in fact this set becomes a semigroup with this operation. This
semigroup is called the quotient of S by ∼ and is written S/ ∼.

Thus semigroup congruences are related to homomorphic images of semigroups in the same
way that normal subgroups are related to homomorphic images of groups. More precisely, in
the group case, the congruence is the coset relation, rather than the normal subgroup itself.

Version: 3 Owner: mclase Author(s): mclase

352.6 cyclic semigroup

A semigroup which is generated by a single element is called a cyclic semigroup.

Let S = hxi be a cyclic semigroup. Then as a set, S = {xn | n > 0}.

If all powers of x are distinct, then S = {x, x2 , x3 , . . . } is (countably) infinite.

Otherwise, there is a least integer n > 0 such that xn = xm for some m < n. It is clear then
that the elements x, x2 , . . . , xn−1 are distinct, but that for any j ≥ n, we must have xj = xi
for some i, m ≤ i ≤ n − 1. So S has n − 1 elements.

1381
Unlike in the group case, however, there are in general multiple non-isomorphic cyclic semi-
groups with the same number of elements. In fact, there are t non-isomorphic cyclic semi-
groups with t elements: these correspond to the different choices of m in the above (with
n = t + 1).

The integer m is called the index of S, and n − m is called the period of S.

The elements K = {xm , xm+1 , . . . , xn−1 } are a subsemigroup of S. In fact, K is a cyclic group.

A concrete representation of the semigroup with index m and period r as a semigroup of transformations
can be obtained as follows. Let X = {1, 2, 3, . . . , m + r}. Let
 
1 2 3 ... m + r −1 m + r
φ= .
2 3 4 ... m+r r+1

Then φ generates a subsemigroup S of the full semigroup of transformations TX , and S is
cyclic with index m and period r.

Version: 3 Owner: mclase Author(s): mclase

352.7 idempotent

An element x of a ring is called an idempotent element, or simply an idempotent if
x2 = x.

The set of idempotents of a ring can be partially ordered by putting e ≤ f iff e = ef = f e.

The element 0 is a minimum element in this partial order. If the ring has an identity element,
1, then 1 is a maximum element in this partial order.

Since these definitions refer only to the multiplicative structure of the ring, they also hold
for semigroups (with the proviso, of course, that a semigroup may not have a zero element).

In the special case of a semilattice, this partial order is the same as the one described in the
entry for semilattice.

If a ring has an identity, then 1 − e is always an idempotent whenever e is an idempotent,
and e(1 − e) = (1 − e)e = 0.

In a ring with an identity, two idempotents e and f are called a pair of orthogonal
idempotents if e + f = 1, and ef = f e = 0. Obviously, this is just a fancy way of saying
that f = 1 − e.

More generally, a set {e1 , e2 , . . . , en } of idempotents is called a complete set of orthogonal
idempotents if ei ej = ej ei = 0 whenever i 6= j and if 1 = e1 + e2 + · · · + en .

1382
Version: 3 Owner: mclase Author(s): mclase

352.8 null semigroup

A left zero semigroup is a semigroup in which every element is a left zero element. In
other words, it is a set S with a product defined as xy = x for all x, y ∈ S.

A right zero semigroup is defined similarly.

Let S be a semigroup. Then S is a null semigroup if it has a zero element and if the
product of any two elements is zero. In other words, there is an element θ ∈ S such that
xy = θ for all x, y ∈ S.

Version: 1 Owner: mclase Author(s): mclase

352.9 semigroup

A semigroup G is a set together with a binary operation · : G × G −→ G which satisfies the
associative property: (a · b) · c = a · (b · c) for all a, b, c ∈ G.

Version: 2 Owner: djao Author(s): djao

352.10 semilattice

A lower semilattice is a partially ordered set S in which each pair of elements has a
greatest lower bound.

A upper semilattice is a partially ordered set S in which each pair of elements has a least
upper bound.

Note that it is not normally necessary to distinguish lower from upper semilattices, because
one may be converted to the other by reversing the partial order. It is normal practise to
refer to either structure as a semilattice and it should be clear from the context whether
greatest lower bounds or least upper bounds exist.

Alternatively, a semilattice can be considered to be a commutative band, that is a semigroup
which is commutative, and in which every element is idempotent. In this context, semilattices
are important elements of semigroup theory and play a key role in the structure theory of
commutative semigroups.

A partially ordered set which is both a lower semilattice and an upper semilattice is a lattice.

1383
Version: 3 Owner: mclase Author(s): mclase

352.11 subsemigroup,, submonoid,, and subgroup

Let S be a semigroup, and let T be a subset of S.

T is a subsemigroup of S if T is closed under the operation of S; that it if xy ∈ T for all
x, y ∈ T .

T is a submonoid of S if T is a subsemigroup, and T has an identity element.

T is a subgroup of S if T is a submonoid which is a group.

Note that submonoids and subgroups do not have to have the same identity element as
S itself (indeed, S may not have an identity element). The identity element may be any
idempotent element of S.

Let e ∈ S be an idempotent element. Then there is a maximal subsemigroup of S for which
e is the identity:
eSe = {exe | x ∈ S}.
In addition, there is a maximal subgroup for which e is the identity:

U(eSe) = {x ∈ eSe | ∃y ∈ eSe st xy = yx = e}.

Subgroups with different identity elements are disjoint. To see this, suppose that G and H
are subgroups
T of a semigroup S with identity elements e and f respectively, and suppose
x ∈ G H. Then x has an inverse y ∈ G, and an inverse z ∈ H. We have:

e = xy = f xy = f e = zxe = zx = f.

Thus intersecting subgroups have the same identity element.

Version: 2 Owner: mclase Author(s): mclase

352.12 zero elements

Let S be a semigroup. An element z is called a right zero [resp. left zero] if xz = z [resp.
zx = z] for all x ∈ S.

An element which is both a left and a right zero is called a zero element.

A semigroup may have many left zeros or right zeros, but if it has at least one of each, then
they are necessarily equal, giving a unique (two-sided) zero element.

1384
It is customary to use the symbol θ for the zero element of a semigroup.

Version: 1 Owner: mclase Author(s): mclase

1385
Chapter 353

20N02 – Sets with a single binary
operation (groupoids)

353.1 groupoid

A groupoid G is a set together with a binary operation · : G × G −→ G. The groupoid (or
“magma”) is closed under the operation.

There is also a separate, category-theoretic definition of “groupoid.”

Version: 7 Owner: akrowne Author(s): akrowne

353.2 idempotency

If (S, ∗) is a magma, then an element x ∈ S is said to be idempotent if x ∗ x = x. If every
element of S is idempotent, then the binary operation ∗ (or the magma itself) is said to
be idempotent. For example, the ∧ and ∨ operations in a lattice are idempotent, because
x ∧ x = x and x ∨ x = x for all x in the lattice.

A function f : D → D is said to be idempotent if f ◦ f = f . (This is just a special case
of the above definition, the magma in question being (D D , ◦), the monoid of all functions
from D to D, with the operation of function composition.) In other words, f is idempotent
iff repeated application of f has the same effect as a single application: f (f (x)) = f (x)
for all x ∈ D. An idempotent linear transformation from a vector space to itself is called a
projection.

Version: 12 Owner: yark Author(s): yark, Logan

1386
353.3 left identity and right identity

Let G be a groupoid. An element e ∈ G is called a left identity element if ex = x for all
x ∈ G. Similarly, e is a right identity element if xe = x for all x ∈ G.

An element which is both a left and a right identity is an identity element.

A groupoid may have more than one left identify element: in fact the operation defined by
xy = y for all x, y ∈ G defines a groupoid (in fact, a semigroup) on any set G, and every
element is a left identity.

But as soon as a groupoid has both a left and a right identity, they are necessarily unique
and equal. For if e is a left identity and f is a right identity, then f = ef = e.

Version: 2 Owner: mclase Author(s): mclase

1387
Chapter 354

20N05 – Loops, quasigroups

354.1 Moufang loop

Proposition: Let Q be a nonempty quasigroup.

I) The following conditions are equivalent.

(x(yz))x = (xy)(zx) ∀x, y, z ∈ Q (354.1.1)
((xy)z)x = x(y(zx)) ∀x, y, z ∈ Q (354.1.2)
(yx)(zy) = (y(xz))y ∀x, y, z ∈ Q (354.1.3)
y(x(yz)) = ((yx)y)z ∀x, y, z ∈ Q (354.1.4)

II) If Q satisfies those conditions, then Q has an identity element (i.e. Q is a loop).

For a proof, we refer the reader to the two references. Kunen in [1] shows that that any of
the four conditions implies the existence of an identity element. And Bol and Bruck [2] show
that the four conditions are equivalent for loops.

Definition:A nonempty quasigroup satisfying the conditions (1)-(4) is called a Moufang
quasigroup or, equivalently, a Moufang loop (after Ruth Moufang, 1905-1977).

The 16-element set of unit octonians over Z is an example of a nonassociative Moufang loop.
Other examples appear in projective geometry, coding theory, and elsewhere.

References

[1] K. Kunen Moufang Quasigroups (PostScript format) (=Moufang Quasigroups, J. Al-
gebra 83 (1996) 231-234)

[2] R. H. Bruck, A Survey of Binary Systems, Springer-Verlag, 1958

1388
Version: 3 Owner: yark Author(s): Larry Hammick

354.2 loop and quasigroup

A quasigroup is a groupoid G with the property that for every x, y ∈ G, there are unique
elements w, z ∈ G such that xw = y and zx = y.

A loop is a quasigroup which has an identity element.

What distinguishes a loop from a group is that the former need not satisfy the associative
law.

Version: 1 Owner: mclase Author(s): mclase

1389
Chapter 355

22-00 – General reference works
(handbooks, dictionaries,
bibliographies, etc.)

355.1 fixed-point subspace

Let Σ ⊂ Γ be a subgroup where Γ is a compact Lie group acting on a vector space V . The
fixed-point subspace of Σ is

Fix(Σ) = {x ∈ V | σx = x, ∀σ ∈ Σ}

Fix(Σ) is a linear subspace of V since
\
Fix(Σ) = ker(σ − I)
σ∈Σ

where I is the identity. If it is important to specify the space V we use the following notation
FixV (Σ).

REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David: Singularities and Groups in Bifurca-
tion Theory (Volume II). Springer-Verlag, New York, 1988.

Version: 1 Owner: Daume Author(s): Daume

1390
Chapter 356

22-XX – Topological groups, Lie
groups

356.1 Cantor space

Cantor space denoted C is the set of all infinite binary sequences with the product topology.
It is a perfect Polish space. It is a compact subspace of Baire space, which is the set of all
infinite sequences of integers with the natural product topology.

REFERENCES
1. Moschovakis, Yiannis N. Descriptive set theory theory, 1980, Amsterdam ; New York : North-
Holland Pub. Co.

Version: 8 Owner: xiaoyanggu Author(s): xiaoyanggu

1391
Chapter 357

22A05 – Structure of general
topological groups

357.1 topological group

A topological group is a triple (G, ·, T) where (G, ·) is a group and T is a topology on
G such that under T, the group operation (x, y) 7→ x · y is continuous with respect to the
product topology on G × G and the inverse map x 7→ x−1 is continuous on G.

Version: 3 Owner: Evandar Author(s): Evandar

1392
Chapter 358

22C05 – Compact groups

358.1 n-torus

The n-Torus, denoted T n , is a smooth orientable n dimensional manifold which is the product
1
of n 1-spheres, i.e. T n = S · · × S}1 .
| × ·{z
n

Equivalently, the n-Torus can be considered to be Rn modulo the action (vector addition)
of the integer lattice Zn .

The n-Torus is in addition a topological group. If we think of S 1 as the unit circle in C
1
and T n = S
| × ·{z· · × S}1 , then S 1 is a topological group and so is T n by coordinate-wise
n
multiplication. That is,

(z1 , z2 , . . . , zn ) · (w1 , w2 , . . . , wn ) = (z1 w1 , z2 w2 , . . . , zn wn )

Version: 2 Owner: ack Author(s): ack, apmxi

358.2 reductive

Let G be a Lie group or algebraic group. G is called reductive over a field k if every
representation of G over k is completely reducible For example, a finite group is reduc-
tive over a field k if and only if its order is not divisible by the characteristic of k (by
Maschke’s theorem). A complex Lie group is reductive if and only if it is a direct product
of a semisimple group and an algebraic torus.

Version: 3 Owner: bwebste Author(s): bwebste

1393
Chapter 359

22D05 – General properties and
structure of locally compact groups

359.1 Γ-simple

A representation V of Γ is Γ-simple if either

• V ∼
= W1 ⊕ W2 where W1 , W2 are absolutely irreducible for Γ and are Γ-isomorphic, or

• V is non-absolutely irreducible for Γ.

[GSS]

REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David.: Singularities and Groups in Bifur-
cation Theory (Volume II). Springer-Verlag, New York, 1988.

Version: 1 Owner: Daume Author(s): Daume

1394
Chapter 360

22D15 – Group algebras of locally
compact groups

360.1 group C ∗-algebra

Let C[G] be the group ring of a discrete group G. It has two completions to a C ∗ -algebra:

Reduced group C ∗ -algebra. The reduced group C ∗ -algebra, Cr∗ (G), is obtained by com-
pleting C[G] in the operator norm for its regular representation on l2 (G).

Maximal group C ∗ -algebra. The maximal group C ∗ -algebra, Cmax ∗
(G) or just C ∗ (G), is
defined by the following universal property: any *-homomorphism from C[G] to some
B(H) (the C ∗ -algebra of bounded operators on some Hilbert space H) factors through

the inclusion C[G] ,→ Cmax (G).

If G is amenable then Cr∗ (G) ∼ ∗
= Cmax (G).

Version: 3 Owner: mhale Author(s): mhale

1395
Chapter 361

22E10 – General properties and
structure of complex Lie groups

361.1 existence and uniqueness of compact real form

Let G be a semisimple complex Lie group. Then there exists a unique (up to isomorphism)
real Lie group K such that K is compact and a real form of G. Conversely, if K is compact,
semisimple and real, it is the real form of a unique semisimple complex Lie group G. The
group K can be realized as the set of fixed points of a special involution of G, called the
Cartan involution.

For example, the compact real form of SLn C, the complex special linear group, is SU(n), the
special unitary group. Note that SLn R is also a real form of SLn C, but is not compact.

The compact real form of SOn C, the complex special orthogonal group, is SOn R, the real or-
thogonal group. SOn C also has other, non-compact real forms, called the pseudo-orthogonal
groups.

The compact real form of Sp2n C, the complex symplectic group, is less well-known. It is
(unfortunately) also usually denoted Sp(2n), and consists of n × n “unitary” quaternion
matrices, that is,
Sp(2n) = {M ∈ GLn H|MM ∗ = I}
where M ∗ denotes M conjugate transpose. This different from the real symplectic group
Sp2n R.

Version: 2 Owner: bwebste Author(s): bwebste

1396
361.2 maximal torus

Let K be a compact group, and let t ∈ K be an element whose centralizer has minimal
dimension (such elements are dense in K). Let T be the centralizer of t. This subgroup is
closed since T = ϕ−1 (t) where ϕ : K → K is the map k 7→ ktk −1 , and abelian since it is the
intersection of K with the Cartan subgroup of its complexification, and hence a torus, since
K (and thus T ) is compact. We call T a maximal torus of K.

This term is also applied to the corresponding maximal abelian subgroup of a complex
semisimple group, which is an algebraic torus.

Version: 2 Owner: bwebste Author(s): bwebste

361.3 Lie group

A Lie group is a group endowed with a compatible analytic structure. To be more precise,
Lie group structure consists of two kinds of data

• a finite-dimensional, real-analytic manifold G
• and two analytic maps, one for multiplication G×G → G and one for inversion G → G,
which obey the appropriate group axioms.

Thus, a homomorphism in the category of Lie groups is a group homomorphism that is
simultaneously an analytic mapping between two real-analytic manifolds.

Next, we describe a natural construction that associates a certain Lie algebra g to every Lie
group G. Let e ∈ G denote the identity element of G.

For g ∈ G let λg : G → G denote the diffeomorphisms corresponding to left multiplication
by g.
Definition 9. A vector-field V on G is called left-invariant if V is invariant with respect to
all left multiplications. To be more precise, V is left-invariant if and only if
(λg )∗ (V ) = V
(see push-forward of a vector-field) for all g ∈ G.
Proposition 15. The vector-field bracket of two left-invariant vector fields is again, a left-
invariant vector field.

Proof. Let V1 , V2 be left-invariant vector fields, and let g ∈ G. The bracket operation is
covariant with respect to diffeomorphism, and in particular
(λg )∗ [V1 , V2 ] = [(λg )∗ V1 , (λg )∗ V2 ] = [V1 , V2 ].

1397
Q.E.D.

Definition 10. The Lie algebra of G, denoted hereafter by g, is the vector space of all
left-invariant vector fields equipped with the vector-field bracket.

Now a right multiplication is invariant with respect to all left multiplications, and it turns
out that we can characterize a left-invariant vector field as being an infinitesimal right mul-
tiplication.

Proposition 16. Let a ∈ Te G and let V be a left-invariant vector-field such that Ve = a.
Then for all g ∈ G we have
Vg = (λg )∗ (a).

The intuition here is that a gives an infinitesimal displacement from the identity element
and that Vg is gives a corresponding infinitesimal right displacement away from g. Indeed
consider a curve
γ : (−, ) → G
passing through the identity element with velocity a; i.e.

γ(0) = e, γ 0 (0) = a.

The above proposition is then saying that the curve

t 7→ gγ(t), t ∈ (−, )

passes through g at t = 0 with velocity Vg .

Thus we see that a left-invariant vector-field is completely determined by the value it takes
at e, and that therefore g is isomorphic, as a vector space to Te G.

Of course, we can also consider the Lie algebra of right-invariant vector fields. The resulting
Lie-algebra is anti-isomorphic (the order in the bracket is reversed) to the Lie algebra of
left-invariant vector fields. Now it is a general principle that the group inverse operation
gives an anti-isomorphism between left and right group actions. So, as one may well expect,
the anti-isomorphism between the Lie algebras of left and right-invariant vector fields can
be realized by considering the linear action of the inverse operation on Te G.

Finally, let us remark that one can induce the Lie algebra structure directly on Te G by
considering adjoint action of G on Te G.

Examples. [Coming soon.]

1398
Notes.

1. No generality is lost in assuming that a Lie group has analytic, rather than C ∞ or
even C k , k = 1, 2, . . . structure. Indeed, given a C 1 differential manifold with a C 1
multiplication rule, one can show that the exponential mapping endows this manifold
with a compatible real-analytic structure.
Indeed, one can go even further and show that even C 0 suffices. In other words, a
topological group that is also a finite-dimensional topological manifold possesses a com-
patible analytic structure. This result was formulated by Hilbert as his fifth problem,
and proved in the 50’s by Montgomery and Zippin.
2. One can also speak of a complex Lie group, in which case G and the multiplication
mapping are both complex-analytic. The theory of complex Lie groups requires the
notion of a holomorphic vector-field. Not withstanding this complication, most of the
essential features of the real theory carry over to the complex case.
3. The name “Lie group” honours the Norwegian mathematician Sophus Lie who pio-
neered and developed the theory of continuous transformation groups and the corre-
sponding theory of Lie algebras of vector fields (the group’s infinitesimal generators,
as Lie termed them). Lie’s original impetus was the study of continuous symmetry of
geometric objects and differential equations.
The scope of the theory has grown enormously in the 100+ years of its existence. The
contributions of Elie Cartan and Claude Chevalley figure prominently in this evolution.
Cartan is responsible for the celebrated ADE classification of simple Lie algebras, as
well as for charting the essential role played by Lie groups in differential geometry and
mathematical physics. Chevalley made key foundational contributions to the analytic
theory, and did much to pioneer the related theory of algebraic groups. Armand Borel’s
book “Essays in the History of Lie groups and algebraic groups” is the definitive source
on the evolution of the Lie group concept. Sophus Lie’s contributions are the subject
of a number of excellent articles by T. Hawkins.

Version: 6 Owner: rmilson Author(s): rmilson

361.4 complexification

Let G be a real Lie group. Then the complexification GC of G is the unique complex Lie
group equipped with a map ϕ : G → GC such that any map G → H where H is a complex Lie
group, extends to a holomorphic map GC → H. If g and gC are the respective Lie algebras,
gC ∼
= g ⊗R C.

For simply connected groups, the construction is obvious: we simply take the simply con-
nected complex group with Lie algebra gC , and ϕ to be the map induced by the inclusion
g → gC .

1399
If γ ∈ G is central, then its image is in central in GC since g 7→ γgγ −1 is a map extending
ϕ, and thus must be the identity by uniqueness half of the universal property. Thus, if
Γ ⊂ G is a discrete central subgroup, then we get a map G/Γ → GC /ϕ(Γ), which gives a
complexification for G/Γ. Since every Lie group is of this form, this shows existence.

Some easy examples: the complexification both of SLn R and SU(n) is SLn C. The complex-
ification of R is C and of S 1 is C∗ .

The map ϕ : G → GC is not always injective. For example, if G is the universal cover
of SLn R (which has fundamental group Z), then GC ∼
= SLn C, and ϕ factors through the
covering G → SLn R.

Version: 3 Owner: bwebste Author(s): bwebste

361.5 Hilbert-Weyl theorem

theorem:

Let Γ be a compact Lie group acting on V . Then there exists a finite Hilbert
basis for the ring P(Γ) (the set of invariant polynomials). [GSS]

proof:

In [GSS] on page 54.

theorem:(as stated by Hermann Weyl)

The (absolute) invariants J(x, y, . . .) corresponding to a given set of representations
of a finite or a compact Lie group have a finite integrity basis. [PV]

proof:

In [PV] on page 274.

REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David.: Singularities and Groups in Bifur-
cation Theory (Volume II). Springer-Verlag, New York, 1988.
[HW] Hermann, Weyl: The Classical Groups: Their Invariants and Representations. Princeton
University Press, New Jersey, 1946.

Version: 3 Owner: Daume Author(s): Daume

1400
361.6 the connection between Lie groups and Lie alge-
bras

Given a finite dimensional Lie group G, it has an associated Lie algebra g = Lie(G). The
Lie algebra encodes a great deal of information about the Lie group. I’ve collected a few
results on this topic:

Theorem 7. (Existence) Let g be a finite dimensional Lie algebra over R or C. Then there
exists a finite dimensional real or complex Lie group G with Lie(G) = g.

Theorem 8. (Uniqueness) There is a unique connected simply-connected Lie group G with
any given finite-dimensional Lie algebra. Every connected Lie group with this Lie algebra is
a quotient G/Γ by a discrete central subgroup Γ.

Even more important, is the fact that the correspondence G 7→ g is functorial: given a
homomorphism ϕ : G → H of Lie groups, there is natural homomorphism defined on Lie
algebras ϕ∗ : g → h, which just the derivative of the map ϕ at the identity (since the Lie
algebra is canonically identified with the tangent space at the identity).

There are analogous existence and uniqueness theorems for maps:

Theorem 9. (Existence) Let ψ : g → h be a homomorphism of Lie algebras. Then if G is
the unique connected, simply-connected group with Lie algebra g, and H is any Lie group
with Lie algebra h, there exists a homorphism of Lie groups ϕ : G → H with ϕ∗ = ψ.

Theorem 10. (Uniqueness) Let G be connected Lie group and H an arbitrary Lie group.
Then if two maps ϕ, ϕ0 : G → H induce the same maps on Lie algebras, then they are equal.

Essentially, what these theorems tell us is the correspondence g 7→ G from Lie algebras to
simply-connected Lie groups is functorial, and right adjoint to the functor H 7→ Lie(H) from
Lie groups to Lie algebras.

Version: 6 Owner: bwebste Author(s): bwebste

1401
Chapter 362

26-00 – General reference works
(handbooks, dictionaries,
bibliographies, etc.)

362.1 derivative notation

This is the list of known standard representations and their nuances.

du df dy
, , −
dv dx dx
The most common notation, this is read as the derivative of u with respect
d2 y
to v. Exponents relate which derivative, for example, dx 2 is the second derivative of y with

resspect to x.

f 0 (x) , f~0 (x) , y 00− This is read as f prime of x. The number of primes tells the derivative,
ie. f 000 (x) is the third derivative of f (x) with respect to x. Note that in higher dimensions,
this may be a tensor of a rank equal to the derivative.

Dx f (x), Fy (x), fxy (x)− These notations are rather arcane, and should not be used gener-
ally, as they have other meanings. For example Fy can easily by the y component of a
vector-valued function. The subscript in this case means ”with respect to”, so Fyy would be
the second derivative of F with respect to y.

D1 f (x), F2 (x), f12 (x)− The subscripts in these cases refer to the derivative with respect to
the nth variable. For example, F2 (x, y, z) would be the derivative of F with respect to y.
They can easily represent higher derivatives, ie. D21 f (x) is the derivative with respect to
the first variable of the derivative with respect to the second variable.

1402
∂u
∂v
, ∂f
∂x
− The partial derivative of u with respect to v. This symbol can be manipulated as
in du
dv
for higher partials.

d ∂
dv
, ∂v − This is the operator version of the derivative. Usually you will see it acting on
d
something such as dv (v 2 + 3u) = 2v.

[Jf(x)] , [Df (x)]− The first of these represents the Jacobian of f, which is a matrix of partial
derivatives such that  
D1 f1 (x) . . . Dn f1 (x)
 .. .. .. 
[Jf (x)] =  . . . 
D1 fm (x) . . . Dn fm (x)
where fn represents the nth function of a vector valued function. the second of these no-
tations represents the derivative matrix, which in most cases is the Jacobian, but in some
cases, does not exist, even though the Jacobian exists. Note that the directional derivative
in the direction ~v is simply [Jf(x)]~v .

Version: 7 Owner: slider142 Author(s): slider142

362.2 fundamental theorems of calculus

The Fundamental Theorems of Calculus serve to demonstrate that integration and
differentiation are inverse processes.

First Fundamental Theorem:

Suppose that F is a differentiable function on the interval [a, b]. Then intba F 0 (x) dx = F (b) −
F (a).

Second Fundamental Theorem:

Let f be a continuous function on the interval [a, b], let c be an arbitrary point in this interval
and assume f is integrable on the intervals of the form [0, x] for all x ∈ [a, b]. Let F be defined
as F (x) = intxc f (t) dt for every x in (a, b). Then, F is differentiable and F 0 (x) = f (x).

This result is about Riemann integrals. When dealing with Lebesgue integrals we get a
generalization with Lebesgue’s differentiation theorem.

Version: 9 Owner: mathcam Author(s): drini, greg

1403
362.3 logarithm

Definition. Three real numbers x, y, p, with x, y > 0, are said to obey the logarithmic
relation
logx (y) = p
if they obey the corresponding exponential relation:

xp = y.

Note that by the monotonicity and continuity property of the exponential operation, for
given x and y there exists a unique p satisfying the above relation. We are therefore able to
says that p is the logarithm of y relative to the base x.

Properties. There are a number of basic algebraic identities involving logarithms.

logx (yz) = logx (y) + logx (z)
logx (y/z) = logx (y) − logx (z)
logx (y z ) = z logx (y)
logx (1) = 0
logx (x) = 1
logx (y) logy (x) = 1
logx (z)
logy (z) =
logx (y)

Notes. In essence, logarithms convert multiplication to addition, and exponentiation to
multiplication. Historically, these properties of the logarithm made it a useful tool for doing
numerical calculations. Before the advent of electronic calculators and computers, tables of
logarithms and the logarithmic slide rule were essential computational aids.

Scientific applications predominantly make use of logarithms whose base is the Eulerian number
e = 2.71828 . . .. Such logarithms are called natural logarithms and are commonly denoted
by the symbol ln, e.g.
ln(e) = 1.
Natural logarithms naturally give rise to the natural logarithm function.

A frequent convention, seen in elementary mathematics texts and on calculators, is that
logarithms that do not give a base explicitly are assumed to be base 10, e.g.

log(100) = 2.

This is far from universal. In Rudin’s “Real and Complex analysis”, for example, we see
a baseless log used to refer to the natural logarithm. By contrast, computer science and

1404
information theory texts often assume 2 as the default logarithm base. This is motivated
by the fact that log2 (N) is the approximate number of bits required to encode N different
messages.

The invention of logarithms is commonly credited to John Napier [ Biography]

Version: 13 Owner: rmilson Author(s): rmilson

362.4 proof of the first fundamental theorem of calcu-
lus

Let us make a subdivison of the intervalP[a, b], ∆ : {a = x0 < x1 < x2 < · · · < xn−1 < xn = b}
From this, we can say F (b) − F (a) = ni=1 [F (xi ) − F (xi−1 )].

From the mean-value theorem, we have that for any two points, x̄ and x̄¯, ∃ ξ ∈ (x̄, x̄ ¯) 3
F (x̄¯) − F (x̄) = F (ξ)(x̄¯ − x̄) If we use xi as x̄¯ and xi−1 as x̄, calling our intermediate point
0

ξi , we get F (xi ) − F (xi−1 ) = F 0 (ξi )(xi − xi−1 ).

Combining
Pn these, and using the abbreviation ∆i x = xi − xi−1 , we have F (xi ) − F (xi−1 ) =
0
i=1 F (ξi )∆i xi .
P
From the definition of an integral ∀  > 0 ∃ δ > 0 3 | ni=1 F 0 (ξi )∆i x − intba F 0 (x) dx| < 
when k∆k < δ. Thus, ∀ > 0, |F (b) − F (a) − intba F 0 (x) dx| < .

lim→0 |F (b) − F (a) − intba F 0 (x) dx| = 0, but F (b) − F (a) − intba F 0 (x) dx is constant with
respect to , which can only mean that |F (b) − F (a) − int ba F 0 (x) dx| = 0, and so we have the
first fundamental theorem of calculus F (b) − F (a) = intba F 0 (x) dx.

Version: 4 Owner: greg Author(s): greg

362.5 proof of the second fundamental theorem of cal-
culus

Recall that a continuous function is Riemann integrable, so the integral

F (x) = intxc f (t) dt

is well defined.

Consider the increment of F :

F (x + h) − F (x) = intcx+h f (t) dt − intxc f (t) dt = intx+h
x f (t) dt

1405
(we have used the linearity of the integral with respect to the function and the additivity
with respect to the domain).

Now let M be the maximum of f on [x, x + h] and m be the minimum. Clearly we have

mh ≤ intxx+h f (t) dt ≤ Mh

(this is due to the monotonicity of the integral with respect to the integrand) which can be
written as
F (x + h) − F (x) intxx+h f (t) dt
= ∈ [m, M]
h h

Being f continuous, by the mean-value theorem, there exists ξh ∈ [x, x+h] such that f (ξh ) =
F (x+h)−F (x)
h
so that

F (x + h) − F (x)
F 0 (x) = lim = lim f (ξh ) = f (x)
h→0 h h→0

since ξh → x as h → 0.

Version: 1 Owner: paolini Author(s): paolini

362.6 root-mean-square

If x1 , x2 , . . . , xn are real numbers, we define their root-mean-square or quadratic mean
as r
x21 + x22 + · · · + x2n
R(x1 , x2 , . . . , xn ) = .
n

The root-mean-square of a random variable X is defined as the square root of the expectation
of X 2 : p
R(X) = E(X 2 )

If X1 , X2 , . . . , Xn are random variables with standard deviations σ1 , σ2 , . . . , σn , then the stan-
dard deviation of their arithmetic mean, X1 +X2n+···+Xn , is the root-mean-square of σ1 , σ2 , . . . , σn .

Version: 1 Owner: pbruin Author(s): pbruin

362.7 square

The square of a number x is the number obtained multiplying x by itself. It’s denoted as x2 .

1406
Some examples:

52 = 25
 2
1 1
=
3 9
02 = 0
.52 = .25

Version: 2 Owner: drini Author(s): drini

1407
Chapter 363

26-XX – Real functions

363.1 abelian function

An abelian or hyperelliptic function is a generalisation of an elliptic function. It is a function
of two variables with four periods. In a similar way to an elliptic function it can also be
regarded as the inverse function to certain integrals (called abelian or hyperelliptic integrals)
of the form
dz
int
R(z)
where R is a polynomial of degree greater than 4.

363.2 full-width at half maximum

The full-width at half maximum (FWHM) is a parameter used to describe the width of a
bump on a function (or curve). The FWHM is given by the distance beteen the points where
the function reaches half of its maximum value.

For example: the function
10
f (x) = .
x2 +1

f reaches its maximum for x = 0,(f (0) = 10), so f reaches half of its maximum value for
x = 1 and x = −1 (f (1) = f (−1) = 5). So the FWHM for f , in this case, is 2. Beacouse
the distance between A(1, 5) and B(−1, 5) si 2.

1408
The function
10
f (x) = .
x2
+1
is called ’The Agnesi curve’, from Maria Gaetana Agnesi (1718 - 1799).

1409
Chapter 364

26A03 – Foundations: limits and
generalizations, elementary topology
of the line

364.1 Cauchy sequence

A sequence x0 , x1 , x2 , . . . in a metric space (X, d) is a Cauchy sequence if, for every real number 
> 0, there exists a natural number N such that d(xn , xm ) <  whenever n, m > N.

Version: 4 Owner: djao Author(s): djao, rmilson

364.2 Dedekind cuts

The purpose of Dedekind cuts is to provide a sound logical foundation for the real number
system. Dedekind’s motivation behind this project is to notice that a real number α, intu-
itively, is completely determined by the rationals strictly smaller than α and those strictly
larger than α. Concerning the completeness or continuity of the real line, Dedekind notes in
[2] that

If all points of the straight line fall into two classes such that every point of the
first class lies to the left of every point of the second class, then there exists one
and only one point which produces this division of all points into two classes,
this severing of the straight line into two portions.

Dedekind defines a point to produce the division of the real line if this point is either the
least or greatest element of either one of the classes mentioned above. He further notes that

1410
the completeness property, as he just phrased it, is deficient in the rationals, which motivates
the definition of reals as cuts of rationals. Because all rationals greater than α are really just
excess baggage, we prefer to sway somewhat from Dedekind’s original definition. Instead,
Definition 34. A Dedekind cut is a subset α of the rational numbers Q that satisfies
these properties:

1. α is not empty.

2. Q \ α is not empty.

3. α contains no greatest element

4. For x, y ∈ Q, if x ∈ α and y < x, then y ∈ α as well.

Dedekind cuts are particularly appealing for two reasons. First, they make it very easy to
prove the completeness, or continuity of the real line. Also, they make it quite plain to
distinguish the rationals from the irrationals on the real line, and put the latter on a firm
logical foundation. In the construction of the real numbers from Dedekind cuts, we make
the following definition:
Definition 35. A real number is a Dedekind cut. We denote the set of all real numbers
by R and we order them by set-theoretic inclusion, that is to say, for any α, β ∈ R,

α < β if and only if α ⊂ β

where the inclusion is strict. We further define α = β as real numbers if α and β are equal
as sets. As usual, we write α 6 β if α < β or α = β. Moreover, a real number α is said to
be irrational if Q \ α contains no least element.

The Dedekind completeness property of real numbers, expressed as the supremum property,
now becomes straightforward to prove. In what follows, we will reserve Greek variables for
real numbers, and Roman variables for rationals.
Theorem 11. Every nonempty subset of real numbers that is bounded above has a least upper bound.

L et A be a nonempty set of real numbers, such that for every α ∈ A we have that α 6 γ
for some real number γ. Now define the set

[
sup A = α.
α∈A

We must show that this set is a real number. This amounts to checking the four conditions
of a Dedekind cut.

1411
1. sup A is clearly not empty, for it is the nonempty union of nonempty sets.
2. Because γ is a real number, there is some rational x that is not in γ. Since every α ∈ A
is a subset of γ, x is not in any α, so x 6∈ sup A either. Thus, Q \ sup A is nonempty.
3. If sup A had a greatest element g, then g ∈ α for some α ∈ A. Then g would be
a greatest element of α, but α is a real number, so by contrapositive, sup A has no
greatest element.
4. Lastly, if x ∈ sup A, then x ∈ α for some α, so given any y < x because α is a real
number y ∈ α, whence y ∈ sup A.

Thus, sup A is a real number. Trivially, sup A is an upper bound of A, for every α ⊆ sup A.
It now suffices to prove that sup A 6 γ, because γ was an arbitrary upper bound. But this
is easy, because every x ∈ sup A is an element of α for some α ∈ A, so because α ⊆ γ, x ∈ γ.
Thus, sup A is the least upper bound of A. We call this real number the supremum of A.

To finish the construction of the real numbers, we must endow them with algebraic opera-
tions, define the additive and multiplicative identity elements, prove that these definitions
give a field, and prove further results about the order of the reals (such as the totality of this
order) – in short, build a complete ordered field. This task is somewhat laborious, but we
include here the appropriate definitions. Verifying their correctness can be an instructive,
albeit tiresome, exercise. We use the same symbols for the operations on the reals as for the
rational numbers; this should cause no confusion in context.
Definition 36. Given two real numbers α and β, we define

• The additive identity, denoted 0, is
0 := {x ∈ Q : x < 0}

• The multiplicative identity, denoted 1, is
1 := {x ∈ Q : x < 1}

• Addition of α and β denoted α + β is
α + β := {x + y : x ∈ α, y ∈ β}

• The opposite of α, denoted −α, is
−α := {x ∈ Q : −x 6∈ α, but − x is not the least element of Q \ α}

• The absolute value of α, denoted |α|, is
(
α, if α > 0
|α| :=
−α, if α 6 0

1412
• If α, β > 0, then multiplication of α and β, denoted α · β, is
α · β := {z ∈ Q : z 6 0 or z = xy for some x ∈ α, y ∈ β with x, y > 0}
In general,

0, if α = 0 or β = 0
α · β := |α| · |β| if α > 0, β > 0 or α < 0, β < 0

−(|α| · |β|) if α > 0, β < 0 or α > 0, β < 0

• The inverse of α > 0, denoted α−1 , is
α−1 := {x ∈ Q : x 6 0 or x > 0 and (1/x) 6∈ α, but 1/x is not the least element of Q\α}
If α < 0,
α−1 := −(|α|)−1

All that remains (!) is to check that the above definitions do indeed define a complete ordered
field, and that all the sets implied to be real numbers are indeed so. The properties of R
as an ordered field follow from these definitions and the properties of Q as an ordered field.
It is important to point out that in two steps, in showing that inverses and opposites are
properly defined, we require an extra property of Q, not merely in its capacity as an ordered
field. This requirement is the Archimedean property.

Moreover, because R is a field of characteristic 0, it contains an isomorphic copy of Q. The
rationals correspond to the Dedekind cuts α for which Q \ α contains a least member.

REFERENCES
1. Courant, Richard and Robbins, Herbert. What is Mathematics? pp. 68-72 Oxford University
Press, Oxford, 1969
2. Dedekind, Richard. Essays on the Theory of Numbers Dover Publications Inc, New York 1963
3. Rudin, Walter Principles of Mathematical Analysis pp. 17-21 McGraw-Hill Inc, New York,
1976
4. Spivak, Michael. Calculus pp. 569-596 Publish or Perish, Inc. Houston, 1994

Version: 20 Owner: rmilson Author(s): rmilson, NeuRet

364.3 binomial proof of positive integer power rule

We will use the difference quotient in this proof of the power rule for positive integers. Let
f (x) = xn for some integer n > 0. Then we have
(x + h)n − xn
f 0 (x) = lim .
h→0 h

1413
We can use the binomial theorem to expand the numerator
C0n x0 hn + C1n x1 hn−1 + · · · + Cn−1
n
xn−1 h1 + Cnn xn h0 − xn
f 0 (x) = lim
h→0 h
n!
where Ckn = k!(n−k)!
. We can now simplify the above

0 hn + nxhn−1 + · · · + nxn−1 h + xn − xn
f (x) = lim
h→0 h
= lim (hn−1 + nxhn−2 + · · · + nxn−1 )
h→0
n−1
= nx
= nxn−1 .

Version: 4 Owner: mathcam Author(s): mathcam, slider142

364.4 exponential

Preamble. We use R+ ⊂ R to denote the set of non-negative real numbers. Our aim is to
define the exponential, or the generalized power operation,

xp , x ∈ R+ , p ∈ R.

The power p in the above expression is called the exponent. We take it as proven that R is
a complete, ordered field. No other properties of the real numbers are invoked.

Definition. For x ∈ R+ and n ∈ Z we define xn in terms of repeated multiplication. To
be more precise, we inductively characterize natural number powers as follows:

x0 = 1, xn+1 = x · xn , n ∈ N.

The existence of the reciprocal is guaranteed by the assumption that R is a field. Thus, for
negative exponents, we can define

x−n = (x−1 )n , n ∈ N,

where x−1 is the reciprocal of x.

The case of arbitrary exponents is somewhat more complicated. A possible strategy is to
define roots, then rational powers, and then extend by continuity. Our approach is different.
For x ∈ R+ and p ∈ R, we define the set of all reals that one would want to be smaller than
xp , and then define the latter as the least upper bound of this set. To be more precise, let
x > 1 and define

L(x, p) = {z ∈ R+ : z n < xm for all m ∈ Z, n ∈ N such that m < pn}.

1414
We then define xp to be the least upper bound of L(x, p). For x < 1 we define

xp = (x−1 )p .

The exponential operation possesses a number of important properties, some of which char-
acterize it up to uniqueness.

Note. It is also possible to define the exponential operation in terms of the exponential function
and the natural logarithm. Since these concepts require the context of differential theory, it
seems preferable to give a basic definition that relies only on the foundational property of
the reals.

Version: 11 Owner: rmilson Author(s): rmilson

364.5 interleave sequence

Let S be a set, and let {xi }, i = 0, 1, 2, . . . and {yi }, i = 0, 1, 2, . . . be two sequences in S.
The interleave sequence is defined to be the sequence x0 , y0 , x1 , y1 , . . . . Formally, it is the
sequence {zi }, i = 0, 1, 2, . . . given by
(
xk if i = 2k is even,
zi :=
yk if i = 2k + 1 is odd.

Version: 2 Owner: djao Author(s): djao

364.6 limit inferior

Let S ⊂ R be a set of real numbers. Recall that a limit point of S is a real number x ∈ R
such that for all  > 0 there exist infinitely many y ∈ S such that

|x − y| < .

We define lim inf S, pronounced the limit inferior of S, to be the infimum of all the limit
points of S. If there are no limit points, we define the limit inferior to be +∞.

The two most common notations for the limit inferior are

lim inf S

and
lim S .

1415
An alternative, but equivalent, definition is available in the case of an infinite sequence of
real numbers x0 , x1 , x2 , , . . .. For each k ∈ N, let yk be the infimum of the k th tail,
yk = inf j>k xj .
This construction produces a non-decreasing sequence
y0 6 y1 6 y2 6 . . . ,
which either converges to its supremum, or diverges to +∞. We define the limit inferior of
the original sequence to be this limit;
lim inf xk = lim yk .
k k

Version: 7 Owner: rmilson Author(s): rmilson

364.7 limit superior

Let S ⊂ R be a set of real numbers. Recall that a limit point of S is a real number x ∈ R
such that for all  > 0 there exist infinitely many y ∈ S such that
|x − y| < .
We define lim sup S, pronounced the limit superior of S, to be the supremum of all the limit
points of S. If there are no limit points, we define the limit superior to be −∞.

The two most common notations for the limit superior are
lim sup S
and
lim S .

An alternative, but equivalent, definition is available in the case of an infinite sequence of
real numbers x0 , x1 , x2 , , . . .. For each k ∈ N, let yk be the supremum of the k th tail,
yk = sup xj .
j>k

This construction produces a non-increasing sequence
y0 > y1 > y2 > . . . ,
which either converges to its infimum, or diverges to −∞. We define the limit superior of
the original sequence to be this limit;
lim sup xk = lim yk .
k k

Version: 7 Owner: rmilson Author(s): rmilson

1416
364.8 power rule

The power rule states that

D p
x = pxp−1 , p∈R
Dx

This rule, when combined with the chain rule, product rule, and sum rule, makes calculating
many derivatives far more tractable. This rule can be derived by repeated application of the
product rule. See the proof of the power rule.

Repeated use of the above formula gives

(
di k 0 i>k
x = k!
dxi (k−i)!
xk−i i 6 k,

for i, k ∈ Z.

Examples

D 0 0 D
x = =0= 1
Dx x Dx
D 1 D
x = 1x0 = 1 = x
Dx Dx
D 2
x = 2x
Dx
D 3
x = 3x2
Dx
D√ D 1/2 1 −1/2 1
x = x = x = √
Dx Dx 2 2 x
D e
2x = 2exe−1
Dx

Version: 4 Owner: mathcam Author(s): mathcam, Logan

364.9 properties of the exponential

The exponential operation possesses the following properties.

1417
• Homogeneity. For x, y ∈ R+ , p ∈ R we have

(xy)p = xp y p

• Exponent additivity. For x ∈ R+ we have

x0 = 1, x1 = x.

Furthermore
xp+q = xp xq , p, q ∈ R.

• Monotonicity. For x, y ∈ R+ with x < y and p ∈ R+ we have

xp < y p , x−p > y −p .

• Continuity. The exponential operation is continuous with respect to its arguments.
To be more precise, the following function is continuous:

P : R+ × R → R, P (x, y) = xy .

Let us also note that the exponential operation is characterized (in the sense of existence and
uniqueness) by the additivity and continuity properties. [Author’s note: One can probably
get away with substantially less, but I haven’t given this enough thought.]

Version: 10 Owner: rmilson Author(s): rmilson

364.10 squeeze rule

Squeeze rule for sequences

Let f, g, h : N → R be three sequences of real numbers such that

f (n) ≤ g(n) ≤ h(n)

for all n. If limn→∞ f (n) and limn→∞ h(n) exist and are equal, say to a, then limn→∞ g(n)
also exists and equals a.

The proof is fairly straightforward. Let e be any real number > 0. By hypothesis there exist
M, N ∈ N such that
|a − f (n)| < e for all n ≥ M
|a − h(n)| < e for all n ≥ N
Write L = max(M, N). For n ≥ L we have

1418
• if g(n) ≥ a:
|g(n) − a| = g(n) − a ≤ h(n) − a < e

• else g(n) < a and:
|g(n) − a| = a − g(n) ≤ a − f (n) < e

So, for all n ≥ L, we have |g(n) − a| < e, which is the desired conclusion.

Squeeze rule for functions

Let f, g, h : S → R be three real-valued functions on a neighbourhood S of a real number b,
such that
f (x) ≤ g(x) ≤ h(x)
for all x ∈ S − {b}. If limx→b f (x) and limx→b h(x) exist and are equal, say to a, then
limx→b g(x) also exists and equals a.

Again let e be an arbitrary positive real number. Find positive reals α and β such that

|a − f (x)| < e whenever 0 < |b − x| < α

|a − h(x)| < e whenever 0 < |b − x| < β
Write δ = min(α, β). Now, for any x such that |b − x| < δ, we have

• if g(x) ≥ a:
|g(x) − a| = g(x) − a ≤ h(x) − a < e

• else g(x) < a and:
|g(x) − a| = a − g(x) ≤ a − f (x) < e

and we are done.

Version: 1 Owner: Daume Author(s): Larry Hammick

1419
Chapter 365

26A06 – One-variable calculus

365.1 Darboux’s theorem (analysis)

Let f : [a, b] → R be a real-valued continuous function on [a, b], which is differentiable on
(a, b), differentiable from the right at a, and differentiable from the left at b. Then f 0 satisfies
the intermediate value theorem: for every t between f+0 (a) and f−0 (b), there is some x ∈ [a, b]
such that f 0 (x) = t.

Note that when f is continuously differentiable (f ∈ C 1 ([a, b])), this is trivially true by the
intermediate value theorem. But even when f 0 is not continuous, Darboux’s theorem places
a severe restriction on what it can be.

Version: 3 Owner: mathwizard Author(s): mathwizard, ariels

365.2 Fermat’s Theorem (stationary points)

Let f : (a, b) → R be a continuous function and suppose that x0 ∈ (a, b) is a local extremum
of f . If f is differentiable in x0 then f 0 (x0 ) = 0.

Version: 2 Owner: paolini Author(s): paolini

1420
365.3 Heaviside step function

The Heaviside step function is the function H : R → R defined as

 0 when x < 0,
H(x) = 1/2 when x = 0,

1 when x > 0.
Here, there are many conventions for the value at x = 0. The motivation for setting H(0) =
1/2 is that we can then write H as a function of the signum function (see this page). In
applications, such as the Laplace transform, where the Heaviside function is used extensively,
the value of H(0) is irrelevant.

The function is named after Oliver Heaviside (1850-1925) [1]. However, the function was
already used by Cauchy[2], who defined the function as
1 √ 
u(t) = t + t/ t2
2
and called it a coefficient limitateur [1].

REFERENCES
1. The MacTutor History of Mathematics archive, Oliver Heaviside.
2. The MacTutor History of Mathematics archive, Augustin Louis Cauchy.
3. R.F. Hoskins, Generalised functions, Ellis Horwood Series: Mathematics and its applica-
tions, John Wiley & Sons, 1979.

Version: 1 Owner: Koro Author(s): matte

365.4 Leibniz’ rule

Theorem [Leibniz’ rule] ([1] page 592) Let f and g be real (or complex) valued functions
that are defined on an open interval of R. If f and g are k times differentiable, then
Xk  
(k) k (k−r) (r)
(f g) = f g .
r=0
r

For multi-indices, Leibniz’ rule have the following generalization:

Theorem [2] If f, g : Rn → C are smooth functions, and j is a multi-index, then
X j 
j
∂ (f g) = ∂ i (f ) ∂ j−i (g),
i≤j
i
where i is a multi-index.

1421
REFERENCES
1. R. Adams, Calculus, a complete course, Addison-Wesley Publishers Ltd, 3rd ed.
2. http://www.math.umn.edu/ jodeit/course/TmprDist1.pdf

Version: 3 Owner: matte Author(s): matte

365.5 Rolle’s theorem

Rolle’s theorem. If f is a continuous function on [a, b], such that f (a) = f (b) = 0 and
differentiable on (a, b) then there exists a point c ∈ (a, b) such that f 0 (c) = 0.

Version: 8 Owner: drini Author(s): drini

365.6 binomial formula

The binomial formula gives the power series expansion of the pth power function for every
real power p. To wit,
X∞ n
p n x
(1 + x) = p , x ∈ R, |x| < 1,
n=0
n!
where
pn = p(p − 1) . . . (p − n + 1)
denotes the nth falling factorial of p.

Note that for p ∈ N the power series reduces to a polynomial. The above formula is therefore
a generalization of the binomial theorem.

Version: 4 Owner: rmilson Author(s): rmilson

365.7 chain rule

Let f (x), g(x) be differentiable, real-valued functions. The derivative of the composition
(f ◦ g)(x) can be found using the chain rule, which asserts that:

(f ◦ g)0(x) = f 0 (g(x)) g 0(x)

The chain rule has a particularly suggestive appearance in terms of the Leibniz formalism.
Suppose that z depends differentiably on y, and that y in turn depends differentiably on x.

1422
Then,
dz dz dy
=
dx dy dx
The apparent cancellation of the dy term is at best a formal mnemonic, and does not con-
stitute a rigorous proof of this result. Rather, the Leibniz format is well suited to the
interpretation of the chain rule in terms of related rates. To wit:

The instantaneous rate of change of z relative to x is equal to the rate of change
of z relative to y times the rate of change of y relative to x.

Version: 5 Owner: rmilson Author(s): rmilson

365.8 complex Rolle’s theorem

Theorem [1] Suppose Ω is an open convex set in C, suppose f is a holomorphic function
f : Ω → C, and suppose f (a) = f (b) = 0 for distinct points a, b in Ω. Then there exist
points u, v on Lab (the straight line connecting a and b not containing the endpoints), such
that
Re{f 0 (u)} = 0 and Im{f 0(v)} = 0.

REFERENCES
1. J.-Cl. Evard, F. Jafari, A Complex Rolle’s Theorem, American Mathematical Monthly,
Vol. 99, Issue 9, (Nov. 1992), pp. 858-861.

Version: 4 Owner: matte Author(s): matte

365.9 complex mean-value theorem

Theorem [1] Suppose Ω is an open convex set in C, suppose f is a holomorphic function
f : Ω → C, and suppose a, b are distinct points in Ω. Then there exist points u, v on Lab
(the straight line connecting a and b not containing the endpoints), such that

f (b) − f (a)
Re{ } = Re{f 0 (u)},
b−a
f (b) − f (a)
Im{ } = Im{f 0 (v)},
b−a

1423
REFERENCES
1. J.-Cl. Evard, F. Jafari, A Complex Rolle’s Theorem, American Mathematical Monthly,
Vol. 99, Issue 9, (Nov. 1992), pp. 858-861.

Version: 2 Owner: matte Author(s): matte

365.10 definite integral

The definite integral with respect to x of some function f (x) over the closed interval [a, b]
is defined to be the “area under the graph of f (x) with respect to x” (if f(x) is negative,
then you have a negative area). It is written as:
intba f (x) dx
one way to find the value of the integral is to take a limit of an approximation technique as
the precision increases to infinity.

For example, use a Riemann sum which approximates the area by dividing it into n intervals
of equal widths, and then calculating the area of rectangles with the width of the interval
and height dependent on the function’s value in the interval. Let Rn be this approximation,
which can be written as n
X
Rn = f (x∗i )∆x
i=1

where x∗i is some x inside the ith interval.

Then, the integral would be
n
X
intba f (x) dx = lim Rn = lim f (x∗i )∆x
n→∞ n→∞
i=1

We can use this definition to arrive at some important properties of definite integrals (a, b,
c are constant with respect to x):
intba f (x) + g(x) dx = intba f (x) dx + intba g(x) dx
intba f (x) − g(x) dx = intba f (x) dx − intba g(x) dx
intba f (x) dx = −intab f (x) dx
intba f (x) dx = intca f (x) dx + intbc f (x) dx
intba cf (x) dx = cintba f (x) dx

There are other generalisations about integrals, but many require the fundamental theorem of calculus.

Version: 4 Owner: xriso Author(s): xriso

1424
365.11 derivative of even/odd function (proof )

Suppose f (x) = ±f (−x). We need to show that f 0 (x) = ∓f 0 (−x). To do this, let us
define the auxiliary function m : R → R, m(x) = −x. The condition on f is then f (x) =
±(f ◦ m)(x). Using the chain rule, we have that

f 0 (x) = ±(f ◦ m)0 (x)

= ±f 0 m(x) m0 (x)
= ∓f 0 (−x),

and the claim follows. 2

Version: 2 Owner: mathcam Author(s): matte

365.12 direct sum of even/odd functions (example)

Example. direct sum of even and odd functions

Let us define the sets

F = {f |f is a function fromR toR},
F+ = {f ∈ F |f (x) = f (−x)for allx ∈ R},
F− = {f ∈ F |f (x) = −f (−x)for allx ∈ R}.

In other words, F contain all functions from R to R, F+ ⊂ F contain all even functions, and
F− ⊂ F contain all odd functions. All of these spaces have a natural vector space structure:
for functions f and g we define f + g as the function x 7→ f (x) + g(x). Similarly, if c is a
real constant, then cf is the function x 7→ cf (x). With these operations, the zero vector is
the mapping x 7→ 0.

We claim that F is the direct sum of F+ and F− , i.e., that

F = F+ ⊕ F− . (365.12.1)

To prove this claim, let us first note that F± are vector subspaces of F . Second, given an
arbitrary function f in F , we can define
1 
f+ (x) = f (x) + f (−x) ,
2
1 
f− (x) = f (x) − f (−x) .
2
Now f+ and f− are even and odd functions and f = f+ + f− . Thus any function in F can be
split into two components f+ and f− , such that f+ ∈ F+ and f− ∈ F− . To show that the sum

1425
T
is direct, suppose f is an element in F+ F− . Then we have that f (x) = −f (−x) = −f (x),
so f (x) = 0 for all x, i.e., f is the zero vector in F . We have established equation 364.12.1.

Version: 2 Owner: mathcam Author(s): matte

365.13 even/odd function

Definition.
Let f be a function from R to R. If f (x) = f (−x) for all x ∈ R, then f is an even function.
Similarly, if f (x) = −f (−x) for all x ∈ R, then f is an odd function.

Example.

1. The trigonometric functions sin and cos are odd and even, respectively.

properties.

1. The vector space of real functions can be written as the direct sum of even and odd

2. Let f : R → R be a differentiable function.

(a) If f is an even function, then the derivative f 0 is an odd function.
(b) If f is an odd function, then the derivative f 0 is an even function.

(proof)

3. Let f : R → R be a smooth function. Then there exists smooth functions g, h : R → R
such that
f (x) = g(x2 ) + xh(x2 )
for all x ∈ R. Thus, if f is even, we have f (x) = g(x2 ), and if f is odd, we have
f (x) = xh(x2 ) ([4], Exercise 1.2)

REFERENCES
1. L. Hörmander, The Analysis of Linear Partial Differential Operators I, (Distribution
theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990.

Version: 4 Owner: mathcam Author(s): matte

1426
365.14 example of chain rule

Suppose we wanted to differentiate
p
h(x) = sin(x).

Here, h(x) is given by the composition

h(x) = f (g(x)),

where √
f (x) = x and g(x) = sin(x).
Then chain rule says that
h0 (x) = f 0 (g(x))g 0(x).

Since
1
f 0 (x) = √ , and g 0 (x) = cos(x),
2 x
we have by chain rule  
0 1 cos x
h (x) = √ cos x = √
2 sin x 2 sin x

Using the Leibniz formalism, the above calculation would have the following appearance.
First we describe the functional relation as
p
z = sin(x).

Next, we introduce an auxiliary variable y, and write

z= y, y = sin(x).

We then have
dz 1 dy
= √ , = cos(x),
dy 2 y dx
and hence the chain rule gives
dz 1
= √ cos(x)
dx 2 y
1 cos(x)
= p
2 sin(x)

Version: 1 Owner: rmilson Author(s): rmilson

1427
365.15 example of increasing/decreasing/monotone func-
tion

The function f (x) = ex is strictly increasing and hence strictly monotone. Similarly g(x) =
e−x is strictly decreasing
p and√hence strictlypmonotone. Consider the function h : [1, 10] 7→

[1, 5] where h(x) = x − 4 x − 1 + 3 + x − 6 x − 1 + 8. It is not strictly monotone
since it is constant on an interval, however it is decreasing and hence monotone.

Version: 1 Owner: Johan Author(s): Johan

365.16 extended mean-value theorem

Let f : [a, b] → R and g : [a, b] → R be continuous on [a, b] and differentiable on (a, b). Then
there exists some number ξ ∈ (a, b) satisfying:

(f (b) − f (a))g 0 (ξ) = (g(b) − g(a))f 0(ξ).

If g is linear this becomes the usual mean-value theorem.

Version: 6 Owner: mathwizard Author(s): mathwizard

365.17 increasing/decreasing/monotone function

Definition Let A a subset of R, and let f be a function from f : A → R. Then

1. f is increasing, if x ≤ y implies that f (x) ≤ f (y) (for all x and y in A).

2. f is strictly increasing, if x < y implies that f (x) < f (y).

3. f is decreasing, if x ≥ y implies that f (x) ≥ f (y).

4. f is strictly decreasing, if x > y implies that f (x) > f (y).

5. f is monotone, if f is either increasing or decreasing.

6. f is strictly monotone, if f is either strictly increasing or strictly decreasing.

Theorem Let X be a bounded of unbounded open S interval of R. In other words, let X be
an interval of the form X = (a, b), where a, b ∈ R {−∞, ∞}. Futher, let f : X → R be a
monotone function.

1428
1. The set of points where f is discontinuous is at most countable [1, 1].

Lebesgue f is differentiable almost everywhere ([1], pp. 514).

REFERENCES
1. C.D. Aliprantis, O. Burkinshaw, Principles of Real Analysis, 2nd ed., Academic Press,
1990.
2. W. Rudin, Principles of Mathematical Analysis, McGraw-Hill Inc., 1976.
3. F. Jones, Lebesgue Integration on Euclidean Spaces, Jones and Barlett Publishers, 1993.

Version: 3 Owner: matte Author(s): matte

365.18 intermediate value theorem

Let f be a continuous function on the interval [a, b]. Let x1 and x2 be points with a ≤ x1 <
x2 ≤ b such that f (x1 ) 6= f (x2 ). Then for each value y between f (x1 ) and (x2 ), there is a
c ∈ (x1 , x2 ) such that f (c) = y.

Bolzano’s theorem is a special case of this one.

Version: 2 Owner: drini Author(s): drini

365.19 limit

Let f : X \ {a} −→ Y be a function between two metric spaces X and Y , defined everywhere
except at some a ∈ X. For L ∈ Y , we say the limit of f (x) as x approaches a is equal to L,
or
lim f (x) = L
x→a

if, for every real number ε > 0, there exists a real number δ > 0 such that, whenever x ∈ X
with 0 < dX (x, a) < δ, then dY (f (x), L) < ε.

The formal definition of limit as given above has a well–deserved reputation for being no-
toriously hard for inexperienced students to master. There is no easy fix for this problem,
since the concept of a limit is inherently difficult to state precisely (and indeed wasn’t even
accomplished historically until the 1800’s by Cauchy, well after the invention of calculus in
the 1600’s by Newton and Leibniz). However, there are number of related definitions, which,
taken together, may shed some light on the nature of the concept.

1429
• The notion of a limit can be generalized to mappings between arbitrary topological spaces.
In this context we say that limx→a f (x) = L if and only if, for every neighborhood V
of L (in Y ), there is a deleted neighborhood U of a (in X) which is mapped into V by
f.
• Let an , n ∈ N be a sequence of elements in a metric space X. We say that L ∈ X is
the limit of the sequence, if for every ε > 0 there exists a natural number N such that
d(an , L) < ε for all natural numbers n > N.
• The definition of the limit of a mapping can be based on the limit of a sequence. To
wit, limx→a f (x) = L if and only if, for every sequence of points xn in X converging to
a (that is, xn → a, xn 6= a), the sequence of points f (xn ) in Y converges to L.

In calculus, X and Y are frequently taken to be Euclidean spaces Rn and Rm , in which case
the distance functions dX and dY cited above are just Euclidean distance.

Version: 5 Owner: djao Author(s): rmilson, djao

365.20 mean value theorem

Mean value theorem Let f : [a, b] → R be a continuous function differentiable on (a, b).

Then there is some real number x0 ∈ (a, b) such that
f (b) − f (a)
f 0 (x0 ) = .
b−a

Version: 3 Owner: drini Author(s): drini, apmxi

365.21 mean-value theorem

Let f : R → R be a function which is continuous on the interval [a, b] and differentiable on
(a, b). Then there exists a number c : a < c < b such that

f (b) − f (a)
f 0 (c) = . (365.21.1)
b−a
The geometrical meaning of this theorem is illustrated in the picture:

1430
This is often used in the integral context: ∃c ∈ [a, b] such that

(b − a)f (c) = intba f (x)dx. (365.21.2)

Version: 4 Owner: mathwizard Author(s): mathwizard, drummond

365.22 monotonicity criterion

Suppose that f : [a, b] → R is a function which is continuous on [a, b] and differentiable on
(a, b).

Then the following relations hold.

1. f 0 (x) ≥ 0 for all x ∈ (a, b) ⇔ f is an increasing function on [a, b];

2. f 0 (x) ≤ 0 for all x ∈ (a, b) ⇔ f is a decreasing function on [a, b];

3. f 0 (x) > 0 for all x ∈ (a, b) ⇒ f is a strictly increasing function on [a, b];

4. f 0 (x) < 0 for all x ∈ (a, b) ⇒ f is a strictly decreasing function on [a, b].

Notice that the third and fourth statement cannot be inverted. As an example consider the
function f : [−1, 1] → R, f (x) = x3 . This is a strictly increasing function, but f 0 (0) = 0.

Version: 4 Owner: paolini Author(s): paolini

365.23 nabla

Let f : Rn → R a C 1 (Rn ) function. That is, a partially differentiable function on all its
coordinates. The symbol ∇, named nabla represents the gradient operator whose action on
f (x1 , x2 , . . . , xn ) is given by

∇f = (fx1 , fx2 , . . . , fxn )
 
∂f ∂f ∂f
= , ,...,
∂x1 ∂x2 ∂xn

Version: 2 Owner: drini Author(s): drini, apmxi

1431
365.24 one-sided limit

Let f be a real-valued function defined on S ⊆ R. The left-hand one-sided limit at a is
defined to be the real number L− such that for every  > 0 there exists a δ > 0 such that
|f (x) − L− | <  whenever 0 < a − x < δ.

Analogously, the right-hand one-sided limit at a is the real number L+ such that for
every  > 0 there exists a δ > 0 such that |f (x) − L+ | <  whenever 0 < x − a < δ.

Common notations for the one-sided limits are

L+ = f (x+) = lim+ f (x) = lim f (x),
x→a x&a

L = f (x−) = lim− f (x) = lim f (x).
x→a x%a

Sometimes, left-handed limits are referred to as limits from below while right-handed limits
are from above.

Theorem The ordinary limit of a function exists at a point if and only if both one-sided
limits exist at this point and are equal (to the ordinary limit).

e.g., The Heaviside unit step function, sometimes colloquially referred to as the diving board
function, defined by (
0 if x < 0
H(x) =
1 if x > 0
has the simplest kind of discontinuity at x = 0, a jump discontinuity. Its ordinary limit does
not exist at this point, but the one-sided limits do exist, and are

lim H(x) = 0 and lim+ H(x) = 1.
x→0− x→0

Version: 5 Owner: matte Author(s): matte, NeuRet

365.25 product rule

The product rule states that if f : R → R and g : R → R are functions in one variable
both differentiable at a point x0 , then the derivative of the product of the two fucntions,
denoted f · g, at x0 is given by
D
(f · g) (x0 ) = f (x0 )g 0 (x0 ) + f 0 (x0 )g(x0 ).
Dx

1432
Proof

See the proof of the product rule.

365.25.1 Generalized Product Rule

More generally, for differentiable functions f1 , f2 , . . . , fn in one variable, all differentiable at
x0 , we have
n
X
D(f1 · · · fn )(x0 ) = (fi (x0 ) · · · fi−1 (x0 ) · Dfi (x0 ) · fi+1 (x0 ) · · · fn (x0 )) .
i=1

Also see Leibniz’ rule.

Example

The derivative of x ln |x| can be found by application of this rule. Let f (x) = x, g(x) = ln |x|,
so that f (x)g(x) = x ln |x|. Then f 0 (x) = 1 and g 0 (x) = x1 . Therefore, by the product rule,

D
(x ln |x|) = f (x)g 0 (x) + f 0 (x)g(x)
Dx
x
= + 1 · ln |x|
x
= ln |x| + 1

Version: 8 Owner: mathcam Author(s): mathcam, Logan

365.26 proof of Darboux’s theorem

WLOG, assume f+0 (a) > t > f−0 (b). Let g(x) = f (x) − tx. Then g 0 (x) = f 0 (x) − t,
0 0
g+ (a) > 0 > g− (b), and we wish to find a zero of g 0 .

g is a continuous function on [a, b], so it attains a maximum on [a, b]. This maximum cannot
0 0
be at a, since g+ (a) > 0 so g is locally increasing at a. Similarly, g− (b) < 0, so g is locally
decreasing at b and cannot have a maximum at b. So the maximum is attained at some
c ∈ (a, b). But then g 0 (c) = 0 by Fermat’s theorem.

Version: 2 Owner: paolini Author(s): paolini, ariels

1433
365.27 proof of Fermat’s Theorem (stationary points)

Suppose that x0 is a local maximum (a similar proof applies if x0 is a local minimum). Then
there exists δ > 0 such that (x0 − δ, x0 + δ) ⊂ (a, b) and such that we have f (x0 ) ≥ f (x) for
all x with |x − x0 | < δ. Hence for h ∈ (0, δ) we notice that it holds
f (x0 + h) − f (x0 )
≤ 0.
h
Since the limit of this ratio as h → 0+ exists and is equal to f 0 (x0 ) we conclude that
f 0 (x0 ) ≤ 0. On the other hand for h ∈ (−δ, 0) we notice that
f (x0 + h) − f (x0 )
≥0
h
but again the limit as h → 0+ exists and is equal to f 0 (x0 ) so we also have f 0 (x0 ) ≥ 0.

Hence we conclude that f 0 (x0 ) = 0.

Version: 1 Owner: paolini Author(s): paolini

365.28 proof of Rolle’s theorem

Because f is continuous on a compact (closed and bounded) interval I = [a, b], it attains its
maximum and minimum values. In case f (a) = f (b) is both the maximum and the minimum,
then there is nothing more to say, for then f is a constant function and f 0 ⇔ 0 on the whole
interval I. So suppose otherwise, and f attains an extremum in the open interval (a, b), and
without loss of generality, let this extremum be a maximum, considering −f in lieu of f as
necessary. We claim that at this extremum f (c) we have f 0 (c) = 0, with a < c < b.

To show this, note that f (x) − f (c) 6 0 for all x ∈ I, because f (c) is the maximum. By
definition of the derivative, we have that
f (x) − f (c)
f 0 (c) = lim .
x→c x−c
Looking at the one-sided limits, we note that
f (x) − f (c)
R = lim+ 60
x→c x−c
because the numerator in the limit is nonpositive in the interval I, yet x − c > 0, as x
approaches c from the right. Similarly,
f (x) − f (c)
L = lim− > 0.
x→c x−c
Since f is differentiable at c, the left and right limits must coincide, so 0 6 L = R 6 0, that
is to say, f 0 (c) = 0.

Version: 1 Owner: rmilson Author(s): NeuRet

1434
365.29 proof of Taylor’s Theorem

Let n be a natural number and I be the closed interval [a, b]. We have that f : I → R has
n continuous derivatives and its (n + 1)-st derivative exists. Suppose that c ∈ I, and x ∈ I
is arbitrary. Let J be the closed interval with endpoints c and x.

Define F : J → R by
n
X (x − t)k
F (t) := f (x) − f (k) (t) (365.29.1)
k=0
k!
so that
n 
X 
0 0 (x − t)k (k+1) (x − t)k−1 (k)
F (t) = f (t) − f (t) − f (t)
k=1
k! (k − 1)!
n
(x − t) (n+1)
= − f (t)
n!
since the sum telescopes. Now, define G on J by
 n+1
x−t
G(t) := F (t) − F (c)
x−c

and notice that G(c) = G(x) = 0. Hence, Rolle’s theorem gives us a ζ strictly between x
and c such that
(x − ζ)n
0 = G0 (ζ) = F 0 (ζ) − (n + 1) F (c)
(x − c)n+1
that yields

1 (x − c)n+1 0
F (c) = − F (ζ)
n + 1 (x − c)n
1 (x − c)n+1 (x − ζ)n (n+1)
= f (ζ)
n + 1 (x − c)n n!
f (n+1) (ζ)
= (x − c)n+1
(n + 1)!

from which we conclude, recalling (364.29.1),
n
X f (k) (c) f (n+1) (ζ)
f (x) = (x − c)k + (x − c)n+1
k=0
k! (n + 1)!

Version: 3 Owner: rmilson Author(s): NeuRet

1435
365.30 proof of binomial formula

Let p ∈ R and x ∈ R, |x| < 1 be given. We wish to show that
X∞
p xn
(1 + x) = pn ,
n=0
n!
where pn denotes the nth falling factorial of p.

The convergence of the series in the right-hand side of the above equation is a straight-
forward consequence of the ratio test. Set
f (x) = (1 + x)p .
and note that
f (n) (x) = pn (1 + x)p−n .
The desired equality now follows from Taylor’s Theorem. Q.E.D.

Version: 2 Owner: rmilson Author(s): rmilson

365.31 proof of chain rule

Let’s say that g is differentiable in x0 and f is differentiable in y0 = g(x0 ). We define:
 f (y)−f (y0 )
y−y0
if y 6= y0
ϕ(y) = 0
f (y0 ) if y = y0

Since f is differentiable in y0 , ϕ is continuous. We observe that, for x 6= x0 ,
f (g(x)) − f (g(x0 )) g(x) − g(x0 )
= ϕ(g(x)) ,
x − x0 x − x0
in fact, if g(x) 6= g(x0 ), it follows at once from the definition of ϕ, while if g(x) = g(x0 ),
both members of the equation are 0.

Since g is continuous in x0 , and ϕ is continuous in y0 ,
lim ϕ(g(x)) = ϕ(g(x0 )) = f 0 (g(x0 )),
x→x0

hence
f (g(x)) − f (g(x0 ))
(f ◦ g)0(x0 ) = lim
x→x0 x − x0
g(x) − g(x0 )
= lim ϕ(g(x))
x→x0 x − x0
0 0
= f (g(x0 ))g (x0 ).

Version: 3 Owner: n3o Author(s): n3o

1436
365.32 proof of extended mean-value theorem

Let f : [a, b] → R and g : [a, b] → R be continuous on [a, b] and differentiable on (a, b).
Define the function

h(x) = f (x) (g(b) − g(a)) − g(x) (f (b) − f (a)) − f (a)g(b) + f (b)g(a).

Because f and g are continuous on [a, b] and differentiable on (a, b), so is h. Furthermore,
h(a) = h(b) = 0, so by Rolle’s theorem there exists a ξ ∈ (a, b) such that h0 (ξ) = 0. This
implies that
f 0 (ξ) (g(b) − g(a)) − g 0(ξ) (f (b) − f (a)) = 0
and, if g(b) 6= g(a),
f 0 (ξ) f (b) − f (a)
0
= .
g (ξ) g(b) − g(a)

Version: 3 Owner: pbruin Author(s): pbruin

365.33 proof of intermediate value theorem

We first prove the following lemma.

If f : [a, b] → R is a continuous function with f (a) ≤ 0 ≤ f (b) then ∃c ∈ [a, b] such that
f (c) = 0.

Define the sequences (an ) and (bn ) inductively, as follows.

a0 = a b0 = b
an + bn
cn =
2
(
(an−1 , cn−1 ) f (cn−1 ) ≥ 0
(an , bn ) =
(cn−1 , bn−1 ) f (cn−1 ) < 0

We note that

a0 ≤ a1 . . . ≤ an ≤ bn ≤ . . . b1 ≤ b0

(bn − an ) = 2−n (b0 − a0 ) (365.33.1)

f (an ) ≤ 0 ≤ f (bn ) (365.33.2)

1437
By the fundamental axiom of analysis (an ) → α and (bn ) → β. But (bn − an ) → 0 so α = β.
By continuity of f
(f (an )) → f (α) (f (bn )) → f (α)
But we have f (α) ≤ 0 and f (α) ≥ 0 so that f (α) = 0. Furthermore we have a ≤ α ≤ b,
proving the assertion.

Set g(x) = f (x) − k where f (a) ≤ k ≤ f (b). g satisfies the same conditions as before, so ∃c
such that f (c) = k. Thus proving the more general result.

Version: 2 Owner: vitriol Author(s): vitriol

365.34 proof of mean value theorem

Define h(x) on [a, b] by
 
f (b) − f (a)
h(x) = f (x) − f (a) − (x − a)
b−a

clearly, h is continuous on [a, b], differentiable on (a, b), and

h(a) = f (a) − f (a) = 0 
f (b)−f (a)
h(b) = f (b) − f (a) − b−a
(b − a) = 0

Notice that h satisfies the conditions of Rolle’s theorem. Therefore, by Rolle’s Theorem
there exists c ∈ (a, b) such that h0 (c) = 0.

However, from the definition of h we obtain by differentiation that
f (b) − f (a)
h0 (x) = f 0 (x) −
b−a
Since h0 (c) = 0, we therefore have

f (b) − f (a)
f 0 (c) =
b−a
as required.

REFERENCES
1. Michael Spivak, Calculus, 3rd ed., Publish or Perish Inc., 1994.

Version: 2 Owner: saforres Author(s): saforres

1438
365.35 proof of monotonicity criterion

Let us start from the implications “⇒”.

Suppose that f 0 (x) ≥ 0 for all x ∈ (a, b). We want to prove that therefore f is increasing. So
take x1 , x2 ∈ [a, b] with x1 < x2 . Applying the mean-value theorem on the interval [x1 , x2 ]
we know that there exists a point x ∈ (x1 , x2 ) such that

f (x2 ) − f (x1 ) = f 0 (x)(x2 − x1 )

and being f 0 (x) ≥ 0 we conclude that f (x2 ) ≥ f (x1 ).

This proves the first claim. The other three cases can be achieved with minor modifications:
replace all “≥” respectively with ≤, > and <.

Let us now prove the implication “⇐” for the first and second statement.

Given x ∈ (a, b) consider the ratio
f (x + h) − f (x)
.
h
If f is increasing the numerator of this ratio is ≥ 0 when h > 0 and is ≤ 0 when h < 0.
Anyway the ratio is ≥ 0 since the denominator has the same sign of the numerator. Since
we know by hypothesys that the function f is differentiable in x we can pass to the limit to
conclude that
f (x + h) − f (x)
f 0 (x) = lim ≥ 0.
h→0 h

If f is decreasing the ratio considered turns out to be ≤ 0 hence the conclusion f 0 (x) ≤ 0.

Notice that if we suppose that f is strictly increasing we obtain the this ratio is > 0, but
passing to the limit as h → 0 we cannot conclude that f 0 (x) > 0 but only (again) f 0 (x) ≥ 0.

Version: 2 Owner: paolini Author(s): paolini

365.36 proof of quotient rule

Let F (x) = f (x)/g(x). Then

f (x+h) f (x)
0 F (x + h) − F (x) g(x+h)
− g(x)
F (x) = lim = lim h
h→0 h h→0 h
f (x + h)g(x) − f (x)g(x + h)
= lim
h→0 hg(x + h)g(x)

1439
Like the product rule, the key to this proof is subtracting and adding the same quantity. We
separate f and g in the above expression by subtracting and adding the term f (x)g(x) in
the numerator.

f (x + h)g(x) − f (x)g(x) + f (x)g(x) − f (x)g(x + h)
F 0 (x) = lim
h→0 hg(x + h)g(x)
g(x) f (x+h)−f
h
(x)
− f (x) g(x+h)−g(x)
h
= lim
h→0 g(x + h)g(x)
limh→0 g(x) · limh→0 f (x+h)−f
h
(x)
− limh→0 f (x) · limh→0 g(x+h)−g(x)
h
=
limh→0 g(x + h) · limh→0 g(x)
g(x)f 0(x) − f (x)g 0 (x)
=
[g(x)]2

Version: 1 Owner: Luci Author(s): Luci

365.37 quotient rule

The quotient rule says that the derivative of the quotient f /g of two differentiable functions
f and g exists at all values of x as long as g(x) 6= 0 and is given by the formula
 
d f (x) g(x)f 0 (x) − f (x)g 0(x)
=
dx g(x) [g(x)]2

The Quotient Rule and the other differentiation formulas allow us to compute the derivative
of any rational function.

Version: 10 Owner: Luci Author(s): Luci

365.38 signum function

The signum function is the function sign : R → R

 −1 when x < 0,
sign(x) = 0 when x = 0,

1 when x > 0.

The following properties hold:

1440
1. For all x ∈ R, sign(−x) = − sign(x).

2. For all x ∈ R, |x| = sign(x)x.
d
3. For all x 6= 0, dx
|x| = sign(x).

Here, we should point out that the signum function is often defined simply as 1 for x > 0 and
−1 for x < 0. Thus, at x = 0, it is left undefined. See e.g. [2]. In applications, such as the
Laplace transform, this definition is adequate since the value of a function at a single point
does not change the analysis. One could then, in fact, set sign(0) to any value. However,
setting sign(0) = 0 is motivated by the above relations.

A related function is the Heaviside step function defined as

 0 when x < 0,
H(x) = 1/2 when x = 0,

1 when x > 0.

Again, this function is sometimes left undefined at x = 0. The motivation for setting
H(0) = 1/2 is that for all x ∈ R, we then have the relations
1
H(x) = (sign(x) + 1),
2
H(−x) = 1 − H(x).

This first relation is clear. For the second, we have
1
1 − H(x) = 1 − (sign(x) + 1)
2
1
= (1 − sign(x))
2
1
= (1 + sign(−x))
2
= H(−x).

Example Let a < b be real numbers, and let f : R → R be the piecewise defined function

4 when x ∈ (a, b),
f (x) =
0 otherwise.

Using the Heaviside step function, we can write

f (x) = 4 H(x − a) − H(x − b) (365.38.1)

almost everywhere. Indeed, if we calculate f using equation 364.38.1 we obtain f (x) = 4 for
x ∈ (a, b), f (x) = 0 for x ∈
/ [a, b], and f (a) = f (b) = 2. Therefore, equation 364.38.1 holds
at all points except a and b. 2

1441
365.38.1 Signum function for complex arguments

For a complex number z, the signum function is defined as [1]

0 when z = 0,
sign(z) =
z/|z| when z = 6 0.

In other words, if z is non-zero, then sign z is the projection of z onto the unit circle {z ∈
C | |z| = 1}. clearly, the complex signum function reduces to the real signum function for
real arguments. For all z ∈ C, we have

z sign z = |z|,

where z is the complex conjugate of z.

REFERENCES
1. E. Kreyszig, Advanced Engineering Mathematics, John Wiley & Sons, 1993, 7th ed.
2. G. Bachman, L. Narici, Functional analysis, Academic Press, 1966.

Version: 4 Owner: mathcam Author(s): matte

1442
Chapter 366

26A09 – Elementary functions

366.1 definitions in trigonometry

Informal definitions

Given a triangle ABC with a signed angle x at A and a right angle at B, the ratios
BC AB BC
AC AC AB
are dependant only on the angle x, and therefore define functions, denoted by
sin x cos x tan x
respectively, where the names are short for sine, cosine and tangent. Their inverses are
rather less important, but also have names:
1
cot x = AB/BC = (cotangent)
tan x
1
csc x = AC/BC = (cosecant)
sin x
1
sec x = AC/AB = (secant)
cos x
From Pythagoras’s theorem we have cos2 x + sin2 x = 1 for all (real) x. Also it is “clear”
from the diagram at left that functions cos and sin are periodic with period 2π. However:

Formal definitions

The above definitions are not fully rigorous, because we have not defined the word angle.
We will sketch a more rigorous approach.

The power series

X xn
n=0
n!

1443
converges uniformly on compact subsets of C and its sum, denoted by exp(x) or by ex , is
therefore an entire function of x, called the exponential function. f (x) = exp(x) is the
unique solution of the boundary value problem

f (0) = 1 f 0 (x) = f (x)

on R. The sine and cosine functions, for real arguments, are defined in terms of exp, simply
by
exp(ix) = cos x + i(sin x) .
Thus
x2 x4 x6
cos x = 1 − + − + ...
2! 4! 6!
x x3 x5
sin x = − + − ...
1! 3! 5!
Although it is not self-evident, cos and sin are periodic functions on the real line, and have
the same period. That period is denoted by 2π.

Version: 3 Owner: Daume Author(s): Larry Hammick

366.2 hyperbolic functions

The hyperbolic functions sinh x and cosh x ared defined as follows:

ex − e−x
sinh x :=
2
e + e−x
x
cosh x := .
2
One can then also define the functions tanh x and coth x in analogy to the definitions of
tan x and cot x:
sinh x ex − e−x
tanh x := = x
cosh x e + e−x
coth x ex + e−x
coth x := = x .
cosh x e − e−x
The hyperbolic functions are named in that way because the hyperbola

x2 y 2
− 2 =1
a2 b
can be written in parametrical form with the equations:

x = a cosh t, y = b sinh t.

1444
This is because of the equation

cosh2 x − sinh2 x = 1.

There are also addition formulas which are like the ones for trigonometric functions:

sinh(x ± y) = sinh x cosh y ± cosh x sinh y
cosh(x ± y) = cosh x cosh y ± sinh x sinh y.

The Taylor series for the hyperbolic functions are:

X x2n+1
sinh x =
n=0
(2n + 1)!
X∞
x2n
cosh x = .
n=0
(2n)!

Using complex numbers we can use the hyperbolic functions to express the trigonometric
functions:
sinh(ix)
sin x =
i
cos x = cosh(ix).

Version: 2 Owner: mathwizard Author(s): mathwizard

1445
Chapter 367

26A12 – Rate of growth of functions,
orders of infinity, slowly varying
functions

367.1 Landau notation

Given two functions f and g from R+ to R+ , the notation

f = O(g)
f (x)
means that the ratio g(x)
stays bounded as x → ∞. If moreover that ratio approaches zero,
we write
f = o(g).

It is legitimate to write, say, 2x = O(x) = O(x2 ), with the understanding that we are using
the equality sign in an unsymmetric (and informal) way, in that we do not have, for example,
O(x2 ) = O(x).

The notation
f = Ω(g)
f (x)
means that the ratio g(x)
is bounded away from zero as x → ∞, or equivalently g = O(f ).

If both f = O(g) and f = Ω(g), we write f = Θ(g).

One more notational convention in this group is

f (x) ∼ g(x),

meaning limx→∞ fg(x)
(x)
= 1.

1446
In analysis, such notation is useful in describing error estimates. For example, the Riemann hypothesis
is equivalent to the conjecture
x √
π(x) = + O( x log x)
log x

Landau notation is also handy in applied mathematics, e.g. in describing the efficiency of an
algorithm. It is common to say that an algorithm requires O(x3 ) steps, for example, without
needing to specify exactly what is a step; for if f = O(x3 ), then f = O(Ax3 ) for any positive
constant A.

Version: 8 Owner: mathcam Author(s): Larry Hammick, Logan

1447
Chapter 368

26A15 – Continuity and related
questions (modulus of continuity,
semicontinuity, discontinuities, etc.)

368.1 Dirichlet’s function

Dirichlet’s function f : R → R is defined as
 1
q
if x = pq is a rational number in lowest terms,
f (x) =
0 if x is an irrational number.

This function has the property that it is continuous at every irrational number and discontinuous
at every rational one.

Version: 3 Owner: urz Author(s): urz

368.2 semi-continuous

A real function f : A → R, where A ⊆ R is said to be lower semi-continuous in x0 if

∀ε > 0 ∃δ > 0 ∀x ∈ A |x − x0 | < δ ⇒ f (x) > f (x0 ) − ε,

and f is said to be upper semi-continuous if

∀ε > 0 ∃δ > 0 ∀x ∈ A |x − x0 | < δ ⇒ f (x) < f (x0 ) + ε.

1448
Remark A real function is continuous in x0 if and only if it is both upper and lower semicontinuous
in x0 .

We can generalize the definition to arbitrary topological spaces as follows.

Let A be a topological space. f : A → R is lower semicontinuous at x0 if, for each ε > 0
there is a neighborhood U of x0 such that x ∈ U implies f (x) > f (x0 ) − ε.

Theorem Let f : [a, b] → R be a lower (upper) semi-continuous function. Then f has a
minimum (maximum) in [a, b].

Version: 3 Owner: drini Author(s): drini, n3o

368.3 semicontinuous

Defintion [1] Suppose X is a topological space, and f is a function from X into the
extended real numbers R; f : X → R. Then:

1. If {x ∈ X | f (x) > α} is an open set in X for all α ∈ R, then f is said to be lower
semicontinuous.

2. If {x ∈ X | f (x) < α} is an open set in X for all α ∈ R, then f is said to be upper
semicontinuous.

Properties

1. If X is a topological space and f is a function f : X → R, then f is continuous if and
only if f is upper and lower semicontinuous [1, 3].

2. The characteristic function of an open set is lower semicontinuous [1, 3].

3. The characteristic function of a closed set is upper semicontinuous [1, 3].

4. If f and g are lower semicontinuous, then f + g is also lower semicontinuous [3].

REFERENCES
1. W. Rudin, Real and complex analysis, 3rd ed., McGraw-Hill Inc., 1987.
2. D.L. Cohn, Measure Theory, Birkhäuser, 1980.

Version: 2 Owner: bwebste Author(s): matte, apmxi

1449
368.4 uniformly continuous

Let f : A → R be a real function defined on a subset A of the real line. We say that f is
uniformly continuous if, given an arbitrary small positive ε, there exists a positive δ such
that whenever two points in A differ by less than δ, they are mapped by f into points which
differ by less than ε. In symbols:

∀ε > 0 ∃δ > 0 ∀x, y ∈ A |x − y| < δ ⇒ |f (x) − f (y)| < ε.

Every uniformly continuous function is also continuous, while the converse does not always
hold. For instance, the function f :]0, +∞[→ R defined by f (x) = 1/x is continuous in its
domain, but not uniformly.

A more general definition of uniform continuity applies to functions between metric spaces
(there are even more general environments for uniformly continuous functions, i.e. Uniform spaces).
Given a function f : X → Y , where X and Y are metric spaces with distances dX and dY ,
we say that f is uniformly continuous if

∀ε > 0 ∃δ > 0 ∀x, y ∈ X dX (x, y) < δ ⇒ dY (f (x), f (y)) < ε.

Uniformly continuous functions have the property that they map Cauchy sequences to Cauchy
sequences and that they preserve uniform convergence of sequences of functions.

Any continuous function defined on a compact space is uniformly continuous (see Heine-Cantor theorem).

Version: 10 Owner: n3o Author(s): n3o

1450
Chapter 369

26A16 – Lipschitz (Hölder) classes

369.1 Lipschitz condition

A mapping f : X → Y between metric spaces is said to satisfy the Lipschitz condition if
there exists a real constant α > 0 such that

dY (f (p), f (q)) 6 αdX (p, q), for all p, q ∈ X.

Proposition 17. A Lipschitz mapping f : X → Y is uniformly continuous.

Proof. Let f be a Lipschitz mapping and α > 0 a corresponding Lipschitz constant. For
every given  > 0, choose δ > 0 such that

δα < .

Let p, q ∈ X such that
dX (p, q) < δ
be given. By assumption,
dY (f (p), f (q)) 6 αδ < ,
as desired. QED

Notes. More generally, one says that mapping satisfies a Lipschitz condition of order β > 0
if there exists a real constant α > 0 such that

dY (f (p), f (q)) 6 αdX (p, q)β , for all p, q ∈ X.

Version: 17 Owner: rmilson Author(s): rmilson, slider142

1451
369.2 Lipschitz condition and differentiability

If X and Y are Banach spaces, e.g. Rn , one can inquire about the relation between differ-
entiability and the Lipschitz condition. The latter is the weaker condition. If f is Lipschitz,
the ratio
kf (q) − f (p)k
, p, q ∈ X
kq − pk
is bounded but is not assumed to converge to a limit. Indeed, differentiability is the stronger
condition.

Proposition 18. Let f : X → Y be a continuously differentiable mapping between Banach
spaces. If K ⊂ X is a compact subset, then the restriction f : K → Y satisfies the Lipschitz
condition.

Proof. Let Lin(X, Y ) denote the Banach space of bounded linear maps from X to Y . Recall
that the norm kT k of a linear mapping T ∈ Lin(X, Y ) is defined by

kT uk
kT k = sup{ : u 6= 0}.
kuk

Let Df : X → Lin(X, Y ) denote the derivative of f . By definition Df is continuous, which
really means that kDf k : X → R is a continuous function. Since K ⊂ X is compact, there
exists a finite upper bound B1 > 0 for kDf k restricted to U. In particular, this means that

kDf (p)uk 6 kDf (p)kkuk 6 B1 kuk,

for all p ∈ K, u ∈ X.

Next, consider the secant mapping s : X × X → R defined by

 kf (q) − f (p) − Df (p)(q − p)k q 6= p
s(p, q) = kq − pk

0 p=q

This mapping is continuous, because f is assumed to be continuously differentiable. Hence,
there is a finite upper bound B2 > 0 for s restricted to the compact K × K. It follows that
for all p, q ∈ K we have

kf (q) − f (p)k 6 kf (q) − f (p) − Df (p)(q − p)k + kDf (p)(q − p)k
6 B2 kq − pk + B1 kq − pk
= (B1 + B2 )kq − pk

Therefore B1 , B2 is the desired Lipschitz constant. QED

Version: 22 Owner: rmilson Author(s): rmilson, slider142

1452
369.3 Lipschitz condition and differentiability result

About Lipschitz continuity of differentiable functions the following holds.

Theorem 6. Let X, Y be Banach spaces and let A be a convex (see convex set), open subset
of X. Let f : A → Y be a function which is continuous in A and differentiable in A. Then
f is lipschitz continuous on A if and only if the derivative Df is bounded on A i.e.

sup kDf (x)k < +∞.
x∈A

S uppose that f is lipschitz continuous:

kf (x) − f (y)k ≤ Lkx − yk.

Then given any x ∈ A and any v ∈ X, for all small h ∈ R we have

f (x + hv) − f (x)
k k ≤ L.
h
Hence, passing to the limit h → 0 it must hold kDf (x)k ≤ L.

On the other hand suppose that Df is bounded on A:

kDf (x)k ≤ L, ∀x ∈ A.

Given any two points x, y ∈ A and given any α ∈ Y ∗ consider the function G : [0, 1] → R

G(t) = hα, f ((1 − t)x + ty)i.

For t ∈ (0, 1) it holds
G0 (t) = hα, Df ((1 − t)x + ty)[y − x]i
and hence
|G0 (t)| ≤ Lkαk ky − xk.
Applying Lagrange mean-value theorem to G we know that there exists ξ ∈ (0, 1) such that

|hα, f (y) − f (x)i| = |G(1) − G(0)| = |G0(ξ)| ≤ kαkLky − xk

and since this is true for all α ∈ Y ∗ we get

kf (y) − f (x)k ≤ Lky − xk

which is the desired claim.

Version: 1 Owner: paolini Author(s): paolini

1453
Chapter 370

26A18 – Iteration

370.1 iteration

Let f : X → X be a function, X being any set. The n-th iteration of a function is the
function which is obtained if f is applied n times, and is denoted by f n . More formally we
define:
f 0 (x) = x
and
f n+1 (x) = f (f n (x))
for nonnegative integers n. If f is invertible, then by going backwards we can define the
iterate also for negative n.

Version: 6 Owner: mathwizard Author(s): mathwizard

370.2 periodic point

Let f : X → X be a function and f n its n-th iteration. A point x is called a periodic point
of period n of f if it is a fixed point of f n . The least n for which x is a fixed point of f n is
called prime period or least period.

If f is a function mapping R to R or C to C then a periodic point x of prime period n is
called hyperbolic if |(f n )0 (x)| =
6 1, attractive if |(f n )0 (x)| < 1 and repelling if |(f n )0 (x)| > 1.

Version: 11 Owner: mathwizard Author(s): mathwizard

1454
Chapter 371

26A24 – Differentiation (functions of
one variable): general theory,
generalized derivatives, mean-value
theorems

371.1 Leibniz notation

Leibniz notation centers around the concept of a differential element. The differential
element of x is represented by dx. You might think of dx as being an infinitesimal change
dy
in x. It is important to note that d is an operator, not a variable. So, when you see dx , you
y
can’t automatically write as a replacement x .
df (x) d
We use dx
or dx
f (x) to represent the derivative of a function f (x) with respect to x.
df (x) f (x + Dx) − f (x)
= lim
dx Dx→0 Dx
We are dividing two numbers infinitely close to 0, and arriving at a finite answer. D is
another operator that can be thought of just a change in x. When we take the limit of Dx
as Dx approaches 0, we get an infinitesimal change dx.

Leibniz notation shows a wonderful use in the following example:
dy dy du dy du
= =
dx dx du du dx
The two dus can be cancelled out to arrive at the original derivative. This is the Leibniz
notation for the chain rule.

Leibniz notation shows up in the most common way of representing an integral,
F (x) = intf (x)dx

1455
The dx is in fact a differential element. Let’s start with a derivative that we know (since
F (x) is an antiderivative of f (x)).
dF (x)
= f (x)
dx
dF (x) = f (x)dx
intdF (x) = intf (x)dx
F (x) = intf (x)dx
We can think of dF (x) as the differential element of area. Since dF (x) = f (x)dx, the element
of area is a rectangle, with f (x) × dx as its dimensions. Integration is the sum of all these
infinitely thin elements of area along a certain interval. The result: a finite number.

(a diagram is deserved here)

One clear advantage of this notation is seen when finding the length s of a curve. The
formula is often seen as the following:
s = intds
The length is the sum of all the elements,qds, of length. If we have a function f (x), the
length element is usually written as ds = 1 + [ dfdx (x) 2
] dx. If we modify this a bit, we get
p
2 2
ds = [dx] + [df (x)] . Graphically, we could say that the length element is the hypotenuse
of a right triangle with one leg being the x element, and the other leg being the f (x) element.

(another diagram would be nice!)

There are a few caveats, such as if you want to take the value of a derivative. Compare to
the prime notation.
0 df (x)
f (a) =
dx x=a

A second derivative is represented as follows:
d dy d2 y
= 2
dx dx dx
3
d y
The other derivatives follow as can be expected: dx 3 , etc. You might think this is a little

sneaky, but it is the notation. Properly using these terms can be interesting. For example,
2 d2 y
what is int ddxy ? We could turn it into int dx dy dy
2 dx or intd dx . Either way, we get dx .

Version: 2 Owner: xriso Author(s): xriso

371.2 derivative

Qualitatively the derivative is a measure of the change of a function in a small region around
a specified point.

1456
Motivation

The idea behind the derivative comes from the straight line. What characterizes a straight
line is the fact that it has constant “slope”.

Figure 371.1: The straight line y = mx + b

In other words for a line given by the equation y = mx + b, as in Fig. 370.1, the ratio of ∆y
∆y
over ∆x is always constant and has the value ∆x = m.

Figure 371.2: The parabola y = x2 and its tangent at (x0 , y0 )

For other curves we cannot define a “slope”, like for the straight line, since such a quantity
would not be constant. However, for sufficiently smooth curves, each point on a curve has a
tangent line. For example consider the curve y = x2 , as in Fig. 370.2. At the point (x0 , y0 )
on the curve, we can draw a tangent of slope m given by the equation y − y0 = m(x − x0 ).

Suppose we have a curve of the form y = f (x), and at the point (x0 , f (x0 )) we have a tangent
given by y − y0 = m(x − x0 ). Note that for values of x sufficiently close to x0 we can make
the approximation f (x) ≈ m(x − x0 ) + y0. So the slope m of the tangent describes how much
f (x) changes in the vicinity of x0 . It is the slope of the tangent that will be associated with
the derivative of the function f (x).

Formal definition

More formally for any real function f : R → R, we define the derivative of f at the point x
as the following limit (if it exists)

f (x + h) − f (x)
f 0 (x) := lim .
h→0 h
This definition turns out to be consistent with the motivation introduced above.

The derivatives for some elementary functions are (cf. Derivative notation)

d
1. dx
c = 0, where c is constant;
d n
2. dx
x = nxn−1 ;
d
3. dx
sin x = cos x;

1457
d
4. dx
cos x = − sin x;
d x
5. dx
e = ex ;
d
6. dx
ln x = x1 .

While derivatives of more complicated expressions can be calculated algorithmically using
the following rules

d
Linearity dx
(af (x) + bg(x)) = af 0 (x) + bg 0 (x);
d
Product rule dx
(f (x)g(x)) = f 0 (x)g(x) + f (x)g 0 (x);
d
Chain rule dx
g(f (x)) = g 0(f (x))f 0 (x);
d f (x) f 0 (x)g(x)−f (x)g 0 (x)
Quotient Rule dx g(x)
= g(x)2
.

Note that the quotient rule, although given as much importance as the other rules in ele-
mentary calculus, can be derived by succesively applying the product rule and the chain rule
to fg(x)
(x) 1
= f (x) g(x) . Also the quotient rule does not generalize as well as the other ones.

Since the derivative f 0 (x) of f (x) is also a function x, higher derivatives can be obtained by
applying the same procedure to f 0 (x) and so on.

Generalization

Banach Spaces

Unfortunately the notion of the “slope of the tangent” does not directly generalize to more
abstract situations. What we can do is keep in mind the facts that the tangent is a linear
function and that it approximates the function near the point of tangency, as well as the
formal definition above.

Very general conditions under which we can define a derivative in a manner much similar to
the above areas follows. Let f : V → W, where V and W are Banach spaces. Suppose that
h ∈ V and h 6= 0, the we define the directional derivative (Dh f )(x) at x as the following
limit
f (x + h) − f (x)
(Dh f )(x) := lim ,
→0 
where  is a scalar. Note that f (x + h) ≈ f (x) + (Dh f )(x), which is consistent with our
original motivation. This directional derivative is also called the Gâteaux derivative.

1458
Finally we define the derivative at x as the bounded linear map (Df )(x) : V → W such that
for any non-zero h ∈ V
(f (x + h) − f (x)) − (Df )(x) · h
lim = 0.
khk→0 khk

Once again we have f (x + h) ≈ f (x) + (Df )(x) · h. In fact, if the derivative (Df )(x)
exists, the directional derivatives can be obtained as (Dh f )(x) = (Df )(x) · h.1 each non-
zero h ∈ V does not guarantee the existence of (Df )(x). This derivative is also called the
Fréchet derivative. In the more familiar case f : Rn → Rm , the derivative Df is simply
the Jacobian of f .

Under these general conditions the following properties of the derivative remain

1. Dh = 0, where h is a constant;

2. D(A · x) = A, where A is linear.

Linearity D(af (x) + bg(x)) · h = a(Df )(x) · h + b(Dg)(x) · h;

“Product” rule D(B(f (x), g(x)))·h = B((Df )(x)·h, g(x))+B(f (x), (Dg)(x)·h), where
B is bilinear;

Chain rule D(g(f (x)) · h = (Dg)(f (x)) · ((Df )(x) · h).

Note that the derivative of f can be seen as a function Df : V → L(V, W) given by Df : x 7→
(Df )(x), where L(V, W) is the space of bounded linear maps from V to W. Since L(V, W)
can be considered a Banach space itself with the norm taken as the operator norm, higher
derivatives can be obtained by applying the same procedure to Df and so on.

Manifolds

A manifold is a topological space that is locally homeomorphic to a Banach space V (for
finite dimensional manifolds V = Rn ) and is endowed with enough structure to define deriva-
tives. Since the notion of a manifold was constructed specifically to generalize the notion of
a derivative, this seems like the end of the road for this entry. The following discussion is
rather technical, a more intuitive explanation of the same concept can be found in the entry
on related rates.

Consider manifolds V and W modeled on Banach spaces V and W, respectively. Say we
have y = f (x) for some x ∈ V and y ∈ W , then, by definition of a manifold, we can find
1
The notation A · h is used when h is a vector and A a linear operator. This notation can be considered
advantageous to the usual notation A(h), since the latter is rather bulky and the former incorporates the
intuitive distributive properties of linear operators also associated with usual multiplication.

1459
charts (X, x) and (Y, y), where X and Y are neighborhoods of x and y, respectively. These
charts provide us with canonical isomorphisms between the Banach spaces V and W, and
the respective tangent spaces Tx V and Ty W :

dxx : Tx V → V, dyy : Ty W → W.

Now consider a map f : V → W between the manifolds. By composing it with the chart
maps we construct the map
(Y,y)
g(X,x) = y ◦ f ◦ x−1 : V → W,

defined on an appropriately restricted domain. Since we now have a map between Banach
(Y,y)
spaces, we can define its derivative at x(x) in the sense defined above, namely Dg(X,x) (x(x)).
If this derivative exists for every choice of admissible charts (X, x) and (Y, y), we can say
that the derivative of Df (x) of f at x is defined and given by
(Y,y)
Df (x) = dyy−1 ◦ Dg(X,x) (x(x)) ◦ dxx

(it can be shown that this is well defined and independent of the choice of charts).

Note that the derivative is now a map between the tangent spaces of the two manifolds
Df (x) : Tx V → Ty W . Because of this a common notation for the derivative of f at x is
Tx f . Another alternative notation for the derivative is f∗,x because of its connection to the
category-theoretical pushforward.

Version: 15 Owner: igor Author(s): igor

371.3 l’Hpital’s rule

L’Hôpital’s rule states that given an unresolvable limit of the form 00 or ∞ ∞
, the ratio of
f (x) f 0 (x)
functions g(x) will have the same limit at c as the ratio g0 (x) . In short, if the limit of a ratio
of functions approaches an indeterminate form, then

f (x) f 0 (x)
lim = lim 0
x→c g(x) x→c g (x)

provided this last limit exists. L’Hôpital’s rule may be applied indefinitely as long0 as the
conditions still exist. However it is important to note, that the nonexistance of lim fg0 (x)
(x)
does
not prove the nonexistance of lim fg(x)
(x)
.

Example: We try to determine the value of

x2
lim .
x→∞ ex

1460

As x approaches ∞ the expression becomes an indeterminate form ∞
. By applying L’Hôpital’s
rule we get
x2 2x 2
lim x = lim x = lim x = 0.
x→∞ e x→∞ e x→∞ e

Version: 8 Owner: mathwizard Author(s): mathwizard, slider142

371.4 proof of De l’Hpital’s rule

Let x0 ∈ R, I be an interval containing x0 and let f and g be two differentiable functions
defined on I \ {x0 } with g 0(x) 6= 0 for all x ∈ I. Suppose that
lim f (x) = 0, lim g(x) = 0
x→x0 x→x0

and that
f 0 (x)
lim = m.
x→x0 g 0 (x)

We want to prove that hence g(x) 6= 0 for all x ∈ I \ {x0 } and
f (x)
lim = m.
x→x0 g(x)

First of all (with little abuse of notation) we suppose that f and g are defined also in the
point x0 by f (x0 ) = 0 and g(x0 ) = 0. The resulting functions are continuous in x0 and hence
in the whole interval I.

Let us first prove that g(x) 6= 0 for all x ∈ I \ {x0 }. If by contradiction g(x̄) = 0 since we
also have g(x0 ) = 0, by Rolle’s theorem we get that g 0 (ξ) = 0 for some ξ ∈ (x0 , x̄) which is
against our hypotheses.

Consider now any sequence xn → x0 with xn ∈ I \ {x0 }. By Cauchy’s mean value theorem
there exists a sequence x0n such that
f (xn ) f (xn ) − f (x0 ) f 0 (x0 )
= = 0 0n .
g(xn ) g(xn ) − g(x0 ) g (xn )
But as xn → x0 and since x0n ∈ (x0 , xn ) we get that x0n → x0 and hence
f (xn ) f 0 (xn ) f 0 (x)
lim = lim 0 = lim 0 = m.
n→∞ g(xn ) n→∞ g (xn ) x→x0 g (x)

Since this is true for any given sequence xn → x0 we conclude that
f (x)
lim = m.
x→x0 g(x)

Version: 5 Owner: paolini Author(s): paolini

1461
371.5 related rates

The notion of a derivative has numerous interpretations and applications. A well-known
geometric interpretation is that of a slope, or more generally that of a linear approximation
to a mapping between linear spaces (see here). Another useful interpretation comes from
physics and is based on the idea of related rates. This second point of view is quite general,
and sheds light on the definition of the derivative of a manifold mapping (the latter is
described in the pushforward entry).

Consider two physical quantities x and y that are somehow coupled. For example:

• the quantities x and y could be the coordinates of a point as it moves along the
unit circle;

• the quantity x could be the radius of a sphere and y the sphere’s surface area;

• the quantity x could be the horizontal position of a point on a given curve and y the
distance traversed by that point as it moves from some fixed starting position;

• the quantity x could be depth of water in a conical tank and y the rate at which the
water flows out the bottom.

Regardless of the application, the situation is such that a change in the value of one quantity
is accompanied by a change in the value of the other quantity. So let’s imagine that we
take control of one of the quantities, say x, and change it in any way we like. As we do so,
quantity y follows suit and changes along with x. Now the analytical relation between the
values of x and y could be quite complicated and non-linear, but the relation between the
instantaneous rates of change of x and y is linear.

It does not matter how we vary the two quantities, the ratio of the rates of change depends
only on the values of x and y. This ratio is, of course, the derivative of the function that
maps the values of x to the values of y. Letting ẋ, ẏ denote the rates of change of the two
quantities, we describe this conception of the derivative as
dy ẏ
= ,
dx ẋ
or equivalently as
dy
ẏ = ẋ. (371.5.1)
dx

Next, let us generalize the discussion and suppose that the two quantities x and y represent
physical states with multiple degrees of freedom. For example, x could be a point on the
earth’s surface, and y the position of a point 1 kilometer to the north of x. Again, the
dependence of y and x is, in general, non-linear, but the rate of change of y does have a
linear dependence on the rate of change of x. We would like to say that the derivative is

1462
precisely this linear relation, but we must first contend with the following complication. The
rates of change are no longer scalars, but rather velocity vectors, and therefore the derivative
must be regarded as a linear transformation that changes one vector into another.

In order to formalize this generalized notion of the derivative we must consider x and y
to be points on manifolds X and Y , and the relation between them a manifold mapping
φ : X → Y . A varying x is formally described by a trajectory

γ : I → X, I ⊂ R.

The corresponding velocities take their value in the tangent spaces of X:

γ 0 (t) ∈ Tγ(t) X.

The “coupling” of the two quantities is described by the composition

φ ◦ γ : I → Y.

The derivative of φ at any given x ∈ X is a linear mapping

φ∗ (x) : Tx X → Tφ(x) Y,

called the pushforward of φ at x, with the property that for every trajectory γ passing
through x at time t, we have
(φ ◦ γ)0 (t) = φ∗ (x)γ 0 (t).
The above is the multi-dimensional and coordinate-free generalization of the related rates
relation (370.5.1).

All of the above has a perfectly rigorous presentation in terms of manifold theory. The
approach of the present entry is more informal; our ambition was merely to motivate the
notion of a derivative by describing it as a linear transformation between velocity vectors.

Version: 2 Owner: rmilson Author(s): rmilson

1463
Chapter 372

26A27 – Nondifferentiability
(nondifferentiable functions, points of
nondifferentiability), discontinuous
derivatives

372.1 Weierstrass function

The Weierstrass function is a continuous function that is nowhere differentiable, and hence
is not an analytic function. The formula for the Weierstrass function is

X
f (x) = bn cos(an πx)
n=1

with a odd, 0 < b < 1, and ab > 1 + 23 π.

Another example of an everywhere continuous but nowhere differentiable curve is the fractal
Koch curve.

[insert plot of Weierstrass function]

Version: 5 Owner: akrowne Author(s): akrowne

1464
Chapter 373

26A36 – Antidifferentiation

373.1 antiderivative

The function F (x) is called an antiderivative of a function f (x) if (and only if) the
derivative of F is equal to f .
F 0 (x) = f (x)
Note that there are an infinite number of antiderivatives for any function f (x), since any
constant can be added or subtracted from any valid antiderivative to yield another equally
valid antiderivative. To account for this, we express the general antiderivative, or indef-
inite integral, as follows:
intf (x) dx = F (x) + C
where C is an arbitrary constant called the constant of integration. The dx portion means
”with respect to x”, because after all, our functions F and f are functions of x.

Version: 4 Owner: xriso Author(s): xriso

373.2 integration by parts

When one has an integral of a product of two functions, it is sometimes preferable to simplify
the integrand by integrating one of the functions and differentiating the other. This process
is called integrating by parts, and is defined in the following way, where u and v are functions
of x.
intu · v 0 dx = u · v − intv · u0 dx
This process may be repeated indefinitely, and in some cases it may be used to solve for the
original integral algebraically. For definite integrals, the rule appears as

intba u(x) · v 0 (x) dx = (u(b) · v(b) − u(a) · v(a)) − intba v(x) · u0 (x) dx

1465
Proof: Integration by parts is simply the antiderivative of a product rule. Let G(x) =
u(x) · v(x). Then,
G0 (x) = u0 (x)v(x) + u(x)v 0 (x)
Therefore,
G0 (x) − v(x)u0 (x) = u(x)v 0 (x)
We can now integrate both sides with respect to x to get

G(x) − intv(x)u0 (x) dx = intu(x)v 0 (x) dx

which is just integration by parts rearranged.
Example: We integrate the function f (x) = x sin x: Therefore we define u(x) := x and
v 0 (x) = sin x. So integration by parts yields us:

intx sin xdx = −x cos x + int cos xdx = −x cos x + sin x.

Version: 5 Owner: mathwizard Author(s): mathwizard, slider142

373.3 integrations by parts for the Lebesgue integral

Theorem [1, 2] Suppose f, g are complex valued functions on a bounded interval [a, b]. If f
and g are absolutely continuous, then

int[a,b] f 0 g = −int[a,b] f g 0 + f (b)g(b) − f (a)g(a).

where both integrals are Lebesgue integrals.

Remark Any absolutely continuous function can be differentiated almost everywhere. Thus,
in the above, the functions f 0 and g 0 make sense.

Proof. Since f, g and f g are almost everywhere differentiable with Lebesgue integrable

(f g)0 = f 0 g + f g 0

almost everywhere, and

int[a,b] (f g)0 = int[a,b] f 0 g + f g 0
= int[a,b] f 0 g + int[a,b] f g 0 .

The last equality is justified since f 0 g and f g 0 are integrable. For instance, we have

int[a,b] |f 0 g| ≤ max |g(x)|int[a,b] |f 0|,
x∈[a,b]

which is finite since g is continuous and f 0 is Lebesgue integrable. Now the claim follows
from the Fundamental theorem of calculus for the Lebesgue integral. 2

1466
REFERENCES
1. Jones, F., Lebesgue Integration on Euclidean Spaces, Jones and Barlett Publishers, 1993.
2. Ng, Tze Beng, Integration by Parts, online.

Version: 4 Owner: matte Author(s): matte

1467
Chapter 374

26A42 – Integrals of Riemann,
Stieltjes and Lebesgue type

374.1 Riemann sum

Suppose there is a function f : I → R where I = [a, b] is a closed interval, and f is bounded
on I. If we have a finite set of points {x0 , x1 , x2 , . . . xn } such that a = x0 < x1 < x2 · · · <
xn = b, then this set creates a partition P = {[x0 , x1 ), [x1 , x2 ), . . . [xn − 1, xn ]} of I. If P is a
partition with n ∈ N elements of I, then the Riemann sum of f over I with the partition P
is defined as

n
X
S= f (yi)(xi − xi−1 )
i=1

where xi−1 6 yi 6 xi . The choice of yi is arbitrary. If yi = xi−1 for all i, then S is called a
left Riemann sum. If yi = xi , then S is called a right Riemann sum. Suppose we have

n
X
S= b(xi − xi−1 )
i=1

where b is the supremum of f over [xi−1 , xi ]; then S is defined to be an upper Riemann sum.
Similarly, if b is the infimum of f over [xi−1 , xi ], then S is a lower Riemann sum.

Version: 3 Owner: mathcam Author(s): mathcam, vampyr

1468
374.2 Riemann-Stieltjes integral

Let f and α be bounded, real-valued functions defined upon a closed finite interval I = [a, b]
of R(a 6= b), P = {x0 , ..., xn } a partition of I, and ti a point of the subinterval [xi−1 , xi ]. A
sum of the form

n
X
S(P, f, α) = f (ti )(α(xi ) − α(xi−1 ))
i=1

is called a Riemann-Stieltjes sum of f with respect to α. f is said to be Riemann integrable
with respect to α on I if there exists A ∈ R such that given any  > 0 there exists a par-
tition P of I for which, for all P finer than P and for every choice of points ti , we have

|S(P, f, α) − A| < 

If such an A exists, then it is unique and is known as the Riemann-Stieltjes integral of
f with respect to α. f is known as the integrand and α the integrator. The integral is
denoted by

intba f dα or intba f (x)dα(x)

Version: 3 Owner: vypertd Author(s): vypertd

374.3 continuous functions are Riemann integrable

Let f : [a, b] → R be a continuous function. Then f is Riemann integrable.

Version: 2 Owner: paolini Author(s): paolini

374.4 generalized Riemann integral

A function f : [a, b] → R is said to be generalized Riemann integrable on [a, b] if there
exists a number L ∈ R such that for every  > 0 there exists a gauge δ on [a, b] such that if
Ṗ is any δ -fine partition of [a, b], then

|S(f ; Ṗ) − L| < 

1469
Where S(f ; Ṗ) is the Riemann sum for f using the partition Ṗ. The collection of all gener-
alized Riemann integrable functions is usually denoted by R∗ [a, b].

If f ∈ R∗ [a, b] then the number L is uniquely determined, and is called the generalized
Riemann integral of f over [a, b].

Version: 3 Owner: vypertd Author(s): vypertd

374.5 proof of Continuous functions are Riemann in-
tegrable

Recall the definition of Riemann integral. To prove that f is integrable we have to prove
that limδ→0+ S ∗ (δ) − S∗ (δ) = 0. Since S ∗ (δ) is decreasing and S∗ (δ) is increasing it is enough
to show that given  > 0 there exists δ > 0 such that S ∗ (δ) − S∗ (δ) < .

So let  > 0 be fixed.

By Heine-Cantor theorem f is uniformly continuous i.e.

∃δ > 0 |x − y| < δ ⇒ |f (x) − f (y)| < .
b−a

Let now P be any partition of [a, b] in C(δ) i.e. a partition {x0 = a, x1 , . . . , xN = b} such
that xi+1 − xi < δ. In any small interval [xi , xi+1 ] the function f (being continuous) has a
maximum Mi and minimum mi . Being f uniformly continuous and being xi+1 − xi < δ we
hence have Mi − mi < /(b − a). So the difference between upper and lower Riemann sums
is X X  X
Mi (xi+1 − xi ) − mi (xi+1 − xi ) ≤ (xi+1 − xi ) = .
i i
b − a i

Being this true for every partition P in C(δ) we conclude that S ∗ (δ) − S∗ (δ) < .

Version: 1 Owner: paolini Author(s): paolini

1470
Chapter 375

26A51 – Convexity, generalizations

375.1 concave function

Let f (x) a continuous function defined on an interval [a, b]. Then we say that f is a concave
function on [a, b] if, for any x1 , x2 in [a, b] and any λ ∈ [0, 1] we have
 
f λx1 + (1 − λ)x2 > λf (x1 ) + (1 − λ)f (x2 ).

The definition is equivalent to the statements:

• For all x1 , x2 in [a, b],  
x1 + x2 f (x1 ) + f (x2 )
f ≥
2 2

• The second derivative of f is negative on [a, b].

• If f has a derivative which is monotone decreasing.

obviously, the last two items apply provided f has the required derivatives.

And example of concave function is f (x) = −x2 on the interval [−5, 5].

Version: 5 Owner: drini Author(s): drini

1471
Chapter 376

26Axx – Functions of one variable

376.1 function centroid

Let f : D ⊂ R → R be an arbitrary function. By analogy with the geometric centroid, the
centroid of an function f is defined as:

intxf (x)dx
hxi = ,
intf (x)dx

where the integrals are taken over the domain D.

1472
Chapter 377

26B05 – Continuity and
differentiation questions

377.1 C0∞ (U ) is not empty

Theorem If U is a non-empty open set in Rn , then the set of smooth functions with compact support
C0∞ (U) is not empty.

The proof is divided into three sub-claims:

Claim 1 Let a < b be real numbers. Then there exists a smooth non-negative function
f : R → R, whose support is the compact set [a, b].

To prove Claim 1, we need the following lemma:

Lemma ([4], pp. 14) If

0 for x ≤ 0,
φ(x) =
e−1/x for x > 0,
then φ : R → R is a non-negative smooth function.

(A proof of the Lemma can be found in [4].)

Proof of Claim 1. Using the lemma, let us define

f (x) = φ(x − a)φ(b − x).

Since φ is smooth, it follows that f is smooth. Also, from the definition of φ, we see that
φ(x − a) = 0 precisely when x ≤ a, and φ(b − x) = 0 precisely when x ≥ b. Thus the support
of f is indeed [a, b]. 2

Claim 2 Let ai , bi be real numbers with ai < bi for all i = 1, . . . , n. Then there exists a

1473
smooth non-negative function f : Rn → R whose support is the compact set [a1 , b1 ] × · · · ×
[an , bn ].

Proof of Claim 2. Using Claim 1, we can for each i = 1, . . . , n construct a function fi with
support in [ai , bi ]. Then
f (x1 , . . . , xn ) = f1 (x1 )f2 (x2 ) · · · fn (xn )
gives a smooth function with the sought properties. 2

Claim 3 If U is a non-empty open set in Rn , then there are real numbers ai < bi for
i = 1, . . . , n such that [a1 , b1 ] × · · · × [an , bn ] is a subset of U.

Proof of Claim 3. Here, of course, we assume that Rn is equipped with the usual topology
induced by the open balls of the Euclidean metric.

Since U is non-empty, there exists some point x in U. Further, since U is a topological space,
x is contained in some open set. Since the topology has a basis consisting of open balls, there
exists a y ∈ U and ε > 0 such that x is contained in the open ball B(y, ε). Let us now set
ai = yi − 2√ε n and bi = yi + 2√ε n for all i = 1, . . . , n. Then D = [a1 , b1 ] × · · · × [an , bn ] can be
parametrized as
ε
D = {y + (λ1 , . . . , λn ) √ | λi ∈ [−1, 1] for all i = 1, . . . , n}.
2 n
For an arbitrary point in D, we have
ε ε
|y + (λ1 , . . . , λn ) √ − y| = |(λ1 , . . . , λn )| √ |
2 n 2 n
q
ε
= √ λ21 + · · · + λ2n
2 n
ε
≤ < ε,
2
so D ⊂ B(y, ) ⊂ U, and Claim 3 follows. 2

REFERENCES
1. L. Hörmander, The Analysis of Linear Partial Differential Operators I, (Distribution
theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990.

Version: 3 Owner: matte Author(s): matte

Let f : Rn → R be any Lipschitz continuous function. Then f is differentiable in almost
every x ∈ Rn .

1474
Version: 1 Owner: paolini Author(s): paolini

377.3 smooth functions with compact support

Definition [3] Let U be an open set in Rn . Then the set of smooth functions with
compact support (in U) is the set of functions f : Rn → C which are smooth (i.e.,
∂ α f : Rn → C is a continuous function for all multi-indices α) and supp f is compact and
contained in U. This functionspace is denoted by C0∞ (U).

Remarks

1. A proof that C0∞ (U) is not empty can be found here.
2. With the usual point-wise addition and point-wise multiplication by a scalar, C0∞ (U)
is a vector space over the field C.
3. Suppose U and V are open subsets in Rn and U ⊂ V . Then C0∞ (U) is a vector subspace
of C0∞ (V ). In particular, C0∞ (U) ⊂ C0∞ (V ).

It is possible to equip C0∞ (U) with a topology, which makes C0∞ (U) into a locally convex topological vector s
The definition, however, of this topology is rather involved (see e.g. [3]). However, the next
theorem shows when a sequence converges in this topology.

Theorem 1 Suppose that U is an open set in Rn , and that {φi }∞
i=1 is a sequence of functions

in C0 (U). Then {φi } converges (in the aforementioned topology) to a function φ ∈ C0∞ (U)
if and only if the following conditions hold:

1. There is a compact set K ⊂ U such that supp φi ⊂ K for all i = 1, 2, . . ..
2. For every multi-index α,
∂ α φi → ∂ α φ
in the sup-norm.

Theorem 2 Suppose that U is an open set in Rn , that Γ is a locally convex topological
vector space, and that L : C0∞ (U) → Γ is a linear map. Then L is a continuous map, if and
only if the following condition holds:

If K is a compact subset of U, and {φi }∞ ∞
i=1 is a sequence of functions in C0 (U) such
that supp φi ⊂ K for all i, and φi → φ (in C0∞ (U)) for some φ ∈ D(U), then Lφi → Lφ
(in C).

The above theorems are stated without proof in [1].

1475
REFERENCES
1. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973.
2. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed,
John Wiley & Sons, Inc., 1999.

Version: 3 Owner: matte Author(s): matte

1476
Chapter 378

26B10 – Implicit function theorems,
Jacobians, transformations with
several variables

378.1 Jacobian matrix

The Jacobian [Jf~(x)] of a function f~ : Rn → Rm is the matrix of partial derivatives such
that  
D1 f1 (x) . . . Dn f1 (x)
 .. .. .. 
[Jf~(x)] =  . . . 
D1 fm (x) . . . Dn fm (x)
A more concise way of writing it is
 
~ 1
∇f
−−→ −−→  
~
[Jf(x)] = [D1 f , · · · , Dn f ] =  ... 
~ m
∇f
−−→ ~ m is the gradient
where Dn f is the partial derivative with respect to the nth variable and ∇f
of the nth component of f. The Jacobian matrix represents the full derivative matrix [Df (x)]
of f at x iff f is differentiable at x. Also, if f is differentiable at x, then [Jf(x)] = [Df (x)]
and the directional derivative in the direction ~v is [Df(x)]~v .

Version: 9 Owner: slider142 Author(s): slider142

378.2 directional derivative

Partial derivatives measure the rate at which a multivariable function f~ varies as the variable

1477
moves in the direction of the standard basis vectors. Directional derivatives measure the rate
at which f~ varies when the variable moves in the direction ~v. Thus the directional derivative
of f~ at a in the direction ~v is represented as

∂ f~(a) f~(a + h~v ) − f~(a)
D~v f (a) = = lim .
∂~v h→0 h

x
For example, if f  y  = x2 + 3y 2z, and we wanted to find the derivative at the point
z
   
1 1
a =  2  in the direction ~v =  1 , our equation would be
3 1

limh→0 h1 ((1 + h)2 + 3(2 + h)2 (3 + h) − 37)

= limh→0 h1 (3h3 + 37h2 + 50h) .

= limh→0 3h2 + 37h + 50 = 50
One may also use the Jacobian matrix if the function is differentiable to find the derivative
in the direction ~v as [Jf (x)]~v .

Version: 6 Owner: slider142 Author(s): slider142

Summary. The gradient is a first-order differential operator that maps functions to vector fields.
It is a generalization of the ordinary derivative, and as such conveys information about the
rate of change of a function relative to small variations in the independent variables. The
gradient of a function f is customarily denoted by ∇f or by grad f .

Definition: Euclidean space Consider n-dimensional Euclidean space with orthogonal
coordinates x1 , . . . , xn , and corresponding unit vectors e1 , . . . , en . In this setting, the gradi-
ent of a function f (x1 , . . . , xn ) is defined to be the vector field given by
Xn
∂f
∇f = ei .
i=1
∂xi

It is also useful to represent the gradient operator as the vector-valued differential operator
n
X ∂
∇= ei .
i=1
∂xi

1478
or, in the context of Euclidean 3-space, as
∂ ∂ ∂
∇=i +j +k ,
∂x ∂y ∂z
where i, j, k are the unit vectors lying along the positive direction of the x, y, z axes, respec-
tively. Using this formalism, the ∇ symbol can be used to express the divergence operator
as ∇·, the curl operator as ∇×, and the Laplacian operator as ∇2 . To wit, for a given vector
field
A = Ax i + Ay j + Az k,
and a given function f we have
∂Ax ∂Ay ∂Az
∇·A= + +
∂x ∂y ∂z
     
∂Az ∂Ay ∂Ax ∂Az ∂Ay ∂Ax
∇×A= − i+ − j+ − k
∂y ∂z ∂z ∂x ∂x ∂y
∂2f ∂2f ∂2f
∇2 f = + + .
∂x2 ∂y 2 ∂z 2

Definition: Riemannian geometry More generally still, consider a Riemannian manifold
with metric tensor gij and inverse g ij . In this setting the gradient X = grad f of a function
f relative to a general coordinate system, is given by
X j = g ij f,i . (378.3.1)
Note that the Einstein summation convention is in force above. Also note that f,i denotes
the partial derivative of f with respect to the ith coordinate.

Definition (377.3.1) is useful even in the Euclidean setting, because it can be used to derive
the formula for the gradient in various generalized coordinate systems. For example, in the
cylindrical system of coordinates (r, θ, z) we have
 
1 0 0
gij = 0 r 2 0
0 0 1
while for the system of spherical coordinates (ρ, φ, θ) we have
 
1 0 0
gij = 0 ρ2 0 .
0 0 ρ sin2 φ
2

Hence, for a given function f we have
∂f 1 ∂f ∂f
∇f = er + eθ + k Cylindrical
∂r r ∂θ ∂z
∂f 1 ∂f 1 ∂f
∇f = eρ + eφ + eθ Spherical ,
∂ρ ρ ∂φ ρ sin φ ∂θ

1479
where for the cylindrical system
∂ x y
er = = i+ j
∂r r r
1 ∂ y x
eθ = =− i+ j
r ∂θ r r
are the unit vectors in the direction of increase of r and θ, respectively, and for the spherical
system
∂ x y z
eρ = = i+ j+ k
∂ρ ρ ρ ρ
1 ∂ zx zy r
eφ = = i+ j− k
ρ ∂φ rρ rρ ρ
1 ∂ y x
eθ = =− i+ j
ρ sin θ ∂θ r r
are the unit vectors in the direction of increase of ρ, θ, φ, respectively.

Physical Interpretation. In the simplest case, we consider the Euclidean plane with
Cartesian coordinates x, y. The gradient of a function f (x, y) is given by
∂f ∂f
∇f = i+ j,
∂x ∂y
where i, j denote, respectively, the standard unit horizontal and vertical vectors. The gradient
vectors have the following geometric interpretation. Consider the graph z = f (x, y) as a
surface in 3-space. The direction of the gradient vector ∇f is the direction of steepest
ascent, while the magnitude is the slope in that direction. Thus,
s   2
2
∂f ∂f
k∇f k = +
∂x ∂y
describes the steepness of the hill z = f (x, y) at a point on the hill located at (x, y, f (x, y)).

A more general conception of the gradient is based on the interpretation of a function f as
a potential corresponding to some conservative physical force. The negation of the gradient,
−∇f , is then interpreted as the corresponding force field.

Differential identities. Several properties of the one-dimensional derivative generalize to
a multi-dimensional setting
∇(af + bg) = a∇f + b∇g Linearity
∇(f g) = f ∇g + g∇f Product rule
∇(φ ◦ f ) = (φ0 ◦ f )∇f Chain rule

Version: 9 Owner: rmilson Author(s): rmilson, slider142

1480
378.4 implicit differentiation

Implicit differentiation is a tool used to analyze functions that cannot be conveniently put
into a form y = f (x) where x = (x1 , x2 , ..., xn ). To use implicit differentiation meaningfully,
you must be certain that your function is of the form f (x) = 0 (it can be written as
a level set) and that it satisfies the implicit function theorem (f must be continuous, its
first partial derivatives must be continuous, and the derivative with respect to the implicit
function must be non-zero). To actually differentiate implicitly, we use the chain rule to
differentiate the entire equation.

Example: The first step is to identify the implicit function. For simplicity in the example,
we will assume f (x, y) = 0 and y is an implicit function of x. Let f (x, y) = x2 + y 2 + xy = 0
(Since this is a two dimensional equation, all one has to check is that the graph of y may
be an implicit function of x in local neighborhoods.) Then, to differentiate implicitly, we
differentiate both sides of the equation with respect to x. We will get
dy dy
2x + 2y · +x·1· +y =0
dx dx
Do you see how we used the chain rule in the above equation ? Next, we simply solve for our
dy
implicit derivative dx = − 2x+y
2y+x
. Note that the derivative depends on both the variable and
the implicit function y. Most of your derivatives will be functions of one or all the variables,
including the implicit function itself.

[better example and ?multidimensional? coming]

Version: 2 Owner: slider142 Author(s): slider142

378.5 implicit function theorem

Let f = (f1 , ..., fn ) be a continuously differentiable, vector-valued function mapping an
open set E ⊂ Rn+m into Rn . Let (a, b) = (a1 , ..., an , b1 , ..., bm ) be a point in E for which
f(a, b) = 0 and such that the n × n determinant

|Dj fi (a, b)| =
6 0

for i, j = 1, ..., n. Then there exists an m-dimensional neighbourhood W of b and a unique
continuously differentiable function g : W → Rn such that g(b) = a and

f(g(t), t) = 0

for all t ∈ W .

1481
Simplest case

When n = m = 1, the theorem reduces to: Let F be a continuously differentiable, real-
valued function defined on an open set E ⊂ R2 and let (x0 , y0) be a point on E for which
F (x0 , y0) = 0 and such that
∂F
|x ,y 6= 0
∂x 0 0
Then there exists an open interval I containing y0 , and a unique function f : I → R which
is continuously differentiable and such that f (y0 ) = x0 and

F (f (y), y) = 0

for all y ∈ I.

Note

The inverse function theorem is a special case of the implicit function theorem where the
dimension of each variable is the same.

Version: 7 Owner: vypertd Author(s): vypertd

378.6 proof of implicit function theorem

Consider the function F : E → Rn × Rm defined by

F (x, y) = (f (x, y), y).
∂f j ∂fj
Setting Ajk = ∂xk
(a, b), and Mji = ∂yi
(a, b), A is an n × m matrix and M is n × n. It holds

Df (a, b) = (A|M)

and hence  
In 0
DF (a, b) = .
A M

Being det M 6= 0 M is invertible and hence DF (a, b) is invertible too. Applying the
inverse function theorem to F we find that there exist a neighbourhood V of a and W of b
and a function G ∈ C 1 (V × W, Rn+m ) such that F (G(x, y)) = (x, y) for all (x, y) ∈ V × W .
Letting G(x, y) = (G1 (x, y), G2(x, y)) (so that G1 : V × W → Rn , G2 : V × W → Rm ) we
hence have

(x, y) = F (G1 (x, y), G2(x, y)) = (f (G1 (x, y), G2(x, y)), G2 (x, y))

1482
and hence y = G2 (x, y) and x = f (G1 (x, y), G2(x, y)) = f (G1 (x, y), y). So we only have to
set g(y) = G1 (0, y) to obtain

f (g(y), y) = 0, ∀y ∈ W.

Version: 1 Owner: paolini Author(s): paolini

1483
Chapter 379

26B12 – Calculus of vector functions

379.1 Clairaut’s theorem

Theorem. (Clairaut’s Theorem) If F : Rn → Rm is a function whose second partial derivatives
exist and are continuous on a set S ⊆ Rn , then
∂2f ∂2f
=
∂xi ∂xj ∂xj ∂xi
on S (where 1 6 i, j 6 n).

This theorem is commonly referred to as simply ’the equality of mixed partials’. It is usually
first presented in a vector calculus course, and is useful in this context for proving basic
properties of the interrelations of gradient, divergence, and curl. I.e., if F : R3 → R3 is a
function satisfying the hypothesis, then ∇ · (∇ × F) = 0. Or, if f : R3 → R is a function
satisfying the hypothesis, ∇ × ∇f = 0.

Version: 10 Owner: flynnheiss Author(s): flynnheiss

379.2 Fubini’s Theorem

Fubini’s Theorem Let I ⊂ RN and J ⊂ RM be compact intervals, and let f : I × J → RK
be a Riemann integrable function such that, for each x ∈ I the integral

F (x) := intJ f (x, y) dµJ (y)

exists. Then F : I → RK is Riemann integrable, and

intI F = intI×J f.

1484
This theorem effectively states that, given a function of N variables, you may integrate it
one variable at a time, and that the order of integration does not affect the result.

Example Let I := [0, π/2] × [0, π/2], and let f : I → R, x 7→ sin(x) cos(y) be a function.
Then
ZZ
intI f = sin(x) cos(y)
[0,π/2]×[0,π/2]
 
π/2 π/2
= int0 int0 sin(x) cos(y) dy dx
π/2
= int0 sin(x) (1 − 0) dx = (0 − −1) = 1

R R
Note that it is often simpler (and no less correct) to write ··· I
f as intI f .

Version: 3 Owner: vernondalhart Author(s): vernondalhart

379.3 Generalised N-dimensional Riemann Sum

Let I = [a1 , b1 ] × · · · × [aN , bN ] be an N-cell in RN . For each j = 1, . . . , N, let aj = tj,0 <
. . . < tj,N = bj be a partition Pj of [aj , bj ]. We define a partition P of I as

P := P1 × · · · × PN

Each partition P of I generates a subdivision of I (denoted by (Iν )ν ) of the form

Iν = [t1,j , t1,j+1 ] × · · · × [tN,k , tN,k+1 ]

Let f : U → RM be such that I ⊂ U, and let (Iν )ν be the corresponding subdivision of a
partition P of I. For each ν, choose xν ∈ Iν . Define
X
S(f, P ) := f (xν )µ(Iν)
ν

As the Riemann sum of f corresponding to the partition P .

A partition Q of I is called a refinement of P if P ⊂ Q.

Version: 1 Owner: vernondalhart Author(s): vernondalhart

379.4 Generalized N-dimensional Riemann Integral

Let I = [a1 , b1 ] × · · · × [aN , bN ] ⊂ RN be a compact interval, and let f : I → RM be a
function. Let  > 0. If there exists a y ∈ RM and a partition P of I such that for each

1485
refinement P of P (and corresponding Riemann sum S(f, P )),
kS(f, P ) − yk < 
Then we say that f is Riemann integrable over I, that y is the Riemann integral of f over
I, and we write
intI f := intI f dµ := y

Note also that it is possible to extend this definition to more arbitrary sets; for any bounded
set D, one can find a compact interval I such that D ⊂ I, and define a function
(
f (x), x ∈ D
f˜ : I → RM x 7→
0, x∈/D
in which case we define
intD f := intI f˜

Version: 3 Owner: vernondalhart Author(s): vernondalhart

379.5 Helmholtz equation

It is a partial differential equation which, in scalar form is
∇2 f + k 2 f = 0,
or in vector form is
∇2 A + k 2 A = 0,
where ∇2 is the Laplacian. The solutions of this equation represent the solution of the
wave equation, which is of great interest in physics.

Consider a wave equation
∂2ψ
= c 2 ∇2 ψ
∂t2
with wave speed c. If we look for time harmonic standing waves of frequency ω,
ψ(x, t) = e−jωt φ(x)
we find that φ(x) satisfies the Helmholtz equation:
(∇2 + k 2 )φ = 0
where k = ω/c is the wave number.

Usually Helmholtz equation is solved by seperation of variables method, in cartesian, spher-
ical or cylindrical coordinates.

Version: 3 Owner: giri Author(s): giri

1486
379.6 Hessian matrix

The Hessian of a scalar function of a vector is the matrix of partial second derivatives. So
the Hessian matrix of a function f : Rn → R is:

 ∂2f ∂2f ∂2f

dx21 dx1 dx2
... dx1 dxn
 ∂2f ∂2f ∂2f 
 dx22
... 
 dx2 dx1 dx2 dxn 
(379.6.1)
 .. .. .. .. 
 . . . . 
∂2f ∂2f ∂2f
dxn dx1 dxn dx2
... dx2n

Note that the Hessian is symmetric because of the equality of mixed partials.

Version: 2 Owner: bshanks Author(s): akrowne, bshanks

379.7 Jordan Content of an N-cell

Let I = [a1 , b1 ] × · · · × [aN , bN ] be an N-cell in RN . Then the Jordan content (denoted µ(I))
of I is defined as
YN
µ(I) := (bj − aj )
j=1

Version: 1 Owner: vernondalhart Author(s): vernondalhart

379.8 Laplace equation

The scalar form of Laplace’s equation is the partial differential equation

∇2 f = 0

and the vector form is
∇2 A = 0,
where ∇2 is the Laplacian. It is a special case of the Helmholtz differential equation with
k = 0.

A function f which satisfies Laplace’s equation is said to be harmonic. Since Laplace’s
equation is linear, the superposition of any two solutions is also a solution.

Version: 3 Owner: giri Author(s): giri

1487
379.9 chain rule (several variables)

The chain rule is a theorem of analysis that governs derivatives of composed functions. The
basic theorem is the chain rule for functions of one variables (see here). This entry is devoted
to the more general version involving functions of several variables and partial derivatives.
Note: the symbol Dk will be used to denote the partial derivative with respect to the k th
variable.

Let F (x1 , . . . , xn ) and G1 (x1 , . . . , xm ), . . . , Gn (x1 , . . . , xm ) be differentiable functions of sev-
eral variables, and let

H(x1 , . . . , xm ) = F (G1 (x1 , . . . , xm ), . . . , Gn (x1 , . . . , xm ))

be the function determined by the composition of F with G1 , . . . , Gn The partial derivatives
of H are given by
n
X
(Dk H)(x1 , . . . , xm ) = (Di F )(G1 (x1 , . . . , xm ), . . .)(Dk Gi )(x1 , . . . , xm ).
i=1

The chain rule can be more compactly (albeit less precisely) expressed in terms of the Jacobi-
Legendre partial derivative symbols (historical note). Just as in the Leibniz system, the basic
idea is that of one quantity (i.e. variable) depending on one or more other quantities. Thus
we would speak about a variable z depends differentiably on y1 , . . . , yn , which in turn depend
differentiably on variables x1 , . . . , xm . We would then write the chain rule as
n
X ∂z ∂yi
∂z
= , j = 1, . . . m.
∂xj i=1
∂y i ∂xj

The most general, and conceptually clear approach to the multi-variable chain is based on the
notion of a differentiable mapping, with the Jacobian matrix of partial derivatives playing
the role of generalized derivative. Let, X ⊂ Rm and Y ⊂ Rn be open domains and let

F : Y → Rl , G:X→Y

be differentiable mappings. In essence, the symbol F represents l functions of n variables
each:
F = (F1 , . . . , Fl ), Fi = Fi (x1 , . . . , xn ),
whereas G = (G1 , . . . , Gn ) represents n functions of m variables each. The derivative of such
mappings is no longer a function, but rather a matrix of partial derivatives, customarily called
the Jacobian matrix. Thus
   
D1 F1 . . . Dn F1 D1 G1 . . . Dm G1
 ..   .. 
DF =  ... ..
. .  DG =  ... ..
. . 
D1 Fl . . . Dn Fl D1 Gn . . . Dm Gn

1488
The chain rule now takes the same form as it did for functions of one variable:
D(F ◦ G) = ((DF) ◦ G) (DG),
albeit with matrix multiplication taking the place of ordinary multiplication.

This form of the chain rule also generalizes quite nicely to the even more general setting
where one is interested in describing the derivative of a composition of mappings between
manifolds.

Version: 7 Owner: rmilson Author(s): rmilson

379.10 divergence

Basic Definition. Let x, y, z be a system of Cartesian coordinates on 3-dimensional Euclidean space,
and let i, j, k be the corresponding basis of unit vectors. The divergence of a continuously differentiable
vector field
F = F 1 i + F 2 j + F 3 k,
is defined to be the function
∂F 1 ∂F 2 ∂F 3
div F = + + .
∂x ∂y ∂z
Another common notation for the divergence is ∇ · F (see gradient), a convenient mnemonic.

Physical interpretation. In physical terms, the divergence of a vector field is the extent
to which the vector field flow behaves like a source or a sink at a given point. Indeed, an
alternative, but logically equivalent definition, gives the divergence as the derivative of the
net flow of the vector field across the surface of a small sphere relative to the surface area of
the sphere. To wit, 
(div F)(p) = lim intS(F · N)dS / 4πr 2 ,
r→0
where S denotes the sphere of radius r about a point p ∈ R3 , and the integral is a surface
integral taken with respect to N, the normal to that sphere.

The non-infinitesimal interpretation of divergence is given by Gauss’s Theorem. This the-
orem is a conservation law, stating that the volume total of all sinks and sources, i.e. the
volume integral of the divergence, is equal to the net flow across the volume’s boundary. In
symbols,
intV div F dV = intS (F · N) dS,
where V ⊂ R3 is a compact region with a smooth boundary, and S = ∂V is that boundary
oriented by outward-pointing normals. We note that Gauss’s theorem follows from the more
general Stokes’ theorem, which itself generalizes the fundamental theorem of calculus.

In light of the physical interpretation, a vector field with constant zero divergence is called
incompressible – in this case, no net flow can occur across any closed surface.

1489
General definition. The notion of divergence has meaning in the more general setting of
Riemannian geometry. To that end, let V be a vector field on a Riemannian manifold. The
covariant derivative of V is a type (1, 1) tensor field. We define the divergence of V to be the
trace of that field. In terms of coordinates (see tensor and Einstein summation convention),
we have
div V = V i ;i .

Version: 6 Owner: rmilson Author(s): rmilson, jaswenso

379.11 extremum

Extrema are minima and maxima. The singular forms of these words are extremum, mini-
mum, and maximum.

Extrema may be “global” or “local”. A global minimum of a function f is the lowest value
that f ever achieves. If you imagine the function as a surface, then a global minimum is the
lowest point on that surface. Formally, it is said that f : U → V has a global minimum at x
if ∀u ∈ U, f (x) 6 f (u).

A local minimum of a function f is a point x which has less value than all points ”next
to” it. If you imagine the function as a surface, then a local minimum is the bottom of a
“valley” or “bowl” in the surface somewhere. Formally, it is said that f : U → V has a local
minimum at x if ∃ a neighborhood N of x such that ∀y ∈ N, f (x) 6 f (y).

If you flip the 6 signs above to >, you get the definitions of global and local maxima.

A ”strict local minima” or ”strict local maxima” means that nearby points are strictly less
than or strictly greater than the critical point, rather than 6 or >. For instance, a strict
local minima at x has a neighborhood N such that ∀y ∈ N, (f (x) < f (y) or y = x).

Related concepts are plateau and saddle point.

Finding minima or maxima is an important task which is part of the field of optimization.

Version: 9 Owner: bshanks Author(s): bshanks, bbukh

379.12 irrotational field

Suppose Ω is an open set in R3 , and V is a vector field with differentiable real (or possibly
complex) valued component functions. If ∇ × V = 0, then V is called an irrotional vector
field, or curl free field.

If U and V are irrotational, then U × V is solenoidal.

1490
Version: 6 Owner: matte Author(s): matte, giri

379.13 partial derivative

The partial derivative of a multivariable function f is simply its derivative with respect to
only one variable, keeping all other variables constant (which are not functions of the variable
in question). The formal definition is
   
a1
  ..  
.
∂f 1 
 

 f (a + h~ei ) − f (a)
Di f (a) = = lim f  ai + h  − f (a) = lim
∂ai h→0 h   ..   h→0 h
  .  
an

where ~ei is the standard basis vector of the ith variable. Since this only affects the ith vari-
able, one can derive the function using common rules and tables, treating all other variables
(which are not functions of ai ) as constants. For example, if f (x) = x2 + 2xy + y 2 + y 3z,
then
∂f
(1) ∂x
= 2x + 2y

∂f
(2) ∂y
= 2x + 2y + 3y 2z

∂f
(3) ∂z
= y3
Note that in equation (1), we treated y as a constant, since we were differentiating with re-
spect to x. d(c∗x)
dx
= c The partial derivative of a vector-valued function f~(x) with respect
−→ ∂ f~
to variable ai is a vector Di f = ∂a i
.
Multiple Partials:
Multiple partial derivatives can be treated just like multiple derivatives. There is an addi-
tional degree of freedom though, as you can compound derivatives with respect to different
variables. For example, using the above function,
∂2f ∂
(4) ∂x2
= ∂x
(2x + 2y) =2

∂2f ∂
(5) ∂z∂y
= ∂z
(2x + 2y + (5)3y 2z) = 3y 2

∂2f ∂
(6) ∂y∂z
= ∂y
(y 3 ) = 3y 2

D12 is another way of writing ∂x1∂∂x2 . If f (x) is continuous in the neighborhood of x, it can
be shown that Dij f (x) = Dji f (x) where i, j are the ith and jth variables. In fact, as long
as an equal number of partials are taken with respect to each variable, changing the order

1491
of differentiation will produce the same results in the above condition.
Another form of notation is f (a,b,c,...)(x) where a is the partial derivative with respect to the
first variable a times, b is the partial with respect to the second variable b times, etc.

Version: 17 Owner: slider142 Author(s): slider142

379.14 plateau

A plateau of a function is a region where a function has constant value.

More formally, let U and V be topological spaces. A plateau for a scalar function f : U → V
is a path-connected set of points P ⊆ U such that for some y we have

∀p ∈ P, f (p) = y (379.14.1)

Please take note that this entry is not authoritative. If you know of a more standard definition
of ”plateau”, please contribute it, thank you.

Version: 4 Owner: bshanks Author(s): bshanks

379.15 proof of Green’s theorem

Consider the region R bounded by the closed curve P in a well-connected space. P can be
given by a vector valued function F~ (x, y) = (f (x, y), g(x, y)). The region R can then be
described by  
∂g ∂f ∂g ∂f
intintR − dA = intintR dA − intintR dA
∂x ∂y ∂x ∂y
The double integrals above can be evaluated separately. Let’s look at
∂g B(y) ∂g
intintR dA = intba intA(y) dxdy
∂x ∂x
Evaluating the above double integral, we get

intba (g(A(y), y) − g(B(y), y)) dy = intba g(A(y), y) dy − intba g(B(y), y) dy

According to the fundamental theorem of line integrals, the above equation is actually
equivalent to the evaluation of the line integral of the function F~1 (x, y) = (0, g(x, y)) over a
path P = P1 + P2 , where P1 = (A(y), y) and P2 = (B(y), y).
I
inta g(A(y), y) dy − inta g(B(y), y) dy = intP1 F~1 · d~t + intP2 F~1 · d~t =
b b
F~1 · d~t
P

1492
Thus we have I
∂g
intintR dA = F~1 · d~t
∂x P
By a similar argument, we can show that
I
∂f
intintR dA = − F~2 · d~t
∂y P

where F~2 = (f (x, y), 0). Putting all of the above together, we can see that
  I I I I
∂g ∂f
intintR − dA = F~1 · d~t + F~2 · d~t = (F~1 + F~2 ) · d~t = (f (x, y), g(x, y)) · d~t
∂x ∂y P P P P

which is Green’s theorem.

Version: 7 Owner: slider142 Author(s): slider142

379.16 relations between Hessian matrix and local ex-
trema

Let x be a vector, and let H(x) be the Hessian for f at a point x. Let the neighborhood of x
be in the domain for f , and let f have continuous partial derivatives of first and second order.
Let ∇f = ~0.

If H(x) is positive definite, then x is a strict local minimum for f .

If x is a local minimum for x, then H(x) is positive semidefinite.

If H(x) is negative definite, then x is a strict local maximum for f .

If x is a local maximum for x, then H(x) is negative semidefinite.

If H(x) is indefinite, x is a nondegenerate saddle point.

If the case when the dimension of x is 1 (i.e. f : R → R), this reduces to the Second
Derivative Test, which is as follows:

Let the neighborhood of x be in the domain for f , and let f have continuous partial deriva-
tives of first and second order. Let f 0 (x) = 0. If f 00 (x) > 0, then x is a strict local minimum.
If f 00 (x) < 0, then x is a strict local maximum.

Version: 6 Owner: bshanks Author(s): bshanks

1493
379.17 solenoidal field

A solenoidal vector field is one that satisfies

∇·B=0

at every point where the vector field B is defined. Here ∇ · B is the divergence.

This condition actually implies that there exists a vector A, known as the vector potential,
such that
B = ∇ × A.

For a function f satisfying Laplace’s equation

∇2 f = 0,

it follows that ∇f is solenoidal.

Version: 4 Owner: giri Author(s): giri

1494
Chapter 380

26B15 – Integration: length, area,
volume

380.1 arc length

Arc length is the length of a section of a differentiable curve. Finding arc length is useful in
many applications, for the length of a curve can be attributed to distance traveled, work, etc.
It is commonly represented as S or the differential ds if one is differentiating or integrating
with respect to change in arclength.

If one knows the vector function or parametric equations of a curve, finding the arc length
is simple, as it can be given by the sum of the lengths of the tangent vectors to the curve or

intba |F~ 0(t)| dt = S

Note that t is an independent parameter. In Cartesian coordinates, arclength can be calcu-
lated by the formula p
S = intba 1 + (f 0 (x))2 dx

This formula is derived by viewing arclength as the Riemman sum

n
X p
lim 1 + f 0 (xi ) ∆x
∆x→∞
i=1

The term being summed is the length of an approximating secant to the curve over the dis-
tance ∆x. As ∆x vanishes, the sum approaches the arclength, thus the algorithm. Arclength
can also be derived for polar coordinates from the general formula for vector functions given

1495
above. The result is p
L = intba r(θ)2 + (r 0 (θ))2 dθ

Version: 5 Owner: slider142 Author(s): slider142

1496
Chapter 381

26B20 – Integral formulas (Stokes,
Gauss, Green, etc.)

381.1 Green’s theorem

Green’s theorem provides a connection between path integrals over a well-connected region
in the plane and the area of the region bounded in the plane. Given a closed path P bounding
a region R with area A, and a vector-valued function F~ = (f (x, y), g(x, y)) over the plane,
I
F~ · d~x = int
intR [g1 (x, y) − f2 (x, y)]dA
P

where an is the derivative of a with respect to the nth variable.

Corollary: The closed path integral over a gradient of a function with continuous partial derivatives
is always zero. Thus, gradients are conservative vector fields. The smooth function is called
the potential of the vector field.

Proof: The corollary states that

I
~ h · d~x = 0

P

We can easily prove this using Green’s theorem.

1497
I
~ h · d~x = intintR [g1 (x, y) − f2 (x, y)]dA

P

But since this is a gradient...

intintR [g1 (x, y) − f2 (x, y)]dA = int
intR [h21 (x, y) − h12 (x, y)]dA

Since h12 = h21 for any function with continuous partials, the corollary is proven.

Version: 4 Owner: slider142 Author(s): slider142

1498
Chapter 382

26B25 – Convexity, generalizations

382.1 convex function

Definition Suppose Ω is a convex set in a vector space over R (or C), and suppose f is a
function f : Ω → R. If for any x, y ∈ Ω and any λ ∈ (0, 1), we have
 
f λa + (1 − λ)b 6 λf (a) + (1 − λ)f (b),

we say that f is a convex function. If for any x, y ∈ Ω and any λ ∈ (0, 1), we have
 
f λa + (1 − λ)b > λf (a) + (1 − λ)f (b),

we say that f is a concave function. If either of the inequalities are strict, then we say
that f is a strictly convex function, or a strictly concave function, respectively.

Properties

• A function f is a (strictly) convex function if and only if −f is a (strictly) concave
function.

• On R, a continuous function is convex if and only if for all x, y ∈ R, we have
 
x+y f (x) + f (y)
f ≤ .
2 2

• A twice continuously differentiable function on R is convex if and only if f 00 (x) ≥ 0 for
all x ∈ R.

• A local minimum of a convex function is a global minimum. See this page.

1499
Examples

• ex ,e−x , and x2 are convex functions on R.
• A norm is a convex function.
• On R2 , the 1-norm and the ∞-norm (i.e., ||(x, y)||1 = |x| + |y| and ||(x, y)||∞ =
max{|x|, |y|}) are not strictly convex ([2], pp. 334-335).

REFERENCES
1. E. Kreyszig, Introductory Functional Analysis With Applications, John Wiley & Sons,
1978.

Version: 11 Owner: matte Author(s): matte, drini

382.2 extremal value of convex/concave functions

Theorem. Let U be a convex set in a normed (real or complex) vector space. If f : U → R
is a convex function on U, then a local minimum of f is a global minimum.

Proof. Suppose x is a local minimum for f , i.e., there is an open ball B ⊂ U with radius 
and center x such that f (x) ≤ f (ξ) for all ξ ∈ B. Let us fix some y ∈ / B. Our aim is to
1
prove that f (x) ≤ f (y). We define λ = 2||x−y|| , where || · || is the norm on U.

Then
||λy + (1 − λ)x − x|| = ||λy − λx||
= |λ|||x − y||

= ,
2
so λy + (1 − λ)x ∈ B. If follows that f (x) ≤ f (λy + (1 − λ)x). Since f is convex, we then
get
f (x) ≤ f (λy + (1 − λ)x)
≤ λf (y) + (1 − λ)f (x),
and f (x) ≤ f (y) as claimed. 2

The analogous theorem for concave functions is as follows.

Theorem. Let U be a convex set in a normed (real or complex) vector space. If f : U → R
is a concave function on U, then a local maximum of f is a global maximum.

1500
Proof. Consider the convex function −f . If x is a local maximum of f , then it is a local
minimum of −f . By the previous theorem, x is then a global minimum of −f . Hence x is a
global maximum of f . 2

Version: 1 Owner: matte Author(s): matte

1501
Chapter 383

26B30 – Absolutely continuous
functions, functions of bounded
variation

383.1 absolutely continuous function

Definition [1, 1] closed bounded interval of R. Then a function f : [a, b] → C is absolutely
continuous on [a, b], if for any ε > 0, there is a δ > 0 such that the following condition
holds:

(∗) If (a1 , b1 ), . . . , (an , bn ) is a finite collection of disjoint open intervals in [a, b] such that
n
X
(bi − ai ) < δ,
i=1

then n
X
|f (bi ) − f (ai )| < ε.
i=1

Basic results for absolutely continuous functions are as follows.

Theorem

1. A function f : [a, b] → C is absolutely continuous if and only if Re{f } and Im{f } are
absolutely continuous real functions.

2. If f : [a, b] → C is a function, which is everywhere differentiable and f 0 is bounded,
then f is absolutely continuous [1].

1502
3. Any absolutely continuous function f : [a, b] → C is continuous on [a, b] and has a
bounded variation [1, 1].

4. If f, g be absolutely continuous functions, then so are f g, f + g, |f |γ (if γ ≥ 1), and
f /g (if g is never zero) [1].

5. If f, g are real valued absolutely continuous functions, then so are max{f, g} and
min{f, g}. If f (x) > 0 for all x and γ ∈ R, then f γ is absolutely continuous [1].

Property (2), which is readily proven using the mean value theorem, implies that any smooth function
with compact support on R is absolutely continuous. By property (3), any absolutely con-
tinuous function is a bounded variation. Hence, from properties of functions of bounded
variation, the following theorem follows:

Theorem ([1], pp. 536) Let f : [a, b] → C be a absolutely continuous function. Then f is
differentiable almost everywhere, and |f 0| is Lebesgue integrable.

We have the following characterization of absolutely continuous functions

Theorem [Fundamental theorem of calculus for the Lebesgue integral] ([1], pp. 550, [1])
Let f : [a, b] → C be a function. Then f is absolutely continuous if and only if there is a
function g ∈ L1 (a, b) (i.e. a g : (a, b) → C with int(a,b) |g| < ∞), such that

f (x) = f (a) + intxa g(t)dt

for all x ∈ [a, b]. What is more, if f and g are as above, then f 0 = g almost everywhere.
(Above, both integrals are Lebesgue integrals.)

REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed,
John Wiley & Sons, Inc., 1999.
2. W. Rudin, Real and complex analysis, 3rd ed., McGraw-Hill Inc., 1987.
3. F. Jones, Lebesgue Integration on Euclidean Spaces, Jones and Barlett Publishers, 1993.
4. C.D. Aliprantis, O. Burkinshaw, Principles of Real Analysis, 2nd ed., Academic Press,
1990.

Version: 5 Owner: matte Author(s): matte

383.2 total variation

Let γ : [a, b] → X be a function mapping an interval [a, b] to a metric space (X, d). We
say that γ is of bounded variation if there is a constant M such that, for each partition

1503
P = {a = t0 < t1 < · · · < tn = b} of [a, b],
n
X
v(γ, P ) = d(γ(tk ), γ(tk−1 )) 6 M.
k=1

The total variation Vγ of γ is defined by

Vγ = sup{v(γ, P ) : P is a partition of [a, b]}.

It can be shown that, if X is either R or C, every smooth (or piecewise smooth) function
γ : [a, b] → X is of bounded variation, and

Vγ = intba |γ 0 (t)|dt.

Also, if γ is of bounded variation and f : [a, b] → X is continuous, then the Riemann-Stieltjes integral
intba f dγ is finite.

If γ is also continuous, it is said to be a rectifiable path, and V (γ) is the length of its trace.

If X = R, it can be shown that γ is of bounded variation if and only if it is the difference of
two monotonic functions.

Version: 3 Owner: Koro Author(s): Koro

1504
Chapter 384

26B99 – Miscellaneous

384.1 derivation of zeroth weighted power mean

Let x1 , x2 , . . . , xn be positive real numbers, and let w1 , w2 , . . . , wn be positive real numbers
such that w1 + w2 + · · · + wn = 1. For r 6= 0, the r-th weighted power mean of x1 , x2 , . . . , xn
is
Mwr (x1 , x2 , . . . , xn ) = (w1 xr1 + w2 xr2 + · · · + wn xrn )1/r .
Using the Taylor series expansion et = 1 + t + O(t2 ), where O(t2 ) is Landau notation for
terms of order t2 and higher, we can write xri as

xri = er log xi = 1 + r log xi + O(r 2 ).

By substituting this into the definition of Mwr , we get
 1/r
Mwr (x1 , x2 , . . . , xn ) = w1 (1 + r log x1 ) + · · · + wn (1 + r log xn ) + O(r 2 )
 1/r
= 1 + r(w1 log x1 + · · · + wn log xn ) + O(r 2 )
 
2 1/r
= 1 + r log(xw 1 w2 wn
1 x2 · · · xn ) + O(r )
 
1  w1 w2 wn 2

= exp log 1 + r log(x1 x2 · · · xn ) + O(r ) .
r
Again using a Taylor series, this time log(1 + t) = t + O(t2 ), we get
 
r 1 w1 w2 wn 2

Mw (x1 , x2 , . . . , xn ) = exp r log(x1 x2 · · · xn ) + O(r )
r
= exp [log(xw 1 w2 wn
1 x2 · · · xn ) + O(r)] .

Taking the limit r → 0, we find

Mw0 (x1 , x2 , . . . , xn ) = exp [log(xw 1 w2 wn
1 x2 · · · xn )]
= xw 1 w2 wn
1 x2 · · · xn .

1505
In particular, if we choose all the weights to be n1 ,

M 0 (x1 , x2 , . . . , xn ) = n
x1 x2 · · · xn ,

the geometric mean of x1 , x2 , . . . , xn .

Version: 3 Owner: pbruin Author(s): pbruin

384.2 weighted power mean

If w1 , w2 , . . . , wn are positive real numbers such that w1 + w2 + · · · + wn = 1, we define the
r-th weighted power mean of the xi as:

Mwr (x1 , x2 , . . . , xn ) = (w1 xr1 + w2 xr2 + · · · + wn xrn )1/r .

When all the wi = n1 we get the standard power mean. The weighted power mean is a
continuous function of r, and taking limit when r → 0 gives us

Mw0 = xw 1 w2 wn
1 x2 · · · wn .

We can weighted use power means to generalize the power means inequality: If w is a set of
weights, and if r < s then
Mwr < Mws .

Version: 6 Owner: drini Author(s): drini

1506
Chapter 385

26C15 – Rational functions

385.1 rational function

A real function R(x) of a single variable x is called rational if it can be written as a quotient

P (x)
R(x) = ,
Q(x)

where P (x) and Q(x) are polynomials in x with real coefficients.

In general, a rational function R(x1 , . . . , xn ) has the form

P (x1 , . . . , xn )
R(x1 , . . . , xn ) = ,
Q(x1 , . . . , xn )

where P (x1 , . . . , xn ) and Q(x1 , . . . , xn ) are polynomials in the variables (x1 , . . . , xn ) with
coefficients in some field or ring S.

In this sense, R(x1 , . . . , xn ) can be regarded as an element of the fraction field S(x1 , . . . , xn )
of the polynomial ring S[x1 , . . . , xn ].

Version: 1 Owner: igor Author(s): igor

1507
Chapter 386

26C99 – Miscellaneous

386.1 Laguerre Polynomial

A Laguerre Polynomial is a polynomial of the form:
ex dn −x n 
Ln (x) = e x .
n! dxn
Associated to this is the Laguerre differential equation, the solutions of which are called
associated Laguerre Polynomials:

ex x−k dn −x n+k 
Lkn (x) = e x .
n! dxn
Of course
L0n (x) = Ln (x).
The associated Laguere Polynomials are orthogonal over |0, ∞) with respect to the weighting
function xk e−x :
(n + k)!
int∞ x k k k
0 e x Ln (x)Lm (x)dx = δn m.
n!

Version: 2 Owner: mathwizard Author(s): mathwizard

1508
Chapter 387

26D05 – Inequalities for trigonometric
functions and polynomials

387.1 Weierstrass product inequality

For any finite family (ai )i∈I of real numbers in the interval [0, 1], we have
Y X
(1 − ai ) ≥ 1 − ai .
i i

Proof: Write Y X
f= (1 − ai ) + ai .
i i

For any k ∈ I, and any fixed values of the ai for i 6= k, f is a polynomial of the first degree
in ak . Consequently f is minimal either at ak = 0 or ak = 1. That brings us down to two
cases: all the ai are zero, or at least one of them is 1. But in both cases it is clear that f ≥ 1,
QED.

Version: 2 Owner: Daume Author(s): Larry Hammick

387.2 proof of Jordan’s Inequality

To prove that
2 π
x 6 sin(x) 6 x ∀ x ∈ [0, ] (387.2.1)
π 2
consider a unit circle (circle with radius = 1 unit). Take any point P on the circumference
of the circle.

1509
Drop the perpendicular from P to the horizontal line, M being the foot of the perpendicular
and Q the reflection of P at M. (refer to figure)

Let x = ∠P OM.

For x to be in [0, π2 ], the point P lies in the first quadrant, as shown.

The length of line segment P M is sin(x). Construct a circle of radius MP , with M as the
center.

Length of line segment P Q is 2 sin(x).

Length of arc P AQ is 2x.

Length of arc P BQ is π sin(x).

Since P Q 6 length of arc P AQ (equality holds when x = 0) we have 2 sin(x) 6 2x. This
implies
sin(x) 6 x

Since length of arc P AQ is 6 length of arc P BQ (equality holds true when x = 0 or x = π2 ),
we have 2x 6 π sin(x). This implies
2
x 6 sin(x)
π

Thus we have
2 π
x 6 sin(x) 6 x ∀ x ∈ [0, ] (387.2.2)
π 2

Version: 12 Owner: giri Author(s): giri

1510
Chapter 388

26D10 – Inequalities involving
derivatives and differential and
integral operators

388.1 Gronwall’s lemma

If, for t0 6 t 6 t1 , φ(t) > 0 and ψ(t) > 0 are continuous functions such that the inequality
φ(t) 6 K + Linttt0 ψ(s)φ(s)ds
holds on t0 6 t 6 t1 , with K and L positive constants, then

φ(t) 6 K exp Linttt0 ψ(s)ds
on t0 6 t 6 t1 .

Version: 1 Owner: jarino Author(s): jarino

388.2 proof of Gronwall’s lemma

The inequality
φ(t) 6 K + Linttt0 ψ(s)φ(s)ds (388.2.1)
is equivalent to
φ(t)
61
K + Linttt0 ψ(s)φ(s)ds
Multiply by Lψ(t) and integrate, giving
Lψ(s)φ(s)ds
inttt0 6 Linttt0 ψ(s)ds
K + Lintst0 ψ(τ )φ(τ )dτ

1511
Thus 
ln K + Linttt0 ψ(s)φ(s)ds − ln K 6 Linttt0 ψ(s)ds
and finally 
K + Linttt0 ψ(s)φ(s)ds 6 K exp Linttt0 ψ(s)ds
Using (387.2.1) in the left hand side of this inequality gives the result.

Version: 2 Owner: jarino Author(s): jarino

1512
Chapter 389

26D15 – Inequalities for sums, series
and integrals

389.1 Carleman’s inequality

Theorem ([4], pp. 24) For positive real numbers {an }∞ n=1 , Carleman’s inequality states
that ∞ ∞
X 1/n X
a1 a2 · · · an ≤e an .
n=1 n=1

Although the constant e (the natural log base) is optimal, it is possible to refine Carleman’s
inequality by decreasing the weight coefficients on the right hand side [2].

REFERENCES
1. L. Hörmander, The Analysis of Linear Partial Differential Operators I, (Distribution
theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990.
2. B.Q. Yuan, Refinements of Carleman’s inequality, Journal of Inequalities in Pure and
Applied Mathematics, Vol. 2, Issue 2, 2001, Article 21. online

Version: 2 Owner: matte Author(s): matte

389.2 Chebyshev’s inequality

If x1 , x2 , . . . , xn and y1 , y2, . . . , yn are two sequences (at least one of them consisting of positive
numbers):

1513
• if x1 < x2 < · · · < xn and y1 < y2 < · · · < yn then
  
x1 + x2 + · · · + xn y1 + y2 + · · · + yn x1 y1 + x2 y2 + · · · + xn yn
≤ .
n n n

• if x1 < x2 < · · · < xn and y1 > y2 > · · · > yn then
  
x1 + x2 + · · · + xn y1 + y2 + · · · + yn x1 y1 + x2 y2 + · · · + xn yn
≥ .
n n n

Version: 1 Owner: drini Author(s): drini

389.3 MacLaurin’s Inequality

Let a1 , a2 , . . . , an be positive real numbers , and define the sums Sk as follows :
X
ai1 ai2 · · · aik
16i1 <i2 <···<ik 6n
Sk =  
n
k
Then the following chain of inequalities is true :
p p p
S1 > S2 > 3 S3 > · · · > n Sn
Note : Sk are called the averages of the elementary symmetric sums
This inequality is in fact important because it shows that the Arithmetic-Geometric Mean
inequality is nothing but a consequence of a chain of stronger inequalities

Version: 2 Owner: drini Author(s): drini, slash

389.4 Minkowski inequality

If p > 1 and ak , bk are real numbers for k = 1, . . ., then
n
!1/p n
!1/p n
!1/p
X X X
|ak + bk |p ≤ |ak |p + |bk |p
k=1 k=1 k=1

The Minkowski inequality is in fact valid for all Lp norms with p ≥ 1 on arbitrary measure spaces.
This covers the case of Rn listed here as well as spaces of sequences and spaces of functions,
and also complex Lp spaces.

Version: 8 Owner: drini Author(s): drini, saforres

1514

Let 0 6 s1 6 · · · 6 sn and 0 6 t1 6 . . . 6 tn be real numbers such that
n
X n
X k
X k
X
si = ti and si 6 ti (k = 1, . . . , n − 1)
i=1 i=1 i=1 i=1

Then for any nonnegative numbers x1 , . . . , xn ,
X sσ(1) s X tσ(1) t
x1 . . . xnσ(n) 6 x1 . . . xnσ(n)
σ σ

where the sums run over all permutations σ of {1, 2, . . . , n}.

Version: 3 Owner: Koro Author(s): Koro

389.6 Schur’s inequality

If a, b, and c are positive real numbers and k > 1 a fixed real constant, then the following
inequality holds:
ak (a − b)(b − c) + bk (b − c)(c − a) + ck (c − a)(c − b) > 0

Taking k = 1, we get the well-known

a3 + b3 + c3 + 3abc > ab(a + b) + ac(a + c) + bc(b + c)

W e can assume without loss of generality that c 6 b 6 a via a permutation of the variables
(as both sides are symmetric in those variables). Then collecting terms, the lemma states
that

(a − b) ak (a − c) − bk (b − c) + ck (a − c)(b − c) > 0
which is clearly true as every term on the left is positive.

Version: 3 Owner: mathcam Author(s): mathcam, slash

389.7 Young’s inequality

Let φ : R → R be a continuous , strictly increasing function such that φ(0) = 0 . Then the
following inequality holds:
ab 6 inta0 φ(x)dx + intb0 φ−1 (y)dy

1515
The inequality is trivial to prove by drawing the graph of φ(x) and by observing that the
sum of the two areas represented by the integrals above is greater than the area of a rectangle
of sides a and b .

Version: 2 Owner: slash Author(s): slash

389.8 arithmetic-geometric-harmonic means inequality

Let x1 , x2 , . . . , xn be positive numbers. Then
x1 + x2 + · · · + xn
max{x1 , x2 , . . . , xn } ≥
√ n
≥ n x1 x2 · · · xn
n
≥ 1
x1
+ x2 + · · · + x1n
1

≥ min{x1 , x2 , . . . , xn }

There are several generalizations to this inequality using power means and weighted power means.

Version: 4 Owner: drini Author(s): drini

389.9 general means inequality

The power means inequality is a generalization of arithmetic-geometric means inequality.

If 0 6= r ∈ R, the r-mean (or r-th power mean) of the nonnegative numbers a1 , . . . , an is
defined as !1/r
n
X
1
M r (a1 , a2 , . . . , an ) = ark
n
k=1

Given real numbers x, y such that xy 6= 0 and x < y, we have

Mx 6 My

and the equality holds if and only if a1 = ... = an .

Additionally, if we define M 0 to be the geometric mean (a1 a2 ...an )1/n , we have that the
inequality above holds for arbitrary real numbers x < y.

The mentioned inequality is a special case of this one, since M 1 is the arithmetic mean, M 0
is the geometric mean and M −1 is the harmonic mean.

1516
This inequality can be further generalized using weighted power means.

Version: 3 Owner: drini Author(s): drini

389.10 power mean

The r-th power mean of the numbers x1 , x2 , . . . , xn is defined as:

 1/r
r xr1 + xr2 + · · · + xrn
M (x1 , x2 , . . . , xn ) = .
n

The arithmetic mean is a special case when r = 1. The power mean is a continuous function
of r, and taking limit when r → 0 gives us the geometric mean:

M 0 (x1 , x2 , . . . , xn ) = n
x1 x2 · · · xn .

Also, when r = −1 we get
n
M −1 (x1 , x2 , . . . , xn ) = 1 1 1
x1
+ x2
+···+ xn

the harmonic mean.

A generalization of power means are weighted power means.

Version: 8 Owner: drini Author(s): drini

389.11 proof of Chebyshev’s inequality

Let x1 , x2 , . . . , xn and y1 , y2, . . . , yn be real numbers such that x1 ≤ x2 ≤ · · · ≤ xn . Write
the product (x1 + x2 + · · · + xn )(y1 + y2 + · · · + yn ) as

(x1 y1 + x2 y2 + · · · + xn yn )
+ (x1 y2 + x2 y3 + · · · + xn−1 yn + xn y1 )
+ (x1 y3 + x2 y4 + · · · + xn−2 yn + xn−1 y1 + xn y2 )
+ ···
+ (x1 yn + x2 y1 + x3 y2 + · · · + xn yn−1 ). (389.11.1)

1517
• If y1 ≤ y2 ≤ · · · ≤ yn , each of the n terms in parentheses is less than or equal to
x1 y1 + x2 y2 + · · · + xn yn , according to the rearrangement inequality. From this, it
follows that

(x1 + x2 + · · · + xn )(y1 + y2 + · · · + yn ) ≤ n(x1 y1 + x2 y2 + · · · + xn yn )

or (dividing by n2 )
  
x1 + x2 + · · · + xn y1 + y2 + · · · + yn x1 y1 + x2 y2 + · · · + xn yn
≤ .
n n n

• If y1 ≥ y2 ≥ · · · ≥ yn , the same reasoning gives
  
x1 + x2 + · · · + xn y1 + y2 + · · · + yn x1 y1 + x2 y2 + · · · + xn yn
≥ .
n n n

It is clear that equality holds if x1 = x2 = · · · = xn or y1 = y2 = · · · = yn . To see that
this condition is also necessary, suppose that not all yi ’s are equal, so that y1 6= yn . Then
the second term in parentheses of (388.11.1) can only be equal to x1 y1 + x2 y2 + · · · + xn yn
if xn−1 = xn , the third term only if xn−2 = xn−1 , and so on, until the last term which can
only be equal to x1 y1 + x2 y2 + · · · + xn yn if x1 = x2 . This implies that x1 = x2 = · · · = xn .
Therefore, Chebyshev’s inequality is an equality if and only if x1 = x2 = · · · = xn or
y1 = y2 = · · · = yn .

Version: 1 Owner: pbruin Author(s): pbruin

389.12 proof of Minkowski inequality

For p = 1 the result follows immediately from the triangle inequality, so we may assume
p > 1.

We have
|ak + bk |p = |ak + bk ||ak + bk |p−1 6 (|ak | + |bk |)|ak + bk |p−1
by the triangle inequality. Therefore we have

|ak + bk |p 6 |ak ||ak + bk |p−1 + |bk ||ak + bk |p−1
p 1 1
Set q = p−1
. Then p
+ q
= 1, so by the Hölder inequality we have

n n
! 1p n
! 1q
X X X
|ak ||ak + bk |p−1 6 |ak |p |ak + bk |(p−1)q
k=0 k=0 k=0

n n
! p1 n
! 1q
X X X
|bk ||ak + bk |p−1 6 |bk |p |ak + bk |(p−1)q
k=0 k=0 k=0

1518
Adding these two inequalities, dividing by the factor common to the right sides of both, and
observing that (p − 1)q = p by definition, we have
n
!1− 1q n n
! p1 n
! p1
X X X X
|ak + bk |p 6 (|ak | + |bk |)|ak + bk |p−1 6 |ak |p + |bk |p
i=0 k=0 k=0 k=0

Finally, observe that 1 − 1q = 1p , and the result follows as required. The proof for the integral
version is analogous.

Version: 4 Owner: saforres Author(s): saforres

389.13 proof of arithmetic-geometric-harmonic means
inequality

Let M be max{x1 , x2 , x3 , . . . , xn } and let m be min{x1 , x2 , x3 , . . . , xn }.

Then
M +M +M +···+M x1 + x2 + x3 + · · · + xn
M= >
n n
n n n
m= n = 1 1 1 1 6 1
m m
+ m + m +···+ m x1
+ x2 + x3 + · · · + x1n
1 1

where all the summations have n terms. So we have proved in this way the two inequalities
at the extremes.

Now we shall prove the inequality between arithmetic mean and geometric mean. We do
first the case n = 2.

√ √
( x1 − x2 )2 > 0

x1 − 2 x1 x2 + x2 > 0

x1 + x2 > 2 x1 x2
x1 + x2 √
> x1 x2
2

Now we prove the inequality for any power of 2 (that is, n = 2k for some integer k) by using
mathematical induction.

x1 + x2 + · · · + x2k + x2k +1 + · · · + x2k+1
2k+1
  x +x +···+x2k+1

x1 +x2 +···+x2k 2k +1 2k +2
2k + 2k
=
2
1519
and using the case n = 2 on the last expression we can state the following inequality
x1 + x2 + · · · + x2k + x2k +1 + · · · + x2k+1
s 2k+1
 
x1 + x2 + · · · + x2k x2k +1 + x2k +2 + · · · + x2k+1

2k 2k
q

2k

≥ x1 x2 · · · x2k 2k x2k +1 x2k +2 · · · x2k+1

where the last inequality was obtained by applying the induction hypothesis with n = 2k .

Finally, we see that the last expression is equal to 2k+1 x1 x2 x3 · · · x2k+1 and so we have proved
the truth of the inequality when the number of terms is a power of two.

Finally, we prove that if the inequality holds for any n, it must also hold for n − 1, and
this proposition, combined with the preceding proof for powers of 2, is enough to prove the
inequality for any positive integer.

Suppose that
x1 + x2 + · · · + xn √
> n x1 x2 · · · xn
n
is known for a given value of n (we just proved that it is true for powers of two, as example).
Then we can replace xn with the average of the first n − 1 numbers. So
+···+xn−1 
x1 + x2 + · · · + xn−1 + x1 +x2n−1
n
(n − 1)x1 + (n − 1)x2 + · · · + (n − 1)xn−1 + x1 + x2 + · · · + xn
=
n(n − 1)
nx1 + nx2 + · · · + nxn−1
=
n(n − 1)
x1 + x2 + · · · + xn−1
=
(n − 1)

On the other hand
s  
n x1 + x2 + · · · + xn−1
x1 x2 · · · xn−1
n−1
r
√ x1 + x2 + · · · + xn−1
= n x1 x2 · · · xn−1 n
n−1
which, by the inequality stated for n and the observations made above, leads to:
 n  
x1 + x2 + · · · + xn−1 x1 + x2 + · · · + xn−1
≥ (x1 x2 · · · xn )
n−1 n−1
and so  n−1
x1 + x2 + · · · + xn−1
≥ x1 x2 · · · xn
n−1

1520
from where we get that
x1 + x2 + · · · + xn−1 √
≥ n−1
x1 x2 · · · xn .
n−1

So far we have proved the inequality between the arithmetic mean and the geometric mean.
The geometric-harmonic inequality is easier. Let ti be 1/xi .

From
t1 + t2 + · · · + tn √
> n t1 t2 t3 · · · tn
n
we obtain r
1
x1
+ x12 + x13 + · · · + x1n 1 1 1 1
> n ···
n x1 x2 x3 xn
and therefore
√ n
n
x1 x2 x3 · · · xn > 1 1 1 1
x1
+ x2
+ x3
+···+ xn
and so, our proof is completed.

Version: 2 Owner: drini Author(s): drini

389.14 proof of general means inequality

Let r < s be real numbers, and let w1 , w2 , . . . , wn be positive real numbers such that w1 +
w2 + · · · + wn = 1. We will prove the weighted power means inequality, which states that
for positive real numbers x1 , x2 , . . . , xn ,
Mwr (x1 , x2 , . . . , xn ) ≤ Mws (x1 , x2 , . . . , xn ).

First, suppose that r and s are nonzero. Then the r-th weighted power mean of x1 , x2 , . . . , xn
is
Mwr (x1 , x2 , . . . , xn ) = (w1 x1 + w2 x2 + · · · + wn xn )1/r
and Mws is defined similarly.

Let t = rs , and let yi = xri for 1 ≤ i ≤ n; this implies yit = xsi . Define the function f on
1
(0, ∞) by f (x) = xt . The second derivative of f is f 00 (x) = t(t−1) xt−2 . There are three cases
for the signs of r and s: r < s < 0, r < 0 < s, and 0 < r < s. We will prove the inequality
for the case 0 < r < s; the other cases are almost identical.
1
In the case that r and s are both positive, t > 1. Since f 00 (x) = t(t−1) xt−2 > 0 for all x > 0,
f is a strictly convex function. Therefore, according to Jensen’s inequality,
(w1 y1 + w2 y2 + · · · + wn yn )t = f (w1y1 + w2 y2 + · · · + wn yn )
≤ w1 f (y1) + w2 f (y2) + · · · + wn f (yn )
= w1 y1t + w2 y2t + · · · + wn ynt .

1521
s
with equality if and only if y1 = y2 = · · · = yn . By substituting t = r
and yi = xri back into
this inequality, we get

(w1 xr1 + w2 xr2 + · · · + wn xrn )s/r ≤ w1 xs1 + w2 xs2 + · · · + wn xsn

with equality if and only if x1 = x2 = · · · = xn . Since s is positive, the function x 7→ x1/s is
strictly increasing, so raising both sides to the power 1/s preserves the inequality:

(w1 xr1 + w2 xr2 + · · · + wn xrn )1/r ≤ (w1 xs1 + w2 xs2 + · · · + wn xsn )1/s ,

which is the inequality we had to prove. Equality holds if and only if all the xi are equal.

If r = 0, the inequality is still correct: Mw0 is defined as limr→0 Mwr , and since Mwr ≤ Mws
for all r < s with r 6= 0, the same holds for the limit r → 0. We can show by an identical
argument that Mwr ≤ Mw0 for all r < 0. Therefore, for all real numbers r and s such that
r < s,
Mwr (x1 , x2 , . . . , xn ) ≤ Mws (x1 , x2 , . . . , xn ).

Version: 1 Owner: pbruin Author(s): pbruin

389.15 proof of rearrangement inequality

We first prove the rearrangement inequality for the case n = 2. Let x1 , x2 , y1 , y2 be real numbers
such that x1 ≤ x2 and y1 ≤ y2 . Then

(x2 − x1 )(y2 − y1 ) ≥ 0,

and therefore
x1 y1 + x2 y2 ≥ x1 y2 + x2 y1 .
Equality holds iff x1 = x2 or y1 = y2 .

For the general case, let x1 , x2 , . . . , xn and y1 , y2, . . . , yn be real numbers such that x1 ≤ x2 ≤
· · · ≤ xn . Suppose that (z1 , z2 , . . . , zn ) is a permutation (rearrangement) of {y1, y2 , . . . , yn }
such that the sum
x1 z1 + x2 z2 + · · · + xn zn
is maximized. If there exists a pair i < j with zi > zj , then xi zj + xj zi ≥ xi zi + xj zj (the
n = 2 case); equality holds iff xi = xj . Therefore, x1 z1 + x2 z2 + · · · + xn zn is not maximal
unless z1 ≤ z2 ≤ · · · ≤ zn or xi = xj for all pairs i < j such that zi > zj . In the latter case, we
can consecutively interchange these pairs until z1 ≤ z2 ≤ · · · ≤ zn (this is possible because
the number of pairs i < j with zi > zj decreases with each step). So x1 z1 + x2 z2 + · · · + xn zn
is maximized if
z1 ≤ z2 ≤ · · · ≤ zn .
To show that x1 z1 + x2 z2 + · · · + xn zn is minimal for a permutation (z1 , z2 , . . . , zn ) of
{y1 , y2 , . . . , yn } if z1 ≥ z2 ≥ · · · ≥ zn , observe that −(x1 z1 + x2 z2 + · · · + xn zn ) = x1 (−z1 ) +

1522
x2 (−z2 ) + · · · + xn (−zn ) is maximized if −z1 ≤ −z2 ≤ · · · ≤ −zn . This implies that
x1 z1 + x2 z2 + · · · + xn zn is minimized if

z1 ≥ z2 ≥ · · · ≥ zn .

Version: 1 Owner: pbruin Author(s): pbruin

389.16 rearrangement inequality

Let x1 , x2 , . . . , xn and y1 , y2 , . . . , yn two sequences of positive real numbers. Then the sum

x1 y1 + x2 y2 + · · · + xn yn

is maximized when the two sequences are ordered in the same way (i.e. x1 ≤ x2 ≤ · · · ≤ xn
and y1 ≤ y2 ≤ · · · ≤ yn ) and is minimized when the two sequences are ordered in the opposite
way (i.e. x1 ≤ x2 ≤ · · · ≤ xn and y1 ≥ y2 ≥ · · · ≥ yn ).

This can be seen intuitively as: If x1 , x2 , . . . , xn are the prices of n kinds of items, and
y1 , y2 , . . . , yn the number of units sold of each, then the highest profit is when you sell more
items with high prices and fewer items with low prices (same ordering), and the lowest profit
happens when you sell more items with lower prices and less items with high prices (opposite
orders).

Version: 4 Owner: drini Author(s): drini

1523
Chapter 390

26D99 – Miscellaneous

390.1 Bernoulli’s inequality

Let x and r be real numbers. If r > 1 and x > −1 then

(1 + x)r ≥ 1 + xr.

The inequality also holds when r is an even integer.

Version: 3 Owner: drini Author(s): drini

390.2 proof of Bernoulli’s inequality

Let I be the interval (−1, ∞) and f : I → R the function defined as:

f (x) = (1 + x)α − 1 − αx

with α ∈ R \ {0, 1} fixed. Then f is differentiable and its derivative is

f 0 (x) = α(1 + x)α−1 − α, for all x ∈ I,

from which it follows that f 0 (x) = 0 ⇔ x = 0.

1. If 0 < α < 1 then f 0 (x) < 0 for all x ∈ (0, ∞) and f 0 (x) > 0 for all x ∈ (−1, 0)
which means that 0 is a global maximum point for f . Therefore f (x) < f (0) for all
x ∈ I \ {0} which means that (1 + x)α < 1 + αx for all x ∈ (−1, 0).

1524
/ [0, 1] then f 0 (x) > 0 for all x ∈ (0, ∞) and f 0 (x) < 0 for all x ∈ (−1, 0) meaning
2. If α ∈
that 0 is a global minimum point for f . This implies that f (x) > f (0) for all x ∈ I \{0}
which means that (1 + x)α > 1 + αx for all x ∈ (−1, 0).

Checking that the equality is satisfied for x = 0 or for α ∈ {0, 1} ends the proof.

Version: 3 Owner: danielm Author(s): danielm

1525
Chapter 391

26E35 – Nonstandard analysis

391.1 hyperreal

An ultrafilter F on a set I is called nonprincipal if no finite subsets of I are in F.

Fix once and for all a nonprincipal ultrafilter F on the set N of natural numbers. Let ∼ be
the equivalence relation on the set RN of sequences of real numbers given by

{an } ∼ {bn } ⇔ {n ∈ N | an = bn } ∈ F

Let ∗ R be the set of equivalence classes of RN under the equivalence relation ∼. The set ∗ R
is called the set of hyperreals. It is a field under coordinatewise addition and multiplication:

{an } + {bn } = {an + bn }
{an } · {bn } = {an · bn }

The field ∗ R is an ordered field under the ordering relation

{an } 6 {bn } ⇔ {n ∈ N | an 6 bn } ∈ F

The real numbers embed into ∗ R by the map sending the real number x ∈ R to the equivalence
class of the constant sequence given by xn := x for all n. In what follows, we adopt the
convention of treating R as a subset of ∗ R under this embedding.

A hyperreal x ∈ ∗ R is:

• limited if a < x < b for some real numbers a, b ∈ R

• positive unlimited if x > a for all real numbers a ∈ R

• negative unlimited if x < a for all real numbers a ∈ R

1526
• unlimited if it is either positive unlimited or negative unlimited

• positive infinitesimal if 0 < x < a for all positive real numbers a ∈ R+

• negative infinitesimal if a < x < 0 for all negative real numbers a ∈ R−

• infinitesimal if it is either positive infinitesimal or negative infinitesimal

For any subset A of R, the set ∗ A is defined to be the subset of ∗ R consisting of equivalence
classes of sequences {an } such that

{n ∈ N | an ∈ A} ∈ F.

The sets ∗ N, ∗ Z, and ∗ Q are called hypernaturals, hyperintegers, and hyperrationals, respec-
tively. An element of ∗ N is also sometimes called hyperfinite.

Version: 1 Owner: djao Author(s): djao

391.2 e is not a quadratic irrational

Looking at the Taylor series for ex , we see that

X
x xk
e = .
k=0
k!

P∞ 1 −1
P∞ k 1
This converges for every x ∈ R, so e = k=0 k! and e = k=0 (−1) k! . Arguing by
contradiction, assume ae2 +be+c = 0 for integers a, b and c. That is the same as ae+b+ce−1 =
0.

Fix n > |a| + |c|, then a, c | n! and ∀k ≤ n, k! | n! . Consider

X∞ X∞
−1 1 1
0 = n!(ae + b + ce ) = an! + b + cn! (−1)k
k=0
k! k=0
k!
n
X ∞
X
n! n!
= b+ (a + c(−1)k ) + (a + c(−1)k )
k! k!
k=0 k=n+1

Since k! | n! for k ≤ n, the first two terms are integers. So the third term should be an

1527
integer. However,

X∞
n! X∞
n!

(a + c(−1)k ) ≤ (|a| + |c|)
k! k!
k=n+1 k=n+1

X 1
= (|a| + |c|)
k=n+1
(n + 1)(n + 2) · · · k

X
≤ (|a| + |c|) (n + 1)n−k
k=n+1

X
= (|a| + |c|) (n + 1)−t
t=1
1
= (|a| + |c|)
n
is less than 1 by our assumption that n > |a|P+ |c|. Since there is only one integer which is
less than 1 in absolute value, this means that ∞ k 1
k=n+1 (a+ c(−1) ) k! = 0 for every sufficiently
large n which is not the case because

X X∞
1
k 1 1
(a + c(−1) ) − (a + c(−1)k ) = (a + c(−1)k )
k=n+1
k! k=n+2 k! (n + 1)!

is not identically zero. The contradiction completes the proof.

Version: 6 Owner: thedagit Author(s): bbukh, thedagit

391.3 zero of a function

Definition Suppose X is a set, and suppose f is a complex-valued function f : X → C.
Then a zero of f is an element x ∈ X such that f (x) = 0. The zero set of f is the set

Z(f ) = {x ∈ X | f (x) = 0}.

Remark

When X is a “simple” space, such as R or C a zero is also called a root. However, in pure
mathematics and especially if Z(f ) is infinite, it seems to be customary to talk of zeroes and
the zero set instead of roots.

Examples

• Suppose p is a polynomial p : C → C of degree n ≥ 1. Then p has at most n zeroes.
That is, |Z(p)| ≤ n.

1528
• If f and g are functions f : X → C and g : X → C, then
[
Z(f g) = Z(f ) Z(g),
Z(f g) ⊃ Z(f ),

where f g is the function x 7→ f (x)g(x).

• If X is a topological space and f : X → C is a function, then

supp f = Z(f ){.

Further, if f is continuous, then Z(f ) is a closed in X (assuming that C is given the
usual topology of the complex plane where {0} is a closed set).

Version: 21 Owner: mathcam Author(s): matte, yark, say 10, apmxi

1529
Chapter 392

28-00 – General reference works
(handbooks, dictionaries,
bibliographies, etc.)

392.1 extended real numbers

The extended real numbers are the real numbers together with +∞ (or simply ∞) and
−∞. This set is usually denoted by R or [−∞, ∞] [3], and the elements +∞ and −∞ are
called plus infinity respectively minus infinity. Following [3], let us next extend the order
operation <, the addition and multiplication operations, and the absolute value from R to
R. In other words, let us define how these operations should behave when some of their
arguments are ∞ or −∞.

Order on R

The order relation on R extends to R by defining that for any x ∈ R, we have

−∞ < x,
x < ∞,

and that −∞ < ∞.

For any real number x, we define

x + (±∞) = (±∞) + x = ±∞,

1530
and for +∞ and −∞, we define

(±∞) + (±∞) = ±∞.

It should be pointed out that sums like (+∞) + (−∞) are left undefined.

Multiplication

If x is a positive real number, then

x · (±∞) = (±∞) · x = ±∞.

Similarly, if x is a negative real number, then

x · (±∞) = (±∞) · x = ∓∞.

Furthermore, for ∞ and −∞, we define

(+∞) · (+∞) = (−∞) · (−∞) = +∞,
(+∞) · (−∞) = (−∞) · (+∞) = −∞.

In many areas of mathematics, products like 0 · ∞ are left undefined. However, a special
case is measure theory, where it is convenient to define [3]

0 · (±∞) = (±∞) · 0 = 0.

Absolute value

For ∞ and −∞, the absolute value is defined as

| ± ∞| = +∞.

Examples

1. By taking x = −1 in the product rule, we obtain the relations

(−1) · (±∞) = ∓∞.

REFERENCES
1. D.L. Cohn, Measure Theory, Birkhäuser, 1980.

Version: 1 Owner: matte Author(s): matte

1531
Chapter 393

28-XX – Measure and integration

393.1 Riemann integral

Suppose there is a function f : D → R where D, R ⊆ R and that there is a closed interval I =
[a, b] such that I ⊆ D. For any finite set of points {x0 , x1 , x2 , . . . xn } such that a = x0 < x1 <
x2 · · · < xn = b, there is a corresponding partition P = {[x0 , x1 ), [x1 , x2 ), . . . , [xn − 1, xn ]} of
I.

Let C() be the set of all partitions of I with max(xi+1 − xi ) < . Then let S ∗ () be the
infimum of the set of upper Riemann sums with each partition in C(), and let S∗ () be the
supremum of the set of lower Riemann sums with each partition in C(). If 1 < 2 , then
C(1 ) ⊂ C(2 ), so S ∗ = lim→0 S ∗ () and S∗ = lim→0 S∗ () exist. If S ∗ = S∗ , then f is
Riemann-integrable over I, and the Riemann integral of f over I is defined by
intba f (x)dx = S ∗ = S∗ .

Version: 4 Owner: bbukh Author(s): bbukh, vampyr

393.2 martingale

Let ν be a probability measure on Cantor space C, and let s ∈ [0, ∞).

1. A ν-s-supergale is a function d : {0, 1}∗ → [0, ∞) that satisfies the condition
d(w)ν(w)s > d(w0)ν(w0)s + d(w1)ν(w1)s (393.2.1)
for all w ∈ {0, 1}∗.
2. A ν-s-gale is a ν-s-supergale that satisfies the condition with equality for all w ∈ {0, 1}∗.

1532
3. A ν-supermartingale is a ν-1-supergale.

4. A ν-martingale is a ν-1-gale.

5. An s-supergale is a µ-s-supergale, where µ is the uniform probability measure.

6. An s-gale is a µ-s-gale.

7. A supermartingale is a 1-supergale.

8. A martingale is a 1-gale.

Put in another way, a martingale is a function d : {0, 1}∗ → [0, ∞) such that, for all
w ∈ {0, 1}∗, d(w) = (d(w0) + d(w1))/2.

Let d be a ν-s-supergale, where ν is a probability measure on C and s ∈ [0, ∞). We say that
d succeeds on a sequence S ∈ C if

lim sup d(S[0..n − 1]) = ∞.
n→∞

The success set of d is S ∞ [d] = {S ∈ C d succeeds on S}. d succeeds on a language A ⊆
{0, 1}∗ if d succeeds on the characteristic sequence χA of A. We say that d succeeds strongly
on a sequence S ∈ C if
lim inf d(S[0..n − 1]) = ∞.
n→∞

The strong success set of d is Sstr [d] = {S ∈ C d succeeds strongly on S}.

Intuitively, a supergale d is a betting strategy that bets on the next bit of a sequence when
the previous bits are known. s is the parameter that tunes the fairness of the betting. The
smaller s is, the less fair the betting is. If d succeeds on a sequence, then the bonus we can
get from applying d as the betting strategy on the sequence is unbounded. If d succeeds
strongly on a sequence, then the bonus goes to infinity.

Version: 10 Owner: xiaoyanggu Author(s): xiaoyanggu

1533
Chapter 394

28A05 – Classes of sets (Borel fields,
σ-rings, etc.), measurable sets, Suslin
sets, analytic sets

394.1 Borel σ-algebra

For any topological space X, the Borel sigma algebra of X is the σ–algebra B generated by
the open sets of X. An element of B is called a Borel subset of X, or a Borel set.

Version: 5 Owner: djao Author(s): djao, rmilson

1534
Chapter 395

28A10 – Real- or complex-valued set
functions

395.1 σ-finite

A measure space (Ω, B, µ) is σ-finite if the total space is the union of a finite or countable
family of sets of finite measure; i.e. if there
S exists a finite or countable set F ⊂ B such that
µ(A) < ∞ for each A ∈ F, and Ω = A∈F A. In this case we also say that µ is a σ-finite
measure. If µ is not σ-finite, we say that it is σ-infinite.

Examples. Any finite measure space is σ-finite. A more interesting example is the Lebesgue measure
µ in Rn : it is σ-finite but not finite. In fact
[
R= [−k, k]n
k∈N

([−k, k]n is a cube with center at 0 and side length 2k, and its measure is (2k)n ), but
µ(Rn ) = ∞.

Version: 6 Owner: Koro Author(s): Koro, drummond

395.2 Argand diagram

An argand diagram is the graphical representation of complex numbers written in polar
coordinates.

Argand is the name of Jean-Robert Argand, the frenchman who is is credited with the
geometric interpretation of the complex numbers [Biography]

Version: 3 Owner: drini Author(s): drini

1535
395.3 Hahn-Kolmogorov theorem
S
Let A0 be an algebra of subsets of a set X. If a finitely additive measure µ0 : A → R {∞}
satisfies ∞ ∞
[ X
µ0 ( An ) = µ0 (An )
n=1 n=1
S
for any disjoint family {An : n ∈ N} of elements of A0 such that ∞ n=0 An ∈ A0 , then µ0
extends uniquely to a measureS defined on the σ-algebra A generated by A0 ; i.e. there exists
a unique measure µ : A → R {∞} such that its restriction to A0 coincides with µ0

Version: 3 Owner: Koro Author(s): Koro

395.4 measure

S (E, B(E)) be a measurable space. A measure on (E, B(E)) is a function µ : B(E) −→
Let
R {∞} with values in the extended real numbers such that:

1. µ(A) > 0 for A ∈ B(E), with equality if A = ∅
S P∞
2. µ( ∞i=0 Ai ) = i=0 µ(Ai ) for any sequence of disjoint sets Ai ∈ B(E).

I

The second property is called countable additivity. A finitely additive measure µ has the
same definition except that B(E) is only required to be an algebra and the second property
above is only required to hold for finite unions. Note the slight abuse of terminology: a
finitely additive measure is not necessarily a measure.

The triple (E, B, µ) is called a measure space. If µ(E) = 1, then it is called a probability
space, and the measure µ is called a probability measure.

Lebesgue measure on Rn is one important example of a measure.

Version: 8 Owner: djao Author(s): djao

395.5 outer measure

Definition [1, 2, 1] Let X be a set, and let P(X) be the power set of X. An outer measure
on X is a function µ∗ : P(X) → [0, ∞] satisfying the properties

1. µ∗ (∅) = 0.

1536
2. If A ⊂ B are subsets in X, then µ∗ (A) ≤ µ∗ (B).

3. If {Ai } is a countable collection of subsets of X, then
[ X
µ∗ ( Ai ) ≤ µ∗ (Ai ).
i i

Here, we can make two remarks. First, from (1) and (2), it follows that µ∗ is a positive
function on P(X). Second, property (3) also holds for any finite collection of subsets since
we can always append an infinite sequence of empty sets to such a collection.

Examples

• [1, 2] On a set X, let us define µ∗ : P(X) → [0, ∞] as

∗ 1 when E 6= ∅,
µ (E) =
0 when E = ∅.

Then µ∗ is an outer measure.

• [1] On a uncountable set X, let us define µ∗ : P(X) → [0, ∞] as

∗ 1 when E is uncountable,
µ (E) =
0 when E is countable.

Then µ∗ is an outer measure.

Theorem [1, 2, 1] Let X be a set, and let F be a collection of subsets of X such that ∅ ∈ F
and X ∈ F. Further, let ρ : F → [0, ∞] be a mapping such that ρ(∅) = 0. If A ⊂ X, let

X

µ (A) = inf ρ(Fi ),
i=1
S∞
where the infimum is taken over all collections {Fi }∞
i=1 ⊂ F such that A ⊂ i=1 Fi . Then

µ : P(X) → [0, ∞] is an outer measure.

REFERENCES
1. A. Mukherjea, K. Pothoven, Real and Functional analysis, Plenum press, 1978.
2. A. Friedman, Foundations of Modern Analysis, Dover publications, 1982.
3. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed,
John Wiley & Sons, Inc., 1999.

Version: 1 Owner: mathcam Author(s): matte

1537
395.6 properties for measure

Theorem [1, 1, 3, 2] Let (E, B, µ) be a measure space, i.e., let E be a set, let B be a
σ-algebra of sets in E, and let µ be a measure on B. Then the following properties hold:

1. Monotonicity: If A, B ∈ B, and A ⊂ B, then µ(A) ≤ µ(B).
2. If A, B in B, A ⊂ B, and µ(A) < ∞, then
µ(B \ A) = µ(B) − µ(A).

3. For any A, B in B, we have
[ \
µ(A B) + µ(A B) = µ(A) + µ(B).

i=1 is a collection of sets from B, then

[ ∞
X

µ Ai ≤ µ(Ai ).
i=1 i=1

5. Continuity from below: If {Ai }∞
i=1 is a collection of sets from B such that Ai ⊂ Ai+1
for all i, then
[∞

µ Ai = lim µ(Ai ).
i→∞
i=1

6. Continuity from above: If {Ai }∞ i=1 is a collection of sets from B such that µ(A1 ) <
∞, and Ai ⊃ Ai+1 for all i, then

\ 
µ Ai = lim µ(Ai ).
i→∞
i=1

Remarks In (2), the assumption µ(A) < ∞ assures that the right hand side is always
well defined, i.e., not of the form ∞ − ∞. Without the assumption we can prove T that
µ(B) = µ(A) + µ(B \ A) (see below). In (3), it is tempting to move the term µ(A B) to
the other side for aesthetic reasons. However, this is only possible if the term is finite.
S
Proof. For (1), suppose A ⊂ B. We can then write B as the disjoint union B = A (B \ A),
whence
[
µ(B) = µ(A (B \ A)) = µ(A) + µ(B \ A).

Since µ(B \ A) ≥ 0, the claim follows. Property (2) follows from the above equation; since
S < ∞, S
µ(A) we can subtract this quantity from both sides. For property (3), we can write
A B = A (B \ A), whence
[
µ(A B) = µ(A) + µ(B \ A)
≤ µ(A) + µ(B).

1538
S
If µ(A B) is infinite, the last inequality must be equality, and either of µ(A) or µ(B) T must
be infinite.
S Together with (1), we obtain that if any of the quantities µ(A), µ(B), µ(A B)
or µ(A B) is infinite, then all quantities are infinite, whence the claim clearly holds. S We
canS therefore without loss of generality assume that all quantities are finite. From A B=
B (A \ B), we have [
µ(A B) = µ(B) + µ(A \ B)
and thus [
2µ(A B) = µ(A) + µ(B) + µ(A \ B) + µ(B \ A).
For the last two terms we have
[
µ(A \ B) + µ(B \ A) = µ((A \ B) (B \ A))
[ \
= µ((A B) \ (A B))
[ \
= µ(A B) − µ(A B),
where, in the second equality we have used properties for the symmetric set difference, and
the last equality follows from property (2). This completes the proof of property (3). For
property (4), let us define the sequence {Di }∞
i=1 as
i−1
[
D1 = A1 , Di = Ai \ Ak .
k=1
T S S∞
Now Di Dj = ∅ for i < j, so {Di } is a sequence of disjoint sets. Since ∞i=1 Di = i=1 Ai ,
and since Di ⊂ Ai , we have
[∞ ∞
[
µ( Ai ) = µ( Di )
i=1 i=1

X
= µ(Di )
i=1
X∞
≤ µ(Ai ),
i=1

and property (4) follows.

TODO: proofs for (5)-(6).

REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed,
John Wiley & Sons, Inc., 1999.
2. A. Mukherjea, K. Pothoven, Real and Functional analysis, Plenum press, 1978.
3. D.L. Cohn, Measure Theory, Birkhäuser, 1980.
4. A. Friedman, Foundations of Modern Analysis, Dover publications, 1982.

Version: 2 Owner: matte Author(s): matte

1539
Chapter 396

28A12 – Contents, measures, outer
measures, capacities

396.1 Hahn decomposition theorem

Let µ be a signed measure in the measurable space (Ω, S). There are two measurable sets A
and B such that:

S T
1. A B = Ω and A B = ∅;

2. µ(E) > 0 for each E ∈ S such that E ⊂ A;

3. µ(E) 6 0 for each E ∈ S such that E ⊂ B.

The pair (A, B) is called a Hahn decomposition for µ. This decomposition is not unique,
but any other such decomposition (A0 , B 0 ) satisfies µ(A0 M A) = µ(B M B 0 ) = 0 (where M
denotes the symmetric difference), so the two decompositions differ in a set of measure 0.

Version: 6 Owner: Koro Author(s): Koro

396.2 Jordan decomposition

Let (Ω, S, µ) be a signed measure space, and let (A, B) be a Hahn decomposition for µ. We
define µ+ and µ− by
\ \
µ+ (E) = µ(A E) and µ− (E) = −µ(B E).

This definition is easily shown to be independent of the chosen Hahn decomposition.

1540
It is clear that µ+ is a positive measure, and it is called the positive variation of µ. On
the other hand, µ− is a positive finite measure, called the negative variation of µ. The
measure |µ| = µ+ + µ− is called the total variation of µ.

Notice that µ = µ+ − µ− . This decomposition of µ into its positive and negative parts is
called the Jordan decomposition of µ.

Version: 6 Owner: Koro Author(s): Koro

396.3 Lebesgue decomposition theorem

Let µ and ν be two σ-finite signed measures in the measurable space (Ω, S). There exist two
σ-finite signed measures ν0 and ν1 such that:

1. ν = ν0 + ν1 ;

2. ν0  µ (i.e. ν0 is absolutely continuous with respect to µ;)

3. ν1 ⊥ µ (i.e. ν1 and µ are singular.)

These two measures are uniquely determined.

Version: 5 Owner: Koro Author(s): Koro

396.4 Lebesgue outer measure

Let S be some arbitrary subset of R. Let L(I) be the traditional definition of the length of
an interval I ⊆ R. If I = (a, b), then L(I) = b − a. Let M be the set containing

X
L(A)
A∈C

S
for any countable collection of open intervals C that covers S (that is, S ⊆ C). Then the
Lebesgue outer measure of S is defined by:

m∗ (S) = inf(M)

Note that (R, P(R), m∗ ) is “almost” a measure space. In particular:

1541
• Lebesgue outer measure is defined for any subset of R (and P(R) is a σ-algebra).

• m∗ A > 0 for any A ⊆ R, and m∗ ∅ = 0.
S
• If A and B are disjoint sets, then m∗ (A B) 6 mS ∗
A + m∗P
B. More generally, if hAi i

is a countable sequence of disjoint sets, then m ( Ai ) 6 m∗ Ai . This property is
known as countable subadditivity and is weaker than countable additivity. In fact,

Lebesgue outer measure has other nice properties:

• The outer measure of an interval is its length: m∗ (a, b) = b − a.

• m∗ is translation invariant. That is, if we define A + y to be the set {x + y : x ∈ A},
we have m∗ A = m∗ (A + y) for any y ∈ R.

Version: 4 Owner: vampyr Author(s): vampyr

396.5 absolutely continuous

Given two signed measures µ and ν on the same measurable space (Ω, S), we say that ν is
absolutely continuous with respect to µ if, for each A ∈ S such that |µ|(A) = 0, it holds
ν(A) = 0. This is usually denoted by ν  µ.

Remarks.

If (ν + , ν − ) is the Jordan decomposition of ν, the following propositions are equivalent:

1. ν  µ;

2. ν +  µ and ν −  µ;

3. |ν|  kµ|.

If ν is a finite signed measure and ν  µ, the following useful property holds: for each ε > 0,
there is a δ > 0 such that |ν|(E) < ε whenever |µ|(E) < δ.

Version: 5 Owner: Koro Author(s): Koro

1542
396.6 counting measure

Let (X, B) be a measurable space. We call a measure µ counting measure on X if

n if A has exactly n elements
µ(A ∈ B) =
∞ otherwise.

Generally, counting measure is applied on N or Z.

Version: 2 Owner: mathwizard Author(s): mathwizard, drummond

396.7 measurable set

Let (X, F, µ) be a measure space with a sigma algebra F. A measurable set with respect
to µ in X is an element of F. These are also sometimes called µ-measurable sets. Any subset
Y ⊂ X with Y ∈ / F is said to be nonmeasurable with respect to µ, or non-µ-measurable.

Version: 2 Owner: mathcam Author(s): mathcam, drummond

396.8 outer regular

Let X be a locally compact Hausdorff topological space with Borel σ–algebra B, and suppose
µ is a measure on (X, B). For any Borel set B ∈ B, the measure µ is said to be outer regular
on B if
µ(B) = inf {µ(U) | U ⊃ B, U open}.
We say µ is inner regular on B if

µ(B) = sup {µ(K) | K ⊂ B, K compact}.

Version: 1 Owner: djao Author(s): djao

396.9 signed measure
S
A signed measure on a measurable space (Ω, S) is a function µ : S → R {+∞} which is
σ-additive and such that µ(∅) = 0.

Remarks.

1543
1. The usual (positive) measure is a particular case of signed measure, in which |µ| = µ
(see Jordan decomposition.)

2. Notice that the value −∞ is not allowed.

3. An important example of signed measures arises from the usual measures in the follow-
ing way: Let (Ω, S, µ) be a measure space, and let f be a (real valued) measurable function
such that
int{x∈Ω:f (x)<0} |f |dµ < ∞.
Then a signed measure is defined by

A 7→ intA f dµ.

Version: 4 Owner: Koro Author(s): Koro

396.10 singular measure

Two measures µ and ν in a measurableSspace (Ω, A) are called singular if there exist two
disjoint sets A and B in A such that A B = Ω and µ(B) = ν(A) = 0. This is denoted by
µ ⊥ ν.

Version: 4 Owner: Koro Author(s): Koro

1544
Chapter 397

28A15 – Abstract differentiation
theory, differentiation of set functions

397.1 Hardy-Littlewood maximal theorem

There is a constant K > 0 such that for each Lebesgue integrable function f ∈ L1 (Rn ), and
each t > 0,
K K
m({x : Mf (x) > t}) 6 kf k1 = intRn |f (x)|dx,
t t
where Mf is the Hardy-Littlewood maximal function of f .

Remark. The theorem holds for the constant K = 3n .

Version: 1 Owner: Koro Author(s): Koro

397.2 Lebesgue differentiation theorem

Let f be a locally integrable function on Rn with Lebesgue measure m, i.e. f ∈ L1loc (Rn ).
Lebesgue’s differentiation theorem basically says that for almost every x, the averages
1
intQ |f (y) − f (x)|dy
m(Q)
converge to 0 when Q is a cube containing x and m(Q) → 0.

Formally, this means that there is a set N ⊂ Rn with µ(N) = 0, such that for every x ∈
/N
and ε > 0, there exists δ > 0 such that, for each cube Q with x ∈ Q and m(Q) < δ, we have
1
intQ |f (y) − f (x)|dy < ε.
m(Q)

1545
For n = 1, this can be restated as an analogue of the fundamental theorem of calculus for
Lebesgue integrals. Given a x0 ∈ R,
d
intx f (t)dt = f (x)
dx x0
for almost every x.

Version: 6 Owner: Koro Author(s): Koro

Let µ and ν be two σ-finite measures on the same measurable space (Ω, S), such that ν  µ
(i.e. ν is absolutely continuous with respect to µ.) Then there exists a measurable function
f , which is nonnegative and finite, such that for each A ∈ S,
ν(A) = intA f dµ.
This function is unique (any other function satisfying these conditions is equal to f µ-almost
everywhere,) and it is called the Radon-Nikodym derivative of ν with respect to µ,

denoted by f = dµ .

Remark. The theorem also holds if ν is a signed measure. Even if ν is not σ-finite the
theorem holds, with the exception that f is not necessarely finite.

Some properties of the Radon-Nikodym derivative

Let ν, µ, and λ be σ-finite measures in (Ω, S).

1. If ν  λ and µ  λ, then
d(ν + µ) dν dµ
= + µ-almost everywhere;
dλ dλ dλ
2. If ν  µ  λ, then
dν dν dν
= µ-almost everywhere;
dλ dµ dλ
3. If µ  λ and g is a µ-integrable function, then

intΩ gdµ = intΩ g dλ;

4. If µ  ν and ν  µ, then  −1
dµ dν
= .
dν dµ

Version: 5 Owner: Koro Author(s): Koro

1546
397.4 integral depending on a parameter

Suppose (E, B, µ) is a measure space, suppose I is an open interval in R, and suppose we
are given a function
f : E × I → R,
(x, t) 7→ f (x, t),
where R is the extended real numbers. Further, suppose that for each t ∈ I, the mapping
x 7→ f (x, t) is in L1 (E). (Here, L1 (E) is the set of measurable functions f : E → R with
finite Lebesgue integral; intE |f (x)|dµ < ∞.) Then we can define a function F : I → R by
F (t) = intE f (x, t)dµ.

Continuity of F

Let t0 ∈ I. In addition to the above, suppose:

1. For almost all x ∈ E, the mapping t 7→ f (x, t) is continuous at t = t0 .
2. There is a function g ∈ L1 (E) such that for almost all x ∈ E,
|f (x, t)| ≤ g(x)
for all t ∈ I.

Then F is continuous at t0 .

Differentiation under the integral sign

Suppose that the assumptions given in the introduction hold, and suppose:

1. For almost all x ∈ E, the mapping t 7→ f (x, t) is differentiable for all t ∈ I.
2. There is a function g ∈ L1 (E) such that for almost all x ∈ E,
d
| f (x, t)| ≤ g(x)
dt
for all t ∈ I.

d
Then F is differentiable on I, dt
f (x, t)dµ is in L1 (E), and for all t ∈ I,
d d
F (t) = intE f (x, t)dµ. (397.4.1)
dt dt

The above results can be found in [1, 1].

1547
REFERENCES
1. F. Jones, Lebesgue Integration on Euclidean Spaces, Jones and Barlett Publishers, 1993.
2. C.D. Aliprantis, O. Burkinshaw, Principles of Real Analysis, 2nd ed., Academic Press,
1990.

Version: 1 Owner: matte Author(s): matte

1548
Chapter 398

28A20 – Measurable and
nonmeasurable functions, sequences of
measurable functions, modes of
convergence

398.1 Egorov’s theorem

Let (X, S, µ) be a measure space, and let E be a subset of X of finite measure. If fn is a
sequence of measurable functions converging to f almost everywhere, then for each δ > 0
there exists a set Eδ such that µ(Eδ ) < δ and fn → f uniformly on E − Eδ .

Version: 2 Owner: Koro Author(s): Koro

398.2 Fatou’s lemma

If f1 , f2 , . . . is a sequence of nonnegative measurable functions in a measure space X, then

intX lim inf fn 6 lim inf intX fn
n→∞ n→∞

Version: 3 Owner: Koro Author(s): Koro

1549
398.3 Fatou-Lebesgue theorem

Let X be a measure space. If Φ is a measurable function with intX Φ < ∞, and if f1 , f2 , . . .
is a sequence of measurable functions such that |fn | 6 Φ for each n, then

g = lim inf fn and h = lim sup fn
n→∞ n→∞

are both integrable, and

−∞ < intX g 6 lim inf intX fn 6 lim sup intX fn 6 intX h < ∞.
n→∞ k→∞

Version: 3 Owner: Koro Author(s): Koro

398.4 dominated convergence theorem

Let X be a measure space, and let Φ, f1 , f2 , . . . be measurable functions such that intX Φ <
∞ and |fn | 6 Φ for each n. If fn → f almost everywhere, then f is integrable and

lim intX fn = intX f.
n→∞

This theorem is a corollary of the Fatou-Lebesgue theorem.

A possible generalization is that if {fr : r ∈ R} is a family of measurable functions such that
|fr | 6 |Φ| for each r ∈ R and fr −−→ f , then f is integrable and
r→0

lim intX fr = intX f.
r→0

Version: 8 Owner: Koro Author(s): Koro

398.5 measurable function

Let f : X → R̄ be a function defined on a measure space X. We say that f is measurable
if {x ∈ X | f (x) > a} is a measurable set for all a ∈ R.

Version: 5 Owner: vypertd Author(s): vypertd

1550
398.6 monotone convergence theorem

Let X be a measure space, and let 0 6 f1 6 f2 6 · · · be a monotone increasing sequence
of nonnegative measurable functions. Let f be the function defined almost everywhere by
f (x) = limn→∞ fn (x). Then f is measurable, and

lim intX fn = intX f.
n→∞

Remark. This theorem is the first of several theorems which allow us to “exchange inte-
gration and limits”. It requires the use of the Lebesgue integral: with the Riemann integral,
we cannot even formulate the theorem, lacking, as we do, the concept of “almost every-
where”. For instance, the characteristic function of the rational numbers in [0, 1] is not
Riemann integrable, despite being the limit of an increasing sequence of Riemann integrable
functions.

Version: 5 Owner: Koro Author(s): Koro, ariels

398.7 proof of Egorov’s theorem

Let Ei,j = {x ∈ E : |fj (x) − f (x)| < 1/i}. Since fn → f almost everywhere, there is a set
S with µ(S) = 0 such that, given i ∈ N and x ∈ E − S, there is m ∈ N such that j > m
implies |fj (x) − f (x)| < 1/i. This can be expressed by
[ \
E−S ⊂ Ei,j ,
m∈N j>m

or, in other words, \ [
(E − Ei,j ) ⊂ S.
m∈N j>m
S
Since { j>m (E − Ei,j )}m∈N is a decreasing nested sequence of sets, each of which has finite
measure, and such that its intersection has measure 0, by continuity from above we know
that [
µ( (E − Ei,j )) −−−→ 0.
m→∞
j>m

Therefore, for each i ∈ N, we can choose mi such that
[ δ
µ( (E − Ei,j )) < .
j>mi
2i

Let [ [
Eδ = (E − Ei,j ).
i∈N j>mi

1551
Then ∞ ∞
X [ X δ
µ(Eδ ) 6 µ( (E − Ei,j )) < = δ.
i=1 j>m i=1
2i
i

We claim that fn → f uniformly on E −Eδ . In fact, given ε > 0, choose n such that 1/n < ε.
If x ∈ E − Eδ , we have \ \
x∈ Ei,j ,
i∈N j>mi

which in particular implies that, if j > mn , x ∈ En,j ; that is, |fj (x) − f (x)| < 1/n < ε.
Hence, for each xε > 0 there is N (which is given by mn above) such that j > N implies
|fj (x) − f (x)| < ε for each x ∈ E − Eδ , as required. This completes the proof.

Version: 3 Owner: Koro Author(s): Koro

398.8 proof of Fatou’s lemma

Let f (x) = lim inf n→∞ fn (x) and let gn (x) = inf k≥n fk (x) so that we have

f (x) = sup gn (x).
n

As gn is an increasing sequence of measurable nonnegative functions we can apply the
monotone convergence theorem to obtain

intX f dµ = lim intX gn dµ.
n→∞

On the other hand, being gn ≤ fn , we conclude by observing

lim intX gn dµ = lim inf intX gn dµ ≤ lim inf intX fn dµ.
n→∞ n→∞ n→∞

Version: 1 Owner: paolini Author(s): paolini

398.9 proof of Fatou-Lebesgue theorem

By Fatou’s lemma we have
intX g ≤ lim inf intX fn
n→∞

and (recall that lim sup f = − lim inf −f )

lim sup intX fn ≤ intX h.
n→∞

1552
On the other hand by the properties of lim inf and lim sup we have

g ≥ −Φ, f ≤Φ

and hence
intX g ≥ intX − Φ > −∞, intX h ≤ intX Φ < +∞.

Version: 1 Owner: paolini Author(s): paolini

398.10 proof of dominated convergence theorem

It is not difficult to prove that f is measurable. In fact we can write

f (x) = sup inf k≥n fk (x)
n

and we know that measurable functions are closed under the sup and inf operation.

Consider the sequence gn (x) = 2Φ(x) − |f (x) − fn (x)|. clearly gn are nonnegative functions
since f − fn ≤ 2Φ. So, applying Fatou’s lemma, we obtain

lim intX |f − fn | dµ ≤ lim sup intX |f − fn | dµ
n→∞ n→∞
= − lim inf intX − |f − fn | dµ
n→∞
= intX 2Φ dµ − lim inf intX 2Φ − |f − fn | dµ
n→∞
≤ intX 2Φ dµ − intX 2Φ − lim sup |f − fn | dµ
n→∞
= intX 2Φ dµ − intX 2Φ dµ = 0.

Version: 1 Owner: paolini Author(s): paolini

398.11 proof of monotone convergence theorem

It is enough to prove the following
S
Theorem 7. Let (X, µ) be a measurable space and let fk : X → R {+∞} be a monotone increasing
sequence of positive measurable functions (i.e. 0 ≤ f1 ≤ f2 ≤ . . .). Then f (x) = limk→∞ fk (x)
is measurable and
lim intX fk dµ = intX f (x).
n→∞

1553
First of all by the monotonicity of the sequence we have

f (x) = sup fk (x)
k

hence we know that f is measurable. Moreover being fk ≤ f for all k, by the monotonicity
of the integral, we immediately get

sup intX fk dµ ≤ intX f (x) dµ.
k

So take any simple measurable function s such that 0 ≤ s ≤ f . Given also α < 1 define

Ek = {x ∈ X : fk (x) ≥ αs(x)}.

The sequence Ek is an increasing sequence of measurable sets. Moreover the union of all Ek
is the whole space X since limk→∞ fk (x) = f (x) ≥ s(x) > αs(x). Moreover it holds

intX fk dµ ≥ intEk fk dµ ≥ αintEk s dµ.

Being s a simple measurable function it is easy to check that E 7→ intE s dµ is a measure and
hence
sup intX fk dµ ≥ αintX s dµ.
k

But this last inequality holds for every α < 1 and for all simple measurable functions s with
s ≤ f . Hence by the definition of Lebesgue integral

sup intk fk dµ ≥ intX f dµ
k

which completes the proof.

Version: 1 Owner: paolini Author(s): paolini

1554
Chapter 399

28A25 – Integration with respect to
measures and other set functions

399.1 L∞(X, dµ)

The L∞ space, L∞ (X, dµ), is a vector space consisting of equivalence classes of functions
f : X → C with norm given by

kf k∞ = ess sup |f (t)| ,

the essential supremum of |f |. Additionally, we require that kf k∞ < ∞.

The equivalence classes of L∞ (X, dµ) are given by saying that f, g : X → C are equivalent
iff f and g differ on a set of µ measure zero.

Version: 3 Owner: ack Author(s): bbukh, ack, apmxi

399.2 Hardy-Littlewood maximal operator

The Hardy-Littlewood maximal operator in Rn is an operator defined on L1loc (Rn ) (the
space of locally integrable functions in Rn with the Lebesgue measure) which maps each
locally integrable function f to another function Mf , defined for each x ∈ Rn by
1
Mf (x) = sup intQ |f (y)|dy,
Q m(Q)

where the supremum is taken over all cubes Q containing x. This function is lower semicontinuous
(and hence measurable), and it is called the Hardy-Littlewood maximal function of f .

1555
The operator M is sublinear, which means that
M(af + bg) 6 |a|Mf + |b|Mg
for each pair of locally integrable functions f, g and scalars a, b.

Version: 3 Owner: Koro Author(s): Koro

399.3 Lebesgue integral
S
The integral of a measurable function f : X → R {±∞} on a measure space (X, B, µ) is
written

intX f dµ or just intf. (399.3.1)

It is defined via the following steps:

• If f = f rm[o]−−A is the characteristic function of a set A ∈ B, then set
intX f rm[o]−−A dµ := µ(A). (399.3.2)

• If f is a simple function (i.e. if f can be written as
n
X
f= ck f rm[o]−−Ak , ck ∈ R (399.3.3)
k=1

for some finite collection Ak ∈ B), then define
n
X n
X
intX f dµ := ck intX f rm[o]−−Ak dµ = ck µ(Ak ). (399.3.4)
k=1 k=1

• If f is a nonnegative measurable function (possibly attaining the value ∞ at some
points), then we define
intX f dµ := sup {intX h dµ : h is simple and h(x) ≤ f (x) for all x ∈ X} . (399.3.5)

• For any measurable function f (possibly attaining the values ∞ or −∞ at some points),
write f = f + − f − where
f + := max(f, 0) and f − := max(−f, 0), (399.3.6)
and define the integral of f as
intX f dµ := intX f + dµ − intX f − dµ, (399.3.7)
provided that intX f + dµ and intX f − dµ are not both ∞.

1556
If µ is Lebesgue measure and X is any interval in Rn then the integral is called the Lebesgue
integral. If the Lebesgue integral of a function f on a set A exists, f is said to be Lebesgue
integrable. The Lebesgue integral equals the Riemann integral everywhere the latter is
defined; the advantage to the Lebesgue integral is that many Lebesgue-integrable functions
are not Riemann-integrable. For example, the Riemann integral of the characteristic function
of the rationals in [0, 1] is undefined, while the Lebesgue integral of this function is simply
the measure of the rationals in [0, 1], which is 0.

Version: 12 Owner: djao Author(s): djao, drummond

1557
Chapter 400

28A60 – Measures on Boolean rings,
measure algebras

400.1 σ-algebra

Let X be a set. A σ-algebra is a collection M of subsets of X such that

• X∈M

• If A ∈ M then X − A ∈ M.

• If A1 , A2 , A3 , . . . is a countable subcollection of M, that is, Aj ∈ M for j = 1, 2, 3, . . .
(the subcollection can be finite) then the union of all of them is also in M:

[
Ai ∈ M.
j=1

Version: 3 Owner: drini Author(s): drini, apmxi

400.2 σ-algebra

Given a set E, a sigma algebra (or σ–algebra) in E is a collection B(E) of subsets of E such
that:

• ∅ ∈ B(E)

• Any countable union of elements of B(E) is in B(E)

1558
• The complement of any element of B(E) in E is in B(E)

Given any collection C of subsets of B(E), the σ–algebra generated by C is defined to be
the smallest σ–algebra in E containing C.

Version: 5 Owner: djao Author(s): djao

400.3 algebra

Given a set E, an algebra in E is a collection B(E) of subsets of E such that:

• ∅ ∈ B(E)

• Any finite union of elements of B(E) is in B(E)

• The complement of any element of B(E) in E is in B(E)

Given any collection C of subsets of B(E), the algebra generated by C is defined to be the
smallest algebra in E containing C.

Version: 2 Owner: djao Author(s): djao

400.4 measurable set (for outer measure)

Definition [1, 2, 1] Let µ∗ be an outer measure on a set X. A set E ⊂ X is said to be
measurable, or µ∗ -measurable, if for all A ⊂ X, we have
\ \
µ∗ (A) = µ∗ (A E) + µ∗ (A E {). (400.4.1)

Remark If A, E ⊂ X, we have, from the properties of the outer measure,
\ [ 
µ∗ (A) = µ∗ A (E E {)
\ [ \ 
= µ∗ (A E) (A E {)
\ \
≤ µ∗ (A E) + µ∗ (A E {)).

Hence equation (399.4.1) is equivalent to the inequality [1, 2, 1]
\ \
µ∗ (A) ≥ µ(A E) + µ(A E {).

1559
Of course, this inequality is trivially satisfied if µ∗ (A) = ∞. Thus a set E ⊂ X is µ-
measurable in X if and only if the above inequality holds for all A ⊂ X for which µ∗ (A) < ∞
[1].

Theorem [Carathéodory’s theorem] [1, 2, 1] Suppose µ∗ is an outer measure on a set
X, and suppose M is the set of all µ∗ -measurable sets in X. Then M is a σ-algebra, and µ∗
restricted to M is a measure (on M).

Example Let µ∗ be an outer measure on a set X.

∗ ∗
1. Any null set (a set E with
T µ (E) = 0) is measurable.
T Indeed, suppose µT (E) = 0, and
A ⊂ X. Then, sinceTA E ⊂ E, we have µ (A E) = 0, and since A E { ⊂ A, we

have µ∗ (A) ≥ µ∗ (A E {), so
\
µ∗ (A) ≥ µ∗ (A E)
\ \
= µ∗ (A E) + µ∗ (A E {).

Thus E is measurable.
S∞
2. If {Bi }∞
i=1 is a countable collection of null sets, then i=1 Bi is a null set. This follows
directly from the last property of the outer measure.

REFERENCES
1. A. Mukherjea, K. Pothoven, Real and Functional analysis, Plenum press, 1978.
2. A. Friedman, Foundations of Modern Analysis, Dover publications, 1982.
3. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed,
John Wiley & Sons, Inc., 1999.

Version: 1 Owner: matte Author(s): matte

1560
Chapter 401

28A75 – Length, area, volume, other
geometric measure theory

401.1 Lebesgue density theorem

Let µ be the Lebesgue measure on R. If µ(Y ) > 0, then there exists X ⊂ Y such that
µ(Y − X) = 0 and for all x ∈ X
T
µ(X [x − , x + ])
lim = 1.
→+0 2

Version: 2 Owner: bbukh Author(s): bbukh

1561
Chapter 402

28A80 – Fractals

402.1 Cantor set

The Cantor set C is the canonical example of an uncountable set of measure zero. We
construct C as follows.

Begin with the unit interval C0 = [0, 1], and remove the open segment R1 := ( 31 , 23 ) from the
middle. We define C1 as the two remaining pieces
   
1 [ 2
C1 := C0 R1 = 0, ,0 (402.1.1)
3 3
Now repeat the process on each remaining segment, removing the open set
   
1 2 [ 7 8
R2 := , , (402.1.2)
9 9 9 9
to form the four-piece set
       
1 [ 2 1 [ 2 7 [ 8
C2 := C1 R2 = 0, , , ,1 (402.1.3)
9 9 3 3 9 16

Continue the process, forming C3 , C4 , . . . Note that Ck has 2k pieces.

Figure 402.1: The sets C0 through C5 in the construction of the Cantor set

Also note that at each step, the endpoints of each closed segment will stay in the set forever—
e.g., the point 32 isn’t touched as we remove sets.

1562
The Cantor set is defined as

\ ∞
[
C := Ck = C0 \ Rn (402.1.4)
k=1 n=1

Cardinality of the Cantor set To establish cardinality, we want a bijection between
some set whose cardinality we know (e.g. Z, R) and the points in the Cantor set. We’ll be
aggressive and try the reals.

Start at C1 , which has two pieces. Mark the left-hand segment “0” and the right-hand
segment “1”. Then continue to C2 , and consider only the leftmost pair. Again, mark the
segments “0” and “1”, and do the same for the rightmost pair.

Keep doing this all the way down the Ck , starting at the left side and marking the segments
0, 1, 0, 1, 0, 1 as you encounter them, until you’ve labeled the entire Cantor set.

Now, pick a path through the tree starting at C0 and going left-left-right-left. . . and so on.
Mark a decimal point for C0 , and record the zeros and ones as you proceed. Each path has a
unique number based on your decision at each step. For example, the figure represents your
choice of left-left-right-left-right at the first five steps, representing the number beginning
0.00101...
Every point in the Cantor set will have a unique address dependent solely on the pattern

Figure 402.2: One possible path through C5 : 0.00101

of lefts and rights, 0’s and 1’s, required to reach it. Each point thus has a unique number,
the real number whose binary expansion is that sequence of zeros and ones. Every infinite
stream of binary digits can be found among these paths, and in fact the binary expansion of
every real number is a path to a unique point in the Cantor set.

Some caution is justified, as two binary expansions may refer to the same real number; for
example, 0.011111 . . . = 0.100000 . . . = 21 . However, each one of these duplicates must corre-
spond to a rational number. To see this, suppose we have a number x in [0, 1] whose binary
expansion becomes all zeros or all ones at digit k (both are the same number, remember).
Then we can multiply that number by 2k and get 1, so it must be a (binary) rational number.
There are only countably many rationals, and not even all of those are the double-covered
numbers we’re worried about (see, e.g., 13 = 0.0101010 . . .), so we have at most countably
many duplicated reals. Thus, the cardinality of the 0.Cantor set is equal to that of the reals.
(If we want to be really picky, map (0, 1) to the reals with, say, f (x) = 1/x + 1/(x − 1), and
the end points really don’t matter much.)

Return, for a moment, to the earlier observation that numbers such as 31 , 29 , the endpoints
of deleted intervals, are themselves never deleted. In particluar, consider the first deleted

1563
interval: the ternary expansions of its constituent numbers are precisely those that begin
0.1, and proceed thence with at least one non-zero “ternary” digit (just digit for us) further
along. Note also that the point 13 , with ternary expansion 0.1, may also be written 0.02̇
(or 0.02̄), which has no digits 1. Similar descriptions apply to further deleted intervals.
The result is that the cantor set is precisely those numbers in the set [0, 1] whose ternary
expansion contains no digits 1.

Measure of the Cantor set Let µ be Lebesgue measure. The measure of the sets Rk
that we remove during the construction of the Cantor set are
2 1 1
µ(R1 ) = − = (402.1.5)
3 3 3 
 
2 1 8 7 2
µ(R2 ) = − + − = (402.1.6)
9 9 9 9 9
..
. (402.1.7)
k
X 2n−1
µ(Rk ) = (402.1.8)
n=1
3n

Note that the R’s are disjoint, which will allow us to sum their measures without worry. In
the limit k → ∞, this gives us

! ∞
[ X 2n−1
µ Rn = = 1. (402.1.9)
n=1 n=1
3n

But we have µ(C0 ) = 1 as well, so this means

! ∞
[ X 1
µ(C) = µ C0 \ Rn = µ(C0) − = 1 − 1 = 0. (402.1.10)
n=1 n=1
2n

Thus we have seen that the measure of C is zero (though see below for more on this topic).
How many points are there in C? Lots, as we shall see.

So we have a set of measure zero (very tiny) with uncountably many points (very big). This
non-intuitive result is what makes Cantor sets so interesting.

Cantor sets with positive measure clearly, Cantor sets can be constructed for all sorts
of “removals”—we can remove middle halves, or thirds, or any amount 1r , r > 1 we like. All
of these Cantor sets have measure zero, since at each step n we end up with
 n
1
Ln = 1 − (402.1.11)
r

1564
of what we started with, and limn→∞ Ln = 0 for any r > 1. With apologies, the figure above
is drawn for the case r = 2, rather than the r = 3 which seems to be the publically favored
example.

However, it is possible to construct Cantor sets with positive measure as well; the key is to
remove less and less as we proceed. These Cantor sets have the same “shape” (topology) as
the Cantor set we first constructed, and the same cardinality, but a different “size.”

Again, start with the unit interval for C0 , and choose a number 0 < p < 1. Let
 
2−p 2+p
R1 := , (402.1.12)
4 4

which has length (measure) p2 . Again, define C1 := C0 \ R1 . Now define
   
2 − p 2 + p [ 14 − p 14 + p
R2 := , , (402.1.13)
16 16 16 16

which has measure p4 . Continue as before, such that each Rk has measure 2pk ; note again
that all the Rk are disjoint. The resulting Cantor set has measure

! ∞ ∞
[ X X
µ C0 \ Rn = 1 − µ(Rn ) = 1 − p 2−n = 1 − p > 0.
n=1 n=1 n=1

Thus we have a whole family of Cantor sets of positive measure to accompany their vanishing
brethren.

Version: 19 Owner: drini Author(s): drini, quincynoodles, drummond

402.2 Hausdorff dimension

Let Θ be a bounded subset of Rn let NΘ () be the minimum number of balls of radius 
required to cover Θ. Then define the Hausdorff dimension dH of Θ to be

log NΘ ()
dH (Θ) := − lim .
→0 log 

Hausdorff dimension is easy to calculate for simple objects like the Sierpinski gasket or a
Koch curve. Each of these may be covered with a collection of scaled-down copies of itself.
In fact, in the case of the Sierpinski gasket, one can take the individual triangles in each
approximation as balls in the covering. At stage n, there are 3n triangles of radius 21n , and
so the Hausdorff dimension of the Sierpinski triangle is − nnlog
log 3
1/2
log 3
= log 2
.

1565
From some notes from Koro This definition can be extended to a general metric space
X with distance function d.

Define the diameter |C| of a bounded subset C of X to be supx,y∈C d(x, y), and define a
countable r-cover S
of X to be a collection of subsets Ci of X indexed by some countable set
I, such that X = i∈I Ci . We also define the handy function
X
HrD (X) = inf |Ci |D
i∈I

where the infimum is over all countable r-covers of X. The Hausdorf dimension of X may
then be defined as
dH (X) = inf{D | lim HrD (X) = 0}.
r→0
n
When X is a subset of R with any restricted norm-induced metric, then this definition
reduces to that given above.

Version: 8 Owner: drini Author(s): drini, quincynoodles

402.3 Koch curve

A Koch curve is a fractal generated by a replacement rule. This rule is, at each step, to
replace the middle 1/3 of each line segment with two sides of a right triangle having sides of
length equal to the replaced segment. Two applications of this rule on a single line segment
gives us:

To generate the Koch curve, the rule is applied indefinitely, with a starting line segment.
Note that, if the length of the initial line segment is l, the length LK of the Koch curve at
the nth step will be

 n
4
LK = l
3

This quantity increases without bound; hence the Koch curve has infinite length. However,
the curve still bounds a finite area. We can prove this by noting that in each step, we add an
amount of area equal to the area of all the equilateral triangles we have just created. We can
bound the area of each triangle of side length s by s2 (the square containing the triangle.)
Hence, at step n, the area AK ”under” the Koch curve (assuming l = 1) is

1566
Figure 402.3: Sierpinski gasket stage 0, a single triangle

Figure 402.4: Stage 1, three triangles

 2  2  2
1 1 1
AK < +3 +9 +···
3 9 27
Xn
1
= i−1
i=1
3

but this is a geometric series of ratio less than one, so it converges. Hence a Koch curve has
infinite length and bounds a finite area.

A Koch snowflake is the figure generated by applying the Koch replacement rule to an
equilateral triangle indefinitely.

Version: 3 Owner: akrowne Author(s): akrowne

Let S0 be a triangular area, and define Sn+1 to be obtained from Sn by replacing each trian-
gular area in Sn with three similar and similarly oriented triangular areas each intersecting
with each of the other two at exactly one vertex, each one half the linear scale of the orrig-
inal in size. The limiting set as n → ∞ (alternately the intersection of all these sets) is a
Sierpinski gasket, also known as a Sierpinski triangle.

Version: 3 Owner: quincynoodles Author(s): quincynoodles

402.5 fractal

Option 1: Some equvialence class of subsets of Rn . A usual equivalence is postulated when
some generalised ”distance” is zero. For example, let F, G ⊂ Rn , and let d(x, y) be the usual
distance (x, y ∈ R). Define the distance D between F and G as

D(F, G) := inf f ∈F sup d(f, g) + inf g∈G sup d(f, g)
g∈G f ∈F

Figure 402.5: Stage 2, nine triangles

1567
Figure 402.6: Stage n, 3n triangles

Then in this case we have, as fractals, that Q and R are equivalent.

Option 2: A subset of Rn with non-integral Hausdorff dimension. Examples: (we think) the
coast of Britain, a Koch snowflake.

Option 3: A “self-similar object”. That is, one which can be covered by copies of itself using
a set of (usually two or more) transformation mappings. Another way to say this would
be “an object with a discrete approximate scaling symmetry.” Example: A square region,
a Koch curve, a fern frond. This isn’t much different from Option 1 because of the collage
theorem.

A cursory description of some relationships between options 2 and 3 is given towards the
end of the entry on Hausdorff dimension. The use of option 1 is that it permits one to talk
about how ”close” two fractals are to one another. This becomes quite handy when one
wants to talk about approximating fractals, especially approximating option 3 type fractals
with pictures that can be drawn in finite time. A simple example: one can talk about how
close one of the line drawings in the Koch curve entry is to an actual Koch curve.

Version: 7 Owner: quincynoodles Author(s): quincynoodles

1568
Chapter 403

28Axx – Classical measure theory

403.1 Vitali’s Theorem

There exists a set V ⊂ [0, 1] which is not Lebesgue measurable

Version: 1 Owner: paolini Author(s): paolini

403.2 proof of Vitali’s Theorem

Consider the equivalence relation in [0, 1) given by
x∼y ⇔ x−y ∈Q
and let F be the family of all equivalence classes of ∼. Let V be a section of F i.e. put in V
an element for each equivalence class of ∼ (notice that we are using the axiom of choice).
T
Given q ∈ Q [0, 1) define
\ [ \
Vq = ((V + q) [0, 1)) ((V + q − 1) [0, 1))
that is Vq is obtained translating V by a quantity q to the right and then cutting the piece
which goes beyond the point 1 and putting it on the left, starting from 0.

Now notice that given x ∈ [0, 1) there
T exists y ∈ V such that x ∼ y (because V is a section
of ∼) and hence there exists q ∈ Q [0, 1) such that x ∈ Vq . So
[
Vq = [0, 1).
T
q∈Q [0,1)

T
Moreover all the Vq are disjoint. In fact if x ∈ Vq Vp then x − q (modulus [0, 1)) and x − p
are both in V which is not possible since they differ by a rational quantity q −p (or q −p + 1).

1569
Now if V is Lebesgue measurable, clearly also Vq are measurable and µ(Vq ) = µ(V ). Moreover
by the countable additivity of µ we have
X X
µ([0, 1)) = µ(Vq ) = µ(V ).
q
T
q∈Q [0,1)

So if µ(V ) = 0 we had µ([0, 1)) = 0 and if µ(V ) > 0 we had µ([0, 1)) = +∞.

So the only possibility is that V is not Lebesgue measurable.

Version: 1 Owner: paolini Author(s): paolini

1570
Chapter 404

28B15 – Set functions, measures and
integrals with values in ordered spaces

404.1 Lp-space

Definition Let (X, B, µ) be a measure space. The Lp -norm of a function f : X → R is
defined as 1
kf kp := (intX |f |p dµ) p (404.1.1)
when the integral exists. The set of functions with finite Lp -norm form a vector space V
with the usual pointwise addition and scalar multiplication of functions. In particular, the
set of functions with zero Lp -norm form a linear subspace of V , which for this article will be
called K. We are then interested in the quotient space V /K, which consists of real functions
on X with finite Lp -norm, identified up to equivalence almost everywhere. This quotient
space is the real Lp -space on X.

Theorem The vector space V /K is complete with respect to the Lp norm.

The space L∞ . The space L∞ is somewhat special, and may be defined without explicit
reference to an integral. First, the L∞ -norm of f is defined to be the essential supremum of
|f |:
kf k∞ := ess sup |f | = inf {a ∈ R : µ({x : |f (x)| > a}) = 0} (404.1.2)
The definitions of V , K, and L∞ then proceed as above. Functions in L∞ are also called
essentially bounded.

Example Let X = [0, 1] and f (x) = √1 . Then f ∈ L1 (X) but f ∈
/ L2 (X).
x

Version: 18 Owner: mathcam Author(s): Manoj, quincynoodles, drummond

1571
404.2 locally integrable function

Definition [4, 1, 2] Suppose that U is an open set in Rn , and f : U → C is a Lebesgue integrable
function. If the Lebesgue integral
intK |f |dx
is finite for all compact subsets K in U, then f is locally integrable. The set of all such
functions is denoted by L1loc (U).

Example

1. L1 (U) ⊂ L1loc (U), where L1 (U) is the set of integrable functions.

Theorem Suppose f and g are locally integrable functions on an open subset U ⊂ Rn , and
suppose that
intU f φdx = intU gφdx
for all smooth functions with compact support φ ∈ C0∞ (U). Then f = g almost everywhere.

A proof based on the Lebesgue differentiation theorem is given in [4] pp. 15. Another proof
is given in [2] pp. 276.

REFERENCES
1. L. Hörmander, The Analysis of Linear Partial Differential Operators I, (Distribution
theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990.
2. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed,
John Wiley & Sons, Inc., 1999.
3. S. Lang, Analysis II, Addison-Wesley Publishing Company Inc., 1969.

Version: 3 Owner: matte Author(s): matte

1572
Chapter 405

28C05 – Integration theory via linear
integrals, etc.), representing set
functions and measures

405.1 Haar integral

Let Γ be a locally compact topological group and C be the algebra of all continuous real-
valued functions on Γ with compact support. In addition we define C+ to be the set of
non-negative functions that belong to C. The Haar integral is a real linear map I of C into
the field of the real number for Γ if it satisfies:

• I is not the zero map

• I only takes non-negative values on C+

• I has the following property I(γ · f ) = I(f ) for all elements f of C and all element γ
of Γ.

The Haar integral may be denoted in the following way (there are also other ways):

intγ∈Γ f (γ) or intΓ f or intΓ f dγ or I(f )

In order for the Haar intergral to exists and to be unique, the following conditions are
necessary and sufficient: That there exists a real-values function I + on C+ satisfying the
following condition:

1573
1. (Linearity).I + (λf + µg) = λI + (f ) + µI + (g) where f, g ∈ C+ and λ, µ ∈ R+ .

2. (Positivity). If f (γ) > 0 for all γ ∈ Γ then I + (f (γ)) > 0.

3. (Translation-Invariance). I(f (δγ)) = I(f (γ)) for any fixed δ ∈ Γ and every f in C+ .

An additional property is if Γ is a compact group then the Haar integral has right translation-
invariance: intγ∈Γ f (γδ) = intγ∈Γ f (γ) for any fixed δ ∈ Γ. In addition we can define nor-
malized Haar integral to be intΓ 1 = 1 since Γ is compact, it implies that intΓ 1 is finite.
(The proof for existence and uniqueness of the Haar integral is presented in [PV] on page
9.)

( the information of this entry is in part quoted and paraphrased from [GSS])

REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David.: Singularities and Groups in Bifur-
cation Theory (Volume II). Springer-Verlag, New York, 1988.
[HG] Gochschild, G.: The Structure of Lie Groups. Holden-Day, San Francisco, 1965.

Version: 4 Owner: Daume Author(s): Daume

1574
Chapter 406

28C10 – Set functions and measures
on topological groups, Haar measures,
invariant measures

406.1 Haar measure

406.1.1 Definition of Haar measures

Let G be a locally compact topological group. A left Haar measure on G is a measure µ on
the Borel sigma algebra B of G which is:

1. outer regular on all Borel sets B ∈ B

2. inner regular on all open sets U ⊂ G

3. finite on all compact sets K ⊂ G

4. invariant under left translation: µ(gB) = µ(B) for all Borel sets B ∈ B

A right Haar measure on G is defined similarly, except with left translation invariance re-
placed by right translation invariance (µ(Bg) = µ(B) for all Borel sets B ∈ B). A bi–
invariant Haar measure is a Haar measure that is both left invariant and right invariant.

1575
406.1.2 Existence of Haar measures

For any finite group G, the counting measure on G is a bi–invariant Haar measure. More
generally, every locally compact topological group G has a left 1 Haar measure µ, which is
unique up to scalar multiples. The Haar measure plays an important role in the development
of Fourier analysis and representation theory on locally compact groups such as Lie groups
and profinite groups.

Version: 1 Owner: djao Author(s): djao

1
G also has a right Haar measure, although the right and left Haar measures on G are not necessarily
equal unless G is abelian.

1576
Chapter 407

28C20 – Set functions and measures
and integrals in infinite-dimensional
spaces (Wiener measure, Gaussian
measure, etc.)

407.1 essential supremum

Let (X, B, µ) be a measure space and let f : X → R be a function. The essential supre-
mum of f is the smallest number a ∈ R for which f only exceeds a on a set of measure zero.
This allows us to generalize the maximum of a function in a useful way.

More formally, we define ess sup f as follows. Let a ∈ R, and define

Ma = {x : f (x) > a} , (407.1.1)
the subset of X where f (x) is greater than a. Then let

A0 = {a ∈ R : µ(Ma ) = 0} , (407.1.2)

the set of real numbers for which Ma has measure zero. If A0 = ∅, then the essential
supremum is defined to be ∞. Otherwise, the essential supremum of f is

ess sup f := infA0 . (407.1.3)

Version: 1 Owner: drummond Author(s): drummond

1577
Chapter 408

28D05 – Measure-preserving
transformations

408.1 measure-preserving

Let (X, B, µ) be a measure space, and T : X → X be a (possibly non-invertible) measurable
transformation. We call T measure-preserving if for all A ∈ B,

µ(T −1(A)) = µ(A),

where T −1 (A) is defined to be the set of points x ∈ X such that T (x) ∈ A.

A measure-preserving transformation is also called an endomorphism of the measure space.

Version: 5 Owner: mathcam Author(s): mathcam, drummond

1578
Chapter 409

30-00 – General reference works
(handbooks, dictionaries,
bibliographies, etc.)

409.1 domain

A non-empty open set in C is called a domain.

The topology considered is the Euclidean one (viewing C as R2 ). So we have that for a
domain D being connected is equivalent to being path-connected.

Since we have that every component of a domain D will be a region, we have that every
domain has at most countably many components.

Version: 4 Owner: drini Author(s): drini

409.2 region

A region is a connected domain.

Since every domain of C can be seen as the union of countably many components and each
component is a region, we have that regions play a major role in complex analysis.

Version: 2 Owner: drini Author(s): drini

1579
409.3 regular region

Let E be a n-dimensional Euclidean space with the topology induced by the Euclidean metric.
Then a set in E is a regular region, if it can be written as the closure of a non-empty region
with a piecewise smooth boundary.

Version: 10 Owner: ottocolori Author(s): ottocolori

409.4 topology of the complex plane

The usual topology for the complex plane C is the topology induced by the metric d(x, y) =
|x − y| for x, y ∈ C. Here, | · | is the complex modulus.

If we identify R2 and C, it is clear that the above topology coincides with topology induced
by the Euclidean metric on R2 .

Version: 1 Owner: matte Author(s): matte

1580
Chapter 410

30-XX – Functions of a complex
variable

410.1 z0 is a pole of f

Let f be an analytic function on a punctured neighborhood of x0 ∈ C, that is, f analytic on

{z ∈ C : 0 < |z − x0 | < ε}

for some ε > 0 and such that
lim f = ∞.
z→z0

We say then that x0 is a pole for f .

Version: 2 Owner: drini Author(s): drini, apmxi

1581
Chapter 411

30A99 – Miscellaneous

411.1 Riemann mapping theorem

Let U be a simply connected open proper subset of C, and let a ∈ U. There is a unique
analytic function f : U → C such that

1. f (a) = 0, and f 0 (a) is real and positive;

2. f is injective;

3. f (U) = {z ∈ C : |z| < 1}.

Remark. As a consequence of this theorem, any two simply connected regions, none of
which is the whole plane, are conformally equivalent.

Version: 2 Owner: Koro Author(s): Koro

411.2 Runge’s theorem
S
Let K be a compact subset of C, and let E be a subset of C∞ = C {∞} (the extended
complex plane) which intersects every connected component of C∞ − K. If f is an analytic
function in an open set containing K, given ε > 0, there is a rational function R(z) whose
only poles are in E, such that |f (z) − R(z)| < ε for all z ∈ K.

Version: 2 Owner: Koro Author(s): Koro

1582
411.3 Weierstrass M-test

Let X be a topological space, {fn }n∈N a sequence of real or complex valued functions on
X and {Mn }n∈N a sequence of non-negative real Pnumbers. Suppose that, for eachP∞ n∈ N

and x ∈ X, we have |fn (x)| ≤ Mn . Then f = n=1 fn converges uniformly if n=1 Mn
converges.

Version: 8 Owner: vypertd Author(s): vypertd, igor

411.4 annulus

Briefly, an annulus is the region bounded between two (usually concentric) circles.

An open annulus, or just annulus for short, is a domain in the complex plane of the form
A = Aw (r, R) = {z ∈ C | r < |z − w| < R},
where w is an abitrary complex number, and r and R are real numbers with 0 < r < R.
Such a set is often called an annular region.

More generally, one can allow r = 0 or R = ∞. (This makes sense for the purposes of the
bound on |z − w| above.) This would make an annulus include the cases of a punctured disc,
and some unbounded domains.

Analogously, one can define a closed annulus to be a set of the form
A = Aw (r, R) = {z ∈ C | r 6 |z − w| 6 R},
where w ∈ C, and r and R are real numbers with 0 < r < R.

One can show that two annuli Dw (r, R) and Dw0 (r 0 , R0 ) are conformally equivalent if and
only if R/r = R0 /r 0 . More generally, the complement of any closed disk in an open disk is
conformally equivalen to precisely one annulus of the form D0 (r, 1).

Version: 1 Owner: jay Author(s): jay

411.5 conformally equivalent

A region G is conformally equivalent to a set S if there is an analytic bijective function
mapping G to S.

Conformal equivalence is an equivalence relation.

Version: 1 Owner: Koro Author(s): Koro

1583
411.6 contour integral

Let f be a complex-valued function defined on the image of a curve α: [a, b] → C, let
P = {a0 , ..., an } be a partition of [a, b]. If the sum

n
X
f (zi )(α(ai) − α(ai−1 ))
i=1

where zi is some point α(ti ) such that ai−1 6 ti 6 ai , tends to a unique limit l as n tends
to infinity and the greatest of the numbers ai − ai−1 tends to zero, then we say that the
contour integral of f along α exists and has value l. The contour integral is denoted by

intα f (z)dz

Note

(i) If Im(α) is a segment of the real axis, then this definition reduces to that of the
Riemann integral of f(x) between α(a) and α(b)

(ii) An alternative definition, making use of the Riemann-Stieltjes integral, is based on the
fact that the definition of this can be extended without any other changes in the wording to
cover the cases where f and α are complex-valued functions.

Now let α be any curve [a, b] → R2 . Then α can be expressed in terms of the components
(α1 , α2 ) and can be assosciated with the complex valued function

z(t) = α1 (t) + iα2 (t)

Given any complex-valued function of a complex variable, f say, defined on Im(α) we define
the contour integral of f along α, denoted by

intα f (z)dz

by

intα f (z)dz = intba f (z(t))dz(t)
whenever the complex Riemann-Stieltjes integral on the right exists.

(iii) Reversing the direction of the curve changes the sign of the integral.

1584
(iv) The contour integral always exists if α is rectifiable and f is continuous.

(v) If α is piecewise smooth and the countour integral of f along α exists, then

intα f dz = intba f (z(t))z 0 (t)dt

Version: 4 Owner: vypertd Author(s): vypertd

411.7 orientation

Let α be a rectifiable, Jordan curve in R2 and z0 be a point in R2 − Im(α) and let α have
a winding number W [α : z0 ]. Then W [α : z0 ] = ±1; all points inside α will have the same
index and we define the orientation of a Jordan curve α by saying that α is positively
oriented if the index of every point in α is +1 and negatively oriented if it is −1.

Version: 3 Owner: vypertd Author(s): vypertd

411.8 proof of Weierstrass M-test
P
Consider the sequence of partial sums sn = nm=1 fm . Since the sums are finite, each sn is
continuous. Take any p, q ∈ N such that p ≤ q, then, for every x ∈ X, we have
q
X

|sq (x) − sp (x)| = fm (x)

m=p+1
Xq
≤ |fm (x)|
m=p+1
q
X
≤ Mm
m=p+1

P∞
But since n=1 Mn converges, for any  > 0 wePcan find an N ∈ N such that, for any
p, q > N and x ∈ X, weP∞ have |sq (x) − sp (x)| ≤ qm=p+1
P∞Mm < . Hence the sequence sn
converges uniformly to n=1 fn , and the function f = n=1 fn is continuous.

Version: 1 Owner: igor Author(s): igor

1585
411.9 unit disk

The unit disk in the complex plane, denoted ∆, is defined as {z ∈ C : |z| < 1}. The unit
circle, denoted ∂∆ or S 1 is the boundary {z ∈ C : |z| = 1} of the unit disk ∆. Every element
z ∈ ∂∆ can be written as z = eiθ for some real value of θ.

Version: 5 Owner: brianbirgen Author(s): brianbirgen

411.10 upper half plane

The upper half plane in the complex plane, abbreviated UHP, is defined as {z ∈ C : Im(z) >
0}.

Version: 4 Owner: brianbirgen Author(s): brianbirgen

411.11 winding number and fundamental group

The winding number is an analytic way to define an explicit isomorphism

W [• : z0 ] : π1 (C \ z0 ) → Z

from the fundamental group of the punctured (at z0 ) complex plane to the group of integers.

Version: 1 Owner: Dr Absentius Author(s): Dr Absentius

1586
Chapter 412

30B10 – Power series (including
lacunary series)

412.1 Euler relation

Euler’s relation (also known as Euler’s formula) is considered the first bridge between the
fields of algebra and geometry, as it relates the exponential function to the trigonometric
sine and cosine functions.

The goal is to prove
eix = cos(x) + i sin(x)

It’s easy to show that

i4n = 1
i4n+1 = i
i4n+2 = −1
i4n+3 = −i

Now, using the Taylor series expansions of sin x, cos x and ex , we can show that
∞ n n
X
ix i x
e =
n=0
n!
X∞
ix x4n ix4n+1 x4n+2 ix4n+3
e = + − −
n=0
(4n)! (4n + 1)! (4n + 2)! (4n + 3)!

Because the series expansion above is absolutely convergent for all x, we can rearrange the

1587
terms of the series as follows
X∞ X∞
ix x2n n x2n+1
e = (−1) + i (−1)n
n=0
(2n)! n=0
(2n + 1)!
eix = cos(x) + i sin(x)

Version: 8 Owner: drini Author(s): drini, fiziko, igor

412.2 analytic

Let U be a domain in the complex numbers (resp., real numbers). A function f : U −→ C
(resp., f : U −→ R) is analytic (resp., real analytic) if f has a Taylor series about each point
x ∈ U that converges to the function f in an open neighborhood of x.

412.2.1 On Analyticity and Holomorphicity

A complex function is analytic if and only if it is holomorphic. Because of this equivalence,
an analytic function in the complex case is often defined to be one that is holomorphic,
instead of one having a Taylor series as above. Although the two definitions are equivalent,
it is not an easy matter to prove their equivalence, and a reader who does not yet have this
result available will have to pay attention as to which definition of analytic is being used.

Version: 4 Owner: djao Author(s): djao

412.3 existence of power series

In this entry we shall demonstrate the logical equivalence of the holomorphic and analytic
concepts. As is the case with so many basic results in complex analysis, the proof of these
facts hinges on the Cauchy integral theorem, and the Cauchy integral formula.

Holomorphic implies analytic.

Theorem 8. Let U ⊂ C be an open domain that contains the origin, and let f : U → C, be
a function such that the complex derivative

f (z + ζ) − f (z)
f 0 (z) = lim
ζ→0 ζ

1588
exists for all z ∈ U. Then, there exists a power series representation

X
f (z) = ak z k , kzk < R, ak ∈ C
k=0

for a sufficiently small radius of convergence R > 0.

Note: it is just as easy to show the existence of a power series representation around every
basepoint in z0 ∈ U; one need only consider the holomorphic function f (z − z0 ).

Proof. Choose an R > 0 sufficiently small so that the disk kzk 6 R is contained in U. By
the Cauchy integral formula we have that
I
1 f (ζ)
f (z) = dζ, kzk < R,
2πi kζk=R ζ − z

where, as usual, the integration contour is oriented counterclockwise. For every ζ of modulus
R, we can expand the integrand as a geometric power series in z, namely
X f (ζ) ∞
f (ζ) f (ζ)/ζ
= = zk , kzk < R.
ζ −z 1 − z/ζ ζ k+1
k=0

The circle of radius R is a compact set; hence f (ζ) is bounded on it; and hence, the power
series above converges uniformly with respect to ζ. Consequently, the order of the infinite
summation and the integration operations can be interchanged. Hence,

X
f (z) = ak z k , kzk < R,
k=0

where I
1 f (ζ)
ak = ,
2πi kζk=R ζ k+1
as desired. QED

Analytic implies holomorphic.
Theorem 9. Let ∞
X
f (z) = an z n , an ∈ C, kzk < 
n=0

be a power series, converging in D = D (0), the open disk of radius  > 0 about the origin.
Then the complex derivative
f (z + ζ) − f (z)
f 0 (z) = lim
ζ→0 ζ
exists for all z ∈ D, i.e. the function f : D → C is holomorphic.

1589
Note: this theorem generalizes immediately to shifted power series in z − z0 , z0 ∈ C.

Proof. For every z0 ∈ D, the function f (z) can be recast as a power series centered at z0 .
Hence, without loss of generality it suffices to prove the theorem for z = 0. The power series

X
an+1 ζ n , ζ∈D
n=0

converges, and equals (f (ζ) − f (0))/ζ for ζ 6= 0. Consequently, the complex derivative f 0 (0)
exists; indeed it is equal to a1 . QED

Version: 2 Owner: rmilson Author(s): rmilson

412.4 infinitely-differentiable function that is not ana-
lytic

If f ∈ C∞ , then we can certainly write a Taylor series for f . However, analyticity requires
that this Taylor series actually converge (at least across some radius of convergence) to f .
It is not necessary that the power series for f converge to f , as the following example shows.

Let (
e x 6= 0
f (x) = .
0 x=0
Then f ∈ C∞ , and for any n ≥ 0, f (n) (0) = 0 (see below). So the Taylor series for f around
0 is 0; since f (x) > 0 for all x 6= 0, clearly it does not converge to f .

Proof that f (n) (0) = 0

Let p(x), q(x) ∈ R[x] be polynomials, and define
p(x)
g(x) = · f (x).
q(x)
Then, for x 6= 0,
0 (p0 (x) + p(x) x23 )q(x) − q 0 (x)p(x)
g (x) = · e.
q 2 (x)
Computing (e.g. by applying L’Hôpital’s rule), we see that g 0(0) = limx→0 g 0 (x) = 0.

Define p0 (x) = q0 (x) = 1. Applying the above inductively, we see that we may write
f (n) (x) = pqnn(x)
(x)
f (x). So f (n) (0) = 0, as required.

Version: 2 Owner: ariels Author(s): ariels

1590
412.5 power series

A power series is a series of the form

X
ak (x − x0 )k ,
k=0

with ak , x0 ∈ R or ∈ C. The ak are called the coefficients and x0 the center of the power
series. Where it converges it defines a function, which can thus be represented by a power
series. This is what power series are usually used for. Every power series is convergent at
least at x = x0 where it converges to a0 . In addition it is absolutely convergent in the region
{x | |x − x0 | < r}, with
1
r = lim inf p
k→∞ k
|ak |
It is divergent for every x with |x − x0 | > r. For |x − x0 | = r no general predictions can be
made. If r = ∞, the power series converges absolutely for every real or complex x. The real
number r is called the radius of convergence of the power series.

Examples of power series are:

• Taylor series, for example:

X
x xk
e = .
k=0
k!

• The geometric series:
X ∞
1
= xk ,
1−x
k=0

with |x| < 1.

Power series have some important properties:

• If a power series converges for a z0 ∈ C then it also converges for all z ∈ C with
|z − x0 | < |z0 − x0 |.

• Also, if a power series diverges for some z0 ∈ C then it diverges for all z ∈ C with
|z − x0 | > |z0 − x0 |.

• For |x − x0 | < r Power series can be added by adding coefficients and mulitplied in the
obvious way:

X ∞
X
k
ak (x−xo ) · bj (x−x0 )j = a0 b0 +(a0 b1 +a1 b0 )(x−x0 )+(a0 b2 +a1 b1 +a2 b0 )(x−x0 )2 . . . .
k=0 l=0

1591
• (Uniqueness) If two power series are equal and their centers are the same, then their
coefficients must be equal.

• Power series can be termwise differentiated and integrated. These operations keep the

Version: 13 Owner: mathwizard Author(s): mathwizard, AxelBoldt

412.6 proof of radius of convergence

According to Cauchy’s root test a power series is absolutely convergent if
p p
lim sup k |ak (x − x0 )k | = |x − x0 | lim sup k |ak | < 1.
k→∞ k→∞

This is obviously true if
1 1
|x − x0 | < lim sup p = p .
k→∞
k
|ak | lim inf k→∞ k |ak |

In the same way we see that the series is divergent if
1
|x − x0 | > p ,
lim inf k→∞ k
|ak |

which means that the right hand side is the radius of convergence of the power series.

Now from the ratio test we see that the power series is absolutely convergent if

ak+1 (x − x0 )k+1 ak+1
lim = |x − x0 | lim < 1.
k→∞ ak (x − x0 )k k→∞ ak

Again this is true if
ak

|x − x0 | < lim .
k→∞ ak+1

The series is divergent if
ak

|x − x0 | > lim ,
k→∞ ak+1

as follows from the ratio test in the same way. So we see that in this way too we can calculate

Version: 1 Owner: mathwizard Author(s): mathwizard

1592

To the power series

X
ak (x − x0 )k (412.7.1)
k=0

there exists a number r ∈ [0, ∞], its radius of convergence, such that the series converges absolutely
for all (real or complex) numbers x with |x − x0 | < r and diverges whenever |x − x0 | > r.
(For |x − x0 | = r no general statements can be made, except that there always exists at least
one complex number x with |x − x0 | = r such that the series diverges.)

The radius of convergence is given by:
1
r = lim inf p (412.7.2)
k→∞ k |a |
k

and can also be computed as
ak

r = lim , (412.7.3)
k→∞ ak+1

if this limit exists.

Version: 6 Owner: mathwizard Author(s): mathwizard, AxelBoldt

1593
Chapter 413

30B50 – Dirichlet series and other
series expansions, exponential series

413.1 Dirichlet series

Let (λn )n≥1 be an increasing sequence of positive real numbers tending to ∞. A Dirichlet
series with exponents (λn ) is a series of the form
X
an e−λn z
n

where z and all the an are complex numbers.

An ordinary Dirichlet series is one having λn = log n for all n. It is written
X an
.
nz
The best-known examples are the Riemann zeta fuction (in which an is the constant 1)
and the more general Dirichlet L-series (in which the mapping n 7→ an is multiplicative and
periodic).

When λn = n, the Dirichlet series is just a power series in the variable e−z .

The following are the basic convergence properties of Dirichlet series. There is nothing
profound about their proofs, which can be found in [1] and in various other works on complex
analysis and analytic number theory.
P
Let f (z) = n an e−λn z be a Dirichlet series.

1. If f converges at z = z0 , then f converges uniformly in the region

Re(z − z0 ) ≥ 0 − α ≤ arg(z − z0 ) ≤ α

1594
where α is any real number such that 0 < α < π/2. (Such a region is known as a
“Stoltz angle”.)

2. Therefore, if f converges at z0 , its sum defines a holomorphic function on the region
Re(z) > Re(z0 ), and moreover f (z) → f (z0 ) as z → z0 within any Stoltz angle.

3. f = 0 identically iff all the an are zero.

So, if f converges somewhere but not everywhere in C, then the domain of its convergence
is the region Re(z) > ρ for some real number ρ, which is called the abscissa
P of convergence
of the Dirichlets series. The abscissa of convergence of the series f (z) = n |an |e−λn z , if it
exists, is called the abscissa of absolute convergence of f .

Now suppose that the coefficients an are all real and ≥ 0. If the series f converges for
Re(z) > ρ, and the resulting function admits an analytic extension to a neighbourhood
of ρ, then the series f converges in a neighbourhood of ρ. Consequently, the domain of
convergence of f (unless it is the whole of C) is bounded by a singularity at a point on the
real axis.

Pofanany complex numbers (an ), but suppose λn = log n, so
f is an ordinary Dirichlet series nz
.

1. If the sequence (an ) is bounded, then f converges absolutely in the region Re(z) > 1.
P
2. If the partial sums ln=k an are bounded, then f converges (not necessarily absolutely)
in the region Re(z) > 0.

Reference:

[1] Serre, J.-P., A Course in Arithmetic, Chapter VI, Springer-Verlag, 1973.

Version: 2 Owner: bbukh Author(s): Larry Hammick

1595
Chapter 414

30C15 – Zeros of polynomials,
rational functions, and other analytic
functions (e.g. zeros of functions with
bounded Dirichlet integral)

414.1 Mason-Stothers theorem

Mason’s theorem is often described as the polynomial case of the (currently unproven)
ABC conjecture.

Theorem 1 (Mason-Stothers). Let f (z), g(z), h(z) ∈ C[z] be such that f (z) + g(z) = h(z)
for all z, and such that f , g, and h are pair-wise relatively prime. Denote the number of
distinct roots of the product f gh(z) by N. Then

max deg{f, g, h} + 1 6 N.

Version: 1 Owner: mathcam Author(s): mathcam

414.2 zeroes of analytic functions are isolated

The zeroes of a non-constant analytic function on C are isolated. Let f be an analytic
function defined in some domain D ⊂ C and let f (z0 ) = 0 for some z0 ∈ D. Because f is
analytic, there is a Taylor series expansion for f around z0 which converges on an open disk
|z − z0 | < R. Write it as f (z) = Σ∞ n
n=k an (z − z0 ) , with ak 6= 0 and k > 0 (ak is the first
non-zero term). One can factor the series so that f (z) = (z − z0 )k Σ∞ n
n=0 an+k (z − z0 ) and

1596
define g(z) = Σ∞ n k
n=0 an+k (z − z0 ) so that f (z) = (z − z0 ) g(z). Observe that g(z) is analytic
on |z − z0 | < R.

To show that z0 is an isolated zero of f , we must find  > 0 so that f is non-zero on
0 < |z − z0 | < . It is enough to find  > 0 so that g is non-zero on |z − z0 | <  by the
relation f (z) = (z − z0 )k g(z). Because g(z) is analytic, it is continuous at z0 . Notice that
g(z0 ) = ak 6= 0, so there exists an  > 0 so that for all z with |z − z0 | <  it follows that
|g(z) − ak | < |a2k | . This implies that g(z) is non-zero in this set.

Version: 5 Owner: brianbirgen Author(s): brianbirgen

1597
Chapter 415

30C20 – Conformal mappings of
special domains

415.1 automorphisms of unit disk

All automorphisms of the complex unit disk ∆ = {z ∈ C : |z| < 1} to itself, can be written
z−a
in the form fa (z) = eiθ 1−az where a ∈ ∆ and θ ∈ S 1 .

This map sends a to 0, 1/a to ∞ and the unit circle to the unit circle.

Version: 3 Owner: brianbirgen Author(s): brianbirgen

415.2 unit disk upper half plane conformal equivalence
theorem

Theorem: There is a conformal map from ∆, the unit disk, to UHP , the upper half plane.

Proof: Define f : C → C, f (z) = z−i
z+i
1+w
. Notice that f −1 (w) = i 1−w and that f (and therefore
−1
f ) is a Mobius transformation.
−1−i
Notice that f (0) = −1, f (1) = 1−i
1+i
= −i and f (−1) = −1+i = i. By the Mobius circle transformation theore
f takes the real axis to the unit circle. Since f (i) = 0, f maps UHP to ∆ and f −1 : ∆ →
UHP .

Version: 3 Owner: brianbirgen Author(s): brianbirgen

1598
Chapter 416

30C35 – General theory of conformal
mappings

416.1 proof of conformal mapping theorem

Let D ⊂ C be a domain, and let f : D → C be an analytic function. By identifying the
complex plane C with R2 , we can view f as a function from R2 to itself:
f˜(x, y) := (Re f (x + iy), Im f (x + iy)) = (u(x, y), v(x, y))
with u and v real functions. The Jacobian matrix of f˜ is
 
∂(u, v) ux uy
J(x, y) = = .
∂(x, y) vx vy
As an analytic function, f satisfies the Cauchy-Riemann equations, so that ux = vy and
uy = −vx . At a fixed point z = x + iy ∈ D, we can therefore define a = ux (x, y) = vy (x, y)
and b = uy (x, y) = −vx (x, y). We write (a, b) in polar coordinates as (r cos θ, r sin θ) and get
   
a b cos θ sin θ
J(x, y) = =r
−b a − sin θ cos θ

Now we consider two smooth curves through (x, y), which we parametrize by γ1 (t) =
(u1 (t), v1 (t)) and γ2 (t) = (u2 (t), v2 (t)). We can choose the parametrization such that
γ1 (0) = γ2 (0) = z. The images of these curves under f˜ are f˜ ◦ γ1 and f˜ ◦ γ2 , respectively,
and their derivatives at t = 0 are
 du1 
∂(u, v) dγ 1
(f˜ ◦ γ1 ) (0) =
0
(γ1 (0)) · (0) = J(x, y) dvdt
1
∂(x, y) dt dt

and, similarly,  du2 
(f˜ ◦ γ2 )0 (0) = J(x, y) dt
dv2
dt

1599
by the chain rule. We see that if f 0 (z) 6= 0, f transforms the tangent vectors to γ1 and γ2 at
t = 0 (and therefore in z) by the orthogonal matrix
 
cos θ sin θ
J/r =
− sin θ cos θ

and scales them by a factor of r. In particular, the transformation by an orthogonal matrix
implies that the angle between the tangent vectors is preserved. Since the determinant of
J/r is 1, the transformation also preserves orientation (the direction of the angle between
the tangent vectors). We conclude that f is a conformal mapping.

Version: 3 Owner: pbruin Author(s): pbruin

1600
Chapter 417

30C80 – Maximum principle;
Schwarz’s lemma, Lindelöf principle,
analogues and generalizations;
subordination

417.1 Schwarz lemma

Let ∆ = {z : |z| < 1} be the open unit disk in the complex plane C. Let f : ∆ → ∆ be
a holomorphic function with f(0)=0. Then |f (z)| ≤ |z| for all z ∈ ∆, and |f 0 (0)| ≤ 1. If
equality |f (z)| = |z| holds for any z 6= 0 or |f 0 (0)| = 1, then f is a rotation: f (z) = az with
|a| = 1.

This lemma is less celebrated than the bigger guns (such as the Riemann mapping theorem,
which it helps prove); however, it is one of the simplest results capturing the “rigidity” of
holomorphic functions. No similar result exists for real functions, of course.

Version: 2 Owner: ariels Author(s): ariels

417.2 maximum principle
Maximum principle Let f : U → R (where U ⊆ Rd ) be a harmonic function. Then f
attains its extremal values on any compact K ⊆ U on the boundary ∂K of K. If f
attains an extremal value anywhere inside int K, then it is constant.
Maximal modulus principle Let f : U → C (where U ⊆ C) be a holomorphic function.
Then |f | attains its maximal value on any compact K ⊆ U on the boundary ∂K of K.
If |f | attains its maximal value anywhere inside int K, then it is constant.

1601
Version: 1 Owner: ariels Author(s): ariels

417.3 proof of Schwarz lemma

Define g(z) = f (z)/z. Then g : ∆ → C is a holomorphic function. The Schwarz lemma is
just an application of the maximal modulus principle to g.

For any 1 >  > 0, by the maximal modulus principle |g| must attain its maximum on the
closed disk {z : |z| ≤ 1 − } at its boundary {z : |z| = 1 − }, say at some point z . But then
1
|g(z)| ≤ |g(z )| ≤ 1− for any |z| ≤ 1 − . Taking an infinimum as  → 0, we see that values
of g are bounded: |g(z)| ≤ 1.

Thus |f (z)| ≤ |z|. Additionally, f 0 (0) = g(0), so we see that |f 0 (0)| = |g(0)| ≤ 1. This is the
first part of the lemma.

Now suppose, as per the premise of the second part of the lemma, that |g(w)| = 1 for some
w ∈ ∆. For any r > |w|, it must be that |g| attains its maximal modulus (1) inside the
disk {z : |z| ≤ r}, and it follows that g must be constant inside the entire open disk ∆. So
g(z) ⇔ a for a = g(w) of size 1, and f (z) = az, as required.

Version: 2 Owner: ariels Author(s): ariels

1602
Chapter 418

30D20 – Entire functions, general
theory

418.1 Liouville’s theorem

A bounded entire function is constant. That is, a bounded complex function f : C → C
which is holomorphic on the entire complex plane is always a constant function.

More generally, any holomorphic function f : C → C which satisfies a polynomial bound
condition of the form
|f (z)| < c · |z|n
for some c ∈ R, n ∈ Z, and all z ∈ C with |z| sufficiently large is necessarily equal to a
polynomial function.

Liouville’s theorem is a vivid example of how stringent the holomorphicity condition on
a complex function really is. One has only to compare the theorem to the corresponding
statement for real functions (namely, that a bounded differentiable real function is constant,
a patently false statement) to see how much stronger the complex differentiability condition
is compared to real differentiability.

Applications of Liouville’s theorem include proofs of the fundamental theorem of algebra and
of the partial fraction decomposition theorem for rational functions.

Version: 4 Owner: djao Author(s): djao

418.2 Morera’s theorem

Morera’s theorem provides the converse of Cauchy’s integral theorem.

1603
Theorem [1] Suppose G is a region in C, and f : G → C is a continuous function. If for
every closed triangle ∆ in G, we have
int∂∆ f dz = 0,
then f is analytic on G. (Here, ∂∆ is the piecewise linear boundary of ∆.)

In particular, if for every rectifiable closed curve Γ in G, we have intΓ f dz = 0, then f is
analytic on G. Proofs of this can be found in [2, 2].

REFERENCES
1. W. Rudin, Real and complex analysis, 3rd ed., McGraw-Hill Inc., 1987.
2. E. Kreyszig, Advanced Engineering Mathematics, John Wiley & Sons, 1993, 7th ed.
3. R.A. Silverman, Introductory Complex Analysis, Dover Publications, 1972.

Version: 7 Owner: matte Author(s): matte, drini, nerdy2

418.3 entire

A function f : C −→ C is entire if it is holomorphic.

Version: 2 Owner: djao Author(s): djao

418.4 holomorphic

Let U ⊂ C be a domain in the complex numbers. A function f : U −→ C is holomorphic if
f has a complex derivative at every point x in U, i.e. if
f (z) − f (z0 )
lim
z→z0 z − z0
exists for all z0 ∈ U.

Version: 5 Owner: djao Author(s): djao, rmilson

418.5 proof of Liouville’s theorem

Let f : C → C be a bounded, entire function. Then by Taylor’s Theorem,
X∞
1 f (w)
f (z) = cn xn where cn = intΓr n+1 dw
n=0
2πi w

1604
where Γr is the circle of radius r about 0, for r > 0. Then cn can be estimated as
 
1 f (w) 1 Mr Mr
|cn | 6 length(Γr ) sup n+1 : w ∈ Γr = 2πr n+1 = n
2π w 2π r r

where Mr = sup{|f (w)| : w ∈ Γr }.

But f is bounded, so there is M such that Mr 6 M for all r. Then |cn | 6 rMn for all n and
all r > 0. But since r is arbitrary, this gives cn = 0 whenever n > 0. So f (z) = c0 for all z,
so f is constant.

Version: 2 Owner: Evandar Author(s): Evandar

1605
Chapter 419

30D30 – Meromorphic functions,
general theory

419.1 Casorati-Weierstrass theorem

Let U ⊂ C be a domain, a ∈ U, and let f : U \ {a} → C be holomorphic. Then a is an
essential singularity of f if and only if the image of any punctured neighborhood of a under
f is dense in C.

Version: 2 Owner: pbruin Author(s): pbruin

419.2 Mittag-Leffler’s theorem

Let G be an open subset of C, let {ak } be a sequence of distinct points in G which has no
limit point in G. For each k, let A1k , . . . , Amk k be arbitrary complex coefficients, and define
mk
X Ajk
Sk (z) = .
j=1
(z − ak )j

Then there exists a meromorphic function f on G whose poles are exactly the points {ak }
and such that the singular part of f at ak is Sk (z), for each k.

Version: 1 Owner: Koro Author(s): Koro

1606
419.3 Riemann’s removable singularity theorem

Let U ⊂ C be a domain, a ∈ U, and let f : U \ {a} be holomorphic. Then a is a
removable singularity of f if and only if
lim (z − a)f (z) = 0.
z→a

In particular, a is a removable singularity of f if f is bounded near a, i.e. if there is a
punctured neighborhood V of a and a real number M > 0 such that |f (z)| < M for all
z ∈ V.

Version: 1 Owner: pbruin Author(s): pbruin

419.4 essential singularity

Let U ⊂ C be a domain, a ∈ U, and let f : U \{a} → C be holomorphic. If the Laurent series
expansion of f (z) around a contains infinitely many terms with negative powers of z −a, then
a is said to be an essential singularity of f . Any singularity of f is a removable singularity,
a pole or an essential singularity.

If a is an essential singularity of f , then the image of any punctured neighborhood of a under
f is dense in C (the Casorati-Weierstrass theorem). In fact, an even stronger statement is
true: according to Picard’s theorem, the image of any punctured neighborhood of a is C,
with the possible exception of a single point.

Version: 4 Owner: pbruin Author(s): pbruin

419.5 meromorphic

Let U ⊂ C be a domain. A function f : U −→ C is meromorphic if f is holomorphic except
at an isolated set of poles.

It can be proven that if f is meromorphic then its set of poles does not have an accumulation point.

Version: 2 Owner: djao Author(s): djao

419.6 pole

Let U ⊂ C be a domain and let a ∈ C. A function f : U −→ C has a pole at a if it can
be represented by a Laurent series centered about a with only finitely many negative terms;

1607
that is,

X
f (z) = ck (z − a)k
k=−n

in some nonempty deleted neighborhood of a, for some n ∈ N.

Version: 2 Owner: djao Author(s): djao

419.7 proof of Casorati-Weierstrass theorem

Assume that a is an essential singularity of f . Let V ⊂ U be a punctured neighborhood of
a, and let λ ∈ C. We have to show that λ is a limit point of f (V ). Suppose it is not, then
there is an  > 0 such that |f (z) − λ| >  for all z ∈ V , and the function
1
g : V → C, z 7→
f (z) − λ
1
is bounded, since |g(z)| = |f (z)−λ| < −1 for all z ∈ V . According to Riemann’s removable singularity theorem
this implies that
S a is a removable singularity of g, so that g can be extended to a holomorphic
function ḡ : V {a} → C. Now
1
f (z) = −λ
ḡ(z)
for z 6= a, and a is either a removable singularity of f (if ḡ(z) 6= 0) or a pole of order n (if ḡ
has a zero of order n at a). This contradicts our assumption that a is an essential singularity,
which means that λ must be a limit point of f (V ). The argument holds for all λ ∈ C, so
f (V ) is dense in C for any punctured neighborhood V of a.

To prove the converse, assume that f (V ) is dense in C for any punctured neighborhood V
of a. If a is a removable singularity, then f is bounded near a, and if a is a pole, f (z) → ∞
as z → a. Either of these possibilities contradicts the assumption that the image of any
punctured neighborhood of a under f is dense in C, so a must be an essential singularity of
f.

Version: 1 Owner: pbruin Author(s): pbruin

419.8 proof of Riemann’s removable singularity theo-
rem

Suppose that f is holomorphic on U \ {a} and limz→a (z − a)f (z) = 0. Let

X
f (z) = ck (z − a)k
k=−∞

1608
be the Laurent series of f centered at a. We will show that ck = 0 for k < 0, so that f can
be holomorphically extended to all of U by defining f (a) = c0 .

For n ∈ N0 , the residue of (z − a)n f (z) at a is
I
n 1
Res((z − a) f (z), a) = lim (z − a)n f (z)dz.
2πi δ→0+ |z−a|=δ

This is equal to zero, because
I

(z − a) f (z)dz ≤ 2πδ max |(z − a)n f (z)|
n
|z−a|=δ
|z−a|=δ
= 2πδ n max |(z − a)f (z)|
|z−a|=δ

which, by our assumption, goes to zero as δ → 0. Since the residue of (z − a)n f (z) at a is
also equal to c−n−1 , the coefficients of all negative powers of z in the Laurent series vanish.

Conversely, if a is a removable singularity of f , then f can be expanded in a power series
centered at a, so that
lim (z − a)f (z) = 0
z→a

because the constant term in the power series of (z − a)f (z) is zero.

A corollary of this theorem is the following: if f is bounded near a, then

|(z − a)f (z)| ≤ |z − a|M

for some M > 0. This implies that (z − a)f (z) → 0 as z → a, so a is a removable singularity
of f .

Version: 1 Owner: pbruin Author(s): pbruin

419.9 residue

Let U ⊂ C be a domain and let f : U −→ C be a function represented by a Laurent series

X
f (z) := ck (z − a)k
k=−∞

centered about a. The coefficient c−1 of the above Laurent series is called residue of f at a,
and denoted Res(f ; a).

Version: 2 Owner: djao Author(s): djao

1609
419.10 simple pole

A simple pole is a pole of order 1. That is, a meromorphic function f has a simple pole at
x0 ∈ C if
a
f (z) = + g(z)
z − x0
where a 6= 0 ∈ C, and g is holomorphic at x0 .

Version: 3 Owner: bwebste Author(s): bwebste

1610
Chapter 420

30E20 – Integration, integrals of
Cauchy type, integral representations
of analytic functions

420.1 Cauchy integral formula

The formulas. Let D = {z ∈ C : kz − z0 k < R} be an open disk in the complex plane,
and let f (z) be a holomorphic 1 function defined on some open domain that contains D and
its boundary. Then, for every z ∈ D we have
I
1 f (ζ)
f (z) = dζ
2πi C ζ − z
I
0 1 f (ζ)
f (z) = dζ
2πi C (ζ − z)2
..
. I
(n) n! f (ζ)
f (z) = dζ
2πi C (ζ − z)n+1
Here C = ∂D is the corresponding circular boundary contour, oriented counterclockwise,
with the most obvious parameterization given by
ζ = z0 + Reit , 0 6 t 6 2π.

Discussion. The first of the above formulas underscores the “rigidity” of holomoprhic
functions. Indeed, the values of the holomorphic function inside a disk D are completely
1
It is necessary to draw a distinction between holomorphic functions (those having a complex derivative)
and analytic functions (those representable by power series). The two concepts are, in fact, equivalent, but
the standard proof of this fact uses the Cauchy Integral Formula with the (apparently) weaker holomorphicity
hypothesis.

1611
specified by its values on the boundary of the disk. The second formula is useful, because it
gives the derivative in terms of an integral, rather than as the outcome of a limit process.

Generalization. The following technical generalization of the formula is needed for the
treatment of removable singularities. Let S be a finite subset of D, and suppose that f (z)
is holomorphic for all z ∈
/ S, but also that f (z) is bounded near all z ∈ S. Then, the above
formulas are valid for all z ∈ D \ S.

Using the Cauchy residue theorem, one can further generalize the integral formula to the
situation where D is any domain and C is any closed rectifiable curve in D; in this case, the
formula becomes I
1 f (ζ)
η(C, z)f (z) = dζ
2πi C ζ − z
where η(C, z) denotes the winding number of C. It is valid for all points z ∈ D \ S which
are not on the curve C.

Version: 19 Owner: djao Author(s): djao, rmilson

420.2 Cauchy integral theorem

Theorem 10. Let U ⊂ C be an open, simply connected domain, and let f : U → C be a
function whose complex derivative, that is
f (w) − f (z)
lim ,
w→z w−z
exists for all z ∈ U. Then, the integral around a every closed contour γ ⊂ U vanishes; in
symbols I
f (z) dz = 0.
γ

We also have the following, technically important generalization involving removable singularities.
Theorem 11. Let U ⊂ C be an open, simply connected domain, and S ⊂ U a finite subset.
Let f : U\S → C be a function whose complex derivative exists for all z ∈ U\S, and that
is bounded near all z ∈ S. Then, the integral around a every closed contour γ ⊂ U\S that
avoids the exceptional points vanishes.

Cauchy’s theorem is an essential stepping stone in the theory of complex analysis. It is
required for the proof of the Cauchy integral formula, which in turn is required for the proof
that the existence of a complex derivative implies a power series representation.

The original version of the theorem, as stated by Cauchy in the early 1800s, requires that the
derivative f 0 (z) exist and be continuous. The existence of f 0 (z) implies the Cauchy-Riemann equations,

1612
which in turn can be restated as the fact that the complex-valued differential f (z) dz is closed.
The original proof makes use of this fact, and calls on Green’s theorem to conclude that the
contour integral vanishes. The proof of Green’s theorem, however, involves an interchange
of order in a double integral, and this can only be justified if the integrand, which involves the
real and imaginary parts of f 0 (z), is assumed to be continuous. To this date, many authors
prove the theorem this way, but erroneously fail to mention the continuity assumption.

In the latter part of the 19th century E. Goursat found a proof of the integral theorem that
merely required that f 0 (z) exist. Continuity of the derivative, as well as the existence of all
higher derivatives, then follows as a consequence of the Cauchy integral formula. Not only
is Goursat’s version a sharper result, but it is also more elementary and self-contained, in
that sense that it is does not require Green’s theorem. Goursat’s argument makes use of
rectangular contour (many authors use triangles though), but the extension to an arbitrary
simply-connected domain is relatively straight-forward.
Theorem 12 (Goursat). Let U be an open domain containing a rectangle
R = {x + iy ∈ C : a 6 x 6 b , c 6 y 6 d}.
If the complex derivative of a function f : U → C exists at all points of U, then the contour
integral of f around the boundary of R vanishes; in symbols
I
f (z) dz = 0.
∂R

Bibliography.

• A. Ahlfors, “Complex Analysis”.

Version: 7 Owner: rmilson Author(s): rmilson

420.3 Cauchy residue theorem

Let U ⊂ C be a simply connected domain, and suppose f is a complex valued function
which is defined and analytic on all but finitely many points a1 , . . . , am of U. Let C be a
closed curve in U which does not intersect any of the ai . Then
m
X
intC f (z) dz = 2πi η(C, ai) Res(f ; ai ),
i=1

where
1 dz
η(C, ai) := intC
2πi z − ai
is the winding number of C about ai , and Res(f ; ai) denotes the residue of f at ai .

Version: 4 Owner: djao Author(s): djao, rmilson

1613
420.4 Gauss’ mean value theorem

Let Ω be a domain in C and suppose f is an analytic function on Ω. Furthermore, let C be
a circle inside Ω with center z0 and radius r. Then f (z0 ) is the mean value of f along C,
that is,
1
f (z0 ) = int2π f (z0 + reiθ )dθ.
2π 0

Version: 7 Owner: Johan Author(s): Johan

420.5 Möbius circle transformation theorem

Möbius transformations always transform circles into circles.

Version: 1 Owner: Johan Author(s): Johan

420.6 Möbius transformation cross-ratio preservation
theorem

A Möbius transformation f : z 7→ w preserves the cross-ratios, i.e.

(w1 − w2 )(w3 − w4 ) (z1 − z2 )(z3 − z4 )
=
(w1 − w4 )(w3 − w2 ) (z1 − z4 )(z3 − z2 )

Version: 3 Owner: Johan Author(s): Johan

420.7 Rouch’s theorem

Let f, g be analytic on and inside a simple closed curve C. Suppose |f (z)| > |g(z)| on C.
Then f and f + g have the same number of zeros inside C.

Version: 2 Owner: Johan Author(s): Johan

1614
420.8 absolute convergence implies convergence for an
infinite product

If an infinite product is absolutely convergent then it is convergent.

Version: 2 Owner: Johan Author(s): Johan

420.9 absolute convergence of infinite product
Q∞ Q∞
An infinite product n=1 (1 + an ) is said to be absolutely convergent if n=1 (1 + |an |) con-
verges.

Version: 4 Owner: mathcam Author(s): mathcam, Johan

420.10 closed curve theorem

Let U ⊂ C be a simply connected domain, and suppose f : U −→ C is holomorphic. Then

intC f (z) dz = 0

for any smooth closed curve C in U.

More generally, if U is any domain, and C1 and C2 are two homotopic smooth closed curves
in U, then
intC1 f (z) dz = intC2 f (z) dz.
for any holomorphic function f : U −→ C.

Version: 3 Owner: djao Author(s): djao

420.11 conformal Möbius circle map theorem

Any conformal map that maps the interior of the unit disc onto itself is a Möbius transformation.

Version: 4 Owner: Johan Author(s): Johan

1615
420.12 conformal mapping

A mapping f : C 7→ C which preserves the size and orientation of the angles (at z0 ) between
any two curves which intersects in a given point z0 is said to be conformal at z0 . A mapping
that is conformal at any point in a domain D is said to be conformal in D.

Version: 4 Owner: Johan Author(s): Johan

420.13 conformal mapping theorem

Let f (z) be analytic in a domain D. Then it is conformal at any point z ∈ D where f 0 (z) 6= 0.

Version: 2 Owner: Johan Author(s): Johan

420.14 convergence/divergence for an infinite product
Q∞
Consider
Qm n=1 pn . We say that this infinite product converges iff the finite products Pm =
p
n=1 n −→ P 6= 0 converge or for at most a finite number of terms pnk = 0 , k = 1, . . . , K.
Otherwise the infinite product is called divergent.

Note: The infinite product vanishes only if a factor is zero.

Version: 6 Owner: Johan Author(s): Johan

420.15 example of conformal mapping

Consider the four curves A = {t}, B = {t + it}, C = {it} and D = {−t + it}, t ∈ [−10, 10].
Suppose there is a mapping f : C 7→ C which maps A to D and B to C. Is f conformal
at z0 = 0? The size of the angles between A and B at the point of intersection z0 = 0 is
preserved, however the orientation is not. Therefore f is not conformal at z0 = 0. Now
suppose there is a function g : C 7→ C which maps A to C and B to D. In this case we
see not only that the size of the angles is preserved, but also the orientation. Therefore g is
conformal at z0 = 0.

Version: 3 Owner: Johan Author(s): Johan

1616
420.16 examples of infinite products

A classic example is the Riemann zeta function. For Re(z) > 1 we have
X∞ Y
1 1
ζ(z) = z
= −z
.
n=1
n p prime
1 − p

With the help of a Fourier series, or in other ways, one can prove this infinite product
expansion of the sine function:
Y∞  
z2
sin z = z 1− 2 2 (420.16.1)
n=1

where z is an arbitrary complex number. Taking the logarithmic derivative (a frequent move
in connection with infinite products) we get a decomposition of the cotangent into partial
fractions: ∞  
1 X 1 1
π cot πz = + 2 + . (420.16.2)
z n=1
z+n z−n
The equation (495.2.1), in turn, has some interesting uses, e.g. to get the Taylor expansion
of an Eisenstein series, or to evaluate ζ(2n) for positive integers n.

Version: 1 Owner: mathcam Author(s): Larry Hammick

420.17 link between infinite products and sums

Let ∞
Y
pk
k=1

be an infinite product such that pk > 0 for all k. Then the infinite product converges if and
only if the infinite sum

X
log pk
k=1

converges. Moreover

Y ∞
X
pk = exp log pk .
k=1 k=1

Proof.

Simply notice that
N
Y N
X
pk = exp log pk .
k=1 k=1

1617
If the infinite sum converges then
N
Y N
X ∞
X
lim pk = lim exp log pk = exp log pk
N →∞ N →∞
k=1 k=1 k=1

and also the infinite product converges.

Version: 1 Owner: paolini Author(s): paolini

420.18 proof of Cauchy integral formula

Let D = {z ∈ C : kz − z0 k < R} be a disk in the complex plane, S ⊂ D a finite subset, and
U ⊂ C an open domain that contains the closed disk D̄. Suppose that

• f : U\S → C is holomorphic, and that

• f (z) is bounded on D\S.

Let z ∈ D\S be given, and set

f (ζ) − f (z)
g(ζ) = , ζ ∈ D\S 0,
ζ −z
S
where S 0 = S {z}. Note that g(ζ) is holomorphic and bounded on D\S 0. The second
assertion is true, because
g(ζ) → f 0 (z), as ζ → z.
Therefore, by the Cauchy integral theorem
I
g(ζ) dζ = 0,
C

where C is the counterclockwise circular contour parameterized by

ζ = z0 + Reit , 0 6 t 6 2π.

Hence, I I
f (ζ) f (z)
dζ = dζ. (420.18.1)
C ζ −z C ζ −z

lemma. If z ∈ C is such that kzk =
6 1, then
I (
dζ 0 if kzk > 1
=
kζk=1 ζ −z 2πi if kzk < 1

1618
The proof is fun exercise in elementary integral calculus, an application of the half-angle
trigonometric substitutions.

Thanks to the Lemma, the right hand side of (495.2.1) evaluates to 2πif (z). Dividing through
by 2πi, we obtain I
1 f (ζ)
f (z) = dζ,
2πi C ζ − z
as desired.

Since a circle is a compact set, the defining limit for the derivative

d f (ζ) f (ζ)
=
dz ζ − z (ζ − z)2

converges uniformly for ζ ∈ ∂D. Thanks to the uniform convergence, the order of the
derivative and the integral operations can be interchanged. In this way we obtain the second
formula: I I
0 1 d f (ζ) 1 f (ζ)
f (z) = dζ = dζ.
2πi dz C ζ − z 2πi C (ζ − z)2

Version: 9 Owner: rmilson Author(s): rmilson, stawn

420.19 proof of Cauchy residue theorem

Being f holomorphic by Cauchy Riemann equations the differential form f (z) dz is closed.
So by the lemma about closed differential forms on a simple connected domain we know that
the integral intC f (z) dz is equal to intC 0 f (z) dz if C 0 is any curve which is homotopic to C.
In particular we can consider a curve C 0 which turns around the points aj along small circles
and join these small circles with segments. Since the curve C 0 follows each segment two times
with opposite orientation it is enough to sum the integrals of f around the small circles.

So letting z = aj + ρeiθ be a parameterization of the curve around the point aj , we have
dz = ρieiθ dθ and hence
X
intC f (z) dz = intC 0 f (z) dz = η(C, aj )int∂Bρ (aj ) f (z) dz
j

X
= η(C, aj )int2π iθ iθ
0 f (aj + ρe )ρie dθ
j

where ρ > 0 is choosen so small that the balls Bρ (aj ) are all disjoint and all contained in the
domain U. So by linearity, it is enough to prove that for all j

iint2π iθ iθ
0 f (aj + e )ρe dθ = 2πiRes(f, aj ).

1619
Let now j be fixed and consider now the Laurent series for f in aj :
X
f (z) = ck (z − aj )k
k∈Z

so that Res(f, aj ) = c−1 . We have
X X
int2π
0 f (aj + eiθ
)ρeiθ
dθ = int 2π
0 kc (ρeiθ k
) ρeiθ
dθ = ρk+1
ck int2π
0 e
i(k+1)θ
dθ.
k k

Notice now that if k = −1 we have

ρk+1 ck int2π
0 e
i(k+1)θ
dθ = c−1 int2π
0 dθ = 2πc−1 = 2π Res(f, aj )

while for k 6= −1 we have
 2π
ei(k+1)θ
int2π
0 e
i(k+1)θ
dθ = = 0.
i(k + 1) 0

Hence the result follows.

Version: 2 Owner: paolini Author(s): paolini

420.20 proof of Gauss’ mean value theorem

We can parametrize the circle by letting z = z0 + reiφ . Then dz = ireiφ dφ. Using the
Cauchy integral formula we can express f (z0 ) in the following way:
I
1 f (z) 1 f (z0 + reiφ ) iφ 1
f (z0 ) = dz = int2π
0 ire dφ = int2π f (z0 + reiφ )dφ.
2πi C z − z0 2πi reiφ 2π 0

Version: 12 Owner: Johan Author(s): Johan

420.21 proof of Goursat’s theorem

We argue by contradiction. Set I
η= f (z) dz,
∂R

and suppose that η 6= 0. Divide R into four congruent rectangles R1 , R2 , R3 , R4 (see Figure
1), and set I
ηi = f (z) dz.
∂Ri

1620
Figure 1: subdivision of the rectangle contour.

Now subdivide each of the four sub-rectangles, to get 16 congruent sub-sub-rectangles
Ri1 i2 , i1 , i2 = 1 . . . 4, and then continue ad infinitum to obtain a sequence of nested fami-
lies of rectangles Ri1 ...ik , with ηi1 ...ik the values of f (z) integrated along the corresponding
contour.

Orienting the boundary of R and all the sub-rectangles in the usual counter-clockwise fashion
we have
η = η1 + η2 + η3 + η4 ,
and more generally
ηi1 ...ik = ηi1 ...ik 1 + ηi1 ...ik 2 + ηi1 ...ik 3 + ηi1 ...ik 4 .
In as much as the integrals along oppositely oriented line segments cancel, the contributions
from the interior segments cancel, and that is why the right-hand side reduces to the integrals
along the segments at the boundary of the composite rectangle.

Let j1 ∈ {1, 2, 3, 4} be such that |ηj1 | is the maximum of |ηi |, i = 1, . . . , 4. By the triangle inequality
we have
|η1 | + |η2 | + |η3 | + |η4 | > |η|,
and hence
|ηj1 | > 1/4|η|.
Continuing inductively, let jk+1 be such that |ηj1 ...jk jk+1 | is the maximum of |ηj1 ...jk i |, i =
1, . . . , 4. We then have
|ηj1 ...jk jk+1 | > 4−(k+1) |η|. (420.21.1)

Now the sequence of nested rectangles Rj1 ...jk converges to some point z0 ∈ R; more formally

\
{z0 } = Rj1 ...jk .
k=1

The derivative f 0 (z0 ) is assumed to exist, and hence for every  > 0 there exists a k sufficiently
large, so that for all z ∈ Rj1 ...jk we have
|f (z) − f 0 (z0 )(z − z0 )| 6 |z − z0 |.
Now we make use of the following.
Lemma 9. Let Q ⊂ C be a rectangle, let a, b ∈ C, and let f (z) be a continuous, complex
valued function defined and bounded in a domain containing Q. Then,
I
(az + b)dz = 0
∂Q
I

f (z) 6 MP,

∂Q

where M is an upper bound for |f (z)| and where P is the length of ∂Q.

1621
The first of these assertions follows by the fundamental theorem of calculus; after all the
function az + b has an anti-derivative. The second assertion follows from the fact that the
absolute value of an integral is smaller than the integral of the absolute value of the integrand
— a standard result in integration theory.

Using the lemma and the fact that the perimeter of a rectangle is greater than its diameter
we infer that for every  > 0 there exists a k sufficiently large that
I

ηj1 ...jk = f (z) dz 6 |∂Rj1 ...jk |2 = 4−k |∂R|2 .
∂Rj ...j
1 k

where |∂R| denotes the length of perimeter of the rectangle R. This contradicts the earlier
estimate (419.21.1). Therefore η = 0.

Version: 10 Owner: rmilson Author(s): rmilson

420.22 proof of Möbius circle transformation theorem

Case 1: f (z) = az + b.

Case 1a: The points on |z − C| = R can be written as z = C + Reiθ . They are mapped to
the points w = aC + b + aReiθ which all lie on the circle |w − (aC + b)| = |a|R.
 iθ  
Case 1b: The line Re(eiθ z) = k are mapped to the line Re e aw = k + Re ab .

Case 2: f (z) = z1 .

Case 2a: Consider a circle passing through the origin. This can be written as |z − C| = |C|.
This circle is mapped to the line Re(Cw) = 21 which does not pass through the origin. To
show this, write z = C + |C|eiθ . w = 1z = C+|C|e
1
iθ .

 
1 1 C C
Re(Cw) = (Cw + Cw) = iθ
+
2 2 C + |C|e C + |C|e−iθ

   
1 C C eiθ C/|C| 1 C |C|eiθ 1
= + = + =
2 C + |C|eiθ C + |C|e−iθ eiθ C/|C| 2 C + |C|eiθ |C|eiθ + C 2

Case 2b: Consider the line which does not pass through the origin. This can be written as
Re(az) = 1 for a 6= 0. Then az + az = 2 which is mapped to wa + wa = 2. This is simplified

as aw + aw = 2ww which becomes (w − a/2)(w − a/2) = aa/4 or w − a2 = |a| 2