Free Encyclopedia of Mathematics (0.0.

1) – volume 2

Chapter 242 16-00 – General reference works (handbooks, dictionaries, bibliographies, etc.)
242.1 direct product of modules

Let {Xi : i ∈ I} be a collection of modules in some category of modules. Then the direct product i∈I Xi of that collection is the module whose underlying set is the cartesian product of the Xi with componentwise addition and scalar multiplication. For example, in a category of left modules: (xi ) + (yi ) = (xi + yi), r(xi ) = (rxi ). For each j ∈ I we have a projection pj : i∈I Xi → Xj defined by (xi ) → xj , and an injection λj : Xj → i∈I Xi where an element xj of Xj maps to the element of i∈I Xi whose jth term is xj and every other term is zero. The direct product i∈I Xi satisfies a certain universal property. Namely, if Y is a module and there exist homomorphisms fi : Xi → Y for all i ∈ I, then there exists a unique homomorphism φ : Y → i∈I Xi satisfying φλi = fi for all i ∈ I. Xi
λi i∈I fi

Y
φ

Xi

The direct product is often referred to as the complete direct sum, or the strong direct sum, or simply the product. 1088

Compare this to the direct sum of modules. Version: 3 Owner: antizeus Author(s): antizeus

242.2

direct sum

Let {Xi : i ∈ I} be a collection of modules in some category of modules. Then the direct sum i∈I Xi of that collection is the submodule of the direct product of the Xi consisting of all elements (xi ) such that all but a finite number of the xi are zero. For each j ∈ I we have a projection pj : i∈I Xi → Xj defined by (xi ) → xj , and an injection λj : Xj → i∈I Xi where an element xj of Xj maps to the element of i∈I Xi whose jth term is xj and every other term is zero. The direct sum i∈I Xi satisfies a certain universal property. Namely, if Y is a module and there exist homomorphisms fi : Y → Xi for all i ∈ I, then there exists a unique homomorphism φ : i∈I Xi → Y satisfying pi φ = fi for all i ∈ I. Xi
pi i∈I fi

Y
φ

Xi

The direct sum is often referred to as the weak direct sum or simply the sum. Compare this to the direct product of modules. Version: 3 Owner: antizeus Author(s): antizeus

242.3

exact sequence

If we have two homomorphisms f : A → B and g : B → C in some category of modules, then we say that f and g are exact at B if the image of f is equal to the kernel of g. A sequence of homomorphisms · · · → An+1 −→ An −→ An−1 → · · ·
fn+1 fn

is said to be exact if each pair of adjacent homomorphisms (fn+1 , fn ) is exact – in other words if imfn+1 = kerfn for all n. Compare this to the notion of a chain complex. Version: 2 Owner: antizeus Author(s): antizeus 1089

242.4

quotient ring

Definition. Let R be a ring and let I be a two-sided ideal of R. To define the quotient ring R/I, let us first define an equivalence relation in R. We say that the elements a, b ∈ R are equivalent, written as a ∼ b, if and only if a − b ∈ I. If a is an element of R, we denote the corresponding equivalence class by [a]. Thus [a] = [b] if and only if a − b ∈ I. The quotient ring of R modulo I is the set R/I = {[a] | a ∈ R}, with a ring structure defined as follows. If [a], [b] are equivalence classes in R/I, then • [a] + [b] := [a + b], • [a] · [b] := [a · b]. Here a and b are some elements in R that represent [a] and [b]. By construction, every element in R/I has such a representative in R. Moreover, since I is closed under addition and multiplication, one can verify that the ring structure in R/I is well defined. properties. 1. If R is commutative, then R/I is commutative. Examples. 1. For any ring R, we have that R/R = {0} and R\{0} = R. 2. Let R = Z, and let I be the set of even numbers. Then R/I contains only two classes; one for even numbers, and one for odd numbers. Version: 3 Owner: matte Author(s): matte, djao

1090

Chapter 243 16D10 – General module theory
243.1 annihilator

Let R be a ring. Suppose that M is a left R-module. If X is a subset of M, then we define the left annihilator of X in R: l.ann(X) = {r ∈ R | rx = 0 for all x ∈ X}. If Z is a subset of R, then we define the right annihilator of Z in M: r.annM (Z) = {m ∈ M | zm = 0 for all z ∈ Z}. Suppose that N is a right R-module. If Y is a subset of N, then we define the right annihilator of Y in R: r.ann(Y ) = {r ∈ R | yr = 0 for all y ∈ Y }. If Z is a subset of R, then we define the left annihilator of Z in N: l.annN (Z) = {n ∈ N | nz = 0 for all z ∈ Z}. Version: 3 Owner: antizeus Author(s): antizeus

243.2

annihilator is an ideal

The right annihilator of a right R-module MR in R is an ideal. 1091

y the distributive law for modules, it is easy to see that r. ann(MR ) is closed under addition and right multiplication. Now take x ∈ r. ann(MR ) and r ∈ R.
B

Take any m ∈ MR . Then mr ∈ MR , but then (mr)x = 0 since x ∈ r. ann(MR ). So m(rx) = 0 and rx ∈ r. ann(MR ). An equivalent result holds for left annihilators. Version: 2 Owner: saforres Author(s): saforres

243.3

artinian

A module M is artinian if it satisfies the following equivalent conditions: • the descending chain condition holds for submodules of M; • every nonempty family of submodules of M has a minimal element. A ring R is left artinian if it is artinian as a left module over itself (i.e. if R R is an artinian module), and right artinian if it is artinian as a right module over itself (i.e. if RR is an artinian module), and simply artinian if both conditions hold. Version: 3 Owner: antizeus Author(s): antizeus

243.4

composition series

Let R be a ring and let M be a (right or left) R-module. A series of submodules M = M0 ⊃ M1 ⊃ M2 ⊃ · · · ⊃ Mn = 0 in which each quotient Mi /Mi+1 is simple is called a composition series for M. A module need not have a composition series. For example, the ring of integers, Z, condsidered as a module over itself, does not have a composition series. A necessary and sufficient condition for a module to have a composition series is that it is both noetherian and artinian. If a module does have a composition series, then all composition series are the same length. This length (the number n above) is called the composition length of the module. If R is a semisimple Artinian ring, then RR and R R always have composition series. Version: 1 Owner: mclase Author(s): mclase 1092

243.5

conjugate module

If M is a right module over a ring R, and α is an endomorphism of R, we define the conjugate module M α to be the right R-module whose underlying set is {mα | m ∈ M}, with abelian group structure identical to that of M (i.e. (m − n)α = mα − nα ), and scalar multiplication given by mα · r = (m · α(r))α for all m in M and r in R. In other words, if φ : R → EndZ (M) is the ring homomorphism that describes the right module action of R upon M, then φα describes the right module action of R upon M α . If N is a left R-module, we define α N similarly, with r · α n = α (α(r) · n). Version: 4 Owner: antizeus Author(s): antizeus

243.6

modular law

Let R M be a left R-module with submodules A, B, C, and suppose C ⊆ B. Then C + (B A) = B (C + A)

Version: 1 Owner: saforres Author(s): saforres

243.7

module

Let R be a ring, and let M be an abelian group. We say that M is a left R-module if there exists a ring homomorphism φ : R → EndZ (M) from R to the ring of abelian group endomorphisms on M (in which multiplication of endomorphisms is composition, using left function notation). We typically denote this function using a multiplication notation: [φ(r)](m) = r · m = rm This ring homomorphism defines what is called a left module action of R upon M. If R is a unital ring (i.e. a ring with identity), then we typically demand that the ring homomorphism map the unit 1 ∈ R to the identity endomorphism on M, so that 1 · m = m for all m ∈ M. In this case we may say that the module is unital. Typically the abelian group structure on M is expressed in additive terms, i.e. with operator +, identity element 0M (or just 0), and inverses written in the form −m for m ∈ M. 1093

Right module actions are defined similarly, only with the elements of R being written on the right sides of elements of M. In this case we either need to use an anti-homomorphism R → EndZ (M), or switch to right notation for writing functions. Version: 7 Owner: antizeus Author(s): antizeus

243.8

proof of modular law

First we show C + (B A) ⊆ B (C + A): Note that C ⊆ B, B A ⊆ B, and therefore C + (B A) ⊆ B. Further, C ⊆ C + A, B A ⊆ C + A, thus C + (B A) ⊆ C + A. Next we show B (C + A) ⊆ C + (B A): Let b ∈ B (C + A). Then b = c + a for some c ∈ C and a ∈ A. Hence a = b − c, and so a ∈ B since b ∈ B and c ∈ C ⊆ B. Hence a ∈ B A, so b = c + a ∈ C + (B A). Version: 5 Owner: saforres Author(s): saforres

243.9

zero module

Let R be a ring. The abelian group which contains only an identity element (zero) gains a trivial R-module structure, which we call the zero module. Every R-module M has an zero element and thus a submodule consisting of that element. This is called the zero submodule of M. Version: 2 Owner: antizeus Author(s): antizeus

1094

Chapter 244 16D20 – Bimodules
244.1 bimodule

Suppose that R and S are rings. An (R, S)-bimodule is an abelian group M which has a left R-module action as well as a right S-module action, which satisfy the relation r(ms) = (rm)s for every choice of elements r of R, s of S, and m of M. A (R, S)-sub-bi-module of M is a subgroup which is also a left R-submodule and a right S-submodule. Version: 3 Owner: antizeus Author(s): antizeus

1095

Chapter 245 16D25 – Ideals
245.1 associated prime

Let R be a ring, and let M be an R-module. A prime ideal P of R is an annihilator prime for M if P is equal to the annihilator of some nonzero submodule X of M. Note that if this is the case, then the module annA (P ) contains X, has P as its annihilator, and is a faithful (R/P )-module. If, in addition, P is equal to the annihilator of a submodule of M that is a fully faithful (R/P )-module, then we call P an associated prime of M. Version: 2 Owner: antizeus Author(s): antizeus

245.2

nilpotent ideal

A left (right) ideal I of a ring R is a nilpotent ideal if I n = 0 for some positive integer n. Here I n denotes a product of ideals – I · I · · · I. Version: 2 Owner: antizeus Author(s): antizeus

245.3

primitive ideal

Let R be a ring, and let I be an ideal of R. We say that I is a left (right) primitive ideal if there exists a simple left (right) R-module X such that I is the annihilator of X in R. We say that R is a left (right) primitive ring if the zero ideal is a left (right) primitive ideal 1096

of R. Note that I is a left (right) primitive ideal if and only if R/I is a left (right) primitive ring. Version: 2 Owner: antizeus Author(s): antizeus

245.4

product of ideals

Let R be a ring, and let A and B be left (right) ideals of R. Then the product of the ideals A and B, which we denote AB, is the left (right) ideal generated by the products {ab | a ∈ A, b ∈ B}. Version: 2 Owner: antizeus Author(s): antizeus

245.5

proper ideal

Suppose R is a ring and I is an ideal of R. We say that I is a proper ideal if I is not equal to R. Version: 2 Owner: antizeus Author(s): antizeus

245.6

semiprime ideal

Let R be a ring. An ideal I of R is a semiprime ideal if it satisfies the following equivalent conditions: (a) I can be expressed as an intersection of prime ideals of R; (b) if x ∈ R, and xRx ⊂ I, then x ∈ I; (c) if J is a two-sided ideal of R and J 2 ⊂ I, then J ⊂ I as well; (d) if J is a left ideal of R and J 2 ⊂ I, then J ⊂ I as well; (e) if J is a right ideal of R and J 2 ⊂ I, then J ⊂ I as well. Here J 2 is the product of ideals J · J. The ring R itself satisfies all of these conditions (including being expressed as an intersection of an empty family of prime ideals) and is thus semiprime. A ring R is said to be a semiprime ring if its zero ideal is a semiprime ideal. 1097

Note that an ideal I of R is semiprime if and only if the quotient ring R/I is a semiprime ring. Version: 7 Owner: antizeus Author(s): antizeus

245.7

zero ideal

In any ring, the set consisting only of the zero element (i.e. the additive identity) is an ideal of the left, right, and two-sided varieties. It is the smallest ideal in any ring. Version: 2 Owner: antizeus Author(s): antizeus

1098

Chapter 246 16D40 – Free, projective, and flat modules and ideals
246.1 finitely generated projective module

Let R be a unital ring. A finitely generated projective right R-module is of the form eRn , n ∈ N, where e is an idempotent in EndR (Rn ). Let A be a unital C ∗ -algebra and p be a projection in EndA (An ), n ∈ N. Then, E = pAn is a finitely generated projective right A-module. Further, E is a pre-Hilbert A-module with (A-valued) inner product
n

u, v =
i=1

u∗ vi , i

u, v ∈ E.

Version: 3 Owner: mhale Author(s): mhale

246.2

flat module

A right module M over a ring R is flat if the tensor product functor M ⊗R (−) is an exact functor. Similarly, a left module N over R is flat if the tensor product functor (−) ⊗R N is an exact functor. Version: 2 Owner: antizeus Author(s): antizeus

1099

246.3

free module

Let R be a commutative ring. A free module over R is a direct sum of copies of R. In particular, as every abelian group is a Z-module, a free abelian group is a direct sum of copies of Z. This is equivalent to saying that the module has a free basis, i.e. a set of elements with the property that every element of the module can be uniquely expressed as an linear combination over R of elements of the free basis. In the case that a free module over R is a sum of finitely many copies of R, then the number of copies is called the rank of the free module. An alternative definition of a free module is via its universal property: Given a set X, the free R-module F (X) on the set X is equipped with a function i : X → F (X) satisfying the property that for any other R-module A and any function f : X → A, there exists a unique R-module map h : F (X) → A such that (h ◦ i) = f . Version: 4 Owner: mathcam Author(s): mathcam, antizeus

246.4

free module

Let R be a ring. A free module over R is a direct sum of copies of R. Similarly, as an abelian group is simply a module over Z, a free abelian group is a direct sum of copies of Z. This is equivalent to saying that the module has a free basis, i.e. a set of elements with the property that every element of the module can be uniquely expressed as an linear combination over R of elements of the free basis. Version: 1 Owner: antizeus Author(s): antizeus

246.5

projective cover

Let X and P be modules. We say that P is a projective cover of X if P is a projective module and there exists an epimorphism p : P → X such that ker p is a superfluous submodule of P . Equivalently, P is an projective cover of X if P is projective, and there is an epimorphism p : P → X, and if g : P → X is an epimorphism from a projective module P to X, then

1100

there exists an epimorphism h : P → P such that ph = g. P
h g p

P

X 0

0

Version: 2 Owner: antizeus Author(s): antizeus

246.6

projective module

A module P is projective if it satisfies the following equivalent conditions: (a) Every short exact sequence of the form 0 → A → B → P → 0 is split; (b) The functor Hom(P, −) is exact; (c) If f : X → Y is an epimorphism and there exists a homomorphism g : P → Y , then there exists a homomorphism h : P → X such that f h = g. P
h g f

X

Y

0

(d) The module P is a direct summand of a free module. Version: 3 Owner: antizeus Author(s): antizeus

1101

Chapter 247 16D50 – Injective modules, self-injective rings
247.1 injective hull

Let X and Q be modules. We say that Q is an injective hull or injective envelope of X if Q is both an injective module and an essential extension of X. Equivalently, Q is an injective hull of X if Q is injective, and X is a submodule of Q, and if g : X → Q is a monomorphism from X to an injective module Q , then there exists a monomorphism h : Q → Q such that h(x) = g(x) for all x ∈ X. 0 0 X
g i

Q
h

Q Version: 2 Owner: antizeus Author(s): antizeus

247.2

injective module

A module Q is injective if it satisfies the following equivalent conditions: (a) Every short exact sequence of the form 0 → Q → B → C → 0 is split; (b) The functor Hom(−, Q) is exact; 1102

(c) If f : X → Y is a monomorphism and there exists a homomorphism g : X → Q, then there exists a homomorphism h : Y → Q such that hf = g. 0 X
g f

Y
h

Q Version: 3 Owner: antizeus Author(s): antizeus

1103

Chapter 248 16D60 – Simple and semisimple modules, primitive rings and ideals
248.1 central simple algebra

Let K be a field. A central simple algebra A (over K) is an algebra A over K, which is finite dimensional as a vector space over K, such that • A has an identity element, as a ring • A is central: the center of A equals K (for all z ∈ A, we have z · a = a · z for all a ∈ A if and only if z ∈ K) • A is simple: for any two sided ideal I of A, either I = {0} or I = A By a theorem of Brauer, for every central simple algebra A over K, there exists a unique (up to isomorphism) division ring D containing K and a unique natural number n such that A is isomorphic to the ring of n × n matrices with coefficients in D. Version: 2 Owner: djao Author(s): djao

248.2

completely reducible

A module M is called completely reducible (or semisimple) if it is a direct sum of irreducible (or simple) modules. Version: 1 Owner: bwebste Author(s): bwebste

1104

248.3

simple ring

A nonzero ring R is said to be a simple ring if it has no (two-sided) ideal other then the zero ideal and R itself. This is equivalent to saying that the zero ideal is a maximal ideal. If R is a commutative ring with unit, then this is equivalent to being a field. Version: 4 Owner: antizeus Author(s): antizeus

1105

Chapter 249 16D80 – Other classes of modules and ideals
249.1 essential submodule

Let X be a submodule of a module Y . We say that X is an essential submodule of Y , and that Y is an essential extension of X, if whenever A is a nonzero submodule of Y , then A X is also nonzero. A monomorphism f : X → Y is an essential monomorphism if the image imf is an essential submodule of Y . Version: 2 Owner: antizeus Author(s): antizeus

249.2

faithful module

Let R be a ring, and let M be an R-module. We say that M is a faithful R-module if its annihilator annR (M) is the zero ideal. We say that M is a fully faithful R-module if every nonzero R-submodule of M is faithful. Version: 3 Owner: antizeus Author(s): antizeus

1106

249.3

minimal prime ideal

A prime ideal P of a ring R is called a minimal prime ideal if it does not properly contain any other prime ideal of R. If R is a prime ring, then the zero ideal is a prime ideal, and is thus the unique minimal prime ideal of R. Version: 2 Owner: antizeus Author(s): antizeus

249.4

module of finite rank

Let M be a module, and let E(M) be the injective hull of M. Then we say that M has finite rank if E(M) is a finite direct sum of indecomposible submodules. This turns out to be equivalent to the property that M has no infinite direct sums of nonzero submodules. Version: 3 Owner: antizeus Author(s): antizeus

249.5

simple module

Let R be a ring, and let M be an R-module. We say that M is a simple or irreducible module if it contains no submodules other than itself and the zero module. Version: 2 Owner: antizeus Author(s): antizeus

249.6

superfluous submodule

Let X be a submodule of a module Y . We say that X is a superfluous submodule of Y if whenever A is a submodule of Y such that A + X = Y , then A = Y . Version: 2 Owner: antizeus Author(s): antizeus

1107

249.7

uniform module

A module M is said to be uniform if any two nonzero submodules of M must have a nonzero intersection. This is equivalent to saying that any nonzero submodule is an essential submodule. Version: 3 Owner: antizeus Author(s): antizeus

1108

Chapter 250 16E05 – Syzygies, resolutions, complexes
250.1 n-chain

An n-chain on a topological space X is a finite formal sum of n-simplices on X. The group of such chains is denoted Cn (X). For a CW-complex Y, Cn (Y ) = Hn (Y n , Y n−1 ), where Hn denotes the nth homology group. The boundary of an n-chain is the (n − 1)-chain given by the formal sum of the boundaries of its constitutent simplices. An n-chain is closed if its boundary is 0 and exact if it is the boundary of some (n + 1)-chain. Version: 3 Owner: mathcam Author(s): mathcam

250.2

chain complex

A sequence of modules and homomorphisms
n · · · → An+1 −→ An −→ An−1 → · · ·

dn+1

d

is said to be a chain complex or complex if each pair of adjacent homomorphisms (dn+1 , dn ) satisfies the relation dn dn+1 = 0. This is equivalent to saying that im dn+1 ⊂ ker dn . We often denote such a complex as (A, d) or simply A. Compare this to the notion of an exact sequence. Version: 4 Owner: antizeus Author(s): antizeus

1109

250.3

flat resolution

Let M be a module. A flat resolution of M is an exact sequence of the form · · · → Fn → Fn−1 → · · · → F1 → F0 → M → 0 where each Fn is a flat module. Version: 2 Owner: antizeus Author(s): antizeus

250.4

free resolution

Let M be a module. A free resolution of M is an exact sequence of the form · · · → Fn → Fn−1 → · · · → F1 → F0 → M → 0 where each Fn is a free module. Version: 2 Owner: antizeus Author(s): antizeus

250.5

injective resolution

Let M be a module. An injective resolution of M is an exact sequence of the form 0 → M → Q0 → Q1 → · · · → Qn−1 → Qn → · · · where each Qn is an injective module. Version: 2 Owner: antizeus Author(s): antizeus

250.6

projective resolution

Let M be a module. A projective resolution of M is an exact sequence of the form · · · → Pn → Pn−1 → · · · → P1 → P0 → M → 0 where each Pn is a projective module. Version: 2 Owner: antizeus Author(s): antizeus

1110

250.7

short exact sequence

A short exact sequence is an exact sequence of the form 0 → A → B → C → 0. Note that in this case, the homomorphism A → B must be a monomorphism, and the homomorphism B → C must be an epimorphism. Version: 2 Owner: antizeus Author(s): antizeus

250.8

split short exact sequence
f g

In an abelian category, a short exact sequence 0 → A → B → C → 0 is split if it satisfies the following equivalent conditions: (a) there exists a homomorphism h : C → B such that gh = 1C ; (b) there exists a homomorphism j : B → A such that jf = 1A ; (c) B is isomorphic to the direct sum A ⊕ C. In this case, we say that h and j are backmaps or splitting backmaps. Version: 4 Owner: antizeus Author(s): antizeus

250.9

von Neumann regular

An element a of a ring R is said to be von Neumann regular if there exists b ∈ R such that aba = a. A ring R is said to be a von Neumann regular ring (or simply a regular ring, if the meaning is clear from context) if every element of R is von Neumann regular. Note that regular ring in the sense of von Neumann should not be confused with regular ring in the sense of commutative algebra. Version: 1 Owner: igor Author(s): igor

1111

Chapter 251 16K20 – Finite-dimensional
251.1 quaternion algebra

A quaternion algebra over a field K is a central simple algebra over K which is four dimensional as a vector space over K. Examples: • For any field K, the ring M2×2 (K) of 2 × 2 matrices with entries in K is a quaternion algebra over K. If K is algebraically closed, then all quaternion algebras over K are isomorphic to M2×2 (K). • For K = R, the well known algebra H of Hamiltonian quaternions is a quaternion algebra over R. The two algebras H and M2×2 (R) are the only quaternion algebras over R, up to isomorphism. • When K is a number field, there are infinitely many non–isomorphic quaternion algebras over K. In fact, there is one such quaternion algebra for every even sized finite collection of finite primes or real primes of K. The proof of this deep fact leads to many of the major results of class field theory. Version: 1 Owner: djao Author(s): djao

1112

Chapter 252 16K50 – Brauer groups
252.1 Brauer group

Let K be a field. The Brauer group Br(K) of K is the set of all equivalence classes of central simple algebras over K, where two central simple algebras A and B are equivalent if there exists a division ring D over K and natural numbers n, m such that A (resp. B) is isomorphic to the ring of n × n (resp. m × m) matrices with coefficients in D. The group operation in Br(K) is given by tensor product: for any two central simple algebras A, B over K, their product in Br(K) is the central simple algebra A ⊗K B. The identity element in Br(K) is the class of K itself, and the inverse of a central simple algebra A is the opposite algebra Aopp defined by reversing the order of the multiplication operation of A. Version: 5 Owner: djao Author(s): djao

1113

Chapter 253 16K99 – Miscellaneous
253.1 division ring

A division ring is a ring D with identity such that • 1=0 • For all nonzero a ∈ D, there exists b ∈ D with a · b = b · a = 1 A field is equivalent to a commutative division ring. Version: 3 Owner: djao Author(s): djao

1114

Chapter 254 16N20 – Jacobson radical, quasimultiplication
254.1 Jacobson radical

The Jacobson radical J(R) of a ring R is the intersection of the annihilators of irreducible left R-modules. The following are alternate characterizations of the Jacobson radical J(R): 1. The intersection of all left primitive ideals. 2. The intersection of all maximal left ideals. 3. The set of all t ∈ R such that for all r ∈ R, 1 − rt is left invertible (i.e. there exists u such that u(1 − rt) = 1). 4. The largest ideal I such that for all v ∈ I, 1 − v is a unit in R. 5. (1) - (3) with “left” replaced by “right” and rt replaced by tr. Note that if R is commutative and finitely generated, then J(R) = {x ∈ R | xn = 0for some n ∈ N} = Nil(R) Version: 13 Owner: saforres Author(s): saforres

1115

254.2

a ring modulo its Jacobson radical is semiprimitive

Let R be a ring. Then J(R/J(R)) = (0).
L et [u] ∈ J(R/J(R)). Then by one of the alternate characterizations of the Jacobson radical, 1 − [r][u] is left invertible for all r ∈ R, so there exists v ∈ R such that [v](1 − [r][u]) = 1.

Then v(1 − ru) = 1 − a for some a ∈ J(R). So wv(1−ru) = 1 since w(1−a) = 1 for some w ∈ R. Since this holds for all r ∈ R, u ∈ J(R), then [u] = 0. Version: 3 Owner: saforres Author(s): saforres

254.3

examples of semiprimitive rings

Examples of semiprimitive rings:
The integers Z: ince Z is commutative, any left ideal is two-sided. So the maximal left ideals of Z are the maximal ideals of Z, which are the ideals pZ for p prime. Note that pZ qZ = (0) if gcd(p, q) > 1. Hence J(Z) = p pZ = (0).
S

A matrix ring Mn (D) over a division ring D:
T

he ring Mn (D) is simple, so the only proper ideal is (0). Thus J(Mn (D)) = (0).

A polynomial ring R[x] over a domain R:
T

1. By one of the alternate characterizations of the Jacobson radical, 1 − ax is a unit. But deg(1 − ax) = max{deg(1), deg(ax)} 1. So 1 − ax is not a unit, and by this contradiction we see that J(R[x]) = (0). Version: 5 Owner: saforres Author(s): saforres

ake a ∈ J(R[x]) with a = 0. Then ax ∈ J(R[x]), since J(R[x]) is an ideal, and deg(ax)

1116

254.4

proof of Characterizations of the Jacobson radical

First, note that by definition a left primitive ideal is the annihilator of an irreducible left Rmodule, so clearly characterization 1) is equivalent to the definition of the Jacobson radical. Next, we will prove cyclical containment. Observe that 5) follows after the equivalence of 1) - 4) is established, since 4) is independent of the choice of left or right ideals. 1) ⊂ 2) We know that every left primitive ideal is the largest ideal contained in a maximal left ideal. So the intersection of all left primitive ideals will be contained in the intersection of all maximal left ideals. 2) ⊂ 3) Let S = {M : M a maximal left ideal of R} and take r ∈ R. Let t ∈ M ∈S M. Then rt ∈ M ∈S M. Assume 1 − rt is not left invertible; therefore there exists a maximal left ideal M0 of R such that R(1 − rt) ⊆ M0 . Note then that 1 − rt ∈ M0 . Also, by definition of t, we have rt ∈ M0 . Therefore 1 ∈ M0 ; this contradiction implies 1 − rt is left invertible. 3) ⊂ 4) We claim that 3) satisfies the condition of 4). Let K = {t ∈ R : 1 − rt is left invertible for all r ∈ R}. We shall first show that K is an ideal. Clearly if t ∈ K, then rt ∈ K. If t1 , t2 ∈ K, then 1 − r(t1 + t2 ) = (1 − rt1 ) − rt2 Now there exists u1 such that u1(1 − rt1 ) = 1, hence u1 ((1 − rt1 ) − rt2 ) = 1 − u1rt2 Similarly, there exists u2 such that u2 (1 − u1 rt2 ) = 1, therefore u2 u1 (1 − r(t1 + t2 )) = 1 Hence t1 + t2 ∈ K. Now if t ∈ K, r ∈ R, to show that tr ∈ K it suffices to show that 1−tr is left invertible. Suppose u(1 − rt) = 1, hence u − urt = 1, then tur − turtr = tr. So (1 + tur)(1 − tr) = 1 + tur − tr − turtr = 1. Therefore K is an ideal. Now let v ∈ K. Then there exists u such that u(1 − v) = 1, hence 1 − u = −uv ∈ K, so u = 1 − (1 − u) is left invertible. So there exists w such that wu = 1, hence wu(1 − v) = w, then 1 − v = w. Thus 1117

(1 − v)u = 1 and therefore 1 − v is a unit. Let J be the largest ideal such that, for all v ∈ J, 1 − v is a unit. We claim that K ⊆ J. Suppose this were not true; in this case K + J strictly contains J. Consider rx + sy ∈ K + J with x ∈ K, y ∈ J and r, s ∈ R. Now 1 − (rx + sy) = (1 − rx) − sy, and since rx ∈ K, then 1 − rx = u for some unit u ∈ R. So 1 − (rx + sy) = u − sy = u(1 − u−1 sy), and clearly u−1sy ∈ J since y ∈ J. Hence 1 − u−1 sy is also a unit, and thus 1 − (rx + sy) is a unit. Thus 1 − v is a unit for all v ∈ K + J. But this contradicts the assumption that J is the largest such ideal. So we must have K ⊆ J. 4) ⊂ 1) We must show that if I is an ideal such that for all u ∈ I, 1 − u is a unit, then I ⊂ ann(R M) for every irreducible left R-module R M. Suppose this is not the case, so there exists R M such that I ⊂ ann(R M). Now we know that ann(R M) is the largest ideal inside some maximal left ideal J of R. Thus we must also have I ⊂ J, or else this would contradict the maximality of ann(R M) inside J. But since I ⊂ J, then by maximality I + J = R, hence there exist u ∈ I and v ∈ J such that u + v = 1. Then v = 1 − u, so v is a unit and J = R. But since J is a proper left ideal, this is a contradiction. Version: 25 Owner: saforres Author(s): saforres

254.5

properties of the Jacobson radical

Theorem: Let R, T be rings and ϕ : R → T be a surjective homomorphism. Then ϕ(J(R)) ⊆ J(T ). e shall use the characterization of the Jacobson radical as the set of all a ∈ R such that for all r ∈ R, 1 − ra is left invertible.
W

Let a ∈ J(R), t ∈ T . We claim that 1 − tϕ(a) is left invertible: Since ϕ is surjective, t = ϕ(r) for some r ∈ R. Since a ∈ J(R), we know 1 − ra is left invertible, so there exists u ∈ R such that u(1 − ra) = 1. Then we have ϕ(u) (ϕ(1) − ϕ(r)ϕ(a)) = ϕ(u)ϕ(1 − ra) = ϕ(1) = 1 So ϕ(a) ∈ J(T ) as required. Theorem: Let R, T be rings. Then J(R × T ) ⊆ J(R) × J(T ). 1118

et π1 : R ×T → R be a (surjective) projection. By the previous theorem, π1 (J(R ×T )) ⊆ J(R).
L

Similarly let π2 : R × T → T be a (surjective) projection. We see that π2 (J(R × T )) ⊆ J(T ). Now take (a, b) ∈ J(R × T ). Note that a = π1 (a, b) ∈ J(R) and b = π2 (a, b) ∈ J(T ). Hence (a, b) ∈ J(R) × J(T ) as required. Version: 8 Owner: saforres Author(s): saforres

254.6

quasi-regularity

An element x of a ring is called right quasi-regular [resp. left quasi-regular] if there is an element y in the ring such that x + y + xy = 0 [resp. x + y + yx = 0]. For calculations with quasi-regularity, it is useful to introduce the operation ∗ defined: x ∗ y = x + y + xy. Thus x is right quasi-regular if there is an element y such that x ∗ y = 0. The operation ∗ is easily demonstrated to be associative, and x ∗ 0 = 0 ∗ x = 0 for all x. An element x is called quasi-regular if it is both left and right quasi-regular. In this case, there are elements x and y such that x+y+xy = 0 = x+z+zx (equivalently, x∗y = z∗x = 0). A calculation shows that y = 0 ∗ y = (z ∗ x) ∗ y = z ∗ (x ∗ y) = z. So y = z is a unique element, depending on x, called the quasi-inverse of x. An ideal (one- or two-sided) of a ring is called quasi-regular if each of its elements is quasiregular. Similarly, a ring is called quasi-regular if each of its elements is quasi-regular (such rings cannot have an identity element). Lemma 1. Let A be an ideal (one- or two-sided) in a ring R. If each element of A is right quasi-regular, then A is a quasi-regular ideal. This lemma means that there is no extra generality gained in defining terms such as right quasi-regular left ideal, etc. Quasi-regularity is important because it provides elementary characterizations of the Jacobson radical for rings without an identity element: • The Jacobson radical of a ring is the sum of all quasi-regular left (or right) ideals. 1119

• The Jacobson radical of a ring is the largest quasi-regular ideal of the ring. For rings with an identity element, note that x is [right, left] quasi-regular if and only if 1 + x is [right, left] invertible in the ring. Version: 1 Owner: mclase Author(s): mclase

254.7

semiprimitive ring

A ring R is said to be semiprimitive (sometimes semisimple) if its Jacobson radical is the zero ideal. Any simple ring is automatically semiprimitive. A finite direct product of matrix rings over division rings can be shown to be semiprimitive and both left and right artinian. The Artin-Wedderburn Theorem states that any semiprimitive ring which is left or right Artinian is isomorphic to a finite direct product of matrix rings over division rings. Version: 11 Owner: saforres Author(s): saforres

1120

Chapter 255 16N40 – Nil and nilpotent radicals, sets, ideals, rings
255.1 Koethe conjecture

The Koethe Conjecture is the statement that for any pair of nil right ideals A and B in any ring R, the sum A + B is also nil. If either of A or B is a two-sided ideal, it is easy to see that A + B is nil. Suppose A is a two-sided ideal, and let x ∈ A + B. The quotient (A + B)/A is nil since it is a homomorphic image of B. So there is an n > 0 with xn ∈ A. Then there is an m > 0 such that xnm = 0, because A is nil. In particular, this means that the Koethe conjecture is true for commutative rings. It has been shown to be true for many classes of rings, but the general statement is still unproven, and no counter example has been found. Version: 1 Owner: mclase Author(s): mclase

255.2

nil and nilpotent ideals

An element x of a ring is nilpotent if xn = 0 for some positive integer n. A ring R is nil if every element in R is nilpotent. Similarly, a one- or two-sided ideal is called nil if each of its elements is nilpotent. A ring R [resp. a one- or two sided ideal A] is nilpotent if Rn = 0 [resp. An = 0] for some positive integer n. 1121

A ring or an ideal is locally nilpotent if every finitely generated subring is nilpotent. The following implications hold for rings (or ideals): nilpotent ⇒ locally nilpotent ⇒ nil

Version: 3 Owner: mclase Author(s): mclase

1122

Chapter 256 16N60 – Prime and semiprime rings
256.1 prime ring

A ring R is said to be a prime ring if the zero ideal is a prime ideal. If R is commutative, this is equivalent to being an integral domain. Version: 2 Owner: antizeus Author(s): antizeus

1123

Chapter 257 16N80 – General radicals and rings
257.1 prime radical

The prime radical of a ring R is the intersection of all the prime ideals of R. Note that the prime radical is the smallest semiprime ideal of R, and that R is a semiprime ring if and only if its prime radical is the zero ideal. Version: 2 Owner: antizeus Author(s): antizeus

257.2

radical theory

Let x◦ be a property which defines a class of rings, which we will call the x◦ -rings. Then x◦ is a radical property if it satisfies: 1. The class of x◦ -rings is closed under homomorphic images. 2. Every ring R has a largest ideal in the class of x◦ -rings; this ideal is written x◦ (R). 3. x◦ (R/x◦ (R)) = 0. Note: it is extremely important when interpreting the above definition that your definition of a ring does not require an identity element. The ideal x◦ (R) is called the x◦ -radical of R. A ring is called x◦ -radical if x◦ (R) = R, and is called x◦ -semisimple if x◦ (R) = 0. If x◦ is a radical property, then the class of x◦ -rings is also called the class of x◦ -radical rings. 1124

The class of x◦ -radical rings is closed under ideal extensions. That is, if A is an ideal of R, and A and R/A are x◦ -radical, then so is R. Radical theory is the study of radical properties and their interrelations. There are several well-known radicals which are of independent interest in ring theory (See examples – to follow). The class of all radicals is however very large. Indeed, it is possible to show that any partition of the class of simple rings into two classes, R and S gives rise to a radical x◦ with the property that all rings in R are x◦ -radical and all rings in S are x◦ -semisimple. A radical x◦ is hereditary if every ideal of an x◦ -radical ring is also x◦ -radical. A radical x◦ is supernilpotent if the class of x◦ -rings contains all nilpotent rings. Version: 2 Owner: mclase Author(s): mclase

1125

Chapter 258 16P40 – Noetherian rings and modules
258.1 Noetherian ring

A ring R is right noetherian (or left noetherian ) if R is noetherian as a right module (or left module ), i.e., if the three equivalent conditions hold: 1. right ideals (or left ideals) are finitely generated 2. the ascending chain condition holds on right ideals (or left ideals) 3. every nonempty family of right ideals (or left ideals) has a maximal element. We say that R is noetherian if it is both left noetherian and right noetherian. Examples of Noetherian rings include any field (as the only ideals are 0 and the whole ring) and the ring Z of integers (each ideal is generated by a single integer, the greatest common divisor of the elements of the ideal). The Hilbert basis theorem says that a ring R is noetherian iff the polynomial ring R[x] is. Version: 10 Owner: KimJ Author(s): KimJ

258.2

noetherian

A module M is noetherian if it satisfies the following equivalent conditions:

1126

• the ascending chain condition holds for submodules of M ; • every nonempty family of submodules of M has a maximal element; • every submodule of M is finitely generated. A ring R is left noetherian if it is noetherian as a left module over itself (i.e. if R R is a noetherian module), and right noetherian if it is noetherian as a right module over itself (i.e. if RR is an noetherian module), and simply noetherian if both conditions hold. Version: 2 Owner: antizeus Author(s): antizeus

1127

Chapter 259 16P60 – Chain conditions on annihilators and summands: Goldie-type conditions , Krull dimension
259.1 Goldie ring

Let R be a ring. If the set of annihilators {r. ann(x) | x ∈ R} satisifies the ascending chain condition, then R is said to satisfy the ascending chain condition on right annihilators. A ring R is called a right Goldie ring if it satisfies the ascending chain condition on right annihilators and RR is a module of finite rank. Left Goldie ring is defined similarly. If the context makes it clear on which side the ring operates, then such a ring is simply called a Goldie ring. A right noetherian ring is right Goldie. Version: 3 Owner: mclase Author(s): mclase

259.2

uniform dimension

Let M be a module over a ring R, and suppose that M contains no infinite direct sums of non-zero submodules. (This is the same as saying that M is a module of finite rank.)

1128

Then there exits an integer n such that M contains an essential submodule N where N = U1 ⊕ U2 ⊕ · · · ⊕ Un is a direct sum of n uniform submodules. This number n does not depend on the choice of N or the decomposition into uniform submodules. We call n the uniform dimension of M. Sometimes this is written u-dim M = n. If R is a field K, and M is a finite-dimensional vector space over K, then u-dim M = dimK M. u-dim M = 0 if and only if M = 0. Version: 3 Owner: mclase Author(s): mclase

1129

Chapter 260 16S10 – Rings determined by universal properties (free algebras, coproducts, adjunction of inverses, etc.)
260.1 Ore domain

Let R be a domain. We say that R is a right Ore domain if any two nonzero elements of R have a nonzero common right multiple, i.e. for every pair of nonzero x and y, there exists a pair of elements r and s of R such that xr = ys = 0. This condition turns out to be equivalent to the following conditions on R when viewed as a right R-module: (a) RR is a uniform module. (b) RR is a module of finite rank. The definition of a left Ore domain is similar. If R is a commutative domain, then it is a right (and left) Ore domain. Version: 6 Owner: antizeus Author(s): antizeus

1130

Chapter 261 16S34 – Group rings , Laurent polynomial rings
261.1 support

Let R[G] be the group ring of a group G over a ring R. Let x = g xg g be an element of R[G]. The support of x, often written supp(x), is the set of elements of G which occur with non-zero coefficient in the expansion of x. Thus: supp(x) = {g ∈ G | xg = 0}. Version: 2 Owner: mclase Author(s): mclase

1131

Chapter 262 16S36 – Ordinary and skew polynomial rings and semigroup rings
262.1 Gaussian polynomials

For an indeterminate u and integers n ≥ m ≥ 0 we define the following: (a) (m)u = um−1 + um−2 + · · · + 1 for m > 0, (b) (m!)u = (m)u (m − 1)u · · · (1)u for m > 0, and (0!)u = 1, (c)
n m u (n!)u . (m!)u ((n−m)!)u n m u n m u

=

If m > n then we define

= 0.

The expressions

are called u-binomial coefficients or Gaussian polynomials.

Note: if we replace u with 1, then we obtain the familiar integers, factorials, and binomial coefficients. Specifically, (a) (m)1 = m, (b) (m!)1 = m!, (c)
n m 1

=

n m

.

Version: 3 Owner: antizeus Author(s): antizeus

1132

262.2

q skew derivation

Let (σ, δ) be a skew derivation on a ring R. Let q be a central (σ, δ)-constant. Suppose further that δσ = q · σδ. Then we say that (σ, δ) is a q-skew derivation. Version: 5 Owner: antizeus Author(s): antizeus

262.3

q skew polynomial ring

If (σ, δ) is a q-skew derivation on R, then we say that the skew polynomial ring R[θ; σ, δ] is a q-skew polynomial ring. Version: 3 Owner: antizeus Author(s): antizeus

262.4

sigma derivation

If σ is a ring endomorphism on a ring R, then a (left) σ-derivation is an additive map δ on R such that δ(x · y) = σ(x) · δ(y) + δ(x) · y for all x, y in R. Version: 7 Owner: antizeus Author(s): antizeus

262.5

sigma, delta constant

If (σ, δ) is a skew derivation on a ring R, then a (σ, δ)-constant is an element q of R such that σ(q) = q and δ(q) = 0. Note: If q is a (σ, δ)-constant, then it follows that σ(q · x) = q · σ(x) and δ(q · x) = q · δ(x) for all x in R. Version: 3 Owner: antizeus Author(s): antizeus

262.6

skew derivation

A (left) skew derivation on a ring R is a pair (σ, δ), where σ is a ring endomorphism of R, and δ is a left σ-derivation on R. Version: 4 Owner: antizeus Author(s): antizeus 1133

262.7

skew polynomial ring

If (σ, δ) is a left skew derivation on R, then we can construct the (left) skew polynomial ring R[θ; σ, δ], which is made up of polynomials in an indeterminate θ and left-hand coefficients from R, with multiplication satisfying the relation θ · r = σ(r) · θ + δ(r) for all r in R. Version: 2 Owner: antizeus Author(s): antizeus

1134

Chapter 263 16S99 – Miscellaneous
263.1 algebra

Let A be a ring with identity. An algebra over A is a ring B with identity together with a ring homomorphism f : A −→ Z(B), where Z(B) denotes the center of B. Equivalently, an algebra over A is an A–module B which is a ring and satisfies the property a · (x ∗ y) = (a · x) ∗ y = x ∗ (a · y) for all a ∈ A and all x, y ∈ B. Here · denotes A–module multiplication and ∗ denotes ring multiplication in B. One passes between the two definitions as follows: given any ring homomorphism f : A −→ Z(B), the scalar multiplication rule a · b := f (a) ∗ b makes B into an A–module in the sense of the second definition. Version: 5 Owner: djao Author(s): djao

263.2

algebra (module)

Given a commutative ring R, an algebra over R is a module M over R, endowed with a law of composition f :M ×M →M which is R-bilinear. Most of the important algebras in mathematics belong to one or the other of two classes: the unital associative algebras, and the Lie algebras. 1135

263.2.1

Unital associative algebras

In these cases, the ”product” (as it is called) of two elements v and w of the module, is denoted simply by vw or v w or the like. Any unital associative algebra is an algebra in the sense of djao (a sense which is also used by Lang in his book Algebra (Springer-Verlag)). Examples of unital associative algebras: – tensor algebras and – quotients of them – Cayley algebras, such as the – ring of – quaternions – polynomial rings – the ring of – endomorphisms of a – vector space, in which – the bilinear product of – two mappings is simply the – composite mapping.

263.2.2

Lie algebras

In these cases the bilinear product is denoted by [v, w], and satisfies [v, v] = 0 for all v ∈ M [v, [w, x]] + [w, [x, v]] + [x, [v, w]] = 0 for all v, w, x ∈ M The second of these formulas is called the Jacobi identity. One proves easily [v, w] + [w, v] = 0 for all v, w ∈ M for any Lie algebra M. Lie algebras arise naturally from Lie groups, q.v. Version: 1 Owner: karthik Author(s): Larry Hammick

1136

Chapter 264 16U10 – Integral domains
264.1 Pr¨ fer domain u

An integral domain R is a Pr¨fer domain if every finitely generated ideal I of R is invertu ible. Let RI denote the localization of R at I. Then the following statements are equivalent: • i) R is a Pr¨ fer domain. u • ii) For every prime ideal P in R, RP is a valuation domain. • iii) For every maximal ideal M in R, RM is a valuation domain. A Pr¨ fer domain is a Dedekind domain if and only if it is noetherian. u If R is a Pr¨ fer domain with quotient field K, then any domain S such that R ⊂ S ⊂ K is u Pr¨ fer. u

REFERENCES
1. Thomas W. Hungerford. Algebra. Springer-Verlag, 1974. New York, NY.

Version: 2 Owner: mathcam Author(s): mathcam

264.2

valuation domain

An integral domain R is a valuation domain if for all a, b ∈ R, either a|b or b|a. 1137

Version: 3 Owner: mathcam Author(s): mathcam

1138

Chapter 265 16U20 – Ore rings, multiplicative sets, Ore localization
265.1 Goldie’s Theorem

Let R be a ring with an identity. Then R has a right classical ring of quotients Q which is semisimple Artinian if and only if R is a semiprime right Goldie ring. If this is the case, then the composition length of Q is equal to the uniform dimension of R. An immediate corollary of this is that a semiprime right noetherian ring always has a right classical ring of quotients. This result was discovered by Alfred Goldie in the late 1950’s. Version: 3 Owner: mclase Author(s): mclase

265.2

Ore condition

A ring R satisfies the left Ore condition (resp. right Ore condition) if and only if for all elements x and y with x regular, there exist elements u and v with v regular such that ux = vy (resp.xu = yv).

A ring which satisfies the (left, right) Ore condition is called a (left, right) Ore ring. Version: 3 Owner: mclase Author(s): mclase

1139

265.3

Ore’s theorem

A ring has a (left, right) classical ring of quotients if and only if it satisfies the (left, right) Ore condition. Version: 3 Owner: mclase Author(s): mclase

265.4

classical ring of quotients

Let R be a ring. An element of R is called regular if it is not a right zero divisor or a left zero divisor in R. A ring Q ⊃ R is a left classical ring of quotients for R (resp. right classical ring of quotients for R) if it satisifies: • every regular element of R is invertible in Q • every element of Q can be written in the form x−1 y (resp. yx−1 ) with x, y ∈ R and x regular. If a ring R has a left or right classical ring of quotients, then it is unique up to isomorphism. If R is a commutative integral domain, then the left and right classical rings of quotients always exist – they are the field of fractions of R. For non-commutative rings, necessary and sufficient conditions are given by Ore’s theorem. Note that the goal here is to construct a ring which is not too different from R, but in which more elements are invertible. The first condition says which elements we want to be invertible. The second condition says that Q should contain just enough extra elements to make the regular elements invertible. Such rings are called classical rings of quotients, because there are other rings of quotients. These all attempt to enlarge R somehow to make more elements invertible (or sometimes to make ideals invertible). Finally, note that a ring of quotients is not the same as a quotient ring. Version: 2 Owner: mclase Author(s): mclase

1140

265.5

saturated

Let S be multiplicative subset of A. We say that S is a saturated if ab ∈ S ⇒ a, b ∈ S. When A is an integral domain, then S is saturated if and only if its complement A\S is union of prime ideals. Version: 1 Owner: drini Author(s): drini

1141

Chapter 266 16U70 – Center, normalizer (invariant elements)
266.1 center (rings)

If A is a ring, the center of A, sometimes denoted Z(A), is the set of all elements in A that commute with all other elements of A. That is, Z(A) = {a ∈ A | ax = xa∀x ∈ A} Note that 0 ∈ Z(A) so the center is non-empty. If we assume that A is a ring with a multiplicative unity 1, then 1 is in the center as well. The center of A is also a subring of A. Version: 3 Owner: dublisk Author(s): dublisk

1142

Chapter 267 16U99 – Miscellaneous
267.1 anti-idempotent

An element x of a ring is called an anti-idempotent element, or simply an anti-idempotent if x2 = −x. The term is most often used in linear algebra. Every anti-idempotent matrix over a field is diagonalizable. Two anti-idempotent matrices are similar if and only if they have the same rank. Version: 1 Owner: mathcam Author(s): mathcam

1143

Chapter 268 16W20 – Automorphisms and endomorphisms
268.1 ring of endomorphisms

Let R be a ring and let M be a right R-module. An endomorphism of M is a R-module homomorphism from M to itself. We shall write endomorphisms on the left, so that f : M → M maps x → f (x). If f, g : M → M are two endomorphisms, we can add them: f + g : x → f (x) + g(x) and multiply them With these operations, the set of endomorphisms of M becomes a ring, which we call the ring of endomorphisms of M, written EndR (M). Instead of writing endomorphisms as functions, it is often convenient to write them multiplicatively: we simply write the application of the endomorphism f as x → f x. Then the fact that each f is an R-module homomorphism can be expressed as: f (xr) = (f x)r for all x ∈ M and r ∈ R and f ∈ EndR (M). With this notation, it is clear that M becomes an EndR (M)-R-bimodule. Now, let N be a left R-module. We can construct the ring EndR (N) in the same way. There is a complication, however, if we still think of endomorphism as functions written on the left. In order to make M into a bimodule, we need to define an action of EndR (N) on the right of N: say x · f = f (x) 1144 f g : x → f (g(x))

But then we have a problem with the multiplication: x · f g = f g(x) = f (g(x)) but (x · f ) · g = f (x) · g = g(f (x))! In order to make this work, we need to reverse the order of composition when we define multiplication in the ring EndR (N) when it acts on the right. There are essentially two different ways to go from here. One is to define the muliplication in EndR (N) the other way, which is most natural if we write the endomorphisms as functions on the right. This is the approach taken in many older books. The other is to leave the muliplication in EndR (N) the way it is, but to use the opposite ring to define the bimodule. This is the approach that is generally taken in more recent works. Using this approach, we conclude that N is a R-EndR (N)op -bimodule. We will adopt this convention for the lemma below. Considering R as a right and a left module over itself, we can construct the two endomorphism rings EndR (RR ) and EndR (R R). Lemma 2. Let R be a ring with an identity element. Then R EndR (R R)op .
D

EndR (RR ) and R

efine ρr ∈ EndR (R R) by x → xr.

A calculation shows that ρrs = ρs ρr (functions written on the left) from which it is easily seen that the map θ : r → ρr is a ring homomorphism from R to EndR (R R)op . We must show that this is an isomorphism. If ρr = 0, then r = 1r = ρr (1) = 0. So θ is injective. Let f be an arbitrary element of EndR (R R), and let r = f (1). Then for any x ∈ R, f (x) = f (x1) = xf (1) = xr = ρr (x), so f = ρr = θ(r). The proof of the other isomorphism is similar. Version: 4 Owner: mclase Author(s): mclase

1145

Chapter 269 16W30 – Coalgebras, bialgebras, Hopf algebras ; rings, modules, etc. on which these act
269.1 Hopf algebra

A Hopf algebra is a bialgebra A over a field K with a K-linear map S : A → A, called the Definition 1. antipode, such that m ◦ (S ⊗ id) ◦ ∆ = η ◦ ε = m ◦ (id ⊗ S) ◦ ∆, (269.1.1)

where m : A ⊗ A → A is the multiplication map m(a ⊗ b) = ab and η : K → A is the unit map η(k) = k1 I. In terms of a commutative diagram: A
∆ ∆ ε

A⊗A
S⊗id

A⊗A
id⊗∗S

C
η m m

A⊗A

A⊗A

A

1146

Example 1 (Algebra of functions on a finite group). Let A = C(G) be the algebra of complexvalued functions on a finite group G and identify C(G × G) with A ⊗ A. Then, A is a Hopf algebra with comultiplication (∆(f ))(x, y) = f (xy), counit ε(f ) = f (e), and antipode (S(f ))(x) = f (x−1 ). Example 2 (Group algebra of a finite group). Let A = CG be the complex group algebra of a finite group G. Then, A is a Hopf algebra with comultiplication ∆(g) = g ⊗ g, counit ε(g) = 1, and antipode S(g) = g −1 . The above two examples are dual to one another. Define a bilinear form C(G) ⊗ CG → C by f, x = f (x). Then, f g, x 1, x ∆(f ), x ⊗ y ε(f ) S(f ), x = f ⊗ g, ∆(x) , = ε(x), = f, xy , = f, e , = f, S(x) .

Example 3 (Polynomial functions on a Lie group). Let A = Poly(G) be the algebra of complex-valued polynomial functions on a complex Lie group G and identify Poly(G × G) with A ⊗ A. Then, A is a Hopf algebra with comultiplication (∆(f ))(x, y) = f (xy), counit ε(f ) = f (e), and antipode (S(f ))(x) = f (x−1 ). Example 4 (Universal enveloping algebra of a Lie algebra). Let A = U(g) be the universal enveloping algebra of a complex Lie algebra g. Then, A is a Hopf algebra with comultiplication ∆(X) = X ⊗ 1 + 1 ⊗ X, counit ε(X) = 0, and antipode S(X) = −X. The above two examples are dual to one another (if g is the Lie algebra of G). Define a d bilinear form Poly(G) ⊗ U(g) → C by f, X = dt t=0 f (exp(tX)). Version: 6 Owner: mhale Author(s): mhale

269.2

almost cocommutative bialgebra
R∆(a) = ∆op (a)R

A bialgebra A is called almost cocommutative if there is an unit R ∈ A ⊗ A such that where ∆op is the opposite comultiplication (the usual comultiplication, composed with the flip map of the tensor product A ⊗ A). The element R is often called the R-matrix of A. The significance of the almost cocommutative condition is that σV,W = σ ◦ R : V ⊗ W → W ⊗ V gives a natural isomorphism of bialgebra representations, where V and W are Amodules, making the category of A-modules into a quasi-tensor or braided monoidal category. Note that σW,V ◦ σV,W is not necessarily the identity (this is the braiding of the category). Version: 2 Owner: bwebste Author(s): bwebste 1147

269.3
A

bialgebra

Definition 2. bialgebra is a vector space that is both a unital algebra and a coalgebra, such that the comultiplication and counit are unital algebra homomorphisms. Version: 2 Owner: mhale Author(s): mhale

269.4
A

coalgebra

Definition 3. coalgebra is a vector space A over a field K with a K-linear map ∆ : A → A ⊗ A, called the Definition 4. comultiplication, and a (non-zero) K-linear map ε : A → K, called the (∆ ⊗ id) ◦ ∆ = (id ⊗ ∆) ◦ ∆ (coassociativity), (ε ⊗ id) ◦ ∆ = id = (id ⊗ ε) ◦ ∆. In terms of commutative diagrams: A
∆ ∆

Definition 5. counit, such that (269.4.1) (269.4.2)

A⊗A
∆⊗id id⊗∆

A⊗A A⊗A⊗A A
∆ ∆ id

A⊗A
ε⊗id

A⊗A
id⊗ε

A Let σ : A ⊗ A → A ⊗ A be the flip map σ(a ⊗ b) = b ⊗ a. A coalgebra is said to be Definition 6. cocommutative if σ ◦ ∆ = ∆. Version: 4 Owner: mhale Author(s): mhale 1148

269.5

coinvariant

Let V be a comodule with a right coaction t : V → V ⊗ A of a coalgebra A. An element v ∈ V is Definition 7. right coinvariant if t(v) = v ⊗ 1 A . I Version: 1 Owner: mhale Author(s): mhale (269.5.1)

269.6

comodule

Let (A, ∆, ε) be a coalgebra. A Definition 8. right A-comodule is a vector space V with a linear map t : V → V ⊗A, called the Definition 9. right coaction, satisfying (t ⊗ id) ◦ t = (id ⊗ ∆) ◦ t, (id ⊗ ε) ◦ t = id. (269.6.1)

An A-comodule is also referred to as a corepresentation of A. Let V and W be two right A-comodules. Then V ⊕ W is also a right A-comodule. If A is a bialgebra then V ⊗ W is a right A-comodule as well (make use of the multiplication map A ⊗ A → A). Version: 2 Owner: mhale Author(s): mhale

269.7

comodule algebra

Let H be a bialgebra. A right H-comodule algebra is a unital algebra A which is a right H-comodule satisfying t(ab) = t(a)t(b) = for all h ∈ H and a, b ∈ A. There is a dual notion of a H-module coalgebra. Example 5. Let H be a bialgebra. Then H is itself a H-comodule algebra for the right regular coaction t(h) = ∆(h). Version: 5 Owner: mhale Author(s): mhale 1149 a(1) b(1) ⊗ a(2) b(2) , t(1 A ) = 1 A ⊗ 1 H , I I I (269.7.1)

269.8

comodule coalgebra

Let H be a bialgebra. A right H-comodule coalgebra is a coalgebra A which is a right H-comodule satisfying (∆ ⊗ id)t(a) = a(1)(1) ⊗ a(2)(1) ⊗ a(1)(2) a(2)(2) , (ε ⊗ id)t(a) = ε(a)1 H , I (269.8.1)

for all h ∈ H and a ∈ A. There is a dual notion of a H-module algebra. Example 6. Let H be a Hopf algebra. Then H is itself a H-comodule coalgebra for the adjoint coaction t(h) = h(2) ⊗ S(h(1) )h(3) . Version: 4 Owner: mhale Author(s): mhale

269.9

module algebra

Let H be a bialgebra. A left H-module algebra is a unital algebra A which is a left H-module satisfying h (ab) = for all h ∈ H and a, b ∈ A. There is a dual notion of a H-comodule coalgebra. Example 7. Let H be a Hopf algebra. Then H is itself a H-module algebra for the adjoint action g h = g(1) hS(g(2) ). Version: 4 Owner: mhale Author(s): mhale (h(1) a)(h(2) b), h 1 A = ε(h)1 A , I I (269.9.1)

269.10

module coalgebra

Let H be a bialgebra. A left H-module coalgebra is a coalgebra A which is a left Hmodule satisfying ∆(h a) = (h(1) a(1) ) ⊗ (h(2) a(2) ), ε(h a) = ε(h)ε(a), (269.10.1)

for all h ∈ H and a ∈ A. There is a dual notion of a H-comodule algebra. Example 8. Let H be a bialgebra. Then H is itself a H-module coalgebra for the left regular action g h = gh. Version: 5 Owner: mhale Author(s): mhale 1150

Chapter 270 16W50 – Graded rings and modules
270.1 graded algebra

An algebra A is graded if it is a graded module and satisfies Ap · Aq ⊆ Ap+q Examples of graded algebras include the polynomial ring k[X] being an N-graded k-algebra, and the exterior algebra. Version: 1 Owner: dublisk Author(s): dublisk

270.2

graded module

If R = R0 ⊕ R1 ⊕ · · · is a graded ring, then a graded module over R is a module M of the form M = ⊕∞ Mi and satisfies Ri Mj ⊆ Mi+j for all i, j. i=−∞ Version: 4 Owner: KimJ Author(s): KimJ

270.3

supercommutative

Let R be a Z2 -graded ring. Then R is supercommutative if for any homogeneous elements a and b ∈ R: ab = (−1)deg a deg b ba. 1151

This is, even homogeneous elements are in the center of the ring, and odd homogeneous elements anti-commute. Common examples of supercommutative rings are the exterior algebra of a module over a commutative ring (in particular, a vector space) and the cohomology ring of a topological space (both with the standard grading by degree reduced mod 2). Version: 1 Owner: bwebste Author(s): bwebste

1152

Chapter 271 16W55 – “Super” (or “skew”) structure
271.1 super tensor product

If A and B are Z-graded algebras, we define the super tensor product A ⊗su B to be the ordinary tensor product as graded modules, but with multiplication - called the super product - defined by (a ⊗ b)(a ⊗ b ) = (−1)(deg b)(deg a ) aa ⊗ bb where a, a , b, b are homogeneous. The super tensor product of A and B is itself a graded algebra, as we grade the super tensor product of A and B as follows: (A ⊗su B)n = Ap ⊗ B q

p,q : p+q=n

Version: 4 Owner: dublisk Author(s): dublisk

271.2

superalgebra

A graded algebra A is said to be a super algebra if it has a Z/2Z grading. Version: 2 Owner: dublisk Author(s): dublisk

1153

271.3

supernumber

Let ΛN be the Grassmann algebra generated by θi , i = 1 . . . N, such that θi θj = −θj θi and (θi )2 = 0. Denote by Λ∞ , the case of an infinite number of generators θi . A Definition 10. supernumber is an element of ΛN or Λ∞ . Any supernumber z can be expressed uniquely in the form 1 1 z = z0 + zi θi + zij θi θj + . . . + zi1 ...in θi1 . . . θin + . . . , 2 n! where the coefficients zi1 ...in ∈ C are antisymmetric in their indices. The Definition 11. body of z is defined as zB = z0 , and its Definition 12. soul is defined as zS = z − zB . If zB = 0 then z has an inverse given by z
−1

1 = zB

k=0

zS − zB

k

.

A supernumber can be decomposed into the even and odd parts 1 1 zeven = z0 + zij θi θj + . . . + zi1 ...i2n θi1 . . . θi2n + . . . , 2 (2n)! 1 1 zodd = zi θi + zijk θi θj θk + . . . + zi ...i θi1 . . . θi2n+1 + . . . . 6 (2n + 1)! 1 2n+1 Purely even supernumbers are called Definition 13. c-numbers, and odd supernumbers are called Definition 14. a-numbers. The superalgebra ΛN thus has a decomposition ΛN = Cc ⊕ Ca , where Cc is the space of c-numbers, and Ca is the space of a-numbers. Supernumbers are the generalisation of complex numbers to a commutative superalgebra of commuting and anticommuting “numbers”. They are primarily used in the description of fermionic fields in quantum field theory. Version: 5 Owner: mhale Author(s): mhale

1154

Chapter 272 16W99 – Miscellaneous
272.1 Hamiltonian quaternions

Definition of Q We define a unital associative algebra Q over R, of dimension 4, by the basis {1, i, j, k} and the multiplication table 1 i i −1 j −k k j j k −1 −i k −j i −1

(where the element in row x and column y is xy, not yx). Thus an arbitrary element of Q is of the form a1 + bi + cj + dk, a, b, c, d ∈ R (sometimes denoted by a, b, c, d or by a+ b, c, d ) and the product of two elements a, b, c, d and α, β, γ, δ is w, x, y, z where w x y z = = = = aα − bβ − cγ − dδ aβ + bα + cδ − dγ aγ − bδ + cα + kβ aδ + bγ − cβ + kα

The elements of Q are known as Hamiltonian quaternions. Clearly the subspaces of Q generated by {1} and by {1, i} are subalgebras isomorphic to R and C respectively. R is customarily identified with the corresponding subalgebra of Q. (We

1155

shall see in a moment that there are other and less obvious embeddings of C in Q.) The real numbers commute with all the elements of Q, and we have λ · a, b, c, d = λa, λb, λc, λd for λ ∈ R and a, b, c, d ∈ Q. norm, conjugate, and inverse of a quaternion Like the complex numbers (C), the quaternions have a natural involution called the quaternion conjugate. If q = a1 + bi + cj + dk, then the quaternion conjugate of q, denoted q, is simply q = a1 − bi − cj − dk. One can readily verify that if q = a1 + bi + cj + dk, then qq = (a2 + b2 + c2 + d2 )1. (See Euler four-square identity.) This product is used to form a norm | · | on the algebra (or the √ ring) Q: We define q = s where qq = s1. If v, w ∈ Q and λ ∈ R, then 1. v 0 with equality only if v = 0, 0, 0, 0 = 0

2. λv = |λ| v 3. v + w v + w 4. v · w = v · w which means that Q qualifies as a normed algebra when we give it the norm | · |. Because the norm of any nonzero quaternion q is real and nonzero, we have qq qq = = 1, 0, 0, 0 2 q q 2 which shows that any nonzero quaternion has an inverse: q −1 = Other embeddings of C into Q One can use any non-zero q to define an embedding of C into Q. If n(z) is a natural embedding of z ∈ C into Q, then the embedding: z → qn(z)q −1 is also an embedding into Q. Because Q is an associative algebra, it is obvious that: (qn(a)q −1 )(qn(b)q −1 ) = q(n(a)n(b))q −1 1156 q q
2

.

and with the distributive laws, it is easy to check that (qn(a)q −1 ) + (qn(b)q −1 ) = q(n(a) + n(b))q −1 Rotations in 3-space Let us write U = {q ∈ Q : ||q|| = 1} With multiplication, U is a group. Let us briefly sketch the relation between U and the group SO(3) of rotations (about the origin) in 3-space.
θ An arbitrary element q of U can be expressed cos 2 + sin θ (ai + bj + ck), for some real 2 numbers θ, a, b, c such that a2 + b2 + c2 = 1. The permutation v → qv of U thus gives rise to a permutation of the real sphere. It turns out that that permutation is a rotation. Its axis is the line through (0, 0, 0) and (a, b, c), and the angle through which it rotates the sphere is θ. If rotations F and G correspond to quaternions q and r respectively, then clearly the permutation v → qrv corresponds to the composite rotation F ◦ G. Thus this mapping of U onto SO(3) is a group homomorphism. Its kernel is the subset {1, −1} of U, and thus it comprises a double cover of SO(3). The kernel has a geometric interpretation as well: two unit vectors in opposite directions determine the same axis of rotation.

Version: 3 Owner: mathcam Author(s): Larry Hammick, patrickwonders

1157

Chapter 273 16Y30 – Near-rings
273.1 near-ring

A near-ring is a set N together with two binary operations, denoted + : N × N → N and · : N × N → N, such that 1. (a + b) + c = a + (b + c) and (a · b) · c = a · (b · c) for all a, b, c ∈ N (associativity of both operations) 2. There exists an element 0 ∈ N such that a + 0 = 0 + a = a for all a ∈ N (additive identity) 3. For all a ∈ N, there exists b ∈ N such that a + b = b + a = 0 (additive inverse) 4. (a + b) · c = (a · c) + (b · c) for all a, b, c ∈ N (right distributive law) Note that the axioms of a near-ring differ from those of a ring in that they do not require addition to be commutative, and only require distributivity on one side. Every element a in a near-ring has a unique additive inverse, denoted −a. We say N has an identity element if there exists an element 1 ∈ N such that a · 1 = 1 · a = a for all a ∈ N. We say N is distributive if a · (b + c) = (a · b) + (a · c) holds for all a, b, c ∈ N. We say N is commutative if a · b = b · a for all a, b ∈ N. A natural example of a near-ring is the following. Let (G, +) be a group (not necessarily abelian), and let M be the set of all functions from G to G. For two functions f and g in M define f + g ∈ M by (f + g)(x) = f (x) + g(x) for all x ∈ G. Then (M, +, ◦) is a near-ring with identity, where ◦ denotes composition of functions. Version: 13 Owner: yark Author(s): yark, juergen 1158

Chapter 274 17A01 – General theory
274.1 commutator bracket
[a, b] = ab − ba

Let A be an associative algebra over a field K. For a, b ∈ A, the element of A defined by is called the commutator of a and b. The corresponding bilinear operation is called the commutator bracket. [−, −] : A × A → A

The commutator bracket is bilinear, skew-symmetric, and also satisfies the Jacobi identity. To wit, for a, b, c ∈ A we have [a, [b, c]] + [b, [c, a]] + [c, [a, b]] = 0. The proof of this assertion is straightforward. Each of the brackets in the left-hand side expands to 4 terms, and then everything cancels. In categorical terms, what we have here is a functor from the category of associative algebras to the category of Lie algebras over a fixed field. The action of this functor is to turn an associative algebra A into a Lie algebra that has the same underlying vector space as A, but whose multiplication operation is given by the commutator bracket. It must be noted that this functor is right-adjoint to the universal enveloping algebra functor. Examples • Let V be a vector space. Composition endows the vector space of endomorphisms End V with the structure of an associative algebra. However, we could also regard End V as a Lie algebra relative to the commutator bracket: [X, Y ] = XY − Y X, 1159 X, Y ∈ End V.

• The algebra of differential operators has some interesting properties when viewed as a Lie algebra. The fact is that even though, even though the composition of differential operators is a non-commutative operation, it is commutative when restricted to the highest order terms of the involved operators. Thus, if X, Y are differential operators of order p and q, respectively, the compositions XY and Y X have order p + q. Their highest order term coincides, and hence the commutator [X, Y ] has order p + q − 1. • In light of the preceding comments, it is evident that the vector space of first-order differential operators is closed with respect to the commutator bracket. Specializing even further we remark that, a vector field is just a homogeneous first-order differential operator, and that the commutator bracket for vector fields, when viewed as first-order operators, coincides with the usual, geometrically motivated vector field bracket. Version: 4 Owner: rmilson Author(s): rmilson

1160

Chapter 275 17B05 – Structure theory
275.1 Killing form

Let g be a finite dimensional Lie algebra over a field k, and adX : g → g be the adjoint action, adX Y = [X, Y ]. Then the Killing form on g is a bilinear map Bg : g × g → k given by Bg(X, Y ) = tr(adX ◦ adY ). The Killing form is invariant and symmetric (since trace is symmetric). Version: 4 Owner: bwebste Author(s): bwebste

275.2

Levi’s theorem

Let g be a complex Lie algebra, r its radical. Then the extension 0 → r → g → g/r → 0 is split, i.e., there exists a subalgebra h of g mapping isomorphically to g/r under the natural projection. Version: 2 Owner: bwebste Author(s): bwebste

275.3

nilradical

Let g be a Lie algebra. Then the nilradical n of g is defined to be the intersection of the kernels of all the irreducible representations of g. Equivalently, n = [g, g] rad g, the 1161

interesection of the derived ideal and radical of g. Version: 1 Owner: bwebste Author(s): bwebste

275.4

radical

Let g be a Lie algebra. Since the sum of any two solvable ideals of g is in turn solvable, there is a unique maximal solvable ideal of any Lie algebra. This ideal is called the radical of g. Note that g/rad g has no solvable ideals, and is thus semi-simple. Thus, every Lie algebra is an extension of a semi-simple algebra by a solvable one. Version: 2 Owner: bwebste Author(s): bwebste

1162

Chapter 276 17B10 – Representations, algebraic theory (weights)
276.1 Ado’s theorem

Every finite dimensional Lie algebra has a faithful finite dimensional representation. In other words, every finite dimensional Lie algebra is a matrix algebra. This result is not true for Lie groups. Version: 2 Owner: bwebste Author(s): bwebste

276.2

Lie algebra representation

A representation of a Lie algebra g is a Lie algebra homomorphism ρ : g → End V, where End V is the commutator Lie algebra of some vector space V . In other words, ρ is a linear mapping that satisfies ρ([a, b]) = ρ(a)ρ(b) − ρ(b)ρ(a), We call the representation faithful if ρ is injective. A invariant subsspace or sub-module W ⊂ V is a subspace of V satisfying ρ(a)(W ) ⊂ W for all a ∈ g. A representation is called irreducible or simple if its only invariant subspaces are {0} and the whole representation. 1163 a, b ∈ g

Alternatively, one calls V a g-module, and calls ρ(a), a ∈ g the action of a on V .

The dimension of V is called the dimension of the representation. If V is infinite-dimensional, then one speaks of an infinite-dimensional representation. Given a representation or pair of representation, there are a couple of operations which will produce other representations: First there is direct sum. If ρ : g → End(V ) and σ : g → End(W ) are representations, then V ⊕ W has the obvious Lie algebra action, by the embedding End(V ) × End(W ) → End(V ⊕ W ). Version: 9 Owner: bwebste Author(s): bwebste, rmilson

276.3

adjoint representation

Let g be a Lie algebra. For every a ∈ g we define the adjoint endomorphism, a.k.a. the adjoint action, ad(a) : g → g to be the linear transformation with action ad(a) : b → [a, b], The linear mapping ad : g → End(g) with action a → ad(a), a∈g b ∈ g.

is called the adjoint representation of g. The fact that ad defines a representation is a straight-forward consequence of the Jacobi identity axiom. Indeed, let a, b ∈ g be given. We wish to show that ad([a, b]) = [ad(a), ad(b)], where the bracket on the left is the g multiplication structure, and the bracket on the right is the commutator bracket. For all c ∈ g the left hand side maps c to [[a, b], c], while the right hand side maps c to [a, [b, c]] + [b, [a, c]]. Taking skew-symmetry of the bracket as a given, the equality of these two expressions is logically equivalent to the Jacobi identity: [a, [b, c]] + [b, [c, a]] + [c, [a, b]] = 0. Version: 2 Owner: rmilson Author(s): rmilson 1164

276.4

examples of non-matrix Lie groups

While most well-known Lie groups are matrix groups, there do in fact exist Lie groups which are not matrix groups. That is, they have no faithful finite dimensional representations. For example, let H be the real Heisenberg group     1 a b  H = 0 1 c  | a, b, c ∈ R ,   0 0 1      1 0 n Γ = 0 1 0  | n ∈ Z .   0 0 1

and Γ the discrete subgroup

The subgroup Γ is central, and thus normal. The Lie group H/Γ has no faithful finite dimensional representations over R or C. Another example is the universal cover of SL2 R. SL2 R is homotopy equivalent to a circle, and thus π(SL2 R) ∼ Z, and thus has an infinite-sheeted cover. Any real or complex repre= sentation of this group factors through the projection map to SL2 R. Version: 3 Owner: bwebste Author(s): bwebste

276.5

isotropy representation

Let g be a Lie algebra, and h ⊂ g a subalgebra. The isotropy representation of h relative to g is the naturally defined action of h on the quotient vector space g/h. Here is a synopsis of the technical details. As is customary, we will use b + h, b ∈ g to denote the coset elements of g/h. Let a ∈ h be given. Since h is invariant with respect to adg(a), the adjoint action factors through the quotient to give a well defined endomorphism of g/h. The action is given by b + h → [a, b] + h, This is the action alluded to in the first paragraph. Version: 3 Owner: rmilson Author(s): rmilson 1165 b ∈ g.

Chapter 277 17B15 – Representations, analytic theory
277.1 invariant form (Lie algebras)

Let V be a representation of a Lie algebra g over a field k. Then a bilinear form B : V ×V → k is invariant if B(Xv, w) + B(v, Xw) = 0. for all X ∈ g, v, w ∈ V . This criterion seems a little odd, but in the context of Lie algebras, ˜ it makes sense. For example, the map B : V → V ∗ given by v → B(·, v) is equivariant if and only if B is an invariant form. Version: 2 Owner: bwebste Author(s): bwebste

1166

Chapter 278 17B20 – Simple, semisimple, reductive (super)algebras (roots)
278.1 Borel subalgebra

Let g be a semi-simple Lie group, h a Cartan subalgebra, R the associated root system and R+ ⊂ R a set of positive roots. We have a root decomposition into the Cartan subalgebra and the root spaces gα g=h⊕ gα
α∈R

.

Now let b be the direct sum of the Cartan subalgebra and the positive root spaces.   This is called a Borel subalgebra. b=h⊕
β∈R+

gβ  .

Version: 2 Owner: bwebste Author(s): bwebste

278.2

Borel subgroup

Let G be a complex semi-simple Lie group. Then any maximal solvable subgroup B G is called a Borel subgroup. All Borel subgroups of a given group are conjugate. Any Borel group is connected and equal to its own normalizer, and contains a unique Cartan subgroup. The intersection of B with a maximal compact subgroup K of G is the maximal torus of K. If G = SLn C, then the standard Borel subgroup is the set of upper triangular matrices. 1167

Version: 2 Owner: bwebste Author(s): bwebste

278.3

Cartan matrix

Let R ⊂ E be a reduced root system, with E a euclidean vector space, with inner product (·, ·), and let Π = {α1 , · · · , αn } be a base of this root system. Then the Cartan matrix of the root system is the matrix 2(αi , αj ) Ci,j = . (αi , αi ) The Cartan matrix uniquely determines the root system, and is unique up to simultaneous permutation of the rows and columns. It is also the basis change matrix from the basis of fundamental weights to the basis of simple roots in E. Version: 1 Owner: bwebste Author(s): bwebste

278.4

Cartan subalgebra

Let g be a Lie algebra. Then a Cartan subalgebra is a maximal subalgebra of g which is selfnormalizing, that is, if [g, h] ∈ h for all h ∈ h, then g ∈ h as well. Any Cartan subalgebra h is nilpotent, and if g is semi-simple, it is abelian. All Cartan subalgebras of a Lie algebra are conjugate by the adjoint action of any Lie group with algebra g. Version: 3 Owner: bwebste Author(s): bwebste

278.5

Cartan’s criterion

A Lie algebra g is semi-simple if and only if its Killing form Bg is nondegenerate. Version: 2 Owner: bwebste Author(s): bwebste

278.6

Casimir operator

Let g be a semisimple Lie algebra, and let (·, ·) denote the Killing form. If {gi} is a basis of g, then there is a dual basis {g i} with respect to the Killing form, i.e., (gi , g j ) = δij . Consider the element Ω = gi g i of the universal enveloping algebra of g. This element, called the Casimir operator is central in the enveloping algebra, and thus commutes with the g action on any representation. 1168

Version: 2 Owner: bwebste Author(s): bwebste

278.7

Dynkin diagram

Dynkin diagrams are a combinatorial way of representing the imformation in a root system. Their primary advantage is that they are easier to write down, remember, and analyze than explicit representations of a root system. They are an important tool in the classification of simple Lie algebras. Given a reduced root system R ⊂ E, with E an inner-product space, choose a base or simple roots Π (or equivalently, a set of positive roots R+ ). The Dynkin diagram associated to R is a graph whose vertices are Π. If πi and πj are distinct elements of the root system, we −4(πi ,π )2 add mij = (πi ,πi)(πjj,πj ) lines between them. This number is obivously positive, and an integer since it is the product of 2 quantities that the axioms of a root system require to be integers. By the Cauchy-Schwartz inequality, and the fact that simple roots are never anti-parallel (they are all strictly contained in some half space), mij ∈ {0, 1, 2, 3}. Thus Dynkin diagrams are finite graphs, with single, double or triple edges. Fact, the criteria are much stronger than this: if the multiple edges are counted as single edges, all Dynkin diagrams are trees, and have at most one multiple edge. In fact, all Dynkin diagrams fall into 4 infinite families, and 5 exceptional cases, in exact parallel to the classification of simple Lie algebras. (Does anyone have good Dynkin diagram pictures? I’d love to put some up, but am decidedly lacking.) Version: 1 Owner: bwebste Author(s): bwebste

278.8

Verma module

Let g be a semi-simple Lie algebra, h a Cartan subalgebra, and b a Borel subalgebra. Let Fλ for a weight λ ∈ h∗ be the 1-d dimensional b module on which h acts by multiplication by λ, and the positive root spaces act trivially. Now, the Verma module Mλ of the weight λ is the g module Mλ = Fλ ⊗U(b) U(g). This is an infinite dimensional representation, and it has a very important property: If V is any representation with highest weight λ, there is a surjective homomorphism Mλ → V . That is, all representations with highest weight λ are quotients of Mλ . Also, Mλ has a unique maximal submodule, so there is a unique irreducible representation with highest weight λ. Version: 1 Owner: bwebste Author(s): bwebste 1169

278.9

Weyl chamber

If R ⊂ E is a root system, with E a euclidean vector space, and R+ is a set of positive roots, then the positive Weyl chamber is the set C = {e ∈ E|(e, α) 0 ∀α ∈ R+ }.

The interior of C is a fundamental domain for the action of the Weyl group on E. The image w(C) of C under the any element of the Weyl group is called a Weyl chamber. The Weyl group W acts simply transitively on the set of Weyl chambers. A weight which lies inside the positive Weyl chamber is called dominant Version: 2 Owner: bwebste Author(s): bwebste

278.10

Weyl group

The Weyl group WR of a root system R ⊂ E, where E is a euclidean vector space, is the subgroup of GL(E) generated by reflection in the hyperplanes perpendicular to the roots. The map of reflection in a root α is given by rα (v) = v − 2 . The Weyl group is generated by reflections in the simple roots for any choice of a set of positive roots. There is a well-defined length function : WR → Z, where (w) is the minimal number of reflections in simple roots that w can be written as. This is also the number of positive roots that w takes to negative roots. Version: 1 Owner: bwebste Author(s): bwebste (v, α) (α, α)

278.11

Weyl’s theorem

Let g be a finite dimensional semi-simple Lie algebra. Then any finite dimensional representation of g is completely reducible. Version: 1 Owner: bwebste Author(s): bwebste

1170

278.12

classification of finite-dimensional representations of semi-simple Lie algebras

If g is a semi-simple Lie algebra, then we say that an irreducible representation V has highest weight λ, if there is a vector v ∈ Vλ , the weight space of λ, such that Xv = 0 for X in any positive root space, and v is called a highest vector, or vector of highest weight. There is a unique (up to isomorphism) irreducible finite dimensional representation of g with highest weight λ for any dominant weight λ ∈ ΛW , where ΛW is the weight lattice of g, and every irreducible representation of g is of this type. Version: 1 Owner: bwebste Author(s): bwebste

278.13

cohomology of semi-simple Lie algebras

There are some important facts that make the cohomology of semi-simple Lie algebras easier to deal with than general Lie algebra cohomology. In particular, there are a number of vanishing theorems. First of all, let g be a finite-dimensional, semi-simple Lie algebra over C. Theorem. Let M be an irreducible representation of g. Then H n (g, M) = 0 for all n. Whitehead’s lemmata. Let M be any representation of g, then H 1 (g, M) = H 2 (g, M) = 0. Whitehead’s lemmata lead to two very important results. From the vanishing of H 1 , we can derive Weyl’s theorem, the fact that representations of semi-simple Lie algebras are completely reducible, since extensions of M by N are classified by H 1 (g, HomMN). And from the vanishing of H 2 , we obtain Levi’s theorem, which states that every Lie algebra is a split extension of a semi-simple algebra by a solvable algebra since H 2 (g, M) classifies extensions of g by M with a specified action of g on M. Version: 2 Owner: bwebste Author(s): bwebste

278.14

nilpotent cone

Let g be a finite dimensional semisimple Lie algebra. Then the nilpotent cone N of g is set of elements which act nilpotently on all representations of g. This is a irreducible subvariety of g (considered as a k-vector space), which is invariant under the adjoint action of G on g (here G is the adjoint group associated to g).

1171

Version: 3 Owner: bwebste Author(s): bwebste

278.15

parabolic subgroup

Let G be a complex semi-simple Lie group. Then any subgroup P of G containg a Borel subgroup B is called parabolic. Parabolics are classified in the following manner. Let g be the Lie algebra of G, h the unique Cartan subalgebra contained in b, the algebra of B, R the set of roots corresponding to this choice of Cartan, and R+ the set of positive roots whose root spaces are contained in b and let p be the Lie algebra of P . Then there exists a unique subset ΠP of Π, the base of simple roots associated to this choice of positive roots, such that {b, g−α }α∈ΠP generates p. In other words, parabolics containing a single Borel subgroup are classified by subsets of the Dynkin diagram, with the empty set corresponding to the Borel, and the whole graph corresponding to the group G. Version: 1 Owner: bwebste Author(s): bwebste

278.16

pictures of Dynkin diagrams

Here is a complete list of connected Dynkin diagrams. In general if the name of a diagram has n as a subscript then there are n dots in the diagram. There are four infinite series that correspond to classical complex (that is over C) simple Lie algebras. No pan intended. • An , for n 1 represents the simple complex Lie algebra sln+1 : A1 A2 A3 An • Bn , for n • Cn , for n

1 represents the simple complex Lie algebra so2n+1 : 1 represents the simple complex Lie algebra sp2n :

¢  ¢¡ ¢ ¡¡    ¡¡    ¢ ¢ ¢ ¡   ¡   ¡   ¡   ¡   ¡  
1172

B1 B2 B3 Bn C1 C2 C3 Cn • Dn , for n

3 represents the simple complex Lie algebra so2n :

D3

D4

D5

Dn

And then there are the exceptional cases that come in finite families. The corresponding Lie algebras are usually called by the name of the diagram.

• There is the E series that has three members: E6 which represents a 78–dimensional Lie algebra, E7 which represents a 133–dimensional Lie algebra, and E8 which represents a 248–dimensional Lie algebra. 1173

§ ¦ §¦§ ¦¨§ § ¦§ § §¨¦ ¦¨¦ §¦ ¦¨ § ¦ § ¦ § ¦ §¨ ¦¨ § ¦ § ¦ § ¦ § ¦ § ¦ § ¦ ¥¦ ¥§ ¥¥ ¤¤ ££ ¤¤ ££ ¤ £ ¥ ¤ £ ¤ £ ¤ £ ¤ ¤ £ ¤ £ ¢£ ¢¥ ¢  ¡¡    ¡¡    ¡ ¢ ¢ ¢ ¡   ¡   ¡   ¡   ¡   ¡  

E6

E7

E8

• There is the F4 diagram which represents a 52–dimensional complex simple Lie algebra:

F4

• And finally there is G2 that represents a 14–dimensional Lie algebra. G2

Notice the low dimensional coincidences:

which reflects the exceptional isomorphisms

Also reflecting the isomorphism And, reflecting

¥ ¤ £¥ ¢¤ £ ¢ £ ¢ £ ¢ ¡¡¡     ¡¡    ¡¡    ¡   ¡¡    ¡¡    ¡¡    ¡   ¡¡    ¡¡    ¡   ¡  
A1 = B1 = C1 sl2 ∼ so3 ∼ sp2 . = = B2 ∼ C2 = so5 ∼ sp4 . = A3 ∼ D3 = sl4 ∼ so6 . = 1174

Remark 1. Often in the literature the listing of Dynkin diagrams is arranged so that there are no “intersections” between different families. However by allowing intersections one gets a graphical representation of the low degree isomorphisms. In the same vein there is a graphical representation of the isomorphism

Namely, if not for the requirement that the families consist of connected diagrams, one could start the D family with

D2

which consists of two disjoint copies of A2 .

Version: 9 Owner: Dr Absentius Author(s): Dr Absentius

278.17

positive root

If R ⊂ E is a root system, with E a euclidean vector space, then a subset R+ ⊂ R is called a set of positive roots if there is a vector v ∈ E such that (α, v) > 0 if α ∈ R+ , and (α, v) < 0 if α ∈ R\R+ . roots which are not positive are called negative. Since −α is negative exactly when α is positive, exactly half the roots must be positive. Version: 2 Owner: bwebste Author(s): bwebste

278.18

rank

Let lg be a finite dimensional Lie algebra. One can show that all Cartan subalgebras h ⊂ lg have the same dimension. The rank of lg is defined to be this dimension. Version: 5 Owner: rmilson Author(s): rmilson

278.19

root lattice

If R ⊂ E is a root system, and E a euclidean vector space, then the root lattice ΛR of R is the subset of E generated by R as an abelian group. In fact, this group is free on the simple roots, and is thus a full sublattice of E. 1175

¡   ¡  

so4 ∼ sl2 × sl2 . =

Version: 1 Owner: bwebste Author(s): bwebste

278.20

root system

Root systems are sets of vectors in a Euclidean space which are used classify simple Lie algebras, and to understand their representation theory, and also in the theory of reflection groups. Axiomatically, an (abstract) root system R is a set of vectors in a euclidean vector space E with inner product (·, ·), such that: 1. R spans the vector space E. 2. if α ∈ R, then reflection in the hyperplane orthogonal to α preserves R.
(α,β) 3. if α, β ∈ R, then 2 (α,α) is an integer.

Axiom 3 is sometimes dropped when dealing with reflection groups, but it is necessary for the root systems which arise in connection with Lie algebras. Additionally, a root system is called reduced if for all α ∈ R, if kα ∈ R, then k = ±1. We call a root system indecomposable if there is no proper subset R ⊂ R such that every vector in R is orthogonal to R. Root systems arise in the classification of semi-simple Lie algebras in the following manner: If g is a semi-simple complex Lie algebra, then one can choose a maximal self-normalizing subalgebra of g (alternatively, this is the commutant of an element with commutant of minimal dimension), called a Cartan subalgebra, traditionally denote h. These act on g by the adjoint action by diagonalizable linear maps. Since these maps all commute, they are all simultaneously diagonalizable. The simultaneous eigenspaces of this action are called root spaces, and the decomposition of g into h and the root spaces is called a root decompositon of g. It turns out that all root spaces are all one dimensional. Now, for each eigenspace, we have a map λ : h → C, given by Hv = λ(H)v for v an element of that eigenspace. The set R ⊂ h∗ of these λ is called the root system of the algebra g. The Cartan subalgebra h has a natural inner product (the Killing form), which in turn induces an inner product on h∗ . With respect to this inner product, the root system R is an abstract root system, in the sense defined up above. Conversely, given any abstract root system R, there is a unique semi-simple complex Lie algebra g such that R is its root system. Thus to classify complex semi-simple Lie algebras, we need only classify roots systems, a somewhat easier task. Really, we only need to classify indecomposable root systems, since all other root systems are built out of these. The Lie algebra corresponding to a root system is simple if and only if the associated root system is indecomposable. 1176

By convention e1 , . . . , en are orthonormal vectors, and the subscript on the name of the root system is the dimension of the space it is contained in, also called the rank of the system, and the indices i and j will run from 1 to n. There are four infinite series of indecomposable root systems : • An = {ei − ej , δ + ei }i=j , where δ = • Bn = {±ei ± ej }i<j • Cn = {±ei ± ej }i<j
n k=1 ek .

This system corresponds to sl2 C.

{ei }. This system corresponds to so2n+1 C. {2ei }. This system corresponds to sp2n C.

• Dn = {±ei ± ej }i<j . This sytem corresponds to so2n C. and there are five exceptional root systems G2 , F4 , E6 , E7 , E8 , with five corresponding exceptional algebras, generally denoted by the same letter in lower-case Fraktur (g2 , etc.). Version: 3 Owner: bwebste Author(s): bwebste

278.21

simple and semi-simple Lie algebras

A Lie algebra is called simple if it has no proper ideals and is not abelian. A Lie algebra is called semisimple if it has no proper solvable ideals and is not abelian. Let k = R or C. Examples of simple algebras are sln k, the Lie algebra of the special linear group (traceless matrices), son k, the Lie algebra of the special orthogonal group (skew-symmetric matrices), and sp2n k the Lie algebra of the symplectic group. Over R, there are other simple Lie algebas, such as sun , the Lie algebra of the special unitary group (skew-Hermitian matrices). Any semisimple Lie algebra is a direct product of simple Lie algebras. Simple and semi-simple Lie algebras are one of the most widely studied classes of algebras for a number of reasons. First of all, many of the most interesting Lie groups have semi-simple Lie algebras. Secondly, their representation theory is very well understood. Finally, there is a beautiful classification of simple Lie algebras. Over C, there are 3 infinite series of simple Lie algebras: sln , son and sp2n , and 5 exceptional simple Lie algebras g2 , f4 , e6 , e7 , and e8 . Over R the picture is more complicated, as several different Lie algebras can have the same complexification (for example, sun and sln R both have complexification sln C). Version: 3 Owner: bwebste Author(s): bwebste

1177

278.22

simple root

Let R ⊂ E be a root system, with E a euclidean vector space. If R+ is a set of positive roots, then a root is called simple if it is positive, and not the sum of any two positive roots. The simple roots form a basis of the vector space E, and any positive root is a positive integer linear combination of simple roots. A set of roots which is simple with respect to some choice of a set of positive roots is called a base. The Weyl group of the root system acts simply transitively on the set of bases. Version: 1 Owner: bwebste Author(s): bwebste

278.23

weight (Lie algebras)

Let g be a semi-simple Lie algebra. Choose a Cartan subalgebra h. Then a weight is simply an element of the dual h∗ . Weights arise in the representation theory of semi-simple Lie algebras in the following manner: The elements of h must act on V by diagonalizable (also called semi-simple) linear transformations. Since h is abelian, these must be simultaneously diagonalizable. Thus, V decomposes as the direct sum of simultaneous eigenspaces for h. Let V be such an eigenspace. Then the map λ defined by λ(H)v = Hv is a linear functional on h, and thus a weight, as defined above. The maximal eigenspace Vλ with weight λ is called the weight space of λ. The dimension of Vλ is called the multiplicity of λ. A representation of a semi-simple algebra is determine by the multiplicities of its weights. Version: 3 Owner: bwebste Author(s): bwebste

278.24

weight lattice

The weight lattice ΛW of a root system R ⊂ E is the dual lattice to ΛR , the root lattice of R. That is, ΛW = {e ∈ E|(e, r) ∈ Z}. Weights which lie in the weight lattice are called integral. Since the simple roots are free generators of the root lattice, one need only check that (e, π) ∈ Z for all simple roots π. If R ⊂ h is the root system of a semi-simple Lie algebra g with Cartan subalgebra h, then ΛW is exactly the set of weights appearing in finite dimensional representations of g. Version: 4 Owner: bwebste Author(s): bwebste

1178

Chapter 279 17B30 – Solvable, nilpotent (super)algebras
279.1 Engel’s theorem

Before proceeding, it will be useful to recall the definition of a nilpotent Lie algebra. Let g be a Lie algebra. The lower central series of g is defined to be the filtration of ideals D0 g ⊃ D1 g ⊃ D2 g ⊃ . . . , where D0 g = g, Dk+1g = [g, Dk g], k ∈ N. To say that g is nilpotent is to say that the lower central series has a trivial termination, i.e. that there exists a k such that Dk g = 0, or equivalently, that k nested bracket operations always vanish. Theorem 1 (Engel). Let g ⊂ End V be a Lie algebra of endomorphisms of a finite-dimensional vector space V . Suppose that all elements of g are nilpotent transformations. Then, g is a nilpotent Lie algebra. Lemma 3. Let X : V → V be a nilpotent endomorphism of a vector space V . Then, the adjoint action ad(X) : End V → End V is also a nilpotent endomorphism.

Proof. Suppose that Xk = 0 1179

for some k ∈ N. We will show that ad(X)2k−1 = 0. Note that ad(X) = l(X) − r(X), where l(X), r(X) : End V → End V, are the endomorphisms corresponding, respectively, to left and right multiplication by X. These two endomorphisms commute, and hence we can use the binomial formula to write
2k−1

ad(X)

2k−1

=
i=0

(−1)i l(X)2k−1−i r(X)i .

Each of terms in the above sum vanishes because l(X)k = r(X)k = 0. QED Lemma 4. Let g be as in the theorem, and suppose, in addition, that g is a nilpotent Lie algebra. Then the joint kernel, ker g = ker a,
a∈g

is non-trivial.

Proof. We proceed by induction on the dimension of g. The claim is true for dimension 1, because then g is generated by a single nilpotent transformation, and all nilpotent transformations are singular. Suppose then that the claim is true for all Lie algebras of dimension less than n = dim g. We note that D1 g fits the hypotheses of the lemma, and has dimension less than n, because g is nilpotent. Hence, by the induction hypothesis V0 = ker D1 g is non-trivial. Now, if we restrict all actions to V0 , we obtain a representation of g by abelian transformations. This is because for all a, b ∈ g and v ∈ V0 we have abv − bav = [a, b]v = 0. Now a finite number of mutually commuting linear endomorphisms admits a mutual eigenspace decomposition. In particular, if all of the commuting endomorphisms are singular, their joint kernel will be non-trivial. We apply this result to a basis of g/D1 g acting on V0 , and the desired conclusion follows. QED 1180

Proof of the theorem. We proceed by induction on the dimension of g. The theorem is true in dimension 1, because in that circumstance D1 g is trivial. Next, suppose that the theorem holds for all Lie algebras of dimension less than n = dim g. Let h ⊂ g be a properly contained subalgebra of minimum codimension. We claim that there exists an a ∈ g but not in h such that [a, h] ⊂ h. By the induction hypothesis, h is nilpotent. To prove the claim consider the isotropy representation of h on g/h. By Lemma 1, the action of each a ∈ h on g/h is a nilpotent endomorphism. Hence, we can apply Lemma 2 to deduce that the joint kernel of all these actions is nontrivial, i.e. there exists a a ∈ g but not in h such that [b, a] ⇔ 0 (mod#1) , for all b ∈ h. Equivalently, [h, a] ⊂ h and the claim is proved. Evidently then, the span of a and h is a subalgebra of g. Since h has minimum codimension, we infer that h and a span all of g, and that D1 g ⊂ h. Next, we claim that all the Dk h are ideals of g. It is enough to show that [a, Dk h] ⊂ Dk h. We argue by induction on k. Suppose the claim is true for some k. Let b ∈ h, c ∈ Dk h be given. By the Jacobi identity [a, [b, c]] = [[a, b], c] + [b, [a, c]]. The first term on the right hand-side in Dk+1h because [a, b] ∈ h. The second term is in Dk+1 h by the induction hypothesis. In this way the claim is established. Now a is nilpotent, and hence by Lemma 1, ad(a)n = 0 for some n ∈ N. We now claim that Dn+1 g ⊂ D1 h. By (278.1.1) it suffices to show that
n times

(279.1.1)

(279.1.2)

[g, [. . . [g, h] . . .]] ⊂ D1 h. Putting g1 = g/D1 h, h1 = h/D1 h,

1181

this is equivalent to
n times

[g1 , [. . . [g1 , h1 ] . . .]] = 0. However, h1 is abelian, and hence, the above follows directly from (278.1.2). Adapting this argument in the obvious fashion we can show that Dkn+1g ⊂ Dk h. Since h is nilpotent, g must be nilpotent as well. QED

Historical remark. In the traditional formulation of Engel’s theorem, the hypotheses are the same, but the conclusion is that there exists a basis B of V , such that all elements of g are represented by nilpotent matrices relative to B. Let us put this another way. The vector space of nilpotent matrices Nil, is a nilpotent Lie algebra, and indeed all subalgebras of Nil are nilpotent Lie algebras. Engel’s theorem asserts that the converse holds, i.e. if all elements of a Lie algebra g are nilpotent transformations, then g is isomorphic to a subalgebra of Nil. The classical result follows straightforwardly from our version of the Theorem and from Lemma 2. Indeed, let V1 be the joint kernel g. We then let U2 be the joint kernel of g acting on V /V0 , and let V2 ⊂ V be the subspace obtained by pulling U2 x back to V . We do this a finite number of times and obtain a flag of subspaces 0 = V0 ⊂ V1 ⊂ V2 ⊂ . . . ⊂ Vn = V, such that gVk+1 = Vk for all k. The choose an adapted basis relative to this flag, and we’re done. Version: 2 Owner: rmilson Author(s): rmilson

279.2

Lie’s theorem

Let g be a finite dimensional complex solvable Lie algebra, and V a repesentation of g. Then there exists an element of V which is a simultaneous eigenvector for all elements of g. Applying this result inductively, we find that there is a basis of V with respect to which all elements of g are upper triangular. Version: 3 Owner: bwebste Author(s): bwebste

1182

279.3

solvable Lie algebra

Let g be a Lie algebra. The lower central series of g is the filtration of subalgebras D1 g ⊃ D2 g ⊃ D3 g ⊃ · · · ⊃ Dk g ⊃ · · · of g, inductively defined for every natural number k as follows: D1 g := [g, g] Dk g := [g, Dk−1 g] The upper central series of g is the filtration D1 g ⊃ D2 g ⊃ D3 g ⊃ · · · ⊃ Dk g ⊃ · · · defined inductively by D1 g := [g, g] Dk g := [Dk−1g, Dk−1 g] In fact both Dk g and Dk g are ideals of g, and Dk g ⊂ Dk g for all k. The Lie algebra g is defined to be nilpotent if Dk g = 0 for some k ∈ N, and solvable if Dk g = 0 for some k ∈ N. A subalgebra h of g is said to be nilpotent or solvable if h is nilpotent or solvable when considered as a Lie algebra in its own right. The terms may also be applied to ideals of g, since every ideal of g is also a subalgebra. Version: 1 Owner: djao Author(s): djao

1183

Chapter 280 17B35 – Universal enveloping (super)algebras
280.1 Poincar´-Birkhoff-Witt theorem e

Let g be a Lie algebra over a field k, and let B be a k-basis of g equipped with a linear order . The Poincar´-Birkhoff-Witt-theorem (often abbreviated to PBW-theorem) states that e the monomials x1 x2 · · · xn with x1 x2 . . . xn elements of B constitute a k-basis of the universal enveloping algebra U(g) of g. Such monomials are often called ordered monomials or PBW-monomials. It is easy to see that they span U(g): for all n ∈ N, let Mn denote the set Mn = {(x1 , . . . , xn ) | x1 and denote by π :
∞ n=0

...

xn } ⊂ B n ,

B n → U(g) the multiplication map. Clearly it suffices to prove that
n

π(B n ) ⊆

π(Mi )
i=0

for all n ∈ N; to this end, we proceed by induction. For n = 0 the statement is clear. Assume that it holds for n − 1 0, and consider a list (x1 , . . . , xn ) ∈ B n . If it is an element of Mn , then we are done. Otherwise, there exists an index i such that xi > xi+1 . Now we have π(x1 , . . . , xn ) = π(x1 , . . . , xi−1 , xi+1 , xi , xi+2 , . . . , xn ) + x1 · · · xi−1 [xi , xi+1 ]xi+1 · · · xn . As B is a basis of k, [xi , xi+1 ] is a linear combination of B. Using this to expand the second n−1 term above, we find that it is in i=0 π(Mi ) by the induction hypothesis. The argument 1184

of π in the first term, on the other hand, is lexicographically smaller than (x1 , . . . , xn ), but contains the same entries. Clearly this rewriting proces must end, and this concludes the induction step. The proof of linear independence of the PBW-monomials is slightly more difficult. Version: 1 Owner: draisma Author(s): draisma

280.2

universal enveloping algebra

A universal enveloping algebra of a Lie algebra g over a field k is an associative algebra U (with unity) over k, together with a Lie algebra homomorphism ι : g → U (where the Lie algebra structure on U is given by the commutator), such that if A is a another associative algebra over k and φ : g → A is another Lie algebra homomorphism, then there exists a unique homomorphism ψ : U → A of associative algebras such that the diagram g
ι

U
ψ

φ

then U = T /I is a universal enveloping algebra of g. Moreover, the universal property above ensures that all universal enveloping algebras of g are canonically isomorphic; this justifies the standard notation U(g). Some remarks:

A commutes. Any g has a universal enveloping algebra: let T be the associative tensor algebra generated by the vector space g, and let I be the two-sided ideal of T generated by elements of the form xy − yx − [x, y] for x, y ∈ g;

1. By the Poincar´-Birkhoff-Witt theorem, the map ι is injective; usually g is identified e with ι(g). From the construction above it is clear that this space generates U(g) as an associative algebra with unity. 2. By definition, the (left) representation theory of U(g) is identical to that of g. In particular, any irreducible g-module corresponds to a maximal left ideal of U(g). Example: let g be the Lie algebra generated by the elements p, q, and e with Lie bracket determined by [p, q] = e and [p, e] = [q, e] = 0. Then U(g)/(e − 1) (where (e − 1) denotes the ∂ two-sided ideal generated by e − 1) is isomorphic to the skew polynomial algebra k[x, ∂x ], the isomorphism being determined by ∂ and p + (e − 1) → ∂x q + (e − 1) → x. 1185

Version: 1 Owner: draisma Author(s): draisma

1186

Chapter 281 17B56 – Cohomology of Lie (super)algebras
281.1 Lie algebra cohomology

Let g be a Lie algebra, and M a representation of g. Let M g = {m ∈ M : Xm = 0∀X ∈ g}. This is clearly a covariant functor. Call its derived functor Ri (−g) = H i(g, −) the Lie algebra cohomology of g with coefficients in M These cohomology groups have certain interpretations. For any Lie algebra, H 1 (g, k) ∼ = 2 g/[g, g], the abelianization of g, and H (g, M) is in natural bijection with Lie algebra extensions (thinking of M as an abelian Lie algebra) 0 → M → f → g → 0 such that the action of g on M induced by that of f coincides with that already specified. Version: 2 Owner: bwebste Author(s): bwebste

1187

Chapter 282 17B67 – Kac-Moody (super)algebras (structure and representation theory)
282.1 Kac-Moody algebra

Let A be an n × n generalized Cartan matrix. If n − r is the rank of A, then let h be a n + r dimensional complex vector space. Choose n linearly independent elements α1 , . . . , αn ∈ h∗ (called roots), and α1 , . . . , αn ∈ h (called coroots) such that αi , αj = aij , where ·, · is the ˇ ˇ ˇ ∗ natural pairing of h and h. This choice is unique up to automorphisms of h. Then the Kac-Moody algebra associated to g(A) is the Lie algebra generated by elements X1 , . . . , Xn , Y1 , . . . , Yn and h, with the relations [Xi , Yi ] = αi ˇ [Xi , h] = αi (h)Xi [Xi , [Xi , · · · , [Xi , Xj ] · · · ]] = 0
1−aij times

[Xi , Yj ] = 0 [Yi , h] = −αi (h)Yi [Yi , [Yi, · · · , [Yi, Yj ] · · · ]] = 0
1−aij times

If the matrix A is positive-definite, we obtain a finite dimensional semi-simple Lie algebra, and A is the Cartan matrix associated to a Dynkin diagram. Otherwise, the algebra we obtain is infinite dimensional and has an r-dimensional center. Version: 2 Owner: bwebste Author(s): bwebste

282.2

generalized Cartan matrix

A generalized Cartan matrix is a matrix A whose diagonal entries are all 2, and whose off-diagonal entries are nonpositive integers, such that aij = 0 if and only if aji = 0. Such a 1188

matrix is called symmetrizable if there is a diagonal matrix B such that AB is symmetric. Version: 2 Owner: bwebste Author(s): bwebste

1189

Chapter 283 17B99 – Miscellaneous
283.1 Jacobi identity interpretations

The Jacobi identity in a Lie algebra g has various interpretations that are more transparent, whence easier to remember, than the usual form [x, [y, z]] + [y, [z, x]] + [z, [x, y]] = 0. One is the fact that the adjoint representation ad : g → End(g) really is a representation. Yet another way to formulate the identity is ad(x)[y, z] = [ad(x)y, z] + [y, ad(x)z], i.e., ad(x) is a derivation on g for all x ∈ g. Version: 2 Owner: draisma Author(s): draisma

283.2

Lie algebra

A Lie algebra over a field k is a vector space g with a bilinear map [ , ] : g × g → g, called the Lie bracket and denoted (x, y) → [x, y]. It is required to satisfy: 1. [x, x] = 0 for all x ∈ g. 2. The Jacobi identity: [x, [y, z]] + [y, [z, x]] + [z, [x, y]] = 0 for all x, y, z ∈ g.

1190

283.2.1

Subalgebras & Ideals

A vector subspace h of the Lie algebra g is a subalgebra if h is closed under the Lie bracket operation, or, equivalently, if h itself is a Lie algebra under the same bracket operation as g. An ideal of g is a subspace h for which [x, y] ∈ h whenever either x ∈ h or y ∈ h. Note that every ideal is also a subalgebra. Some general examples of subalgebras: • The center of g, defined by Z(g) := {x ∈ g | [x, y] = 0for all y ∈ g}. It is an ideal of g. • The normalizer of a subalgebra h is the set N(h) := {x ∈ g | [x, h] ⊂ h}. The Jacobi identity guarantees that N(h) is always a subalgebra of g. • The centralizer of a subset X ⊂ g is the set C(X) := {x ∈ g | [x, X] = 0}. Again, the Jacobi identity implies that C(X) is a subalgebra of g.

283.2.2

Homomorphisms

Given two Lie algebras g and g over the field k, a homomorphism from g to g is a linear transformation φ : g → g such that φ([x, y]) = [φ(x), φ(y)] for all x, y ∈ g. An injective homomorphism is called a monomorphism, and a surjective homomorphism is called an epimorphism. The kernel of a homomorphism φ : g → g (considered as a linear transformation) is denoted ker (φ). It is always an ideal in g.

283.2.3

Examples

• Any vector space can be made into a Lie algebra simply by setting [x, x] = 0 for all x. The resulting Lie algebra is called an abelian Lie algebra. • If G is a Lie group, then the tangent space at the identity forms a Lie algebra over the real numbers. • R3 with the cross product operation is a nonabelian three dimensional Lie algebra over R.

283.2.4

Historical Note

Lie algebras are so-named in honour of Sophus Lie, a Norwegian mathematician who pioneered the study of these mathematical objects. Lie’s discovery was tied to his investigation 1191

of continuous transformation groups and symmetries. One joint project with Felix Klein called for the classification of all finite-dimensional groups acting on the plane. The task seemed hopeless owing to the generally non-linear nature of such group actions. However, Lie was able to solve the problem by remarking that a transformation group can be locally reconstructed from its corresponding “infinitesimal generators”, that is to say vector fields corresponding to various 1–parameter subgroups. In terms of this geometric correspondence, the group composition operation manifests itself as the bracket of vector fields, and this is very much a linear operation. Thus the task of classifying group actions in the plane became the task of classifying all finite-dimensional Lie algebras of planar vector field; a project that Lie brought to a successful conclusion. This “linearization trick” proved to be incredibly fruitful and led to great advances in geometry and differential equations. Such advances are based, however, on various results from the theory of Lie algebras. Lie was the first to make significant contributions to this purely algebraic theory, but he was surely not the last. Version: 10 Owner: djao Author(s): djao, rmilson, nerdy2

283.3

real form

Let G be a complex Lie group. A real Lie group K called a real form of G if g ∼ C ⊗R k, = where g and k are the Lie algebras of G and K, respectively. Version: 2 Owner: bwebste Author(s): bwebste

1192

Chapter 284 18-00 – General reference works (handbooks, dictionaries, bibliographies, etc.)
284.1 Grothendieck spectral sequence

If F : C → D and G : D → E are two covariant left exact functors between abelian categories, and if F takes injective objects of C to G-acyclic objects of D then there is a spectral sequence for each object A of C:
pq E2 = (Rp G ◦ Rq F )(A) → Rp+q (G ◦ F )(A)

If X and Y are topological spaces and C = Ab(X) is the category of sheaves of abelian groups on X and D = Ab(Y ) and E = Ab is the category of abelian groups, then for a continuous map f : X → Y we have a functor f∗ : Ab(X) → Ab(Y ), the direct image functor. We also have the global section functors ΓX : Ab(X) → Ab, and ΓY : Ab(Y ) → Ab. Then since ΓY ◦ f∗ = ΓX and we can verify the hypothesis (injectives are flasque, direct images of flasque sheaves are flasque, and flasque sheaves are acyclic for the global section functor), the sequence in this case becomes: H p (Y, Rq f∗ F) → H p+q (X, F) for a sheaf F of abelian groups on X, exactly the Leray spectral sequence. I can recommend no better book than Weibel’s book on homological algebra. Sheaf theory can be found in Hartshorne or in Godement’s book. Version: 5 Owner: bwebste Author(s): Manoj, ceps, nerdy2

1193

284.2

category of sets

The category of sets has as its objects all sets and as its morphisms functions between sets. (This works if a category’s objects are only required to be part of a class, as the class of all sets exists.) Alternately one can specify a universe, containing all sets of interest in the situation, and take the category to contain only sets in that universe and functions between those sets. Version: 1 Owner: nerdy2 Author(s): nerdy2

284.3

functor

Given two categories C and D, a covariant functor T : C → D consists of an assignment for each object X of C an object T (X) of D (i.e. a “function” T : Ob(C) → Ob(D)) together with an assignment for every morphism f ∈ HomC(A, B), to a morphism T (f ) ∈ HomD(T (A), T (B)), such that: • T (1A ) = 1T (A) where 1X denotes the identity morphism on the object X (in the respective category). • T (g ◦ f ) = T (g) ◦ T (f ), whenever the composition g ◦ f is defined. A contravariant functor T : C → D is just a covariant functor T : Cop → D from the opposite category. In other words, the assignment reverses the direction of maps. If f ∈ HomC(A, B), then T (f ) ∈ HomD(T (B), T (A)) and T (g ◦ f ) = T (f ) ◦ T (g) whenever the composition is defined (the domain of g is the same as the codomain of f ). Given a category C and an object X we always have the functor T : C → Sets to the category of sets defined on objects by T (A) = Hom(X, A). If f : A → B is a morphism of C, then we define T (f ) : Hom(X, A) → Hom(X, B) by g → f ◦ g. This is a covariant functor, denoted by Hom(X, −). Similarly, one can define a contravariant functor Hom(−, X) : C → Sets. Version: 3 Owner: nerdy2 Author(s): nerdy2

284.4

monic

A morphism f : A → B in a category is called monic if for any object C and any morphisms g1 , g2 : C → A, if f ◦ g1 = f ◦ g2 then g1 = g2 . 1194

A monic in the category of sets is simply a one-to-one function. Version: 1 Owner: nerdy2 Author(s): nerdy2

284.5

natural equivalence

A natural transformation between functors τ : F → G is called a natural equivalence (or a natural isomorphism) if there is a natural transformation σ : G → F such that τ ◦ σ = idG and σ ◦ τ = idF where idF is the identity natural transformation on F (which for each object A gives the identity map F (A) → F (A)), and composition is defined in the obvious way (for each object compose the morphisms and it’s easy to see that this results in a natural transformation). Version: 2 Owner: mathcam Author(s): mathcam, nerdy2

284.6

representable functor

A contravariant functor T : C → Sets between a category and the category of sets is representable if there is an object X of C such that T is isomorphic to the functor X • = Hom(−, X). Similarly, a covariant functor is T called representable if it is isomorphic to X• = Hom(X, −). We say that the object X represents T . X is unique up to canonical isomorphism. A vast number of important objects in mathematics are defined as representing functors. For example, if F : C → D is any functor, then the adjoint G : D → C (if it exists) can be defined as follows. For Y in D, G(Y ) is the object of C representing the functor X → Hom(F (X), Y ) if G is right adjoint to F or X → Hom(Y, F (X)) if G is left adjoint. Thus, for example, if R is a ring, then N⊗M represents the functor L → HomR (N, HomR (M, L)). Version: 3 Owner: bwebste Author(s): bwebste, nerdy2

284.7

supplemental axioms for an Abelian category

These are axioms introduced by Alexandre Grothendieck for an abelian category. The first two are satisfied by definition in an Abelian category, and others may or may not be. (Ab1) Every morphism has a kernel and a cokernel. 1195

(Ab2) Every monic is the kernel of its cokernel. (Ab3) coproducts exist. (Coproducts are also called direct sums.) If this axiom is satisfied the category is often just called cocomplete. (Ab3*) Products exist. If this axiom is satisfied the category is often just called complete. (Ab4) Coproducts exist and the coproduct of monics is a monic. (Ab4*) Products exist and the product of epics is an epic. (Ab5) Coproducts exist and filtered colimits of exact sequences are exact. (Ab5*) Products exist and filtered inverse limits of exact sequences are exact. Grothendieck introduced these in his homological algebra paper in the Tokohu Journal of Math. They can also be found in Weibel’s excellent homological algebra book. Version: 5 Owner: nerdy2 Author(s): nerdy2

1196

Chapter 285 18A05 – Definitions, generalizations
285.1 autofunctor

Let F : C → C be an endofunctor on a category C. If F is a bijection on objects, Ob(C), and morphisms, Mor(C), then it is an autofunctor. In short, an autofunctor is a full and faithful endofunctor F : C → C such that the mapping Ob(C) → Ob(C) which is induced by F is a bijection. An autofunctor F : C → C is naturally isomorphic to the identity functor idC. Version: 10 Owner: mathcam Author(s): mathcam, mhale, yark, gorun manolescu

285.2

automorphism

Roughly, an automorphism is a map from a mathematical object onto itself such that: 1. There exists an ”inverse” map such that the composition of the two is the identity map of the object, and 2. any relevent structure related to the object in question is preserved. In category theory, an automorphism of an object A in a category C is a morphishm ψ ∈ Mor(A, A) such that there exists another morphism φ ∈ Mor(A, A) and ψ ◦ φ = φ ◦ ψ = idA . For example in the category of groups an automorphism is just a bijective (inverse exists and composition gives the identity) group homomorphism (group structure is preserved). Concretely, the map: x → −x is an automorphism of the additive group of real numbers. In the category of topological spaces an automorphism would be a bijective, continuous map such that it’s inverse map is also continuous (not guaranteed as in the group case). Concretely, the map ψ : S 1 → S 1 where ψ(α) = α + θ for some fixed angle θ is an automorphism of the topological space that is the circle. 1197

Version: 4 Owner: benjaminfjones Author(s): benjaminfjones

285.3

category

A category C consists of the following data: 1. a collection ob(C) of objects (of C) 2. for each ordered pair (A, B) of objects of C, a collection (we will assume it is a set) Hom(A, B) of morphisms from the domain A to the codomain B 3. a function ◦ : Hom(A, B) × Hom(B, C) → Hom(A, C) called composition. We normally denote ◦(f, g) by g ◦ f for morphisms f, g. The above data must satisfy the following axioms: for objects A, B, C, D, A1: Hom(A, B) Hom(C, D) = ∅ whenever A = B or C = D

A2: (associativity) if f ∈ Hom(A, B), g ∈ Hom(B, C) and h ∈ Hom(C, D), h ◦ (g ◦ f ) = (h ◦ g) ◦ f A3: (Existence of an identity morphism) for each object A there exists an identity morphism idA ∈ Hom(A, A) such that for every f ∈ Hom(A, B), f ◦ idA = f and idA ◦ g = g for every g ∈ Hom(B, A). Some examples of categories: • 0 is the empty category with no objects or morphisms, 1 is the category with one object and one (identity) morphism. • If we assume we have a universe U which contains all sets encountered in “everyday” mathematics, Set is the category of all such small sets with morphisms being set functions • Top is the category of all small topological spaces with morphisms continuous functions • Grp is the category of all small groups whose morphisms are group homomorphisms Version: 9 Owner: mathcam Author(s): mathcam, RevBobo

1198

285.4

category example (arrow category)

Let C be a category, and let D be the category whose objects are the arrows of C. A morphism between two morphisms f : A → B and g : A → B is defined to be a couple of morphisms (h, k), where h ∈ Hom(A, A ) and k ∈ Hom(B, B ) such that the following diagram A
f h

A
g

B

k

B

commutes. The resulting category D is called the arrow category of C. Version: 6 Owner: n3o Author(s): n3o

285.5

commutative diagram

Definition 15. Let C be a category. A diagram in C is a directed graph Γ with vertex set V and edge set E, (“loops” and “parallel edges” are allowed) together with two maps o : V → Obj(C), m : E → Morph(C) such that if e ∈ E has source s(e) ∈ V and target t(e) ∈ V then m(e) ∈ HomC (s(e), t(e)). Usually diagrams are denoted by drawing the corresponding graph and labeling its vertices (respectively edges) with their images under o (respectively m), for example if f : A → B is a morphism A
f

B

is a diagram. Often (as in the previous example) the vertices themselves are not drawn since their position can b deduced by the position of their labels. Definition 16. Let D = (Γ, o, m) be a diagram in the category C and γ = (e1 , . . . , en ) be a path in Γ. Then the composition along γ is the following morphism of C ◦(γ) := m(en ) ◦ · · · ◦ m(e1 ) . We say that D is commutative or that it commutes if for any two objects in the image of o, say A = o(v1 ) and B = o(v2 ), and any two paths γ1 and γ2 that connect v1 to v2 we have ◦(γ1 ) = ◦(γ2 ) . For example the commutativity of the triangle A
h f

B
g

C 1199

translates to h = g ◦ f , while the commutativity of the square A
k f

B
g

C translates to g ◦ f = h ◦ k.

h

D

Version: 3 Owner: Dr Absentius Author(s): Dr Absentius

285.6

double dual embedding

Let V be a vector space over a field K. Recall that V ∗ , the dual space, is defined to be the vector space of all linear forms on V . There is a natural embedding of V into V ∗∗ , the dual of its dual space. In the language of categories, this embedding is a natural transformation between the identity functor and the double dual functor, both endofunctors operating on VK , the category of vector spaces over K. Turning to the details, let denote the identity and the dual functors, respectively. Recall that for a linear mapping L : U → V (a morphism in VK ), the dual homomorphism D[L] : V ∗ → U ∗ is defined by D[L](α) : u → α(Lu), u ∈ U, α ∈ V ∗ . The double dual embedding is a natural transformation δ : I → D2 , that associates to every V ∈ VK a linear homomorphism δV ∈ Hom(V, V ∗∗ ) described by δV (v) : α → α(v), v ∈ V, α ∈ V ∗ To show that this transformation is natural, let L : U → V be a linear mapping. We must show that the following diagram commutes:
δU

I, D : VK → VK

U
L

U ∗∗
D 2 [L]

V

δV

V ∗∗

Let u ∈ U and α ∈ V ∗ be given. Following the arrows down and right we have that (δV ◦ L)(u) : α → α(Lu). 1200

Following the arrows right, then down we have that (D[D[L]] ◦ δU )(u) : α → (δU u)(D[L]α) = (D[L]α)(u) = α(Lu), as desired. Let us also note that for every non-zero v ∈ V , there exists an α ∈ V ∗ such that α(v) = 0. Hence δV (v) = 0, and hence δV is an embedding, i.e. it is one-to-one. If V is finite dimensional, then V ∗ has the same dimension as V . Consequently, for finite-dimensional V , the natural embedding δV is, in fact, an isomorphism. Version: 1 Owner: rmilson Author(s): rmilson

285.7

dual category

Let C be a category. The dual category C∗ of C is the category which has the same objects as C, but in which all morphisms are ”reversed”. That is to say if A, B are objects of C and we have a morphism f : A → B, then f ∗ : B → A is a morphism in C∗ . The dual category is sometimes called the opposite category and is denoted Cop . Version: 3 Owner: RevBobo Author(s): RevBobo

285.8

duality principle

Let Σ be any statement of the elementary theory of an abstract category. We form the dual of Σ as follows: 1. Replace each occurrence of ”domain” in Σ with ”codomain” and conversely. 2. Replace each occurrence of g ◦ f = h with f ◦ g = h Informally, these conditions state that the dual of a statement is formed by reversing arrows and compositions. For example, consider the following statements about a category C: • f :A→B • f is monic, i.e. for all morphisms g, h for which composition makes sense, f ◦ g = f ◦ h implies g = h.

1201

The respective dual statements are • f :B→A • f is epi, i.e. for all morphisms g, h for which composition makes sense, g ◦ f = h ◦ f implies g = h. The duality principle asserts that if a statement is a theorem, then the dual statment is also a theorem. We take ”theorem” here to mean provable from the axioms of the elementary theory of an abstract category. In practice, for a valid statement about a particular category C, the dual statement is valid in the dual category C∗ (Cop ). Version: 3 Owner: RevBobo Author(s): RevBobo

285.9

endofunctor

Given a category C, an endofunctor is a functor T : C → C. Version: 2 Owner: rmilson Author(s): NeuRet, Logan

285.10

examples of initial objects, terminal objects and zero objects

Examples of initial objects, terminal objects and zero objects of categories include: • The empty set is the unique initial object in the category of sets; every one-element set is a terminal object in this category; there are no zero objects. Similarly, the empty space is the unique initial object in the category of topological spaces; every one-point space is a terminal object in this category. • In the category of non-empty sets, there are no initial objects. The singletons are not initial: while every non-empty set admits a function from a singleton, this function is in general not unique. • In the category of pointed sets (whose objects are non-empty sets together with a distinguished point; a morphism from (A, a) to (B, b) is a function f : A → B with f (a) = b) every singleton serves as a zero object. Similarly, in the category of pointed topological spaces, every singleton is a zero object.

1202

• In the category of groups, any trivial group (consisting only of its identity element) is a zero object. The same is true for the category of abelian groups as well as for the category of modules over a fixed ring. This is the origin of the term ”zero object”. • In the category of rings with identity, the ring of integers (and any ring isomorphic to it) serves as an initial object. The trivial ring consisting only of a single element 0 = 1 is a terminal object. • In the category of schemes, the prime spectrum of the integers spec(Z) is a terminal object. The emtpy scheme (which is the prime spectrum of the trivial ring) is an initial object. • In the category of fields, there are no initial or terminal objects. • Any partially ordered set (P, ≤) can be interpreted as a category: the objects are the elements of P , and there is a single morphism from x to y if and only if x ≤ y. This category has an initial object if and only if P has a smallest element; it has a terminal object if and only if P has a largest element. This explains the terminology. • In the category of graphs, the null graph is an initial object. There are no terminal objects, unless we allow our graphs to have loops (edges starting and ending at the same vertex), in which case the one-point-one-loop graph is terminal. • Similarly, the category of all small categories with functors as morphisms has the empty category as initial object and the one-object-one-morphism category as terminal object. ˆ • Any topological space X can be viewed as a category X by taking the open sets as objects, and a single morphism between two open sets U and V if and only if U ⊂ V . The empty set is the initial object of this category, and X is the terminal object. • If X is a topological space and C is some small category, we can form the category of ˆ all contravariant functors from X to C, using natural transformations as morphisms. This category is called the category of presheaves on X with values in C. If C has an initial object c, then the constant functor which sends every open set to c is an initial object in the category of presheaves. Similarly, if C has a terminal object, then the corresponding constant functor serves as a terminal presheave. • If we fix a homomorphism f : A → B of abelian groups, we can consider the category C consisting of all pairs (X, φ) where X is an abelian group and φ : X → A is a 1203

group homomorphism with f φ = 0. A morphism from the pair (X, φ) to the pair (Y, ψ) is defined to be a group homomorphism r : X → Y with the property ψr = φ: X
r φ

A
ψ

f

B

Y

The kernel of f is a terminal object in this category; this expresses the universal property of kernels. With an analogous construction, cokernels can be retrieved as initial objects of a suitable category. • The previous example can be generalized to arbitrary limits of functors: if F : I → C is ˆ a functor, we define a new category F as follows: its objects are pairs (X, (φi )) where X is an object of C and for every object i of I, φi : X → F (i) is a morphism in C such that for every morphism ρ : i → j in I, we have F (ρ)φi = φj . A morphism between pairs (X, (φi )) and (Y, (ψi )) is defined to be a morphism r : X → Y such that ψi r = φi for all objects i of I. The universal property of the limit can then be expressed as ˆ ˆ saying: any terminal object of F is a limit of F and vice versa (note that F need not contain a terminal object, just like F need not have a limit). Version: 11 Owner: AxelBoldt Author(s): AxelBoldt

285.11

forgetful functor

Let C and D be categories such that each object c of C can be regarded an object of D by suitably ignoring structures c may have as a C-object but not a D-object. A functor U : C → D which operates on objects of C by “forgetting” any imposed mathematical structure is called a forgetful functor. The following are examples of forgetful functors: 1. U : Grp → Set takes groups into their underlying sets and group homomorphisms to set maps. 2. U : Top → Set takes topological spaces into their underlying sets and continuous maps to set maps. 3. U : Ab → Grp takes abelian groups to groups and acts as identity on arrows. Forgetful functors are often instrumental in studying adjoint functors. Version: 1 Owner: RevBobo Author(s): RevBobo

1204

285.12

isomorphism

A morphism f : A −→ B in a category is an isomorphism if there exists a morphism f −1 : B −→ A which is its inverse. The objects A and B are isomorphic if there is an isomorphism between them. Examples: • In the category of sets and functions, a function f : A −→ B is an isomorphism if and only if it is bijective. • In the category of groups and group homomorphisms (or rings and ring homomorphisms), a homomorphism φ : G −→ H is an isomorphism if it has an inverse map φ−1 : H −→ G which is also a homomorphism. • In the category of vector spaces and linear transformations, a linear transformation is an isomorphism if and only if it is an invertible linear transformation. • In the category of topological spaces and continuous maps, a continuous map is an isomorphism if and only if it is a homeomorphism. Version: 2 Owner: djao Author(s): djao

285.13

natural transformation

Let A, B be categories and T, S : A → B functors. A natural transformation τ : S → T is a family of morphisms τ = {τA : T (A) → S(A)} such that for each object A of A, τA : S(A) → T (A) is an object of B and for each morphism f : A → A in A the following diagram commutes: τA T (A) S(A)
Sf Tf τA

S(A )

T (A )

Version: 6 Owner: RevBobo Author(s): RevBobo

285.14

types of homomorphisms

Often in a category of algebraic structures, those structures are generated by certain elements, and subject to certain relations. One often refers to functions between structures 1205

which are said to preserve those relations. These functions are typically called homomorphisms. An example is the category of groups. Suppose that f : A → B is a function between two groups. We say that f is a group homomorphism if: (a) the binary operator is preserved: f (a1 · a2 ) = f (a1 ) · f (a2 ) for all a1 , a2 ∈ A; (b) the identity element is preserved: f (eA ) = eB ; (c) inverses of elements are preserved: f (a−1 ) = [f (a)]−1 for all a ∈ A. One can define similar natural concepts of homomorphisms for other algebraic structures, giving us ring homomorphisms, module homomorphisms, and a host of others. We give special names to homomorphisms when their functions have interesting properties. If a homomorphism is an injective function (i.e. one-to-one), then we say that it is a monomorphism. These are typically monic in their category. If a homomorphism is an surjective function (i.e. onto), then we say that it is an epimorphism. These are typically epic in their category. If a homomorphism is an bijective function (i.e. both one-to-one and onto), then we say that it is an isomorphism. If the domain of a homomorphism is the same as its codomain (e.g. a homomorphism f : A → A), then we say that it is an endomorphism. We often denote the collection of endomorphisms on A as End(A). If a homomorphism is both an endomorphism and an isomorphism, then we say that it is an automorphism. We often denote the collection of automorphisms on A as Aut(A). Version: 4 Owner: antizeus Author(s): antizeus

285.15

zero object

An initial object in a category C is an object A in C such that, for every object X in C, there is exactly one morphism A −→ X. A terminal object in a category C is an object B in C such that, for every object X in C, there is exactly one morphism X −→ B. A zero object in a category C is an object 0 that is both an initial object and a terminal object.

1206

All initial objects (respectively, terminal objects, and zero objects), if they exist, are isomorphic in C. Version: 2 Owner: djao Author(s): djao

1207

Chapter 286 18A22 – Special properties of functors (faithful, full, etc.)
286.1 exact functor

A covariant functor F is said to be left exact if whenever 0 → A −→ B −→ C is an exact sequence, then 0 → F A −→ F B −→ F C is also an exact sequence. A covariant functor F is said to be right exact if whenever A −→ B −→ C → 0 is an exact sequence, then F A −→ F B −→ F C → 0 is also an exact sequence. A contravariant functor F is said to be left exact if whenever A −→ B −→ C → 0 is an exact sequence, then 0 → F C −→ F B −→ F A is also an exact sequence.
Fβ Fα α β Fα Fβ α β Fα Fβ α β

1208

A contravariant functor F is said to be right exact if whenever 0 → A −→ B −→ C is an exact sequence, then F C −→ F B −→ F A → 0 is also an exact sequence. A (covariant or contravariant) functor is said to be exact if it is both left exact and right exact. Version: 3 Owner: antizeus Author(s): antizeus
Fβ Fα α β

1209

Chapter 287 18A25 – Functor categories, comma categories
287.1 Yoneda embedding

ˆ If C is a category, write C for the category of contravariant functors from C to Sets, the ˆ category of sets. The morphisms in C are natural transformations of functors. (To avoid set theoretical concerns, one can take a universe U and take all categories to be U-small.) For any object X of C, there is the functor hX = Hom(−, X). Then X → hX is a covariant ˆ ˆ functor C → C, which embeds C faithfully as a full subcategory of C. Version: 4 Owner: nerdy2 Author(s): nerdy2

1210

Chapter 288 18A30 – Limits and colimits (products, sums, directed limits, pushouts, fiber products, equalizers, kernels, ends and coends, etc.)
288.1 categorical direct product

Let {Ci}i∈I be a set of objects in a category C. A direct product of the collection {Ci }i∈I is an object i∈I Ci of C, with morphisms πi : j∈I Cj −→ Ci for each i ∈ I, such that: For every object A in C, and any collection of morphisms fi : A −→ Ci for every i ∈ I, there exists a unique morphism f : A −→ i∈I Ci making the following diagram commute for all i ∈ I. fi Ci A
f j∈I πi

Cj

Version: 4 Owner: djao Author(s): djao

288.2

categorical direct sum

Let {Ci }i∈I be a set of objects in a category C. A direct sum of the collection {Ci }i∈I is an object i∈I Ci of C, with morphisms ιi : Ci −→ j∈I Cj for each i ∈ I, such that:

1211

For every object A in C, and any collection of morphisms fi : Ci −→ A for every i ∈ I, there exists a unique morphism f : i∈I Ci −→ A making the following diagram commute for all i ∈ I. fi Ci A
ιi j∈I Cj f

Version: 4 Owner: djao Author(s): djao

288.3

kernel

Let f : X → Y be a function and let Y be have some sort of zero, neutral or null element that we’ll denote as e. (Examples are groups, vector spaces, modules, etc) The kernel of f is the set: ker f = {x ∈ X : f (x) = e} that is, the set of elements in X such that their image is e. This set can also denoted as f −1 (e) (that doesn’t mean f has an inverse function, it’s just notation) and that is read as ”the kernel is the preimage of the neutral element”. Let’s see an examples. If X = Z and Y = Z6 , the function f that sends each integer n to its residue class modulo 6. So f (4) = 4, f (20) = 2, f (−5) = 1. The kernel of f consist precisely of the multiples of 6 (since they have residue 0, we have f (6k) = 0). This is also an example of kernel of a group homomorphism, and since the sets are also rings, the function f is also a homomorphism between rings and the kernel is also the kernel of a ring homomorphism. Usually we are interested on sets with certain algebraic structure. In particular, the following theorem holds for maps between pairs of vector spaces, groups, rings and fields (and some other algebraic structures): A map f : X → Y is injective if and only if ker f = {0} (the zero of Y ). Version: 4 Owner: drini Author(s): drini

1212

Chapter 289 18A40 – Adjoint functors (universal constructions, reflective subcategories, Kan extensions, etc.)
289.1 adjoint functor

Let C, D be categories and T : C → D, S : D → C be covariant functors. T is said to be a left adjoint functor to S (equivalently, S is a right adjoint functor to T ) if there exists ν = νC,D such that ν : Hom(T (C), D) ∼ Hom(C, S(D)) =
D C

is a natural bijection of hom-sets for all objects C of C and D of D. An adjoint to any functor are unique up to natural transformation. Examples: 1. Let U : Top → Set be the forgetful functor (i.e. U takes topological spaces to their underlying sets, and continuous maps to set functions). Then U is right adjoint to the functor F : Set → Top which gives each set the discrete topology. 2. If U : Grp → Set is again the forgetful functor, this time on the category of groups, the functor F : Set → Grp which takes a set A to the free group generated by A is left adjoint to U. 3. If UN : R − mod → R − mod is the functor M → N ⊗ M for an R-module N, then UN is the left adjoint to the functor FN : R−mod → R−mod given by L → HomR (N, L). 1213

Version: 8 Owner: bwebste Author(s): bwebste, RevBobo

289.2

equivalence of categories

Let C and D be two categories with functors F : C → D and G : D → C. The functors F and G are an Definition 17. equivalence of categories if there are natural isomorphisms F G ∼ idD and = ∼ idC . GF = Note, F is left adjoint to G, and G is right adjoint to F as Hom(F (c), d) −→ Hom(GF (c), G(d)) ←→ Hom(c, G(d)).
D C C G

And, F is right adjoint to G, and G is left adjoint to F as Hom(G(d), c) −→ Hom(F G(d), F (c)) ←→ Hom(d, F (c)).
C D D F

In practical terms, two categories are equivalent if there is a fully faithful functor F : C → D, such that every object d ∈ D is isomorphic to an object F (c), for some c ∈ C. Version: 2 Owner: mhale Author(s): mhale

1214

Chapter 290 18B40 – Groupoids, semigroupoids, semigroups, groups (viewed as categories)
290.1 groupoid (category theoretic)

A groupoid, also known as a virtual group, is a small category where every morphism is invertible. There is also a group-theoretic concept with the same name. Version: 6 Owner: akrowne Author(s): akrowne

1215

Chapter 291 18E10 – Exact categories, abelian categories
291.1 abelian category

An abelian category is a category A satisfying the following axioms. Because the later axioms rely on terms whose definitions involve the earlier axioms, we will intersperse the statements of the axioms with such auxiliary definitions as needed. Axiom 1. For any two objects A, B in A, the set of morphisms Hom(A, B) is an abelian group. The identity element in the group Hom(·, ·) will be denoted by 0, and the group operation by +. Axiom 2. Composition of morphisms distributes over addition in Hom(·, ·). That is, given any diagram of morphisms A
f g1

B

g2

C

h

D

we have (g1 + g2 )f = g1 f + g2 f and h(g1 + g2 ) = hg1 + hg2 . Axiom 3. A has a zero object. Axiom 4. For any two objects A, B in A, the categorical direct product A × B exists in A. Given a morphism f : A −→ B in A, a kernel of f is a morphism i : X −→ A such that: • f i = 0. • For any other morphism j : X −→ A such that f j = 0, there exists a unique morphism 1216

j : X −→ X such that the diagram X
j j i

X commutes.

A

f

B

Likewise, a cokernel of f is a morphism p : B −→ Y such that: • pf = 0. • For any other morphism j : B −→ Y such that jf = 0, there exists a unique morphism j : Y −→ Y such that the diagram A
f

B
j

p

Y
j

Y commutes. Axiom 5. Every morphism in A has a kernel and a cokernel. The kernel and cokernel of a morphism f in A will be denoted ker (f ) and cok(f ), respectively. A morphism f : A −→ B in A is called a monomorphism if, for every morphism g : X −→ A such that f g = 0, we have g = 0. Similarly, the morphism f is called an epimorphism if, for every morphism h : B −→ Y such that hf = 0, we have h = 0. Axiom 6. ker (cok(f )) = f for every monomorphism f in A. Axiom 7. cok(ker (f )) = f for every epimorphism f in A. Version: 6 Owner: djao Author(s): djao

291.2

exact sequence

Let A be an abelian category. We begin with a preliminary definition. Definition 1. For any morphism f : A −→ B in A, let m : X −→ B be the morphism equal to ker (cok(f )). Then the object X is called the image of f , and denoted Im(f ). The morphism m is called the image morphism of f , and denoted i(f ). 1217

Note that Im(f ) is not the same as i(f ): the former is an object of A, while the latter is a morphism of A. We note that f factors through i(f ): A
e

Im(f )
f

i(f )

B

The proof is as follows: by definition of cokernel, cok(f )f = 0; therefore by definition of kernel, the morphism f factors through ker (cok(f )) = i(f ), and this factor is the morphism e above. Furthermore m is a monomorphism and e is an epimorphism, although we do not prove these facts. Definition 2. A sequence ···A B of morphisms in A is exact at B if ker (g) = i(f ). Version: 3 Owner: djao Author(s): djao
f g

C ···

291.3

derived category

Let A be an abelian category, and let K(A) be the category of chain complexes in A, with morphisms chain homotopy classes of maps. Call a morphism of chain complexes a quasiisomorphism if it induces an isomorphism on homology groups of the complexes. For example, any chain homotopy is an quasi-isomorphism, but not conversely. Now let the derived category D(A) be the category obtained from K(A) by adding a formal inverse to every quasi-isomorphism (technically this called a localization of the category). Derived categories seem somewhat obscure, but in fact, many mathematicians believe they are the appropriate place to do homological algebra. One of their great advantages is that the important functors of homological algebra which are left or right exact (Hom,N⊗k , where N is a fixed k-module, the global section functor Γ, etc.) become exact on the level of derived functors (with an appropriately modified definition of exact). See Methods of Homological Algebra, by Gelfand and Manin for more details. Version: 2 Owner: bwebste Author(s): bwebste

291.4

enough injectives

An abelian category is said to have enough injectives if for every object X, there is a monomorphism 0 → X → I where I is an injective object. Version: 2 Owner: bwebste Author(s): bwebste 1218

Chapter 292 18F20 – Presheaves and sheaves
292.1
292.1.1

locally ringed space
Definitions

A locally ringed space is a topological space X together with a sheaf of rings OX with the property that, for every point p ∈ X, the stalk (OX )p is a local ring 1 . A morphism of locally ringed spaces from (X, OX ) to (Y, OY ) is a continuous map f : X −→ Y together with a morphism of sheaves φ : OY −→ OX with respect to f such that, for every point p ∈ X, the induced ring homomorphism on stalks φp : (OY )f (p) −→ (OX )p is a local homomorphism. That is, φp (y) ∈ mp for every y ∈ mf (p) , where mp (respectively, mf (p) ) is the maximal ideal of the ring (OX )p (respectively, (OY )f (p) ).

292.1.2

Applications

Locally ringed spaces are encountered in many natural contexts. Basically, every sheaf on the topological space X consisting of continuous functions with values in some field is a locally ringed space. Indeed, any such function which is not zero at a point p ∈ X is nonzero and thus invertible in some neighborhood of p, which implies that the only maximal ideal of the stalk at p is the set of germs of functions which vanish at p. The utility of this definition lies in the fact that one can then form constructions in familiar instances of locally ringed spaces which readily generalize in ways that would not necessarily be obvious without this framework. For example, given a manifold X and its locally ringed space DX of real–valued differentiable functions, one can show that the space of all tangent vectors to
1

All rings mentioned in this article are required to be commutative.

1219

X at p is naturally isomorphic to the real vector space (mp /m2 )∗ , where the ∗ indicates the p dual vector space. We then see that, in general, for any locally ringed space X, the space of tangent vectors at p should be defined as the k–vector space (mp /m2 )∗ , where k is the p residue field (OX )p /mp and ∗ denotes dual with respect to k as before. It turns out that this definition is the correct definition even in esoteric contexts like algebraic geometry over finite fields which at first sight lack the differential structure needed for constructions such as tangent vector. Another useful application of locally ringed spaces is in the construction of schemes. The forgetful functor assigning to each locally ringed space (X, OX ) the ring OX (X) is adjoint to the ”prime spectrum” functor taking each ring R to its prime spectrum Spec(R), and this correspondence is essentially why the category of locally ringed spaces is the proper building block to use in the formulation of the notion of scheme. Version: 9 Owner: djao Author(s): djao

292.2

presheaf

For a topological space X a presheaf F with values in a category C associates to each open set U ⊂ X, an object F (U) of C and to each inclusion U ⊂ V a morphism of C, ρU V : F (V ) → F (U), the restriction morphism. It is required that ρU U = 1F (U ) and ρU W = ρU V ◦ ρV W for any U ⊂ V ⊂ W . A presheaf with values in the category of sets (or abelian groups) is called a presheaf of sets (or abelian groups). If no target category is specified, either the category of sets or abelian groups is most likely understood. A more categorical way to state it is as follows. For X form the category Top(X) whose objects are open sets of X and whose morphisms are the inclusions. Then a presheaf is merely a contravariant functor Top(X) → C. Version: 2 Owner: nerdy2 Author(s): nerdy2

292.3
292.3.1

sheaf
Presheaves

Let X be a topological space and let A be a category. A presheaf on X with values in A is a contravariant functor F from the category of open sets in X and inclusion morphisms to the category A. As this definition may be less than helpful to many readers, we offer the following equivalent 1220

(but longer) definition. A presheaf F on X consists of the following data: 1. An object F (U) in A, for each open set U ⊂ X 2. A morphism resV,U : F (V ) −→ F (U) for each pair of open sets U ⊂ V in X (called the restriction morphism), such that: (a) For every open set U ⊂ X, the morphism resU,U is the identity morphism.
resW,U

(b) For any open sets U ⊂ V ⊂ W in X, the diagram

F (W ) resW,V F (V ) resV,U F (U) commutes. If the object F (U) of A is a set, its elements are called sections of U.

292.3.2

Morphisms of Presheaves

Let f : X −→ Y be a continuous map of topological spaces. Suppose FX is a presheaf on X, and GY is a presheaf on Y (with FX and GY both having values in A). We define a morphism of presheaves φ from GY to FX , relative to f , to be a collection of morphisms φU : GY (U) −→ FX (f −1 (U)) in A, one for every open set U ⊂ Y , such that the diagram GY (V )
resV,U φV

FX (f −1 (V ))
resf −1 (V ),f −1 (U )

GY (U)

φU

FX (f −1 (U))

commutes, for each pair of open sets U ⊂ V in Y . In the special case that f is the identity map id : X −→ X, we omit mention of the map f , and speak of φ as simply a morphism of presheaves on X. Form the category whose objects are presheaves on X and whose morphisms are morphisms of presheaves on X. Then an isomorphism of presheaves φ on X is a morphism of presheaves on X which is an isomorphism in this category; that is, there exists a morphism φ−1 whose composition with φ both ways is the identity morphism. More generally, if f : X −→ Y is any homeomorphism of topological spaces, a morphism of presheaves φ relative to f is an isomorphism if it admits a two–sided inverse morphism of presheaves φ−1 relative to f −1 .

1221

292.3.3

Sheaves

We now assume that the category A is a concrete category. A sheaf is a presheaf F on X, with values in A, such that for every open set U ⊂ X, and every open cover {Ui } of U, the following two conditions hold: 1. Any two elements f1 , f2 ∈ F (U) which have identical restrictions to each Ui are equal. That is, if resU,Ui f1 = resU,Ui f2 for every i, then f1 = f2 . 2. Any collection of elements fi ∈ F (Ui ) that have common restrictions can be realized as the collective restrictions of a single element of F (U). That is, if resUi ,Ui T Uj fi = resUj ,Ui T Uj fj for every i and j, then there exists an element f ∈ F (U) such that resU,Ui f = fi for all i.

292.3.4

Sheaves in abelian categories

If A is a concrete abelian category, then a presheaf F is a sheaf if and only if for every open subset U of X, the sequence 0 F (U)
incl i

F (Ui )

diff i,j

F (Ui

Uj )

(292.3.1)

is an exact sequence of morphisms in A for every open cover {Ui } of U in X. This diagram requires some explanation, because we owe the reader a definition of the morphisms incl and diff. We start with incl (short for “inclusion”). The restriction morphisms F (U) −→ F (Ui ) induce a morphism F (U) −→ F (Ui )
i

to the categorical direct product i F (Ui ), which we define to be incl. The map diff (called “difference”) is defined as follows. For each Ui , form the morphism αi : F (Ui ) −→ F (Ui
j

Uj ).

By the universal properties of categorical direct product, there exists a unique morphism α:
i

F (Ui ) −→

F (Ui
i j

Uj )

such that πi α = αi πi for all i, where πi is projection onto the ith factor. In a similar manner, form the morphism β: F (Uj ) −→ F (Ui Uj ).
j j i

1222

Then α and β are both elements of the set Hom
i

F (Ui ),
i,j

F (Ui

Uj ) ,

which is an abelian group since A is an abelian category. Take the difference α − β in this group, and define this morphism to be diff. Note that exactness of the sequence (291.3.1) is an element free condition, and therefore makes sense for any abelian category A, even if A is not concrete. Accordingly, for any abelian category A, we define a sheaf to be a presheaf F for which the sequence (291.3.1) is always exact.

292.3.5

Examples

It’s high time that we give some examples of sheaves and presheaves. We begin with some of the standard ones. Example 9. If F is a presheaf on X, and U ⊂ X is an open subset, then one can define a presheaf F |U on U by restricting the functor F to the subcategory of open sets of X in U and inclusion morphisms. In other words, for open subsets of U, define F |U to be exactly what F was, and ignore open subsets of X that are not open subsets of U. The resulting presheaf is called, for obvious reasons, the restriction presheaf of F to U, or the restriction sheaf if F was a sheaf to begin with. Example 10. For any topological space X, let cX be the presheaf on X, with values in the category of rings, given by • cX (U) := the ring of continuous real–valued functions U −→ R, • resV,U f := the restriction of f to U, for every element f : V −→ R of cX (V ) and every subset U of V . Then cX is actually a sheaf of rings, because continuous functions are uniquely specified by their values on an open cover. The sheaf cX is called the sheaf of continuous real–valued functions on X. Example 11. Let X be a smooth differentiable manifold. Let DX be the presheaf on X, with values in the category of real vector spaces, defined by setting DX (U) to be the space of smooth real–valued functions on U, for each open set U, and with the restriction morphism given by restriction of functions as before. Then DX is a sheaf as well, called the sheaf of smooth real–valued functions on X. Much more surprising is that the construct DX can actually be used to define the concept of smooth manifold! That is, one can define a smooth manifold to be a locally Euclidean n–dimensional second countable topological space X, together with a sheaf F , such that there exists an open cover {Ui } of X where: 1223

For every i, there exists a homeomorphism fi : Ui −→ Rn and an isomorphism of sheaves φi : DRn −→ F |Ui relative to fi . The idea here is that not only does every smooth manifold X have a sheaf DX of smooth functions, but specifying this sheaf of smooth functions is sufficient to fully describe the smooth manifold structure on X. While this phenomenon may seem little more than a toy curiousity for differential geometry, it arises in full force in the field of algebraic geometry where the coordinate functions are often unwieldy and algebraic structures in many cases can only be satisfactorily described by way of sheaves and schemes. Example 12. Similarly, for a complex analytic manifold X, one can form the sheaf HX of holomorphic functions by setting HX (U) equal to the complex vector space of C–valued holomorphic functions on U, with the restriction morphism being restriction of functions as before. Example 13. The algebraic geometry analogue of the sheaf DX of differential geometry is the prime spectrum Spec(R) of a commutative ring R. However, the construction of the sheaf Spec(R) is beyond the scope of this discussion and merits a separate article. Example 14. For an example of a presheaf that is not a sheaf, consider the presheaf F on X, with values in the category of real vector spaces, whose sections on U are locally constant real–valued functions on U modulo constant functions on U. Then every section f ∈ F (U) is locally zero in some fine enough open cover {Ui } (it is enough to take a cover where each Ui is connected), whereas f may be nonzero if U is not connected. We conclude with some interesting examples of morphisms of sheaves, chosen to illustrate the unifying power of the language of schemes across various diverse branches of mathematics. 1. For any continuous function f : X −→ Y , the map φU : cY (U) −→ cX (f −1 (U)) given by φU (g) := gf defines a morphisms of sheaves from cY to cX with respect to f . 2. For any continuous function f : X −→ Y of smooth differentiable manifolds, the map given by φU (g) := gf has the property if and only if f is a smooth function. g ∈ DY (U) → φU (g) ∈ DX (f −1 (U))

3. For any continuous function f : X −→ Y of complex analytic manifolds, the map given by φU (g) := gf has the property if and only if f is a holomorphic function. g ∈ HY (U) → φU (g) ∈ HX (f −1 (U))

4. For any Zariski continuous function f : X −→ Y of algebraic varieties over a field k, the map given by φU (g) := gf has the property if and only if f is a regular function. Here OX denotes the sheaf of k–valued regular functions on the algebraic variety X. 1224 g ∈ OY (U) → φU (g) ∈ OX (f −1 (U))

REFERENCES
1. David Mumford, The Red Book of Varieties and Schemes, Second Expanded Edition, Springer– Verlag, 1999 (LNM 1358). 2. Charles Weibel, An Introduction to Homological Algebra, Cambridge University Press, 1994.

Version: 9 Owner: djao Author(s): djao

292.4

sheafification

Let F be a presheaf over a topological space X with values in a category A for which sheaves are defined. The sheafification of F , if it exists, is a sheaf F over X together with a morphism θ : F −→ F satisfying the following universal property: For any sheaf G over X and any morphism of presheaves φ : F −→ G over X, there exists a unique morphism of sheaves ψ : F −→ G such that the diagram F commutes. In light of the universal property, the sheafification of F is uniquely defined up to canonical isomorphism whenever it exists. In the case where A is a concrete category (one consisting of sets and set functions), the sheafification of any presheaf F can be constructed by taking F (U) to be the set of all functions s : U −→ p∈U Fp such that 1. s(p) ∈ Fp for all p ∈ U 2. For all p ∈ U, there is a neighborhood V ⊂ U of p and a section t ∈ F (V ) such that, for all q ∈ V , the induced element tq ∈ Fq equals s(q) for all open sets U ⊂ X. Here Fp denotes the stalk of the presheaf F at the point p. The following quote, taken from [1], is perhaps the best explanation of sheafification to be found anywhere: F is ”the best possible sheaf you can get from F ”. It is easy to imagine how to get it: first identify things which have the same restrictions, and then add in all the things which can be patched together. 1225
θ

F
φ

ψ

G

REFERENCES
1. David Mumford, The Red Book of Varieties and Schemes, Second Expanded Edition, Springer– Verlag, 1999 (LNM 1358)

Version: 4 Owner: djao Author(s): djao

292.5

stalk

Let F be a presheaf over a topological space X with values in an abelian category A, and suppose direct limits exist in A. For any point p ∈ X, the stalk Fp of F at p is defined to be the object in A which is the direct limit of the objects F (U) over the directed set of all open sets U ⊂ X containing p, with respect to the restriction morphisms of F . In other words, Fp := lim F (U)
−→ U p

If A is a category consisting of sets, the stalk Fp can be viewed as the set of all germs of sections of F at the point p. That is, the set Fp consists of all the equivalence classes of ordered pairs (U, s) where p ∈ U and s ∈ F (U), under the equivalence relation (U, s) ∼ (V, t) if there exists a neighborhood W ⊂ U V of p such that resU,W s = resV,W t. By universal properties of direct limit, a morphism φ : F −→ G of presheaves over X induces a morphism φp : Fp −→ Gp on each stalk Fp of F . Stalks are most useful in the context of sheaves, since they encapsulate all of the local data of the sheaf at the point p (recall that sheaves are basically defined as presheaves which have the property of being completely characterized by their local behavior). Indeed, in many of the standard examples of sheaves that take values in rings (such as the sheaf DX of smooth functions, or the sheaf OX of regular functions), the ring Fp is a local ring, and much of geometry is devoted to the study of sheaves whose stalks are local rings (so-called ”locally ringed spaces”). We mention here a few illustrations of how stalks accurately reflect the local behavior of a sheaf; all of these are drawn from [1]. • A morphism of sheaves φ : F −→ G over X is an isomorphism if and only if the induced morphism φp is an isomorphism on each stalk. • A sequence F −→ G −→ H of morphisms of sheaves over X is an exact sequence at G if and only if the induced morphism Fp −→ Gp −→ Hp is exact at each stalk Gp . • The sheafification F of a presheaf F has stalk equal to Fp at every point p.

1226

REFERENCES
1. Robin Hartshorne, Algebraic Geometry, Springer–Verlag New York Inc., 1977 (GTM 52).

Version: 4 Owner: djao Author(s): djao

1227

Chapter 293 18F30 – Grothendieck groups
293.1 Grothendieck group

Let S be an Abelian semigroup. The Grothendieck group of S is K(S) = S × S/∼, where ∼ is the equivalence relation: (s, t) ∼ (u, v) if there exists r ∈ S such that s + v + r = t + u + r. This is indeed an abelian group with zero element (s, s) (any s ∈ S) and inverse −(s, t) = (t, s). The Grothendieck group construction is a functor from the category of abelian semigroups to the category of abelian groups. A morphism f : S → T induces a morphism K(f ) : K(S) → K(T ). Example 15. K(N) = Z. Example 16. Let G be an abelian group, then K(G) ∼ G via (g, h) ↔ g − h. = Let C be a symmetric monoidal category. Its Grothendieck group is K([C]), i.e. the Grothendieck group of the isomorphism classes of objects of C. Version: 2 Owner: mhale Author(s): mhale

1228

Chapter 294 18G10 – Resolutions; derived functors
294.1 derived functor

There are two objects called derived functors. First, there are classical derived functors. Let A, B be abelian categories, and F : A → B be a covariant left-exact functor. Note that a completely analogous construction can be done for right-exact and contravariant functors, but it is traditional to only describe one case, as doing the other mostly consists of reversing arrows. Given an object A ∈ A, we can construct an injective resolution: A→A:0 A I1 I2 ··· which is unique up to chain homotopy equivalence. Then we apply the functor F to the injectives in the resolution to to get a complex F (A) : 0 F (I 1 ) F (I 2 ) ···

(notice that the term involving A has been left out. This is not an accident, in fact, it is crucial). This complex also is independent of the choice of I’s (up to chain homotopy equivalence). Now, we define the classical right derived functors Ri F (A) to be the cohomology groups H i (F (A)). These only depend on A. Important properties of the classical derived functors are these: If the sequence 0 → A → A → A → 0 is exact, then there is a long exact sequence 0 F (A) F (A ) F (A ) R1 F (A) ···

which is natural (a morphism of short exact sequences induces a morphism of long exact sequences). This, along with a couple of other properties determine the derived functors completely, giving an axiomatic definition, though the construction used above is usually necessary to show existence. From the definition, one can see immediately that the following are equivalent: 1229

1. F is exact 2. Rn F (A) = 0 for n 1 and all A ∈ A.

3. R1 F (A) = 0 for all A ∈ A. However, R1 F (A) = 0 for a particular A does not imply that Rn F (A) = 0 for all n 1.

Important examples are Extn , the derived functor of Hom, Torn , the derived functor of tensor product, and sheaf cohomology, the derived functor of the global section functor on sheaves. (Coming soon: the derived categoies definition) Version: 4 Owner: bwebste Author(s): bwebste

1230

Chapter 295 18G15 – Ext and Tor, generalizations, K¨nneth formula u
295.1 Ext

For a ring R, and R-module A, we have a covariant functor HomA − R. Extn (A, −) are R defined to be the right derived functors of HomA − R (Extn (A, −) = Rn HomA − R). R Ext gets its name from the following fact: There is a natural bijection between elements of Ext1 (A, B) and extensions of B by A up to isomorphism of short exact sequences, where an R extension of B by A is an exact sequence 0→B→C→A→0 . For example, Ext1 (Z/nZ, Z) ∼ Z/nZ = Z

, with 0 corresponding to the trivial extension 0 → Z → Z ⊕ Z/nZ → 0, and m = 0 corresponding to n m Z/nZ 0. 0 Z Z Version: 3 Owner: bwebste Author(s): bwebste

1231

Chapter 296 18G30 – Simplicial sets, simplicial objects (in a category)
296.1
The Definition 18. nerve of a category C is the simplicial set Hom(i(−), C), where i : ∆ → Cat is the fully faithful functor that takes each ordered set [n] in the simplicial category, ∆, to op the pre-order n + 1. The nerve is a functor Cat → Set∆ . Version: 1 Owner: mhale Author(s): mhale

nerve

296.2

simplicial category

The simplicial category ∆ is defined as the small category whose objects are the totally ordered finite sets [n] = {0 < 1 < 2 < . . . < n}, n 0, (296.2.1) and whose morphisms are monotonic non-decreasing (order-preserving) maps. It is generated by two families of morphisms:
n δi : [n − 1] → [n] is the injection missing i ∈ [n], n n n σi : [n + 1] → [n] is the surjection such that σi (i) = σi (i + 1) = i ∈ [n]. n The δi morphisms are called n Definition 19. face maps, and the σi morphisms are called

1232

Definition 20. degeneracy maps. They satisfy the following relations,
n+1 n n+1 n δj δi = δi δj−1

for i < j,

(296.2.2) (296.2.3) (296.2.4)

0 All morphisms [n] → [0] factor through σ0 , so [0] is terminal.

n−1 n n−1 n σj σi = σi σj+1 for i j,  n n−1 if i < j,  δi σj−1 n n+1 idn if i = j or i = j + 1, σj δi =  n n−1 δi−1 σj if i > j + 1.

There is a bifunctor + : ∆ × ∆ → ∆ defined by [m] + [n] = [m + n + 1], f (i) if 0 i m, (f + g)(i) = g(i − m − 1) + m + 1 if m < i (m + n + 1), (296.2.5) (296.2.6)

where f : [m] → [m ] and g : [n] → [n ]. Sometimes, the simplicial category is defined to include the empty set [−1] = ∅, which provides an initial object for the category. This makes ∆ a strict monoidal category as ∅ is a unit for the bifunctor: ∅ + [n] = [n] = [n] + ∅ and id∅ + f = f = f + id∅ . Further, ∆ is then the free monoidal category on a monoid object 0 (the monoid object being [0], with product σ0 : [0] + [0] → [0]). There is a fully faithful functor from ∆ to Top, which sends each object [n] to an oriented nsimplex. The face maps then embed an (n − 1)-simplex in an n-simplex, and the degeneracy maps collapse an (n + 1)-simplex to an n-simplex. The bifunctor forms a simplex from the disjoint union of two simplicies by joining their vertices together in a way compatible with their orientations. There is also a fully faithful functor from ∆ to Cat, which sends each object [n] to a pre-order n + 1. The pre-order n is the category consisting of n partially-ordered objects, with one morphism a → b iff a b. Version: 4 Owner: mhale Author(s): mhale

296.3
A

simplicial object

Definition 21. simplicial object in a category C is a contravariant functor from the simplicial category n ∆ to C. Such a functor X is uniquely specified by the morphisms X(δi ) : [n] → [n − 1] and

1233

n X(σi ) : [n] → [n + 1], which satisfy n−1 n n−1 n X(δi ) X(δj ) = X(δj−1 ) X(δi ) for i < j, n+1 n n+1 n X(σi ) X(σj ) = X(σj+1 ) X(σi ) for  n−1 n  X(σj−1 ) X(δi ) n+1 n idn X(δi ) X(σj ) =  n−1 n X(σj ) X(δi−1 )

(296.3.1) (296.3.2) (296.3.3)

i

j,

if i < j, if i = j or i = j + 1, if i > j + 1.

In particular, a

Definition 22. simplicial set is a simplicial object in Set. Equivalently, one could say that a simplicial set is a presheaf on ∆. The object X([n]) of a simplicial set is a set of n-simplices, and is called the n-skeleton. Version: 2 Owner: mhale Author(s): mhale

1234

Chapter 297 18G35 – Chain complexes
297.1 5-lemma

If Ai , Bi for i = 1, . . . , 5 are objects in an abelian category (for example, modules over a ring R) such that there is a commutative diagram A1
γ1

A2
γ2

A3
γ3

A4
γ4

A5
γ5

B1

B2

B3

B4

B5

with the rows exact, and γ1 is surjective, γ5 is injective, and γ2 and γ4 are isomorphisms, then γ3 is an isomorphism as well. Version: 2 Owner: bwebste Author(s): bwebste

1235

297.2

9-lemma

If Ai , Bi , Ci , for i = 1, 2, 3 are objects of an abelian category such that there is a commutative diagram 0 0 0 0 A1 A2 A3 0 0 B1 B2 B3 0 0 C1 C2 C3 0 0 0 0

with the columns and bottom two rows are exact, then the top row is exact as well. Version: 2 Owner: bwebste Author(s): bwebste

297.3

Snake lemma

There are two versions of the snake lemma: (1) Given a commutative (1) diagram as below, with exact (1) rows 0 − − A1 − − B1 − − C1 − − 0 −→ −→ −→ −→       γ α β 0 − − A2 − − B2 − − C2 − − 0 −→ −→ −→ −→ there is an exact sequence 0 → ker α → ker β → ker γ → coker α → coker β → coker γ → 0 where ker denotes the kernel of a map and coker its cokernel. (2) Applying this result inductively to a short exact (2) sequence of (2) chain complexes, we obtain the following: Let A, B, C be chain complexes, and let 0→A→B→C→0 be a short exact sequence. Then there is a long exact sequence of homology groups · · · → Hn (A) → Hn (B) → Hn (C) → Hn−1 (A) → · · · Version: 5 Owner: bwebste Author(s): bwebste 1236

297.4

chain homotopy

Let (A, d) and (A , d ) be chain complexes and f : A → A , g : A → A be chain maps. A chain homotopy D between f and g is a sequence of homomorphisms {Dn : An → An+1 } so that dn+1 ◦ Dn + Dn−1 ◦ dn = fn − gn for each n. Thus, we have a commutative diagram: An+1
fn+1 −gn+1 dn+1

An

dn

An−1
fn−1 −gn−1

Dn−1 Dn dn+1

An+1

An

dn

An−1

Version: 4 Owner: RevBobo Author(s): RevBobo

297.5

chain map

Let (A, d) and (A , d ) be chain complexes. A chain map f : A → A is a sequence of homomorphisms {fn } such that dn ◦ fn = fn−1 ◦ dn for each n. Diagramatically, this says that the following diagram commutes: An
fn dn

An−1
fn−1

An

dn

An−1

Version: 3 Owner: RevBobo Author(s): RevBobo

297.6

homology (chain complex)

If (A, d) is a chain complex − − −− −− · · · ← − An−1 ← n − An ← − An+1 ← − · · · −− then the n-th homology group Hn (A, d) (or module) of the chain complex A is the quotient Hn (A, d) = Version: 2 Owner: bwebste Author(s): bwebste 1237 ker dn . i dn+1
dn−1 d dn+1 dn+2

Chapter 298 18G40 – Spectral sequences, hypercohomology
298.1 spectral sequence

A spectral sequence is a collection of R-modules (or more generally, objects of an abelian category) r r r {Ep,q } for all r ∈ N, p, q ∈ Z, equipped with maps dr : Ep,q → Ep−r,q+r−1 such that is a pq r+1 chain complex, and the E ’s are its homology, that is,
r+1 Ep,q ∼ ker(dr )/im(dr = p,q p+r,q−r+1).

(Note: what I have defined above is a homology spectral sequence. Cohomology spectral sequences are identical, except that all the arrows go in the other direction.)
r Most interesting spectral sequences are upper right quadrant, meaning that Ep,q = 0 if p or q < 0. If this is the case then for any p, q, both dr and dr pq p+r,q−r+1 are 0 for sufficiently large r since the target or source is out of the upper right quadrant, so that for all r > r0 r r+1 ∞ Ep,q = Ep,q · · · . This group is called Ep,q . r A upper right quadrant spectral sequence {Ep,q } is said to converge to a sequence Fn of R-modules if there is an exhaustive filtration Fn,0 = 0 ⊂ Fn,1 ⊂ · · · ⊂ of each Fn such that

Fp+q,q+1 /Fp+q,q ∼ Ep,q = ∞
r . This is typically written Ep,q ⇒ Fp+q .

Typically spectral sequences are used in the following manner: we find an interpretation of E r for a small value of r, typically 1, and of E ∞ , and then in cases where enough groups and differentials are 0, we can obtain information about one from the other. Version: 2 Owner: bwebste Author(s): bwebste 1238

Chapter 299 19-00 – General reference works (handbooks, dictionaries, bibliographies, etc.)
299.1 Algebraic K-theory

Algebraic K-theory is a series of functors on the category of rings. It classifies ring invariants, i.e. ring properties that are Morita invariant. The functor K0 Let R be a ring and denote by M∞ (R) the algebraic direct limit of matrix algebras Mn (R) a 0 . The zeroth K-group of under the embeddings Mn (R) → Mn+1 (R) : a → 0 0 R, K0 (R), is the Grothendieck group (abelian group of formal differences) of the unitary equivalence classes of projections in M∞ (R). The addition of two equivalence classes [p] and [q] is given by the direct summation of the projections p and q: [p] + [q] = [p ⊕ q]. The functor K1 [To Do: coauthor?] The functor K2 [To Do: coauthor?] Higher K-functors Higher K-groups are defined using the Quillen plus construction,
alg Kn (R) = πn (BGL∞ (R)+ ),

(299.1.1)

1239

where GL∞ (R) is the infinite general linear group over R (defined in a similar way to M∞ (R)), and BGL∞ (R) is its classifying space. Algebraic K-theory has a product structure, Ki (R) ⊗ Kj (S) → Ki+j (R ⊗ S). Version: 2 Owner: mhale Author(s): mhale (299.1.2)

299.2

K-theory

Topological K-theory is a generalised cohomology theory on the category of compact Hausdorff spaces. It classifies the vector bundles over a space X up to stable equivalences. Equivalently, via the Serre-Swan theorem, it classifies the finitely generated projective modules over the C ∗ -algebra C(X). Let A be a unital C ∗ -algebra over C and denote by M∞ (A) the algebraic direct limit of a 0 matrix algebras Mn (A) under the embeddings Mn (A) → Mn+1 (A) : a → . The 0 0 K0 (A) group is the Grothendieck group (abelian group of formal differences) of the homotopy classes of the projections in M∞ (A). Two projections p and q are homotopic if p = uqu−1 for some unitary u ∈ M∞ (A). Addition of homotopy classes is given by the direct summation of projections: [p] + [q] = [p ⊕ q]. Denote by U∞ (A) the direct limit of unitary groups Un (A) under the embeddings Un (A) → u 0 Un+1 (A) : u → . Give U∞ (A) the direct limit topology, i.e. a subset U of U∞ (A) 0 1 is open if and only if U Un (A) is an open subset of Un (A), for all n. The K1 (A) group is the Grothendieck group (abelian group of formal differences) of the homotopy classes of the unitaries in U∞ (A). Two unitaries u and v are homotopic if uv −1 lies in the identity component of U∞ (A). Addition of homotopy classes is given by the direct summation of unitaries: [u] + [v] = [u ⊕ v]. Equivalently, one can work with invertibles in GL∞ (A) (an invertible g is connected to the unitary u = g|g|−1 via the homotopy t → g|g|−t). Higher K-groups can be defined through repeated suspensions, Kn (A) = K0 (S n A). But, the Bott periodicity theorem means that K1 (SA) ∼ K0 (A). = (299.2.2) (299.2.1)

1240

The main properties of Ki are: Ki (A ⊕ B) Ki (Mn (A)) Ki (A ⊗ K) Ki+2 (A) = = = = Ki (A) ⊕ Ki (B), Ki (A) (Morita invariance), Ki (A) (stability), Ki (A) (Bott periodicity). (299.2.3) (299.2.4) (299.2.5) (299.2.6)

There are three flavours of topological K-theory to handle the cases of A being complex (over C), real (over R) or Real (with a given real structure). Ki (C(X, C)) = KU −i (X) (complex/unitary), Ki (C(X, R)) = KO −i (X) (real/orthogonal), KR i (C(X), J) = KR −i (X, J) (Real). Real K-theory has a Bott period of 8, rather than 2. (299.2.7) (299.2.8) (299.2.9)

REFERENCES
1. N. E. Wegge-Olsen, K-theory and C ∗ -algebras. Oxford science publications. Oxford University Press, 1993. 2. B. Blackadar, K-Theory for Operator Algebras. Cambridge University Press, 2nd ed., 1998.

Version: 12 Owner: mhale Author(s): mhale

299.3

examples of algebraic K-theory groups
R Z R C K0 (R) Z Z C K1 (R) Z/2 R× C× K2 (R) Z/2 K3 (R) Z/48 K4 (R) 0

Algebraic K-theory of some common rings. Version: 2 Owner: mhale Author(s): mhale

1241

Chapter 300 19K33 – EXT and K-homology
300.1 Fredholm module

Fredholm modules represent abstract elliptic pseudo-differential operators. Definition 3. An Definition 23. odd Fredholm module (H, F ) over a C ∗ -algebra A is given by an involutive representation π of A on a Hilbert space H, together with an operator F on H such that F = F ∗ , F 2 = 1 and [F, π(a)] ∈ K(H) for all a ∈ A. I Definition 4. An Definition 24. even Fredholm module (H, F, Γ) is given by an odd Fredholm module (H, F ) together with a Z2 -grading Γ on H, Γ = Γ∗ , Γ2 = 1 such that Γπ(a) = π(a)Γ and ΓF = I, −F Γ. Definition 5. A Fredholm module is called Definition 25. degenerate if [F, π(a)] = 0 for all a ∈ A. Degenerate Fredholm modules are homotopic to the 0-module. Example 17 (Fredholm modules over C). An even Fredholm module (H, F, Γ) over C is given by H = Ck ⊕ Ck F = Γ = 0 1k I 1k 0 I 1k I 0 0 −1 k I with π(a) = , . a1 k 0 I 0 0 ,

Version: 3 Owner: mhale Author(s): mhale 1242

300.2

K-homology

K-homology is a homology theory on the category of compact Hausdorff spaces. It classifies the elliptic pseudo-differential operators acting on the vector bundles over a space. In terms of C ∗ -algebras, it classifies the Fredholm modules over an algebra. The K 0 (A) group is the abelian group of homotopy classes of even Fredholm modules over A. The K 1 (A) group is the abelian group of homotopy classes of odd Fredholm modules over A. Addition is given by direct summation of Fredholm modules, and the inverse of (H, F, Γ) is (H, −F, −Γ). Version: 1 Owner: mhale Author(s): mhale

1243

Chapter 301 19K99 – Miscellaneous
301.1 examples of K-theory groups
A C Mn (C) H K B B/K C0 ((0, 1)) C0 (R2n ) C0 (R2n+1 ) C([0, 1]) C(Tn ) C(S2n ) C(S2n+1 ) C(CPn ) On Aθ C ∗ (H3 ) K0 (A) Z Z Z Z 0 0 0 Z 0 Z n−1 Z2 Z2 Z Zn+1 Z/(n − 1) Z2 Z3 K1 (A) 0 0 0 0 0 Z Z 0 Z 0 n−1 Z2 0 Z 0 0 Z2 Z3

Topological K-theory of some common C ∗ -algebras. Version: 5 Owner: mhale Author(s): mhale

1244

Chapter 302 20-00 – General reference works (handbooks, dictionaries, bibliographies, etc.)
302.1 alternating group is a normal subgroup of the symmetric group

Theorem 2. The alternating group An is a normal subgroup of the symmetric group Sn efine the epimorphism f : Sn → Z2 by : σ → 0 if σ is an even permutation and : σ → 1 if σ is an odd permutation. Hence, An is the kernel of f and so it is a normal subgroup of the domain Sn . Furthermore Sn /An ∼ Z2 by the first isomorphism theorem. So by = Lagrange’s theorem |Sn | = |An ||Sn /An |.
D

Therefore, |An | = n!/2. That is, there are n!/2 many elements in An Version: 1 Owner: tensorking Author(s): tensorking

302.2

associative

Let (S, φ) be a set with binary operation φ. φ is said to be associative over S if φ(a, φ(b, c)) = φ(φ(a, b), c)

1245

for all a, b, c ∈ S. Examples of associative operations are addition and multiplication over the integers (or reals), or addition or multiplication over n × n matrices. We can construct an operation which is not associative. Let S be the integers. and define ν(a, b) = a2 + b. Then ν(ν(a, b), c) = ν(a2 + b, c) = a4 + 2ba2 + b2 + c. But ν(a, ν(b, c)) = ν(a, b2 + c) = a + b4 + 2cb2 + c2 , hence ν(ν(a, b), c) = ν(a, ν(b, c)). Note, however, that if we were to take S = {0}, ν would be associative over S!. This illustrates the fact that the set the operation is taken with respect to is very important. Example. We show that the division operation over nonzero reals is non-associative. All we need is a counter-example: so let us compare 1/(1/2) and (1/1)/2. The first expression is equal to 2, the second to 1/2, hence division over the nonzero reals is not associative. Version: 6 Owner: akrowne Author(s): akrowne

302.3

canonical projection

Given a group G and a normal subgroup N ¡ G there is an epimorphism π : G → G/N defined by sending an element g ∈ G to its coset gN. The epimorphism π is referred to as the canonical projection. Version: 4 Owner: Dr Absentius Author(s): Dr Absentius

302.4

centralizer

For a given group G, the centralizer of an element a ∈ G is defined to be the set C(a) = {x ∈ G | xa = ax} We note that if x, y ∈ C(a) then xy −1 a = xay −1 = axy −1 so that xy −1 ∈ C(a). Thus C(a) is a non-trivial subgroup of G containing at least {e, a}. To illustrate an application of this concept we prove the following lemma. There exists a bijection between the right cosets of C(a) and the conjugates of a. If x, y ∈ G are in the same right coset, then y = cx for some c ∈ C(a). Thus y −1ay = x−1 c−1 acx = x−1 c−1 cax = x−1 ax. Conversely, if y −1 ay = x−1 ax then xy −1 a = axy −1 and 1246

xy −1 ∈ C(a) giving x, y are in the same right coset. Let [a] denote the conjugacy class of a. It follows that |[a]| = [G : C(a)] and |[a]| | |G|. We remark that a ∈ Z(G) ⇔ |C(a)| = |G| ⇔ |[a]| = 1, where Z(G) denotes the center of G. Now let G be a p-group, i.e. a finite group of order pn , where p is a prime and n > 0. Let z = |Z(G)|. Summing over elements in distinct conjugacy classes, we have pn = |[a]| = z + a∈Z(G) |[a]| since the center consists precisely of the conjugacy classes of cardinality 1. / But |[a]| | pn , so p | z. However, Z(G) is certainly non-empty, so we conclude that every p-group has a non-trivial center. The groups C(gag −1) and C(a), for any g, are isomorphic. Version: 5 Owner: mathcam Author(s): Larry Hammick, vitriol

302.5

commutative

Let (S, φ) be a set with binary operation φ. φ is said to be commutative if φ(a, b) = φ(b, a) for all a, b ∈ S. Some operations which are commutative are addition over the integers, multiplication over the integers, addition over n × n matrices, and multiplication over the reals. An example of a non-commutative operation is multiplication over n × n matrices. Version: 3 Owner: akrowne Author(s): akrowne

302.6

examples of groups

Groups are ubiquitous throughout mathematics. Many “naturally occurring” groups are either groups of numbers (typically abelian) or groups of symmetries (typically non-abelian). Groups of numbers • The most important group is the group of integers Z with addition as operation. • The integers modulo n, often denoted by Zn , form a group under addition. Like Z itself, this a cyclic group; any cyclic group is isomorphic to one of these. 1247

• The rational (or real, or complex) numbers form a group under addition. • The positive rationals form a group under multiplication, and so do the non-zero rationals. The same is true for the reals. • The non-zero complex numbers form a group under multiplication. So do the non-zero quaternions. The latter is our first example of a non-abelian group. • More generally, any (skew) field gives rise to two groups: the additive group of all field elements, and the multiplicative group of all non-zero field elements. • The complex numbers of absolute value 1 form a group under multiplication, best thought of as the unit circle. The quaternions of absolute value 1 form a group under multiplication, best thought of as the three-dimensional unit sphere S 3 . The twodimensional sphere S 2 however is not a group in any natural way. Most groups of numbers carry natural topologies turning them into topological groups.

Symmetry groups • The symmetric group of degree n, denoted by Sn , consists of all permutations of n items and has n! elements. Every finite group is isomorphic to a subgroup of some Sn . • An important subgroup of the symmetric group of degree n is the alternating group, denoted An . This consists of all even permutations on n items. A permutation is said to be even if it can be written as the product of an even number of transpositions. The alternating group is normal in Sn , of index 2, and it is an interesting fact that An is simple for n 5. See the proof on the simplicity of the alternating groups. By the Jordan-H¨lder theorem, this means that this is the only normal subgroup of Sn . o • If any geometrical object is given, one can consider its symmetry group consisting of all rotations and reflections which leave the object unchanged. For example, the symmetry group of a cone is isomorphic to S 1 . • The set of all automorphisms of a given group (or field, or graph, or topological space, or object in any category) forms a group with operation given by the composition of homomorphisms. These are called automorphism groups; they capture the internal symmetries of the given objects. • In Galois theory, the symmetry groups of field extensions (or equivalently: the symmetry groups of solutions to polynomial equations) are the central object of study; they are called Galois groups. • Several matrix groups describe various aspects of the symmetry of n-space:

1248

– The general linear group GL(n, R) of all real invertible n × n matrices (with matrix multiplication as operation) contains rotations, reflections, dilations, shear transformations, and their combinations. – The orthogonal group O(n, R) of all real orthogonal n × n matrices contains the rotations and reflections of n-space. – The special orthogonal group SO(n, R) of all real orthogonal n × n matrices with determinant 1 contains the rotations of n-space. All these matrix groups are Lie groups: groups which are differentiable manifolds such that the group operations are smooth maps.

Other groups • The trivial group consists only of its identity element. • If X is a topological space and x is a point of X, we can define the fundamental group of X at x. It consists of (equivalence classes of) continuous paths starting and ending at x and describes the structure of the “holes” in X accessible from x. • The free groups are important in algebraic topology. In a sense, they are the most general groups, having only those relations among their elements that are absolutely required by the group axioms. • If A and B are two abelian groups (or modules over the same ring), then the set Hom(A, B) of all homomorphisms from A to B is an abelian group (since the sum and difference of two homomorphisms is again a homomorphism). Note that the commutativity of B is crucial here: without it, one couldn’t prove that the sum of two homorphisms is again a homomorphism. • The set of all invertible n × n matrices over some ring R forms a group denoted by GL(n, R). • The positive integers less than n which are coprime to n form a group if the operation is defined as multiplication modulo n. This is a cyclic group whose order is given by the Euler phi-function φ(n), • Generalizing the last two examples, every ring (and every monoid) contains a group, its group of units (invertible elements), where the group operation is ring (monoid) multiplication. • If K is a number field, then multiplication of (equivalence classes of) non-zero ideals in the ring of algebraic integers OK gives rise to the ideal class group of K. • The set of arithmetic functions that take a value other than 0 at 1 form an abelian group under Dirichlet convolution. They include as a subgroup the set of multiplicative functions. 1249

• Consider the curve C = {(x, y) ∈ R2 | y 2 = x3 − x}. Every straight line intersects this set in three points (counting a point twice if the line is tangent, and allowing for a point at infinity). If we require that those three points add up to zero for any straight line, then we have defined an abelian group structure on C. Groups like these are called abelian varieties; the most prominent examples are elliptic curves of which C is the simplest one. • In the classification of all finite simple groups, several “sporadic” groups occur which don’t follow any discernable pattern. The largest of these is the monster group with some 8 · 1053 elements. Version: 14 Owner: AxelBoldt Author(s): AxelBoldt, NeuRet

302.7

group

Group. A group is a pair (G, ∗) where G is a non-empty set and ∗ is binary operation on G that holds the following conditions. • For any a, b, c ∈ G, (a ∗ b) ∗ c = a ∗ (b ∗ c). (associativity of the operation). • For any a, b in G, a ∗ b belongs to G. (The operation ∗ is closed). • There is an element e ∈ G such that ge = eg = g for any g ∈ G. (Existence of identity element). • For any g ∈ G there exists an element h such that gh = hg = e. (Existence of inverses) Usually the symbol ∗ is omitted and we write ab for a ∗ b. Sometimes, the symbol + is used to represent the operation, especially when the group is abelian. It can be proved that there is only one identity element , and that for every element there is only one inverse. Because of this we usually denote the inverse of a as a−1 or −a when we are using additive notation. The identity element is also called neutral element due to its behavior with respect to the operation. Version: 10 Owner: drini Author(s): drini

302.8

quotient group

Let (G, ∗) be a group and H a normal subgroup. The relation ∼ given by a ∼ b when ab−1 ∈ H is an equivalence relation. The equivalence classes are called cosets. The equivalence class of a is denoted as aH (or a + H if additive notation is being used). 1250

We can induce a group structure on the cosets with the following operation: (aH) (bH) = (a ∗ b)H. The collection of cosets is denoted as G/H and together with the quotient group or factor group of G with H. Example. Consider the group Z and the subgroup 3Z = {n ∈ Z : n = 3k, Since Z is abelian, 3Z is then also a normal subgroup. Using additive notation, the equivalence relation becomes n ∼ m when (n−m) ∈ 3Z, that is, 3 divides n − m. So the relation is actually congruence modulo 3. Therefore the equivalence classes (the cosets) are: 3Z = . . . , −9, −6, −3, 0, 3, 6, 9, . . . 1 + 3Z = . . . , −8, −5, −2, 1, 4, 7, 10, . . . 2 + 3Z = . . . , −7, −4, −1, 2, 5, 8, 11, . . . which we’ll represent as ¯ ¯ and ¯ 0, 1 2. Then we can check that Z/3Z is actually the integers modulo 3 (that is, Z/3Z ∼ Z3 ). = Version: 6 Owner: drini Author(s): drini k ∈ Z}. operation form the

1251

Chapter 303 20-02 – Research exposition (monographs, survey articles)
303.1 length function

Let G be a group. A length function on G is a function L : G → R+ satisfying: L(e) = 0, L(g) = L(g −1 ), ∀g ∈ G, L(g1 g2 ) L(g1 ) + L(g2 ), ∀g1 , g2 ∈ G. Version: 2 Owner: mhale Author(s): mhale

1252

Chapter 304 20-XX – Group theory and generalizations
304.1 free product with amalgamated subgroup

Definition 26. Let Gk , k = 0, 1, 2 be groups and ik : G0 → Gi , k = 1, 2 be monomorphisms. The free product of G1 and G2 with amalgamated subgroup G0 , is defined to be a group G that has the following two properties 1. there are homomorphisms jk : Gk → G, k = 1, 2 that make the following diagram commute G1
i1 j1

G0
i2 j2

G G2

2. G is universal with respect to the previous property, that is for any other group G and homomorphisms jk : Gk → G , k = 1, 2 that fit in such a commutative diagram there is a unique homomorphism G → G so that the following diagram commutes G1
i1 j1 j1

G0
i2 j2

G G2
j2

!

G

1253

It follows by “general nonsense” that the free product of G1 and G2 with amalgamated subgroup G0 , if it exists, is “unique up to unique isomorphism.” The free product of G1 and G2 with amalgamated subgroup G0 , is denoted by G1 G0 G2 . The following theorem asserts its existence. Theorem 1. G1 Gi , k = 1, 2.
G0 G 2

exists for any groups Gk , k = 0, 1, 2 and monomorphisms ik : G0 →

[ Sketch of proof] Without loss of generality assume that G0 is a subgroup of Gk and that ik is the inclusion for k = 1, 2. Let

Gk = (xk;s )s∈S | (rk;t)t∈T be a presentation of Gk for k = 1, 2. Each g ∈ G0 can be expressed as a word in the generators of Gk ; denote that word by wk (g) and let N be the normal closure of {w1 (g)w2(g)−1 | g ∈ G0 } in the free product G1 G2 . Define G1
G0 G 2

:= G1 G2 /N

and for k = 0, 1 define jk to be the inclusion into the free product followed by the canonical projection. Clearly (1) is satisfied, while (2) follows from the universal properties of the free product and the quotient group. Notice that in the above proof it would be sufficient to divide by the relations w1 (g)w2 (g)−1 for g in a generating set of G0 . This is useful in practice when one is interested in obtaining a presentation of G1 G0 G2 . In case that the ik ’s are not injective the above still goes through verbatim. The group thusly obtained is called a “pushout”. Examples of free products with amalgamated subgroups are provided by Van Kampen’s theorem. Version: 1 Owner: Dr Absentius Author(s): Dr Absentius

304.2

nonabelian group

Let (G, ∗) be a group. If a ∗ b = b ∗ a for some a, b ∈ G, we say that the group is nonabelian or noncommutative. proposition. There is a nonabelian group for which x → x3 is a homomorphism Version: 2 Owner: drini Author(s): drini, apmxi

1254

Chapter 305 20A05 – Axiomatics and elementary properties
305.1 Feit-Thompson theorem

An important result in the classification of all finite simple groups, the Feit-Thompson theorem states that every non-Abelian simple group must have even order. The proof requires 255 pages. Version: 1 Owner: mathcam Author(s): mathcam

305.2

Proof: The orbit of any element of a group is a subgroup

Following is a proof that, if G is a group and g ∈ G, then g ≤ G. Here g is the orbit of g and is defined as g = {g n : n ∈ Z} Since g ∈ g , then g is nonempty. Let a, b ∈ g . Then there exist x, y ∈ Z such that a = g x and b = g y . Since ab−1 = g x (g y )−1 = g x g −y = g x−y ∈ g , it follows that g ≤ G. Version: 3 Owner: drini Author(s): drini, Wkbj79

1255

305.3

center

The center of a group G is the subgroup of elements which commute with every other element. Formally Z(G) = {x ∈ G | xg = gx, ∀ g ∈ G} It can be shown that the center has the following properties • It is non-empty since it contains at least the identity element • It consists of those conjugacy classes containing just one element • The center of an abelian group is the entire group • It is normal in G • Every p-group has a non-trivial center Version: 5 Owner: vitriol Author(s): vitriol

305.4

characteristic subgroup

If (G, ∗) is a group, then H is a characteristic subgroup of G (H char G) if every automorphism of G maps H to itself. That is: ∀f ∈ Aut(G)∀h ∈ Hf (h) ∈ H or, equivalently: ∀f ∈ Aut(G)f [H] = H A few properties of characteristic subgroups:

(a) If H char G then H is a normalsubgroup of G (b) If G has only one subgroup of a given size then that subgroup is characteristic (c) If K char H and H G then K G (contrast with normality of subgroups is not transitive)

(d) If K char H and H char G then K char G Proofs of these properties: 1256

(a) Consider H char G under the inner automorphisms of G. Since every automorphism preserves H, in particular every inner automorphism preserves H, and therefore g ∗ h ∗ g −1 ∈ H for any g ∈ G and h ∈ H. This is precisely the definition of a normal subgroup. (b) Suppose H is the only subgroup of G of order n. In general, homomorphisms takes subgroups to subgroups, and of course isomorphisms take subgroups to subgroups of the same order. But since there is only one subgroup of G of order n, any automorphism must take H to H, and so H char G. (c) Take K char H and H G, and consider the inner automorphisms of G (automorphisms of the form h → g ∗ h ∗ g −1 for some g ∈ G). These all preserve H, and so are automorphisms of H. But any automorphism of H preserves K, so for any g ∈ G and k ∈ K, g ∗ k ∗ g −1 ∈ K. (d) Let K char H and H char G, and let φ be an automorphism of G. Since H char G, φ[H] = H, so φH , the restriction of φ to H is an automorphism of H. Since K char H, so φH [K] = K. But φH is just a restriction of φ, so φ[K] = K. Hence K char G. Version: 1 Owner: Henry Author(s): Henry

305.5

class function

Given a field K, a K–valued class function on a group G is a function f : G −→ K such that f (g) = f (h) whenever g and h are elements of the same conjugacy class of G. An important example of a class function is the character of a group representation. Over the complex numbers, the set of characters of the irreducible representations of G form a basis for the vector space of all C–valued class functions, when G is a compact Lie group. Relation to the convolution algebra Class functions are also known as central functions, because they correspond to functions f in the convolution algebra C ∗ (G) that have the property f ∗ g = g ∗ f for all g ∈ C ∗ (G) (i.e., they commute with everything under the convolution operation). More precisely, the set of measurable complex valued class functions f is equal to the set of central elements of the convolution algebra C ∗ (G), for G a locally compact group admitting a Haar measure. Version: 5 Owner: djao Author(s): djao

1257

305.6

conjugacy class

Two elements g and g of a group G are said to be conjugate if there exists h ∈ G such that g = hgh−1 . Conjugacy of elements is an equivalence relation, and the equivalence classes of G are called conjugacy classes. Two subsets S and T of G are said to be conjugate if there exists g ∈ G such that T = {gsg −1 | s ∈ S} ⊂ G. In this situation, it is common to write gSg −1 for T to denote the fact that everything in T has the form gsg −1 for some s ∈ S. We say that two subgroups of G are conjugate if they are conjugate as subsets. Version: 2 Owner: djao Author(s): djao

305.7

conjugacy class formula

The conjugacy classes of a group form a partition of its elements. In a finite group, this means that the order of the group is the sum of the number of elements of the distinct conjugacy classes. For an element g of group G, we denote the conjugacy class of g as Cg and the normalizer in G of g as NG (g). The number of elements in Cg equals [G : NG (g)], the index of the normalizer of g in G. For an element g of the center Z(G) of G, the conjugacy class of g consists of the singleton {g}. Putting this together gives us the conjugacy class formula m |G| = |Z(G)| + [G : NG (xi )]
i=1

where the xi are elements of the distinct conjugacy classes contained in G − Z(G). Version: 3 Owner: lieven Author(s): lieven

305.8

conjugate stabilizer subgroups

Let · be a right group action of G on a set M. Then Gα·g = g −1 Gα g for any α ∈ M and g ∈ G. Proof:
1

1

Gα is the stabilizer subgroup of α ∈ M .

1258

x ∈ Gα·g ↔ α · (gx) = α · g ↔ α · (gxg −1 ) = α ↔ gxg −1 ∈ Gα ↔ x ∈ g −1 αg and therefore Gα·g = g −1 Gα g. Thus all stabilizer subgroups for elements of the orbit G(α) of α are conjugate to Gα . Version: 4 Owner: Thomas Heye Author(s): Thomas Heye

305.9

coset

Let H be a subgroup of a group G, and let a ∈ G. The left coset of a with respect to H in G is defined to be the set aH := {ah | h ∈ H}. The right coset of a with respect to H in G is defined to be the set Ha := {ha | h ∈ H}. Two left cosets aH and bH of H in G are either identical or disjoint. Indeed, if c ∈ aH bH, then c = ah1 and c = bh2 for some h1 , h2 ∈ H, whence b−1 a = h2 h−1 ∈ H. But then, given 1 any ah ∈ aH, we have ah = (bb−1 )ah = b(b−1 a)h ∈ bH, so aH ⊂ bH, and similarly bH ⊂ aH. Therefore aH = bH. Similarly, any two right cosets Ha and Hb of H in G are either identical or disjoint. Accordingly, the collection of left cosets (or right cosets) partitions the group G; the corresponding equivalence relation for left cosets can be described succintly by the relation a ∼ b if a−1 b ∈ H, and for right cosets by a ∼ b if ab−1 ∈ H. The index of H in G, denoted [G : H], is the cardinality of the set G/H of left cosets of H in G. Version: 5 Owner: djao Author(s): rmilson, djao

305.10

cyclic group

A group G is said to be cyclic if it is generated entirely by some x ∈ G. That is, if G has infinite order then every g ∈ G can be expressed as xk with k ∈ Z. If G has finite order then every g ∈ G can be expressed as xk with k ∈ N0 , and G has exactly φ(|G|) generators, where φ is the Euler totient function. It is a corollary of Lagrange’s theorem that every group of prime order is cyclic. All cyclic groups of the same order are isomorphic to each other. Consequently cyclic groups of order n are often denoted by Cn . Every cyclic group is abelian. 1259

Examples of cyclic groups are (Zm , +m ), (Zp , ×p ) and (Rm , ×m ) where p is prime and Rm = {n ∈ N : (n, m) = 1, n ≤ m} Version: 10 Owner: yark Author(s): yark, Larry Hammick, vitriol

305.11

derived subgroup

Let G be a group and a, b ∈ G. The group element aba−1 b−1 is called the commutator of a and b. An element of G is called a commutator if it is the commutator of some a, b ∈ G. The subgroup of G generated by all the commutators in G is called the derived subgroup of G, and also the commutator subgroup. It is commonly denoted by G and also by G(1) . Alternatively, one may define G as the smallest subgroup that contains all the commutators. Note that the commutator of a, b ∈ G is trivial, i.e. aba−1 b−1 = 1 if and only if a and b commute. Thus, in a fashion, the derived subgroup measures the degree to which a group fails to be abelian. Proposition 1. The derived subgroup G is normal in G, and the factor group G/G is abelian. Indeed, G is abelian if and only if G is the trivial subgroup. One can of course form the derived subgroup of the derived subgroup; this is called the second derived subgroup, and denoted by G or by G(2) . Proceeding inductively one defines the nth derived subgroup as the derived subgroup of G(n−1) . In this fashion one obtains a sequence of subgroups, called the derived series of G: G = G(0) ⊇ G(1) ⊇ G(2) ⊇ . . . Proposition 2. The group G is solvable if and only if the derived series terminates in the trivial group {1} after a finite number of steps. In this case, one can refine the derived series to obtain a composition series (a.k.a. a Jordan-Holder decomposition) of G. Version: 4 Owner: rmilson Author(s): rmilson

305.12

equivariant

Let G be a group, and X and Y left (resp. right) homogeneous spaces of G. Then a map f : X → Y is called equivariant if g(f (x)) = f (gx) (resp. (f (x))g = f (xg)) for all g ∈ G. Version: 1 Owner: bwebste Author(s): bwebste 1260

305.13

examples of finite simple groups

This entry under construction. If I take too long to finish it, nag me about it, or fill in the rest yourself. All groups considered here are finite. It is now widely believed that the classification of all finite simple groups up to isomorphism is finished. The proof runs for at least 10,000 printed pages, and as of the writing of this entry, has not yet been published in its entirety.

Abelian groups • The first trivial example of simple groups are the cyclic groups of prime order. It is not difficult to see (say, by Cauchy’s theorem) that these are the only abelian simple groups.

Alternating groups • The alternating group on n symbols is the set of all even permutations of Sn , the symmetric group on n symbols. It is usually denoted by An , or sometimes by Alt(n). This is a normal subgroup of Sn , namely the kernel of the homomorphism that sends every even permutation to 1 and the odd permutations to −1. Because every permutation is either even or odd, and there is a bijection between the two (multiply every even permutation by a transposition), the index of An in Sn is 2. A3 is simple because it only has three elements, and the simplicity of An for n 5 can be proved by an elementary argument. The simplicity of the alternating groups is an important ´ fact that Evariste Galois required in order to prove the insolubility by radicals of the general polynomial of degree higher than four.

Groups of Lie type • Projective special linear groups • Other groups of Lie type. Sporadic groups There are twenty-six sporadic groups (no more, no less!) that do not fit into any of the infinite sequences of simple groups considered above. These often arise as the group of automorphisms of strongly regular graphs.

1261

• Mathieu groups. • Janko groups. • The baby monster. • The monster. Version: 8 Owner: drini Author(s): bbukh, yark, NeuRet

305.14

finitely generated group

A group G is finitely generated if there is a finite subset X ⊆ G such that X generates G. That is, every element of G is a product of elements of X and inverses of elements of X. Or, equivalently, no proper subgroup of G contains X. Every finite group is finitely generated, as we can take X = G. Every finitely generated group is countable. Version: 6 Owner: yark Author(s): yark, nerdy2

305.15

first isomorphism theorem

If f : G → H is a homorphism of groups (or rings, or modules), then it induces an isomorphism G/ker f ≈ imf . Version: 2 Owner: nerdy2 Author(s): nerdy2

305.16

fourth isomorphism theorem

fourth isomorphism theorem
1: X group 2: N X

3: A set of subgroups of X that contain N 4: B set of subgroups of X/N 1262

5: ∃ϕ : A → B bijection : ∀Y, Z X : N Y & N Z ⇒ Y Z ⇔ Y /N Z/N & Z Y ⇒ [Y : Z] = [Y /N : Z/N] & Y, Z /N = Y /N, Z/N & (Y Z)/N = Y /N Z/N & Y G ⇔ Y /N G/N Note: This is a “seed” entry written using a short-hand format described in this FAQ. Version: 2 Owner: bwebste Author(s): yark, apmxi

305.17

generator

If G is a cyclic group and g ∈ G, then g is a generator of G if g = G. All infinite cyclic groups have exactly 2 generators. Let G be an infinite cyclic group and g be a generator of G. Let z ∈ Z such that g z is a generator of G. Then g z = G. Since g ∈ G, then g ∈ g z . Thus, there exists, n ∈ Z with g = (g z )n = g nz . Thus, g nz−1 = eG . Since G is infinite and |g| = | g | = |G| must be infinity, then nz − 1 = 0. Since nz = 1 and n and z are integers, then n = z = 1 or n = z = −1. It follows that the only generators of G are g and g −1. A finite cyclic group of order n has exactly ϕ(n) generators, where ϕ is the Euler totient function. Let G be a finite cyclic group of order n and g be a generator of G. Then |g| = | g | = |G| = n. Let z ∈ Z such that g z is a generator of G. By the division algorithm, there exist q, r ∈ Z with 0 ≤ r < n such that z = qn + r. Thus, g z = g qn+r = g qn g r = (g n )q g r = (eG )q g r = eG g r = g r . |g| n Since g r is a generator of G, then g r = G. Thus, n = |G| = | g r | = |g r | = gcd(r,|g|) = gcd(r,n) . Therefore, gcd(r, n) = 1, and the result follows. Version: 3 Owner: Wkbj79 Author(s): Wkbj79

305.18

group actions and homomorphisms

Notes on group actions and homomorphisms Let G be a group, X a non-empty set and SX the symmetric group of X, i.e. the group of all bijective maps on X. · may denote a left group action of G on X. 1. For each g ∈ G we define fg− 1 (fg (x)) = g −1 · (g · x) = x ∀ x ∈ X, so fg− 1 is the inverse of fg . so fg is bijective and thus element of SX . We define F : G −→ SX , F (g) = fg for all g ∈ G. This mapping is a group homomorphism: Let g, h ∈ G, x ∈ X. Then fg : X −→ X, fg (x) = g · x ∀ x ∈ X.

F (gh)(x) = fgh (x) = (gh) · x = g · (h · x) = (fg ◦ fh )(x) = 1263

implies F (gh) = F (g) ◦ F (h). – The same is obviously true for a right group action. 2. Now let F : G −→ Sx be a group homomorphism, and let f : G × X −→ X, (g, x) −→ F (g)(x) satisfies (a) f (1G , x) = F (1g )(x) = x∀x ∈ X and

(b) f (gh, x) = F (gh)(x) = (F (g) ◦ F (h)(x) = F (g)(F (h)(x)) = f (g, f (h, x)), so f is a group action induced by F .

Characterization of group actions
Let G be a group acting on a set X. Using the same notation as above, we have for each g ∈ ker(F ) F (g) = id x = fg ↔ g · x = x ∀x ∈ X ↔ g ∈ Gx (305.18.1)
x∈X

and it follows ker(F ) =
x∈X

Gx .

Let G act transitively on X. Then for any x ∈ X X is the orbit G(x) of x. As shown in “conjugate stabilizer subgroups’, all stabilizer subgroups of elements y ∈ G(x) are conjugate subgroups to Gx in G. From the above it follows that ker(F ) =
g∈G

gGx g −1 .

For a faithful operation of G the condition g · x = x∀x ∈ X → g = 1G is equivalent to ker(F ) = {1G } and therefore F : G −→ SX is a monomorphism. For the trivial operation of G on X given by g · x = x∀g ∈ G the stabilizer subgroup Gx is G for all x ∈ X, and thus ker(F ) = G. The corresponding homomorphism is g −→ id x∀g ∈ G. If the operation of G on X is free, then Gx = {1G } ∀ x ∈ X, thus the kernel of F is {1G }–like for a faithful operation. But: Let X = {1, . . . , n} and G = Sn . Then the operation of G on X given by π · i := π(i)∀i ∈ X, π ∈ Sn is faithful but not free. Version: 5 Owner: Thomas Heye Author(s): Thomas Heye 1264

305.19

group homomorphism

Let (G, ∗g ) and (K, ∗k ) be two groups. A group homomorphism is a function φ : G → K such that φ(s ∗g t) = φ(s) ∗k φ(t) for all s, t ∈ G. The composition of group homomorphisms is again a homomorphism. Let φ : G → K a group homomorphism. Then • φ(eg ) = ek where eg and ek are the respective identity elements for G and K. • φ(g)−1 = φ(g −1 ) for all g ∈ G • φ(g)z = φ(g z ) for all g ∈ G and for all z ∈ Z The kernel of φ is a subgroup of G and its image is a subgroup of K. Some special homomorphisms have special names. If φ : G → K is injective, we say that φ is an monomorphism, and if φ is onto we call it an epimorphism. When φ is both injective and surjective (that is, bijective) we call it an isomorphism. In the latter case we also say that G and K are isomorphic, meaning they are basically the same group (have the same structure). An homomorphism from G on itself is called an endomorphism, and if it is bijective, then is called an automorphism. Version: 15 Owner: drini Author(s): saforres, drini

305.20

homogeneous space

Overview and definition. Let G be a group acting transitively on a set X. In other words, we consider a homomorphism φ : G → Perm(X), where the latter denotes the group of all bijections of X. If we consider G as being, in some sense, the automorphisms of X, the transitivity assumption means that it is impossible to distinguish a particular element of X from any another element. Since the elements of X are indistinguishable, we call X a homogeneous space. Indeed, the concept of a homogeneous space, is logically equivalent to the concept of a transitive group action. Action on cosets. Let G be a group, H < G a subgroup, and let G/H denote the set of left cosets, as above. For every g ∈ G we consider the mapping ψH (g) : G/H → G/H with action aH → gaH, a ∈ G. Proposition 3. The mapping ψH (g) is a bijection. The corresponding mapping ψH : G → Perm(G/H) is a group homomorphism, specifying a transitive group action of G on G/H. 1265

Thus, G/H has the natural structure of a homogeneous space. Indeed, we shall see that every homogeneous space X is isomorphic to G/H, for some subgroup H. N.B. In geometric applications, the want the homogeneous space X to have some extra structure, like a topology or a differential structure. Correspondingly, the group of automorphisms is either a continuous group or a Lie group. In order for the quotient space X to have a Hausdorff topology, we need to assume that the subgroup H is closed in G. The isotropy subgroup and the basepoint identification. Let X be a homogeneous space. For x ∈ X, the subgroup Hx = {h ∈ G : hx = x}, consisting of all G-actions that fix x, is called the isotropy subgroup at the basepoint x. We identify the space of cosets G/Hx with the homogeneous space by means of the mapping τx : G/Hx → X, defined by τx (aHx ) = ax, a ∈ G. Proposition 4. The above mapping is a well-defined bijection. To show that τx is well defined, let a, b ∈ G be members of the same left coset, i.e. there exists an h ∈ Hx such that b = ah. Consequently bx = a(hx) = ax, as desired. The mapping τx is onto because the action of G on X is assumed to be transitive. To show that τx is one-to-one, consider two cosets aHx , bHx , a, b ∈ G such that ax = bx. It follows that b−1 a fixes x, and hence is an element of Hx . Therefore aHx and bHx are the same coset. The homogeneous space as a quotient. Next, let us show that τx is equivariant relative to the action of G on X and the action of G on the quotient G/Hx . Proposition 5. We have that φ(g) ◦ τx = τx ◦ ψHx (g) for all g ∈ G. To prove this, let g, a ∈ G be given, and note that ψHx (g)(aHx ) = gaHx . The latter coset corresponds under τx to the point gax, as desired. Finally, let us note that τx identifies the point x ∈ X with the coset of the identity element eHx , that is to say, with the subgroup Hx itself. For this reason, the point x is often called the basepoint of the identification τx : G/Hx → X. 1266

The choice of basepoint. Next, we consider the effect of the choice of basepoint on the quotient structure of a homogeneous space. Let X be a homogeneous space. Proposition 6. The set of all isotropy subgroups {Hx : x ∈ X} forms a single conjugacy class of subgroups in G. To show this, let x0 , x1 ∈ X be given. By the transitivity of the action we may choose a g ∈ G such that x1 = g x0 . Hence, for all h ∈ G satisfying hx0 = x0 , we have ˆ ˆ (ˆhˆ−1 )x1 = g (h(ˆ−1 x1 )) = g x0 = x1 . g g ˆ g ˆ Similarly, for all h ∈ Hx1 we have that g −1hˆ fixes x0 . Therefore, ˆ g g (Hx0 )ˆ−1 = Hx1 ; ˆ g gHx g −1 = Hgx . or what is equivalent, for all x ∈ X and g ∈ G we have

Equivariance. Since we can identify a homogeneous space X with G/Hx for every possible x ∈ X, it stands to reason that there exist equivariant bijections between the different G/Hx . To describe these, let H0 , H1 < G be conjugate subgroups with H1 = g H0 g −1 ˆ ˆ for some fixed g ∈ G. Let us set ˆ X = G/H0 ,

and let x0 denote the identity coset H0 , and x1 the coset g H0 . What is the subgroup of G ˆ that fixes x1 ? In other words, what are all the h ∈ G such that hˆH0 = g H0 , g ˆ or what is equivalent, all h ∈ G such that g −1 hˆ ∈ H0 . ˆ g

The collection of all such h is precisely the subgroup H1 . Hence, τx1 : G/H1 → G/H0 is the desired equivariant bijection. This is a well defined mapping from the set of H1 -cosets to the set of H0 -cosets, with action given by τx1 (aH1 ) = aˆH0 , g a ∈ G.

Let ψ0 : G → Perm(G/H0 ) and ψ1 : G → Perm(G/H1 ) denote the corresponding coset G-actions. Proposition 7. For all g ∈ G we have that τx1 ◦ ψ1 (g) = ψ0 (g) ◦ τx1 . Version: 3 Owner: rmilson Author(s): rmilson 1267

305.21

identity element

Let G be a groupoid, that is a set with a binary operation G×G → G, written muliplicatively so that (x, y) → xy. An identity element for G is an element e such that ge = eg = g for all g ∈ G. The symbol e is most commonly used for identity elements. Another common symbol for an identity element is 1, particularly in semigroup theory (and ring theory, considering the multiplicative structure as a semigroup). Groups, monoids, and loops are classes of groupoids that, by definition, always have an identity element. Version: 6 Owner: mclase Author(s): mclase, vypertd, imran

305.22

inner automorphism

Let G be a group. For every x ∈ G, we define a mapping φx : G → G, y → xyx−1 , y ∈ G,

called conjugation by x. It is easy to show the conjugation map is in fact, a group automorphism. An automorphism of G that corresponds to the conjugation by some x ∈ G is called inner. An automorphism that isn’t inner is called an outer automorphism. The composition operation gives the set of all automorphisms of G the structure of a group, Aut(G). The inner automorphisms also form a group, Inn(G), which is a normal subgroup of Aut(G). Indeed, if φx , x ∈ G is an inner automorphism and π : G → G an arbitrary automorphism, then π ◦ φx ◦ π −1 = φπ(x) . Let us also note that the mapping x → φx , x∈G

is a surjective group homomorphism with kernel Z(G), the centre subgroup. Consequently, Inn(G) is naturally isomorphic to the quotient of G/ Z(G). Version: 7 Owner: rmilson Author(s): rmilson, tensorking

1268

305.23

kernel

Let ρ : G → K be a group homomorphism. The preimage of the codomain identity element eK ∈ K forms a subgroup of the domain G, called the kernel of the homomorphism; ker(ρ) = {s ∈ G | ρ(s) = eK } The kernel is a normal subgroup. It is the trivial subgroup if and only if ρ is a monomorphism. Version: 9 Owner: rmilson Author(s): rmilson, Daume

305.24

maximal

Let G be a group. A subgroup H of G is said to be maximal if H = G and whenever K is a subgroup of G with H ⊆ K ⊆ G then K = H or K = G. Version: 1 Owner: Evandar Author(s): Evandar

305.25

normal subgroup

A subgroup H of a group G is normal if aH = Ha for all a ∈ G. Equivalently, H ⊂ G is normal if and only if aHa−1 = H for all a ∈ G, i.e., if and only if each conjugacy class of G is either entirely inside H or entirely outside H. The notation H G or H G is often used to denote that H is a normal subgroup of G.

The kernel ker (f ) of any group homomorphism f : G −→ G is a normal subgroup of G. More surprisingly, the converse is also true: any normal subgroup H ⊂ G is the kernel of some homomorphism (one of these being the projection map ρ : G −→ G/H, where G/H is the quotient group). Version: 6 Owner: djao Author(s): djao

305.26

normality of subgroups is not transitive

Let G be a group. Obviously, a subgroup K ≤ H of a subgroup H ≤ G of G is a subgroup K ≤ G of G. It seems plausible that a similar situation would also hold for normal subgroups. This is not true. Even when K examples: H and H G, it is possible that K G. Here are two

1269

1. Let G be the subgroup of orientation-preserving isometries of the plane R2 (G is just all rotations and translations), let H be the subgroup of G of translations, and let K be the subgroup of H of integer translations τi,j (x, y) = (x + i, y + j), where i, j ∈ Z.

Any element g ∈ G may be represented as g = r1 ◦ t1 = t2 ◦ r2 , where r1,2 are rotations and t1,2 are translations. So for any translation t ∈ H we may write g −1 ◦ t ◦ g = r −1 ◦ t ◦ r, where t ∈ H is some other translation and r is some rotation. But this is an orientationpreserving isometry of the plane that does not rotate, so it too must be a translation. Thus G−1 HG = H, and H G. H is an abelian group, so all its subgroups, K included, are normal. We claim that K G. Indeed, if ρ ∈ G is rotation by 45◦ about the origin, then ρ−1 ◦ τ1,0 ◦ ρ is not an integer translation.

2. A related example uses finite subgroups. Let G = D4 be the dihedral group with four elements (the group of automorphisms of the graph of the square C4 ). Then D4 = r, f | f 2 = 1, r 4 = 1, f r = r −1 f is generated by r, rotation, and f , flipping.

The subgroup

H = rf, f r = 1, rf, r 2 , f r ∼ C2 × C2 = G

is isomorphic to the Klein 4-group – an identity and 3 elements of order 2. H since [G : H] = 2. Finally, take K = rf = {1, rf } We claim that K G. And indeed, f ◦ rf ◦ f = f r ∈ K. / Version: 4 Owner: ariels Author(s): ariels H.

305.27

normalizer

Let G be a group, and let H ⊆ G. The normalizer of H in G, written NG (H), is the set {g ∈ G | gHg −1 = H} 1270

If H is a subgroup of G, then NG (H) is a subgroup of G containing H. Note that H is a normal subgroup of NG (H); in fact, NG (H) is the largest subgroup of G of which H is a normal subgroup. In particular, if H is a normal subgroup of G, then NG (H) = G. Version: 6 Owner: saforres Author(s): saforres

305.28

order (of a group)

The order of a group G is the number of elements of G, denoted |G|; if |G| is finite, then G is said to be a finite group. The order of an element g ∈ G is the smallest positive integer n such that g n = e, where e is the identity element; if there is no such n, then g is said to be of infinite order. Version: 5 Owner: saforres Author(s): saforres

305.29

presentation of a group

A presentation of a group G is a description of G in terms of generators and relations. We say that the group is finitely presented, if it can be described in terms of a finite number of generators and a finite number of defining relations. A collection of group elements gi ∈ G, i ∈ I is said to generate G if every element of G can be specified as a product of the gi , and of their inverses. A relation is a word over the alphabet consisting of the generators gi and their inverses, with the property that it multiplies out to the identity in G. A set of relations rj , j ∈ J is said to be defining, if all relations in G can be given as a product of the rj , their inverses, and the G-conjugates of these. The standard notation for the presentation of a group is G = g i | rj , meaning that G is generated by generators gi , subject to relations rj . Equivalently, one has a short exact sequence of groups 1 → N → F [I] → G → 1, where F [I] denotes the free group generated by the gi , and where N is the smallest normal subgroup containing all the rj . By the Nielsen-Schreier theorem, the kernel N is itself a free group, and hence we assume without loss of generality that there are no relations among the relations. Example. The symmetric group on n elements 1, . . . , n admits the following finite presentation (Note: this presentation is not canonical. Other presentations are known.) As 1271

generators take gi = (i, i + 1), (gi gj )ni,j = id, where ni,i = 1 ni,i+1 = 3 ni,j = 2, the transpositions of adjacent elements. As defining relations take i, j = 1, . . . n, i = 1, . . . , n − 1,

i<j+1

This means that a finite symmetric group is a Coxeter group, Version: 11 Owner: rmilson Author(s): rmilson

305.30

proof of first isomorphism theorem

Let K denote ker f . K is a normal subgroup of G because, by the following calculation, gkg −1 ∈ K for all g ∈ G and k ∈ K (rules of homomorphism imply the first equality, definition of K for the second): f (gkg −1) = f (g)f (k)f (g)−1 = f (g)1H f (g)−1 = 1H Therefore, G/K is well defined. Define a group homomorphism θ : G/K → imf given by: θ(gK) = f (g) We argue that θ is an isomorphism. First, θ is well defined. Take two representative, g1 and g2 , of the same modulo class. By −1 −1 definition, g1 g2 is in K. Hence, f sends g1 g2 to 1 (all elements of K are sent by f to −1 1). Consequently, the next calculation is valid: f (g1)f (g2 )−1 = f (g1 g2 ) = 1 but this is the same as saying that f (g1 ) = f (g2 ). And we are done because the last equality indicate that θ(g1 K) is equal to θ(g2 K). Going backward the last argument, we get that θ is also an injection: If θ(g1 K) is equal to −1 θ(g2 K) then f (g1 ) = f (g2 ) and hence g1 g2 ∈ K (exactly as in previous part) which implies an equality between g1 K and g2 K. Now, θ is a homomorphism. We need to show that θ(g1 K · g2 K) = θ(g1 K)θ(g2 K) and that θ((gK)−1 ) = (θ(gK))−1 . And indeed: θ(g1 K · g2 K) = θ(g1 g2 K) = f (g1 g2 ) = f (g1 )f (g2 ) = θ(g1 K)θ(g2 K) 1272

θ((gK)−1 ) = θ(g −1 K) = f (g −1 ) = (f (g))−1 = (θ(gK))−1 To conclude, θ is surjective. Take h to be an element of imf and g its pre-image. Since h = f (g) we have that h is also the image of of θ(gK). Version: 3 Owner: uriw Author(s): uriw

305.31

proof of second isomorphism theorem

First, we shall prove that HK is a subgroup of G: Since e ∈ H and e ∈ K, clearly e = e2 ∈ HK. Take h1 , h2 ∈ H, k1, k2 ∈ K. Clearly h1 k1 , h2 k2 ∈ HK. Further, h1 k1 h2 k2 = h1 (h2 h−1 )k1 h2 k2 = h1 h2 (h−1 k1 h2 )k2 2 2 Since K is a normal subgroup of G and h2 ∈ G, then h−1 k1 h2 ∈ K. Therefore h1 h2 (h−1 k1 h2 )k2 ∈ 2 2 HK, so HK is closed under multiplication. Also, (hk)−1 ∈ HK for h ∈ H, k ∈ K, since (hk)−1 = k −1 h−1 = h−1 hk −1 h−1 and hk −1 h−1 ∈ K since K is a normal subgroup of G. So HK is closed under inverses, and is thus a subgroup of G. Since HK is a subgroup of G, the normality of K in HK follows immediately from the normality of K in G. Clearly H K is a subgroup of G, since it is the intersection of two subgroups of G.

Finally, define φ : H → HK/K by ϕ(h) = hK. We claim that φ is a surjective homomorphism from H to HK/K. Let h0 k0 K be some element of HK/K; since k0 ∈ K, then h0 k0 K = h0 K, and φ(h0 ) = h0 K. Now ker (φ) = {h ∈ H | φ(h) = K} = {h ∈ H | hK = K} and if hK = K, then we must have h ∈ K. So ker (φ) = {h ∈ H | h ∈ K} = H K

Thus, since φ(H) = HK/K and ker φ = H K, by the first isomorphism theorem we see that H K is normal in H and that there is a natural isomorphism between H/(H K) and HK/K. Version: 8 Owner: saforres Author(s): saforres 1273

305.32

proof that all cyclic groups are abelian

Following is a proof that all cyclic groups are abelian. Let G be a cyclic group and g be a generator of G. Let a, b ∈ G. Then there exist x, y ∈ Z such that a = g x and b = g y . Since ab = g x g y = g x+y = g y+x = g y g x = ba, it follows that G is abelian. Version: 2 Owner: Wkbj79 Author(s): Wkbj79

305.33

proof that all cyclic groups of the same order are isomorphic to each other

The following is a proof that all cyclic groups of the same order are isomorphic to each other. Let G be a cyclic group and g be a generator of G. Define ϕ : Z → G by ϕ(c) = g c . Since ϕ(a + b) = g a+b = g a g b = ϕ(a)ϕ(b), then ϕ is a group homomorphism. If h ∈ G, then there exists x ∈ Z such that h = g x . Since ϕ(x) = g x = h, then ϕ is surjective. ker ϕ = {c ∈ Z|ϕ(c) = eG } = {c ∈ Z|g c = eG } If G is infinite, then ker ϕ = {0}, and ϕ is injective. Hence, ϕ is a group isomorphism, and G ∼ Z. = If G is finite, then let |G| = n. Thus, |g| = | g | = |G| = n. If g c = eG , then n divides c. Therefore, ker ϕ = nZ. By the first isomorphism theorem, G ∼ nZ ∼ Zn . = Z = Let H and K be cyclic groups of the same order. If H and K are infinite, then, by the above argument, H ∼ Z and K ∼ Z. If H and K are finite of order n, then, by the above = = argument, H ∼ Zn and K ∼ Zn . In any case, it follows that H ∼ K. = = = Version: 1 Owner: Wkbj79 Author(s): Wkbj79

305.34

proof that all subgroups of a cyclic group are cyclic

Following is a proof that all subgroups of a cyclic group are cyclic. Let G be a cyclic group and H ≤ G. If G is trivial, then H = G, and H is cyclic. If H is the trivial subgroup, then H = {eG } = eG , and H is cyclic. Thus, for the remainder of the proof, it will be assumed that both G and H are nontrivial. 1274

Let g be a generator of G. Let n be the smallest positive integer such that g n ∈ H. Claim: H = g n Let a ∈ g n . Then there exists z ∈ Z with a = (g n )z . Since g n ∈ H, then (g n )z ∈ H. Thus, a ∈ H. Hence, g n ⊆ H. Let h ∈ H. Then h ∈ G. Let x ∈ Z with h = g x . By the division algorithm, there exist q, r ∈ Z with 0 ≤ r < n such that x = qn + r. Since h = g x = g qn+r = g qn g r = (g n )q g r , then g r = h(g n )−q . Since h, g n ∈ H, then g r ∈ H. By choice of n, r cannot be positive. Thus, r = 0. Therefore, h = (g n )q g 0 = (g n )q eG = (g n )q ∈ g n . Hence, H ⊆ g n . Since g n ⊆ H and H ⊆ g n , then H = g n . It follows that every subgroup of G is cyclic. Version: 3 Owner: Wkbj79 Author(s): Wkbj79

305.35

regular group action

Let G be a group action on a set X. The action is called regular if for any pair α, β ∈ X there exists exactly one g ∈ G such that g · α = β. (For a right group action it is defined correspondingly.) Version: 3 Owner: Thomas Heye Author(s): Thomas Heye

305.36

second isomorphism theorem

Let (G, ∗) be a group. Let H be a subgroup of G and let K be a normal subgroup of G. Then • HK := {h ∗ k | h ∈ H, k ∈ K} is a subgroup of G, • K is a normal subgroup of HK, • H K is a normal subgroup of H, K) = HK/K. • There is a natural group isomorphism H/(H

The same statement also holds in the category of modules over a fixed ring (where normality is neither needed nor relevant), and indeed can be formulated so as to hold in any abelian category. Version: 4 Owner: djao Author(s): djao

1275

305.37

simple group

Let G be a group. G is said to be simple if the only normal subgroups of G are {1} and G itself. Version: 3 Owner: Evandar Author(s): Evandar

305.38

solvable group

A group G is solvable if it has a composition series G = G0 ⊃ G1 ⊃ · · · ⊃ Gn = {1} where all the quotient groups Gi /Gi+1 are abelian. Version: 4 Owner: djao Author(s): djao

305.39

subgroup

Definition: Let (G, ∗) be a group and let K be subset of G. Then K is a subgroup of G defined under the same operation if K is a group by itself (respect to ∗), that is: • K is closed under the ∗ operation. • There exists an identity element e ∈ K such that for all k ∈ K, k ∗ e = k = e ∗ k. • Let k ∈ K then there exists an inverse k −1 ∈ K such that k −1 ∗ k = e = k ∗ k −1 . The subgroup is denoted likewise (K, ∗). We denote K being a subgroup of G by writing K G. properties: • The set {e} whose only element is the identity is a subgroup of any group. It is called the trivial subgroup. • Every group is a subgroup of itself. • The null set {} is never a subgroup (since the definition of group states that the set must be non-empty). 1276

There is a very useful theorem that allows proving a given subset is a subgroup. Theorem: If K is a nonempty subset of of the group G. Then K is a subgroup of G if and only if s, t ∈ K implies that st−1 ∈ K. Proof: First we need to show if K is a subgroup of G then st−1 ∈ K. Since s, t ∈ K then st−1 ∈ K, because K is a group by itself. Now, suppose that if for any s, t ∈ K ⊆ G we have st−1 ∈ K. We want to show that K is a subgroup, which we will acomplish by proving it holds the group axioms. Since tt−1 ∈ K by hypothesis, we conclude that the identity element is in K: e ∈ K. (Existence of identity) Now that we know e ∈ K, for all t in K we have that et−1 = t−1 ∈ K so the inverses of elements in K are also in K. (Existence of inverses). Let s, t ∈ K. Then we know that t−1 ∈ K by last step. Applying hypothesis shows that s(t−1 )−1 = st ∈ K so K is closed under the operation. QED Example: • Consider the group (Z, +). Show that(2Z, +) is a subgroup.

The subgroup is closed under addition since the sum of even integers is even.

The identity 0 of Z is also on 2Z since 2 divides 0. For every k ∈ 2Z there is an −k ∈ 2Z which is the inverse under addition and satisfies −k + k = 0 = k(−k). Therefore (2Z, +) is a subgroup of (Z, +). Another way to show (2Z, +) is a subgroup is by using the proposition stated above. If s, t ∈ 2Z then s, t are even numbers and s −t ∈ 2Z since the difference of even numbers is always an even number. See also: • Wikipedia, subgroup Version: 7 Owner: Daume Author(s): Daume

305.40

third isomorphism theorem

If G is a group (or ring, or module) and H ⊂ K are normal subgroups (or ideals, or submodules), with H normal (or an ideal, or a submodule) in K then there is a natural isomorphism 1277

(G/H)/(K/H) ≈ G/K. I think it is not uncommon to see the third and second isomorphism theorems permuted. Version: 2 Owner: nerdy2 Author(s): nerdy2

1278

Chapter 306 20A99 – Miscellaneous
306.1 Cayley table

A Cayley table for a group is essentially the “multiplication table” of the group.1 The columns and rows of the table (or matrix) are labeled with the elements of the group, and the cells represent the result of applying the group operation to the row-th and column-th elements. Formally, Let G be our group, with operation ◦ the group operation. Let C be the Cayley table for the group, with C(i, j) denoting the element at row i and column j. Then C(i, j) = ei ◦ ej where ei is the ith element of the group, and ej is the jth element. Note that for an abelian group, we have ei ◦ ej = ej ◦ ei , hence the Cayley table is a symmetric matrix. All Cayley tables for isomorphic groups are isomorphic (that is, the same, invariant of the labeling and ordering of group elements).

306.1.1

Examples.

• The Cayley table for Z4 , the group of integers modulo 4 (under addition), would be
A caveat to novices in group theory: multiplication is usually used notationally to represent the group operation, but the operation needn’t resemble multiplication in the reals. Hence, you should take “multiplication table” with a grain or two of salt.
1

1279

• The Cayley table for S3 , the permutation group of order 3, is  (1) (123) (132) (12) (13) (23)  (1) (1) (123) (132) (12) (13) (23)   23) (123) (132) (1) (13) (23) (12)   (132) (132) (1) (123) (23) (12) (13)   2) (12) (23) (13) (1) (132) (123)   (13) (13) (12) (23) (123) (1) (132) (23) (23) (13) (12) (132) (123) (1) Version: 6 Owner: akrowne Author(s): akrowne

 [0]   [1]   [2] [3]

[0] [0] [1] [2] [3]

[1] [1] [2] [3] [0]

[2] [2] [3] [0] [1]

[3] [3] [0] [1] [2]

     

         

306.2

proper subgroup

A group H is a proper subgroup of a group G if and only if H is a subgroup of G and H = G. (306.2.1)

Similarly a group H is an improper subgroup of a group G if and only if H is a subgroup of G and H = G. (306.2.2) Version: 2 Owner: imran Author(s): imran

306.3

quaternion group

The quaternion group, or quaternionic group, is a noncommutative group with eight elements. It is traditionally denoted by Q (not to be confused with Q) or by Q8 . This group is defined by the presentation {i, j; i4 , i2 j 2 , iji−1 j −1 } or, equivalently, defined by the multiplication table

1280

· 1 i j k −i −j −k −1

1 1 i j k −i −j −k −1

i j k −i i j k −i −1 k −j 1 −k −1 i k j −i −1 −j 1 −k j −1 k 1 −i −k −j i 1 j −i −j −k i

−j −j −k 1 i k −1 −i j

−k −1 −k −1 j −i −i −j 1 −k −j i i j −1 k k 1

where we have put each product xy into row x and column y. The minus signs are justified by the fact that {1, −1} is subgroup contained in the center of Q. Every subgroup of Q is normal and, except for the trivial subgroup {1}, contains {1, −1}. The dihedral group D4 (the group of symmetries of a square) is the only other noncommutative group of order 8. Since i2 = j 2 = k 2 = −1, the elements i, j, and k are known as the imaginary units, by analogy with i ∈ C. Any pair of the imaginary units generate the group. Better, given x, y ∈ {i, j, k}, any element of Q is expressible in the form xm y n . Q is identified with the group of units (invertible elements) of the ring of quaternions over Z. That ring is not identical to the group ring Z[Q], which has dimension 8 (not 4) over Z. Likewise the usual quaternion algebra is not quite the same thing as the group algebra R[Q]. Quaternions were known to Gauss in 1819 or 1820, but he did not publicize this discovery, and quaternions weren’t rediscovered until 1843, with Hamilton. For an excellent account of this famous Story, see http://math.ucr.edu/home/baez/Octonions/node1.html. Version: 6 Owner: vernondalhart Author(s): vernondalhart, Larry Hammick, patrickwonders

1281

Chapter 307 20B05 – General theory for finite groups
307.1 cycle notation

The cycle notation is a useful convention for writing down a permutations in terms of its constituent cycles. Let S be a finite set, and a1 , . . . , ak , k 2

distinct elements of S. The expression (a1 , . . . , ak ) denotes the cycle whose action is a1 → a2 → a3 . . . ak → a1 . Note there are k different expressions for the same cycle; the following all represent the same cycle: (a1 , a2 , a3 , . . . , ak ) = (a2 , a3 , . . . , ak , a1 ), = . . . = (ak , a1 , a2 , . . . , ak−1 ). Also note that a 1-element cycle is the same thing as the identity permutation, and thus there is not much point in writing down such things. Rather, it is customary to express the identity permutation simply as (). Let π be a permutation of S, and let S1 , . . . , Sk ⊂ S, k∈N

be the orbits of π with more than 1 element. For each j = 1, . . . , k let nj denote the cardinality of Sj . Also, choose an a1,j ∈ Sj , and define ai+1,j = π(ai,j ), i ∈ N. We can now express π as a product of disjoint cycles, namely π = (a1,1 , . . . an1 ,1 )(a2,1 , . . . , an2 ,2 ) . . . (ak,1 , . . . , ank ,k ). 1282

By way of illustration, here are the 24 elements of the symmetric group on {1, 2, 3, 4} expressed using the cycle notation, and grouped according to their conjugacy classes: (), (12), (13), (14), (23), (24), (34) (123), (213), (124), (214), (134), (143), (234), (243) (12)(34), (13)(24), (14)(23) (1234), (1243), (1324), (1342), (1423), (1432) Version: 1 Owner: rmilson Author(s): rmilson

307.2

permutation group

A permutation group is a pair (G, X) where G is an abstract group, and X is a set on which G acts faithfully. Alternatively, this can be thought of as a group G equipped with a homomorphism in to Sym(X), the symmetric group on X. Version: 2 Owner: bwebste Author(s): bwebste

1283

Chapter 308 20B15 – Primitive groups
308.1 primitive transitive permutation group

1: A finite set 2: G transitive permutation group on A 3: ∀B ⊂ A block or B = 1

example
1: S4 is a primitive transitive permutation group on {1, 2, 3, 4}

counterexample
1: D8 is not a primitive transitive permutation group on the vertices of a square

stabilizer maximal necessary and sufficient for primitivity
1: A finite set 2: G transitive permutation group on A 3: G primitive ⇔ ∀a ∈ A : H G & H ⊃ StabG (a) ⇒ H = G or H = StabG (a) 1284

Note: This was a “seed” entry written using a short-hand format described in this FAQ. Version: 4 Owner: Thomas Heye Author(s): yark, apmxi

1285

Chapter 309 20B20 – Multiply transitive finite groups
309.1 Jordan’s theorem (multiply transitive groups)
4. Then

Let G be a sharply n-transitive permutation group, with n 1. G is similar to Sn with the standard action or

2. n = 4 and G is similar to M11 , the Mathieu group of degree 10 or 3. n = 5 and G is similar to M12 , the Mathieu group of degree 11. Version: 1 Owner: bwebste Author(s): bwebste

309.2

multiply transitive

Let G be a group, X a set on which it acts. Let X (n) be the set of order n-tuples of distinct elements of X. This is a G-set by the diagonal action: g · (x1 , . . . , xn ) = (g · x1 , . . . , g · xn ) The action of G on X is said to be n-transitive if it acts transitively on X (n) . For example, the standard action of S n , the symmetric group, is n-transitive, and the standard action of An , the alternating group, is (n − 2)-transitive. Version: 2 Owner: bwebste Author(s): bwebste 1286

309.3

sharply multiply transitive

Let G be a group, and X a set that G acts on, and let X (n) be the set of order n-tuples of distinct elements of X. Then the action of G on X is sharply n-transitive if G acts regularly on X (n) . Version: 1 Owner: bwebste Author(s): bwebste

1287

Chapter 310 20B25 – Finite automorphism groups of algebraic, geometric, or combinatorial structures
310.1 diamond theory

Diamond theory is the theory of affine groups over GF (2) acting on small square and cubic arrays. In the simplest case, the symmetric group of order 4 acts on a two-colored Diamond figure like that in Plato’s Meno dialogue, yielding 24 distinct patterns, each of which has some ordinary or color-interchange symmetry. This can be generalized to (at least) a group of order approximately 1.3 trillion acting on a 4x4x4 array of cubes, with each of the resulting patterns still having nontrivial symmetry. The theory has applications to finite geometry and to the construction of the large Witt design underlying the Mathieu group of degree 24.

Further Reading • ”Diamond Theory,” http://m759.freeservers.com/ Version: 4 Owner: m759 Author(s): akrowne, m759

1288

Chapter 311 20B30 – Symmetric groups
311.1 symmetric group

Let X be a set. Let S(X) be the set of permutations of X (i.e. the set of bijective functions on X). Then the act of taking the composition of two permutations induces a group structure on S(X). We call this group the symmetric group and it is often denoted Sym(X). Version: 5 Owner: bwebste Author(s): bwebste, antizeus

311.2

symmetric group

Let X be a set. Let S(X) be the set of permutations of X (i.e. the set of bijective functions on X). Then the act of taking the composition of two permutations induces a group structure on S(X). We call this group the symmetric group and it is often denoted Sym(X). When X has a finite number n of elements, we often refer to the symmetric group as Sn , and describe the elements by using cycle notation. Version: 2 Owner: antizeus Author(s): antizeus

1289

Chapter 312 20B35 – Subgroups of symmetric groups
312.1 Cayley’s theorem

Let G be a group, then G is isomorphic to a subgroup of the permutation group SG If G is finite and of order n, then G is isomorphic to a subgroup of the permutation group Sn Furthermore, suppose H is a proper subgroup of G. Let X = {Hg|g ∈ G} be the set of right cosets in G. The map θ : G → SX given by θ(x)(Hg) = Hgx is a homomorphism. The kernel is the largest normal subgroup of H. We note that |SX | = [G : H]!. Consequently if |G| doesn’t divide [G : H]! then θ is not an isomorphism so H contains a non-trivial normal subgroup, namely the kernel of θ. Version: 4 Owner: vitriol Author(s): vitriol

1290

Chapter 313 20B99 – Miscellaneous
313.1 (p, q) shuffle

Definition. Let p and q be positive natural numbers. Further, let S(k) be the set of permutations of the numbers {1, . . . , k}. A permutation τ ∈ S(p + q) is a (p, q) shuffle if τ (1) < · · · < τ (p), τ (p + 1) < · · · < τ (p + q).

The set of all (p, q) shuffles is denoted by S(p, q).

It is clear that S(p, q) ⊂ S(p + q). Since a (p, q) shuffle is completely determined by how the p first elements are mapped, the cardinality of S(p, q) is p+q . The wedge product of a p p-form and a q-form can be defined as a sum over (p, q) shuffles. Version: 3 Owner: matte Author(s): matte

313.2

Frobenius group

A permutation group G on a set X is Frobenius if no non-trivial element of G fixes more than one element of X. Generally, one also makes the restriction that at least one non-trivial element fix a point. In this case the Frobenius group is called non-regular. The stabilizer of any point in X is called a Frobenius complement, and has the remarkable property that it is distinct from any conjugate by an element not in the subgroup. Conversely, if any finite group G has such a subgroup, then the action on cosets of that subgroup makes G into a Frobenius group. Version: 2 Owner: bwebste Author(s): bwebste 1291

313.3

permutation

A permutation of a set {a1 , a2 , . . . , an } is an arrangement of its elements. For example, if S = {ABC} then ABC, CAB , CBA are three different permutations of S. The number of permutations of a set with n elements is n!. A permutation can also be seen as a bijective function of a set into itself. For example, the permutation CAB could be seen a function that assigns: f (A) = C, f (B) = A, f (C) = B.

In fact, every bijection of a set into itself gives a permutation, and any permutation gives rise to a bijective function. Therefore, we can say that there are n! bijective fucntion from a set with n elements into itself. Using the function approach, it can be proved that any permutation can be expressed as a composition of disjoint cycles and also as composition of (not necessarily disjoint) transpositions. Moreover, if σ = τ1 τ2 · · · τm = ρ1 ρ2 · · · ρn are two factorization of a permutation σ into transpositions, then m and n must be both even or both odd. So we can label permutations as even or odd depending on the number of transpositions for any decomposition. Permutations (as functions) form a non-abelian group with function composition as binary operation called symmetric group of order n. The subset of even permutations becomes a subgroup called the alternating group of order n. Version: 3 Owner: drini Author(s): drini

313.4

proof of Cayley’s theorem

Let G be a group, and let SG be the permutation group of the underlying set G. For each g ∈ G, define ρg : G → G by ρg (h) = gh. Then ρg is invertible with inverse ρg−1 , and so is a permutation of the set G. Define Φ : G → SG by Φ(g) = ρg . Then Φ is a homomorphism, since (Φ(gh))(x) = ρgh (x) = ghx = ρg (hx) = (ρg ◦ ρh )(x) = ((Φ(g))(Φ(h)))(x) And Φ is injective, since if Φ(g) = Φ(h) then ρg = ρh , so gx = hx for all x ∈ X, and so g = h as required. 1292

So Φ is an embedding of G into its own permutation group. If G is finite of order n, then simply numbering the elements of G gives an embedding from G to Sn . Version: 2 Owner: Evandar Author(s): Evandar

1293

Chapter 314 20C05 – Group rings of finite groups and their modules
314.1 group ring

For any group G, the group ring Z[G] is defined to be the ring whose additive group is the abelian group of formal integer linear combinations of elements of G, and whose multiplication operation is defined by multiplication in G, extended Z–linearly to Z[G]. More generally, for any ring R, the group ring of G over R is the ring R[G] whose additive group is the abelian group of formal R–linear combinations of elements of G, i.e.:
n

R[G] :=
i=1

ri g i

ri ∈ R, gi ∈ G ,

and whose multiplication operation is defined by R–linearly extending the group multiplication operation of G. In the case where K is a field, the group ring K[G] is usually called a group algebra. Version: 4 Owner: djao Author(s): djao

1294

Chapter 315 20C15 – Ordinary representations and characters
315.1 Maschke’s theorem

Let G be a finite group, and k a field of characteristic not dividing |G|. Then any representation V of G over k is completely reducible. e need only show that any subrepresentation has a compliment, and the result follows by induction.
W

Let V be a representation of G and W a subrepresentation. Let π : V → W be an arbitrary projection, and let 1 π (v) = g −1 π(gv) |G| g∈G This map is obviously G-equivariant, and is the identity on W , and its image is contained in W , since W is invariant under G. Thus it is an equivariant projection to W , and its kernel is a compliment to W . Version: 5 Owner: bwebste Author(s): bwebste

315.2

a representation which is not completely reducible

If G is a finite group, and k is a field whose characteristic does divide the order of the group, then Maschke’s theorem fails. For example let V be the regular representation of G, which can be thought of as functions from G to k, with the G action g · ϕ(g ) = ϕ(g −1 g ). Then this representation is not completely reducible. 1295

There is an obvious trivial subrepresentation W of V , consisting of the constant functions. I claim that there is no complementary invariant subspace to this one. If W is such a subspace, then there is a homomorphism ϕ : V → V /W ∼ k. Now consider the characteristic function = of the identity e ∈ G 1 g=e δe (g) = 0 g=e and = ϕ(δe ) in V /W . This is not zero since δ generates the representation V . By Gequivarience, ϕ(δg ) = for all g ∈ G. Since η=
g∈G

η(g)δg

for all η ∈ V , W = ϕ(η) =
g∈G

η(g) .

Thus, ker ϕ = {η ∈ V | η(g) = 0}.
∈G

But since the characteristic of the field k divides the order of G, W not possibly be complimentary to it.

W , and thus could

For example, if G = C2 = {e, f } then the invariant subspace of V is spanned by e + f . For characteristics other than 2, e − f spans a complimentary subspace, but over characteristic 2, these elements are the same. Version: 1 Owner: bwebste Author(s): bwebste

315.3

orthogonality relations

First orthogonality relations: Let χ1 , χ2 be characters of representations V1 , V2 of a finite group G over a field k of characteristic 0. Then 1 χ1 (g)χ2 (g) = dim(HomV1 V2 ). |G| g∈G

(χ1 , χ2 ) =

irst of all, consider the special case where V = k with the trivial action of the group. Then HomG (k, V2 ) ∼ V2G , the fixed points. On the other hand, consider the map =
F

φ=

1 g : V2 → V2 |G| g∈G 1296

(with the sum in End(V2 )). Clearly, the image of this map is contained in V2G , and it is the identity restricted to V2G . Thus, it is a projection with image V2G . Now, the rank of a projection (over a field of characteristic 0) is its trace. Thus, dimk HomG (k, V2 ) = dim V2G = tr(φ) = which is exactly the orthogonality formula for V1 = k. Now, in general, Hom(V1 , V2 ) ∼ V1∗ ⊗V2 is a representation, and HomG (V1 , v2 ) = (Hom(V1 , V2 ))G . = ∗ ⊗V = χ1 χ2 , Since χV1 2 dimk HomG (V1 , V2 ) = dimk (Hom(V1 , V2 ))G =
g∈G

1 |G|

χ2 (g)

χ1 χ2

which is exactly the relation we desired. In particular, if V1 , V2 irreducible, by Schur’s lemma HomV1 V2 = D 0 V1 ∼ V2 = V1 V2

where D is a division algebra. In particular, non-isomorphic irreducible representations have orthogonal characters. Thus, for any representation V , the multiplicities ni in the unique decomposition of V into the direct sum of irreducibles
⊕n V ∼ V1⊕n1 ⊕ · · · ⊕ Vm m =

where Vi ranges over irreducible representations of G over k, can be determined in terms of the character inner product: (ψ, χi ) (χi , χi )

ni =

where ψ is the character of V and χi the character of Vi . In particular, representations over a field of characteristic zero are determined by their character. Note: This is not true over fields of positive characteristic. If the field k is algebraically closed, the only finite division algebra over k is k itself, so the characters of irreducible representations form an orthonormal basis for the vector space of class functions with respect to this inner product. Since (χi , χi ) = 1 for all irreducibles, the multiplicity formula above reduces to ni = (ψ, χi ).

1297

Second orthogonality relations: We assume now that k is algebraically closed. Let g, g be elements of a finite group G. Then χ(g)χ(g ) =
χ

|CG (g1 )| g ∼ g 0 g g

where the sum is over the characters of irreducible representations, and CG (g) is the centralizer of g.
L et χ1 , . . . , χn be the characters of the irreducible representations, and let g1 , . . . , gn be representatives of the conjugacy classes.

Let A be the matrix whose ijth entry is |G : CG (gj )|(χi (gj )). By first orthogonality, AA∗ = |G|I (here ∗ denotes conjugate transpose), where I is the identity matrix. Since left inverses are right inverses, A∗ A = |G|I. Thus,
n

|G : CG (gi )||G : CG (gk )|

j=1

χj (gi )χj (gk ) = |G|δij .

Replacing gi or gk with any conjuagate will not change the expression above. thus, if our two elements are not conjugate, we obtain that χ χ(g)χ(g ) = 0. On the other hand, if g ∼ g , then i = k in the sum above, which reduced to the expression we desired. A special case of this result, applied to 1 is that |G| = χ χ(1)2 , that is, the sum of the squares of the dimensions of the irreducible representations of any finite group is the order of the group. Version: 8 Owner: bwebste Author(s): bwebste

1298

Chapter 316 20C30 – Representations of finite symmetric groups
316.1 example of immanent

If χ = 1 we obtain the permanent. If χ = sgn we obtain the determinant. Version: 1 Owner: gholmes74 Author(s): gholmes74

316.2

immanent

Let χ : Sn → C be a complex character. For any n × n matrix A define
n

Immχ (A) =
σ∈Sn

χ(σ)
j=1

A(j, σj)

functions obtained in this way are called immanents. Version: 4 Owner: gholmes74 Author(s): gholmes74

316.3

permanent

The permanent of an n × n matrix A over C is the number
n

per(A) =
σ∈Sn j=1

A(j, σj) 1299

Version: 2 Owner: gholmes74 Author(s): gholmes74

1300

Chapter 317 20C99 – Miscellaneous
317.1 Frobenius reciprocity

Let V be a finite-dimensional representation of a finite group G, and let W be a representation of a subgroup H ⊂ G. Then the characters of V and W satisfy the inner product relation (χInd(W ) , χV ) = (χW , χRes(V ) ) where Ind and Res denote the induced representation IndG and the restriction representation H ResG . H The Frobenius reciprocity theorem is often given in the stronger form which states that Res and Ind are adjoint functors between the category of G–modules and the category of H–modules: HomH (W, Res(V )) = HomG (Ind(W ), V ), or, equivalently V ⊗ Ind(W ) = Ind(Res(V ) ⊗ W ). Version: 4 Owner: djao Author(s): rmilson, djao

317.2

Schur’s lemma

Schur’s lemma in representation theory is an almost trivial observation for irreducible modules, but deserves respect because of its profound applications and implications. Lemma 5 (Schur’s lemma). Let G be a finite group represented on irreducible G-modules V and W . Any G-module homomorphism f : V → W is either invertible or the zero map. 1301

he only insight here is that both ker f and im f are G-submodules of V and W respectively. This is routine. However, because V is irreducible, ker f is either trivial or all of V . In the former case, im f is all of W , also because W is irreducible, so f is invertible. In the latter case, f is the zero map.
T

The following corollary is a very useful form of Schur’s lemma, in case that our representations are over an algebraically closed field. Corollary 1. If G is represented over an algebraically closed field F on irreducible G-modules V and W , then any G-module homomorphism f : V → W is a scalar.
T he insight in this case is to consider the modules V and W as vector spaces over F . Notice then that the homomorphism f is a linear transformation and therefore has an eigenvalue λ in our algebraically closed F . Hence, f −λ1 is not invertible. By Schur’s lemma, f −λ1 = 0. In other words, f = λ, a scalar.

Version: 14 Owner: rmilson Author(s): rmilson, NeuRet

317.3

character

Let ρ : G −→ GL(V ) be a finite dimensional representation of a group G (i.e., V is a finite dimensional vector space over its scalar field K). The character of ρ is the function χV : G −→ K defined by χV (g) := Tr(ρ(g)) where Tr is the trace function. Properties: • χV (g) = χV (h) if g is conjugate to h in G. (Equivalently, a character is a class function on G.) • If G is finite, the characters of the irreducible representations of G over the complex numbers form a basis of the vector space of all class functions on G (with pointwise addition and scalar multiplication). • Over the complex numbers, the characters of the irreducible representations of G are orthonormal under the inner product (χ1 , χ2 ) := 1 χ1 (g)χ2 (g) |G| g∈G

Version: 4 Owner: djao Author(s): djao 1302

317.4

group representation

Let G be a group, and let V be a vector space. A representation of G in V is a group homomorphism ρ : G −→ GL(V ) from G to the general linear group GL(V ) of invertible linear transformations of V . Equivalently, a representation of G is a vector space V which is a (left) module over the group ring Z[G]. The equivalence is achieved by assigning to each homomorphism ρ : G −→ GL(V ) the module structure whose scalar multiplication is defined by g · v := (ρ(g))(v), and extending linearly.

Special kinds of representations (preserving all notation from above) A representation is faithful if either of the following equivalent conditions is satisfied: • ρ : G −→ GL(V ) is injective • V is a faithful left Z[G]–module A subrepresentation of V is a subspace W of V which is a left Z[G]–submodule of V ; or, equivalently, a subspace W of V with the property that (ρ(g))(w) ∈ W for all w ∈ W. A representation V is called irreducible if it has no subrepresentations other than itself and the zero module. Version: 2 Owner: djao Author(s): djao

317.5

induced representation

Let G be a group, H ⊂ G a subgroup, and V a representation of H, considered as a Z[H]– module. The induced representation of ρ on G, denoted IndG (V ), is the Z[G]–module whose H underlying vector space is the direct sum σV
σ∈G/H

of formal translates of V by left cosets σ in G/H, and whose multiplication operation is defined by choosing a set {gσ }σ∈G/H of coset representatives and setting g(σv) := τ (hv) 1303

where τ is the unique left coset of G/H containing g · gσ (i.e., such that g · gσ = gτ · h for some h ∈ H). One easily verifies that the representation IndG (V ) is independent of the choice of coset H representatives {gσ }. Version: 1 Owner: djao Author(s): djao

317.6

regular representation

Given a group G, the regular representation of G over a field K is the representation ρ : G −→ GL( K G ) whose underlying vector space K G is the K–vector space of formal linear combinations of elements of G, defined by
n n

ρ(g)
i=1

ki gi

:=
i=1

ki (ggi)

for ki ∈ K, g, gi ∈ G. Equivalently, the regular representation is the induced representation on G of the trivial representation on the subgroup {1} of G. Version: 2 Owner: djao Author(s): djao

317.7

restriction representation

Let ρ : G −→ GL( V ) be a representation on a group G. The restriction representation of ρ to a subgroup H of G, denoted ResG (V ), is the representation ρ|H : H −→ GL( V ) obtained H by restricting the function ρ to the subset H ⊂ G. Version: 1 Owner: djao Author(s): djao

1304

Chapter 318 20D05 – Classification of simple and nonsolvable groups
318.1 Burnside p − q theorem

If a finite group G is not solvable, the order of G is divisible by at least 3 distinct primes. Alternatively, any groups whose order is divisible by only two distinct primes is solvable (these two distinct primes are the p and q of the title). Version: 2 Owner: bwebste Author(s): bwebste

318.2

classification of semisimple groups

For every semisimple group G there is a normal subgroup H of G, (called the centerless competely reducible radical) which isomorphic to a direct product of nonabelian simple groups such that conjugation on H gives an injection into Aut(H). Thus G is isomorphic to a subgroup of Aut(H) containing the inner automorphisms, and for every group H isomorphic to a direct product of non-abelian simple groups, every such subgroup is semisimple. Version: 1 Owner: bwebste Author(s): bwebste

318.3

semisimple group

A group G is called semisimple if it has no proper normal solvable subgroups. Every group is an extension of a semisimple group by a solvable one.

1305

Version: 1 Owner: bwebste Author(s): bwebste

1306

Chapter 319 20D08 – Simple groups: sporadic groups
319.1 Janko groups

The Janko groups denoted by J1 , J2 , J3 , and J4 are four of the 26 sporadic groups. They were discovered by Z. Janko in 1966 and published in the article ”A new finite simple group with abelan Sylow subgroups and its characterization.” (Journal of algebra, 1966, 32: 147-186). Each of these groups have very intricate matrix representations as maps into large general linear groups. For example, the matrix K corresponding to J4 gives a representation of J4 in GL112 (2). Version: 7 Owner: mathcam Author(s): mathcam, Thomas Heye

1307

Chapter 320 20D10 – Solvable groups, theory of formations, Schunck classes, Fitting classes, π-length, ranks
320.1 ˇ Cuhinin’s Theorem

Let G be a finite, π-separable group, for some set π of primes. Then if H is a maximal π-subgroup of G, the index of H in G, |G : H|, is coprime to all elements of π and all such subgroups are conjugate. Such a subgroup is called a Hall π-subgroup. For π = {p}, this essentially reduces to the Sylow theorems (with unnecessary hypotheses). If G is solvable, it is π-separable for all π, so such subgroups exist for all π. This result is often called Hall’s theorem. Version: 4 Owner: bwebste Author(s): bwebste

320.2

separable

Let π be a set of primes. A finite group G is called π-separable if there exists a composition series {1} = G0 ¡ · · · ¡ Gn = G such that Gi+1 /Gi is a π-group, or a π -group. π-separability can be thought of as a generalization of solvability; a group is π-separable for all sets of primes if and only it is solvable. Version: 3 Owner: bwebste Author(s): bwebste

1308

320.3

supersolvable group

A group G is supersolvable if it has a finite normal series G = G0 £ G1 £ · · · £ Gn = 1 with the property that each factor group Gi−1 /Gi is cyclic. A supersolvable group is solvable. Finitely generated nilpotent groups are supersolvable. Version: 1 Owner: mclase Author(s): mclase

1309

Chapter 321 20D15 – Nilpotent groups, p-groups
321.1 Burnside basis theorem

If G is a p-group, then Frat G = G Gp , where Frat G is the Frattini subgroup, G the commutator subgroup, and Gp is the subgroup generated by p-th powers. Version: 1 Owner: bwebste Author(s): bwebste

1310

Chapter 322 20D20 – Sylow subgroups, Sylow properties, π-groups, π-structure
322.1 π-groups and π -groups

Let π be a set of primes. A finite group G is called a π-group if all the primes dividing |G| are elements of π, and a π -group if none of them are. Typically, if π is a singleton π = {p}, we write p-group and p -group for these. Version: 2 Owner: bwebste Author(s): bwebste

322.2

p-subgroup

Let G be a finite group with order n, and let p be a prime integer. We can write n = pk m for some k, m integers, such that k and m are coprimes (that is, pk is the highest power of p that divides n). Any subgroup of G whose order is pk is called a Sylow p-subgroup or simply p-subgroup. While there is no reason for p-subgroups to exist for any finite group, the fact is that all groups have p-subgroups for every prime p that divides |G|. This statement is the First Sylow theorem When |G| = pk we simply say that G is a p-group. Version: 2 Owner: drini Author(s): drini, apmxi

1311

322.3

Burnside normal complement theorem

Let G be a finite group, and S a Sylow subgroup such that CG (S) = NG (S). Then S has a normal complement. That is, there exists a normal subgroup N ¡ G such that S N = {1} and SN = G. Version: 1 Owner: bwebste Author(s): bwebste

322.4

Frattini argument

If H is a normal subgroup of a finite group G, and S is a Sylow subgroup of H, then G = HNG (S), where NG (S) is the normalizer of S in G. Version: 1 Owner: bwebste Author(s): bwebste

322.5

Sylow p-subgroup

If (G, ∗) is a group then any subgroup of order pa for any integer a is called a p-subgroup. If|G| = pa m, where p m then any subgroup S of G with |S| = pa is a Sylow p-subgroup. We use Sylp (G) for the set of Sylow p-groups of G. Version: 3 Owner: Henry Author(s): Henry

322.6

Sylow theorems

Let G be a finite group whose order is divisible by the prime p. Suppose pm is the highest |G| power of p which is a factor of |G| and set k = pm • The group G contains at least one subgroup of order pm • Any two subgroups of G of order pm are conjugate • The number of subgroups of G of order pm is congruent to 1 modulo p and is a factor of k Version: 1 Owner: vitriol Author(s): vitriol

1312

322.7

Sylow’s first theorem

existence of subgroups of prime-power order
1: G finite group 2: p prime 3: pk divides |G| 4: ∃H G : |H| = pk

Note: This is a “seed” entry written using a short-hand format described in this FAQ. Version: 2 Owner: bwebste Author(s): yark, apmxi

322.8

Sylow’s third theorem

Let G finite group, and let n be the number of Sylow p-subgroups of G. Then n ⇔ 1 (mod p), and any two Sylow p-subgroups of G are conjugate to one another. Version: 8 Owner: bwebste Author(s): yark, apmxi

322.9

application of Sylow’s theorems to groups of order pq

We can use Sylow’s theorems to examine a group G of order pq, where p and q are primes and p < q. Let nq denote the number of Sylow q-subgroups of G. Then Sylow’s theorems tell us that nq is of the form 1 + kq for some integer k and nq divides pq. But p and q are prime and p < q, so this implies that nq = 1. So there is exactly one Sylow q-subgroup, which is therefore normal (indeed, characteristic) in G. Denoting the Sylow q-subgroup by Q, and letting P be a Sylow p-subgroup, then Q P = {1} and QP = G, so G is a semidirect product of Q and P . In particular, if there is only one Sylow p-subgroup, then G is a direct product of Q and P , and is therefore cyclic. Version: 9 Owner: yark Author(s): yark, Manoj, Henry

1313

322.10

p-primary component

Definition 27. Let G be a finite abelian group and let p ∈ N be a prime. The p-primary component of G, Πp , is the subgroup of all elements whose order is a power of p. Note: The p-primary component of an abelian group G coincides with the unique Sylow p-subgroup of G. Version: 2 Owner: alozano Author(s): alozano

322.11

proof of Frattini argument

Let g ∈ G be any element. Since H is normal, gSg −1 ⊂ H. Since S is a Sylow subgroup of H, gSg −1 = hSh−1 for some h ∈ H, by Sylow’s theorems. Thus n = h−1 g normalizes S, and so g = hn for h ∈ H and n ∈ NG (S). Version: 1 Owner: bwebste Author(s): bwebste

322.12

proof of Sylow theorems

We let G be a group of order pm k where p k and prove Sylow’s theorems. First, a fact which will be used several times in the proof: Proposition 8. If p divides the size of every conjugacy class outside the center then p divides the order of the center. Proof: This follows from this Centralizer: |G| = Z(G) + |[a]|

a∈Z(G) /

If p divides the left hand side, and divides all but one entry on the right hand side, it must divide every entry on the right side of the equation, so p|Z(G). Proposition 9. G has a Sylow p-subgroup Proof: By induction on |G|. If |G| = 1 then there is no p which divides its order, so the condition is trivial. Suppose |G| = pm k, p k, and the proposition holds for all groups of smaller order. Then we can consider whether p divides the order of the center, Z(G). 1314

If it does then, by Cauchy’s theorem, there is an element of Z(G) of order p, and therefore a cyclic subgroup generated by p, p , also of order p. Since this is a subgroup of the center, it is normal, so G/ p is well-defined and of order pm−1 k. By the inductive hypothesis, this group has a subgroup P/ p of order pm−1 . Then there is a corresponding subgroup P of G which has |P | = |P/ p | · |N| = pm . On the other hand, if p |Z(G)| then consider the conjugacy classes not in the center. By the proposition above, since Z(G) is not divisible by p, at least one conjugacy class can’t be. If a is a representative of this class then we have p |[a]| = [G : C(a)], and since |C(a)| · [G : C(a)] = |G|, pm | |C(a)|. But C(a) = G, since a ∈ Z(G), so C(a) has a subgroup / m of order p , and this is also a subgroup of G. Proposition 10. The intersection of a Sylow p-subgroup with the normalizer of a Sylow p-subgroup is the intersection of the subgroups. That is, Q NG (P ) = Q P . Proof: If P and Q are Sylow p-subgroups, consider R = Q NG (P ). Obviously Q P ⊆ R. In addition, since R ⊆ NG (P ), the second isomorphism theorem tells us that RP is a group, |R|·|P and |RP | = |R T P|| . P is a subgroup of RP , so pm | |RP |. But |R| is a subgroup of Q and P is a Sylow p-subgroup, so |R| · |P | is a multiple of p. Then it must be that |RP | = pm , and therefore P = RP , and so R ⊆ P . Obviously R ⊆ Q, so R ⊆ Q P . The following construction will be used in the remainder of the proof: Given any Sylow p-subgroup P , consider the set of its conjugates C. Then X ∈ C ↔ X = xP x−1 = {xpx−1 |∀p ∈ P } for some x ∈ G. Observe that every X ∈ C is a Sylow p-subgroup (and we will show that the converse holds as well). We define a group action of a subset G on C by: g · X = g · xP x−1 = gxP x−1 g −1 = (gx)P (gx)−1 This is clearly a group action, so we can consider the orbits of P under it. Of course, if all G is used then there is only one orbit, so we restrict the action to a Sylow p-subgroup Q. Name the orbits O1 , . . . , Os , and let P1 , . . . , Ps be representatives of the corresponding orbits. By the orbit-stabilizer theorem, the size of an orbit is the index of the stabilizer, and under this action the stabilizer of any Pi is just NQ (Pi ) = Q NG (Pi ) = Q P , so |Oi | = [Q : Q Pi ]. There are two easy results on this construction. If Q = Pi then |Oi| = [Pi : Pi Pi ] = 1. If Q = Pi then [Q : Q Pi ] > 1, and since the index of any subgroup of Q divides Q, p | |Oi|. Proposition 11. The number of conjugates of any Sylow p-subgroup of G is congruent to 1 modulo p In the construction above, let Q = P1 . Then |O1 | = 1 and p | |Oi| for i = 1. Since the number of conjugates of P is the sum of the number in each orbit, the number of conjugates is of the form 1 + k2 p + k3 p + · · · + ks p, which is obviously congruent to 1 modulo p. Proposition 12. Any two Sylow p-subgroups are conjugate 1315

Proof: Given a Sylow p-subgroup P and any other Sylow p-subgroup Q, consider again the construction given above. If Q is not conjugate to P then Q = Pi for every i, and therefore p | |Oi| for every orbit. But then the number of conjugates of P is divisible by p, contradicting the previous result. Therefore Q must be conjugate to P . Proposition 13. The number of subgroups of G of order pm is congruent to 1 modulo p and is a factor of k Proof: Since conjugates of a Sylow p-subgroup are precisely the Sylow p-subgroups, and since a Sylow p-subgroup has 1 modulo p conjugates, there are 1 modulo p Sylow p-subgroups. Since the number of conjugates is the index of the normalizer, it must be |G : NG (P )|. Since P is a subgroup of its normalizer, pm | NG (P ), and therefore |G : NG (P )| | k. Version: 3 Owner: Henry Author(s): Henry

322.13

subgroups containing the normalizers of Sylow subgroups normalize themselves

Let G be a finite group, and S a Sylow subgroup. Let M be a subgroup such that NG (S) ⊂ M. Then M = NG (M). y order considerations, S is a Sylow subgroup of M. Since M is normal in NG (M), by the Frattini argument, NG (M) = NG (S)M = M.
B

Version: 3 Owner: bwebste Author(s): bwebste

1316

Chapter 323 20D25 – Special subgroups (Frattini, Fitting, etc.)
323.1 Fitting’s theorem

If G is a finite group and M and N are normal nilpotent subgroups, then MN is also a normal nilpotent subgroup. Thus, any finite group has a maximal normal nilpotent subgroup, called its Fitting subgroup. Version: 1 Owner: bwebste Author(s): bwebste

323.2

characteristically simple group

A group G is called characterisitically simple if its only characteristic subgroups are {1} and G. Any finite characteristically simple group is the direct product of several copies of isomorphic simple groups. Version: 3 Owner: bwebste Author(s): bwebste

323.3

the Frattini subgroup is nilpotent

The Frattini subgroup Frat G of any finite group G is nilpotent. et S be a Sylow p-subgroup of G. Then by the Frattini argument, (Frat G)NG (S) = G. Since the Frattini subgroup is formed of non-generators, NG (S) = G. Thus S is normal in
L

1317

G, and thus in Frat G. Any subgroup whose Sylow subgroups are all normal is nilpotent. Version: 4 Owner: bwebste Author(s): bwebste

1318

Chapter 324 20D30 – Series and lattices of subgroups
324.1 maximal condition

A group is said to satisfy the maximal condition if every strictly ascending chain of subgroups G1 ⊂ G2 ⊂ G3 ⊂ · · · is finite. This is also called the ascending chain condition. A group satifies the maximal condition if and only if the group and all its subgroups are finitely generated. Similar properties are useful in other classes of algebraic structures: see for example the noetherian condition for rings and modules. Version: 2 Owner: mclase Author(s): mclase

324.2

minimal condition

A group is said to satisfy the minimal condition if every strictly descending chain of subgroups G1 ⊃ G2 ⊃ G3 ⊃ · · · is finite. This is also called the descending chain condition. 1319

A group which satisfies the minimal condition is necessarily periodic. For if it contained an element x of infinite order, then x ⊃ x2 ⊃ x4 ⊃ · · · ⊃ x2 is an infinite descending chain of subgroups. Similar properties are useful in other classes of algebraic structures: see for example the artinian condition for rings and modules. Version: 1 Owner: mclase Author(s): mclase
n

⊃ ···

324.3

subnormal series

Let G be a group with a subgroup H, and let G = G0 £ G1 £ · · · £ Gn = H (324.3.1)

be a series of subgroups with each Gi a normal subgroup of Gi−1 . Such a series is called a subnormal series or a subinvariant series. If in addition, each Gi is a normal subgroup of G, then the series is called a normal series. A subnormal series in which each Gi is a maximal normal subgroup of Gi−1 is called a composition series. A normal series in which Gi is a maximal normal subgroup of G contained in Gi−1 is called a principal series or a chief series. Note that a composition series need not end in the trivial group 1. One speaks of a composition series (1) as a composition series from G to H. But the term composition series for G generally means a compostion series from G to 1. Similar remarks apply to principal series. Version: 1 Owner: mclase Author(s): mclase

1320

Chapter 325 20D35 – Subnormal subgroups
325.1 subnormal subgroup

Let G be a group, and H a subgroup of G. Then H is subnormal if there exists a finite series H = H0 hdH1 hd · · · hdtHn = G with Hi a normal subgroup of Hi+1 . Version: 1 Owner: bwebste Author(s): bwebste

1321

Chapter 326 20D99 – Miscellaneous
326.1 Cauchy’s theorem

Let G be a finite group and let p be a prime dividing |G|. Then there is an element of G of order p. Version: 1 Owner: Evandar Author(s): Evandar

326.2

Lagrange’s theorem

Let G be a finite group and let H be a subgroup of G. Then the order of H divides the order of G. Version: 2 Owner: Evandar Author(s): Evandar

326.3

exponent

If G is a finite group, then the exponent of G, denoted exp G, is the smallest positive integer n such that, for every g ∈ G, g n = eG . Thus, for every group G, exp G divides G, and, for every g ∈ G, |g| divides exp G. The concept of exponent for finite groups is similar to that of characterisic for rings. If G is a finite abelian group, then there exists g ∈ G with |g| = exp G. As a result of the fundamental theorem of finite abelian groups, there exist a1 , . . . , an with ai dividing ai+1 for every integer i between 1 and n such that G ∼ Za1 ⊕ · · · ⊕ Zan . Since, for every c ∈ G, = 1322

can = eG , then exp G ≤ an . Since |(0, . . . , 0, 1)| = an , then exp G = an , and the result follows. Following are some examples of exponents of nonabelian groups. Since |(12)| = 2, |(123)| = 3, and |S3 | = 6, then exp S3 = 6. In Q8 = {1, −1, i, −i, j, −j, k, −k}, the ring of quaternions of order eight, since |i| = | − i| = |j| = | − j| = |k| = | − k| = 4 and 14 = (−1)4 = 1, then exp Q = 4. Since the order of a product of two disjoint transpositions is 2, the order of a three cycle is 3, and the only nonidentity elements of A4 are products of two disjoint transpositions and three cycles, then exp A4 = 6. Since |(123)| = 3 and |(1234)| = 4, then exp S4 ≥ 12. Since S4 is not abelian, then it is not cyclic, and thus contains no element of order 24. It follows that exp S4 = 12. Version: 5 Owner: Wkbj79 Author(s): Wkbj79

326.4

fully invariant subgroup

A subgroup H of a group G is fully invariant if f (H) ⊆ H for all endomorphisms f : G → G This is a stronger condition than being a characteristic subgroup. The derived subgroup is fully invariant. Version: 1 Owner: mclase Author(s): mclase

326.5

proof of Cauchy’s theorem

Let G be a finite group and p be a prime divisor of |G|. Consider the set X of all ordered strings (x1 , x2 , . . . , xp ) for which x1 x2 . . . xp = e. Note |X| = |G|p−1, i.e. a multiple of p. There is a natural group action of Zp on X. m ∈ Zp sends the string (x1 , x2 , . . . , xp ) to (xm+1 , . . . , xp , x1 , . . . , xm ). By orbit-stabilizer theorem each orbit contains exactly 1 or p strings. Since (e, e, . . . , e) has an orbit of cardinality 1, and the orbits partition X, the cardinality of which is divisible by p, there must exist at least one other string (x1 , x2 , . . . , xp ) which is left fixed by every element of Zp . i.e. x1 = x2 = . . . = xp and so there exists an element of order p as required. Version: 1 Owner: vitriol Author(s): vitriol

1323

326.6

proof of Lagrange’s theorem

We know that the cosets Hg form a partition of G (see the coset entry for proof of this.) Since G is finite, we know it can be completely decomposed into a finite number of cosets. Call this number n. We denote the ith coset by Hai and write G as G = Ha1 since each coset has |H| elements, we have |G| = |H| · n and so |H| divides |G|, which proves Lagrange’s theorem. Version: 2 Owner: akrowne Author(s): akrowne Ha2 ··· Han

326.7

proof of the converse of Lagrange’s theorem for finite cyclic groups

Following is a proof that, if G is a finite cyclic group and n ∈ Z+ is a divisor of |G|, then G has a subgroup of order n. Let g be a generator of G. Then |g| = | g | = |G|. Let z ∈ Z such that nz = |G| = |g|. |g| Consider g z . Since g ∈ G, then g z ∈ G. Thus, g z ≤ G. Since | g z | = |g z | = GCD(z,|g|) = nz = nz = n, it follows that g z is a subgroup of G of order n. GCD(z,nz) z Version: 3 Owner: Wkbj79 Author(s): Wkbj79

326.8

proof that expG divides |G|

Following is a proof that exp G divides |G| for every finite group G. By the division algorithm, there exist q, r ∈ Z with 0 ≤ r < exp G such that |G| = q(exp G) + r. Let g ∈ G. Then eG = g |G| = g q(exp G)+r = g q(exp G) g r = (g exp G )q g r = (eG )q g r = eG g r = g r . Thus, for every g ∈ G, g r = eG . By the definition of exponent, r cannot be positive. Thus, r = 0. It follows that exp G divides |G|. Version: 4 Owner: Wkbj79 Author(s): Wkbj79 1324

326.9

proof that |g| divides expG

Following is a proof that, for every finite group G and for every g ∈ G, |g| divides exp G. By the division algorithm, there exist q, r ∈ Z with 0 ≤ r < |g| such that exp G = q|g| + r. Since eG = g exp G = g q|g|+r = (g |g| )q g r = (eG )q g r = eG g r = g r , then, by definition of the order of an element, r cannot be positive. Thus, r = 0. It follows that |g| divides exp G. Version: 2 Owner: Wkbj79 Author(s): Wkbj79

326.10

proof that every group of prime order is cyclic

Following is a proof that every group of prime order is cyclic. Let p be a prime and G be a group such that |G| = p. Then G contains more than one element. Let g ∈ G such that g = eG . Then g contains more than one element. Since g ≤ G, then by Lagrange’s theorem, | g | divides p. Since | g | > 1 and | g | divides a prime, then | g | = p = |G|. Hence, g = G. It follows that G is cyclic. Version: 3 Owner: Wkbj79 Author(s): Wkbj79

1325

Chapter 327 20E05 – Free nonabelian groups
327.1 Nielsen-Schreier theorem

Let G be a free group and H a subgroup of G. Then H is free. Version: 1 Owner: Evandar Author(s): Evandar

327.2

Scheier index formula

Let G be a free group and H a subgroup of finite index |G : H| = n. By the Nielsen-Schreier theorem, H is free. The Scheier index formula states that rank(H) = n(rank(G) − 1) + 1. Thus implies more generally, if G is any group generated by m elements, then any subgroup of index n can be generated by nm − n + 1 elements. Version: 1 Owner: bwebste Author(s): bwebste

327.3

free group

Let A be a set with elements ai for some index set I. We refer to A as an alphabet and the elements of A as letters. A syllable is a symbol of the form an for n ∈ Z. It is customary i to write a for a1 . Define a word to be a finite ordered string, or sequence, of syllables made up of elements of A. For example, a3 a1 a−1 a2 a−3 2 4 3 2 1326

is a five-syllable word. Notice that there exists a unique empty word, i.e. the word with no syllables, usually written simply as 1. Denote the set of all words formed from elements of A by W[A]. Define a binary operation, called the product, on W[A] by concatenation of words. To illustrate, if a3 a1 and a−1 a4 are elements of W[A] then their product is simply a3 a1 a−1 a4 . 3 1 2 3 1 2 This gives W[A] the structure of a semigroup with identity. The empty word 1 acts as a right and left identity in W[A], and is the only element which has an inverse. In order to give W[A] the structure of a group, two more ideas are needed. If v = u1 a0 u2 is a word where u1 , u2 are also words and ai is some element of A, an elemeni tary contraction of type I replaces the occurrence of a0 by 1. Thus, after this type of contraction we get another word w = u1 u2 . If v = u1 ap aq u2 is a word, an elementary coni i p+q traction of type II replaces the occurrence of ap aq by ai which results in w = u1 ap+q u2 . i i i In either of these cases, we also say that w is obtained from v by an elementary contraction, or that v is obtained from w by an elementary expansion. Call two words u, v equivalent (denoted u ∼ v) if one can be obtained from the other by a finite sequence of elementary contractions or expansions. This is an equivalence relation on W[A]. Let F[A] be the set of equivalence classes of words in W[A]. Then F[A] is group under the operation [u][v] = [uv] where [u] ∈ F[A]. The inverse [u]−1 of an element [u] is obtained by reversing the order of the syllables of [u] and changing the sign of each syllable. For example, if [u] = [a1 a2 ], then 3 [u]−1 = [a−2 a−1 ]. 3 1 We call F[A] the free group on the alphabet A or the free group generated by A. A given group G is free if G is isomorphic to F[A] for some A. This seemingly ad hoc construction gives an important result: Every group is the homomorphic image of some free group. Version: 4 Owner: jihemme Author(s): jihemme, rmilson, djao

327.4

proof of Nielsen-Schreier theorem and Schreier index formula

While there are purely algebraic proofs of this fact, a much easier proof is available through geometric group theory. Let G be a group which is free on a set X. Any group acts freely on its Cayley graph, and the Cayley graph of G is a 2|X|-regular tree, which we will call T.

If H is any subgroup of G, then H also acts freely on T by restriction. Since groups that act freely on trees a H is free. 1327

Moreover, we can obtain the rank of H (the size of the set on which it is free). If G is a finite graph, then π1 (G) is free of rank −χ(G) − 1, where χ(G) denotes the Euler characteristic of G. Since H ∼ π1 (H\T), the rank of H is χ(H\T). If H is of finite index n in G, then H\T = is finite, and χ(H\T) = nχ(G\T). Of course −χ(G\T) − 1 is the rank of G. Substituting, we find that rank(H) = n(rank(G) − 1) + 1. Version: 2 Owner: bwebste Author(s): bwebste

327.5

Jordan-Holder decomposition

A Jordan–H¨lder decomposition of a group G is a filtration o G = G1 ⊃ G2 ⊃ · · · ⊃ Gn = {1} such that Gi+1 is a normal subgroup of Gi and the quotient Gi /Gi+1 is a simple group for each i. Version: 4 Owner: djao Author(s): djao

327.6

profinite group

A topological group G is profinite if it is isomorphic to the inverse limit of some projective system of finite groups. In other words, G is profinite if there exists a directed set I, a collection of finite groups {Hi }i∈I , and homomorphisms αij : Hj → Hi for each pair i, j ∈ I with i j, satisfying 1. αii = 1 for all i ∈ I, 2. αij ◦ αjk = αik for all i, j, k ∈ I with i with the property that: • G is isomorphic as a group to the projective limit lim Hi :=
←−

j

k,

(hi ) ∈

Hi
i∈I

αij (hj ) = hi for all i

j

under componentwise multiplication.

1328

• The isomorphism from G to lim Hi (considered as a subspace of
←−

Hi ) is a homeomorphism Hi is given

of topological spaces, where each Hi is given the discrete topology and the product topology.

The topology on a profinite group is called the profinite topology. Version: 3 Owner: djao Author(s): djao

327.7

extension

A short exact sequence 0 → A → B → C → 0 is sometimes called an extension of C by A. This term is also applied to an object B which fits into such an exact sequence. Version: 1 Owner: bwebste Author(s): bwebste

327.8

holomorph

Let K be a group, and let θ : Aut(K) → Aut(K) be the identity map. The holomorph of K, denoted Hol(K), is the semidirect product K θ Aut(K). Then K is a normal subgroup of Hol(K), and any automorphism of K is the restriction of an inner automorphism of Hol(K). For if φ ∈ Aut(K), then (1, φ) · (k, 1) · (1, φ−1) = (1 · k θ(φ) , φ) · (1, φ−1 ) = (k θ(φ) · 1θ(φ) , φφ−1) = (φ(k), 1).

Version: 2 Owner: dublisk Author(s): dublisk

327.9

proof of the Jordan Holder decomposition theorem

Let |G| = N. We first prove existence, using induction on N. If N = 1 (or, more generally, if G is simple) the result is clear. Now suppose G is not simple. Choose a maximal proper normal subgroup G1 of G. Then G1 has a Jordan–H¨lder decomposition by induction, which o produces a Jordan–H¨lder decomposition for G. o

1329

To prove uniqueness, we use induction on the length n of the decomposition series. If n = 1 then G is simple and we are done. For n > 1, suppose that G ⊃ G1 ⊃ G2 ⊃ · · · ⊃ Gn = {1} and G ⊃ G1 ⊃ G2 ⊃ · · · ⊃ Gm = {1} are two decompositions of G. If G1 = G1 then we’re done (apply the induction hypothesis to G1 ), so assume G1 = G1 . Set H := G1 G1 and choose a decomposition series H ⊃ H1 ⊃ · · · ⊃ Hk = {1} for H. By the second isomorphism theorem, G1 /H = G1 G1 /G1 = G/G1 (the last equality is because G1 G1 is a normal subgroup of G properly containing G1 ). In particular, H is a normal subgroup of G1 with simple quotient. But then G1 ⊃ G2 ⊃ · · · ⊃ Gn and G1 ⊃ H ⊃ · · · ⊃ H k are two decomposition series for G1 , and hence have the same simple quotients by the induction hypothesis; likewise for the G1 series. Therefore n = m. Moreover, since G/G1 = G1 /H and G/G1 = G1 /H (by the second isomorphism theorem), we have now accounted for all of the simple quotients, and shown that they are the same. Version: 4 Owner: djao Author(s): djao

327.10

semidirect product of groups

The goal of this exposition is to carefully explain the correspondence between the notions of external and internal semi–direct products of groups, as well as the connection between semi–direct products and short exact sequences. Naturally, we start with the construction of semi–direct products. Definition 6. Let H and Q be groups and let θ : Q −→ Aut(H) be a group homomorphism. The semi–direct product H θ Q is defined to be the group with underlying set {(h, q)such thath ∈ H, q ∈ Q} and group operation (h, q)(h , q ) := (hθ(q)h , qq ). We leave it to the reader to check that H inverse of (h, q) is (θ(q −1 )(h−1 ), q −1 ).
θ

Q is really a group. It helps to know that the

For the remainder of this article, we omit θ from the notation whenever this map is clear from the context. 1330

Set G := H

Q. There exist canonical monomorphisms H −→ G and Q −→ G, given by h → (h, 1Q ), q → (1H , q), h∈H q∈Q

where 1H (resp. 1Q ) is the identity element of H (resp. Q). These monomorphisms are so natural that we will treat H and Q as subgroups of G under these inclusions. Theorem 3. Let G := H Q as above. Then:

• H is a normal subgroup of G. • HQ = G. • H Q = {1G }.

L et p : G −→ Q be the projection map defined by p(h, q) = q. Then p is a homomorphism with kernel H. Therefore H is a normal subgroup of G.

Every (h, q) ∈ G can be written as (h, 1Q )(1H , q). Therefore HQ = G. Finally, it is evident that (1H , 1Q ) is the only element of G that is of the form (h, 1Q ) for h ∈ H and (1H , q) for q ∈ Q. This result motivates the definition of internal semi–direct products. Definition 7. Let G be a group with subgroups H and Q. We say G is the internal semi– direct product of H and Q if: • H is a normal subgroup of G. • HQ = G. • H Q = {1G }.

We know an external semi–direct product is an internal semi–direct product (Theorem 3). Now we prove a converse (Theorem 4), namely, that an internal semi–direct product is an external semi–direct product. Lemma 6. Let G be a group with subgroups H and Q. Suppose G = HQ and H Q = {1G }. Then every element g of G can be written uniquely in the form hq, for h ∈ H and q ∈ Q. ince G = HQ, we know that g can be written as hq. Suppose it can also be written as h q . Then hq = h q so h −1 h = q q −1 ∈ H Q = {1G }. Therefore h = h and q = q .
S

1331

Theorem 4. Suppose G is a group with subgroups H and Q, and G is the internal semi– direct product of H and Q. Then G ∼ H θ Q where θ : Q −→ Aut(H) is given by = θ(q)(h) := qhq −1 , q ∈ Q, h ∈ H. y lemma 6, every element g of G can be written uniquely in the form hq, with h ∈ H and q ∈ Q. Therefore, the map φ : H Q −→ G given by φ(h, q) = hq is a bijection from G to H Q. It only remains to show that this bijection is a homomorphism.
B

Given elements (h, q) and (h , q ) in H

Q, we have

φ((h, q)(h , q )) = φ((hθ(q)(h ), qq )) = φ(hqh q −1 , qq ) = hqh q = φ(h, q)φ(h , q ). Therefore φ is an isomorphism. Consider the external semi–direct product G := H θ Q with subgroups H and Q. We know from Theorem 4 that G is isomorphic to the external semi–direct product H θ Q, where we are temporarily writing θ for the conjugation map θ (q)(h) := qhq −1 of Theorem 4. But in fact the two maps θ and θ are the same: θ (q)(h) = (1H , q)(h, 1Q )(1H , q −1 ) = (θ(q)(h), 1Q ) = θ(q)(h). In summary, one may use Theorems 3 and 4 to pass freely between the notions of internal semi–direct product and external semi–direct product. Finally, we discuss the correspondence between semi–direct products and split exact sequences of groups. Definition 8. An exact sequence of groups 1 −→ H −→ G −→ Q −→ 1. is split if there exists a homomorphism k : Q −→ G such that j ◦ k is the identity map on Q. Theorem 5. Let G, H, and Q be groups. Then G is isomorphic to a semi–direct product H Q if and only if there exists a split exact sequence 1 −→ H −→ G −→ Q −→ 1. irst suppose G ∼ H Q. Let i : H −→ G be the inclusion map i(h) = (h, 1Q ) and let = j : G −→ Q be the projection map j(h, q) = q. Let the splitting map k : Q −→ G be the inclusion map k(q) = (1H , q). Then the sequence above is clearly split exact.
F
i j i j

Now suppose we have the split exact sequence above. Let k : Q −→ G be the splitting map. Then: • i(H) = ker j, so i(H) is normal in G. 1332

• For any g ∈ G, set q := k(j(g)). Then j(gq −1 ) = j(g)j(k(j(g)))−1 = 1Q , so gq −1 ∈ Im i. Set h := gq −1 . Then g = hq. Therefore G = i(H)k(Q). • Suppose g ∈ G is in both i(H) and k(Q). Write g = k(q). Then k(q) ∈ Im i = ker j, so q = j(k(q)) = 1Q . Therefore g = k(q) = k(1Q ) = 1G , so i(H) k(Q) = {1G }. This proves that G is the internal semi–direct product of i(H) and k(Q). These are isomorphic to H and Q, respectively. Therefore G is isomorphic to a semi–direct product H Q. Thus, not all normal subgroups H ⊂ G give rise to an (internal) semi–direct product G = H G/H. More specifically, if H is a normal subgroup of G, we have the canonical exact sequence 1 −→ H −→ G −→ G/H −→ 1.

We see that G can be decomposed into H only if the canonical exact sequence splits. Version: 5 Owner: djao Author(s): djao

G/H as an internal semi–direct product if and

327.11

wreath product

Let A and B be groups, and let B act on the set Γ. Let AΓ be the set of all functions from Γ to A. Endow AΓ with a group operation by pointwise multiplication. In other words, for any f1 , f2 ∈ AΓ , (f1 f2 )(γ) = f1 (γ)f2 (γ) ∀γ ∈ Γ where the operation on the right hand side above takes place in A, of course. Define the action of B on AΓ by bf (γ) := f (bγ),

for any f : Γ → A and all γ ∈ Γ. The wreath product of A and B according to the action of B on Γ, sometimes denoted A Γ B, is the following semidirect product of groups: AΓ B.

Before going into further constructions, let us pause for a moment to unwind this definition. Let W := A Γ B. The elements of W are ordered pairs (f, b), for some function f : Γ → A and some b ∈ B. The group operation in the semidirect product, for any (f1 , b1 ), (f2 , b2 ) ∈ W is, (f1 (γ), b1 )(f2 (γ), b2 ) = (f1 (γ)f2 (b1 γ), b1 b2 ) ∀γ ∈ Γ

The set AΓ can be interpreted as the cartesian product of A with itself, of cardinality Γ. That is to say, Γ here plays the role of an index set for the Cartesian product. If Γ is finite, 1333

for instance, say Γ = {1, 2, . . . , n}, then any f ∈ AΓ is an n-tuple, and we can think of any (f, b) ∈ W as the following ordered pair: ((a1 , a2 , . . . , an ), b) where a1 , a2 , . . . , an ∈ A The action of B on Γ in the semidirect product has the effect of permuting the elements of the n-tuple f , and the group operation defined on AΓ gives pointwise multiplication. To be explicit, suppose (f, a), (g, b) ∈ W , and for j ∈ Γ, f (j) = rj ∈ A and g(j) = sj ∈ A. Then, (f, a)(g, b) = ((r1 , r2 , . . . , rn ), a)((s1 , s2 , . . . , sn ), b) = ((r1 , r2 , . . . , rn )(sa1 , sa2 , . . . , san ), ab) (Notice the permutation of the indices!) = ((r1 sa1 , r2 sa2 , . . . , rn san ), ab). A moment’s thought to understand this slightly messy notation will be illuminating (and might also shed some light on the choice of terminology, “wreath” product). Version: 11 Owner: bwebste Author(s): NeuRet

327.12

Jordan-Hlder decomposition theorem

Every finite group G has a filtration G ⊃ G0 ⊃ · · · ⊃ Gn = {1}, where each Gi+1 is normal in Gi and each quotient group Gi /Gi+1 is a simple group. Any two such decompositions of G have the same multiset of simple groups Gi /Gi+1 up to ordering. A filtration of G satisfying the properties above is called a Jordan–H¨lder decomposition of o G. Version: 4 Owner: djao Author(s): djao

327.13

simplicity of the alternating groups
5 the alternating group on n symbols, An , is simple.

This is an elementary proof that for n

Throughout this discussion, fix n 5. We will extensively employ cycle notation, with composition on the left, as is usual. The following observation will also be useful. Let π be a permutation written as disjoint cycles π = (a1 , a2 , . . . , ak )(b1 , b2 , . . . , bl )(. . .) . . . 1334

It is easy to check that for any other permutation σ ∈ Sn σπσ −1 = (σ(a1 ), σ(a2 ), . . . , σ(ak ))(σ(b1 ), σ(b2 ), . . .)(. . .) . . .) In particular, two permutations of Sn are conjugate exactly when they have the same cycle type. Two preliminary results are necessary. Lemma 7. An is generated by all cycles of length 3.
A product of 3-cycles is an even permutation, so the subgroup generated by all 3-cycles is therefore contained in An . For the reverse inclusion, by definition every even permutation is the product of even number of transpositions. Thus, it suffices to show that the product of two transpositions can be written as a product of 3-cycles. There are two possibilities. Either the two transpositions move an element in common, say (a, b) and (a, c), or the two transpositions are disjoint, say (a, b) and (c, d). In the former case,

(a, b)(a, c) = (a, c, b), and in the latter, (a, b)(c, d) = (a, b, d)(c, b, d). This establishes the first lemma. Lemma 8. If a normal subgroup N An contains a 3-cycle, then N = An .

W e will show that if (a, b, c) ∈ N, then the assumption of normality implies that any other (a , b , c ) ∈ N. This is easy to show, because there is some permutation in σ ∈ Sn that under conjugation takes (a, b, c) to (a , b , c ), that is

σ(a, b, c)σ −1 = (σ(a), σ(b), σ(c)) = (a , b , c ). In case σ is odd, then (because n 5) we can choose some transposition (d, e) ∈ An disjoint from (a , b , c ) so that σ(a, b, c)σ −1 = (d, e)(a , b , c )(d, e), that is, σ (a, b, c)σ −1 = (d, e)σ(a, b, c)σ −1 (d, e) = (a , b , c ) where σ is even. This means that N contains all 3-cycles, as N lemma N = An as required. An . Hence, by previous

The rest of the proof proceeds by an exhuastive verification of all the possible cases. Suppose there is some nontrivial N An . We will show that N = An . In each case we will suppose N contains a particular kind of element, and the normality will imply that N also contains a certain conjugate of the element in An , thereby reducing the situation to a previously solved case. 1335

Case 1 Suppose N contains a permutation π that when written as disjoint cycles has a cycle of length at least 4, say π = (a1 , a2 , a3 , a4 , . . .) . . . Upon conjugation by (a1 , a2 , a3 ) ∈ An , we obtain π = (a1 , a2 , a3 )π(a3 , a2 , a1 ) = (a2 , a3 , a1 , a4 , . . .) . . . so that π ∈ N, and also π π −1 = (a1 , a2 , a4 ) ∈ N. Notice that the rest of the cycles cancel. By Lemma 8, N = An . Case 2 The cyclic decompositions of elements of N only involve cycles of length 2 and at least two cycles of length 3. Consider then π = (a, b, c)(d, e, f ) . . . Conjugation by (c, d, e) implies that N also contains π = (c, d, e)π(e, d, c) = (a, b, d)(e, c, f ) . . . , and hence N also contains π π = (a, d, c, b, f ) . . ., which reduces to Case 1.

Case 3 There is an element of N whose cyclic decomposition only involves transpositions and exactly one 3-cycle. Upon squaring, this element becomes a 3-cycle and Lemma 8 applies.

Case 4 There is an element of N of the form π = (a, b)(c, d). Conjugating by (a, e, b) with e distinct from a, b, c, d (again, at least one such e, as n 5) yields π = (a, e, b)π(b, e, a) = (a, e)(c, d) ∈ N. Hence π π = (a, b, e) ∈ N. Lemma 8, applies and N = An . Case 5 Every element of N is the product of at least four transpositions. Suppose N contains π = (a1 , b1 )(a2 , b2 )(a3 , b3 )(a4 , b4 ) . . ., the number of transpostions being even, of course. This time we conjugate by (a2 , b1 )(a3 , b2 ). π = (a2 , b1 )(a3 , b2 )π(a3 , b2 )(a2 , b1 ) = (a1 , a2 )(a3 , b1 )(b2 , b3 )(a4 , b4 ), and π π = (a1 , a3 , b2 )(a2 , b3 , b1 ) ∈ N which is Case 2. Since this covers all possible cases, N = An and the alternating group contains no proper nontrivial normal subgroups. QED. Version: 8 Owner: rmilson Author(s): NeuRet 1336

327.14

abelian groups of order 120

Here we present an application of the fundamental theorem of finitely generated abelian groups. Example (abelian groups of order 120): Let G be an abelian group of order n = 120. Since the group is finite it is obviously finitely generated, so we can apply the theorem. There exist n1 , n2 , . . . , ns with G ∼ Z/n1 Z ⊕ Z/n2 Z ⊕ . . . ⊕ Z/ns Z = Notice that in the case of a finite group, r, as in the statement of the theorem, must be equal to 0. We have s n = 120 = 23 · 3 · 5 =
i=1

∀i, ni

2;

ni+1 | ni for 1

i

s−1

ni = n1 · n2 · . . . · ns

and by the divisibility properties of ni we must have that every prime divisor of n must divide n1 . Thus the possibilities for n1 are the following 2 · 3 · 5, 22 · 3 · 5, 23 · 3 · 5

If n1 = 23 · 3 · 5 = 120 then s = 1. In the case that n1 = 22 · 3 · 5 then n2 = 2 and s = 2. It remains to analyze the case n1 = 2 · 3 · 5. Now the possibilities for n2 are 2 (with n3 = 2, s = 3) or 4 (with s = 2). Hence if G is an abelian group of order 120 it must be (up to isomorphism) one of the following: Z/120Z, Z/60Z ⊕ Z/2Z, Z/30Z ⊕ Z/4Z, Z/30Z ⊕ Z/2Z ⊕ Z/2Z

Also notice that they are all non-isomorphic. This is because Z/(n · m)Z ∼ Z/nZ ⊕ Z/mZ ⇔ gcd(n, m) = 1 = which is due to the Chinese remainder theorem. Version: 1 Owner: alozano Author(s): alozano

327.15

fundamental theorem of finitely generated abelian groups

Theorem 2 (Fundamental Theorem of finitely generated abelian groups). Let G be a finitely generated abelian group. Then there is a unique expression of the form: G ∼ Zr ⊕ Z/n1 Z ⊕ Z/n2 Z ⊕ . . . ⊕ Z/ns Z = 1337

for some integers r, ni satisfying: r 0; ∀i, ni 2; ni+1 | ni for 1 i s−1

Version: 1 Owner: bwebste Author(s): alozano

327.16

conjugacy class

Let G a group, and consider its operation (action) on itself give by conjugation, that is, the mapping (g, x) → gxg −1 Since conjugation is an equivalence relation, we obtain a partition of G into equivalence classes, called conjugacy classes. So, the conjugacy class of X (represented Cx or C(x) is given by Cx = {y ∈ X : y = gxg −1 for some g ∈ G} Version: 2 Owner: drini Author(s): drini, apmxi

327.17

Frattini subgroup

Let G be a group. The Frattini subgroup Φ(G) of G is the intersection of all maximal subgroups of G. Equivalently, Φ(G) is the subgroup of non-generators of G. Version: 1 Owner: Evandar Author(s): Evandar

327.18

non-generator

Let G be a group. g ∈ G is said to be a non-generator if whenever X is a generating set for G then X {g} is also a generating set for G. Version: 1 Owner: Evandar Author(s): Evandar

1338

Chapter 328 20Exx – Structure and classification of infinite or finite groups
328.1 faithful group action

Let A be a G-set. That is, a set over which acts (or operates) a group G. The map mg : A → A defined as mg (x) = ψ(g, x)

where g ∈ G and ψ is the action, is a permutation of A (in other words, a bijective function of A) and so an element of SA . We can even get an homorphism from G to SA by the rule g → mg . If for any pair g, h ∈ G g = h we have mg = mh , in other words, the homomorphism g → mg being injective, we say that the action is faithful. Version: 3 Owner: drini Author(s): drini, apmxi

1339

Chapter 329 20F18 – Nilpotent groups
329.1 classification of finite nilpotent groups

Let G be a finite group. The following are equivalent: 1. G is nilpotent. 2. Every subgroup of G is subnormal. 3. Every subgroup H G is properly contained in its normalizer.

4. Every maximal subgroup is normal. 5. Every Sylow subgroup is normal. 6. G is a direct product of p-groups. Version: 1 Owner: bwebste Author(s): bwebste

329.2

nilpotent group

We define the lower central series of a group G to be the filtration of subgroups G = G1 ⊃ G2 ⊃ · · · defined inductively by: G1 := G, Gi := [Gi−1 , G], i > 1, 1340

where [Gi−1 , G] denotes the subgroup of G generated by all commutators of the form hkh−1 k −1 where h ∈ Gi−1 and k ∈ G. The group G is said to be nilpotent if Gi = 1 for some i. Nilpotent groups can also be equivalently defined by means of upper central series. For a group G, the upper central series of G is the filtration of subgroups C1 ⊂ C2 ⊂ · · · defined by setting C1 to be the center of G, and inductively taking Ci to be the unique subgroup of G such that Ci /Ci−1 is the center of G/Ci−1 , for each i > 1. The group G is nilpotent if and only if G = Ci for some i. Nilpotent groups are related to nilpotent Lie algebras in that a Lie group is nilpotent as a group if and only if its corresponding Lie algebra is nilpotent. The analogy extends to solvable groups as well: every nilpotent group is solvable, because the upper central series is a filtration with abelian quotients. Version: 3 Owner: djao Author(s): djao

1341

Chapter 330 20F22 – Other classes of groups defined by subgroup chains
330.1 inverse limit

Let {Gi }∞ be a sequence of groups which are related by a chain of surjective homomorphisms i=0 fi : Gi → Gi−1 such that G0
f1

G1

f2

G2

f3

G3

f4

...

Definition 28. The inverse limit of (Gi , fi ), denoted by is the subset of lim(Gi , fi ), or lim Gi ← − ← − ∞ Gi formed by elements satisfying i=0 ( g0 , g1 , g2 , g3 , . . .), with gi ∈ Gi , fi (gi ) = gi−1
∞ i=0

Note: The inverse limit of Gi can be checked to be a subgroup of the product below for a more general definition. Examples:

Gi . See

1. Let p ∈ N be a prime. Let G0 = {0} and Gi = Z/pi Z. Define the connecting homomorphisms fi , for i 2, to be “reduction modulo pi−1 ” i.e. fi (x mod pi ) = x mod pi−1 which are obviously surjective homomorphisms. The inverse limit of (Z/pi Z, fi ) is called the p-adic integers and denoted by Zp = lim Z/pi Z ← − 1342 fi : Z/pi Z → Z/pi−1 Z

2. Let E be an elliptic curve defined over C. Let p be a prime and for any natural number n write E[n] for the n-torsion group, i.e. E[n] = {Q ∈ E | n · Q = O} In this case we define Gi = E[pi ], and fi : E[pi ] → E[pi−1 ], fi (Q) = p · Q

The inverse limit of (E[pi ], fi ) is called the Tate module of E and denoted Tp (E) = lim E[pi ] ← − The concept of inverse limit can be defined in far more generality. Let (S, ) be a directed set and let C be a category. Let {Gα }α∈S be a collection of objects in the category C and let {fα,β : Gβ → Gα | α, β ∈ S, be a collection of morphisms satisfying: 1. For all α ∈ S, fα,α = IdGα , the identity morphism. 2. For all α, β, γ ∈ S such that α morphisms). β γ, we have fα,γ = fα,β ◦ fβ,γ (composition of α β}

Definition 29. The inverse limit of ({Gα }α∈S , {fα,β }), denoted by lim(Gα , fα,β ), ← −
α∈S

or

is defined to be the set of all (gα ) ∈

Gα such that for all α, β ∈ S

lim G ← α −

α

β ⇒ fα,β (gβ ) = gα

For a good example of this more general construction, see infinite Galois theory. Version: 6 Owner: alozano Author(s): alozano

1343

Chapter 331 20F28 – Automorphism groups of groups
331.1 outer automorphism group

The outer automorphism group of a group is the quotient of its automorphism group by its inner automorphism group: Out(G) = Aut(G)/Inn(G). Version: 7 Owner: Thomas Heye Author(s): yark, apmxi

1344

Chapter 332 20F36 – Braid groups; Artin groups
332.1 braid group

Consider two sets of n points on the complex plane C2 , of the form (1, 0), . . . , (n, 0), and of the form (1, 1), . . . , (n, 1). We connect these two sets of points via a series of paths fi : I → C2 , such that fi (t) = fj (t) for i = j and any t ∈ [0, 1]. Also, each fi may only intersect the planes (0, z) and (1, z) for t = 0 and 1 respectively. Thus, the picture looks like a bunch of strings connecting the two sets of points, but possibly tangled. The path f = (f1 , . . . , fn ) determines a homotopy class f, where we require homotopies to satisfy the same conditions on the fi . Such a homotopy class f is called a braid on n strands. We can obtain a group structure on the set of braids on n strands as follows. Multiplication of two strands f, g is done by simply following f first, then g, but doing each twice as fast. That is, f · g is the homotopy class of the path fg = f (2t) if 0 t 1/2 g(2t − 1) if 1/2 t 1

where f and g are representatives for f and g respectively. Inverses are done by following the same strand backwards, and the identity element is the strand represented by straight lines down. The result is known as the braid group on n strands, it is denoted by Bn . The braid group determines a homomorphism φ : Bn → Sn , where Sn is the symmetric group on n letters. For f ∈ Bn , we get an element of Sn from map sending i → p1 (fi (1)) where f is a representative of the homtopy class f, and p1 is the projection onto the first factor. This works because of our requirement on the points that the braids start and end, and since our homotopies fix basepoints. The kernel of φ consists of the braids that bring each strand to its original order. This kernel gives us the pure braid group on n strands, and is denoted by Pn . Hence, we have a short exact sequence 1 → Pn → Bn → Sn → 1. 1345

We can also describe braid groups as certain fundamental groups, and in more generality. Let M be a manifold, The configuration space of n ordered points on M is defined to be Fn (M) = {(a1 , . . . , an ) ∈ M n | ai = aj fori = j}. The group Sn acts on Fn (M) by permuting coordinates, and the corresponding quotient space Cn (M) = Fn (M)/Sn is called the configuration space of n unordered points on M. In the case that M = C, we obtain the regular and pure braid groups as π1 (Cn (M)) and π1 (Fn (M)) respectively. The group Bn can be given the following presentation. The presentation was given in Artin’s first paper [1] on the braid group. Label the braids 1 through n as before. Let σi be the braid that twists strands i and i + 1, with i passing beneath i + 1. Then the σi generate Bn , and the only relations needed are σi σj = σj σi for |i − j| 2, 1 σi σi+1 σi = σi+1 σi σi+1 for 1 i n − 2 The pure braid group has a presentation with
2 −1 −1 −1 generatorsaij = σj−1 σj−2 · · · σi+1 σi σi+1 · · · σj−2 σj−1 for 1

i, j

n−1

i<j

n

and defining relations  aij   a a a−1 rj ij rj −1 ars aij ars = arj asj aij a−1 a−1  sj rj   −1 −1 arj asj arj asj aij asj arj a−1 a−1 sj rj

if if if if

i < r < s < j or r < s < i < j r<i=s<j i=r<s<j r<i<s<j

REFERENCES
1. E. Artin Theorie der Z¨pfe. Abh. Math. Sem. Univ. Hamburg 4(1925), 42-72. o 2. V.L. Hansen Braids and Coverings. London Mathematical Society Student Texts 18. Cambridge University Press. 1989.

Version: 7 Owner: dublisk Author(s): dublisk

1346

Chapter 333 20F55 – Reflection and Coxeter groups
333.1 cycle

Let S be a set. A cycle is a permutation (bijective function of a set onto itself) such that there exist distinct elements a1 , a2 , . . . , ak of S such that f (ai ) = ai+1 that is f (a1 ) = a2 f (a2 ) = a3 . . . f (ak ) = a1 and f (x) = x for any other element of S. This can also be pictured as a1 → a2 → a3 → · · · → ak → a1 and for any other element x ∈ S, where → represents the action of f . One of the basic results on symmetric groups says that any permutation can be expressed as product of disjoint cycles. Version: 6 Owner: drini Author(s): drini 1347 x→x and f (ak ) = a1

333.2

dihedral group

The nth dihedral group, Dn is the symmetry group of the regular n-sided polygon. The group consists of n reflections, n − 1 rotations, and the identity transformation. Letting ω = exp(2πi/n) denote a primitive nth root of unity, and assuming the polygon is centered at the origin, the rotations Rk , k = 0, . . . , n − 1 (Note: R0 denotes the identity) are given by Rk : z → ω k z, z ∈ C, and the reflections Mk , k = 0, . . . , n − 1 by The abstract group structure is given by Rk Rl = Rk+l , Mk Ml = Rk−l , Rk Ml = Mk+l Mk Rl = Mk−l , Mk : z → ω k z , ¯ z∈C

where the addition and subtraction is carried out modulo n. The group can also be described in terms of generators and relations as (M0 )2 = (M1 )2 = (M1 M0 )n = id. This means that Dn is a rank-1 Coxeter group. Since the group acts by linear transformations (x, y) → (ˆ, y), x ˆ p(ˆ, y ) = p(x, y), ˆx ˆ (x, y) ∈ R2 p ∈ R[x, y].

there is a corresponding action on polynomials p → p, defined by ˆ The polynomials left invariant by all the group transformations form an algebra. This algebra is freely generated by the following two basic invariants: x2 + y 2 , xn − n n−2 2 x y + ..., 2

the latter polynomial being the real part of (x + iy)n . It is easy to check that these two polynomials are invariant. The first polynomial describes the distance of a point from the origin, and this is unaltered by Euclidean reflections through the origin. The second polynomial is unaltered by a rotation through 2π/n radians, and is also invariant with respect to complex conjugation. These two transformations generate the nth dihedral group. Showing that these two invariants polynomially generate the full algebra of invariants is somewhat trickier, and is best done as an application of Chevalley’s theorem regarding the invariants of a finite reflection group. Version: 8 Owner: rmilson Author(s): rmilson 1348

Chapter 334 20F65 – Geometric group theory
334.1 groups that act freely on trees are free

Let X be a tree, and Γ a group acting freely and faithfully by group automorphisms on X. Then Γ is a free group.
S ince Γ acts freely on X, the quotient graph X/Γ is well-defined, and X is the universal cover of X/Γ since X is contractible. Thus Γ ∼ π1 (X/Γ). Since any graph is homotopy equivalent = to a wedge of circles, and the fundamental group of such a space is free by Van Kampen’s theorem, Γ is free.

Version: 3 Owner: bwebste Author(s): bwebste

1349

Chapter 335 20F99 – Miscellaneous
335.1 perfect group

A group G is called perfect if G = [G, G], where [G, G] is the derived subgroup of G, or equivalently, if the abelianization of G is trivial. Version: 1 Owner: bwebste Author(s): bwebste

1350

Chapter 336 20G15 – Linear algebraic groups over arbitrary fields
336.1 Nagao’s theorem

For any integral domain k, the group of n×n invertible matrices with coefficients in k[t] is the amalgamated free product of invertible matrices over k and invertible upper triangular matrices over k[t], amalgamated over the upper triangular matrices of k. More compactly GLn (k[t]) ∼ GLn (k) ∗B(k) B(k[t]). = Version: 3 Owner: bwebste Author(s): bwebste

336.2

computation of the order of GL(n, Fq )

GL(n, Fq ) is the group of n × n matrices over a finite field Fq with non-zero determinant. Here is a proof that |GL(n, Fq )| = (q n − 1)(q n − q) · · · (q n − q n−1 ). Each element A ∈ GL(n, Fq ) is given by a collection of n Fq linearly independent vectors. If one chooses the first column vector of A from (Fq )n there are q n choices, but one can’t choose the zero vector since this would make the determinant of A zero. So there are really only (q n − 1) choices. To choose an i-th vector from (Fq )n which is linearly independent from (i-1) already choosen linearly independent vectors {V1 , · · · , Vi−1 } one must choose a vector not in the span of {V1 , · · · , Vi−1 }. There are q i−1 vectors in this span so the number of choices is clearly (q n − q i−1 ). Thus the number of linearly independent collections of n vectors in Fq is: (q n − 1)(q n − q) · · · (q n − q n−1 ). Version: 5 Owner: benjaminfjones Author(s): benjaminfjones 1351

336.3

general linear group

Given a vector space V , the general linear group GL( V ) is defined to be the group of invertible linear transformations from V to V . The group operation is defined by composition: given T : V −→ V and T : V −→ V in GL( V ), the product T T is just the composition of the maps T and T . If V = Fn for some field F, then the group GL(V ) is often denoted GL(n, F) or GLn (F). In this case, if one identifies each linear transformation T : V −→ V with its matrix with respect to the standard basis, the group GL(n, F) becomes the group of invertible n × n matrices with entries in F, under the group operation of matrix multiplication. Version: 3 Owner: djao Author(s): djao

336.4

order of the general linear group over a finite field

GL(n, Fq ) is a finite group when Fq is a finite field with q elements. Furthermore, |GL(n, Fq )| = (q n − 1)(q n − q) · · · (q n − q n−1 ). Version: 16 Owner: benjaminfjones Author(s): benjaminfjones

336.5

special linear group

Given a vector space V , the special linear group SL(V ) is defined to be the subgroup of the general linear group GL(V ) consisting of all invertible linear transformations T : V −→ V in GL(V ) that have determinant 1. If V = Fn for some field F, then the group SL(V ) is often denoted SL(n, F) or SLn (F), and if one identifies each linear transformation with its matrix with respect to the standard basis, then SL(n, F) consists of all n × n matrices with entries in F that have determinant 1. Version: 2 Owner: djao Author(s): djao

1352

Chapter 337 20G20 – Linear algebraic groups over the reals, the complexes, the quaternions
337.1 orthogonal group

Let Q be a non-degenerate symmetric bilinear form over the real vector space Rn . A linear transformation T : V −→ V is said to preserve Q if Q(T x, T y) = Q(x, y) for all vectors x, y ∈ V . The subgroup of the general linear group GL(V ) consisting of all linear transformations that preserve Q is called the orthogonal group with respect to Q, and denoted O(n, Q). If Q is also positive definite (i.e., Q is an inner product), then O(n, Q) is equivalent to the group of invertible linear transformations that preserve the standard inner product on Rn , and in this case it is usually denoted O(n). One can show that a transformation T is in O(n) if and only if T −1 = T T (the inverse of T equals the transpose of T ). Version: 2 Owner: djao Author(s): djao

1353

Chapter 338 20G25 – Linear algebraic groups over local fields and their integers
338.1 Ihara’s theorem

Let Γ be a discrete, torsion-free subgroup of SL2 Qp (where Qp is the field of p-adic numbers). Then Γ is free. Proof, or a sketch thereof] There exists a p + 1 regular tree X on which SL2 Qp acts, with stabilizer SL2 Zp (here, Zp denotes the ring of p-adic integers). Since Zp is compact in its profinite topology, so is SL2 Zp . Thus, SL2 Zp Γ must be compact, discrete and torsion-free. Since compact and discrete implies finite, the only such group is trivial. Thus, Γ acts freely on X. Since groups acting freely on trees are free, Γ is free.
[

Version: 6 Owner: bwebste Author(s): bwebste

1354

Chapter 339 20G40 – Linear algebraic groups over finite fields
339.1 SL2(F3)

The special linear group over the finite field F3 is represented by SL2 (F3 ) and consists of the 2 × 2 invertible matrices with determinant equal to 1 and whose entries belong to F3 . Version: 6 Owner: drini Author(s): drini, apmxi

1355

Chapter 340 20J06 – Cohomology of groups
340.1 group cohomology

Let G be a group and let M be a (left) G-module. The 0th cohomology group of the G-module M is H 0 (G, M) = {m ∈ M : ∀σ ∈ G, σm = m} which is the set of elements of M which are G-invariant, also denoted by M G . A map φ : G → M is said to be a crossed homomorphism (or 1-cocycle) if φ(αβ) = φ(α) + αφ(β) for all α, β ∈ G. If we fix m ∈ M, the map ρ : G → M defined by ρ(α) = αm − m is clearly a crossed homomorphism, said to be principal (or 1-coboundary). We define the following groups: Z 1 (G, M) = {φ : G → M : φ is a 1-cocycle} B 1 (G, M) = {ρ : G → M : ρ is a 1-coboundary} H 1 (G, M) = Z 1 (G, M)/B 1 (G, M) The following proposition is very useful when trying to compute cohomology groups: Proposition 1. Let G be a group and let A, B, C be G-modules related by an exact sequence: 0→A→B→C→0 Then there is a long exact sequence in cohomology: 0 → H 0 (G, A) → H 0 (G, B) → H 0 (G, C) → H 1 (G, A) → H 1 (G, B) → H 1(G, C) 1356

Finally, the 1st cohomology group of the G-module M is defined to be the quotient group:

In general, the cohomology groups H n (G, M) can be defined as follows: Definition 30. Define C 0 (G, M) = M and for n 1 define the additive group:

C n (G, M) = {φ : Gn → M} The elements of C n (G, M) are called n-cochains. Also, for n homomorphism dn : C n (G, M) → C n+1 (G, M): dn (f )(g1 , ..., gn+1 ) = g1 · f (g2, ..., gn+1 )
n

0 define the nth coboundary

+
i=1

(−1)i f (g1 , ..., gi−1, gi gi+1 , gi+2 , ..., gn+1)

+ (−1)n+1 f (g1 , ..., gn ) Let Z n (G, M) = ker dn for n 0, the set of n-cocyles. Also, let B 0 (G, M) = 1 and for n let B n (G, A) = image dn−1 , the set of n-coboundaries. Finally we define the nth -cohomology group of G with coefficients in M to be H n (G, M) = Z n (G, M)/B n (G, M) 1

REFERENCES
1. J.P. Serre, Galois Cohomology, Springer-Verlag, New York. 2. James Milne, Elliptic Curves, online course notes. 3. Joseph H. Silverman, The Arithmetic of Elliptic Curves. Springer-Verlag, New York, 1986.

Version: 4 Owner: alozano Author(s): alozano

340.2

stronger Hilbert theorem 90

¯ ¯ Let K be a field and let K be an algebraic closure of K. By K + we denote the abelian group ∗ ¯ ¯ ¯ (K, +) and similarly K = (K, ∗) (here the operation is multiplication). Also we let ¯ GK/K = Gal(K/K) ¯ be the absolute Galois group of K. Theorem 3 (Hilbert 90). Let K be a field. 1.

¯ H 1 (GK/K , K + ) = 0 ¯ 1357

2.

¯ H 1 (GK/K , K ∗ ) = 0 ¯

3. If char(K), the characteristic of K, does not divide m (or char(K) = 0) then H 1 (GK/K , µm ) ∼ K ∗ /K ∗m ¯ = where µm denotes the set of all mth -roots of unity.

REFERENCES
1. J.P. Serre, Galois Cohomology, Springer-Verlag, New York. 2. J.P. Serre, Local Fields, Springer-Verlag, New York.

Version: 2 Owner: alozano Author(s): alozano

1358

Chapter 341 20J15 – Category of groups
341.1 variety of groups

A variety of groups is the set of groups G such that all elements x1 , . . . , xn ∈ G satisfy a set of equationally defined relations ri (x1 , . . . , xn ) = 1 ∀i ∈ I, where I is an index set. For example, abelian groups are a variety defined by the equations {[x1 , x2 ] = 1}, where [x, y] = xyx−1 y −1. Nilpotent groups of class < c are a variety defined by {[[· · · [[x1 , x2 ], x3 ] · · · ], xc ]}. Analogously, solvable groups of length < c are a variety. Abelian groups are a special case of both of these. Groups of exponent n are a variety, defined by {xn = 1}. 1 A variety of groups is a full subcategory of the category of groups, and there is a free group on any set of elements in the variety, which is the usual free group modulo the relations of the variety applied to all elements. This satisfies the usual universal property of the free group on groups in the variety, and is thus adjoint to the forgetful functor in the category of sets. In the variety of abelian groups, we get back the usual free abelian groups. In the variety of groups of exponent n, we get the Burnside groups. Version: 1 Owner: bwebste Author(s): bwebste 1359

Chapter 342 20K01 – Finite abelian groups
342.1 Schinzel’s theorem

Let a ∈ Q, not zero or 1 or −1. For any prime p which does not divide the numerator or denominator of a in reduced form, a can be viewed as an element of the multiplicative group Z/pZ. Let np be the order of this element in the multiplicative group. Then the set of np over all such primes has finite complement in the set of positive integers. One can generalize this as follows: Similarly, if K is a number field, choose a not zero or a root of unity in K. Then for any finite place (discrete valuation) p with vp (a) = 0, we can view a as an element of the residue field at p, and take the order np of this element in the multiplicative group. Then the set of np over all such primes has finite complement in the set of positive integers. Silverman also generalized this to elliptic curves over number fields. References to come soon. Version: 4 Owner: mathcam Author(s): Manoj, nerdy2

1360

Chapter 343 20K10 – Torsion groups, primary groups and generalized primary groups
343.1
The Definition 31. torsion of a group G is the set Tor(G) = {g ∈ G : g n = e for some n ∈ N}. A group is said to be Definition 32. torsion-free if Tor(G) = {e}, i.e. the torsion consists only of the identity element. If G is abelian then Tor(G) is a subgroup (the Definition 33. torsion group) of G. Example 18 (Torsion of a cyclic group). For a cyclic group Zp , Tor(Zp ) = Zp . In general, if G is a finite group then Tor(G) = G. Version: 2 Owner: mhale Author(s): mhale

torsion

1361

Chapter 344 20K25 – Direct sums, direct products, etc.
344.1 direct product of groups

The external direct product G × H of two groups G and H is defined to be the set of ordered pairs (g, h), with g ∈ G and h ∈ H. The group operation is defined by (g, h)(g , h ) = (gg , hh ) It can be shown that G × H obeys the group axioms. More generally, we can define the external direct product of n groups, in the obvious way. Let G = G1 × . . . × Gn be the set of all ordered n-tuples {(g1 , g2 . . . , gn ) | gi ∈ Gi } and define the group operation by componentwise multiplication as before. Version: 4 Owner: vitriol Author(s): vitriol

1362

Chapter 345 20K99 – Miscellaneous
345.1 Klein 4-group

The Klein 4-group is the subgroup V (Vierergruppe) of S4 (see symmetric group) consisting of the following 4 permutations: (), (12), (34), (12)(34). (see cycle notation). This is an abelian group, isomorphic to the product Z/2Z × Z/2Z. The group is named after Felix Klein, a pioneering figure in the field of geometric group theory. The Klein 4 group enjoys a number of interesting properties, some of which are listed below. 1. It is the automorphism group of the graph consisting of two disjoint edges. 2. It is the unique 4 element group with the property that all elements are idempotent. 3. It is the symmetry group of a planar ellipse. 4. Consider the action of S4 , the permutation group of 4 elements, on the set of partitions into two groups of two elements. There are 3 such partitions, which we denote by (12, 34) (13, 24) (14, 23). Thus, the action of S4 on these partition induces a homomorphism from S4 to S3 ; the kernel is the Klein 4-group. This homomorphism is quite exceptional, and corresponds to the fact that A4 (the alternating group) is not a simple group (notice that V is actually a subgroup of A4 ). All other alternating groups are simple. 5. A more geometric way to see the above is the following: S4 is the group of symmetries of a tetrahedron. There is an iduced action of S4 on the six edges of the tetrahedron. Observing that this action preserves incidence relations one gets an action of S4 on the three pairs of opposite edges (See figure). 1363

6. It is the symmetry group of the Riemannian curvature tensor. 3

4 1 2

Version: 7 Owner: rmilson Author(s): Dr Absentius, rmilson, imran

345.2

divisible group

An abelian group D is said to be divisible if for any x ∈ D, n ∈ Z+ , there exists an element x ∈ D such that nx = x. Some noteworthy facts: • An abelian group is injective (as a Z-module) if and only if it is divisible. • Every group is isomorphic to a subgroup of a divisible group. • Any divisible abelian group is isomorphic to the direct sum of its torsion subgroup and n copies of the group of rationals (for some cardinal number n). Version: 4 Owner: mathcam Author(s): mathcam

345.3

example of divisible group

Let G denote the group of rational numbers taking the operation to be addition. Then for p p any p ∈ G and n ∈ Z+ , we have nq ∈ G satisfying n nq = p , so the group is divisible. q q Version: 1 Owner: mathcam Author(s): mathcam

345.4

locally cyclic group

A locally cyclic (or generalized cyclic) group is a group in which any pair of elements generates a cyclic subgroup. 1364

Every locally cyclic group is abelian. If G is a locally cyclic group, then every finite subset of G generates a cyclic subgroup. Therefore, the only finitely-generated locally cyclic groups are the cyclic groups themselves. The group (Q, +) is an example of a locally cyclic group that is not cyclic. Subgroups and quotients of locally cyclic groups are also locally cyclic. A group is locally cyclic if and only if its lattice of subgroups is distributive. Version: 10 Owner: yark Author(s): yark

1365

Chapter 346 20Kxx – Abelian groups
346.1 abelian group

Let (G, ∗) be a group. If for any a, b ∈ G we have a ∗ b = b ∗ a, we say that the group is abelian. Sometimes the expression commutative group is used, but this is less frequent. Abelian groups hold several interesting properties. Theorem 4. If ϕ : G → G defined by ϕ(x) = x2 is a homomorphism, then G is abelian. Proof. If such function were a homomorphism, we would have (xy)2 = ϕ(xy) = ϕ(x)ϕ(y) = x2 y 2 that is, xyxy = xxyy. Left-mutiplying by x−1 and right-multiplying by y −1 we are led to yx = xy and thus the group is abelian. QED Theorem 5. Any subgroup of an abelian group is normal. Proof. Let H be a subgroup of the abelian group G. Since ah = ha for any a ∈ G and any h ∈ H we get aH = Ha. That is, H is normal in G. QED Theorem 6. Quotient groups of abelian groups are also abelian. Proof Let H a subgroup of G. Since G is abelian, H is normal and we can get the quotient group G/H whose elements are the equivalence classes for a ∼ b if ab−1 ∈ H. The operation on the quotient group is given by aH · bH = (ab)H. But bh · aH = (ba)H = (ab)H, therefore the quotient group is also commutative. QED Version: 12 Owner: drini Author(s): drini, yark, akrowne, apmxi 1366

Chapter 347 20M10 – General structure theory
347.1 existence of maximal semilattice decomposition

Let S be a semigroup. A maximal semilattice decomposition for S is a surjective homomorphism φ : S → Γ onto a semilattice Γ with the property that any other semilattice decomposition factors through φ. So if φ : S → Γ is any other semilattice decomposition of S, then there is a homomorphism Γ → Γ such that the following diagram commutes: S
φ φ

Γ Γ

Proposition 14. Every semigroup has a maximal semilattice decomposition. ecall that each semilattice decompostion determines a semilattice congruence. If {ρi | i ∈ I} is the family of all semilattice congruences on S, then define ρ = i∈I ρi . (Here, we consider the congruences as subsets of S × S, and take their intersection as sets.)
R

It is easy to see that ρ is also a semilattice congruence, which is contained in all other semilattice congruences. Therefore each of the homomorphisms S → S/ρi factors through S → S/ρ. Version: 2 Owner: mclase Author(s): mclase

1367

347.2

semilattice decomposition of a semigroup

A semigroup S has a semilattice decomposition if we can write S = γ∈Γ Sγ as a disjoint union of subsemigroups, indexed by elements of a semilattice Γ, with the additional condition that x ∈ Sα and y ∈ Sβ implies xy ∈ Sαβ . Semilattice decompositions arise from homomorphims of semigroups onto semilattices. If φ : S → Γ is a surjective homomorphism, then it is easy to see that we get a semilattice decomposition by putting Sγ = φ−1 (γ) for each γ ∈ Γ. Conversely, every semilattice decomposition defines a map from S to the indexing set Γ which is easily seen to be a homomorphism. A third way to look at semilattice decompositions is to consider the congruence ρ defined by the homomorphism φ : S → Γ. Because Γ is a semilattice, φ(x2 ) = φ(x) for all x, and so ρ satisfies the constraint that x ρ x2 for all x ∈ S. Also, φ(xy) = φ(yx) so that xy ρ yx for all x, y ∈ S. A congruence ρ which satisfies these two conditions is called a semilattice congruence. Conversely, a semilattice congruence ρ on S gives rise to a homomorphism from S to a semilattice S/ρ. The ρ-classes are the components of the decomposition. Version: 3 Owner: mclase Author(s): mclase

347.3

simple semigroup

Let S be a semigroup. If S has no ideals other than itself, then S is said to be simple. If S has no left ideals [resp. Right ideals] other than itself, then S is said to be left simple [resp. right simple]. Right simple and left simple are stronger conditions than simple. A semigroup S is left simple if and only if Sa = S for all a ∈ S. A semigroup is both left and right simple if and only if it is a group. If S has a zero element θ, then 0 = {θ} is always an ideal of S, so S is not simple (unless it has only one element). So in studying semigroups with a zero, a slightly weaker definition is required. Let S be a semigroup with a zero. Then S is zero simple, or 0-simple, if the following conditions hold: • S2 = 0 • S has no ideals except 0 and S itself 1368

The condition S 2 = 0 really only eliminates one semigroup: the 2-element null semigroup. Excluding this semigroup makes parts of the structure theory of semigroups cleaner. Version: 1 Owner: mclase Author(s): mclase

1369

Chapter 348 20M12 – Ideal theory
348.1 Rees factor

Let I be an ideal of a semigroup S. Define a congruence ∼ by x ∼ y iff x = y or x, y ∈ I. Then the Rees factor of S by I is the quotient S/ ∼. As a matter of notation, the congruence ∼ is normally suppressed, and the quotient is simply written S/I. Note that a Rees factor always has a zero element. Intuitively, the quotient identifies all element in I and the resulting element is a zero element. Version: 1 Owner: mclase Author(s): mclase

348.2

ideal

Let S be a semigroup. An ideal of S is a non-empty subset of S which is closed under multiplication on either side by elements of S. Formally, I is an ideal of S if I is non-empty, and for all x ∈ I and s ∈ S, we have sx ∈ I and xs ∈ I. One-sided ideals are defined similarly. A non-empty subset A of S is a left ideal (resp. right ideal) of S if for all a ∈ A and s ∈ S, we have sa ∈ A (resp. as ∈ A). A principal left ideal of S is a left ideal generated by a single element. If a ∈ S, then the principal left ideal of S generated by a is S 1 a = Sa {a}. (The notation S 1 is explained here.) Similarly, the principal right ideal generated by a is aS 1 = aS {a}.

The notation L(a) and R(a) are also common for the principal left and right ideals generated 1370

by a respectively. A principal ideal of S is an ideal generated by a single element. The ideal generated by a is S 1 aS 1 = SaS Sa aS {a}. The notation J(a) = S 1 aS 1 is also common. Version: 5 Owner: mclase Author(s): mclase

1371

Chapter 349 20M14 – Commutative semigroups
349.1 Archimedean semigroup

Let S be a commutative semigroup. We say an element x divides an element y, written x | y, if there is an element z such that xz = y. An Archimedean semigroup S is a commutative semigroup with the property that for all x, y ∈ S there is a natural number n such that x | y n . This is related to the Archimedean property of positive real numbers R+ : if x, y > 0 then there is a natural number n such that x < ny. Except that the notation is additive rather than multiplicative, this is the same as saying that (R+ , +) is an Archimedean semigroup. Version: 1 Owner: mclase Author(s): mclase

349.2

commutative semigroup

A semigroup S is commutative if the defining binary operation is commutative. That is, for all x, y ∈ S, the identity xy = yx holds. Although the term Abelian semigroup is sometimes used, it is more common simply to refer to such semigroups as commutative semigroups. A monoid which is also a commutative semigroup is called a commutative monoid. Version: 1 Owner: mclase Author(s): mclase

1372

Chapter 350 20M20 – Semigroups of transformations, etc.
350.1 semigroup of transformations

Let X be a set. A transformation of X is a function from X to X. If α and β are transformations on X, then their product αβ is defined (writing functions on the right) by (x)(αβ) = ((x)α)β. With this definition, the set of all transformations on X becomes a semigroup, the full semigroupf of transformations on X, denoted TX . More generally, a semigroup of transformations is any subsemigroup of a full set of transformations. When X is finite, say X = {x1 , x2 , . . . , xn }, then the transformation α which maps xi to yi (with yi ∈ X, of course) is often written: α= x1 x2 . . . xn y1 y2 . . . yn

With this notation it is quite easy to calculate products. For example, if X = {1, 2, 3, 4}, then 1 2 3 4 1 2 3 4 1 2 3 4 = 3 2 1 2 2 3 3 4 3 3 2 3 When X is infinite, say X = {1, 2, 3, . . . }, then this notation is still useful for illustration in cases where the transformation pattern is apparent. For example, if α ∈ TX is given by

1373

α : n → n + 1, we can write α=

1 2 3 4 ... 2 3 4 5 ...

Version: 3 Owner: mclase Author(s): mclase

1374

Chapter 351 20M30 – Representation of semigroups; actions of semigroups on sets
351.1 counting theorem

Given a group action of a finite group G on a set X, the following expression gives the number of distinct orbits 1 stabg (X) |G| g∈G Where stabg (X) is the number of elements fixed by the action of g. Version: 8 Owner: mathcam Author(s): Larry Hammick, vitriol

351.2

example of group action

Let a, b, c be integers and let [a, b, c] denote the mapping [a, b, c] : Z × Z → Z, (x, y) → ax2 + bxy + cy 2 . Let G be the group of 2 × 2 matrices such that det A = ±1 ∀ A ∈ G, and A ∈ G. The substitution txy → A · txy leads to [a, b, c](a11 x + a12 y, a21x + a22 y) = a x2 + b xy + c y 2 , 1375

where a b c So we define to be the binary quadratic form with coefficients a , b , c of x2 , xy, y 2, respectively as in 1 0 we have [a, b, c] ∗ A = [a, b, c] for any binary quadratic (495.2.1). Putting in A = 0 1 form [a, b, c]. Now let B be another matrix in G. We must show that [a, b, c] ∗ (AB) = ([a, b, c] ∗ A) ∗ B. Set [a, b, c] ∗ (AB) := [a , b , c ]. So we have a c = = = = a · (a11 b11 + a12 b21 )2 + c · (a21 b11 + a22 b21 )2 + b · (a11 b11 + a12 b21 ) (a21 b11 + (351.2.2) a22 b21 ) 2 2 a · b11 + c · b21 + (2a · a11 a12 + 2c · a21 a22 + b (a11 a22 + a12 a21 )) b11 b21 a · (a11 b12 + a12 b22 )2 + c · (a21 b12 + a22 b22 )2 + b · (a11 b12 + a12 b22 ) (a21 b12 + (351.2.3) a22 b22 ) 2 2 a · b12 + c · b22 + (2a · a11 a12 + 2c · a21 a22 + b (a11 a22 + a12 a21 )) b12 b22 = 2a · (a11 b11 + a12 b21 ) (a11 b12 + a12 b22 ) + 2c · (a21 b11 + a22 b21 ) (a21 b12 + a22 b22 ) + b · ((a11 b11 + a12 b21 ) (a21 b12 + a22 b22 ) + (a11 b12 + a12 b22 ) (a21 b11 + a22 b21 )) [a, b, c] ∗ A := [a , b , c ] = a · a2 + b · a11 · a21 + c · a2 11 21 = 2a · a11 · a12 + 2c · a21 · a22 + b(a11 a22 + a12 a21 = a · a2 + b · a12 a22 + c · a2 12 22 (351.2.1)

as desired. For the coefficient b we get b

and by evaluating the factors of b11 b12 , b21 b22 , and b11 b22 + b21 b12 , it can be checked that b = 2a b11 b12 + 2c b21 b22 + (b11 b22 + b21 b12 ) (2a · a11 a12 + 2c · a21 a22 + b (a11 a22 + a12 a21 )) . This shows that and therefore [a, b, c] ∗ (AB) = ([a, b, c] ∗ A) ∗ B. Thus, (495.2.1) defines an action of G on the set of (integer) binary quadratic forms. Furthermore, the discriminant of each quadratic form in the orbit of [a, b, c] under G is b2 − 4ac. Version: 5 Owner: Thomas Heye Author(s): Thomas Heye [a , b , c ] = [a , b , c ] ∗ B (351.2.4)

351.3

group action

Let G be a group and let X be a set. A left group action is a function · : G × X −→ X such that: 1376

1. 1G · x = x for all x ∈ X 2. (g1 g2 ) · x = g1 · (g2 · x) for all g1 , g2 ∈ G and x ∈ X A right group action is a function · : X × G −→ X such that: 1. x · 1G = x for all x ∈ X 2. x · (g1 g2 ) = (x · g1 ) · g2 for all g1 , g2 ∈ G and x ∈ X There is a correspondence between left actions and right actions, given by associating the right action x · g with the left action g · x := x · g −1. In many (but not all) contexts, it is useful to identify right actions with their corresponding left actions, and speak only of left actions. Special types of group actions A left action is said to be effective, or faithful, if the function x → g ·x is the identity function on X only when g = 1G . A left action is said to be transitive if, for every x1 , x2 ∈ X, there exists a group element g ∈ G such that g · x1 = x2 . A left action is free if, for every x ∈ X, the only element of G that stabilizes x is the identity; that is, g · x = x implies g = 1G . Faithful, transitive, and free right actions are defined similarly. Version: 3 Owner: djao Author(s): djao

351.4

orbit

Let G be a group, X a set, and · : G × X −→ X a group action. For any x ∈ X, the orbit of x under the group action is the set {g · x | g ∈ G} ⊂ X. Version: 2 Owner: djao Author(s): djao

351.5

proof of counting theorem

Let N be the cardinality of the set of all the couples (g, x) such that g · x = x. For each g ∈ G, there exist stabg (X) couples with g as the first element, while for each x, there are 1377

|Gx | couples with x as the second element. Hence the following equality holds: N=
g∈G

stabg (X) =
x∈X

|Gx |.

From the orbit-stabilizer theorem it follows that: N = |G| 1 . |G(x)|

x∈X

Since all the x belonging to the same orbit G(x) contribute with |G(x)| in the sum, then therefore
x∈X

1 =1 |G(x)|

1/|G(x)| precisely equals the number of distinct orbits s. We have stabg (X) = |G|s,

g∈G

which proves the theorem. Version: 2 Owner: n3o Author(s): n3o

351.6

stabilizer

Let G be a group, X a set, and · : G × X −→ X a group action. For any subset S of X, the stabilizer of S, denoted Stab(S), is the subgroup Stab(S) := {g ∈ G | g · s ∈ Sfor all s ∈ S}. The stabilizer of a single point x in X is often denoted Gx . Version: 3 Owner: djao Author(s): djao

1378

Chapter 352 20M99 – Miscellaneous
352.1 a semilattice is a commutative band

This note explains how a semilattice is the same as a commutative band. Let S be a semilattice, with partial order < and each pair of elements x and y having a greatest lower bound x ∧ y. Then it is easy to see that the operation ∧ defines a binary operation on S which makes it a commutative semigroup, and that every element is idempotent since x ∧ x = x. Conversely, if S is such a semigroup, define x y iff x = xy. Again, it is easy to see that this defines a partial order on S, and that greatest lower bounds exist with respect to this partial order, and that in fact x ∧ y = xy. Version: 3 Owner: mclase Author(s): mclase

352.2

adjoining an identity to a semigroup

It is possible to formally adjoin an identity element to any semigroup to make it into a monoid. Suppose S is a semigroup without an identity, and consider the set S {1} where 1 is a symbol not in S. Extend the semigroup operation from S to S {1} by additionally defining: s · 1 = s = 1 · s, for alls ∈ S 1 It is easy to verify that this defines a semigroup (associativity is the only thing that needs to be checked).

1379

As a matter of notation, it is customary to write S 1 for the semigroup S with an identity adjoined in this manner, if S does not already have one, and to agree that S 1 = S, if S does already have an identity. Despite the simplicity of this construction, however, it rarely allows one to simplify a problem by considering monoids instead of semigroups. As soon as one starts to look at the structure of the semigroup, it is almost invariably the case that one needs to consider subsemigroups and ideals of the semigroup which do not contain the identity. Version: 2 Owner: mclase Author(s): mclase

352.3

band

A band is a semigroup in which every element is idempotent. A commutative band is called a semilattice. Version: 1 Owner: mclase Author(s): mclase

352.4

bicyclic semigroup

The bicyclic semigroup C(p, q) is the monoid generated by {p, q} with the single relation pq = 1. The elements of C(p, q) are all words of the form q n pm for m, n p0 = q 0 = 1). These words are multiplied as follows: q n pm q k pl = q n+k−m pl q n pl+m−k if m if m k, k. 0 (with the understanding

It is apparent that C(p, q) is simple, for if q n pm is an element of C(p, q), then 1 = pn (q n pm )q m and so S 1 q n pm S 1 = S. It is useful to picture some further properties of C(p, q) by arranging the elements in a table: 1 p p2 p3 p4 2 3 q qp qp qp qp4 q 2 q 2 p q 2 p2 q 2 p3 q 2 p4 q 3 q 3 p q 3 p2 q 3 p3 q 3 p4 q 4 q 4 p q 4 p2 q 4 p3 q 4 p4 . . . . . . . . . . . . . . . 1380 ... ... ... ... ... .. .

Then the elements below any horizontal line drawn through this table form a right ideal and the elements to the right of any vertical line form a left ideal. Further, the elements on the diagonal are all idempotents and their standard ordering is 1 > qp > q 2 p2 > q 3 p3 > · · · . Version: 3 Owner: mclase Author(s): mclase

352.5

congruence

Let S be a semigroup. An equivalence relation ∼ defined on S is called a congruence if it is preserved under the semigroup operation. That is, for all x, y, z ∈ S, if x ∼ y then xz ∼ yz and zx ∼ zy. If ∼ satisfies only x ∼ y implies xz ∼ yz (resp. zx ∼ zy) then ∼ is called a right congruence (resp. left congruence). Example 19. Suppose f : S → T is a semigroup homomorphism. Define ∼ by x ∼ y iff f (x) = f (y). Then it is easy to see that ∼ is a congruence. If ∼ is a congruence, defined on a semigroup S, write [x] for the equivalence class of x under ∼. Then it is easy to see that [x] · [y] = [xy] is a well-defined operation on the set of equivalence classes, and that in fact this set becomes a semigroup with this operation. This semigroup is called the quotient of S by ∼ and is written S/ ∼. Thus semigroup congruences are related to homomorphic images of semigroups in the same way that normal subgroups are related to homomorphic images of groups. More precisely, in the group case, the congruence is the coset relation, rather than the normal subgroup itself. Version: 3 Owner: mclase Author(s): mclase

352.6

cyclic semigroup

A semigroup which is generated by a single element is called a cyclic semigroup. Let S = x be a cyclic semigroup. Then as a set, S = {xn | n > 0}. If all powers of x are distinct, then S = {x, x2 , x3 , . . . } is (countably) infinite. Otherwise, there is a least integer n > 0 such that xn = xm for some m < n. It is clear then that the elements x, x2 , . . . , xn−1 are distinct, but that for any j ≥ n, we must have xj = xi for some i, m ≤ i ≤ n − 1. So S has n − 1 elements. 1381

Unlike in the group case, however, there are in general multiple non-isomorphic cyclic semigroups with the same number of elements. In fact, there are t non-isomorphic cyclic semigroups with t elements: these correspond to the different choices of m in the above (with n = t + 1). The integer m is called the index of S, and n − m is called the period of S. The elements K = {xm , xm+1 , . . . , xn−1 } are a subsemigroup of S. In fact, K is a cyclic group. A concrete representation of the semigroup with index m and period r as a semigroup of transformations can be obtained as follows. Let X = {1, 2, 3, . . . , m + r}. Let φ= 1 2 3 ... m + r −1 m + r . 2 3 4 ... m+r r+1

Then φ generates a subsemigroup S of the full semigroup of transformations TX , and S is cyclic with index m and period r. Version: 3 Owner: mclase Author(s): mclase

352.7

idempotent

An element x of a ring is called an idempotent element, or simply an idempotent if x2 = x. The set of idempotents of a ring can be partially ordered by putting e ≤ f iff e = ef = f e. The element 0 is a minimum element in this partial order. If the ring has an identity element, 1, then 1 is a maximum element in this partial order. Since these definitions refer only to the multiplicative structure of the ring, they also hold for semigroups (with the proviso, of course, that a semigroup may not have a zero element). In the special case of a semilattice, this partial order is the same as the one described in the entry for semilattice. If a ring has an identity, then 1 − e is always an idempotent whenever e is an idempotent, and e(1 − e) = (1 − e)e = 0. In a ring with an identity, two idempotents e and f are called a pair of orthogonal idempotents if e + f = 1, and ef = f e = 0. Obviously, this is just a fancy way of saying that f = 1 − e. More generally, a set {e1 , e2 , . . . , en } of idempotents is called a complete set of orthogonal idempotents if ei ej = ej ei = 0 whenever i = j and if 1 = e1 + e2 + · · · + en . 1382

Version: 3 Owner: mclase Author(s): mclase

352.8

null semigroup

A left zero semigroup is a semigroup in which every element is a left zero element. In other words, it is a set S with a product defined as xy = x for all x, y ∈ S. A right zero semigroup is defined similarly. Let S be a semigroup. Then S is a null semigroup if it has a zero element and if the product of any two elements is zero. In other words, there is an element θ ∈ S such that xy = θ for all x, y ∈ S. Version: 1 Owner: mclase Author(s): mclase

352.9

semigroup

A semigroup G is a set together with a binary operation · : G × G −→ G which satisfies the associative property: (a · b) · c = a · (b · c) for all a, b, c ∈ G. Version: 2 Owner: djao Author(s): djao

352.10

semilattice

A lower semilattice is a partially ordered set S in which each pair of elements has a greatest lower bound. A upper semilattice is a partially ordered set S in which each pair of elements has a least upper bound. Note that it is not normally necessary to distinguish lower from upper semilattices, because one may be converted to the other by reversing the partial order. It is normal practise to refer to either structure as a semilattice and it should be clear from the context whether greatest lower bounds or least upper bounds exist. Alternatively, a semilattice can be considered to be a commutative band, that is a semigroup which is commutative, and in which every element is idempotent. In this context, semilattices are important elements of semigroup theory and play a key role in the structure theory of commutative semigroups. A partially ordered set which is both a lower semilattice and an upper semilattice is a lattice. 1383

Version: 3 Owner: mclase Author(s): mclase

352.11

subsemigroup,, submonoid,, and subgroup

Let S be a semigroup, and let T be a subset of S. T is a subsemigroup of S if T is closed under the operation of S; that it if xy ∈ T for all x, y ∈ T . T is a submonoid of S if T is a subsemigroup, and T has an identity element. T is a subgroup of S if T is a submonoid which is a group. Note that submonoids and subgroups do not have to have the same identity element as S itself (indeed, S may not have an identity element). The identity element may be any idempotent element of S. Let e ∈ S be an idempotent element. Then there is a maximal subsemigroup of S for which e is the identity: eSe = {exe | x ∈ S}. U(eSe) = {x ∈ eSe | ∃y ∈ eSe st xy = yx = e}. Subgroups with different identity elements are disjoint. To see this, suppose that G and H are subgroups of a semigroup S with identity elements e and f respectively, and suppose x ∈ G H. Then x has an inverse y ∈ G, and an inverse z ∈ H. We have: e = xy = f xy = f e = zxe = zx = f. Thus intersecting subgroups have the same identity element. Version: 2 Owner: mclase Author(s): mclase

In addition, there is a maximal subgroup for which e is the identity:

352.12

zero elements

Let S be a semigroup. An element z is called a right zero [resp. left zero] if xz = z [resp. zx = z] for all x ∈ S. An element which is both a left and a right zero is called a zero element. A semigroup may have many left zeros or right zeros, but if it has at least one of each, then they are necessarily equal, giving a unique (two-sided) zero element. 1384

It is customary to use the symbol θ for the zero element of a semigroup. Version: 1 Owner: mclase Author(s): mclase

1385

Chapter 353 20N02 – Sets with a single binary operation (groupoids)
353.1 groupoid

A groupoid G is a set together with a binary operation · : G × G −→ G. The groupoid (or “magma”) is closed under the operation. There is also a separate, category-theoretic definition of “groupoid.” Version: 7 Owner: akrowne Author(s): akrowne

353.2

idempotency

If (S, ∗) is a magma, then an element x ∈ S is said to be idempotent if x ∗ x = x. If every element of S is idempotent, then the binary operation ∗ (or the magma itself) is said to be idempotent. For example, the ∧ and ∨ operations in a lattice are idempotent, because x ∧ x = x and x ∨ x = x for all x in the lattice. A function f : D → D is said to be idempotent if f ◦ f = f . (This is just a special case of the above definition, the magma in question being (D D , ◦), the monoid of all functions from D to D, with the operation of function composition.) In other words, f is idempotent iff repeated application of f has the same effect as a single application: f (f (x)) = f (x) for all x ∈ D. An idempotent linear transformation from a vector space to itself is called a projection. Version: 12 Owner: yark Author(s): yark, Logan

1386

353.3

left identity and right identity

Let G be a groupoid. An element e ∈ G is called a left identity element if ex = x for all x ∈ G. Similarly, e is a right identity element if xe = x for all x ∈ G. An element which is both a left and a right identity is an identity element. A groupoid may have more than one left identify element: in fact the operation defined by xy = y for all x, y ∈ G defines a groupoid (in fact, a semigroup) on any set G, and every element is a left identity. But as soon as a groupoid has both a left and a right identity, they are necessarily unique and equal. For if e is a left identity and f is a right identity, then f = ef = e. Version: 2 Owner: mclase Author(s): mclase

1387

Chapter 354 20N05 – Loops, quasigroups
354.1 Moufang loop

Proposition: Let Q be a nonempty quasigroup. I) The following conditions are equivalent. (x(yz))x ((xy)z)x (yx)(zy) y(x(yz)) = = = = (xy)(zx) x(y(zx)) (y(xz))y ((yx)y)z ∀x, y, z ∈ Q ∀x, y, z ∈ Q ∀x, y, z ∈ Q ∀x, y, z ∈ Q (354.1.1) (354.1.2) (354.1.3) (354.1.4)

II) If Q satisfies those conditions, then Q has an identity element (i.e. Q is a loop). For a proof, we refer the reader to the two references. Kunen in [1] shows that that any of the four conditions implies the existence of an identity element. And Bol and Bruck [2] show that the four conditions are equivalent for loops. Definition:A nonempty quasigroup satisfying the conditions (1)-(4) is called a Moufang quasigroup or, equivalently, a Moufang loop (after Ruth Moufang, 1905-1977). The 16-element set of unit octonians over Z is an example of a nonassociative Moufang loop. Other examples appear in projective geometry, coding theory, and elsewhere. References [1] K. Kunen Moufang Quasigroups (PostScript format) (=Moufang Quasigroups, J. Algebra 83 (1996) 231-234) [2] R. H. Bruck, A Survey of Binary Systems, Springer-Verlag, 1958 1388

Version: 3 Owner: yark Author(s): Larry Hammick

354.2

loop and quasigroup

A quasigroup is a groupoid G with the property that for every x, y ∈ G, there are unique elements w, z ∈ G such that xw = y and zx = y. A loop is a quasigroup which has an identity element. What distinguishes a loop from a group is that the former need not satisfy the associative law. Version: 1 Owner: mclase Author(s): mclase

1389

Chapter 355 22-00 – General reference works (handbooks, dictionaries, bibliographies, etc.)
355.1 fixed-point subspace

Let Σ ⊂ Γ be a subgroup where Γ is a compact Lie group acting on a vector space V . The fixed-point subspace of Σ is Fix(Σ) = {x ∈ V | σx = x, ∀σ ∈ Σ} Fix(Σ) is a linear subspace of V since Fix(Σ) =
σ∈Σ

ker(σ − I)

where I is the identity. If it is important to specify the space V we use the following notation FixV (Σ).

REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.

Version: 1 Owner: Daume Author(s): Daume

1390

Chapter 356 22-XX – Topological groups, Lie groups
356.1 Cantor space

Cantor space denoted C is the set of all infinite binary sequences with the product topology. It is a perfect Polish space. It is a compact subspace of Baire space, which is the set of all infinite sequences of integers with the natural product topology.

REFERENCES
1. Moschovakis, Yiannis N. Descriptive set theory theory, 1980, Amsterdam ; New York : NorthHolland Pub. Co.

Version: 8 Owner: xiaoyanggu Author(s): xiaoyanggu

1391

Chapter 357 22A05 – Structure of general topological groups
357.1 topological group

A topological group is a triple (G, ·, T) where (G, ·) is a group and T is a topology on G such that under T, the group operation (x, y) → x · y is continuous with respect to the product topology on G × G and the inverse map x → x−1 is continuous on G. Version: 3 Owner: Evandar Author(s): Evandar

1392

Chapter 358 22C05 – Compact groups
358.1 n-torus

The n-Torus, denoted T n , is a smooth orientable n dimensional manifold which is the product of n 1-spheres, i.e. T n = S 1 × · · · × S 1 .
n

Equivalently, the n-Torus can be considered to be Rn modulo the action (vector addition) of the integer lattice Zn . The n-Torus is in addition a topological group. If we think of S 1 as the unit circle in C and T n = S 1 × · · · × S 1 , then S 1 is a topological group and so is T n by coordinate-wise multiplication. That is, (z1 , z2 , . . . , zn ) · (w1 , w2 , . . . , wn ) = (z1 w1 , z2 w2 , . . . , zn wn ) Version: 2 Owner: ack Author(s): ack, apmxi
n

358.2

reductive

Let G be a Lie group or algebraic group. G is called reductive over a field k if every representation of G over k is completely reducible For example, a finite group is reductive over a field k if and only if its order is not divisible by the characteristic of k (by Maschke’s theorem). A complex Lie group is reductive if and only if it is a direct product of a semisimple group and an algebraic torus. Version: 3 Owner: bwebste Author(s): bwebste 1393

Chapter 359 22D05 – General properties and structure of locally compact groups
359.1 Γ-simple

A representation V of Γ is Γ-simple if either • V ∼ W1 ⊕ W2 where W1 , W2 are absolutely irreducible for Γ and are Γ-isomorphic, or = • V is non-absolutely irreducible for Γ. [GSS]

REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.

Version: 1 Owner: Daume Author(s): Daume

1394

Chapter 360 22D15 – Group algebras of locally compact groups
360.1 group C ∗-algebra

Let C[G] be the group ring of a discrete group G. It has two completions to a C ∗ -algebra:
∗ Reduced group C ∗ -algebra. The reduced group C ∗ -algebra, Cr (G), is obtained by completing C[G] in the operator norm for its regular representation on l2 (G).

∗ Maximal group C ∗ -algebra. The maximal group C ∗ -algebra, Cmax (G) or just C ∗ (G), is defined by the following universal property: any *-homomorphism from C[G] to some B(H) (the C ∗ -algebra of bounded operators on some Hilbert space H) factors through ∗ the inclusion C[G] → Cmax (G). ∗ If G is amenable then Cr (G) ∼ Cmax (G). = ∗

Version: 3 Owner: mhale Author(s): mhale

1395

Chapter 361 22E10 – General properties and structure of complex Lie groups
361.1 existence and uniqueness of compact real form

Let G be a semisimple complex Lie group. Then there exists a unique (up to isomorphism) real Lie group K such that K is compact and a real form of G. Conversely, if K is compact, semisimple and real, it is the real form of a unique semisimple complex Lie group G. The group K can be realized as the set of fixed points of a special involution of G, called the Cartan involution. For example, the compact real form of SLn C, the complex special linear group, is SU(n), the special unitary group. Note that SLn R is also a real form of SLn C, but is not compact. The compact real form of SOn C, the complex special orthogonal group, is SOn R, the real orthogonal group. SOn C also has other, non-compact real forms, called the pseudo-orthogonal groups. The compact real form of Sp2n C, the complex symplectic group, is less well-known. It is (unfortunately) also usually denoted Sp(2n), and consists of n × n “unitary” quaternion matrices, that is, Sp(2n) = {M ∈ GLn H|MM ∗ = I}

where M ∗ denotes M conjugate transpose. This different from the real symplectic group Sp2n R. Version: 2 Owner: bwebste Author(s): bwebste

1396

361.2

maximal torus

Let K be a compact group, and let t ∈ K be an element whose centralizer has minimal dimension (such elements are dense in K). Let T be the centralizer of t. This subgroup is closed since T = ϕ−1 (t) where ϕ : K → K is the map k → ktk −1 , and abelian since it is the intersection of K with the Cartan subgroup of its complexification, and hence a torus, since K (and thus T ) is compact. We call T a maximal torus of K. This term is also applied to the corresponding maximal abelian subgroup of a complex semisimple group, which is an algebraic torus. Version: 2 Owner: bwebste Author(s): bwebste

361.3

Lie group

A Lie group is a group endowed with a compatible analytic structure. To be more precise, Lie group structure consists of two kinds of data • a finite-dimensional, real-analytic manifold G • and two analytic maps, one for multiplication G×G → G and one for inversion G → G, which obey the appropriate group axioms. Thus, a homomorphism in the category of Lie groups is a group homomorphism that is simultaneously an analytic mapping between two real-analytic manifolds. Next, we describe a natural construction that associates a certain Lie algebra g to every Lie group G. Let e ∈ G denote the identity element of G. For g ∈ G let λg : G → G denote the diffeomorphisms corresponding to left multiplication by g. Definition 9. A vector-field V on G is called left-invariant if V is invariant with respect to all left multiplications. To be more precise, V is left-invariant if and only if (λg )∗ (V ) = V Proposition 15. The vector-field bracket of two left-invariant vector fields is again, a leftinvariant vector field. Proof. Let V1 , V2 be left-invariant vector fields, and let g ∈ G. The bracket operation is covariant with respect to diffeomorphism, and in particular (λg )∗ [V1 , V2 ] = [(λg )∗ V1 , (λg )∗ V2 ] = [V1 , V2 ]. 1397 (see push-forward of a vector-field) for all g ∈ G.

Q.E.D. Definition 10. The Lie algebra of G, denoted hereafter by g, is the vector space of all left-invariant vector fields equipped with the vector-field bracket. Now a right multiplication is invariant with respect to all left multiplications, and it turns out that we can characterize a left-invariant vector field as being an infinitesimal right multiplication. Proposition 16. Let a ∈ Te G and let V be a left-invariant vector-field such that Ve = a. Then for all g ∈ G we have Vg = (λg )∗ (a). The intuition here is that a gives an infinitesimal displacement from the identity element and that Vg is gives a corresponding infinitesimal right displacement away from g. Indeed consider a curve γ : (− , ) → G passing through the identity element with velocity a; i.e. γ(0) = e, γ (0) = a.

The above proposition is then saying that the curve t → gγ(t), passes through g at t = 0 with velocity Vg . Thus we see that a left-invariant vector-field is completely determined by the value it takes at e, and that therefore g is isomorphic, as a vector space to Te G. Of course, we can also consider the Lie algebra of right-invariant vector fields. The resulting Lie-algebra is anti-isomorphic (the order in the bracket is reversed) to the Lie algebra of left-invariant vector fields. Now it is a general principle that the group inverse operation gives an anti-isomorphism between left and right group actions. So, as one may well expect, the anti-isomorphism between the Lie algebras of left and right-invariant vector fields can be realized by considering the linear action of the inverse operation on Te G. Finally, let us remark that one can induce the Lie algebra structure directly on Te G by considering adjoint action of G on Te G. Examples. [Coming soon.] t ∈ (− , )

1398

Notes. 1. No generality is lost in assuming that a Lie group has analytic, rather than C ∞ or even C k , k = 1, 2, . . . structure. Indeed, given a C 1 differential manifold with a C 1 multiplication rule, one can show that the exponential mapping endows this manifold with a compatible real-analytic structure. Indeed, one can go even further and show that even C 0 suffices. In other words, a topological group that is also a finite-dimensional topological manifold possesses a compatible analytic structure. This result was formulated by Hilbert as his fifth problem, and proved in the 50’s by Montgomery and Zippin. 2. One can also speak of a complex Lie group, in which case G and the multiplication mapping are both complex-analytic. The theory of complex Lie groups requires the notion of a holomorphic vector-field. Not withstanding this complication, most of the essential features of the real theory carry over to the complex case. 3. The name “Lie group” honours the Norwegian mathematician Sophus Lie who pioneered and developed the theory of continuous transformation groups and the corresponding theory of Lie algebras of vector fields (the group’s infinitesimal generators, as Lie termed them). Lie’s original impetus was the study of continuous symmetry of geometric objects and differential equations. The scope of the theory has grown enormously in the 100+ years of its existence. The contributions of Elie Cartan and Claude Chevalley figure prominently in this evolution. Cartan is responsible for the celebrated ADE classification of simple Lie algebras, as well as for charting the essential role played by Lie groups in differential geometry and mathematical physics. Chevalley made key foundational contributions to the analytic theory, and did much to pioneer the related theory of algebraic groups. Armand Borel’s book “Essays in the History of Lie groups and algebraic groups” is the definitive source on the evolution of the Lie group concept. Sophus Lie’s contributions are the subject of a number of excellent articles by T. Hawkins. Version: 6 Owner: rmilson Author(s): rmilson

361.4

complexification

Let G be a real Lie group. Then the complexification GC of G is the unique complex Lie group equipped with a map ϕ : G → GC such that any map G → H where H is a complex Lie group, extends to a holomorphic map GC → H. If g and gC are the respective Lie algebras, gC ∼ g ⊗R C. = For simply connected groups, the construction is obvious: we simply take the simply connected complex group with Lie algebra gC , and ϕ to be the map induced by the inclusion g → gC . 1399

If γ ∈ G is central, then its image is in central in GC since g → γgγ −1 is a map extending ϕ, and thus must be the identity by uniqueness half of the universal property. Thus, if Γ ⊂ G is a discrete central subgroup, then we get a map G/Γ → GC /ϕ(Γ), which gives a complexification for G/Γ. Since every Lie group is of this form, this shows existence. Some easy examples: the complexification both of SLn R and SU(n) is SLn C. The complexification of R is C and of S 1 is C∗ . The map ϕ : G → GC is not always injective. For example, if G is the universal cover of SLn R (which has fundamental group Z), then GC ∼ SLn C, and ϕ factors through the = covering G → SLn R. Version: 3 Owner: bwebste Author(s): bwebste

361.5
theorem:

Hilbert-Weyl theorem

Let Γ be a compact Lie group acting on V . Then there exists a finite Hilbert basis for the ring P(Γ) (the set of invariant polynomials). [GSS] proof: In [GSS] on page 54. theorem:(as stated by Hermann Weyl) The (absolute) invariants J(x, y, . . .) corresponding to a given set of representations of a finite or a compact Lie group have a finite integrity basis. [PV] proof: In [PV] on page 274.

REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988. [HW] Hermann, Weyl: The Classical Groups: Their Invariants and Representations. Princeton University Press, New Jersey, 1946.

Version: 3 Owner: Daume Author(s): Daume 1400

361.6

the connection between Lie groups and Lie algebras

Given a finite dimensional Lie group G, it has an associated Lie algebra g = Lie(G). The Lie algebra encodes a great deal of information about the Lie group. I’ve collected a few results on this topic: Theorem 7. (Existence) Let g be a finite dimensional Lie algebra over R or C. Then there exists a finite dimensional real or complex Lie group G with Lie(G) = g. Theorem 8. (Uniqueness) There is a unique connected simply-connected Lie group G with any given finite-dimensional Lie algebra. Every connected Lie group with this Lie algebra is a quotient G/Γ by a discrete central subgroup Γ. Even more important, is the fact that the correspondence G → g is functorial: given a homomorphism ϕ : G → H of Lie groups, there is natural homomorphism defined on Lie algebras ϕ∗ : g → h, which just the derivative of the map ϕ at the identity (since the Lie algebra is canonically identified with the tangent space at the identity). There are analogous existence and uniqueness theorems for maps: Theorem 9. (Existence) Let ψ : g → h be a homomorphism of Lie algebras. Then if G is the unique connected, simply-connected group with Lie algebra g, and H is any Lie group with Lie algebra h, there exists a homorphism of Lie groups ϕ : G → H with ϕ∗ = ψ. Theorem 10. (Uniqueness) Let G be connected Lie group and H an arbitrary Lie group. Then if two maps ϕ, ϕ : G → H induce the same maps on Lie algebras, then they are equal. Essentially, what these theorems tell us is the correspondence g → G from Lie algebras to simply-connected Lie groups is functorial, and right adjoint to the functor H → Lie(H) from Lie groups to Lie algebras. Version: 6 Owner: bwebste Author(s): bwebste

1401

Chapter 362 26-00 – General reference works (handbooks, dictionaries, bibliographies, etc.)
362.1 derivative notation

This is the list of known standard representations and their nuances. The most common notation, this is read as the derivative of u with respect d2 to v. Exponents relate which derivative, for example, dxy is the second derivative of y with 2 resspect to x. f (x) , f (x) , y − This is read as f prime of x. The number of primes tells the derivative, ie. f (x) is the third derivative of f (x) with respect to x. Note that in higher dimensions, this may be a tensor of a rank equal to the derivative. Dx f (x), Fy (x), fxy (x)− These notations are rather arcane, and should not be used generally, as they have other meanings. For example Fy can easily by the y component of a vector-valued function. The subscript in this case means ”with respect to”, so Fyy would be the second derivative of F with respect to y. D1 f (x), F2 (x), f12 (x)− The subscripts in these cases refer to the derivative with respect to the nth variable. For example, F2 (x, y, z) would be the derivative of F with respect to y. They can easily represent higher derivatives, ie. D21 f (x) is the derivative with respect to the first variable of the derivative with respect to the second variable.
du df dy , , − dv dx dx

1402

, ∂f − The partial derivative of u with respect to v. This symbol can be manipulated as ∂x in du for higher partials. dv
∂u ∂v ∂ , ∂v − This is the operator version of the derivative. Usually you will see it acting on d something such as dv (v 2 + 3u) = 2v. d dv

where fn represents the nth function of a vector valued function. the second of these notations represents the derivative matrix, which in most cases is the Jacobian, but in some cases, does not exist, even though the Jacobian exists. Note that the directional derivative in the direction v is simply [Jf(x)]v. Version: 7 Owner: slider142 Author(s): slider142

[Jf(x)] , [Df (x)]− The first of these represents the Jacobian of f, which is a matrix of partial derivatives such that   D1 f1 (x) . . . Dn f1 (x)   . . .. . . [Jf (x)] =   . . . D1 fm (x) . . . Dn fm (x)

362.2

fundamental theorems of calculus

The Fundamental Theorems of Calculus serve to demonstrate that integration and differentiation are inverse processes. First Fundamental Theorem: Suppose that F is a differentiable function on the interval [a, b]. Then intb F (x) dx = F (b) − a F (a). Second Fundamental Theorem: Let f be a continuous function on the interval [a, b], let c be an arbitrary point in this interval and assume f is integrable on the intervals of the form [0, x] for all x ∈ [a, b]. Let F be defined as F (x) = intx f (t) dt for every x in (a, b). Then, F is differentiable and F (x) = f (x). c This result is about Riemann integrals. When dealing with Lebesgue integrals we get a generalization with Lebesgue’s differentiation theorem. Version: 9 Owner: mathcam Author(s): drini, greg

1403

362.3

logarithm

Definition. Three real numbers x, y, p, with x, y > 0, are said to obey the logarithmic relation logx (y) = p if they obey the corresponding exponential relation: xp = y. Note that by the monotonicity and continuity property of the exponential operation, for given x and y there exists a unique p satisfying the above relation. We are therefore able to says that p is the logarithm of y relative to the base x.

Properties. There are a number of basic algebraic identities involving logarithms. logx (yz) = logx (y) + logx (z) logx (y/z) = logx (y) − logx (z) logx (y z ) = z logx (y) logx (1) = 0 logx (x) = 1 logx (y) logy (x) = 1 logx (z) logy (z) = logx (y) Notes. In essence, logarithms convert multiplication to addition, and exponentiation to multiplication. Historically, these properties of the logarithm made it a useful tool for doing numerical calculations. Before the advent of electronic calculators and computers, tables of logarithms and the logarithmic slide rule were essential computational aids. Scientific applications predominantly make use of logarithms whose base is the Eulerian number e = 2.71828 . . .. Such logarithms are called natural logarithms and are commonly denoted by the symbol ln, e.g. ln(e) = 1. Natural logarithms naturally give rise to the natural logarithm function. A frequent convention, seen in elementary mathematics texts and on calculators, is that logarithms that do not give a base explicitly are assumed to be base 10, e.g. log(100) = 2. This is far from universal. In Rudin’s “Real and Complex analysis”, for example, we see a baseless log used to refer to the natural logarithm. By contrast, computer science and 1404

information theory texts often assume 2 as the default logarithm base. This is motivated by the fact that log2 (N) is the approximate number of bits required to encode N different messages. The invention of logarithms is commonly credited to John Napier [ Biography] Version: 13 Owner: rmilson Author(s): rmilson

362.4

proof of the first fundamental theorem of calculus

Let us make a subdivison of the interval [a, b], ∆ : {a = x0 < x1 < x2 < · · · < xn−1 < xn = b} From this, we can say F (b) − F (a) = n [F (xi ) − F (xi−1 )]. i=1 ¯ From the mean-value theorem, we have that for any two points, x and x, ∃ ξ ∈ (¯, x) ¯ x ¯ ¯ ¯ ¯ ¯ F (x) − F (¯) = F (ξ)(x − x) If we use xi as x and xi−1 as x, calling our intermediate point x ¯ ξi , we get F (xi ) − F (xi−1 ) = F (ξi )(xi − xi−1 ). Combining these, and using the abbreviation ∆i x = xi − xi−1 , we have F (xi ) − F (xi−1 ) = n i=1 F (ξi )∆i xi . From the definition of an integral ∀ > 0 ∃ δ > 0 | n F (ξi )∆i x − intb F (x) dx| < a i=1 when ∆ < δ. Thus, ∀ > 0, |F (b) − F (a) − intb F (x) dx| < . a lim →0 |F (b) − F (a) − intb F (x) dx| = 0, but F (b) − F (a) − intb F (x) dx is constant with a a respect to , which can only mean that |F (b) − F (a) − int b F (x) dx| = 0, and so we have the a first fundamental theorem of calculus F (b) − F (a) = intb F (x) dx. a Version: 4 Owner: greg Author(s): greg

362.5

proof of the second fundamental theorem of calculus

Recall that a continuous function is Riemann integrable, so the integral F (x) = intx f (t) dt c is well defined. Consider the increment of F :
x+h F (x + h) − F (x) = intc f (t) dt − intx f (t) dt = intx+h f (t) dt c x

1405

(we have used the linearity of the integral with respect to the function and the additivity with respect to the domain). Now let M be the maximum of f on [x, x + h] and m be the minimum. Clearly we have
x+h mh ≤ intx f (t) dt ≤ Mh

(this is due to the monotonicity of the integral with respect to the integrand) which can be written as x+h F (x + h) − F (x) intx f (t) dt = ∈ [m, M] h h Being f continuous, by the mean-value theorem, there exists ξh ∈ [x, x+h] such that f (ξh ) = F (x+h)−F (x) so that h F (x) = lim since ξh → x as h → 0. Version: 1 Owner: paolini Author(s): paolini F (x + h) − F (x) = lim f (ξh ) = f (x) h→0 h→0 h

362.6

root-mean-square

If x1 , x2 , . . . , xn are real numbers, we define their root-mean-square or quadratic mean as x2 + x2 + · · · + x2 1 2 n R(x1 , x2 , . . . , xn ) = . n The root-mean-square of a random variable X is defined as the square root of the expectation of X 2 : R(X) = E(X 2 ) If X1 , X2 , . . . , Xn are random variables with standard deviations σ1 , σ2 , . . . , σn , then the stan+···+Xn dard deviation of their arithmetic mean, X1 +X2n , is the root-mean-square of σ1 , σ2 , . . . , σn . Version: 1 Owner: pbruin Author(s): pbruin

362.7

square

The square of a number x is the number obtained multiplying x by itself. It’s denoted as x2 .

1406

Some examples: 52 = 25 2 1 1 = 3 9 2 0 = 0 .52 = .25 Version: 2 Owner: drini Author(s): drini

1407

Chapter 363 26-XX – Real functions
363.1 abelian function

An abelian or hyperelliptic function is a generalisation of an elliptic function. It is a function of two variables with four periods. In a similar way to an elliptic function it can also be regarded as the inverse function to certain integrals (called abelian or hyperelliptic integrals) of the form dz int R(z) where R is a polynomial of degree greater than 4. Version: 2 Owner: vladm Author(s): vladm

363.2

full-width at half maximum

The full-width at half maximum (FWHM) is a parameter used to describe the width of a bump on a function (or curve). The FWHM is given by the distance beteen the points where the function reaches half of its maximum value. For example: the function f (x) = 10 . +1

x2

f reaches its maximum for x = 0,(f (0) = 10), so f reaches half of its maximum value for x = 1 and x = −1 (f (1) = f (−1) = 5). So the FWHM for f , in this case, is 2. Beacouse the distance between A(1, 5) and B(−1, 5) si 2.

1408

10 . +1 is called ’The Agnesi curve’, from Maria Gaetana Agnesi (1718 - 1799). f (x) = x2 Version: 2 Owner: vladm Author(s): vladm

The function

1409

Chapter 364 26A03 – Foundations: limits and generalizations, elementary topology of the line
364.1 Cauchy sequence

A sequence x0 , x1 , x2 , . . . in a metric space (X, d) is a Cauchy sequence if, for every real number > 0, there exists a natural number N such that d(xn , xm ) < whenever n, m > N. Version: 4 Owner: djao Author(s): djao, rmilson

364.2

Dedekind cuts

The purpose of Dedekind cuts is to provide a sound logical foundation for the real number system. Dedekind’s motivation behind this project is to notice that a real number α, intuitively, is completely determined by the rationals strictly smaller than α and those strictly larger than α. Concerning the completeness or continuity of the real line, Dedekind notes in [2] that If all points of the straight line fall into two classes such that every point of the first class lies to the left of every point of the second class, then there exists one and only one point which produces this division of all points into two classes, this severing of the straight line into two portions. Dedekind defines a point to produce the division of the real line if this point is either the least or greatest element of either one of the classes mentioned above. He further notes that 1410

the completeness property, as he just phrased it, is deficient in the rationals, which motivates the definition of reals as cuts of rationals. Because all rationals greater than α are really just excess baggage, we prefer to sway somewhat from Dedekind’s original definition. Instead, we adopt the following definition. Definition 34. A Dedekind cut is a subset α of the rational numbers Q that satisfies these properties: 1. α is not empty. 2. Q \ α is not empty. 3. α contains no greatest element 4. For x, y ∈ Q, if x ∈ α and y < x, then y ∈ α as well. Dedekind cuts are particularly appealing for two reasons. First, they make it very easy to prove the completeness, or continuity of the real line. Also, they make it quite plain to distinguish the rationals from the irrationals on the real line, and put the latter on a firm logical foundation. In the construction of the real numbers from Dedekind cuts, we make the following definition: Definition 35. A real number is a Dedekind cut. We denote the set of all real numbers by R and we order them by set-theoretic inclusion, that is to say, for any α, β ∈ R, α < β if and only if α ⊂ β where the inclusion is strict. We further define α = β as real numbers if α and β are equal as sets. As usual, we write α β if α < β or α = β. Moreover, a real number α is said to be irrational if Q \ α contains no least element. The Dedekind completeness property of real numbers, expressed as the supremum property, now becomes straightforward to prove. In what follows, we will reserve Greek variables for real numbers, and Roman variables for rationals. Theorem 11. Every nonempty subset of real numbers that is bounded above has a least upper bound.
L et A be a nonempty set of real numbers, such that for every α ∈ A we have that α for some real number γ. Now define the set

γ

sup A =
α∈A

α.

We must show that this set is a real number. This amounts to checking the four conditions of a Dedekind cut. 1411

1. sup A is clearly not empty, for it is the nonempty union of nonempty sets. 2. Because γ is a real number, there is some rational x that is not in γ. Since every α ∈ A is a subset of γ, x is not in any α, so x ∈ sup A either. Thus, Q \ sup A is nonempty. 3. If sup A had a greatest element g, then g ∈ α for some α ∈ A. Then g would be a greatest element of α, but α is a real number, so by contrapositive, sup A has no greatest element. 4. Lastly, if x ∈ sup A, then x ∈ α for some α, so given any y < x because α is a real number y ∈ α, whence y ∈ sup A. Thus, sup A is a real number. Trivially, sup A is an upper bound of A, for every α ⊆ sup A. It now suffices to prove that sup A γ, because γ was an arbitrary upper bound. But this is easy, because every x ∈ sup A is an element of α for some α ∈ A, so because α ⊆ γ, x ∈ γ. Thus, sup A is the least upper bound of A. We call this real number the supremum of A. To finish the construction of the real numbers, we must endow them with algebraic operations, define the additive and multiplicative identity elements, prove that these definitions give a field, and prove further results about the order of the reals (such as the totality of this order) – in short, build a complete ordered field. This task is somewhat laborious, but we include here the appropriate definitions. Verifying their correctness can be an instructive, albeit tiresome, exercise. We use the same symbols for the operations on the reals as for the rational numbers; this should cause no confusion in context. Definition 36. Given two real numbers α and β, we define • The additive identity, denoted 0, is 0 := {x ∈ Q : x < 0} • The multiplicative identity, denoted 1, is 1 := {x ∈ Q : x < 1} • Addition of α and β denoted α + β is α + β := {x + y : x ∈ α, y ∈ β} • The opposite of α, denoted −α, is −α := {x ∈ Q : −x ∈ α, but − x is not the least element of Q \ α} • The absolute value of α, denoted |α|, is |α| := α, if α −α, if α 1412 0 0

• If α, β > 0, then multiplication of α and β, denoted α · β, is α · β := {z ∈ Q : z 0 or z = xy for some x ∈ α, y ∈ β with x, y > 0} In general,

• The inverse of α > 0, denoted α−1 , is α−1 := {x ∈ Q : x If α < 0,

 0, if α = 0 or β = 0  α · β := |α| · |β| if α > 0, β > 0 or α < 0, β < 0   −(|α| · |β|) if α > 0, β < 0 or α > 0, β < 0 0 or x > 0 and (1/x) ∈ α, but 1/x is not the least element of Q\α} α−1 := −(|α|)−1

All that remains (!) is to check that the above definitions do indeed define a complete ordered field, and that all the sets implied to be real numbers are indeed so. The properties of R as an ordered field follow from these definitions and the properties of Q as an ordered field. It is important to point out that in two steps, in showing that inverses and opposites are properly defined, we require an extra property of Q, not merely in its capacity as an ordered field. This requirement is the Archimedean property. Moreover, because R is a field of characteristic 0, it contains an isomorphic copy of Q. The rationals correspond to the Dedekind cuts α for which Q \ α contains a least member.

REFERENCES
1. Courant, Richard and Robbins, Herbert. What is Mathematics? pp. 68-72 Oxford University Press, Oxford, 1969 2. Dedekind, Richard. Essays on the Theory of Numbers Dover Publications Inc, New York 1963 3. Rudin, Walter Principles of Mathematical Analysis pp. 17-21 McGraw-Hill Inc, New York, 1976 4. Spivak, Michael. Calculus pp. 569-596 Publish or Perish, Inc. Houston, 1994

Version: 20 Owner: rmilson Author(s): rmilson, NeuRet

364.3

binomial proof of positive integer power rule

We will use the difference quotient in this proof of the power rule for positive integers. Let f (x) = xn for some integer n 0. Then we have f (x) = lim (x + h)n − xn . h→0 h 1413

We can use the binomial theorem to expand the numerator f (x) = lim
n where Ck = n! . k!(n−k)! n n n n C0 x0 hn + C1 x1 hn−1 + · · · + Cn−1xn−1 h1 + Cn xn h0 − xn h→0 h

We can now simplify the above hn + nxhn−1 + · · · + nxn−1 h + xn − xn f (x) = lim h→0 h = lim (hn−1 + nxhn−2 + · · · + nxn−1 ) = nx = nxn−1 .
h→0 n−1

Version: 4 Owner: mathcam Author(s): mathcam, slider142

364.4

exponential

Preamble. We use R+ ⊂ R to denote the set of non-negative real numbers. Our aim is to define the exponential, or the generalized power operation, xp , x ∈ R+ , p ∈ R.

The power p in the above expression is called the exponent. We take it as proven that R is a complete, ordered field. No other properties of the real numbers are invoked. Definition. For x ∈ R+ and n ∈ Z we define xn in terms of repeated multiplication. To be more precise, we inductively characterize natural number powers as follows: x0 = 1, xn+1 = x · xn , n ∈ N.

The existence of the reciprocal is guaranteed by the assumption that R is a field. Thus, for negative exponents, we can define x−n = (x−1 )n , where x−1 is the reciprocal of x. The case of arbitrary exponents is somewhat more complicated. A possible strategy is to define roots, then rational powers, and then extend by continuity. Our approach is different. For x ∈ R+ and p ∈ R, we define the set of all reals that one would want to be smaller than xp , and then define the latter as the least upper bound of this set. To be more precise, let x > 1 and define L(x, p) = {z ∈ R+ : z n < xm for all m ∈ Z, n ∈ N such that m < pn}. 1414 n ∈ N,

We then define xp to be the least upper bound of L(x, p). For x < 1 we define xp = (x−1 )p . The exponential operation possesses a number of important properties, some of which characterize it up to uniqueness. Note. It is also possible to define the exponential operation in terms of the exponential function and the natural logarithm. Since these concepts require the context of differential theory, it seems preferable to give a basic definition that relies only on the foundational property of the reals. Version: 11 Owner: rmilson Author(s): rmilson

364.5

interleave sequence

Let S be a set, and let {xi }, i = 0, 1, 2, . . . and {yi }, i = 0, 1, 2, . . . be two sequences in S. The interleave sequence is defined to be the sequence x0 , y0 , x1 , y1 , . . . . Formally, it is the sequence {zi }, i = 0, 1, 2, . . . given by zi := xk yk if i = 2k is even, if i = 2k + 1 is odd.

Version: 2 Owner: djao Author(s): djao

364.6

limit inferior

Let S ⊂ R be a set of real numbers. Recall that a limit point of S is a real number x ∈ R such that for all > 0 there exist infinitely many y ∈ S such that |x − y| < . We define lim inf S, pronounced the limit inferior of S, to be the infimum of all the limit points of S. If there are no limit points, we define the limit inferior to be +∞. The two most common notations for the limit inferior are lim inf S and lim S . 1415

An alternative, but equivalent, definition is available in the case of an infinite sequence of real numbers x0 , x1 , x2 , , . . .. For each k ∈ N, let yk be the infimum of the k th tail, yk = inf j
k xj .

This construction produces a non-decreasing sequence y0 y1 y2 ...,

which either converges to its supremum, or diverges to +∞. We define the limit inferior of the original sequence to be this limit; lim inf xk = lim yk .
k k

Version: 7 Owner: rmilson Author(s): rmilson

364.7

limit superior

Let S ⊂ R be a set of real numbers. Recall that a limit point of S is a real number x ∈ R such that for all > 0 there exist infinitely many y ∈ S such that |x − y| < . We define lim sup S, pronounced the limit superior of S, to be the supremum of all the limit points of S. If there are no limit points, we define the limit superior to be −∞. The two most common notations for the limit superior are lim sup S and lim S . An alternative, but equivalent, definition is available in the case of an infinite sequence of real numbers x0 , x1 , x2 , , . . .. For each k ∈ N, let yk be the supremum of the k th tail, yk = sup xj .
j k

This construction produces a non-increasing sequence y0 y1 y2 ...,

which either converges to its infimum, or diverges to −∞. We define the limit superior of the original sequence to be this limit; lim sup xk = lim yk .
k k

Version: 7 Owner: rmilson Author(s): rmilson 1416

364.8

power rule

The power rule states that D p x = pxp−1 , Dx p∈R

This rule, when combined with the chain rule, product rule, and sum rule, makes calculating many derivatives far more tractable. This rule can be derived by repeated application of the product rule. See the proof of the power rule. Repeated use of the above formula gives

di k x = dxi for i, k ∈ Z.

0
k! xk−i (k−i)!

i>k i k,

Examples
D 0 x Dx D 1 x Dx D 2 x Dx D 3 x Dx D√ x Dx D e 2x Dx 0 D =0= 1 x Dx D = 1x0 = 1 = x Dx = = 2x = 3x2 = D 1/2 1 −1/2 1 x = x = √ Dx 2 2 x

= 2exe−1

Version: 4 Owner: mathcam Author(s): mathcam, Logan

364.9

properties of the exponential

The exponential operation possesses the following properties. 1417

• Homogeneity. For x, y ∈ R+ , p ∈ R we have (xy)p = xp y p • Exponent additivity. For x ∈ R+ we have x0 = 1, Furthermore xp+q = xp xq , p, q ∈ R. x−p > y −p . • Monotonicity. For x, y ∈ R+ with x < y and p ∈ R+ we have xp < y p , x1 = x.

• Continuity. The exponential operation is continuous with respect to its arguments. To be more precise, the following function is continuous: P : R+ × R → R, P (x, y) = xy .

Let us also note that the exponential operation is characterized (in the sense of existence and uniqueness) by the additivity and continuity properties. [Author’s note: One can probably get away with substantially less, but I haven’t given this enough thought.] Version: 10 Owner: rmilson Author(s): rmilson

364.10

squeeze rule

Squeeze rule for sequences Let f, g, h : N → R be three sequences of real numbers such that f (n) ≤ g(n) ≤ h(n) for all n. If limn→∞ f (n) and limn→∞ h(n) exist and are equal, say to a, then limn→∞ g(n) also exists and equals a. The proof is fairly straightforward. Let e be any real number > 0. By hypothesis there exist M, N ∈ N such that |a − f (n)| < e for all n ≥ M |a − h(n)| < e for all n ≥ N Write L = max(M, N). For n ≥ L we have

1418

• if g(n) ≥ a: • else g(n) < a and:

|g(n) − a| = g(n) − a ≤ h(n) − a < e |g(n) − a| = a − g(n) ≤ a − f (n) < e

So, for all n ≥ L, we have |g(n) − a| < e, which is the desired conclusion. Squeeze rule for functions Let f, g, h : S → R be three real-valued functions on a neighbourhood S of a real number b, such that f (x) ≤ g(x) ≤ h(x) for all x ∈ S − {b}. If limx→b f (x) and limx→b h(x) exist and are equal, say to a, then limx→b g(x) also exists and equals a. Again let e be an arbitrary positive real number. Find positive reals α and β such that |a − f (x)| < e whenever 0 < |b − x| < α |a − h(x)| < e whenever 0 < |b − x| < β Write δ = min(α, β). Now, for any x such that |b − x| < δ, we have • if g(x) ≥ a: • else g(x) < a and: and we are done. Version: 1 Owner: Daume Author(s): Larry Hammick |g(x) − a| = g(x) − a ≤ h(x) − a < e |g(x) − a| = a − g(x) ≤ a − f (x) < e

1419

Chapter 365 26A06 – One-variable calculus
365.1 Darboux’s theorem (analysis)

Let f : [a, b] → R be a real-valued continuous function on [a, b], which is differentiable on (a, b), differentiable from the right at a, and differentiable from the left at b. Then f satisfies the intermediate value theorem: for every t between f+ (a) and f− (b), there is some x ∈ [a, b] such that f (x) = t. Note that when f is continuously differentiable (f ∈ C 1 ([a, b])), this is trivially true by the intermediate value theorem. But even when f is not continuous, Darboux’s theorem places a severe restriction on what it can be. Version: 3 Owner: mathwizard Author(s): mathwizard, ariels

365.2

Fermat’s Theorem (stationary points)

Let f : (a, b) → R be a continuous function and suppose that x0 ∈ (a, b) is a local extremum of f . If f is differentiable in x0 then f (x0 ) = 0. Version: 2 Owner: paolini Author(s): paolini

1420

365.3

Heaviside step function
H : R → R defined as when x < 0, when x = 0, when x > 0.

Here, there are many conventions for the value at x = 0. The motivation for setting H(0) = 1/2 is that we can then write H as a function of the signum function (see this page). In applications, such as the Laplace transform, where the Heaviside function is used extensively, the value of H(0) is irrelevant. The function is named after Oliver Heaviside (1850-1925) [1]. However, the function was already used by Cauchy[2], who defined the function as √ 1 u(t) = t + t/ t2 2 and called it a coefficient limitateur [1].

The Heaviside step function is the function   0 1/2 H(x) =  1

REFERENCES
1. The MacTutor History of Mathematics archive, Oliver Heaviside. 2. The MacTutor History of Mathematics archive, Augustin Louis Cauchy. 3. R.F. Hoskins, Generalised functions, Ellis Horwood Series: Mathematics and its applications, John Wiley & Sons, 1979.

Version: 1 Owner: Koro Author(s): matte

365.4

Leibniz’ rule

Theorem [Leibniz’ rule] ([1] page 592) Let f and g be real (or complex) valued functions that are defined on an open interval of R. If f and g are k times differentiable, then
k

(f g)(k) =
r=0

k (k−r) (r) f g . r

For multi-indices, Leibniz’ rule have the following generalization: Theorem [2] If f, g : Rn → C are smooth functions, and j is a multi-index, then ∂ j (f g) =
i≤j

j i ∂ (f ) ∂ j−i (g), i

where i is a multi-index. 1421

REFERENCES
1. R. Adams, Calculus, a complete course, Addison-Wesley Publishers Ltd, 3rd ed. 2. http://www.math.umn.edu/ jodeit/course/TmprDist1.pdf

Version: 3 Owner: matte Author(s): matte

365.5

Rolle’s theorem

Rolle’s theorem. If f is a continuous function on [a, b], such that f (a) = f (b) = 0 and differentiable on (a, b) then there exists a point c ∈ (a, b) such that f (c) = 0. Version: 8 Owner: drini Author(s): drini

365.6

binomial formula

The binomial formula gives the power series expansion of the pth power function for every real power p. To wit, ∞ n p n x (1 + x) = p , x ∈ R, |x| < 1, n! n=0 where denotes the nth falling factorial of p. pn = p(p − 1) . . . (p − n + 1)

Note that for p ∈ N the power series reduces to a polynomial. The above formula is therefore a generalization of the binomial theorem. Version: 4 Owner: rmilson Author(s): rmilson

365.7

chain rule

Let f (x), g(x) be differentiable, real-valued functions. The derivative of the composition (f ◦ g)(x) can be found using the chain rule, which asserts that: (f ◦ g) (x) = f (g(x)) g (x) The chain rule has a particularly suggestive appearance in terms of the Leibniz formalism. Suppose that z depends differentiably on y, and that y in turn depends differentiably on x. 1422

Then,

dz dz dy = dx dy dx

The apparent cancellation of the dy term is at best a formal mnemonic, and does not constitute a rigorous proof of this result. Rather, the Leibniz format is well suited to the interpretation of the chain rule in terms of related rates. To wit: The instantaneous rate of change of z relative to x is equal to the rate of change of z relative to y times the rate of change of y relative to x. Version: 5 Owner: rmilson Author(s): rmilson

365.8

complex Rolle’s theorem

Theorem [1] Suppose Ω is an open convex set in C, suppose f is a holomorphic function f : Ω → C, and suppose f (a) = f (b) = 0 for distinct points a, b in Ω. Then there exist points u, v on Lab (the straight line connecting a and b not containing the endpoints), such that Re{f (u)} = 0 and Im{f (v)} = 0.

REFERENCES
1. J.-Cl. Evard, F. Jafari, A Complex Rolle’s Theorem, American Mathematical Monthly, Vol. 99, Issue 9, (Nov. 1992), pp. 858-861.

Version: 4 Owner: matte Author(s): matte

365.9

complex mean-value theorem

Theorem [1] Suppose Ω is an open convex set in C, suppose f is a holomorphic function f : Ω → C, and suppose a, b are distinct points in Ω. Then there exist points u, v on Lab (the straight line connecting a and b not containing the endpoints), such that Re{ f (b) − f (a) } = Re{f (u)}, b−a f (b) − f (a) Im{ } = Im{f (v)}, b−a

1423

REFERENCES
1. J.-Cl. Evard, F. Jafari, A Complex Rolle’s Theorem, American Mathematical Monthly, Vol. 99, Issue 9, (Nov. 1992), pp. 858-861.

Version: 2 Owner: matte Author(s): matte

365.10

definite integral

The definite integral with respect to x of some function f (x) over the closed interval [a, b] is defined to be the “area under the graph of f (x) with respect to x” (if f(x) is negative, then you have a negative area). It is written as: intb f (x) dx a one way to find the value of the integral is to take a limit of an approximation technique as the precision increases to infinity. For example, use a Riemann sum which approximates the area by dividing it into n intervals of equal widths, and then calculating the area of rectangles with the width of the interval and height dependent on the function’s value in the interval. Let Rn be this approximation, which can be written as n Rn =
i=1

f (x∗ )∆x i

where x∗ is some x inside the ith interval. i Then, the integral would be
n

intb f (x) a

dx = lim Rn = lim
n→∞

n→∞

f (x∗ )∆x i
i=1

We can use this definition to arrive at some important properties of definite integrals (a, b, c are constant with respect to x): intb f (x) + g(x) a b inta f (x) − g(x) intb f (x) a intb f (x) a b inta cf (x) dx dx dx dx dx = = = = = intb f (x) dx + intb g(x) dx a a b inta f (x) dx − intb g(x) dx a a −intb f (x) dx intc f (x) dx + intb f (x) dx a c b cinta f (x) dx

There are other generalisations about integrals, but many require the fundamental theorem of calculus. Version: 4 Owner: xriso Author(s): xriso 1424

365.11

derivative of even/odd function (proof )

Suppose f (x) = ±f (−x). We need to show that f (x) = f (−x). To do this, let us define the auxiliary function m : R → R, m(x) = −x. The condition on f is then f (x) = ±(f ◦ m)(x). Using the chain rule, we have that f (x) = ±(f ◦ m) (x) = ±f m(x) m (x) = f (−x), and the claim follows. P Version: 2 Owner: mathcam Author(s): matte

365.12

direct sum of even/odd functions (example)

Example. direct sum of even and odd functions Let us define the sets F = {f |f is a function fromR toR}, F+ = {f ∈ F |f (x) = f (−x)for allx ∈ R}, F− = {f ∈ F |f (x) = −f (−x)for allx ∈ R}. In other words, F contain all functions from R to R, F+ ⊂ F contain all even functions, and F− ⊂ F contain all odd functions. All of these spaces have a natural vector space structure: for functions f and g we define f + g as the function x → f (x) + g(x). Similarly, if c is a real constant, then cf is the function x → cf (x). With these operations, the zero vector is the mapping x → 0. We claim that F is the direct sum of F+ and F− , i.e., that F = F+ ⊕ F− . (365.12.1)

To prove this claim, let us first note that F± are vector subspaces of F . Second, given an arbitrary function f in F , we can define 1 f (x) + f (−x) , 2 1 f− (x) = f (x) − f (−x) . 2 f+ (x) = Now f+ and f− are even and odd functions and f = f+ + f− . Thus any function in F can be split into two components f+ and f− , such that f+ ∈ F+ and f− ∈ F− . To show that the sum 1425

is direct, suppose f is an element in F+ F− . Then we have that f (x) = −f (−x) = −f (x), so f (x) = 0 for all x, i.e., f is the zero vector in F . We have established equation 364.12.1. Version: 2 Owner: mathcam Author(s): matte

365.13

even/odd function

Definition. Let f be a function from R to R. If f (x) = f (−x) for all x ∈ R, then f is an even function. Similarly, if f (x) = −f (−x) for all x ∈ R, then f is an odd function. Example. 1. The trigonometric functions sin and cos are odd and even, respectively. properties. 1. The vector space of real functions can be written as the direct sum of even and odd functions. (See this page.) 2. Let f : R → R be a differentiable function. (a) If f is an even function, then the derivative f is an odd function. (b) If f is an odd function, then the derivative f is an even function. (proof) 3. Let f : R → R be a smooth function. Then there exists smooth functions g, h : R → R such that f (x) = g(x2 ) + xh(x2 ) for all x ∈ R. Thus, if f is even, we have f (x) = g(x2 ), and if f is odd, we have f (x) = xh(x2 ) ([4], Exercise 1.2)

REFERENCES
1. L. H¨rmander, The Analysis of Linear Partial Differential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990.

Version: 4 Owner: mathcam Author(s): matte

1426

365.14

example of chain rule

Suppose we wanted to differentiate h(x) = Here, h(x) is given by the composition h(x) = f (g(x)), where f (x) = Then chain rule says that h (x) = f (g(x))g (x). Since 1 f (x) = √ , 2 x h (x) = √ √ x and g(x) = sin(x). sin(x).

and g (x) = cos(x), cos x cos x = √ 2 sin x

we have by chain rule 1 2 sin x

Using the Leibniz formalism, the above calculation would have the following appearance. First we describe the functional relation as z= sin(x).

Next, we introduce an auxiliary variable y, and write z= We then have √ y, y = sin(x). dy = cos(x), dx

dz 1 = √ , dy 2 y

and hence the chain rule gives 1 dz = √ cos(x) dx 2 y 1 cos(x) = 2 sin(x) Version: 1 Owner: rmilson Author(s): rmilson

1427

365.15

example of increasing/decreasing/monotone function

The function f (x) = ex is strictly increasing and hence strictly monotone. Similarly g(x) = e−x is strictly decreasing and hence strictly monotone. Consider the function h : [1, 10] → √ √ [1, 5] where h(x) = x − 4 x − 1 + 3 + x − 6 x − 1 + 8. It is not strictly monotone since it is constant on an interval, however it is decreasing and hence monotone. Version: 1 Owner: Johan Author(s): Johan

365.16

extended mean-value theorem

Let f : [a, b] → R and g : [a, b] → R be continuous on [a, b] and differentiable on (a, b). Then there exists some number ξ ∈ (a, b) satisfying: (f (b) − f (a))g (ξ) = (g(b) − g(a))f (ξ). If g is linear this becomes the usual mean-value theorem. Version: 6 Owner: mathwizard Author(s): mathwizard

365.17

increasing/decreasing/monotone function

Definition Let A a subset of R, and let f be a function from f : A → R. Then 1. f is increasing, if x ≤ y implies that f (x) ≤ f (y) (for all x and y in A). 2. f is strictly increasing, if x < y implies that f (x) < f (y). 3. f is decreasing, if x ≥ y implies that f (x) ≥ f (y). 4. f is strictly decreasing, if x > y implies that f (x) > f (y). 5. f is monotone, if f is either increasing or decreasing. 6. f is strictly monotone, if f is either strictly increasing or strictly decreasing. Theorem Let X be a bounded of unbounded open interval of R. In other words, let X be an interval of the form X = (a, b), where a, b ∈ R {−∞, ∞}. Futher, let f : X → R be a monotone function.

1428

1. The set of points where f is discontinuous is at most countable [1, 1]. Lebesgue f is differentiable almost everywhere ([1], pp. 514).

REFERENCES
1. C.D. Aliprantis, O. Burkinshaw, Principles of Real Analysis, 2nd ed., Academic Press, 1990. 2. W. Rudin, Principles of Mathematical Analysis, McGraw-Hill Inc., 1976. 3. F. Jones, Lebesgue Integration on Euclidean Spaces, Jones and Barlett Publishers, 1993.

Version: 3 Owner: matte Author(s): matte

365.18

intermediate value theorem

Let f be a continuous function on the interval [a, b]. Let x1 and x2 be points with a ≤ x1 < x2 ≤ b such that f (x1 ) = f (x2 ). Then for each value y between f (x1 ) and (x2 ), there is a c ∈ (x1 , x2 ) such that f (c) = y. Bolzano’s theorem is a special case of this one. Version: 2 Owner: drini Author(s): drini

365.19

limit

Let f : X \ {a} −→ Y be a function between two metric spaces X and Y , defined everywhere except at some a ∈ X. For L ∈ Y , we say the limit of f (x) as x approaches a is equal to L, or lim f (x) = L
x→a

if, for every real number ε > 0, there exists a real number δ > 0 such that, whenever x ∈ X with 0 < dX (x, a) < δ, then dY (f (x), L) < ε. The formal definition of limit as given above has a well–deserved reputation for being notoriously hard for inexperienced students to master. There is no easy fix for this problem, since the concept of a limit is inherently difficult to state precisely (and indeed wasn’t even accomplished historically until the 1800’s by Cauchy, well after the invention of calculus in the 1600’s by Newton and Leibniz). However, there are number of related definitions, which, taken together, may shed some light on the nature of the concept.

1429

• The notion of a limit can be generalized to mappings between arbitrary topological spaces. In this context we say that limx→a f (x) = L if and only if, for every neighborhood V of L (in Y ), there is a deleted neighborhood U of a (in X) which is mapped into V by f. • Let an , n ∈ N be a sequence of elements in a metric space X. We say that L ∈ X is the limit of the sequence, if for every ε > 0 there exists a natural number N such that d(an , L) < ε for all natural numbers n > N. • The definition of the limit of a mapping can be based on the limit of a sequence. To wit, limx→a f (x) = L if and only if, for every sequence of points xn in X converging to a (that is, xn → a, xn = a), the sequence of points f (xn ) in Y converges to L. In calculus, X and Y are frequently taken to be Euclidean spaces Rn and Rm , in which case the distance functions dX and dY cited above are just Euclidean distance. Version: 5 Owner: djao Author(s): rmilson, djao

365.20

mean value theorem

Mean value theorem Let f : [a, b] → R be a continuous function differentiable on (a, b). Then there is some real number x0 ∈ (a, b) such that f (x0 ) = f (b) − f (a) . b−a

Version: 3 Owner: drini Author(s): drini, apmxi

365.21

mean-value theorem

Let f : R → R be a function which is continuous on the interval [a, b] and differentiable on (a, b). Then there exists a number c : a < c < b such that f (b) − f (a) . b−a The geometrical meaning of this theorem is illustrated in the picture: f (c) =

(365.21.1)

1430

This is often used in the integral context: ∃c ∈ [a, b] such that (b − a)f (c) = intb f (x)dx. a Version: 4 Owner: mathwizard Author(s): mathwizard, drummond (365.21.2)

365.22

monotonicity criterion

Suppose that f : [a, b] → R is a function which is continuous on [a, b] and differentiable on (a, b). Then the following relations hold. 1. f (x) ≥ 0 for all x ∈ (a, b) ⇔ f is an increasing function on [a, b]; 2. f (x) ≤ 0 for all x ∈ (a, b) ⇔ f is a decreasing function on [a, b]; 3. f (x) > 0 for all x ∈ (a, b) ⇒ f is a strictly increasing function on [a, b]; 4. f (x) < 0 for all x ∈ (a, b) ⇒ f is a strictly decreasing function on [a, b]. Notice that the third and fourth statement cannot be inverted. As an example consider the function f : [−1, 1] → R, f (x) = x3 . This is a strictly increasing function, but f (0) = 0. Version: 4 Owner: paolini Author(s): paolini

365.23

nabla

Let f : Rn → R a C 1 (Rn ) function. That is, a partially differentiable function on all its coordinates. The symbol , named nabla represents the gradient operator whose action on f (x1 , x2 , . . . , xn ) is given by f = (fx1 , fx2 , . . . , fxn ) ∂f ∂f ∂f , ,..., = ∂x1 ∂x2 ∂xn Version: 2 Owner: drini Author(s): drini, apmxi

1431

365.24

one-sided limit

Let f be a real-valued function defined on S ⊆ R. The left-hand one-sided limit at a is defined to be the real number L− such that for every > 0 there exists a δ > 0 such that |f (x) − L− | < whenever 0 < a − x < δ. Analogously, the right-hand one-sided limit at a is the real number L+ such that for every > 0 there exists a δ > 0 such that |f (x) − L+ | < whenever 0 < x − a < δ. Common notations for the one-sided limits are L+ = f (x+) = lim+ f (x) = lim f (x),
x→a x a

L

= f (x−) = lim− f (x) = lim f (x).
x→a x a

Sometimes, left-handed limits are referred to as limits from below while right-handed limits are from above. Theorem The ordinary limit of a function exists at a point if and only if both one-sided limits exist at this point and are equal (to the ordinary limit). e.g., The Heaviside unit step function, sometimes colloquially referred to as the diving board function, defined by 0 if x < 0 H(x) = 1 if x 0 has the simplest kind of discontinuity at x = 0, a jump discontinuity. Its ordinary limit does not exist at this point, but the one-sided limits do exist, and are
x→0−

lim H(x) = 0 and lim+ H(x) = 1.
x→0

Version: 5 Owner: matte Author(s): matte, NeuRet

365.25

product rule

The product rule states that if f : R → R and g : R → R are functions in one variable both differentiable at a point x0 , then the derivative of the product of the two fucntions, denoted f · g, at x0 is given by D (f · g) (x0 ) = f (x0 )g (x0 ) + f (x0 )g(x0 ). Dx

1432

Proof See the proof of the product rule.

365.25.1

Generalized Product Rule

More generally, for differentiable functions f1 , f2 , . . . , fn in one variable, all differentiable at x0 , we have
n

D(f1 · · · fn )(x0 ) = Also see Leibniz’ rule.

i=1

(fi (x0 ) · · · fi−1 (x0 ) · Dfi (x0 ) · fi+1 (x0 ) · · · fn (x0 )) .

Example The derivative of x ln |x| can be found by application of this rule. Let f (x) = x, g(x) = ln |x|, 1 so that f (x)g(x) = x ln |x|. Then f (x) = 1 and g (x) = x . Therefore, by the product rule, D (x ln |x|) = f (x)g (x) + f (x)g(x) Dx x + 1 · ln |x| = x = ln |x| + 1 Version: 8 Owner: mathcam Author(s): mathcam, Logan

365.26

proof of Darboux’s theorem

WLOG, assume f+ (a) > t > f− (b). Let g(x) = f (x) − tx. Then g (x) = f (x) − t, g+ (a) > 0 > g− (b), and we wish to find a zero of g . g is a continuous function on [a, b], so it attains a maximum on [a, b]. This maximum cannot be at a, since g+ (a) > 0 so g is locally increasing at a. Similarly, g− (b) < 0, so g is locally decreasing at b and cannot have a maximum at b. So the maximum is attained at some c ∈ (a, b). But then g (c) = 0 by Fermat’s theorem. Version: 2 Owner: paolini Author(s): paolini, ariels 1433

365.27

proof of Fermat’s Theorem (stationary points)

f (x0 + h) − f (x0 ) ≤ 0. h Since the limit of this ratio as h → 0+ exists and is equal to f (x0 ) we conclude that f (x0 ) ≤ 0. On the other hand for h ∈ (−δ, 0) we notice that f (x0 + h) − f (x0 ) ≥0 h but again the limit as h → 0+ exists and is equal to f (x0 ) so we also have f (x0 ) ≥ 0. Hence we conclude that f (x0 ) = 0. Version: 1 Owner: paolini Author(s): paolini

Suppose that x0 is a local maximum (a similar proof applies if x0 is a local minimum). Then there exists δ > 0 such that (x0 − δ, x0 + δ) ⊂ (a, b) and such that we have f (x0 ) ≥ f (x) for all x with |x − x0 | < δ. Hence for h ∈ (0, δ) we notice that it holds

365.28

proof of Rolle’s theorem

Because f is continuous on a compact (closed and bounded) interval I = [a, b], it attains its maximum and minimum values. In case f (a) = f (b) is both the maximum and the minimum, then there is nothing more to say, for then f is a constant function and f ⇔ 0 on the whole interval I. So suppose otherwise, and f attains an extremum in the open interval (a, b), and without loss of generality, let this extremum be a maximum, considering −f in lieu of f as necessary. We claim that at this extremum f (c) we have f (c) = 0, with a < c < b. To show this, note that f (x) − f (c) 0 for all x ∈ I, because f (c) is the maximum. By definition of the derivative, we have that f (x) − f (c) . f (c) = lim x→c x−c Looking at the one-sided limits, we note that f (x) − f (c) R = lim 0 x→c+ x−c because the numerator in the limit is nonpositive in the interval I, yet x − c > 0, as x approaches c from the right. Similarly, f (x) − f (c) L = lim 0. − x→c x−c Since f is differentiable at c, the left and right limits must coincide, so 0 L = R 0, that is to say, f (c) = 0. Version: 1 Owner: rmilson Author(s): NeuRet 1434

365.29

proof of Taylor’s Theorem

Let n be a natural number and I be the closed interval [a, b]. We have that f : I → R has n continuous derivatives and its (n + 1)-st derivative exists. Suppose that c ∈ I, and x ∈ I is arbitrary. Let J be the closed interval with endpoints c and x. Define F : J → R by F (t) := f (x) − so that
n n

k=0

(x − t)k (k) f (t) k!

(365.29.1)

F (t) = f (t) − = −

(x − t) (n+1) f (t) n!

k=1 n

(x − t)k (k+1) (x − t)k−1 (k) f (t) − f (t) k! (k − 1)!

since the sum telescopes. Now, define G on J by G(t) := F (t) − x−t x−c
n+1

F (c)

and notice that G(c) = G(x) = 0. Hence, Rolle’s theorem gives us a ζ strictly between x and c such that (x − ζ)n 0 = G (ζ) = F (ζ) − (n + 1) F (c) (x − c)n+1 that yields 1 (x − c)n+1 F (ζ) n + 1 (x − c)n 1 (x − c)n+1 (x − ζ)n (n+1) = f (ζ) n + 1 (x − c)n n! f (n+1) (ζ) (x − c)n+1 = (n + 1)!

F (c) = −

from which we conclude, recalling (364.29.1),
n

f (x) =
k=0

f (k) (c) f (n+1) (ζ) (x − c)k + (x − c)n+1 k! (n + 1)!

Version: 3 Owner: rmilson Author(s): NeuRet

1435

365.30

proof of binomial formula

Let p ∈ R and x ∈ R, |x| < 1 be given. We wish to show that (1 + x) = where p denotes the n
n th p

pn

n=0

xn , n!

falling factorial of p.

The convergence of the series in the right-hand side of the above equation is a straightforward consequence of the ratio test. Set f (x) = (1 + x)p . and note that f (n) (x) = pn (1 + x)p−n . The desired equality now follows from Taylor’s Theorem. Q.E.D. Version: 2 Owner: rmilson Author(s): rmilson

365.31

proof of chain rule
f (y)−f (y0 ) y−y0

Let’s say that g is differentiable in x0 and f is differentiable in y0 = g(x0 ). We define: ϕ(y) = f (y0 ) if y = y0 if y = y0

Since f is differentiable in y0 , ϕ is continuous. We observe that, for x = x0 , g(x) − g(x0 ) f (g(x)) − f (g(x0 )) = ϕ(g(x)) , x − x0 x − x0 in fact, if g(x) = g(x0 ), it follows at once from the definition of ϕ, while if g(x) = g(x0 ), both members of the equation are 0. Since g is continuous in x0 , and ϕ is continuous in y0 ,
x→x0

lim ϕ(g(x)) = ϕ(g(x0 )) = f (g(x0 )), f (g(x)) − f (g(x0 )) x→x0 x − x0 g(x) − g(x0 ) = lim ϕ(g(x)) x→x0 x − x0 = f (g(x0 ))g (x0 ). lim

hence (f ◦ g) (x0 ) =

Version: 3 Owner: n3o Author(s): n3o 1436

365.32

proof of extended mean-value theorem

Let f : [a, b] → R and g : [a, b] → R be continuous on [a, b] and differentiable on (a, b). Define the function h(x) = f (x) (g(b) − g(a)) − g(x) (f (b) − f (a)) − f (a)g(b) + f (b)g(a). Because f and g are continuous on [a, b] and differentiable on (a, b), so is h. Furthermore, h(a) = h(b) = 0, so by Rolle’s theorem there exists a ξ ∈ (a, b) such that h (ξ) = 0. This implies that f (ξ) (g(b) − g(a)) − g (ξ) (f (b) − f (a)) = 0 and, if g(b) = g(a), f (ξ) f (b) − f (a) = . g (ξ) g(b) − g(a) Version: 3 Owner: pbruin Author(s): pbruin

365.33

proof of intermediate value theorem

We first prove the following lemma. If f : [a, b] → R is a continuous function with f (a) ≤ 0 ≤ f (b) then ∃c ∈ [a, b] such that f (c) = 0. Define the sequences (an ) and (bn ) inductively, as follows. a0 = a b0 = b an + bn cn = 2 (an , bn ) = We note that a0 ≤ a1 . . . ≤ an ≤ bn ≤ . . . b1 ≤ b0 (bn − an ) = 2−n (b0 − a0 ) f (an ) ≤ 0 ≤ f (bn ) 1437 (365.33.1) (365.33.2) (an−1 , cn−1 ) f (cn−1 ) ≥ 0 (cn−1 , bn−1 ) f (cn−1 ) < 0

By the fundamental axiom of analysis (an ) → α and (bn ) → β. But (bn − an ) → 0 so α = β. By continuity of f (f (an )) → f (α) (f (bn )) → f (α) But we have f (α) ≤ 0 and f (α) ≥ 0 so that f (α) = 0. Furthermore we have a ≤ α ≤ b, proving the assertion.

Set g(x) = f (x) − k where f (a) ≤ k ≤ f (b). g satisfies the same conditions as before, so ∃c such that f (c) = k. Thus proving the more general result. Version: 2 Owner: vitriol Author(s): vitriol

365.34

proof of mean value theorem

Define h(x) on [a, b] by h(x) = f (x) − f (a) − f (b) − f (a) (x − a) b−a

clearly, h is continuous on [a, b], differentiable on (a, b), and h(a) = f (a) − f (a) = 0 h(b) = f (b) − f (a) −
f (b)−f (a) b−a

(b − a) = 0

Notice that h satisfies the conditions of Rolle’s theorem. Therefore, by Rolle’s Theorem there exists c ∈ (a, b) such that h (c) = 0. However, from the definition of h we obtain by differentiation that h (x) = f (x) − Since h (c) = 0, we therefore have f (c) = as required. f (b) − f (a) b−a f (b) − f (a) b−a

REFERENCES
1. Michael Spivak, Calculus, 3rd ed., Publish or Perish Inc., 1994.

Version: 2 Owner: saforres Author(s): saforres 1438

365.35

proof of monotonicity criterion

Let us start from the implications “⇒”. Suppose that f (x) ≥ 0 for all x ∈ (a, b). We want to prove that therefore f is increasing. So take x1 , x2 ∈ [a, b] with x1 < x2 . Applying the mean-value theorem on the interval [x1 , x2 ] we know that there exists a point x ∈ (x1 , x2 ) such that f (x2 ) − f (x1 ) = f (x)(x2 − x1 ) and being f (x) ≥ 0 we conclude that f (x2 ) ≥ f (x1 ). This proves the first claim. The other three cases can be achieved with minor modifications: replace all “≥” respectively with ≤, > and <. Let us now prove the implication “⇐” for the first and second statement. Given x ∈ (a, b) consider the ratio f (x + h) − f (x) . h If f is increasing the numerator of this ratio is ≥ 0 when h > 0 and is ≤ 0 when h < 0. Anyway the ratio is ≥ 0 since the denominator has the same sign of the numerator. Since we know by hypothesys that the function f is differentiable in x we can pass to the limit to conclude that f (x + h) − f (x) f (x) = lim ≥ 0. h→0 h If f is decreasing the ratio considered turns out to be ≤ 0 hence the conclusion f (x) ≤ 0. Notice that if we suppose that f is strictly increasing we obtain the this ratio is > 0, but passing to the limit as h → 0 we cannot conclude that f (x) > 0 but only (again) f (x) ≥ 0. Version: 2 Owner: paolini Author(s): paolini

365.36

proof of quotient rule

Let F (x) = f (x)/g(x). Then

f (x+h) − F (x + h) − F (x) g(x+h) = lim F (x) = lim h→0 h→0 h h f (x + h)g(x) − f (x)g(x + h) = lim h→0 hg(x + h)g(x)

f (x) g(x)

h

1439

Like the product rule, the key to this proof is subtracting and adding the same quantity. We separate f and g in the above expression by subtracting and adding the term f (x)g(x) in the numerator.

F (x) = lim = lim =

f (x + h)g(x) − f (x)g(x) + f (x)g(x) − f (x)g(x + h) h→0 hg(x + h)g(x) g(x) f (x+h)−f (x) − f (x) g(x+h)−g(x) h h h→0 g(x + h)g(x)

limh→0 g(x) · limh→0 f (x+h)−f (x) − limh→0 f (x) · limh→0 g(x+h)−g(x) h h limh→0 g(x + h) · limh→0 g(x) g(x)f (x) − f (x)g (x) = [g(x)]2

Version: 1 Owner: Luci Author(s): Luci

365.37

quotient rule

The quotient rule says that the derivative of the quotient f /g of two differentiable functions f and g exists at all values of x as long as g(x) = 0 and is given by the formula d dx f (x) g(x) = g(x)f (x) − f (x)g (x) [g(x)]2

The Quotient Rule and the other differentiation formulas allow us to compute the derivative of any rational function. Version: 10 Owner: Luci Author(s): Luci

365.38

signum function

The following properties hold:

The signum function is the function sign : R → R   −1 when x < 0, 0 when x = 0, sign(x) =  1 when x > 0.

1440

1. For all x ∈ R, sign(−x) = − sign(x). 2. For all x ∈ R, |x| = sign(x)x. 3. For all x = 0,
d |x| dx

= sign(x).

Here, we should point out that the signum function is often defined simply as 1 for x > 0 and −1 for x < 0. Thus, at x = 0, it is left undefined. See e.g. [2]. In applications, such as the Laplace transform, this definition is adequate since the value of a function at a single point does not change the analysis. One could then, in fact, set sign(0) to any value. However, setting sign(0) = 0 is motivated by the above relations. A related function is the Heaviside step function defined as  when x < 0,  0 1/2 when x = 0, H(x) =  1 when x > 0. 1 (sign(x) + 1), 2 H(−x) = 1 − H(x). H(x) = This first relation is clear. For the second, we have 1 1 − H(x) = 1 − (sign(x) + 1) 2 1 = (1 − sign(x)) 2 1 (1 + sign(−x)) = 2 = H(−x). Example Let a < b be real numbers, and let f : R → R be the piecewise defined function f (x) = 4 when x ∈ (a, b), 0 otherwise.

Again, this function is sometimes left undefined at x = 0. The motivation for setting H(0) = 1/2 is that for all x ∈ R, we then have the relations

Using the Heaviside step function, we can write f (x) = 4 H(x − a) − H(x − b) (365.38.1)

almost everywhere. Indeed, if we calculate f using equation 364.38.1 we obtain f (x) = 4 for x ∈ (a, b), f (x) = 0 for x ∈ [a, b], and f (a) = f (b) = 2. Therefore, equation 364.38.1 holds / at all points except a and b. P 1441

365.38.1

Signum function for complex arguments

For a complex number z, the signum function is defined as [1] sign(z) = 0 when z = 0, z/|z| when z = 0.

In other words, if z is non-zero, then sign z is the projection of z onto the unit circle {z ∈ C | |z| = 1}. clearly, the complex signum function reduces to the real signum function for real arguments. For all z ∈ C, we have z sign z = |z|, where z is the complex conjugate of z.

REFERENCES
1. E. Kreyszig, Advanced Engineering Mathematics, John Wiley & Sons, 1993, 7th ed. 2. G. Bachman, L. Narici, Functional analysis, Academic Press, 1966.

Version: 4 Owner: mathcam Author(s): matte

1442

Chapter 366 26A09 – Elementary functions
366.1 definitions in trigonometry

Informal definitions Given a triangle ABC with a signed angle x at A and a right angle at B, the ratios BC AB BC AC AC AB are dependant only on the angle x, and therefore define functions, denoted by sin x cos x tan x

respectively, where the names are short for sine, cosine and tangent. Their inverses are rather less important, but also have names: 1 (cotangent) cot x = AB/BC = tan x 1 csc x = AC/BC = (cosecant) sin x 1 (secant) sec x = AC/AB = cos x From Pythagoras’s theorem we have cos2 x + sin2 x = 1 for all (real) x. Also it is “clear” from the diagram at left that functions cos and sin are periodic with period 2π. However: Formal definitions The above definitions are not fully rigorous, because we have not defined the word angle. We will sketch a more rigorous approach. The power series
∞ n=0

xn n!

1443

converges uniformly on compact subsets of C and its sum, denoted by exp(x) or by ex , is therefore an entire function of x, called the exponential function. f (x) = exp(x) is the unique solution of the boundary value problem f (0) = 1 f (x) = f (x)

on R. The sine and cosine functions, for real arguments, are defined in terms of exp, simply by exp(ix) = cos x + i(sin x) . Thus x2 x4 x6 + − + ... 2! 4! 6! x x3 x5 sin x = − + − ... 1! 3! 5! Although it is not self-evident, cos and sin are periodic functions on the real line, and have the same period. That period is denoted by 2π. cos x = 1 −

Version: 3 Owner: Daume Author(s): Larry Hammick

366.2

hyperbolic functions

The hyperbolic functions sinh x and cosh x ared defined as follows: ex − e−x 2 x e + e−x cosh x := . 2 sinh x := One can then also define the functions tanh x and coth x in analogy to the definitions of tan x and cot x: tanh x := ex − e−x sinh x = x cosh x e + e−x coth x ex + e−x coth x := = x . cosh x e − e−x x2 y 2 − 2 =1 a2 b can be written in parametrical form with the equations: x = a cosh t, y = b sinh t.

The hyperbolic functions are named in that way because the hyperbola

1444

This is because of the equation cosh2 x − sinh2 x = 1. There are also addition formulas which are like the ones for trigonometric functions: sinh(x ± y) = sinh x cosh y ± cosh x sinh y cosh(x ± y) = cosh x cosh y ± sinh x sinh y. The Taylor series for the hyperbolic functions are: sinh x = cosh x =
∞ n=0 ∞ n=0

x2n+1 (2n + 1)! x2n . (2n)!

Using complex numbers we can use the hyperbolic functions to express the trigonometric functions: sinh(ix) i cos x = cosh(ix). sin x = Version: 2 Owner: mathwizard Author(s): mathwizard

1445

Chapter 367 26A12 – Rate of growth of functions, orders of infinity, slowly varying functions
367.1 Landau notation

Given two functions f and g from R+ to R+ , the notation f = O(g) means that the ratio we write
f (x) g(x)

stays bounded as x → ∞. If moreover that ratio approaches zero, f = o(g).

It is legitimate to write, say, 2x = O(x) = O(x2 ), with the understanding that we are using the equality sign in an unsymmetric (and informal) way, in that we do not have, for example, O(x2 ) = O(x). The notation f = Ω(g) means that the ratio
f (x) g(x)

is bounded away from zero as x → ∞, or equivalently g = O(f ).

If both f = O(g) and f = Ω(g), we write f = Θ(g). One more notational convention in this group is f (x) ∼ g(x), meaning limx→∞ f (x) = 1. g(x) 1446

In analysis, such notation is useful in describing error estimates. For example, the Riemann hypothesis is equivalent to the conjecture π(x) = √ x + O( x log x) log x

Landau notation is also handy in applied mathematics, e.g. in describing the efficiency of an algorithm. It is common to say that an algorithm requires O(x3 ) steps, for example, without needing to specify exactly what is a step; for if f = O(x3 ), then f = O(Ax3 ) for any positive constant A. Version: 8 Owner: mathcam Author(s): Larry Hammick, Logan

1447

Chapter 368 26A15 – Continuity and related questions (modulus of continuity, semicontinuity, discontinuities, etc.)
368.1 Dirichlet’s function

Dirichlet’s function f : R → R is defined as f (x) = if x = p is a rational number in lowest terms, q 0 if x is an irrational number.
1 q

This function has the property that it is continuous at every irrational number and discontinuous at every rational one. Version: 3 Owner: urz Author(s): urz

368.2

semi-continuous

A real function f : A → R, where A ⊆ R is said to be lower semi-continuous in x0 if ∀ε > 0 ∃δ > 0 ∀x ∈ A |x − x0 | < δ ⇒ f (x) > f (x0 ) − ε, and f is said to be upper semi-continuous if ∀ε > 0 ∃δ > 0 ∀x ∈ A |x − x0 | < δ ⇒ f (x) < f (x0 ) + ε.

1448

Remark A real function is continuous in x0 if and only if it is both upper and lower semicontinuous in x0 . We can generalize the definition to arbitrary topological spaces as follows. Let A be a topological space. f : A → R is lower semicontinuous at x0 if, for each ε > 0 there is a neighborhood U of x0 such that x ∈ U implies f (x) > f (x0 ) − ε. Theorem Let f : [a, b] → R be a lower (upper) semi-continuous function. Then f has a minimum (maximum) in [a, b]. Version: 3 Owner: drini Author(s): drini, n3o

368.3

semicontinuous

Defintion [1] Suppose X is a topological space, and f is a function from X into the extended real numbers R; f : X → R. Then: 1. If {x ∈ X | f (x) > α} is an open set in X for all α ∈ R, then f is said to be lower semicontinuous. 2. If {x ∈ X | f (x) < α} is an open set in X for all α ∈ R, then f is said to be upper semicontinuous.

Properties 1. If X is a topological space and f is a function f : X → R, then f is continuous if and only if f is upper and lower semicontinuous [1, 3]. 2. The characteristic function of an open set is lower semicontinuous [1, 3]. 3. The characteristic function of a closed set is upper semicontinuous [1, 3]. 4. If f and g are lower semicontinuous, then f + g is also lower semicontinuous [3].

REFERENCES
1. W. Rudin, Real and complex analysis, 3rd ed., McGraw-Hill Inc., 1987. 2. D.L. Cohn, Measure Theory, Birkh¨user, 1980. a

Version: 2 Owner: bwebste Author(s): matte, apmxi 1449

368.4

uniformly continuous

Let f : A → R be a real function defined on a subset A of the real line. We say that f is uniformly continuous if, given an arbitrary small positive ε, there exists a positive δ such that whenever two points in A differ by less than δ, they are mapped by f into points which differ by less than ε. In symbols: ∀ε > 0 ∃δ > 0 ∀x, y ∈ A |x − y| < δ ⇒ |f (x) − f (y)| < ε. Every uniformly continuous function is also continuous, while the converse does not always hold. For instance, the function f :]0, +∞[→ R defined by f (x) = 1/x is continuous in its domain, but not uniformly. A more general definition of uniform continuity applies to functions between metric spaces (there are even more general environments for uniformly continuous functions, i.e. Uniform spaces). Given a function f : X → Y , where X and Y are metric spaces with distances dX and dY , we say that f is uniformly continuous if ∀ε > 0 ∃δ > 0 ∀x, y ∈ X dX (x, y) < δ ⇒ dY (f (x), f (y)) < ε. Uniformly continuous functions have the property that they map Cauchy sequences to Cauchy sequences and that they preserve uniform convergence of sequences of functions. Any continuous function defined on a compact space is uniformly continuous (see Heine-Cantor theorem). Version: 10 Owner: n3o Author(s): n3o

1450

Chapter 369 26A16 – Lipschitz (H¨lder) classes o
369.1 Lipschitz condition

A mapping f : X → Y between metric spaces is said to satisfy the Lipschitz condition if there exists a real constant α 0 such that dY (f (p), f (q)) αdX (p, q), for all p, q ∈ X.

Proposition 17. A Lipschitz mapping f : X → Y is uniformly continuous. Proof. Let f be a Lipschitz mapping and α every given > 0, choose δ > 0 such that 0 a corresponding Lipschitz constant. For

δα < . Let p, q ∈ X such that be given. By assumption, dY (f (p), f (q)) as desired. QED αδ < , dX (p, q) < δ

Notes. More generally, one says that mapping satisfies a Lipschitz condition of order β > 0 if there exists a real constant α 0 such that dY (f (p), f (q)) αdX (p, q)β , for all p, q ∈ X.

Version: 17 Owner: rmilson Author(s): rmilson, slider142

1451

369.2

Lipschitz condition and differentiability

If X and Y are Banach spaces, e.g. Rn , one can inquire about the relation between differentiability and the Lipschitz condition. The latter is the weaker condition. If f is Lipschitz, the ratio f (q) − f (p) , p, q ∈ X q−p

is bounded but is not assumed to converge to a limit. Indeed, differentiability is the stronger condition.

Proposition 18. Let f : X → Y be a continuously differentiable mapping between Banach spaces. If K ⊂ X is a compact subset, then the restriction f : K → Y satisfies the Lipschitz condition. Proof. Let Lin(X, Y ) denote the Banach space of bounded linear maps from X to Y . Recall that the norm T of a linear mapping T ∈ Lin(X, Y ) is defined by T = sup{ Tu : u = 0}. u

Let Df : X → Lin(X, Y ) denote the derivative of f . By definition Df is continuous, which really means that Df : X → R is a continuous function. Since K ⊂ X is compact, there exists a finite upper bound B1 > 0 for Df restricted to U. In particular, this means that Df (p)u for all p ∈ K, u ∈ X. Next, consider the secant mapping s : X × X → R defined by   f (q) − f (p) − Df (p)(q − p) q−p s(p, q) =  0 Df (p) u B1 u ,

q=p p=q

This mapping is continuous, because f is assumed to be continuously differentiable. Hence, there is a finite upper bound B2 > 0 for s restricted to the compact K × K. It follows that for all p, q ∈ K we have f (q) − f (p) f (q) − f (p) − Df (p)(q − p) + Df (p)(q − p) B2 q − p + B1 q − p = (B1 + B2 ) q − p

Therefore B1 , B2 is the desired Lipschitz constant. QED Version: 22 Owner: rmilson Author(s): rmilson, slider142 1452

369.3

Lipschitz condition and differentiability result

About Lipschitz continuity of differentiable functions the following holds. Theorem 6. Let X, Y be Banach spaces and let A be a convex (see convex set), open subset of X. Let f : A → Y be a function which is continuous in A and differentiable in A. Then f is lipschitz continuous on A if and only if the derivative Df is bounded on A i.e. sup Df (x) < +∞.
x∈A

S

uppose that f is lipschitz continuous: f (x) − f (y) ≤ L x − y .

Then given any x ∈ A and any v ∈ X, for all small h ∈ R we have f (x + hv) − f (x) ≤ L. h Hence, passing to the limit h → 0 it must hold Df (x) ≤ L. On the other hand suppose that Df is bounded on A: Df (x) ≤ L, ∀x ∈ A.

Given any two points x, y ∈ A and given any α ∈ Y ∗ consider the function G : [0, 1] → R G(t) = α, f ((1 − t)x + ty) . For t ∈ (0, 1) it holds and hence |G (t)| ≤ L α y−x . Applying Lagrange mean-value theorem to G we know that there exists ξ ∈ (0, 1) such that | α, f (y) − f (x) | = |G(1) − G(0)| = |G (ξ)| ≤ α L y − x and since this is true for all α ∈ Y ∗ we get f (y) − f (x) ≤ L y − x which is the desired claim. Version: 1 Owner: paolini Author(s): paolini G (t) = α, Df ((1 − t)x + ty)[y − x]

1453

Chapter 370 26A18 – Iteration
370.1 iteration

Let f : X → X be a function, X being any set. The n-th iteration of a function is the function which is obtained if f is applied n times, and is denoted by f n . More formally we define: f 0 (x) = x and f n+1 (x) = f (f n (x)) for nonnegative integers n. If f is invertible, then by going backwards we can define the iterate also for negative n. Version: 6 Owner: mathwizard Author(s): mathwizard

370.2

periodic point

Let f : X → X be a function and f n its n-th iteration. A point x is called a periodic point of period n of f if it is a fixed point of f n . The least n for which x is a fixed point of f n is called prime period or least period. If f is a function mapping R to R or C to C then a periodic point x of prime period n is called hyperbolic if |(f n ) (x)| = 1, attractive if |(f n ) (x)| < 1 and repelling if |(f n ) (x)| > 1. Version: 11 Owner: mathwizard Author(s): mathwizard

1454

Chapter 371 26A24 – Differentiation (functions of one variable): general theory, generalized derivatives, mean-value theorems
371.1 Leibniz notation

Leibniz notation centers around the concept of a differential element. The differential element of x is represented by dx. You might think of dx as being an infinitesimal change dy in x. It is important to note that d is an operator, not a variable. So, when you see dx , you y can’t automatically write as a replacement x . We use
df (x) dx

or

d f (x) dx

to represent the derivative of a function f (x) with respect to x.

f (x + Dx) − f (x) df (x) = lim Dx→0 dx Dx We are dividing two numbers infinitely close to 0, and arriving at a finite answer. D is another operator that can be thought of just a change in x. When we take the limit of Dx as Dx approaches 0, we get an infinitesimal change dx. Leibniz notation shows a wonderful use in the following example: dy dy du dy du = = dx dx du du dx The two dus can be cancelled out to arrive at the original derivative. This is the Leibniz notation for the chain rule. Leibniz notation shows up in the most common way of representing an integral, F (x) = intf (x)dx 1455

The dx is in fact a differential element. Let’s start with a derivative that we know (since F (x) is an antiderivative of f (x)). dF (x) dx dF (x) intdF (x) F (x) = f (x) = f (x)dx = intf (x)dx = intf (x)dx

We can think of dF (x) as the differential element of area. Since dF (x) = f (x)dx, the element of area is a rectangle, with f (x) × dx as its dimensions. Integration is the sum of all these infinitely thin elements of area along a certain interval. The result: a finite number. (a diagram is deserved here) One clear advantage of this notation is seen when finding the length s of a curve. The formula is often seen as the following: s = intds The length is the sum of all the elements, ds, of length. If we have a function f (x), the length element is usually written as ds =
(x) 1 + [ dfdx ]2 dx. If we modify this a bit, we get

ds = [dx]2 + [df (x)]2 . Graphically, we could say that the length element is the hypotenuse of a right triangle with one leg being the x element, and the other leg being the f (x) element. (another diagram would be nice!) There are a few caveats, such as if you want to take the value of a derivative. Compare to the prime notation. df (x) f (a) = dx x=a A second derivative is represented as follows: d dy d2 y = 2 dx dx dx
d The other derivatives follow as can be expected: dxy , etc. You might think this is a little 3 sneaky, but it is the notation. Properly using these terms can be interesting. For example, 2y d2 dy dy what is int ddx ? We could turn it into int dxy dx or intd dx . Either way, we get dx . 2
3

Version: 2 Owner: xriso Author(s): xriso

371.2

derivative

Qualitatively the derivative is a measure of the change of a function in a small region around a specified point. 1456

Motivation
The idea behind the derivative comes from the straight line. What characterizes a straight line is the fact that it has constant “slope”. Figure 371.1: The straight line y = mx + b

In other words for a line given by the equation y = mx + b, as in Fig. 370.1, the ratio of ∆y ∆y over ∆x is always constant and has the value ∆x = m. Figure 371.2: The parabola y = x2 and its tangent at (x0 , y0 )

For other curves we cannot define a “slope”, like for the straight line, since such a quantity would not be constant. However, for sufficiently smooth curves, each point on a curve has a tangent line. For example consider the curve y = x2 , as in Fig. 370.2. At the point (x0 , y0 ) on the curve, we can draw a tangent of slope m given by the equation y − y0 = m(x − x0 ). Suppose we have a curve of the form y = f (x), and at the point (x0 , f (x0 )) we have a tangent given by y − y0 = m(x − x0 ). Note that for values of x sufficiently close to x0 we can make the approximation f (x) ≈ m(x − x0 ) + y0. So the slope m of the tangent describes how much f (x) changes in the vicinity of x0 . It is the slope of the tangent that will be associated with the derivative of the function f (x).

Formal definition
More formally for any real function f : R → R, we define the derivative of f at the point x as the following limit (if it exists) f (x) := lim f (x + h) − f (x) . h→0 h

This definition turns out to be consistent with the motivation introduced above. The derivatives for some elementary functions are (cf. Derivative notation) 1. 2. 3.
d c dx

= 0,

where c is constant;

d n x dx d dx

= nxn−1 ;

sin x = cos x; 1457

4. 5. 6.

d dx

cos x = − sin x; = ex ;
1 ln x = x .

d x e dx d dx

While derivatives of more complicated expressions can be calculated algorithmically using the following rules Linearity
d dx

(af (x) + bg(x)) = af (x) + bg (x);
d dx

Product rule Chain rule

(f (x)g(x)) = f (x)g(x) + f (x)g (x); = g (f (x))f (x); =
f (x)g(x)−f (x)g (x) . g(x)2

d g(f (x)) dx d f (x) dx g(x)

Quotient Rule

Note that the quotient rule, although given as much importance as the other rules in elementary calculus, can be derived by succesively applying the product rule and the chain rule 1 to f (x) = f (x) g(x) . Also the quotient rule does not generalize as well as the other ones. g(x) Since the derivative f (x) of f (x) is also a function x, higher derivatives can be obtained by applying the same procedure to f (x) and so on.

Generalization Banach Spaces
Unfortunately the notion of the “slope of the tangent” does not directly generalize to more abstract situations. What we can do is keep in mind the facts that the tangent is a linear function and that it approximates the function near the point of tangency, as well as the formal definition above. Very general conditions under which we can define a derivative in a manner much similar to the above areas follows. Let f : V → W, where V and W are Banach spaces. Suppose that h ∈ V and h = 0, the we define the directional derivative (Dh f )(x) at x as the following limit f (x + h) − f (x) (Dh f )(x) := lim ,
→0

where is a scalar. Note that f (x + h) ≈ f (x) + (Dh f )(x), which is consistent with our original motivation. This directional derivative is also called the Gˆteaux derivative. a

1458

Finally we define the derivative at x as the bounded linear map (Df )(x) : V → W such that for any non-zero h ∈ V
h

lim

(f (x + h) − f (x)) − (Df )(x) · h = 0. →0 h

Once again we have f (x + h) ≈ f (x) + (Df )(x) · h. In fact, if the derivative (Df )(x) exists, the directional derivatives can be obtained as (Dh f )(x) = (Df )(x) · h.1 each nonzero h ∈ V does not guarantee the existence of (Df )(x). This derivative is also called the Fr´chet derivative. In the more familiar case f : Rn → Rm , the derivative Df is simply e the Jacobian of f . Under these general conditions the following properties of the derivative remain 1. Dh = 0, where h is a constant; where A is linear.

2. D(A · x) = A,

Linearity D(af (x) + bg(x)) · h = a(Df )(x) · h + b(Dg)(x) · h; “Product” rule D(B(f (x), g(x)))·h = B((Df )(x)·h, g(x))+B(f (x), (Dg)(x)·h), B is bilinear; Chain rule D(g(f (x)) · h = (Dg)(f (x)) · ((Df )(x) · h). Note that the derivative of f can be seen as a function Df : V → L(V, W) given by Df : x → (Df )(x), where L(V, W) is the space of bounded linear maps from V to W. Since L(V, W) can be considered a Banach space itself with the norm taken as the operator norm, higher derivatives can be obtained by applying the same procedure to Df and so on. where

Manifolds
A manifold is a topological space that is locally homeomorphic to a Banach space V (for finite dimensional manifolds V = Rn ) and is endowed with enough structure to define derivatives. Since the notion of a manifold was constructed specifically to generalize the notion of a derivative, this seems like the end of the road for this entry. The following discussion is rather technical, a more intuitive explanation of the same concept can be found in the entry on related rates. Consider manifolds V and W modeled on Banach spaces V and W, respectively. Say we have y = f (x) for some x ∈ V and y ∈ W , then, by definition of a manifold, we can find
1 The notation A · h is used when h is a vector and A a linear operator. This notation can be considered advantageous to the usual notation A(h), since the latter is rather bulky and the former incorporates the intuitive distributive properties of linear operators also associated with usual multiplication.

1459

charts (X, x) and (Y, y), where X and Y are neighborhoods of x and y, respectively. These charts provide us with canonical isomorphisms between the Banach spaces V and W, and the respective tangent spaces Tx V and Ty W : dxx : Tx V → V, dyy : Ty W → W.

Now consider a map f : V → W between the manifolds. By composing it with the chart maps we construct the map g(X,x) = y ◦ f ◦ x−1 : V → W, defined on an appropriately restricted domain. Since we now have a map between Banach (Y,y) spaces, we can define its derivative at x(x) in the sense defined above, namely Dg(X,x) (x(x)). If this derivative exists for every choice of admissible charts (X, x) and (Y, y), we can say that the derivative of Df (x) of f at x is defined and given by
−1 Df (x) = dyy ◦ Dg(X,x) (x(x)) ◦ dxx (Y,y) (Y,y)

(it can be shown that this is well defined and independent of the choice of charts). Note that the derivative is now a map between the tangent spaces of the two manifolds Df (x) : Tx V → Ty W . Because of this a common notation for the derivative of f at x is Tx f . Another alternative notation for the derivative is f∗,x because of its connection to the category-theoretical pushforward. Version: 15 Owner: igor Author(s): igor

371.3

l’Hpital’s rule

0 L’Hˆpital’s rule states that given an unresolvable limit of the form 0 or ∞ , the ratio of o ∞ (x) functions f (x) will have the same limit at c as the ratio f (x) . In short, if the limit of a ratio g(x) g of functions approaches an indeterminate form, then

f (x) f (x) = lim x→c g(x) x→c g (x) lim provided this last limit exists. L’Hˆpital’s rule may be applied indefinitely as long as the o (x) conditions still exist. However it is important to note, that the nonexistance of lim f (x) does g not prove the nonexistance of lim f (x) . g(x) Example: We try to determine the value of x2 . x→∞ ex lim 1460

As x approaches ∞ the expression becomes an indeterminate form rule we get 2x 2 x2 lim x = lim x = lim x = 0. x→∞ e x→∞ e x→∞ e Version: 8 Owner: mathwizard Author(s): mathwizard, slider142

∞ . ∞

By applying L’Hˆpital’s o

371.4

proof of De l’Hpital’s rule

Let x0 ∈ R, I be an interval containing x0 and let f and g be two differentiable functions defined on I \ {x0 } with g (x) = 0 for all x ∈ I. Suppose that
x→x0

lim f (x) = 0,

x→x0

lim g(x) = 0

and that
x→x0

lim

f (x) = m. g (x)

We want to prove that hence g(x) = 0 for all x ∈ I \ {x0 } and f (x) = m. x→x0 g(x) lim

First of all (with little abuse of notation) we suppose that f and g are defined also in the point x0 by f (x0 ) = 0 and g(x0 ) = 0. The resulting functions are continuous in x0 and hence in the whole interval I. Let us first prove that g(x) = 0 for all x ∈ I \ {x0 }. If by contradiction g(¯) = 0 since we x also have g(x0 ) = 0, by Rolle’s theorem we get that g (ξ) = 0 for some ξ ∈ (x0 , x) which is ¯ against our hypotheses. Consider now any sequence xn → x0 with xn ∈ I \ {x0 }. By Cauchy’s mean value theorem there exists a sequence xn such that f (xn ) f (xn ) − f (x0 ) f (xn ) = = . g(xn ) g(xn ) − g(x0 ) g (xn )

But as xn → x0 and since xn ∈ (x0 , xn ) we get that xn → x0 and hence f (xn ) f (xn ) f (x) = lim = lim = m. n→∞ g(xn ) n→∞ g (xn ) x→x0 g (x) lim

Since this is true for any given sequence xn → x0 we conclude that f (x) = m. x→x0 g(x) lim

Version: 5 Owner: paolini Author(s): paolini 1461

371.5

related rates

The notion of a derivative has numerous interpretations and applications. A well-known geometric interpretation is that of a slope, or more generally that of a linear approximation to a mapping between linear spaces (see here). Another useful interpretation comes from physics and is based on the idea of related rates. This second point of view is quite general, and sheds light on the definition of the derivative of a manifold mapping (the latter is described in the pushforward entry). Consider two physical quantities x and y that are somehow coupled. For example: • the quantities x and y could be the coordinates of a point as it moves along the unit circle; • the quantity x could be the radius of a sphere and y the sphere’s surface area; • the quantity x could be the horizontal position of a point on a given curve and y the distance traversed by that point as it moves from some fixed starting position; • the quantity x could be depth of water in a conical tank and y the rate at which the water flows out the bottom. Regardless of the application, the situation is such that a change in the value of one quantity is accompanied by a change in the value of the other quantity. So let’s imagine that we take control of one of the quantities, say x, and change it in any way we like. As we do so, quantity y follows suit and changes along with x. Now the analytical relation between the values of x and y could be quite complicated and non-linear, but the relation between the instantaneous rates of change of x and y is linear. It does not matter how we vary the two quantities, the ratio of the rates of change depends only on the values of x and y. This ratio is, of course, the derivative of the function that maps the values of x to the values of y. Letting x, y denote the rates of change of the two ˙ ˙ quantities, we describe this conception of the derivative as y ˙ dy = , dx x ˙ or equivalently as y= ˙ dy x. ˙ dx (371.5.1)

Next, let us generalize the discussion and suppose that the two quantities x and y represent physical states with multiple degrees of freedom. For example, x could be a point on the earth’s surface, and y the position of a point 1 kilometer to the north of x. Again, the dependence of y and x is, in general, non-linear, but the rate of change of y does have a linear dependence on the rate of change of x. We would like to say that the derivative is 1462

precisely this linear relation, but we must first contend with the following complication. The rates of change are no longer scalars, but rather velocity vectors, and therefore the derivative must be regarded as a linear transformation that changes one vector into another. In order to formalize this generalized notion of the derivative we must consider x and y to be points on manifolds X and Y , and the relation between them a manifold mapping φ : X → Y . A varying x is formally described by a trajectory γ : I → X, I ⊂ R.

The corresponding velocities take their value in the tangent spaces of X: γ (t) ∈ Tγ(t) X. The “coupling” of the two quantities is described by the composition φ ◦ γ : I → Y. The derivative of φ at any given x ∈ X is a linear mapping φ∗ (x) : Tx X → Tφ(x) Y, called the pushforward of φ at x, with the property that for every trajectory γ passing through x at time t, we have (φ ◦ γ) (t) = φ∗ (x)γ (t). The above is the multi-dimensional and coordinate-free generalization of the related rates relation (370.5.1). All of the above has a perfectly rigorous presentation in terms of manifold theory. The approach of the present entry is more informal; our ambition was merely to motivate the notion of a derivative by describing it as a linear transformation between velocity vectors. Version: 2 Owner: rmilson Author(s): rmilson

1463

Chapter 372 26A27 – Nondifferentiability (nondifferentiable functions, points of nondifferentiability), discontinuous derivatives
372.1 Weierstrass function

The Weierstrass function is a continuous function that is nowhere differentiable, and hence is not an analytic function. The formula for the Weierstrass function is
∞ n=1 3 with a odd, 0 < b < 1, and ab > 1 + 2 π.

f (x) =

bn cos(an πx)

Another example of an everywhere continuous but nowhere differentiable curve is the fractal Koch curve. [insert plot of Weierstrass function] Version: 5 Owner: akrowne Author(s): akrowne

1464

Chapter 373 26A36 – Antidifferentiation
373.1 antiderivative

The function F (x) is called an antiderivative of a function f (x) if (and only if) the derivative of F is equal to f . F (x) = f (x) Note that there are an infinite number of antiderivatives for any function f (x), since any constant can be added or subtracted from any valid antiderivative to yield another equally valid antiderivative. To account for this, we express the general antiderivative, or indefinite integral, as follows: intf (x) dx = F (x) + C where C is an arbitrary constant called the constant of integration. The dx portion means ”with respect to x”, because after all, our functions F and f are functions of x. Version: 4 Owner: xriso Author(s): xriso

373.2

integration by parts

When one has an integral of a product of two functions, it is sometimes preferable to simplify the integrand by integrating one of the functions and differentiating the other. This process is called integrating by parts, and is defined in the following way, where u and v are functions of x. intu · v dx = u · v − intv · u dx This process may be repeated indefinitely, and in some cases it may be used to solve for the original integral algebraically. For definite integrals, the rule appears as intb u(x) · v (x) dx = (u(b) · v(b) − u(a) · v(a)) − intb v(x) · u (x) dx a a 1465

Proof: Integration by parts is simply the antiderivative of a product rule. Let G(x) = u(x) · v(x). Then, G (x) = u (x)v(x) + u(x)v (x) Therefore, We can now integrate both sides with respect to x to get G (x) − v(x)u (x) = u(x)v (x)

G(x) − intv(x)u (x) dx = intu(x)v (x) dx which is just integration by parts rearranged. Example: We integrate the function f (x) = x sin x: Therefore we define u(x) := x and v (x) = sin x. So integration by parts yields us: intx sin xdx = −x cos x + int cos xdx = −x cos x + sin x. Version: 5 Owner: mathwizard Author(s): mathwizard, slider142

373.3

integrations by parts for the Lebesgue integral

Theorem [1, 2] Suppose f, g are complex valued functions on a bounded interval [a, b]. If f and g are absolutely continuous, then int[a,b] f g = −int[a,b] f g + f (b)g(b) − f (a)g(a). where both integrals are Lebesgue integrals. Remark Any absolutely continuous function can be differentiated almost everywhere. Thus, in the above, the functions f and g make sense. Proof. Since f, g and f g are almost everywhere differentiable with Lebesgue integrable derivatives (see this page), we have (f g) = f g + f g almost everywhere, and int[a,b] (f g) = int[a,b] f g + f g = int[a,b] f g + int[a,b] f g .

The last equality is justified since f g and f g are integrable. For instance, we have int[a,b] |f g| ≤ max |g(x)|int[a,b] |f |,
x∈[a,b]

which is finite since g is continuous and f is Lebesgue integrable. Now the claim follows from the Fundamental theorem of calculus for the Lebesgue integral. P 1466

REFERENCES
1. Jones, F., Lebesgue Integration on Euclidean Spaces, Jones and Barlett Publishers, 1993. 2. Ng, Tze Beng, Integration by Parts, online.

Version: 4 Owner: matte Author(s): matte

1467

Chapter 374 26A42 – Integrals of Riemann, Stieltjes and Lebesgue type
374.1 Riemann sum

partition with n ∈ N elements of I, then the Riemann sum of f over I with the partition P is defined as
n

Suppose there is a function f : I → R where I = [a, b] is a closed interval, and f is bounded on I. If we have a finite set of points {x0 , x1 , x2 , . . . xn } such that a = x0 < x1 < x2 · · · < xn = b, then this set creates a partition P = {[x0 , x1 ), [x1 , x2 ), . . . [xn − 1, xn ]} of I. If P is a

S=
i=1

f (yi)(xi − xi−1 )

where xi−1 yi xi . The choice of yi is arbitrary. If yi = xi−1 for all i, then S is called a left Riemann sum. If yi = xi , then S is called a right Riemann sum. Suppose we have
n

S=
i=1

b(xi − xi−1 )

where b is the supremum of f over [xi−1 , xi ]; then S is defined to be an upper Riemann sum. Similarly, if b is the infimum of f over [xi−1 , xi ], then S is a lower Riemann sum. Version: 3 Owner: mathcam Author(s): mathcam, vampyr

1468

374.2

Riemann-Stieltjes integral

Let f and α be bounded, real-valued functions defined upon a closed finite interval I = [a, b] of R(a = b), P = {x0 , ..., xn } a partition of I, and ti a point of the subinterval [xi−1 , xi ]. A sum of the form
n

S(P, f, α) =
i=1

f (ti )(α(xi ) − α(xi−1 ))

is called a Riemann-Stieltjes sum of f with respect to α. f is said to be Riemann integrable with respect to α on I if there exists A ∈ R such that given any > 0 there exists a partition P of I for which, for all P finer than P and for every choice of points ti , we have |S(P, f, α) − A| < If such an A exists, then it is unique and is known as the Riemann-Stieltjes integral of f with respect to α. f is known as the integrand and α the integrator. The integral is denoted by intb f dα or intb f (x)dα(x) a a Version: 3 Owner: vypertd Author(s): vypertd

374.3

continuous functions are Riemann integrable

Let f : [a, b] → R be a continuous function. Then f is Riemann integrable. Version: 2 Owner: paolini Author(s): paolini

374.4

generalized Riemann integral

A function f : [a, b] → R is said to be generalized Riemann integrable on [a, b] if there exists a number L ∈ R such that for every > 0 there exists a gauge δ on [a, b] such that if ˙ P is any δ -fine partition of [a, b], then ˙ |S(f ; P) − L| < 1469

˙ ˙ Where S(f ; P) is the Riemann sum for f using the partition P. The collection of all gener∗ alized Riemann integrable functions is usually denoted by R [a, b]. If f ∈ R∗ [a, b] then the number L is uniquely determined, and is called the generalized Riemann integral of f over [a, b]. Version: 3 Owner: vypertd Author(s): vypertd

374.5

proof of Continuous functions are Riemann integrable

Recall the definition of Riemann integral. To prove that f is integrable we have to prove that limδ→0+ S ∗ (δ) − S∗ (δ) = 0. Since S ∗ (δ) is decreasing and S∗ (δ) is increasing it is enough to show that given > 0 there exists δ > 0 such that S ∗ (δ) − S∗ (δ) < . So let > 0 be fixed.

By Heine-Cantor theorem f is uniformly continuous i.e. ∃δ > 0 |x − y| < δ ⇒ |f (x) − f (y)| < b−a .

Let now P be any partition of [a, b] in C(δ) i.e. a partition {x0 = a, x1 , . . . , xN = b} such that xi+1 − xi < δ. In any small interval [xi , xi+1 ] the function f (being continuous) has a maximum Mi and minimum mi . Being f uniformly continuous and being xi+1 − xi < δ we hence have Mi − mi < /(b − a). So the difference between upper and lower Riemann sums is (xi+1 − xi ) = . Mi (xi+1 − xi ) − mi (xi+1 − xi ) ≤ b−a i i i Being this true for every partition P in C(δ) we conclude that S ∗ (δ) − S∗ (δ) < . Version: 1 Owner: paolini Author(s): paolini

1470

Chapter 375 26A51 – Convexity, generalizations
375.1 concave function

Let f (x) a continuous function defined on an interval [a, b]. Then we say that f is a concave function on [a, b] if, for any x1 , x2 in [a, b] and any λ ∈ [0, 1] we have f λx1 + (1 − λ)x2 λf (x1 ) + (1 − λ)f (x2 ).

The definition is equivalent to the statements: • For all x1 , x2 in [a, b], f

x1 + x2 2

f (x1 ) + f (x2 ) 2

• The second derivative of f is negative on [a, b]. • If f has a derivative which is monotone decreasing. obviously, the last two items apply provided f has the required derivatives. And example of concave function is f (x) = −x2 on the interval [−5, 5]. Version: 5 Owner: drini Author(s): drini

1471

Chapter 376 26Axx – Functions of one variable
376.1 function centroid

Let f : D ⊂ R → R be an arbitrary function. By analogy with the geometric centroid, the centroid of an function f is defined as: x = intxf (x)dx , intf (x)dx

where the integrals are taken over the domain D. Version: 1 Owner: vladm Author(s): vladm

1472

Chapter 377 26B05 – Continuity and differentiation questions
377.1
∞ C0 (U ) is not empty

Theorem If U is a non-empty open set in Rn , then the set of smooth functions with compact support ∞ C0 (U) is not empty. The proof is divided into three sub-claims: Claim 1 Let a < b be real numbers. Then there exists a smooth non-negative function f : R → R, whose support is the compact set [a, b]. To prove Claim 1, we need the following lemma: Lemma ([4], pp. 14) If φ(x) = 0 for x ≤ 0, e−1/x for x > 0,

then φ : R → R is a non-negative smooth function. (A proof of the Lemma can be found in [4].) Proof of Claim 1. Using the lemma, let us define f (x) = φ(x − a)φ(b − x). Since φ is smooth, it follows that f is smooth. Also, from the definition of φ, we see that φ(x − a) = 0 precisely when x ≤ a, and φ(b − x) = 0 precisely when x ≥ b. Thus the support of f is indeed [a, b]. P Claim 2 Let ai , bi be real numbers with ai < bi for all i = 1, . . . , n. Then there exists a 1473

smooth non-negative function f : Rn → R whose support is the compact set [a1 , b1 ] × · · · × [an , bn ]. Proof of Claim 2. Using Claim 1, we can for each i = 1, . . . , n construct a function fi with support in [ai , bi ]. Then f (x1 , . . . , xn ) = f1 (x1 )f2 (x2 ) · · · fn (xn ) gives a smooth function with the sought properties. P Claim 3 If U is a non-empty open set in Rn , then there are real numbers ai < bi for i = 1, . . . , n such that [a1 , b1 ] × · · · × [an , bn ] is a subset of U. Proof of Claim 3. Here, of course, we assume that Rn is equipped with the usual topology induced by the open balls of the Euclidean metric. Since U is non-empty, there exists some point x in U. Further, since U is a topological space, x is contained in some open set. Since the topology has a basis consisting of open balls, there exists a y ∈ U and ε > 0 such that x is contained in the open ball B(y, ε). Let us now set ε ε ai = yi − 2√n and bi = yi + 2√n for all i = 1, . . . , n. Then D = [a1 , b1 ] × · · · × [an , bn ] can be parametrized as ε D = {y + (λ1 , . . . , λn ) √ | λi ∈ [−1, 1] for all i = 1, . . . , n}. 2 n For an arbitrary point in D, we have ε ε |y + (λ1 , . . . , λn ) √ − y| = |(λ1 , . . . , λn )| √ | 2 n 2 n ε λ2 + · · · + λ2 = √ 1 n 2 n ε ≤ < ε, 2 so D ⊂ B(y, ) ⊂ U, and Claim 3 follows. P

REFERENCES
1. L. H¨rmander, The Analysis of Linear Partial Differential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990.

Version: 3 Owner: matte Author(s): matte

377.2

Rademacher’s Theorem

Let f : Rn → R be any Lipschitz continuous function. Then f is differentiable in almost every x ∈ Rn . 1474

Version: 1 Owner: paolini Author(s): paolini

377.3

smooth functions with compact support

Definition [3] Let U be an open set in Rn . Then the set of smooth functions with compact support (in U) is the set of functions f : Rn → C which are smooth (i.e., ∂ α f : Rn → C is a continuous function for all multi-indices α) and supp f is compact and ∞ contained in U. This functionspace is denoted by C0 (U). Remarks
∞ 1. A proof that C0 (U) is not empty can be found here. ∞ 2. With the usual point-wise addition and point-wise multiplication by a scalar, C0 (U) is a vector space over the field C. ∞ 3. Suppose U and V are open subsets in Rn and U ⊂ V . Then C0 (U) is a vector subspace ∞ ∞ ∞ of C0 (V ). In particular, C0 (U) ⊂ C0 (V ).

∞ ∞ It is possible to equip C0 (U) with a topology, which makes C0 (U) into a locally convex topological vector s The definition, however, of this topology is rather involved (see e.g. [3]). However, the next theorem shows when a sequence converges in this topology.

Theorem 1 Suppose that U is an open set in Rn , and that {φi }∞ is a sequence of functions i=1 ∞ in C∞ (U). Then {φi } converges (in the aforementioned topology) to a function φ ∈ C0 (U) 0 if and only if the following conditions hold: 1. There is a compact set K ⊂ U such that supp φi ⊂ K for all i = 1, 2, . . .. 2. For every multi-index α, in the sup-norm. ∂ α φi → ∂ α φ

Theorem 2 Suppose that U is an open set in Rn , that Γ is a locally convex topological ∞ vector space, and that L : C0 (U) → Γ is a linear map. Then L is a continuous map, if and only if the following condition holds:
∞ If K is a compact subset of U, and {φi }∞ is a sequence of functions in C0 (U) such i=1 ∞ that supp φi ⊂ K for all i, and φi → φ (in C0 (U)) for some φ ∈ D(U), then Lφi → Lφ (in C).

The above theorems are stated without proof in [1]. 1475

REFERENCES
1. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973. 2. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999.

Version: 3 Owner: matte Author(s): matte

1476

Chapter 378 26B10 – Implicit function theorems, Jacobians, transformations with several variables
378.1 Jacobian matrix

The Jacobian [Jf (x)] of a function f : Rn → Rm is the matrix of partial derivatives such that   D1 f1 (x) . . . Dn f1 (x)   . . .. . . [Jf (x)] =   . . . D1 fm (x) . . . Dn fm (x) A more concise way of writing it is −→ − −→ −  [Jf(x)] = [D1 f , · · · , Dn f ] =    f1 .  .  . fm

−→ − where Dn f is the partial derivative with respect to the nth variable and fm is the gradient of the nth component of f. The Jacobian matrix represents the full derivative matrix [Df (x)] of f at x iff f is differentiable at x. Also, if f is differentiable at x, then [Jf(x)] = [Df (x)] and the directional derivative in the direction v is [Df(x)]v. Version: 9 Owner: slider142 Author(s): slider142

378.2

directional derivative

Partial derivatives measure the rate at which a multivariable function f varies as the variable 1477

moves in the direction of the standard basis vectors. Directional derivatives measure the rate at which f varies when the variable moves in the direction v. Thus the directional derivative of f at a in the direction v is represented as Dv f (a) =  f (a + hv) − f (a) ∂ f (a) = lim . h→0 ∂v h

 x For example, if f  y  = x2 + 3y 2z, and we wanted to find the derivative at the point z     1 1 a =  2  in the direction v =  1 , our equation would be 1 3
1 limh→0 h ((1 + h)2 + 3(2 + h)2 (3 + h) − 37) 1 = limh→0 h (3h3 + 37h2 + 50h)

.

= limh→0 3h2 + 37h + 50 = 50 One may also use the Jacobian matrix if the function is differentiable to find the derivative in the direction v as [Jf (x)]v. Version: 6 Owner: slider142 Author(s): slider142

378.3

gradient

Summary. The gradient is a first-order differential operator that maps functions to vector fields. It is a generalization of the ordinary derivative, and as such conveys information about the rate of change of a function relative to small variations in the independent variables. The gradient of a function f is customarily denoted by f or by grad f . Definition: Euclidean space Consider n-dimensional Euclidean space with orthogonal coordinates x1 , . . . , xn , and corresponding unit vectors e1 , . . . , en . In this setting, the gradient of a function f (x1 , . . . , xn ) is defined to be the vector field given by
n

f=
i=1

∂f ei . ∂xi

It is also useful to represent the gradient operator as the vector-valued differential operator
n

=
i=1

ei 1478

∂ . ∂xi

or, in the context of Euclidean 3-space, as =i ∂ ∂ ∂ +j +k , ∂x ∂y ∂z

where i, j, k are the unit vectors lying along the positive direction of the x, y, z axes, respectively. Using this formalism, the symbol can be used to express the divergence operator as ·, the curl operator as ×, and the Laplacian operator as 2 . To wit, for a given vector field A = Ax i + Ay j + Az k, and a given function f we have ∂Ax ∂Ay ∂Az + + ∂x ∂y ∂z ∂Az ∂Ay ×A= i+ − ∂y ∂z ∂2f ∂2f ∂2f 2 f= + 2 + 2. ∂x2 ∂y ∂z ·A=

∂Ax ∂Az − ∂z ∂x

j+

∂Ay ∂Ax − ∂x ∂y

k

Definition: Riemannian geometry More generally still, consider a Riemannian manifold with metric tensor gij and inverse g ij . In this setting the gradient X = grad f of a function f relative to a general coordinate system, is given by X j = g ij f,i . (378.3.1)

Note that the Einstein summation convention is in force above. Also note that f,i denotes the partial derivative of f with respect to the ith coordinate. Definition (377.3.1) is useful even in the Euclidean setting, because it can be used to derive the formula for the gradient in various generalized coordinate systems. For example, in the cylindrical system of coordinates (r, θ, z) we have   1 0 0 gij = 0 r 2 0 0 0 1 while for the system of spherical coordinates (ρ, φ, θ) we have   1 0 0 0 . gij = 0 ρ2 2 0 0 ρ sin2 φ 1 ∂f ∂f ∂f er + eθ + k ∂r r ∂θ ∂z 1 ∂f 1 ∂f ∂f eρ + eφ + eθ f= ∂ρ ρ ∂φ ρ sin φ ∂θ f= 1479

Hence, for a given function f we have

Cylindrical Spherical ,

where for the cylindrical system ∂ x y = i+ j ∂r r r 1 ∂ y x eθ = =− i+ j r ∂θ r r are the unit vectors in the direction of increase of r and θ, respectively, and for the spherical system er = ∂ x y z = i+ j+ k ∂ρ ρ ρ ρ 1 ∂ zx zy r eφ = = i+ j− k ρ ∂φ rρ rρ ρ y x 1 ∂ =− i+ j eθ = ρ sin θ ∂θ r r eρ = are the unit vectors in the direction of increase of ρ, θ, φ, respectively. Physical Interpretation. In the simplest case, we consider the Euclidean plane with Cartesian coordinates x, y. The gradient of a function f (x, y) is given by f= ∂f ∂f i+ j, ∂x ∂y

where i, j denote, respectively, the standard unit horizontal and vertical vectors. The gradient vectors have the following geometric interpretation. Consider the graph z = f (x, y) as a surface in 3-space. The direction of the gradient vector f is the direction of steepest ascent, while the magnitude is the slope in that direction. Thus, f = ∂f ∂x
2

+

∂f ∂y

2

describes the steepness of the hill z = f (x, y) at a point on the hill located at (x, y, f (x, y)). A more general conception of the gradient is based on the interpretation of a function f as a potential corresponding to some conservative physical force. The negation of the gradient, − f , is then interpreted as the corresponding force field. Differential identities. Several properties of the one-dimensional derivative generalize to a multi-dimensional setting (af + bg) = a f + b g (f g) = f g + g f (φ ◦ f ) = (φ ◦ f ) f Version: 9 Owner: rmilson Author(s): rmilson, slider142 1480 Linearity Product rule Chain rule

378.4

implicit differentiation

Implicit differentiation is a tool used to analyze functions that cannot be conveniently put into a form y = f (x) where x = (x1 , x2 , ..., xn ). To use implicit differentiation meaningfully, you must be certain that your function is of the form f (x) = 0 (it can be written as a level set) and that it satisfies the implicit function theorem (f must be continuous, its first partial derivatives must be continuous, and the derivative with respect to the implicit function must be non-zero). To actually differentiate implicitly, we use the chain rule to differentiate the entire equation. Example: The first step is to identify the implicit function. For simplicity in the example, we will assume f (x, y) = 0 and y is an implicit function of x. Let f (x, y) = x2 + y 2 + xy = 0 (Since this is a two dimensional equation, all one has to check is that the graph of y may be an implicit function of x in local neighborhoods.) Then, to differentiate implicitly, we differentiate both sides of the equation with respect to x. We will get 2x + 2y · dy dy +x·1· +y =0 dx dx

Do you see how we used the chain rule in the above equation ? Next, we simply solve for our dy implicit derivative dx = − 2x+y . Note that the derivative depends on both the variable and 2y+x the implicit function y. Most of your derivatives will be functions of one or all the variables, including the implicit function itself. [better example and ?multidimensional? coming] Version: 2 Owner: slider142 Author(s): slider142

378.5

implicit function theorem

Let f = (f1 , ..., fn ) be a continuously differentiable, vector-valued function mapping an open set E ⊂ Rn+m into Rn . Let (a, b) = (a1 , ..., an , b1 , ..., bm ) be a point in E for which f(a, b) = 0 and such that the n × n determinant |Dj fi (a, b)| = 0 for i, j = 1, ..., n. Then there exists an m-dimensional neighbourhood W of b and a unique continuously differentiable function g : W → Rn such that g(b) = a and f(g(t), t) = 0 for all t ∈ W .

1481

Simplest case When n = m = 1, the theorem reduces to: Let F be a continuously differentiable, realvalued function defined on an open set E ⊂ R2 and let (x0 , y0) be a point on E for which F (x0 , y0) = 0 and such that ∂F |x ,y = 0 ∂x 0 0 Then there exists an open interval I containing y0 , and a unique function f : I → R which is continuously differentiable and such that f (y0 ) = x0 and F (f (y), y) = 0 for all y ∈ I. Note The inverse function theorem is a special case of the implicit function theorem where the dimension of each variable is the same. Version: 7 Owner: vypertd Author(s): vypertd

378.6

proof of implicit function theorem

Consider the function F : E → Rn × Rm defined by F (x, y) = (f (x, y), y). Setting Ajk =
∂f j (a, b), ∂xk

and Mji =

∂fj (a, b), ∂yi

A is an n × m matrix and M is n × n. It holds

Df (a, b) = (A|M) and hence DF (a, b) = In 0 A M

.

Being det M = 0 M is invertible and hence DF (a, b) is invertible too. Applying the inverse function theorem to F we find that there exist a neighbourhood V of a and W of b and a function G ∈ C 1 (V × W, Rn+m ) such that F (G(x, y)) = (x, y) for all (x, y) ∈ V × W . Letting G(x, y) = (G1 (x, y), G2(x, y)) (so that G1 : V × W → Rn , G2 : V × W → Rm ) we hence have (x, y) = F (G1 (x, y), G2(x, y)) = (f (G1 (x, y), G2(x, y)), G2 (x, y)) 1482

and hence y = G2 (x, y) and x = f (G1 (x, y), G2(x, y)) = f (G1 (x, y), y). So we only have to set g(y) = G1 (0, y) to obtain f (g(y), y) = 0, Version: 1 Owner: paolini Author(s): paolini ∀y ∈ W.

1483

Chapter 379 26B12 – Calculus of vector functions
379.1 Clairaut’s theorem

Theorem. (Clairaut’s Theorem) If F : Rn → Rm is a function whose second partial derivatives exist and are continuous on a set S ⊆ Rn , then ∂2f ∂2f = ∂xi ∂xj ∂xj ∂xi on S (where 1 i, j n).

This theorem is commonly referred to as simply ’the equality of mixed partials’. It is usually first presented in a vector calculus course, and is useful in this context for proving basic properties of the interrelations of gradient, divergence, and curl. I.e., if F : R3 → R3 is a function satisfying the hypothesis, then · ( × F) = 0. Or, if f : R3 → R is a function satisfying the hypothesis, × f = 0. Version: 10 Owner: flynnheiss Author(s): flynnheiss

379.2

Fubini’s Theorem

Fubini’s Theorem Let I ⊂ RN and J ⊂ RM be compact intervals, and let f : I × J → RK be a Riemann integrable function such that, for each x ∈ I the integral F (x) := intJ f (x, y) dµJ (y) exists. Then F : I → RK is Riemann integrable, and intI F = intI×J f. 1484

This theorem effectively states that, given a function of N variables, you may integrate it one variable at a time, and that the order of integration does not affect the result. Example Let I := [0, π/2] × [0, π/2], and let f : I → R, x → sin(x) cos(y) be a function. Then intI f =
[0,π/2]×[0,π/2]

sin(x) cos(y)
π/2 π/2

= int0

int0 sin(x) cos(y) dy dx sin(x) (1 − 0) dx = (0 − −1) = 1 ···
I

π/2

= int0

Note that it is often simpler (and no less correct) to write Version: 3 Owner: vernondalhart Author(s): vernondalhart

f as intI f .

379.3

Generalised N-dimensional Riemann Sum

Let I = [a1 , b1 ] × · · · × [aN , bN ] be an N-cell in RN . For each j = 1, . . . , N, let aj = tj,0 < . . . < tj,N = bj be a partition Pj of [aj , bj ]. We define a partition P of I as P := P1 × · · · × PN Each partition P of I generates a subdivision of I (denoted by (Iν )ν ) of the form Iν = [t1,j , t1,j+1 ] × · · · × [tN,k , tN,k+1 ] Let f : U → RM be such that I ⊂ U, and let (Iν )ν be the corresponding subdivision of a partition P of I. For each ν, choose xν ∈ Iν . Define S(f, P ) :=
ν

f (xν )µ(Iν)

As the Riemann sum of f corresponding to the partition P . A partition Q of I is called a refinement of P if P ⊂ Q. Version: 1 Owner: vernondalhart Author(s): vernondalhart

379.4

Generalized N-dimensional Riemann Integral

Let I = [a1 , b1 ] × · · · × [aN , bN ] ⊂ RN be a compact interval, and let f : I → RM be a function. Let > 0. If there exists a y ∈ RM and a partition P of I such that for each 1485

refinement P of P (and corresponding Riemann sum S(f, P )), S(f, P ) − y < Then we say that f is Riemann integrable over I, that y is the Riemann integral of f over I, and we write intI f := intI f dµ := y Note also that it is possible to extend this definition to more arbitrary sets; for any bounded set D, one can find a compact interval I such that D ⊂ I, and define a function ˜ f : I → RM in which case we define x→ f (x), x ∈ D 0, x∈D /

˜ intD f := intI f

Version: 3 Owner: vernondalhart Author(s): vernondalhart

379.5

Helmholtz equation

It is a partial differential equation which, in scalar form is
2

f + k 2 f = 0,

or in vector form is
2

A + k 2 A = 0,

where 2 is the Laplacian. The solutions of this equation represent the solution of the wave equation, which is of great interest in physics. Consider a wave equation ∂2ψ = c2 2 ψ ∂t2 with wave speed c. If we look for time harmonic standing waves of frequency ω, ψ(x, t) = e−jωt φ(x) we find that φ(x) satisfies the Helmholtz equation: ( where k = ω/c is the wave number. Usually Helmholtz equation is solved by seperation of variables method, in cartesian, spherical or cylindrical coordinates. Version: 3 Owner: giri Author(s): giri 1486
2

+ k 2 )φ = 0

379.6

Hessian matrix

The Hessian of a scalar function of a vector is the matrix of partial second derivatives. So the Hessian matrix of a function f : Rn → R is:      
∂2f dx2 1 ∂2f dx2 dx1 ∂2f dx1 dx2 ∂2f dx2 2

... ... .. . ...

. . .

. . .

∂2f dxn dx1

∂2f dxn dx2

∂2f dx1 dxn ∂2f   dx2 dxn 

. . .

  

(379.6.1)

∂2f dx2 n

Note that the Hessian is symmetric because of the equality of mixed partials. Version: 2 Owner: bshanks Author(s): akrowne, bshanks

379.7

Jordan Content of an N-cell

Let I = [a1 , b1 ] × · · · × [aN , bN ] be an N-cell in RN . Then the Jordan content (denoted µ(I)) of I is defined as
N

µ(I) :=
j=1

(bj − aj )

Version: 1 Owner: vernondalhart Author(s): vernondalhart

379.8

Laplace equation

The scalar form of Laplace’s equation is the partial differential equation
2

f =0

and the vector form is
2

A = 0,

where k = 0.

2

is the Laplacian. It is a special case of the Helmholtz differential equation with

A function f which satisfies Laplace’s equation is said to be harmonic. Since Laplace’s equation is linear, the superposition of any two solutions is also a solution. Version: 3 Owner: giri Author(s): giri 1487

379.9

chain rule (several variables)

The chain rule is a theorem of analysis that governs derivatives of composed functions. The basic theorem is the chain rule for functions of one variables (see here). This entry is devoted to the more general version involving functions of several variables and partial derivatives. Note: the symbol Dk will be used to denote the partial derivative with respect to the k th variable. Let F (x1 , . . . , xn ) and G1 (x1 , . . . , xm ), . . . , Gn (x1 , . . . , xm ) be differentiable functions of several variables, and let H(x1 , . . . , xm ) = F (G1 (x1 , . . . , xm ), . . . , Gn (x1 , . . . , xm )) be the function determined by the composition of F with G1 , . . . , Gn The partial derivatives of H are given by
n

(Dk H)(x1 , . . . , xm ) =
i=1

(Di F )(G1 (x1 , . . . , xm ), . . .)(Dk Gi )(x1 , . . . , xm ).

The chain rule can be more compactly (albeit less precisely) expressed in terms of the JacobiLegendre partial derivative symbols (historical note). Just as in the Leibniz system, the basic idea is that of one quantity (i.e. variable) depending on one or more other quantities. Thus we would speak about a variable z depends differentiably on y1 , . . . , yn , which in turn depend differentiably on variables x1 , . . . , xm . We would then write the chain rule as ∂z = ∂xj
n

i=1

∂z ∂yi , ∂yi ∂xj

j = 1, . . . m.

The most general, and conceptually clear approach to the multi-variable chain is based on the notion of a differentiable mapping, with the Jacobian matrix of partial derivatives playing the role of generalized derivative. Let, X ⊂ Rm and Y ⊂ Rn be open domains and let F : Y → Rl , G:X→Y

be differentiable mappings. In essence, the symbol F represents l functions of n variables each: F = (F1 , . . . , Fl ), Fi = Fi (x1 , . . . , xn ), whereas G = (G1 , . . . , Gn ) represents n functions of m variables each. The derivative of such mappings is no longer a function, but rather a matrix of partial derivatives, customarily called the Jacobian matrix. Thus     D1 G1 . . . Dm G1 D1 F1 . . . Dn F1  .  . .  .  .. .. .  .  DG =  . DF =  . . . . . . . D1 Fl . . . Dn Fl D1 Gn . . . Dm Gn 1488

The chain rule now takes the same form as it did for functions of one variable: albeit with matrix multiplication taking the place of ordinary multiplication. This form of the chain rule also generalizes quite nicely to the even more general setting where one is interested in describing the derivative of a composition of mappings between manifolds. Version: 7 Owner: rmilson Author(s): rmilson D(F ◦ G) = ((DF) ◦ G) (DG),

379.10

divergence

Basic Definition. Let x, y, z be a system of Cartesian coordinates on 3-dimensional Euclidean space, and let i, j, k be the corresponding basis of unit vectors. The divergence of a continuously differentiable vector field F = F 1 i + F 2 j + F 3 k, is defined to be the function ∂F 1 ∂F 2 ∂F 3 + + . ∂x ∂y ∂z Another common notation for the divergence is · F (see gradient), a convenient mnemonic. div F = Physical interpretation. In physical terms, the divergence of a vector field is the extent to which the vector field flow behaves like a source or a sink at a given point. Indeed, an alternative, but logically equivalent definition, gives the divergence as the derivative of the net flow of the vector field across the surface of a small sphere relative to the surface area of the sphere. To wit, (div F)(p) = lim intS · N)dS / 4πr 2 , (F
r→0

where S denotes the sphere of radius r about a point p ∈ R3 , and the integral is a surface integral taken with respect to N, the normal to that sphere. The non-infinitesimal interpretation of divergence is given by Gauss’s Theorem. This theorem is a conservation law, stating that the volume total of all sinks and sources, i.e. the volume integral of the divergence, is equal to the net flow across the volume’s boundary. In symbols, intV div F dV = intS (F · N) dS,

where V ⊂ R3 is a compact region with a smooth boundary, and S = ∂V is that boundary oriented by outward-pointing normals. We note that Gauss’s theorem follows from the more general Stokes’ theorem, which itself generalizes the fundamental theorem of calculus. In light of the physical interpretation, a vector field with constant zero divergence is called incompressible – in this case, no net flow can occur across any closed surface. 1489

General definition. The notion of divergence has meaning in the more general setting of Riemannian geometry. To that end, let V be a vector field on a Riemannian manifold. The covariant derivative of V is a type (1, 1) tensor field. We define the divergence of V to be the trace of that field. In terms of coordinates (see tensor and Einstein summation convention), we have div V = V i ;i . Version: 6 Owner: rmilson Author(s): rmilson, jaswenso

379.11

extremum

Extrema are minima and maxima. The singular forms of these words are extremum, minimum, and maximum. Extrema may be “global” or “local”. A global minimum of a function f is the lowest value that f ever achieves. If you imagine the function as a surface, then a global minimum is the lowest point on that surface. Formally, it is said that f : U → V has a global minimum at x if ∀u ∈ U, f (x) f (u). A local minimum of a function f is a point x which has less value than all points ”next to” it. If you imagine the function as a surface, then a local minimum is the bottom of a “valley” or “bowl” in the surface somewhere. Formally, it is said that f : U → V has a local minimum at x if ∃ a neighborhood N of x such that ∀y ∈ N, f (x) f (y). If you flip the signs above to , you get the definitions of global and local maxima.

A ”strict local minima” or ”strict local maxima” means that nearby points are strictly less than or strictly greater than the critical point, rather than or . For instance, a strict local minima at x has a neighborhood N such that ∀y ∈ N, (f (x) < f (y) or y = x). Related concepts are plateau and saddle point. Finding minima or maxima is an important task which is part of the field of optimization. Version: 9 Owner: bshanks Author(s): bshanks, bbukh

379.12

irrotational field

Suppose Ω is an open set in R3 , and V is a vector field with differentiable real (or possibly complex) valued component functions. If × V = 0, then V is called an irrotional vector field, or curl free field. If U and V are irrotational, then U × V is solenoidal. 1490

Version: 6 Owner: matte Author(s): matte, giri

379.13

partial derivative

The partial derivative of a multivariable function f is simply its derivative with respect to only one variable, keeping all other variables constant (which are not functions of the variable in question). The formal definition is   ∂f 1    Di f (a) = = lim f  ai + h ∂ai h→0 h   . .   . an   a1 . . .      f (a + hei ) − f (a)    − f (a) = lim   h→0 h   

where ei is the standard basis vector of the ith variable. Since this only affects the ith variable, one can derive the function using common rules and tables, treating all other variables (which are not functions of ai ) as constants. For example, if f (x) = x2 + 2xy + y 2 + y 3z, then (1) (2) (3)
∂f ∂x ∂f ∂y ∂f ∂z

= 2x + 2y = 2x + 2y + 3y 2z = y3

Note that in equation (1), we treated y as a constant, since we were differentiating with respect to x. d(c∗x) = c The partial derivative of a vector-valued function f (x) with respect dx − → ∂f to variable ai is a vector Di f = ∂ai . Multiple Partials: Multiple partial derivatives can be treated just like multiple derivatives. There is an additional degree of freedom though, as you can compound derivatives with respect to different variables. For example, using the above function, (4) (5) (6)
∂2f ∂x2 ∂2f ∂z∂y ∂2f ∂y∂z

= = =

∂ (2x ∂x ∂ (2x ∂z

+ 2y)

=2

+ 2y + (5)3y 2z) = 3y 2 = 3y 2

∂ (y 3 ) ∂y

D12 is another way of writing ∂x1∂∂x2 . If f (x) is continuous in the neighborhood of x, it can be shown that Dij f (x) = Dji f (x) where i, j are the ith and jth variables. In fact, as long as an equal number of partials are taken with respect to each variable, changing the order 1491

of differentiation will produce the same results in the above condition. Another form of notation is f (a,b,c,...)(x) where a is the partial derivative with respect to the first variable a times, b is the partial with respect to the second variable b times, etc. Version: 17 Owner: slider142 Author(s): slider142

379.14

plateau

A plateau of a function is a region where a function has constant value. More formally, let U and V be topological spaces. A plateau for a scalar function f : U → V is a path-connected set of points P ⊆ U such that for some y we have ∀p ∈ P, f (p) = y (379.14.1)

Please take note that this entry is not authoritative. If you know of a more standard definition of ”plateau”, please contribute it, thank you. Version: 4 Owner: bshanks Author(s): bshanks

379.15

proof of Green’s theorem

Consider the region R bounded by the closed curve P in a well-connected space. P can be given by a vector valued function F (x, y) = (f (x, y), g(x, y)). The region R can then be described by ∂g ∂f ∂f ∂g int R int − dA − int R int dA dA = int R int ∂x ∂y ∂x ∂y The double integrals above can be evaluated separately. Let’s look at int R int ∂g B(y) ∂g dA = intb intA(y) dxdy a ∂x ∂x

Evaluating the above double integral, we get intb (g(A(y), y) − g(B(y), y)) dy = intb g(A(y), y) dy − intb g(B(y), y) dy a a a According to the fundamental theorem of line integrals, the above equation is actually equivalent to the evaluation of the line integral of the function F1 (x, y) = (0, g(x, y)) over a path P = P1 + P2 , where P1 = (A(y), y) and P2 = (B(y), y). intb g(A(y), y) dy − intb g(B(y), y) dy = intP1 F1 · dt + intP2 F1 · dt = a a 1492 F1 · dt

P

Thus we have int R int

∂g dA = ∂x

P

F1 · dt

By a similar argument, we can show that int R int ∂f dA = − ∂y F2 · dt

P

where F2 = (f (x, y), 0). Putting all of the above together, we can see that int R int ∂g ∂f − ∂x ∂y dA =
P

F1 · dt +

P

F2 · dt =

P

(F1 + F2 ) · dt =

P

(f (x, y), g(x, y)) · dt

which is Green’s theorem. Version: 7 Owner: slider142 Author(s): slider142

379.16

relations between Hessian matrix and local extrema

Let x be a vector, and let H(x) be the Hessian for f at a point x. Let the neighborhood of x be in the domain for f , and let f have continuous partial derivatives of first and second order. Let f = 0. If H(x) is positive definite, then x is a strict local minimum for f . If x is a local minimum for x, then H(x) is positive semidefinite. If H(x) is negative definite, then x is a strict local maximum for f . If x is a local maximum for x, then H(x) is negative semidefinite. If H(x) is indefinite, x is a nondegenerate saddle point. If the case when the dimension of x is 1 (i.e. f : R → R), this reduces to the Second Derivative Test, which is as follows: Let the neighborhood of x be in the domain for f , and let f have continuous partial derivatives of first and second order. Let f (x) = 0. If f (x) > 0, then x is a strict local minimum. If f (x) < 0, then x is a strict local maximum. Version: 6 Owner: bshanks Author(s): bshanks

1493

379.17

solenoidal field

A solenoidal vector field is one that satisfies ·B=0 at every point where the vector field B is defined. Here · B is the divergence.

This condition actually implies that there exists a vector A, known as the vector potential, such that B = × A. For a function f satisfying Laplace’s equation
2

f = 0,

it follows that

f is solenoidal.

Version: 4 Owner: giri Author(s): giri

1494

Chapter 380 26B15 – Integration: length, area, volume
380.1 arc length

Arc length is the length of a section of a differentiable curve. Finding arc length is useful in many applications, for the length of a curve can be attributed to distance traveled, work, etc. It is commonly represented as S or the differential ds if one is differentiating or integrating with respect to change in arclength. If one knows the vector function or parametric equations of a curve, finding the arc length is simple, as it can be given by the sum of the lengths of the tangent vectors to the curve or intb |F (t)| dt = S a Note that t is an independent parameter. In Cartesian coordinates, arclength can be calculated by the formula S = intb 1 + (f (x))2 dx a This formula is derived by viewing arclength as the Riemman sum
n ∆x→∞

lim

1 + f (xi ) ∆x
i=1

The term being summed is the length of an approximating secant to the curve over the distance ∆x. As ∆x vanishes, the sum approaches the arclength, thus the algorithm. Arclength can also be derived for polar coordinates from the general formula for vector functions given

1495

above. The result is L = intb a r(θ)2 + (r (θ))2 dθ

Version: 5 Owner: slider142 Author(s): slider142

1496

Chapter 381 26B20 – Integral formulas (Stokes, Gauss, Green, etc.)
381.1 Green’s theorem

Green’s theorem provides a connection between path integrals over a well-connected region in the plane and the area of the region bounded in the plane. Given a closed path P bounding a region R with area A, and a vector-valued function F = (f (x, y), g(x, y)) over the plane, F · dx = int R [g1 (x, y) − f2 (x, y)]dA int

P

where an is the derivative of a with respect to the nth variable.

Corollary: The closed path integral over a gradient of a function with continuous partial derivatives is always zero. Thus, gradients are conservative vector fields. The smooth function is called the potential of the vector field.

Proof: The corollary states that

h P

· dx = 0

We can easily prove this using Green’s theorem.

1497

h P

· dx = int R [g1 (x, y) − f2 (x, y)]dA int

But since this is a gradient... int R [g1 (x, y) − f2 (x, y)]dA = int R [h21 (x, y) − h12 (x, y)]dA int int Since h12 = h21 for any function with continuous partials, the corollary is proven. Version: 4 Owner: slider142 Author(s): slider142

1498

Chapter 382 26B25 – Convexity, generalizations
382.1 convex function

Definition Suppose Ω is a convex set in a vector space over R (or C), and suppose f is a function f : Ω → R. If for any x, y ∈ Ω and any λ ∈ (0, 1), we have f λa + (1 − λ)b λf (a) + (1 − λ)f (b),

we say that f is a convex function. If for any x, y ∈ Ω and any λ ∈ (0, 1), we have f λa + (1 − λ)b λf (a) + (1 − λ)f (b),

we say that f is a concave function. If either of the inequalities are strict, then we say that f is a strictly convex function, or a strictly concave function, respectively.

Properties • A function f is a (strictly) convex function if and only if −f is a (strictly) concave function. • On R, a continuous function is convex if and only if for all x, y ∈ R, we have f x+y 2 ≤ f (x) + f (y) . 2

• A twice continuously differentiable function on R is convex if and only if f (x) ≥ 0 for all x ∈ R. • A local minimum of a convex function is a global minimum. See this page. 1499

Examples • ex ,e−x , and x2 are convex functions on R. • A norm is a convex function. • On R2 , the 1-norm and the ∞-norm (i.e., ||(x, y)||1 = |x| + |y| and ||(x, y)||∞ = max{|x|, |y|}) are not strictly convex ([2], pp. 334-335).

REFERENCES
1. E. Kreyszig, Introductory Functional Analysis With Applications, John Wiley & Sons, 1978.

Version: 11 Owner: matte Author(s): matte, drini

382.2

extremal value of convex/concave functions

Theorem. Let U be a convex set in a normed (real or complex) vector space. If f : U → R is a convex function on U, then a local minimum of f is a global minimum. Proof. Suppose x is a local minimum for f , i.e., there is an open ball B ⊂ U with radius and center x such that f (x) ≤ f (ξ) for all ξ ∈ B. Let us fix some y ∈ B. Our aim is to / 1 prove that f (x) ≤ f (y). We define λ = 2||x−y|| , where || · || is the norm on U. Then ||λy + (1 − λ)x − x|| = ||λy − λx|| = |λ|||x − y|| = 2 , so λy + (1 − λ)x ∈ B. If follows that f (x) ≤ f (λy + (1 − λ)x). Since f is convex, we then get f (x) ≤ f (λy + (1 − λ)x) ≤ λf (y) + (1 − λ)f (x), and f (x) ≤ f (y) as claimed. P The analogous theorem for concave functions is as follows. Theorem. Let U be a convex set in a normed (real or complex) vector space. If f : U → R is a concave function on U, then a local maximum of f is a global maximum. 1500

Proof. Consider the convex function −f . If x is a local maximum of f , then it is a local minimum of −f . By the previous theorem, x is then a global minimum of −f . Hence x is a global maximum of f . P Version: 1 Owner: matte Author(s): matte

1501

Chapter 383 26B30 – Absolutely continuous functions, functions of bounded variation
383.1 absolutely continuous function

Definition [1, 1] closed bounded interval of R. Then a function f : [a, b] → C is absolutely continuous on [a, b], if for any ε > 0, there is a δ > 0 such that the following condition holds: (∗) If (a1 , b1 ), . . . , (an , bn ) is a finite collection of disjoint open intervals in [a, b] such that
n

i=1

(bi − ai ) < δ,

then

n

i=1

|f (bi ) − f (ai )| < ε.

Basic results for absolutely continuous functions are as follows. Theorem 1. A function f : [a, b] → C is absolutely continuous if and only if Re{f } and Im{f } are absolutely continuous real functions. 2. If f : [a, b] → C is a function, which is everywhere differentiable and f is bounded, then f is absolutely continuous [1]. 1502

3. Any absolutely continuous function f : [a, b] → C is continuous on [a, b] and has a bounded variation [1, 1]. 4. If f, g be absolutely continuous functions, then so are f g, f + g, |f |γ (if γ ≥ 1), and f /g (if g is never zero) [1]. 5. If f, g are real valued absolutely continuous functions, then so are max{f, g} and min{f, g}. If f (x) > 0 for all x and γ ∈ R, then f γ is absolutely continuous [1]. Property (2), which is readily proven using the mean value theorem, implies that any smooth function with compact support on R is absolutely continuous. By property (3), any absolutely continuous function is a bounded variation. Hence, from properties of functions of bounded variation, the following theorem follows: Theorem ([1], pp. 536) Let f : [a, b] → C be a absolutely continuous function. Then f is differentiable almost everywhere, and |f | is Lebesgue integrable. We have the following characterization of absolutely continuous functions Theorem [Fundamental theorem of calculus for the Lebesgue integral] ([1], pp. 550, [1]) Let f : [a, b] → C be a function. Then f is absolutely continuous if and only if there is a function g ∈ L1 (a, b) (i.e. a g : (a, b) → C with int(a,b) |g| < ∞), such that f (x) = f (a) + intx g(t)dt a for all x ∈ [a, b]. What is more, if f and g are as above, then f = g almost everywhere. (Above, both integrals are Lebesgue integrals.)

REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999. 2. W. Rudin, Real and complex analysis, 3rd ed., McGraw-Hill Inc., 1987. 3. F. Jones, Lebesgue Integration on Euclidean Spaces, Jones and Barlett Publishers, 1993. 4. C.D. Aliprantis, O. Burkinshaw, Principles of Real Analysis, 2nd ed., Academic Press, 1990.

Version: 5 Owner: matte Author(s): matte

383.2

total variation

Let γ : [a, b] → X be a function mapping an interval [a, b] to a metric space (X, d). We say that γ is of bounded variation if there is a constant M such that, for each partition 1503

P = {a = t0 < t1 < · · · < tn = b} of [a, b],
n

v(γ, P ) =
k=1

d(γ(tk ), γ(tk−1 ))

M.

The total variation Vγ of γ is defined by Vγ = sup{v(γ, P ) : P is a partition of [a, b]}. It can be shown that, if X is either R or C, every smooth (or piecewise smooth) function γ : [a, b] → X is of bounded variation, and Vγ = intb |γ (t)|dt. a Also, if γ is of bounded variation and f : [a, b] → X is continuous, then the Riemann-Stieltjes integral intb f dγ is finite. a If γ is also continuous, it is said to be a rectifiable path, and V (γ) is the length of its trace. If X = R, it can be shown that γ is of bounded variation if and only if it is the difference of two monotonic functions. Version: 3 Owner: Koro Author(s): Koro

1504

Chapter 384 26B99 – Miscellaneous
384.1 derivation of zeroth weighted power mean

Using the Taylor series expansion et = 1 + t + O(t2 ), where O(t2 ) is Landau notation for terms of order t2 and higher, we can write xr as i xr = er log xi = 1 + r log xi + O(r 2 ). i
r By substituting this into the definition of Mw , we get r Mw (x1 , x2 , . . . , xn ) =

Let x1 , x2 , . . . , xn be positive real numbers, and let w1 , w2 , . . . , wn be positive real numbers such that w1 + w2 + · · · + wn = 1. For r = 0, the r-th weighted power mean of x1 , x2 , . . . , xn is r Mw (x1 , x2 , . . . , xn ) = (w1 xr + w2 xr + · · · + wn xr )1/r . 1 2 n

w1 (1 + r log x1 ) + · · · + wn (1 + r log xn ) + O(r 2 ) 1 + r(w1 log x1 + · · · + wn log xn ) + O(r 2 )
1/r 1/r

1/r

= =

1 + r log(xw1 xw2 · · · xwn ) + O(r 2 ) 1 2 n 1 = exp log 1 + r log(xw1 xw2 · · · xwn ) + O(r 2 ) 1 2 n r

.

Again using a Taylor series, this time log(1 + t) = t + O(t2 ), we get
r Mw (x1 , x2 , . . . , xn ) = exp

1 r log(xw1 xw2 · · · xwn ) + O(r 2 ) 1 2 n r = exp [log(xw1 xw2 · · · xwn ) + O(r)] . 2 n 1

Taking the limit r → 0, we find
0 Mw (x1 , x2 , . . . , xn ) = exp [log(xw1 xw2 · · · xwn )] n 1 2 = xw1 xw2 · · · xwn . 1 2 n

1505

1 In particular, if we choose all the weights to be n ,

M 0 (x1 , x2 , . . . , xn ) = the geometric mean of x1 , x2 , . . . , xn . Version: 3 Owner: pbruin Author(s): pbruin

√ n

x1 x2 · · · xn ,

384.2

weighted power mean

If w1 , w2 , . . . , wn are positive real numbers such that w1 + w2 + · · · + wn = 1, we define the r-th weighted power mean of the xi as:
r Mw (x1 , x2 , . . . , xn ) = (w1 xr + w2 xr + · · · + wn xr )1/r . 1 2 n 1 When all the wi = n we get the standard power mean. The weighted power mean is a continuous function of r, and taking limit when r → 0 gives us 0 w Mw = xw1 xw2 · · · wn n . 1 2

We can weighted use power means to generalize the power means inequality: If w is a set of weights, and if r < s then r s Mw < Mw . Version: 6 Owner: drini Author(s): drini

1506

Chapter 385 26C15 – Rational functions
385.1 rational function

A real function R(x) of a single variable x is called rational if it can be written as a quotient R(x) = P (x) , Q(x)

where P (x) and Q(x) are polynomials in x with real coefficients. In general, a rational function R(x1 , . . . , xn ) has the form R(x1 , . . . , xn ) = P (x1 , . . . , xn ) , Q(x1 , . . . , xn )

where P (x1 , . . . , xn ) and Q(x1 , . . . , xn ) are polynomials in the variables (x1 , . . . , xn ) with coefficients in some field or ring S. In this sense, R(x1 , . . . , xn ) can be regarded as an element of the fraction field S(x1 , . . . , xn ) of the polynomial ring S[x1 , . . . , xn ]. Version: 1 Owner: igor Author(s): igor

1507

Chapter 386 26C99 – Miscellaneous
386.1 Laguerre Polynomial

A Laguerre Polynomial is a polynomial of the form: ex dn −x n e x . Ln (x) = n! dxn Associated to this is the Laguerre differential equation, the solutions of which are called associated Laguerre Polynomials: Lk (x) = n Of course L0 (x) = Ln (x). n The associated Laguere Polynomials are orthogonal over |0, ∞) with respect to the weighting function xk e−x : (n + k)! int∞ ex xk Lk (x)Lk (x)dx = δn m. 0 n m n! Version: 2 Owner: mathwizard Author(s): mathwizard ex x−k dn −x n+k e x . n! dxn

1508

Chapter 387 26D05 – Inequalities for trigonometric functions and polynomials
387.1 Weierstrass product inequality

For any finite family (ai )i∈I of real numbers in the interval [0, 1], we have (1 − ai ) ≥ 1 − ai .
i

i

Proof: Write f=
i

(1 − ai ) +

ai .
i

For any k ∈ I, and any fixed values of the ai for i = k, f is a polynomial of the first degree in ak . Consequently f is minimal either at ak = 0 or ak = 1. That brings us down to two cases: all the ai are zero, or at least one of them is 1. But in both cases it is clear that f ≥ 1, QED. Version: 2 Owner: Daume Author(s): Larry Hammick

387.2

proof of Jordan’s Inequality

To prove that 2 x π sin(x) x π ∀ x ∈ [0, ] 2 (387.2.1)

consider a unit circle (circle with radius = 1 unit). Take any point P on the circumference of the circle. 1509

Drop the perpendicular from P to the horizontal line, M being the foot of the perpendicular and Q the reflection of P at M. (refer to figure) Let x = ∠P OM. For x to be in [0, π ], the point P lies in the first quadrant, as shown. 2

The length of line segment P M is sin(x). Construct a circle of radius MP , with M as the center.

Length of line segment P Q is 2 sin(x). Length of arc P AQ is 2x. Length of arc P BQ is π sin(x). Since P Q implies length of arc P AQ (equality holds when x = 0) we have 2 sin(x) sin(x) x 2x. This

Since length of arc P AQ is length of arc P BQ (equality holds true when x = 0 or x = π ), 2 we have 2x π sin(x). This implies 2 x π Thus we have 2 x π sin(x) x π ∀ x ∈ [0, ] 2 (387.2.2) sin(x)

Version: 12 Owner: giri Author(s): giri

1510

Chapter 388 26D10 – Inequalities involving derivatives and differential and integral operators
388.1
If, for t0 t

Gronwall’s lemma
t1 , φ(t) 0 and ψ(t) φ(t) 0 are continuous functions such that the inequality K + Lintt0 ψ(s)φ(s)ds t K exp Lintt0 ψ(s)ds t

holds on t0

t

t1 , with K and L positive constants, then φ(t)

on t0

t

t1 .

Version: 1 Owner: jarino Author(s): jarino

388.2

proof of Gronwall’s lemma
φ(t) K + Lintt0 ψ(s)φ(s)ds t 1 (388.2.1)

The inequality is equivalent to φ(t) K + Lintt0 ψ(s)φ(s)ds t Multiply by Lψ(t) and integrate, giving intt0 t Lψ(s)φ(s)ds K + Lints0 ψ(τ )φ(τ )dτ t 1511 Lintt0 ψ(s)ds t

Thus ln K + Lintt0 ψ(s)φ(s)ds − ln K t and finally K + Lintt0 ψ(s)φ(s)ds t K exp Lintt0 ψ(s)ds t Using (387.2.1) in the left hand side of this inequality gives the result. Version: 2 Owner: jarino Author(s): jarino Lintt0 ψ(s)ds t

1512

Chapter 389 26D15 – Inequalities for sums, series and integrals
389.1 Carleman’s inequality

Theorem ([4], pp. 24) For positive real numbers {an }∞ , Carleman’s inequality states n=1 that ∞ ∞
n=1

a1 a2 · · · an

1/n

≤e

an .

n=1

Although the constant e (the natural log base) is optimal, it is possible to refine Carleman’s inequality by decreasing the weight coefficients on the right hand side [2].

REFERENCES
1. L. H¨rmander, The Analysis of Linear Partial Differential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990. 2. B.Q. Yuan, Refinements of Carleman’s inequality, Journal of Inequalities in Pure and Applied Mathematics, Vol. 2, Issue 2, 2001, Article 21. online

Version: 2 Owner: matte Author(s): matte

389.2

Chebyshev’s inequality

If x1 , x2 , . . . , xn and y1 , y2, . . . , yn are two sequences (at least one of them consisting of positive numbers):

1513

• if x1 < x2 < · · · < xn and y1 < y2 < · · · < yn then x1 + x2 + · · · + xn n y1 + y2 + · · · + yn n ≤ x1 y1 + x2 y2 + · · · + xn yn . n

• if x1 < x2 < · · · < xn and y1 > y2 > · · · > yn then x1 + x2 + · · · + xn n y1 + y2 + · · · + yn n ≥ x1 y1 + x2 y2 + · · · + xn yn . n

Version: 1 Owner: drini Author(s): drini

389.3

MacLaurin’s Inequality

Let a1 , a2 , . . . , an be positive real numbers , and define the sums Sk as follows :
1 i1 <i2 <···<ik n

ai1 ai2 · · · aik

Sk =

n k

Then the following chain of inequalities is true : S1 S2
3

S3

···

n

Sn

Note : Sk are called the averages of the elementary symmetric sums This inequality is in fact important because it shows that the Arithmetic-Geometric Mean inequality is nothing but a consequence of a chain of stronger inequalities Version: 2 Owner: drini Author(s): drini, slash

389.4
If p

Minkowski inequality

1 and ak , bk are real numbers for k = 1, . . ., then
n 1/p n 1/p n 1/p

k=1

|ak + bk |p

k=1

|ak |p

+
k=1

|bk |p

The Minkowski inequality is in fact valid for all Lp norms with p ≥ 1 on arbitrary measure spaces. This covers the case of Rn listed here as well as spaces of sequences and spaces of functions, and also complex Lp spaces. Version: 8 Owner: drini Author(s): drini, saforres 1514

389.5
Let 0 s1

Muirhead’s theorem
··· sn and 0
n n

t1

...
k

tn be real numbers such that
k

si =
i=1 i=1

ti and
i=1

si
i=1

ti (k = 1, . . . , n − 1)

Then for any nonnegative numbers x1 , . . . , xn , x1σ(1) . . . xnσ(n)
σ σ s s

x1σ(1) . . . xnσ(n)

t

t

where the sums run over all permutations σ of {1, 2, . . . , n}. Version: 3 Owner: Koro Author(s): Koro

389.6

Schur’s inequality
1 a fixed real constant, then the following 0

If a, b, and c are positive real numbers and k inequality holds:

ak (a − b)(b − c) + bk (b − c)(c − a) + ck (c − a)(c − b) Taking k = 1, we get the well-known a3 + b3 + c3 + 3abc
W

ab(a + b) + ac(a + c) + bc(b + c)

e can assume without loss of generality that c b a via a permutation of the variables (as both sides are symmetric in those variables). Then collecting terms, the lemma states that (a − b) ak (a − c) − bk (b − c) + ck (a − c)(b − c) which is clearly true as every term on the left is positive. Version: 3 Owner: mathcam Author(s): mathcam, slash 0

389.7

Young’s inequality

Let φ : R → R be a continuous , strictly increasing function such that φ(0) = 0 . Then the following inequality holds: ab inta φ(x)dx + intb φ−1 (y)dy 0 0 1515

The inequality is trivial to prove by drawing the graph of φ(x) and by observing that the sum of the two areas represented by the integrals above is greater than the area of a rectangle of sides a and b . Version: 2 Owner: slash Author(s): slash

389.8

arithmetic-geometric-harmonic means inequality

Let x1 , x2 , . . . , xn be positive numbers. Then max{x1 , x2 , . . . , xn } ≥ x1 + x2 + · · · + xn n √ ≥ n x1 x2 · · · xn n ≥ 1 1 + x2 + · · · + x1n x1

≥ min{x1 , x2 , . . . , xn } There are several generalizations to this inequality using power means and weighted power means. Version: 4 Owner: drini Author(s): drini

389.9

general means inequality

The power means inequality is a generalization of arithmetic-geometric means inequality. If 0 = r ∈ R, the r-mean (or r-th power mean) of the nonnegative numbers a1 , . . . , an is defined as 1/r n 1 r r M (a1 , a2 , . . . , an ) = ak n
k=1

Given real numbers x, y such that xy = 0 and x < y, we have Mx My

and the equality holds if and only if a1 = ... = an . Additionally, if we define M 0 to be the geometric mean (a1 a2 ...an )1/n , we have that the inequality above holds for arbitrary real numbers x < y. The mentioned inequality is a special case of this one, since M 1 is the arithmetic mean, M 0 is the geometric mean and M −1 is the harmonic mean. 1516

This inequality can be further generalized using weighted power means. Version: 3 Owner: drini Author(s): drini

389.10

power mean

The r-th power mean of the numbers x1 , x2 , . . . , xn is defined as: xr + xr + · · · + xr 1 2 n n
1/r

M r (x1 , x2 , . . . , xn ) =

.

The arithmetic mean is a special case when r = 1. The power mean is a continuous function of r, and taking limit when r → 0 gives us the geometric mean: M 0 (x1 , x2 , . . . , xn ) = √ n x1 x2 · · · xn .

Also, when r = −1 we get M −1 (x1 , x2 , . . . , xn ) = the harmonic mean.
1 x1

+

1 x2

n +···+

1 xn

A generalization of power means are weighted power means. Version: 8 Owner: drini Author(s): drini

389.11

proof of Chebyshev’s inequality

Let x1 , x2 , . . . , xn and y1 , y2, . . . , yn be real numbers such that x1 ≤ x2 ≤ · · · ≤ xn . Write the product (x1 + x2 + · · · + xn )(y1 + y2 + · · · + yn ) as + + + + (x1 y1 + x2 y2 + · · · + xn yn ) (x1 y2 + x2 y3 + · · · + xn−1 yn + xn y1 ) (x1 y3 + x2 y4 + · · · + xn−2 yn + xn−1 y1 + xn y2 ) ··· (x1 yn + x2 y1 + x3 y2 + · · · + xn yn−1 ). 1517

(389.11.1)

• If y1 ≤ y2 ≤ · · · ≤ yn , each of the n terms in parentheses is less than or equal to x1 y1 + x2 y2 + · · · + xn yn , according to the rearrangement inequality. From this, it follows that (x1 + x2 + · · · + xn )(y1 + y2 + · · · + yn ) ≤ n(x1 y1 + x2 y2 + · · · + xn yn ) or (dividing by n2 ) x1 + x2 + · · · + xn n y1 + y2 + · · · + yn n ≤ x1 y1 + x2 y2 + · · · + xn yn . n

• If y1 ≥ y2 ≥ · · · ≥ yn , the same reasoning gives x1 + x2 + · · · + xn n y1 + y2 + · · · + yn n ≥ x1 y1 + x2 y2 + · · · + xn yn . n

It is clear that equality holds if x1 = x2 = · · · = xn or y1 = y2 = · · · = yn . To see that this condition is also necessary, suppose that not all yi ’s are equal, so that y1 = yn . Then the second term in parentheses of (388.11.1) can only be equal to x1 y1 + x2 y2 + · · · + xn yn if xn−1 = xn , the third term only if xn−2 = xn−1 , and so on, until the last term which can only be equal to x1 y1 + x2 y2 + · · · + xn yn if x1 = x2 . This implies that x1 = x2 = · · · = xn . Therefore, Chebyshev’s inequality is an equality if and only if x1 = x2 = · · · = xn or y1 = y2 = · · · = yn . Version: 1 Owner: pbruin Author(s): pbruin

389.12

proof of Minkowski inequality

For p = 1 the result follows immediately from the triangle inequality, so we may assume p > 1. We have by the triangle inequality. Therefore we have |ak + bk |p Set q =
p . p−1

|ak + bk |p = |ak + bk ||ak + bk |p−1

(|ak | + |bk |)|ak + bk |p−1

|ak ||ak + bk |p−1 + |bk ||ak + bk |p−1
1 p 1 q

Then
n

1 p

+

1 q

= 1, so by the H¨lder inequality we have o
n n

k=0 n

|ak ||ak + bk |p−1 |bk ||ak + bk |p−1

k=0 n

|ak |p
1 p

k=0 n

|ak + bk |(p−1)q
1 q

k=0

k=0

|bk |p

k=0

|ak + bk |(p−1)q

1518

Adding these two inequalities, dividing by the factor common to the right sides of both, and observing that (p − 1)q = p by definition, we have
n 1− 1 q n n
1 p

n

1 p

i=0

|ak + bk |p

k=0

(|ak | + |bk |)|ak + bk |p−1

k=0

|ak |p

+
k=0

|bk |p

Finally, observe that 1 − 1 = 1 , and the result follows as required. The proof for the integral q p version is analogous. Version: 4 Owner: saforres Author(s): saforres

389.13

proof of arithmetic-geometric-harmonic means inequality

Let M be max{x1 , x2 , x3 , . . . , xn } and let m be min{x1 , x2 , x3 , . . . , xn }. Then M +M +M +···+M x1 + x2 + x3 + · · · + xn n n n n n m= n = 1 1 1 1 1 1 1 + x2 + x3 + · · · + x1n + m + m +···+ m m m x1 M=

where all the summations have n terms. So we have proved in this way the two inequalities at the extremes. Now we shall prove the inequality between arithmetic mean and geometric mean. We do first the case n = 2. √ √ ( x1 − x2 )2 √ x1 − 2 x1 x2 + x2 x1 + x2 x1 + x2 2

0 0 √ 2 x1 x2 √ x1 x2

Now we prove the inequality for any power of 2 (that is, n = 2k for some integer k) by using mathematical induction. x1 + x2 + · · · + x2k + x2k +1 + · · · + x2k+1 2k+1 x k +x k +···+x2k+1 x1 +x2 +···+x2k + 2 +1 2 +2 2k 2k = 2 1519

and using the case n = 2 on the last expression we can state the following inequality x1 + x2 + · · · + x2k + x2k +1 + · · · + x2k+1 2k+1 ≥ ≥ x1 + x2 + · · · + x2k x2k +1 + x2k +2 + · · · + x2k+1 k 2 2k √ √ 2k x1 x2 · · · x2k 2k x2k +1 x2k +2 · · · x2k+1

where the last inequality was obtained by applying the induction hypothesis with n = 2k . √ Finally, we see that the last expression is equal to 2k+1 x1 x2 x3 · · · x2k+1 and so we have proved the truth of the inequality when the number of terms is a power of two. Finally, we prove that if the inequality holds for any n, it must also hold for n − 1, and this proposition, combined with the preceding proof for powers of 2, is enough to prove the inequality for any positive integer. Suppose that √ x1 + x2 + · · · + xn n x1 x2 · · · xn n is known for a given value of n (we just proved that it is true for powers of two, as example). Then we can replace xn with the average of the first n − 1 numbers. So
+···+x x1 + x2 + · · · + xn−1 + x1 +x2n−1 n−1 n (n − 1)x1 + (n − 1)x2 + · · · + (n − 1)xn−1 + x1 + x2 + · · · + xn = n(n − 1) nx1 + nx2 + · · · + nxn−1 = n(n − 1) x1 + x2 + · · · + xn−1 = (n − 1)

On the other hand
n

x1 x2 · · · xn−1 = √ n

x1 + x2 + · · · + xn−1 n−1
n

x1 x2 · · · xn−1
n

x1 + x2 + · · · + xn−1 n−1 x1 + x2 + · · · + xn−1 n−1

which, by the inequality stated for n and the observations made above, leads to: x1 + x2 + · · · + xn−1 n−1 and so x1 + x2 + · · · + xn−1 n−1 1520 ≥ (x1 x2 · · · xn )
n−1

≥ x1 x2 · · · xn

from where we get that x1 + x2 + · · · + xn−1 ≥ n−1
n−1

x1 x2 · · · xn .

So far we have proved the inequality between the arithmetic mean and the geometric mean. The geometric-harmonic inequality is easier. Let ti be 1/xi . From we obtain t1 + t2 + · · · + tn n
1 x1

√ n

t1 t2 t3 · · · tn 1 1 1 1 ··· x1 x2 x3 xn n
1 x3

+

1 x2

+

1 x3

n and therefore √ n

+···+

1 xn

n

x1 x2 x3 · · · xn

1 x1

+

1 x2

+

and so, our proof is completed. Version: 2 Owner: drini Author(s): drini

+···+

1 xn

389.14

proof of general means inequality

Let r < s be real numbers, and let w1 , w2 , . . . , wn be positive real numbers such that w1 + w2 + · · · + wn = 1. We will prove the weighted power means inequality, which states that for positive real numbers x1 , x2 , . . . , xn ,
r s Mw (x1 , x2 , . . . , xn ) ≤ Mw (x1 , x2 , . . . , xn ).

First, suppose that r and s are nonzero. Then the r-th weighted power mean of x1 , x2 , . . . , xn is r Mw (x1 , x2 , . . . , xn ) = (w1 x1 + w2 x2 + · · · + wn xn )1/r
s and Mw is defined similarly. s t Let t = r , and let yi = xr for 1 ≤ i ≤ n; this implies yi = xs . Define the function f on i i 1 (0, ∞) by f (x) = xt . The second derivative of f is f (x) = t(t−1) xt−2 . There are three cases for the signs of r and s: r < s < 0, r < 0 < s, and 0 < r < s. We will prove the inequality for the case 0 < r < s; the other cases are almost identical. 1 In the case that r and s are both positive, t > 1. Since f (x) = t(t−1) xt−2 > 0 for all x > 0, f is a strictly convex function. Therefore, according to Jensen’s inequality,

(w1 y1 + w2 y2 + · · · + wn yn )t = f (w1y1 + w2 y2 + · · · + wn yn ) ≤ w1 f (y1) + w2 f (y2) + · · · + wn f (yn ) t t t = w1 y 1 + w2 y 2 + · · · + wn y n . 1521

with equality if and only if y1 = y2 = · · · = yn . By substituting t = this inequality, we get

s r

and yi = xr back into i

(w1 xr + w2 xr + · · · + wn xr )s/r ≤ w1 xs + w2 xs + · · · + wn xs 1 2 n 1 2 n with equality if and only if x1 = x2 = · · · = xn . Since s is positive, the function x → x1/s is strictly increasing, so raising both sides to the power 1/s preserves the inequality: (w1 xr + w2 xr + · · · + wn xr )1/r ≤ (w1 xs + w2 xs + · · · + wn xs )1/s , 2 n 1 2 n 1 which is the inequality we had to prove. Equality holds if and only if all the xi are equal.
0 r r s If r = 0, the inequality is still correct: Mw is defined as limr→0 Mw , and since Mw ≤ Mw for all r < s with r = 0, the same holds for the limit r → 0. We can show by an identical r 0 argument that Mw ≤ Mw for all r < 0. Therefore, for all real numbers r and s such that r < s, r s Mw (x1 , x2 , . . . , xn ) ≤ Mw (x1 , x2 , . . . , xn ).

Version: 1 Owner: pbruin Author(s): pbruin

389.15

proof of rearrangement inequality

We first prove the rearrangement inequality for the case n = 2. Let x1 , x2 , y1 , y2 be real numbers such that x1 ≤ x2 and y1 ≤ y2 . Then (x2 − x1 )(y2 − y1 ) ≥ 0, and therefore Equality holds iff x1 = x2 or y1 = y2 . x1 y1 + x2 y2 ≥ x1 y2 + x2 y1 .

For the general case, let x1 , x2 , . . . , xn and y1 , y2, . . . , yn be real numbers such that x1 ≤ x2 ≤ · · · ≤ xn . Suppose that (z1 , z2 , . . . , zn ) is a permutation (rearrangement) of {y1, y2 , . . . , yn } such that the sum x1 z1 + x2 z2 + · · · + xn zn is maximized. If there exists a pair i < j with zi > zj , then xi zj + xj zi ≥ xi zi + xj zj (the n = 2 case); equality holds iff xi = xj . Therefore, x1 z1 + x2 z2 + · · · + xn zn is not maximal unless z1 ≤ z2 ≤ · · · ≤ zn or xi = xj for all pairs i < j such that zi > zj . In the latter case, we can consecutively interchange these pairs until z1 ≤ z2 ≤ · · · ≤ zn (this is possible because the number of pairs i < j with zi > zj decreases with each step). So x1 z1 + x2 z2 + · · · + xn zn is maximized if z1 ≤ z2 ≤ · · · ≤ zn . To show that x1 z1 + x2 z2 + · · · + xn zn is minimal for a permutation (z1 , z2 , . . . , zn ) of {y1 , y2 , . . . , yn } if z1 ≥ z2 ≥ · · · ≥ zn , observe that −(x1 z1 + x2 z2 + · · · + xn zn ) = x1 (−z1 ) + 1522

x2 (−z2 ) + · · · + xn (−zn ) is maximized if −z1 ≤ −z2 ≤ · · · ≤ −zn . This implies that x1 z1 + x2 z2 + · · · + xn zn is minimized if z1 ≥ z2 ≥ · · · ≥ zn . Version: 1 Owner: pbruin Author(s): pbruin

389.16

rearrangement inequality

Let x1 , x2 , . . . , xn and y1 , y2 , . . . , yn two sequences of positive real numbers. Then the sum x1 y1 + x2 y2 + · · · + xn yn is maximized when the two sequences are ordered in the same way (i.e. x1 ≤ x2 ≤ · · · ≤ xn and y1 ≤ y2 ≤ · · · ≤ yn ) and is minimized when the two sequences are ordered in the opposite way (i.e. x1 ≤ x2 ≤ · · · ≤ xn and y1 ≥ y2 ≥ · · · ≥ yn ). This can be seen intuitively as: If x1 , x2 , . . . , xn are the prices of n kinds of items, and y1 , y2 , . . . , yn the number of units sold of each, then the highest profit is when you sell more items with high prices and fewer items with low prices (same ordering), and the lowest profit happens when you sell more items with lower prices and less items with high prices (opposite orders). Version: 4 Owner: drini Author(s): drini

1523

Chapter 390 26D99 – Miscellaneous
390.1 Bernoulli’s inequality

Let x and r be real numbers. If r > 1 and x > −1 then (1 + x)r ≥ 1 + xr.

The inequality also holds when r is an even integer. Version: 3 Owner: drini Author(s): drini

390.2

proof of Bernoulli’s inequality

Let I be the interval (−1, ∞) and f : I → R the function defined as: f (x) = (1 + x)α − 1 − αx with α ∈ R \ {0, 1} fixed. Then f is differentiable and its derivative is f (x) = α(1 + x)α−1 − α, for all x ∈ I, from which it follows that f (x) = 0 ⇔ x = 0. 1. If 0 < α < 1 then f (x) < 0 for all x ∈ (0, ∞) and f (x) > 0 for all x ∈ (−1, 0) which means that 0 is a global maximum point for f . Therefore f (x) < f (0) for all x ∈ I \ {0} which means that (1 + x)α < 1 + αx for all x ∈ (−1, 0). 1524

2. If α ∈ [0, 1] then f (x) > 0 for all x ∈ (0, ∞) and f (x) < 0 for all x ∈ (−1, 0) meaning / that 0 is a global minimum point for f . This implies that f (x) > f (0) for all x ∈ I \{0} which means that (1 + x)α > 1 + αx for all x ∈ (−1, 0). Checking that the equality is satisfied for x = 0 or for α ∈ {0, 1} ends the proof. Version: 3 Owner: danielm Author(s): danielm

1525

Chapter 391 26E35 – Nonstandard analysis
391.1 hyperreal

An ultrafilter F on a set I is called nonprincipal if no finite subsets of I are in F. Fix once and for all a nonprincipal ultrafilter F on the set N of natural numbers. Let ∼ be the equivalence relation on the set RN of sequences of real numbers given by {an } ∼ {bn } ⇔ {n ∈ N | an = bn } ∈ F Let ∗ R be the set of equivalence classes of RN under the equivalence relation ∼. The set ∗ R is called the set of hyperreals. It is a field under coordinatewise addition and multiplication: {an } + {bn } = {an + bn } {an } · {bn } = {an · bn } The field ∗ R is an ordered field under the ordering relation {an } {bn } ⇔ {n ∈ N | an bn } ∈ F

The real numbers embed into ∗ R by the map sending the real number x ∈ R to the equivalence class of the constant sequence given by xn := x for all n. In what follows, we adopt the convention of treating R as a subset of ∗ R under this embedding. A hyperreal x ∈ ∗ R is: • limited if a < x < b for some real numbers a, b ∈ R • positive unlimited if x > a for all real numbers a ∈ R • negative unlimited if x < a for all real numbers a ∈ R 1526

• unlimited if it is either positive unlimited or negative unlimited • positive infinitesimal if 0 < x < a for all positive real numbers a ∈ R+ • negative infinitesimal if a < x < 0 for all negative real numbers a ∈ R− • infinitesimal if it is either positive infinitesimal or negative infinitesimal For any subset A of R, the set ∗ A is defined to be the subset of ∗ R consisting of equivalence classes of sequences {an } such that {n ∈ N | an ∈ A} ∈ F. The sets ∗ N, ∗ Z, and ∗ Q are called hypernaturals, hyperintegers, and hyperrationals, respectively. An element of ∗ N is also sometimes called hyperfinite. Version: 1 Owner: djao Author(s): djao

391.2

e is not a quadratic irrational

Looking at the Taylor series for ex , we see that e =
x ∞ k=0

xk . k!

∞ ∞ 1 −1 k 1 This converges for every x ∈ R, so e = = k=0 k! and e k=0 (−1) k! . Arguing by contradiction, assume ae2 +be+c = 0 for integers a, b and c. That is the same as ae+b+ce−1 = 0.

Fix n > |a| + |c|, then a, c | n! and ∀k ≤ n, k! | n! . Consider 0 = n!(ae + b + ce ) = an! = b+
k=0 −1 ∞ k=0 n

1 1 + b + cn! (−1)k k! k! k=0 (a + c(−1)k ) n! + k!

(a + c(−1)k )

k=n+1

n! k!

Since k! | n! for k ≤ n, the first two terms are integers. So the third term should be an

1527

integer. However,

(a + c(−1)k )

k=n+1

n! n! ≤ (|a| + |c|) k! k! k=n+1 = (|a| + |c|) ≤ (|a| + |c|) = (|a| + |c|) = (|a| + |c|) 1 (n + 1)(n + 2) · · · k k=n+1
∞ ∞

(n + 1)n−k

k=n+1 ∞ t=1

(n + 1)−t

1 n

is less than 1 by our assumption that n > |a| + |c|. Since there is only one integer which is k 1 less than 1 in absolute value, this means that ∞ k=n+1 (a+ c(−1) ) k! = 0 for every sufficiently large n which is not the case because 1 1 1 (a + c(−1)k ) = (a + c(−1)k ) (a + c(−1) ) − k! k=n+2 k! (n + 1)! k=n+1
k ∞ ∞

is not identically zero. The contradiction completes the proof. Version: 6 Owner: thedagit Author(s): bbukh, thedagit

391.3

zero of a function

Definition Suppose X is a set, and suppose f is a complex-valued function f : X → C. Then a zero of f is an element x ∈ X such that f (x) = 0. The zero set of f is the set Z(f ) = {x ∈ X | f (x) = 0}. Remark When X is a “simple” space, such as R or C a zero is also called a root. However, in pure mathematics and especially if Z(f ) is infinite, it seems to be customary to talk of zeroes and the zero set instead of roots. Examples • Suppose p is a polynomial p : C → C of degree n ≥ 1. Then p has at most n zeroes. That is, |Z(p)| ≤ n. 1528

• If f and g are functions f : X → C and g : X → C, then Z(f g) = Z(f ) Z(f g) ⊃ Z(f ), where f g is the function x → f (x)g(x). • If X is a topological space and f : X → C is a function, then supp f = Z(f ) . Further, if f is continuous, then Z(f ) is a closed in X (assuming that C is given the usual topology of the complex plane where {0} is a closed set). Version: 21 Owner: mathcam Author(s): matte, yark, say 10, apmxi Z(g),

1529

Chapter 392 28-00 – General reference works (handbooks, dictionaries, bibliographies, etc.)
392.1 extended real numbers

The extended real numbers are the real numbers together with +∞ (or simply ∞) and −∞. This set is usually denoted by R or [−∞, ∞] [3], and the elements +∞ and −∞ are called plus infinity respectively minus infinity. Following [3], let us next extend the order operation <, the addition and multiplication operations, and the absolute value from R to R. In other words, let us define how these operations should behave when some of their arguments are ∞ or −∞. Order on R The order relation on R extends to R by defining that for any x ∈ R, we have −∞ < x, x < ∞, and that −∞ < ∞. Addition For any real number x, we define x + (±∞) = (±∞) + x = ±∞, 1530

and for +∞ and −∞, we define (±∞) + (±∞) = ±∞. It should be pointed out that sums like (+∞) + (−∞) are left undefined. Multiplication If x is a positive real number, then x · (±∞) = (±∞) · x = ±∞. Similarly, if x is a negative real number, then x · (±∞) = (±∞) · x = Furthermore, for ∞ and −∞, we define (+∞) · (+∞) = (−∞) · (−∞) = +∞, (+∞) · (−∞) = (−∞) · (+∞) = −∞. In many areas of mathematics, products like 0 · ∞ are left undefined. However, a special case is measure theory, where it is convenient to define [3] 0 · (±∞) = (±∞) · 0 = 0. Absolute value For ∞ and −∞, the absolute value is defined as | ± ∞| = +∞. Examples 1. By taking x = −1 in the product rule, we obtain the relations (−1) · (±∞) = ∞. ∞.

REFERENCES
1. D.L. Cohn, Measure Theory, Birkh¨user, 1980. a

Version: 1 Owner: matte Author(s): matte 1531

Chapter 393 28-XX – Measure and integration
393.1 Riemann integral

Suppose there is a function f : D → R where D, R ⊆ R and that there is a closed interval I = [a, b] such that I ⊆ D. For any finite set of points {x0 , x1 , x2 , . . . xn } such that a = x0 < x1 < x2 · · · < xn = b, there is a corresponding partition P = {[x0 , x1 ), [x1 , x2 ), . . . , [xn − 1, xn ]} of I. Let C( ) be the set of all partitions of I with max(xi+1 − xi ) < . Then let S ∗ ( ) be the infimum of the set of upper Riemann sums with each partition in C( ), and let S∗ ( ) be the supremum of the set of lower Riemann sums with each partition in C( ). If 1 < 2 , then C( 1 ) ⊂ C( 2 ), so S ∗ = lim →0 S ∗ ( ) and S∗ = lim →0 S∗ ( ) exist. If S ∗ = S∗ , then f is Riemann-integrable over I, and the Riemann integral of f over I is defined by intb f (x)dx = S ∗ = S∗ . a Version: 4 Owner: bbukh Author(s): bbukh, vampyr

393.2

martingale

Let ν be a probability measure on Cantor space C, and let s ∈ [0, ∞). 1. A ν-s-supergale is a function d : {0, 1}∗ → [0, ∞) that satisfies the condition d(w)ν(w)s for all w ∈ {0, 1}∗. 2. A ν-s-gale is a ν-s-supergale that satisfies the condition with equality for all w ∈ {0, 1}∗. 1532 d(w0)ν(w0)s + d(w1)ν(w1)s (393.2.1)

3. A ν-supermartingale is a ν-1-supergale. 4. A ν-martingale is a ν-1-gale. 5. An s-supergale is a µ-s-supergale, where µ is the uniform probability measure. 6. An s-gale is a µ-s-gale. 7. A supermartingale is a 1-supergale. 8. A martingale is a 1-gale. Put in another way, a martingale is a function d : {0, 1}∗ → [0, ∞) such that, for all w ∈ {0, 1}∗, d(w) = (d(w0) + d(w1))/2. Let d be a ν-s-supergale, where ν is a probability measure on C and s ∈ [0, ∞). We say that d succeeds on a sequence S ∈ C if lim sup d(S[0..n − 1]) = ∞.
n→∞

The success set of d is S ∞ [d] = {S ∈ C d succeeds on S}. d succeeds on a language A ⊆ {0, 1}∗ if d succeeds on the characteristic sequence χA of A. We say that d succeeds strongly on a sequence S ∈ C if lim inf d(S[0..n − 1]) = ∞.
n→∞ ∞ The strong success set of d is Sstr [d] = {S ∈ C d succeeds strongly on S}.

Intuitively, a supergale d is a betting strategy that bets on the next bit of a sequence when the previous bits are known. s is the parameter that tunes the fairness of the betting. The smaller s is, the less fair the betting is. If d succeeds on a sequence, then the bonus we can get from applying d as the betting strategy on the sequence is unbounded. If d succeeds strongly on a sequence, then the bonus goes to infinity. Version: 10 Owner: xiaoyanggu Author(s): xiaoyanggu

1533

Chapter 394 28A05 – Classes of sets (Borel fields, σ-rings, etc.), measurable sets, Suslin sets, analytic sets
394.1 Borel σ-algebra

For any topological space X, the Borel sigma algebra of X is the σ–algebra B generated by the open sets of X. An element of B is called a Borel subset of X, or a Borel set. Version: 5 Owner: djao Author(s): djao, rmilson

1534

Chapter 395 28A10 – Real- or complex-valued set functions
395.1 σ-finite

A measure space (Ω, B, µ) is σ-finite if the total space is the union of a finite or countable family of sets of finite measure; i.e. if there exists a finite or countable set F ⊂ B such that µ(A) < ∞ for each A ∈ F, and Ω = A∈F A. In this case we also say that µ is a σ-finite measure. If µ is not σ-finite, we say that it is σ-infinite. Examples. Any finite measure space is σ-finite. A more interesting example is the Lebesgue measure µ in Rn : it is σ-finite but not finite. In fact R= [−k, k]n
k∈N

([−k, k]n is a cube with center at 0 and side length 2k, and its measure is (2k)n ), but µ(Rn ) = ∞. Version: 6 Owner: Koro Author(s): Koro, drummond

395.2

Argand diagram

An argand diagram is the graphical representation of complex numbers written in polar coordinates. Argand is the name of Jean-Robert Argand, the frenchman who is is credited with the geometric interpretation of the complex numbers [Biography] Version: 3 Owner: drini Author(s): drini 1535

395.3

Hahn-Kolmogorov theorem
{∞}

Let A0 be an algebra of subsets of a set X. If a finitely additive measure µ0 : A → R satisfies ∞ ∞ µ0 ( An ) = µ0 (An )
n=1 n=1

for any disjoint family {An : n ∈ N} of elements of A0 such that ∞ An ∈ A0 , then µ0 n=0 extends uniquely to a measure defined on the σ-algebra A generated by A0 ; i.e. there exists a unique measure µ : A → R {∞} such that its restriction to A0 coincides with µ0 Version: 3 Owner: Koro Author(s): Koro

395.4

measure

Let (E, B(E)) be a measurable space. A measure on (E, B(E)) is a function µ : B(E) −→ R {∞} with values in the extended real numbers such that: 1. µ(A) 2. µ( I The second property is called countable additivity. A finitely additive measure µ has the same definition except that B(E) is only required to be an algebra and the second property above is only required to hold for finite unions. Note the slight abuse of terminology: a finitely additive measure is not necessarily a measure. The triple (E, B, µ) is called a measure space. If µ(E) = 1, then it is called a probability space, and the measure µ is called a probability measure. Lebesgue measure on Rn is one important example of a measure. Version: 8 Owner: djao Author(s): djao 0 for A ∈ B(E), with equality if A = ∅ =
∞ i=0

∞ i=0 Ai )

µ(Ai ) for any sequence of disjoint sets Ai ∈ B(E).

395.5

outer measure

Definition [1, 2, 1] Let X be a set, and let P(X) be the power set of X. An outer measure on X is a function µ∗ : P(X) → [0, ∞] satisfying the properties 1. µ∗ (∅) = 0. 1536

2. If A ⊂ B are subsets in X, then µ∗ (A) ≤ µ∗ (B). 3. If {Ai } is a countable collection of subsets of X, then µ∗ (
i

Ai ) ≤

µ∗ (Ai ).
i

Here, we can make two remarks. First, from (1) and (2), it follows that µ∗ is a positive function on P(X). Second, property (3) also holds for any finite collection of subsets since we can always append an infinite sequence of empty sets to such a collection. Examples • [1, 2] On a set X, let us define µ∗ : P(X) → [0, ∞] as µ∗ (E) = Then µ∗ is an outer measure. • [1] On a uncountable set X, let us define µ∗ : P(X) → [0, ∞] as µ∗ (E) = Then µ∗ is an outer measure. Theorem [1, 2, 1] Let X be a set, and let F be a collection of subsets of X such that ∅ ∈ F and X ∈ F. Further, let ρ : F → [0, ∞] be a mapping such that ρ(∅) = 0. If A ⊂ X, let µ (A) = inf
∗ ∞ i=1

1 when E = ∅, 0 when E = ∅.

1 when E is uncountable, 0 when E is countable.

ρ(Fi ),
∞ i=1

where the infimum is taken over all collections {Fi }∞ ⊂ F such that A ⊂ i=1 µ∗ : P(X) → [0, ∞] is an outer measure.

Fi . Then

REFERENCES
1. A. Mukherjea, K. Pothoven, Real and Functional analysis, Plenum press, 1978. 2. A. Friedman, Foundations of Modern Analysis, Dover publications, 1982. 3. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999.

Version: 1 Owner: mathcam Author(s): matte 1537

395.6

properties for measure

Theorem [1, 1, 3, 2] Let (E, B, µ) be a measure space, i.e., let E be a set, let B be a σ-algebra of sets in E, and let µ be a measure on B. Then the following properties hold: 1. Monotonicity: If A, B ∈ B, and A ⊂ B, then µ(A) ≤ µ(B). 2. If A, B in B, A ⊂ B, and µ(A) < ∞, then µ(B \ A) = µ(B) − µ(A). 3. For any A, B in B, we have µ(A B) + µ(A B) = µ(A) + µ(B).

4. subadditivity: If {Ai }∞ is a collection of sets from B, then i=1 µ
∞ i=1

Ai ≤

µ(Ai ).

i=1

5. Continuity from below: If {Ai }∞ is a collection of sets from B such that Ai ⊂ Ai+1 i=1 for all i, then µ

Ai = lim µ(Ai ).
i→∞

i=1

6. Continuity from above: If {Ai }∞ is a collection of sets from B such that µ(A1 ) < i=1 ∞, and Ai ⊃ Ai+1 for all i, then µ

Ai = lim µ(Ai ).
i→∞

i=1

Remarks In (2), the assumption µ(A) < ∞ assures that the right hand side is always well defined, i.e., not of the form ∞ − ∞. Without the assumption we can prove that µ(B) = µ(A) + µ(B \ A) (see below). In (3), it is tempting to move the term µ(A B) to the other side for aesthetic reasons. However, this is only possible if the term is finite. Proof. For (1), suppose A ⊂ B. We can then write B as the disjoint union B = A (B \ A), whence µ(B) = µ(A (B \ A)) = µ(A) + µ(B \ A).

Since µ(B \ A) ≥ 0, the claim follows. Property (2) follows from the above equation; since µ(A) < ∞, we can subtract this quantity from both sides. For property (3), we can write A B = A (B \ A), whence µ(A B) = µ(A) + µ(B \ A) ≤ µ(A) + µ(B). 1538

If µ(A B) is infinite, the last inequality must be equality, and either of µ(A) or µ(B) must be infinite. Together with (1), we obtain that if any of the quantities µ(A), µ(B), µ(A B) or µ(A B) is infinite, then all quantities are infinite, whence the claim clearly holds. We can therefore without loss of generality assume that all quantities are finite. From A B = B (A \ B), we have µ(A B) = µ(B) + µ(A \ B) and thus 2µ(A B) = µ(A) + µ(B) + µ(A \ B) + µ(B \ A). (B \ A)) B)) B),

For the last two terms we have

µ(A \ B) + µ(B \ A) = µ((A \ B) = µ((A = µ(A

B) \ (A B) − µ(A

where, in the second equality we have used properties for the symmetric set difference, and the last equality follows from property (2). This completes the proof of property (3). For property (4), let us define the sequence {Di }∞ as i=1
i−1

D1 = A1 ,

Di = Ai \

Ak .
k=1 ∞ i=1

Now Di Dj = ∅ for i < j, so {Di } is a sequence of disjoint sets. Since and since Di ⊂ Ai , we have µ(

Di =

∞ i=1

Ai ,

Ai ) = µ( = ≤

Di )

i=1

i=1 ∞ i=1 ∞ i=1

µ(Di )

µ(Ai ),

and property (4) follows. TODO: proofs for (5)-(6).

REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999. 2. A. Mukherjea, K. Pothoven, Real and Functional analysis, Plenum press, 1978. 3. D.L. Cohn, Measure Theory, Birkh¨user, 1980. a 4. A. Friedman, Foundations of Modern Analysis, Dover publications, 1982.

Version: 2 Owner: matte Author(s): matte 1539

Chapter 396 28A12 – Contents, measures, outer measures, capacities
396.1 Hahn decomposition theorem

Let µ be a signed measure in the measurable space (Ω, S). There are two measurable sets A and B such that: 1. A B = Ω and A B = ∅;

2. µ(E) 3. µ(E)

0 for each E ∈ S such that E ⊂ A; 0 for each E ∈ S such that E ⊂ B.

The pair (A, B) is called a Hahn decomposition for µ. This decomposition is not unique, but any other such decomposition (A , B ) satisfies µ(A A) = µ(B B ) = 0 (where denotes the symmetric difference), so the two decompositions differ in a set of measure 0. Version: 6 Owner: Koro Author(s): Koro

396.2

Jordan decomposition

Let (Ω, S, µ) be a signed measure space, and let (A, B) be a Hahn decomposition for µ. We define µ+ and µ− by µ+ (E) = µ(A E) and µ− (E) = −µ(B E).

This definition is easily shown to be independent of the chosen Hahn decomposition. 1540

It is clear that µ+ is a positive measure, and it is called the positive variation of µ. On the other hand, µ− is a positive finite measure, called the negative variation of µ. The measure |µ| = µ+ + µ− is called the total variation of µ. Notice that µ = µ+ − µ− . This decomposition of µ into its positive and negative parts is called the Jordan decomposition of µ. Version: 6 Owner: Koro Author(s): Koro

396.3

Lebesgue decomposition theorem

Let µ and ν be two σ-finite signed measures in the measurable space (Ω, S). There exist two σ-finite signed measures ν0 and ν1 such that: 1. ν = ν0 + ν1 ; 2. ν0 µ (i.e. ν0 is absolutely continuous with respect to µ;)

3. ν1 ⊥ µ (i.e. ν1 and µ are singular.) These two measures are uniquely determined. Version: 5 Owner: Koro Author(s): Koro

396.4

Lebesgue outer measure

Let S be some arbitrary subset of R. Let L(I) be the traditional definition of the length of an interval I ⊆ R. If I = (a, b), then L(I) = b − a. Let M be the set containing L(A)
A∈C

for any countable collection of open intervals C that covers S (that is, S ⊆ Lebesgue outer measure of S is defined by:

C). Then the

m∗ (S) = inf(M) Note that (R, P(R), m∗ ) is “almost” a measure space. In particular: 1541

• Lebesgue outer measure is defined for any subset of R (and P(R) is a σ-algebra). • m∗ A 0 for any A ⊆ R, and m∗ ∅ = 0. • If A and B are disjoint sets, then m∗ (A B) m∗ A + m∗ B. More generally, if Ai is a countable sequence of disjoint sets, then m∗ ( Ai ) m∗ Ai . This property is known as countable subadditivity and is weaker than countable additivity. In fact, m∗ is not countably additive. Lebesgue outer measure has other nice properties: • The outer measure of an interval is its length: m∗ (a, b) = b − a. • m∗ is translation invariant. That is, if we define A + y to be the set {x + y : x ∈ A}, we have m∗ A = m∗ (A + y) for any y ∈ R. Version: 4 Owner: vampyr Author(s): vampyr

396.5

absolutely continuous

Given two signed measures µ and ν on the same measurable space (Ω, S), we say that ν is absolutely continuous with respect to µ if, for each A ∈ S such that |µ|(A) = 0, it holds ν(A) = 0. This is usually denoted by ν µ. Remarks. If (ν + , ν − ) is the Jordan decomposition of ν, the following propositions are equivalent: 1. ν 2. ν + 3. |ν| µ; µ and ν − µ|. µ;

If ν is a finite signed measure and ν µ, the following useful property holds: for each ε > 0, there is a δ > 0 such that |ν|(E) < ε whenever |µ|(E) < δ. Version: 5 Owner: Koro Author(s): Koro

1542

396.6

counting measure

Let (X, B) be a measurable space. We call a measure µ counting measure on X if µ(A ∈ B) = n if A has exactly n elements ∞ otherwise.

Generally, counting measure is applied on N or Z. Version: 2 Owner: mathwizard Author(s): mathwizard, drummond

396.7

measurable set

Let (X, F, µ) be a measure space with a sigma algebra F. A measurable set with respect to µ in X is an element of F. These are also sometimes called µ-measurable sets. Any subset Y ⊂ X with Y ∈ F is said to be nonmeasurable with respect to µ, or non-µ-measurable. / Version: 2 Owner: mathcam Author(s): mathcam, drummond

396.8

outer regular

Let X be a locally compact Hausdorff topological space with Borel σ–algebra B, and suppose µ is a measure on (X, B). For any Borel set B ∈ B, the measure µ is said to be outer regular on B if µ(B) = inf {µ(U) | U ⊃ B, U open}. We say µ is inner regular on B if µ(B) = sup {µ(K) | K ⊂ B, K compact}. Version: 1 Owner: djao Author(s): djao

396.9

signed measure
{+∞} which is

A signed measure on a measurable space (Ω, S) is a function µ : S → R σ-additive and such that µ(∅) = 0. Remarks. 1543

1. The usual (positive) measure is a particular case of signed measure, in which |µ| = µ (see Jordan decomposition.) 2. Notice that the value −∞ is not allowed. 3. An important example of signed measures arises from the usual measures in the following way: Let (Ω, S, µ) be a measure space, and let f be a (real valued) measurable function such that int{x∈Ω:f (x)<0} |f |dµ < ∞. Then a signed measure is defined by A → intA f dµ. Version: 4 Owner: Koro Author(s): Koro

396.10

singular measure

Two measures µ and ν in a measurable space (Ω, A) are called singular if there exist two disjoint sets A and B in A such that A B = Ω and µ(B) = ν(A) = 0. This is denoted by µ ⊥ ν. Version: 4 Owner: Koro Author(s): Koro

1544

Chapter 397 28A15 – Abstract differentiation theory, differentiation of set functions
397.1 Hardy-Littlewood maximal theorem

There is a constant K > 0 such that for each Lebesgue integrable function f ∈ L1 (Rn ), and each t > 0, K K m({x : Mf (x) > t}) f 1 = intRn |f (x)|dx, t t where Mf is the Hardy-Littlewood maximal function of f . Remark. The theorem holds for the constant K = 3n . Version: 1 Owner: Koro Author(s): Koro

397.2

Lebesgue differentiation theorem

Let f be a locally integrable function on Rn with Lebesgue measure m, i.e. f ∈ L1 (Rn ). loc Lebesgue’s differentiation theorem basically says that for almost every x, the averages 1 intQ |f (y) − f (x)|dy m(Q) converge to 0 when Q is a cube containing x and m(Q) → 0. Formally, this means that there is a set N ⊂ Rn with µ(N) = 0, such that for every x ∈ N / and ε > 0, there exists δ > 0 such that, for each cube Q with x ∈ Q and m(Q) < δ, we have 1 intQ |f (y) − f (x)|dy < ε. m(Q) 1545

For n = 1, this can be restated as an analogue of the fundamental theorem of calculus for Lebesgue integrals. Given a x0 ∈ R, d intx f (t)dt = f (x) dx x0

for almost every x. Version: 6 Owner: Koro Author(s): Koro

397.3

Radon-Nikodym theorem

Let µ and ν be two σ-finite measures on the same measurable space (Ω, S), such that ν µ (i.e. ν is absolutely continuous with respect to µ.) Then there exists a measurable function f , which is nonnegative and finite, such that for each A ∈ S, ν(A) = intA f dµ. This function is unique (any other function satisfying these conditions is equal to f µ-almost everywhere,) and it is called the Radon-Nikodym derivative of ν with respect to µ, dν denoted by f = dµ . Remark. The theorem also holds if ν is a signed measure. Even if ν is not σ-finite the theorem holds, with the exception that f is not necessarely finite. Some properties of the Radon-Nikodym derivative Let ν, µ, and λ be σ-finite measures in (Ω, S). 1. If ν λ and µ λ, then d(ν + µ) dν dµ = + µ-almost everywhere; dλ dλ dλ 2. If ν µ λ, then dν dν dν = µ-almost everywhere; dλ dµ dλ dµ dλ; dλ

3. If µ

λ and g is a µ-integrable function, then intΩ gdµ = intΩ g

4. If µ

ν and ν

µ, then dµ = dν dν dµ

−1

.

Version: 5 Owner: Koro Author(s): Koro 1546

397.4

integral depending on a parameter

Suppose (E, B, µ) is a measure space, suppose I is an open interval in R, and suppose we are given a function f : E × I → R, (x, t) → f (x, t),

where R is the extended real numbers. Further, suppose that for each t ∈ I, the mapping x → f (x, t) is in L1 (E). (Here, L1 (E) is the set of measurable functions f : E → R with finite Lebesgue integral; intE |f (x)|dµ < ∞.) Then we can define a function F : I → R by F (t) = intE f (x, t)dµ. Continuity of F Let t0 ∈ I. In addition to the above, suppose: 1. For almost all x ∈ E, the mapping t → f (x, t) is continuous at t = t0 . 2. There is a function g ∈ L1 (E) such that for almost all x ∈ E, |f (x, t)| ≤ g(x) for all t ∈ I. Then F is continuous at t0 . Differentiation under the integral sign Suppose that the assumptions given in the introduction hold, and suppose: 1. For almost all x ∈ E, the mapping t → f (x, t) is differentiable for all t ∈ I. 2. There is a function g ∈ L1 (E) such that for almost all x ∈ E, | for all t ∈ I. Then F is differentiable on I,
d f (x, t)dµ dt

d f (x, t)| ≤ g(x) dt

d d F (t) = intE f (x, t)dµ. dt dt The above results can be found in [1, 1]. 1547

is in L1 (E), and for all t ∈ I, (397.4.1)

REFERENCES
1. F. Jones, Lebesgue Integration on Euclidean Spaces, Jones and Barlett Publishers, 1993. 2. C.D. Aliprantis, O. Burkinshaw, Principles of Real Analysis, 2nd ed., Academic Press, 1990.

Version: 1 Owner: matte Author(s): matte

1548

Chapter 398 28A20 – Measurable and nonmeasurable functions, sequences of measurable functions, modes of convergence
398.1 Egorov’s theorem

Let (X, S, µ) be a measure space, and let E be a subset of X of finite measure. If fn is a sequence of measurable functions converging to f almost everywhere, then for each δ > 0 there exists a set Eδ such that µ(Eδ ) < δ and fn → f uniformly on E − Eδ . Version: 2 Owner: Koro Author(s): Koro

398.2

Fatou’s lemma

If f1 , f2 , . . . is a sequence of nonnegative measurable functions in a measure space X, then intX lim inf fn
n→∞

lim inf intX fn
n→∞

Version: 3 Owner: Koro Author(s): Koro

1549

398.3

Fatou-Lebesgue theorem

Let X be a measure space. If Φ is a measurable function with intX Φ < ∞, and if f1 , f2 , . . . is a sequence of measurable functions such that |fn | Φ for each n, then g = lim inf fn and h = lim sup fn
n→∞ n→∞

are both integrable, and −∞ < intX g lim inf intX fn
n→∞

lim sup intX fn
k→∞

intX h < ∞.

Version: 3 Owner: Koro Author(s): Koro

398.4

dominated convergence theorem

Let X be a measure space, and let Φ, f1 , f2 , . . . be measurable functions such that intX Φ < ∞ and |fn | Φ for each n. If fn → f almost everywhere, then f is integrable and
n→∞

lim intX fn = intX f.

This theorem is a corollary of the Fatou-Lebesgue theorem. A possible generalization is that if {fr : r ∈ R} is a family of measurable functions such that |fr | |Φ| for each r ∈ R and fr − → f , then f is integrable and −
r→0 r→0

lim intX fr = intX f.

Version: 8 Owner: Koro Author(s): Koro

398.5

measurable function

¯ Let f : X → R be a function defined on a measure space X. We say that f is measurable if {x ∈ X | f (x) > a} is a measurable set for all a ∈ R. Version: 5 Owner: vypertd Author(s): vypertd

1550

398.6

monotone convergence theorem

Let X be a measure space, and let 0 f1 f2 · · · be a monotone increasing sequence of nonnegative measurable functions. Let f be the function defined almost everywhere by f (x) = limn→∞ fn (x). Then f is measurable, and
n→∞

lim intX fn = intX f.

Remark. This theorem is the first of several theorems which allow us to “exchange integration and limits”. It requires the use of the Lebesgue integral: with the Riemann integral, we cannot even formulate the theorem, lacking, as we do, the concept of “almost everywhere”. For instance, the characteristic function of the rational numbers in [0, 1] is not Riemann integrable, despite being the limit of an increasing sequence of Riemann integrable functions. Version: 5 Owner: Koro Author(s): Koro, ariels

398.7

proof of Egorov’s theorem

Let Ei,j = {x ∈ E : |fj (x) − f (x)| < 1/i}. Since fn → f almost everywhere, there is a set S with µ(S) = 0 such that, given i ∈ N and x ∈ E − S, there is m ∈ N such that j > m implies |fj (x) − f (x)| < 1/i. This can be expressed by E−S ⊂ or, in other words,
m∈N j>m

Ei,j ,
m∈N j>m

(E − Ei,j ) ⊂ S.

Since { j>m (E − Ei,j )}m∈N is a decreasing nested sequence of sets, each of which has finite measure, and such that its intersection has measure 0, by continuity from above we know that µ( (E − Ei,j )) − − 0. −→
j>m m→∞

Therefore, for each i ∈ N, we can choose mi such that µ(
j>mi

(E − Ei,j )) <

δ . 2i

Let Eδ =
i∈N j>mi

(E − Ei,j ).

1551

Then µ(Eδ )

∞ i=1

µ(
j>mi

(E − Ei,j )) <

∞ i=1

δ = δ. 2i

We claim that fn → f uniformly on E −Eδ . In fact, given ε > 0, choose n such that 1/n < ε. If x ∈ E − Eδ , we have x∈ Ei,j ,
i∈N j>mi

which in particular implies that, if j > mn , x ∈ En,j ; that is, |fj (x) − f (x)| < 1/n < ε. Hence, for each xε > 0 there is N (which is given by mn above) such that j > N implies |fj (x) − f (x)| < ε for each x ∈ E − Eδ , as required. This completes the proof. Version: 3 Owner: Koro Author(s): Koro

398.8

proof of Fatou’s lemma

Let f (x) = lim inf n→∞ fn (x) and let gn (x) = inf k≥n fk (x) so that we have f (x) = sup gn (x).
n

As gn is an increasing sequence of measurable nonnegative functions we can apply the monotone convergence theorem to obtain intX f dµ = lim intX gn dµ.
n→∞

On the other hand, being gn ≤ fn , we conclude by observing
n→∞

lim intX gn dµ = lim inf intX gn dµ ≤ lim inf intX fn dµ.
n→∞ n→∞

Version: 1 Owner: paolini Author(s): paolini

398.9

proof of Fatou-Lebesgue theorem
intX g ≤ lim inf intX fn
n→∞

By Fatou’s lemma we have and (recall that lim sup f = − lim inf −f ) lim sup intX fn ≤ intX h.
n→∞

1552

On the other hand by the properties of lim inf and lim sup we have g ≥ −Φ, and hence intX g ≥ intX − Φ > −∞, Version: 1 Owner: paolini Author(s): paolini intX h ≤ intX Φ < +∞. f ≤Φ

398.10

proof of dominated convergence theorem

It is not difficult to prove that f is measurable. In fact we can write f (x) = sup inf k≥n fk (x)
n

and we know that measurable functions are closed under the sup and inf operation. Consider the sequence gn (x) = 2Φ(x) − |f (x) − fn (x)|. clearly gn are nonnegative functions since f − fn ≤ 2Φ. So, applying Fatou’s lemma, we obtain
n→∞

lim intX |f − fn | dµ ≤ lim sup intX |f − fn | dµ
n→∞

= − lim inf intX − |f − fn | dµ
n→∞

= intX 2Φ dµ − lim inf intX 2Φ − |f − fn | dµ
n→∞

≤ intX 2Φ dµ − intX 2Φ − lim sup |f − fn | dµ
n→∞

= intX 2Φ dµ − intX 2Φ dµ = 0. Version: 1 Owner: paolini Author(s): paolini

398.11

proof of monotone convergence theorem

It is enough to prove the following Theorem 7. Let (X, µ) be a measurable space and let fk : X → R {+∞} be a monotone increasing sequence of positive measurable functions (i.e. 0 ≤ f1 ≤ f2 ≤ . . .). Then f (x) = limk→∞ fk (x) is measurable and lim intX fk dµ = intX f (x).
n→∞

1553

First of all by the monotonicity of the sequence we have f (x) = sup fk (x)
k

hence we know that f is measurable. Moreover being fk ≤ f for all k, by the monotonicity of the integral, we immediately get sup intX fk dµ ≤ intX f (x) dµ.
k

So take any simple measurable function s such that 0 ≤ s ≤ f . Given also α < 1 define Ek = {x ∈ X : fk (x) ≥ αs(x)}. The sequence Ek is an increasing sequence of measurable sets. Moreover the union of all Ek is the whole space X since limk→∞ fk (x) = f (x) ≥ s(x) > αs(x). Moreover it holds intX fk dµ ≥ intEk fk dµ ≥ αintEk s dµ. Being s a simple measurable function it is easy to check that E → intE s dµ is a measure and hence sup intX fk dµ ≥ αintX s dµ.
k

But this last inequality holds for every α < 1 and for all simple measurable functions s with s ≤ f . Hence by the definition of Lebesgue integral sup intk fk dµ ≥ intX f dµ
k

which completes the proof. Version: 1 Owner: paolini Author(s): paolini

1554

Chapter 399 28A25 – Integration with respect to measures and other set functions
399.1 L∞(X, dµ)

The L∞ space, L∞ (X, dµ), is a vector space consisting of equivalence classes of functions f : X → C with norm given by f

= ess sup |f (t)| ,

the essential supremum of |f |. Additionally, we require that f

< ∞.

The equivalence classes of L∞ (X, dµ) are given by saying that f, g : X → C are equivalent iff f and g differ on a set of µ measure zero. Version: 3 Owner: ack Author(s): bbukh, ack, apmxi

399.2

Hardy-Littlewood maximal operator

The Hardy-Littlewood maximal operator in Rn is an operator defined on L1 (Rn ) (the loc space of locally integrable functions in Rn with the Lebesgue measure) which maps each locally integrable function f to another function Mf , defined for each x ∈ Rn by Mf (x) = sup
Q

1 intQ |f (y)|dy, m(Q)

where the supremum is taken over all cubes Q containing x. This function is lower semicontinuous (and hence measurable), and it is called the Hardy-Littlewood maximal function of f .

1555

The operator M is sublinear, which means that M(af + bg) |a|Mf + |b|Mg

for each pair of locally integrable functions f, g and scalars a, b. Version: 3 Owner: Koro Author(s): Koro

399.3

Lebesgue integral

The integral of a measurable function f : X → R {±∞} on a measure space (X, B, µ) is written intX f dµ It is defined via the following steps: • If f = f rm[o]−−A is the characteristic function of a set A ∈ B, then set intX f rm[o]−−A dµ := µ(A). • If f is a simple function (i.e. if f can be written as
n

or just

intf.

(399.3.1)

(399.3.2)

f=
k=1

ck f rm[o]−−Ak ,

ck ∈ R

(399.3.3)

for some finite collection Ak ∈ B), then define
n

n

intX f dµ :=
k=1

ck intX f rm[o]−−Ak dµ =
k=1

ck µ(Ak ).

(399.3.4)

• If f is a nonnegative measurable function (possibly attaining the value ∞ at some points), then we define intX f dµ := sup {intX h dµ : h is simple and h(x) ≤ f (x) for all x ∈ X} . (399.3.5) • For any measurable function f (possibly attaining the values ∞ or −∞ at some points), write f = f + − f − where f + := max(f, 0) and and define the integral of f as intX f dµ := intX f + dµ − intX f − dµ, provided that intX f + dµ and intX f − dµ are not both ∞. 1556 (399.3.7) f − := max(−f, 0), (399.3.6)

If µ is Lebesgue measure and X is any interval in Rn then the integral is called the Lebesgue integral. If the Lebesgue integral of a function f on a set A exists, f is said to be Lebesgue integrable. The Lebesgue integral equals the Riemann integral everywhere the latter is defined; the advantage to the Lebesgue integral is that many Lebesgue-integrable functions are not Riemann-integrable. For example, the Riemann integral of the characteristic function of the rationals in [0, 1] is undefined, while the Lebesgue integral of this function is simply the measure of the rationals in [0, 1], which is 0. Version: 12 Owner: djao Author(s): djao, drummond

1557

Chapter 400 28A60 – Measures on Boolean rings, measure algebras
400.1 σ-algebra

Let X be a set. A σ-algebra is a collection M of subsets of X such that • X∈M • If A ∈ M then X − A ∈ M. • If A1 , A2 , A3 , . . . is a countable subcollection of M, that is, Aj ∈ M for j = 1, 2, 3, . . . (the subcollection can be finite) then the union of all of them is also in M:
∞ j=1

Ai ∈ M.

Version: 3 Owner: drini Author(s): drini, apmxi

400.2

σ-algebra

Given a set E, a sigma algebra (or σ–algebra) in E is a collection B(E) of subsets of E such that: • ∅ ∈ B(E) • Any countable union of elements of B(E) is in B(E) 1558

• The complement of any element of B(E) in E is in B(E) Given any collection C of subsets of B(E), the σ–algebra generated by C is defined to be the smallest σ–algebra in E containing C. Version: 5 Owner: djao Author(s): djao

400.3

algebra

Given a set E, an algebra in E is a collection B(E) of subsets of E such that: • ∅ ∈ B(E) • Any finite union of elements of B(E) is in B(E) • The complement of any element of B(E) in E is in B(E) Given any collection C of subsets of B(E), the algebra generated by C is defined to be the smallest algebra in E containing C. Version: 2 Owner: djao Author(s): djao

400.4

measurable set (for outer measure)

Definition [1, 2, 1] Let µ∗ be an outer measure on a set X. A set E ⊂ X is said to be measurable, or µ∗ -measurable, if for all A ⊂ X, we have µ∗ (A) = µ∗ (A E) + µ∗ (A E ). (400.4.1)

Remark If A, E ⊂ X, we have, from the properties of the outer measure, µ∗ (A) = µ∗ A = µ∗ (A ≤ µ∗ (A (E E) E ) (A E ) E )).

E) + µ∗ (A

Hence equation (399.4.1) is equivalent to the inequality [1, 2, 1] µ∗ (A) ≥ µ(A E) + µ(A E ).

1559

Of course, this inequality is trivially satisfied if µ∗ (A) = ∞. Thus a set E ⊂ X is µmeasurable in X if and only if the above inequality holds for all A ⊂ X for which µ∗ (A) < ∞ [1]. Theorem [Carath´odory’s theorem] [1, 2, 1] Suppose µ∗ is an outer measure on a set e X, and suppose M is the set of all µ∗ -measurable sets in X. Then M is a σ-algebra, and µ∗ restricted to M is a measure (on M). Example Let µ∗ be an outer measure on a set X. 1. Any null set (a set E with µ∗ (E) = 0) is measurable. Indeed, suppose µ∗ (E) = 0, and A ⊂ X. Then, since A E ⊂ E, we have µ∗ (A E) = 0, and since A E ⊂ A, we have µ∗ (A) ≥ µ∗ (A E ), so µ∗ (A) ≥ µ∗ (A = µ∗ (A Thus E is measurable. 2. If {Bi }∞ is a countable collection of null sets, then i=1 directly from the last property of the outer measure.
∞ i=1

E) E) + µ∗ (A E ).

Bi is a null set. This follows

REFERENCES
1. A. Mukherjea, K. Pothoven, Real and Functional analysis, Plenum press, 1978. 2. A. Friedman, Foundations of Modern Analysis, Dover publications, 1982. 3. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999.

Version: 1 Owner: matte Author(s): matte

1560

Chapter 401 28A75 – Length, area, volume, other geometric measure theory
401.1 Lebesgue density theorem

Let µ be the Lebesgue measure on R. If µ(Y ) > 0, then there exists X ⊂ Y such that µ(Y − X) = 0 and for all x ∈ X lim µ(X
→+0

[x − , x + ]) = 1. 2

Version: 2 Owner: bbukh Author(s): bbukh

1561

Chapter 402 28A80 – Fractals
402.1 Cantor set

The Cantor set C is the canonical example of an uncountable set of measure zero. We construct C as follows.
1 Begin with the unit interval C0 = [0, 1], and remove the open segment R1 := ( 3 , 2 ) from the 3 middle. We define C1 as the two remaining pieces

C1 := C0 R1 = 0,

1 3

2 ,0 3

(402.1.1)

Now repeat the process on each remaining segment, removing the open set R2 := to form the four-piece set C2 := C1 R2 = 0, 1 9 2 1 , 9 3 2 7 , 3 9 8 ,1 16 (402.1.3) 1 2 , 9 9 7 8 , 9 9 (402.1.2)

Continue the process, forming C3 , C4 , . . . Note that Ck has 2k pieces.

Figure 402.1: The sets C0 through C5 in the construction of the Cantor set

Also note that at each step, the endpoints of each closed segment will stay in the set forever— 2 e.g., the point 3 isn’t touched as we remove sets. 1562

The Cantor set is defined as C :=
∞ k=1

Ck = C0 \

∞ n=1

Rn

(402.1.4)

Cardinality of the Cantor set To establish cardinality, we want a bijection between some set whose cardinality we know (e.g. Z, R) and the points in the Cantor set. We’ll be aggressive and try the reals. Start at C1 , which has two pieces. Mark the left-hand segment “0” and the right-hand segment “1”. Then continue to C2 , and consider only the leftmost pair. Again, mark the segments “0” and “1”, and do the same for the rightmost pair. Keep doing this all the way down the Ck , starting at the left side and marking the segments 0, 1, 0, 1, 0, 1 as you encounter them, until you’ve labeled the entire Cantor set. Now, pick a path through the tree starting at C0 and going left-left-right-left. . . and so on. Mark a decimal point for C0 , and record the zeros and ones as you proceed. Each path has a unique number based on your decision at each step. For example, the figure represents your choice of left-left-right-left-right at the first five steps, representing the number beginning 0.00101... Every point in the Cantor set will have a unique address dependent solely on the pattern Figure 402.2: One possible path through C5 : 0.00101

of lefts and rights, 0’s and 1’s, required to reach it. Each point thus has a unique number, the real number whose binary expansion is that sequence of zeros and ones. Every infinite stream of binary digits can be found among these paths, and in fact the binary expansion of every real number is a path to a unique point in the Cantor set. Some caution is justified, as two binary expansions may refer to the same real number; for 1 example, 0.011111 . . . = 0.100000 . . . = 2 . However, each one of these duplicates must correspond to a rational number. To see this, suppose we have a number x in [0, 1] whose binary expansion becomes all zeros or all ones at digit k (both are the same number, remember). Then we can multiply that number by 2k and get 1, so it must be a (binary) rational number. There are only countably many rationals, and not even all of those are the double-covered numbers we’re worried about (see, e.g., 1 = 0.0101010 . . .), so we have at most countably 3 many duplicated reals. Thus, the cardinality of the 0.Cantor set is equal to that of the reals. (If we want to be really picky, map (0, 1) to the reals with, say, f (x) = 1/x + 1/(x − 1), and the end points really don’t matter much.)
1 Return, for a moment, to the earlier observation that numbers such as 3 , 2 , the endpoints 9 of deleted intervals, are themselves never deleted. In particluar, consider the first deleted

1563

interval: the ternary expansions of its constituent numbers are precisely those that begin 0.1, and proceed thence with at least one non-zero “ternary” digit (just digit for us) further ˙ along. Note also that the point 1 , with ternary expansion 0.1, may also be written 0.02 3 (or 0.0¯ which has no digits 1. Similar descriptions apply to further deleted intervals. 2), The result is that the cantor set is precisely those numbers in the set [0, 1] whose ternary expansion contains no digits 1.

Measure of the Cantor set Let µ be Lebesgue measure. The measure of the sets Rk that we remove during the construction of the Cantor set are 1 2 1 − = 3 3 3 2 1 − µ(R2 ) = + 9 9 . . . µ(R1 ) =
k

(402.1.5) 8 7 − 9 9 = 2 9 (402.1.6) (402.1.7) (402.1.8)

µ(Rk ) =
n=1

2n−1 3n

Note that the R’s are disjoint, which will allow us to sum their measures without worry. In the limit k → ∞, this gives us µ
∞ n=1

Rn

=

∞ n=1

2n−1 = 1. 3n

(402.1.9)

But we have µ(C0 ) = 1 as well, so this means µ(C) = µ C0 \
∞ n=1

Rn

= µ(C0) −

∞ n=1

1 = 1 − 1 = 0. 2n

(402.1.10)

Thus we have seen that the measure of C is zero (though see below for more on this topic). How many points are there in C? Lots, as we shall see. So we have a set of measure zero (very tiny) with uncountably many points (very big). This non-intuitive result is what makes Cantor sets so interesting.

Cantor sets with positive measure clearly, Cantor sets can be constructed for all sorts of “removals”—we can remove middle halves, or thirds, or any amount 1 , r > 1 we like. All r of these Cantor sets have measure zero, since at each step n we end up with Ln = 1− 1564 1 r
n

(402.1.11)

of what we started with, and limn→∞ Ln = 0 for any r > 1. With apologies, the figure above is drawn for the case r = 2, rather than the r = 3 which seems to be the publically favored example. However, it is possible to construct Cantor sets with positive measure as well; the key is to remove less and less as we proceed. These Cantor sets have the same “shape” (topology) as the Cantor set we first constructed, and the same cardinality, but a different “size.” Again, start with the unit interval for C0 , and choose a number 0 < p < 1. Let R1 := 2−p 2+p , 4 4 (402.1.12)

which has length (measure) p . Again, define C1 := C0 \ R1 . Now define 2 R2 := 2−p 2+p , 16 16 14 − p 14 + p , 16 16
p ; 2k

(402.1.13) note again

which has measure p . Continue as before, such that each Rk has measure 4 that all the Rk are disjoint. The resulting Cantor set has measure µ C0 \
∞ n=1

Rn

=1−

∞ n=1

µ(Rn ) = 1 −

∞ n=1

p 2−n = 1 − p > 0.

Thus we have a whole family of Cantor sets of positive measure to accompany their vanishing brethren. Version: 19 Owner: drini Author(s): drini, quincynoodles, drummond

402.2

Hausdorff dimension

Let Θ be a bounded subset of Rn let NΘ ( ) be the minimum number of balls of radius required to cover Θ. Then define the Hausdorff dimension dH of Θ to be dH (Θ) := − lim log NΘ ( ) . →0 log

Hausdorff dimension is easy to calculate for simple objects like the Sierpinski gasket or a Koch curve. Each of these may be covered with a collection of scaled-down copies of itself. In fact, in the case of the Sierpinski gasket, one can take the individual triangles in each approximation as balls in the covering. At stage n, there are 3n triangles of radius 21 , and n log 3 n log 3 so the Hausdorff dimension of the Sierpinski triangle is − n log 1/2 = log 2 .

1565

From some notes from Koro This definition can be extended to a general metric space X with distance function d. Define the diameter |C| of a bounded subset C of X to be supx,y∈C d(x, y), and define a countable r-cover of X to be a collection of subsets Ci of X indexed by some countable set I, such that X = i∈I Ci . We also define the handy function
D Hr (X) = inf i∈I

|Ci |D

where the infimum is over all countable r-covers of X. The Hausdorf dimension of X may then be defined as D dH (X) = inf{D | lim Hr (X) = 0}.
r→0

When X is a subset of R with any restricted norm-induced metric, then this definition reduces to that given above. Version: 8 Owner: drini Author(s): drini, quincynoodles

n

402.3

Koch curve

A Koch curve is a fractal generated by a replacement rule. This rule is, at each step, to replace the middle 1/3 of each line segment with two sides of a right triangle having sides of length equal to the replaced segment. Two applications of this rule on a single line segment gives us:

To generate the Koch curve, the rule is applied indefinitely, with a starting line segment. Note that, if the length of the initial line segment is l, the length LK of the Koch curve at the nth step will be 4 3
n

LK =

l

This quantity increases without bound; hence the Koch curve has infinite length. However, the curve still bounds a finite area. We can prove this by noting that in each step, we add an amount of area equal to the area of all the equilateral triangles we have just created. We can bound the area of each triangle of side length s by s2 (the square containing the triangle.) Hence, at step n, the area AK ”under” the Koch curve (assuming l = 1) is

1566

Figure 402.3: Sierpinski gasket stage 0, a single triangle Figure 402.4: Stage 1, three triangles

AK < =

1 3
n

2

+3 1 3i−1

1 9

2

+9

1 27

2

+···

i=1

but this is a geometric series of ratio less than one, so it converges. Hence a Koch curve has infinite length and bounds a finite area. A Koch snowflake is the figure generated by applying the Koch replacement rule to an equilateral triangle indefinitely. Version: 3 Owner: akrowne Author(s): akrowne

402.4

Sierpinski gasket

Let S0 be a triangular area, and define Sn+1 to be obtained from Sn by replacing each triangular area in Sn with three similar and similarly oriented triangular areas each intersecting with each of the other two at exactly one vertex, each one half the linear scale of the orriginal in size. The limiting set as n → ∞ (alternately the intersection of all these sets) is a Sierpinski gasket, also known as a Sierpinski triangle. Version: 3 Owner: quincynoodles Author(s): quincynoodles

402.5

fractal

Option 1: Some equvialence class of subsets of Rn . A usual equivalence is postulated when some generalised ”distance” is zero. For example, let F, G ⊂ Rn , and let d(x, y) be the usual distance (x, y ∈ R). Define the distance D between F and G as D(F, G) := inf f ∈F sup d(f, g) + inf g∈G sup d(f, g)
g∈G f ∈F

Figure 402.5: Stage 2, nine triangles 1567

Figure 402.6: Stage n, 3n triangles Then in this case we have, as fractals, that Q and R are equivalent. Option 2: A subset of Rn with non-integral Hausdorff dimension. Examples: (we think) the coast of Britain, a Koch snowflake. Option 3: A “self-similar object”. That is, one which can be covered by copies of itself using a set of (usually two or more) transformation mappings. Another way to say this would be “an object with a discrete approximate scaling symmetry.” Example: A square region, a Koch curve, a fern frond. This isn’t much different from Option 1 because of the collage theorem. A cursory description of some relationships between options 2 and 3 is given towards the end of the entry on Hausdorff dimension. The use of option 1 is that it permits one to talk about how ”close” two fractals are to one another. This becomes quite handy when one wants to talk about approximating fractals, especially approximating option 3 type fractals with pictures that can be drawn in finite time. A simple example: one can talk about how close one of the line drawings in the Koch curve entry is to an actual Koch curve. Version: 7 Owner: quincynoodles Author(s): quincynoodles

1568

Chapter 403 28Axx – Classical measure theory
403.1 Vitali’s Theorem

There exists a set V ⊂ [0, 1] which is not Lebesgue measurable Version: 1 Owner: paolini Author(s): paolini

403.2

proof of Vitali’s Theorem

Consider the equivalence relation in [0, 1) given by x∼y ⇔ x−y ∈Q and let F be the family of all equivalence classes of ∼. Let V be a section of F i.e. put in V an element for each equivalence class of ∼ (notice that we are using the axiom of choice). Given q ∈ Q [0, 1) define Vq = ((V + q) [0, 1)) ((V + q − 1) [0, 1)) that is Vq is obtained translating V by a quantity q to the right and then cutting the piece which goes beyond the point 1 and putting it on the left, starting from 0. Now notice that given x ∈ [0, 1) there exists y ∈ V such that x ∼ y (because V is a section of ∼) and hence there exists q ∈ Q [0, 1) such that x ∈ Vq . So Vq = [0, 1).
q∈Q T [0,1)

Moreover all the Vq are disjoint. In fact if x ∈ Vq Vp then x − q (modulus [0, 1)) and x − p are both in V which is not possible since they differ by a rational quantity q −p (or q −p + 1). 1569

Now if V is Lebesgue measurable, clearly also Vq are measurable and µ(Vq ) = µ(V ). Moreover by the countable additivity of µ we have µ([0, 1)) =
q∈Q T [0,1)

µ(Vq ) =
q

µ(V ).

So if µ(V ) = 0 we had µ([0, 1)) = 0 and if µ(V ) > 0 we had µ([0, 1)) = +∞. So the only possibility is that V is not Lebesgue measurable. Version: 1 Owner: paolini Author(s): paolini

1570

Chapter 404 28B15 – Set functions, measures and integrals with values in ordered spaces
404.1 Lp-space

when the integral exists. The set of functions with finite Lp -norm form a vector space V with the usual pointwise addition and scalar multiplication of functions. In particular, the set of functions with zero Lp -norm form a linear subspace of V , which for this article will be called K. We are then interested in the quotient space V /K, which consists of real functions on X with finite Lp -norm, identified up to equivalence almost everywhere. This quotient space is the real Lp -space on X. Theorem The vector space V /K is complete with respect to the Lp norm.

Definition Let (X, B, µ) be a measure space. The Lp -norm of a function f : X → R is defined as 1 f p := (intX |f |p dµ) p (404.1.1)

The space L∞ . The space L∞ is somewhat special, and may be defined without explicit reference to an integral. First, the L∞ -norm of f is defined to be the essential supremum of |f |: f ∞ := ess sup |f | = inf {a ∈ R : µ({x : |f (x)| > a}) = 0} (404.1.2) The definitions of V , K, and L∞ then proceed as above. Functions in L∞ are also called essentially bounded. Example Let X = [0, 1] and f (x) =
1 √ . x

Then f ∈ L1 (X) but f ∈ L2 (X). /

Version: 18 Owner: mathcam Author(s): Manoj, quincynoodles, drummond 1571

404.2

locally integrable function

Definition [4, 1, 2] Suppose that U is an open set in Rn , and f : U → C is a Lebesgue integrable function. If the Lebesgue integral intK |f |dx is finite for all compact subsets K in U, then f is locally integrable. The set of all such functions is denoted by L1 (U). loc Example 1. L1 (U) ⊂ L1 (U), where L1 (U) is the set of integrable functions. loc Theorem Suppose f and g are locally integrable functions on an open subset U ⊂ Rn , and suppose that intU f φdx = intU gφdx
∞ for all smooth functions with compact support φ ∈ C0 (U). Then f = g almost everywhere.

A proof based on the Lebesgue differentiation theorem is given in [4] pp. 15. Another proof is given in [2] pp. 276.

REFERENCES
1. L. H¨rmander, The Analysis of Linear Partial Differential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990. 2. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999. 3. S. Lang, Analysis II, Addison-Wesley Publishing Company Inc., 1969.

Version: 3 Owner: matte Author(s): matte

1572

Chapter 405 28C05 – Integration theory via linear functionals (Radon measures, Daniell integrals, etc.), representing set functions and measures
405.1 Haar integral

Let Γ be a locally compact topological group and C be the algebra of all continuous realvalued functions on Γ with compact support. In addition we define C+ to be the set of non-negative functions that belong to C. The Haar integral is a real linear map I of C into the field of the real number for Γ if it satisfies: • I is not the zero map • I only takes non-negative values on C+ • I has the following property I(γ · f ) = I(f ) for all elements f of C and all element γ of Γ. The Haar integral may be denoted in the following way (there are also other ways): intγ∈Γ f (γ) or intΓ f or intΓ f dγ or I(f ) In order for the Haar intergral to exists and to be unique, the following conditions are necessary and sufficient: That there exists a real-values function I + on C+ satisfying the following condition: 1573

1. (Linearity).I + (λf + µg) = λI + (f ) + µI + (g) where f, g ∈ C+ and λ, µ ∈ R+ . 2. (Positivity). If f (γ) 0 for all γ ∈ Γ then I + (f (γ)) 0. 3. (Translation-Invariance). I(f (δγ)) = I(f (γ)) for any fixed δ ∈ Γ and every f in C+ . An additional property is if Γ is a compact group then the Haar integral has right translationinvariance: intγ∈Γ f (γδ) = intγ∈Γ f (γ) for any fixed δ ∈ Γ. In addition we can define normalized Haar integral to be intΓ 1 = 1 since Γ is compact, it implies that intΓ 1 is finite. (The proof for existence and uniqueness of the Haar integral is presented in [PV] on page 9.)

( the information of this entry is in part quoted and paraphrased from [GSS])

REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988. [HG] Gochschild, G.: The Structure of Lie Groups. Holden-Day, San Francisco, 1965.

Version: 4 Owner: Daume Author(s): Daume

1574

Chapter 406 28C10 – Set functions and measures on topological groups, Haar measures, invariant measures
406.1
406.1.1

Haar measure
Definition of Haar measures

Let G be a locally compact topological group. A left Haar measure on G is a measure µ on the Borel sigma algebra B of G which is: 1. outer regular on all Borel sets B ∈ B 2. inner regular on all open sets U ⊂ G 3. finite on all compact sets K ⊂ G 4. invariant under left translation: µ(gB) = µ(B) for all Borel sets B ∈ B A right Haar measure on G is defined similarly, except with left translation invariance replaced by right translation invariance (µ(Bg) = µ(B) for all Borel sets B ∈ B). A bi– invariant Haar measure is a Haar measure that is both left invariant and right invariant.

1575

406.1.2

Existence of Haar measures

For any finite group G, the counting measure on G is a bi–invariant Haar measure. More generally, every locally compact topological group G has a left 1 Haar measure µ, which is unique up to scalar multiples. The Haar measure plays an important role in the development of Fourier analysis and representation theory on locally compact groups such as Lie groups and profinite groups. Version: 1 Owner: djao Author(s): djao

1 G also has a right Haar measure, although the right and left Haar measures on G are not necessarily equal unless G is abelian.

1576

Chapter 407 28C20 – Set functions and measures and integrals in infinite-dimensional spaces (Wiener measure, Gaussian measure, etc.)
407.1 essential supremum

Let (X, B, µ) be a measure space and let f : X → R be a function. The essential supremum of f is the smallest number a ∈ R for which f only exceeds a on a set of measure zero. This allows us to generalize the maximum of a function in a useful way. More formally, we define ess sup f as follows. Let a ∈ R, and define Ma = {x : f (x) > a} , the subset of X where f (x) is greater than a. Then let A0 = {a ∈ R : µ(Ma ) = 0} , (407.1.2) (407.1.1)

the set of real numbers for which Ma has measure zero. If A0 = ∅, then the essential supremum is defined to be ∞. Otherwise, the essential supremum of f is ess sup f := infA0 . Version: 1 Owner: drummond Author(s): drummond (407.1.3)

1577

Chapter 408 28D05 – Measure-preserving transformations
408.1 measure-preserving

Let (X, B, µ) be a measure space, and T : X → X be a (possibly non-invertible) measurable transformation. We call T measure-preserving if for all A ∈ B, µ(T −1(A)) = µ(A), where T −1 (A) is defined to be the set of points x ∈ X such that T (x) ∈ A. A measure-preserving transformation is also called an endomorphism of the measure space. Version: 5 Owner: mathcam Author(s): mathcam, drummond

1578

Chapter 409 30-00 – General reference works (handbooks, dictionaries, bibliographies, etc.)
409.1 domain

A non-empty open set in C is called a domain. The topology considered is the Euclidean one (viewing C as R2 ). So we have that for a domain D being connected is equivalent to being path-connected. Since we have that every component of a domain D will be a region, we have that every domain has at most countably many components. Version: 4 Owner: drini Author(s): drini

409.2

region

A region is a connected domain. Since every domain of C can be seen as the union of countably many components and each component is a region, we have that regions play a major role in complex analysis. Version: 2 Owner: drini Author(s): drini

1579

409.3

regular region

Let E be a n-dimensional Euclidean space with the topology induced by the Euclidean metric. Then a set in E is a regular region, if it can be written as the closure of a non-empty region with a piecewise smooth boundary. Version: 10 Owner: ottocolori Author(s): ottocolori

409.4

topology of the complex plane

The usual topology for the complex plane C is the topology induced by the metric d(x, y) = |x − y| for x, y ∈ C. Here, | · | is the complex modulus. If we identify R2 and C, it is clear that the above topology coincides with topology induced by the Euclidean metric on R2 . Version: 1 Owner: matte Author(s): matte

1580

Chapter 410 30-XX – Functions of a complex variable
410.1 z0 is a pole of f

Let f be an analytic function on a punctured neighborhood of x0 ∈ C, that is, f analytic on {z ∈ C : 0 < |z − x0 | < ε} for some ε > 0 and such that
z→z0

lim f = ∞.

We say then that x0 is a pole for f . Version: 2 Owner: drini Author(s): drini, apmxi

1581

Chapter 411 30A99 – Miscellaneous
411.1 Riemann mapping theorem

Let U be a simply connected open proper subset of C, and let a ∈ U. There is a unique analytic function f : U → C such that 1. f (a) = 0, and f (a) is real and positive; 2. f is injective; 3. f (U) = {z ∈ C : |z| < 1}. Remark. As a consequence of this theorem, any two simply connected regions, none of which is the whole plane, are conformally equivalent. Version: 2 Owner: Koro Author(s): Koro

411.2

Runge’s theorem

Let K be a compact subset of C, and let E be a subset of C∞ = C {∞} (the extended complex plane) which intersects every connected component of C∞ − K. If f is an analytic function in an open set containing K, given ε > 0, there is a rational function R(z) whose only poles are in E, such that |f (z) − R(z)| < ε for all z ∈ K. Version: 2 Owner: Koro Author(s): Koro

1582

411.3

Weierstrass M-test

Let X be a topological space, {fn }n∈N a sequence of real or complex valued functions on X and {Mn }n∈N a sequence of non-negative real numbers. Suppose that, for each n ∈ N and x ∈ X, we have |fn (x)| ≤ Mn . Then f = ∞ fn converges uniformly if ∞ Mn n=1 n=1 converges. Version: 8 Owner: vypertd Author(s): vypertd, igor

411.4

annulus

Briefly, an annulus is the region bounded between two (usually concentric) circles. An open annulus, or just annulus for short, is a domain in the complex plane of the form A = Aw (r, R) = {z ∈ C | r < |z − w| < R}, where w is an abitrary complex number, and r and R are real numbers with 0 < r < R. Such a set is often called an annular region. More generally, one can allow r = 0 or R = ∞. (This makes sense for the purposes of the bound on |z − w| above.) This would make an annulus include the cases of a punctured disc, and some unbounded domains. Analogously, one can define a closed annulus to be a set of the form A = Aw (r, R) = {z ∈ C | r |z − w| R},

where w ∈ C, and r and R are real numbers with 0 < r < R. One can show that two annuli Dw (r, R) and Dw (r , R ) are conformally equivalent if and only if R/r = R /r . More generally, the complement of any closed disk in an open disk is conformally equivalen to precisely one annulus of the form D0 (r, 1). Version: 1 Owner: jay Author(s): jay

411.5

conformally equivalent

A region G is conformally equivalent to a set S if there is an analytic bijective function mapping G to S. Conformal equivalence is an equivalence relation. Version: 1 Owner: Koro Author(s): Koro 1583

411.6

contour integral

Let f be a complex-valued function defined on the image of a curve α: [a, b] → C, let P = {a0 , ..., an } be a partition of [a, b]. If the sum
n

i=1

f (zi )(α(ai) − α(ai−1 ))

where zi is some point α(ti ) such that ai−1 ti ai , tends to a unique limit l as n tends to infinity and the greatest of the numbers ai − ai−1 tends to zero, then we say that the contour integral of f along α exists and has value l. The contour integral is denoted by intα f (z)dz Note (i) If Im(α) is a segment of the real axis, then this definition reduces to that of the Riemann integral of f(x) between α(a) and α(b) (ii) An alternative definition, making use of the Riemann-Stieltjes integral, is based on the fact that the definition of this can be extended without any other changes in the wording to cover the cases where f and α are complex-valued functions. Now let α be any curve [a, b] → R2 . Then α can be expressed in terms of the components (α1 , α2 ) and can be assosciated with the complex valued function z(t) = α1 (t) + iα2 (t) Given any complex-valued function of a complex variable, f say, defined on Im(α) we define the contour integral of f along α, denoted by intα f (z)dz by intα f (z)dz = intb f (z(t))dz(t) a whenever the complex Riemann-Stieltjes integral on the right exists. (iii) Reversing the direction of the curve changes the sign of the integral. 1584

(iv) The contour integral always exists if α is rectifiable and f is continuous. (v) If α is piecewise smooth and the countour integral of f along α exists, then intα f dz = intb f (z(t))z (t)dt a Version: 4 Owner: vypertd Author(s): vypertd

411.7

orientation

Let α be a rectifiable, Jordan curve in R2 and z0 be a point in R2 − Im(α) and let α have a winding number W [α : z0 ]. Then W [α : z0 ] = ±1; all points inside α will have the same index and we define the orientation of a Jordan curve α by saying that α is positively oriented if the index of every point in α is +1 and negatively oriented if it is −1. Version: 3 Owner: vypertd Author(s): vypertd

411.8

proof of Weierstrass M-test

Consider the sequence of partial sums sn = n fm . Since the sums are finite, each sn is m=1 continuous. Take any p, q ∈ N such that p ≤ q, then, for every x ∈ X, we have
q

|sq (x) − sp (x)| = ≤ ≤

fm (x)
m=p+1 q

m=p+1 q

|fm (x)| Mm

m=p+1

∞ But since > 0 we can find an N ∈ N such that, for any n=1 Mn converges, for any p, q > N and x ∈ X, we have |sq (x) − sp (x)| ≤ q m=p+1 Mm < . Hence the sequence sn ∞ converges uniformly to n=1 fn , and the function f = ∞ fn is continuous. n=1

Version: 1 Owner: igor Author(s): igor

1585

411.9

unit disk

The unit disk in the complex plane, denoted ∆, is defined as {z ∈ C : |z| < 1}. The unit circle, denoted ∂∆ or S 1 is the boundary {z ∈ C : |z| = 1} of the unit disk ∆. Every element z ∈ ∂∆ can be written as z = eiθ for some real value of θ. Version: 5 Owner: brianbirgen Author(s): brianbirgen

411.10

upper half plane

The upper half plane in the complex plane, abbreviated UHP, is defined as {z ∈ C : Im(z) > 0}. Version: 4 Owner: brianbirgen Author(s): brianbirgen

411.11

winding number and fundamental group

The winding number is an analytic way to define an explicit isomorphism W [• : z0 ] : π1 (C \ z0 ) → Z from the fundamental group of the punctured (at z0 ) complex plane to the group of integers. Version: 1 Owner: Dr Absentius Author(s): Dr Absentius

1586

Chapter 412 30B10 – Power series (including lacunary series)
412.1 Euler relation

Euler’s relation (also known as Euler’s formula) is considered the first bridge between the fields of algebra and geometry, as it relates the exponential function to the trigonometric sine and cosine functions. The goal is to prove eix = cos(x) + i sin(x) It’s easy to show that i4n = i4n+1 = i4n+2 = i4n+3 = 1 i −1 −i

Now, using the Taylor series expansions of sin x, cos x and ex , we can show that e
ix

= =

∞ n=0 ∞ n=0

in xn n! ix4n+1 x4n+2 ix4n+3 x4n + − − (4n)! (4n + 1)! (4n + 2)! (4n + 3)!

e

ix

Because the series expansion above is absolutely convergent for all x, we can rearrange the

1587

terms of the series as follows e e
ix

=

∞ n=0

x2n x2n+1 (−1)n + i (−1)n (2n)! (2n + 1)! n=0

ix

= cos(x) + i sin(x)

Version: 8 Owner: drini Author(s): drini, fiziko, igor

412.2

analytic

Let U be a domain in the complex numbers (resp., real numbers). A function f : U −→ C (resp., f : U −→ R) is analytic (resp., real analytic) if f has a Taylor series about each point x ∈ U that converges to the function f in an open neighborhood of x.

412.2.1

On Analyticity and Holomorphicity

A complex function is analytic if and only if it is holomorphic. Because of this equivalence, an analytic function in the complex case is often defined to be one that is holomorphic, instead of one having a Taylor series as above. Although the two definitions are equivalent, it is not an easy matter to prove their equivalence, and a reader who does not yet have this result available will have to pay attention as to which definition of analytic is being used. Version: 4 Owner: djao Author(s): djao

412.3

existence of power series

In this entry we shall demonstrate the logical equivalence of the holomorphic and analytic concepts. As is the case with so many basic results in complex analysis, the proof of these facts hinges on the Cauchy integral theorem, and the Cauchy integral formula.

Holomorphic implies analytic. Theorem 8. Let U ⊂ C be an open domain that contains the origin, and let f : U → C, be a function such that the complex derivative f (z) = lim f (z + ζ) − f (z) ζ→0 ζ

1588

exists for all z ∈ U. Then, there exists a power series representation f (z) =
∞ k=0

ak z k ,

z < R,

ak ∈ C

for a sufficiently small radius of convergence R > 0. Note: it is just as easy to show the existence of a power series representation around every basepoint in z0 ∈ U; one need only consider the holomorphic function f (z − z0 ). Proof. Choose an R > 0 sufficiently small so that the disk z the Cauchy integral formula we have that f (z) = 1 2πi f (ζ) dζ, ζ −z z < R, R is contained in U. By

ζ =R

where, as usual, the integration contour is oriented counterclockwise. For every ζ of modulus R, we can expand the integrand as a geometric power series in z, namely f (ζ)/ζ f (ζ) = = ζ −z 1 − z/ζ
∞ k=0

f (ζ) k z , ζ k+1

z < R.

The circle of radius R is a compact set; hence f (ζ) is bounded on it; and hence, the power series above converges uniformly with respect to ζ. Consequently, the order of the infinite summation and the integration operations can be interchanged. Hence, f (z) = where ak = as desired. QED Analytic implies holomorphic. Theorem 9. Let f (z) =
∞ n=0 ∞ k=0

ak z k ,

z < R,

1 2πi

ζ =R

f (ζ) , ζ k+1

an z n ,

an ∈ C,

z < > 0 about the origin.

be a power series, converging in D = D (0), the open disk of radius Then the complex derivative f (z) = lim f (z + ζ) − f (z) ζ→0 ζ

exists for all z ∈ D, i.e. the function f : D → C is holomorphic. 1589

Note: this theorem generalizes immediately to shifted power series in z − z0 , z0 ∈ C. Proof. For every z0 ∈ D, the function f (z) can be recast as a power series centered at z0 . Hence, without loss of generality it suffices to prove the theorem for z = 0. The power series
∞ n=0

an+1 ζ n ,

ζ∈D

converges, and equals (f (ζ) − f (0))/ζ for ζ = 0. Consequently, the complex derivative f (0) exists; indeed it is equal to a1 . QED Version: 2 Owner: rmilson Author(s): rmilson

412.4

infinitely-differentiable function that is not analytic

If f ∈ C∞ , then we can certainly write a Taylor series for f . However, analyticity requires that this Taylor series actually converge (at least across some radius of convergence) to f . It is not necessary that the power series for f converge to f , as the following example shows. Let f (x) = e x=0 . 0 x=0

Then f ∈ C∞ , and for any n ≥ 0, f (n) (0) = 0 (see below). So the Taylor series for f around 0 is 0; since f (x) > 0 for all x = 0, clearly it does not converge to f .

Proof that f (n) (0) = 0
Let p(x), q(x) ∈ R[x] be polynomials, and define g(x) = Then, for x = 0, p(x) · f (x). q(x)

(p (x) + p(x) x23 )q(x) − q (x)p(x) · e. g (x) = q 2 (x)

Computing (e.g. by applying L’Hˆpital’s rule), we see that g (0) = limx→0 g (x) = 0. o Define p0 (x) = q0 (x) = 1. Applying the above inductively, we see that we may write n (x) f (n) (x) = pn (x) f (x). So f (n) (0) = 0, as required. q Version: 2 Owner: ariels Author(s): ariels 1590

412.5

power series

A power series is a series of the form
∞ k=0

ak (x − x0 )k ,

with ak , x0 ∈ R or ∈ C. The ak are called the coefficients and x0 the center of the power series. Where it converges it defines a function, which can thus be represented by a power series. This is what power series are usually used for. Every power series is convergent at least at x = x0 where it converges to a0 . In addition it is absolutely convergent in the region {x | |x − x0 | < r}, with 1 r = lim inf k k→∞ |ak | It is divergent for every x with |x − x0 | > r. For |x − x0 | = r no general predictions can be made. If r = ∞, the power series converges absolutely for every real or complex x. The real number r is called the radius of convergence of the power series.

Examples of power series are: • Taylor series, for example: e = • The geometric series:
x

∞ k=0

xk . k!

1 = 1−x

∞ k=0

xk ,

with |x| < 1. Power series have some important properties: • If a power series converges for a z0 ∈ C then it also converges for all z ∈ C with |z − x0 | < |z0 − x0 |. • Also, if a power series diverges for some z0 ∈ C then it diverges for all z ∈ C with |z − x0 | > |z0 − x0 |. • For |x − x0 | < r Power series can be added by adding coefficients and mulitplied in the obvious way:
∞ k=0

ak (x−xo ) ·

k

∞ l=0

bj (x−x0 )j = a0 b0 +(a0 b1 +a1 b0 )(x−x0 )+(a0 b2 +a1 b1 +a2 b0 )(x−x0 )2 . . . .

1591

• (Uniqueness) If two power series are equal and their centers are the same, then their coefficients must be equal. • Power series can be termwise differentiated and integrated. These operations keep the radius of convergence. Version: 13 Owner: mathwizard Author(s): mathwizard, AxelBoldt

412.6

proof of radius of convergence

According to Cauchy’s root test a power series is absolutely convergent if lim sup
k→∞
k

|ak (x − x0 )k | = |x − x0 | lim sup
k→∞

k

|ak | < 1.

This is obviously true if |x − x0 | < lim sup
k→∞
k

1 1 = |ak | lim inf k→∞ 1

k

|ak |

.

In the same way we see that the series is divergent if |x − x0 | > lim inf k→∞
k

|ak |

,

which means that the right hand side is the radius of convergence of the power series. Now from the ratio test we see that the power series is absolutely convergent if lim ak+1 ak+1 (x − x0 )k+1 = |x − x0 | lim < 1. k k→∞ ak (x − x0 ) ak |x − x0 | < lim The series is divergent if |x − x0 | > lim
k→∞

k→∞

Again this is true if
k→∞

ak . ak+1 ak , ak+1

as follows from the ratio test in the same way. So we see that in this way too we can calculate the radius of convergence. Version: 1 Owner: mathwizard Author(s): mathwizard

1592

412.7

radius of convergence
∞ k=0

To the power series

ak (x − x0 )k

(412.7.1)

there exists a number r ∈ [0, ∞], its radius of convergence, such that the series converges absolutely for all (real or complex) numbers x with |x − x0 | < r and diverges whenever |x − x0 | > r. (For |x − x0 | = r no general statements can be made, except that there always exists at least one complex number x with |x − x0 | = r such that the series diverges.) The radius of convergence is given by: r = lim inf
k→∞

1
k

|ak |

(412.7.2)

and can also be computed as r = lim if this limit exists. Version: 6 Owner: mathwizard Author(s): mathwizard, AxelBoldt
k→∞

ak , ak+1

(412.7.3)

1593

Chapter 413 30B50 – Dirichlet series and other series expansions, exponential series
413.1 Dirichlet series

Let (λn )n≥1 be an increasing sequence of positive real numbers tending to ∞. A Dirichlet series with exponents (λn ) is a series of the form an e−λn z
n

where z and all the an are complex numbers. An ordinary Dirichlet series is one having λn = log n for all n. It is written an . nz The best-known examples are the Riemann zeta fuction (in which an is the constant 1) and the more general Dirichlet L-series (in which the mapping n → an is multiplicative and periodic). When λn = n, the Dirichlet series is just a power series in the variable e−z . The following are the basic convergence properties of Dirichlet series. There is nothing profound about their proofs, which can be found in [1] and in various other works on complex analysis and analytic number theory. Let f (z) =
n

an e−λn z be a Dirichlet series.

1. If f converges at z = z0 , then f converges uniformly in the region Re(z − z0 ) ≥ 0 − α ≤ arg(z − z0 ) ≤ α 1594

where α is any real number such that 0 < α < π/2. (Such a region is known as a “Stoltz angle”.) 2. Therefore, if f converges at z0 , its sum defines a holomorphic function on the region Re(z) > Re(z0 ), and moreover f (z) → f (z0 ) as z → z0 within any Stoltz angle. 3. f = 0 identically iff all the an are zero. So, if f converges somewhere but not everywhere in C, then the domain of its convergence is the region Re(z) > ρ for some real number ρ, which is called the abscissa of convergence of the Dirichlets series. The abscissa of convergence of the series f (z) = n |an |e−λn z , if it exists, is called the abscissa of absolute convergence of f . Now suppose that the coefficients an are all real and ≥ 0. If the series f converges for Re(z) > ρ, and the resulting function admits an analytic extension to a neighbourhood of ρ, then the series f converges in a neighbourhood of ρ. Consequently, the domain of convergence of f (unless it is the whole of C) is bounded by a singularity at a point on the real axis. Finally, return to the general case of any complex numbers (an ), but suppose λn = log n, so an f is an ordinary Dirichlet series . nz 1. If the sequence (an ) is bounded, then f converges absolutely in the region Re(z) > 1. 2. If the partial sums l an are bounded, then f converges (not necessarily absolutely) n=k in the region Re(z) > 0. Reference: [1] Serre, J.-P., A Course in Arithmetic, Chapter VI, Springer-Verlag, 1973. Version: 2 Owner: bbukh Author(s): Larry Hammick

1595

Chapter 414 30C15 – Zeros of polynomials, rational functions, and other analytic functions (e.g. zeros of functions with bounded Dirichlet integral)
414.1 Mason-Stothers theorem

Mason’s theorem is often described as the polynomial case of the (currently unproven) ABC conjecture. Theorem 1 (Mason-Stothers). Let f (z), g(z), h(z) ∈ C[z] be such that f (z) + g(z) = h(z) for all z, and such that f , g, and h are pair-wise relatively prime. Denote the number of distinct roots of the product f gh(z) by N. Then max deg{f, g, h} + 1 Version: 1 Owner: mathcam Author(s): mathcam N.

414.2

zeroes of analytic functions are isolated

The zeroes of a non-constant analytic function on C are isolated. Let f be an analytic function defined in some domain D ⊂ C and let f (z0 ) = 0 for some z0 ∈ D. Because f is analytic, there is a Taylor series expansion for f around z0 which converges on an open disk |z − z0 | < R. Write it as f (z) = Σ∞ an (z − z0 )n , with ak = 0 and k > 0 (ak is the first n=k non-zero term). One can factor the series so that f (z) = (z − z0 )k Σ∞ an+k (z − z0 )n and n=0 1596

define g(z) = Σ∞ an+k (z − z0 )n so that f (z) = (z − z0 )k g(z). Observe that g(z) is analytic n=0 on |z − z0 | < R. To show that z0 is an isolated zero of f , we must find > 0 so that f is non-zero on 0 < |z − z0 | < . It is enough to find > 0 so that g is non-zero on |z − z0 | < by the relation f (z) = (z − z0 )k g(z). Because g(z) is analytic, it is continuous at z0 . Notice that g(z0 ) = ak = 0, so there exists an > 0 so that for all z with |z − z0 | < it follows that |g(z) − ak | < |a2k | . This implies that g(z) is non-zero in this set. Version: 5 Owner: brianbirgen Author(s): brianbirgen

1597

Chapter 415 30C20 – Conformal mappings of special domains
415.1 automorphisms of unit disk

All automorphisms of the complex unit disk ∆ = {z ∈ C : |z| < 1} to itself, can be written z−a in the form fa (z) = eiθ 1−az where a ∈ ∆ and θ ∈ S 1 . This map sends a to 0, 1/a to ∞ and the unit circle to the unit circle. Version: 3 Owner: brianbirgen Author(s): brianbirgen

415.2

unit disk upper half plane conformal equivalence theorem

Theorem: There is a conformal map from ∆, the unit disk, to UHP , the upper half plane.
1+w Proof: Define f : C → C, f (z) = z−i . Notice that f −1 (w) = i 1−w and that f (and therefore z+i f −1 ) is a Mobius transformation.

−1−i Notice that f (0) = −1, f (1) = 1−i = −i and f (−1) = −1+i = i. By the Mobius circle transformation theore 1+i f takes the real axis to the unit circle. Since f (i) = 0, f maps UHP to ∆ and f −1 : ∆ → UHP .

Version: 3 Owner: brianbirgen Author(s): brianbirgen

1598

Chapter 416 30C35 – General theory of conformal mappings
416.1 proof of conformal mapping theorem

Let D ⊂ C be a domain, and let f : D → C be an analytic function. By identifying the complex plane C with R2 , we can view f as a function from R2 to itself: ˜ f (x, y) := (Re f (x + iy), Im f (x + iy)) = (u(x, y), v(x, y)) ˜ with u and v real functions. The Jacobian matrix of f is J(x, y) = ∂(u, v) = ∂(x, y) ux uy . vx vy

As an analytic function, f satisfies the Cauchy-Riemann equations, so that ux = vy and uy = −vx . At a fixed point z = x + iy ∈ D, we can therefore define a = ux (x, y) = vy (x, y) and b = uy (x, y) = −vx (x, y). We write (a, b) in polar coordinates as (r cos θ, r sin θ) and get J(x, y) = a b −b a =r cos θ sin θ − sin θ cos θ

Now we consider two smooth curves through (x, y), which we parametrize by γ1 (t) = (u1 (t), v1 (t)) and γ2 (t) = (u2 (t), v2 (t)). We can choose the parametrization such that ˜ ˜ ˜ γ1 (0) = γ2 (0) = z. The images of these curves under f are f ◦ γ1 and f ◦ γ2 , respectively, and their derivatives at t = 0 are ˜ (f ◦ γ1 ) (0) = and, similarly, ˜ (f ◦ γ2 ) (0) = J(x, y) 1599 dγ1 ∂(u, v) (γ1 (0)) · (0) = J(x, y) ∂(x, y) dt
du2 dt dv2 dt du1 dt dv1 dt

by the chain rule. We see that if f (z) = 0, f transforms the tangent vectors to γ1 and γ2 at t = 0 (and therefore in z) by the orthogonal matrix J/r = cos θ sin θ − sin θ cos θ

and scales them by a factor of r. In particular, the transformation by an orthogonal matrix implies that the angle between the tangent vectors is preserved. Since the determinant of J/r is 1, the transformation also preserves orientation (the direction of the angle between the tangent vectors). We conclude that f is a conformal mapping. Version: 3 Owner: pbruin Author(s): pbruin

1600

Chapter 417 30C80 – Maximum principle; Schwarz’s lemma, Lindel¨f principle, o analogues and generalizations; subordination
417.1 Schwarz lemma

Let ∆ = {z : |z| < 1} be the open unit disk in the complex plane C. Let f : ∆ → ∆ be a holomorphic function with f(0)=0. Then |f (z)| ≤ |z| for all z ∈ ∆, and |f (0)| ≤ 1. If equality |f (z)| = |z| holds for any z = 0 or |f (0)| = 1, then f is a rotation: f (z) = az with |a| = 1. This lemma is less celebrated than the bigger guns (such as the Riemann mapping theorem, which it helps prove); however, it is one of the simplest results capturing the “rigidity” of holomorphic functions. No similar result exists for real functions, of course. Version: 2 Owner: ariels Author(s): ariels

417.2

maximum principle

Maximum principle Let f : U → R (where U ⊆ Rd ) be a harmonic function. Then f attains its extremal values on any compact K ⊆ U on the boundary ∂K of K. If f attains an extremal value anywhere inside int K, then it is constant. Maximal modulus principle Let f : U → C (where U ⊆ C) be a holomorphic function. Then |f | attains its maximal value on any compact K ⊆ U on the boundary ∂K of K. If |f | attains its maximal value anywhere inside int K, then it is constant. 1601

Version: 1 Owner: ariels Author(s): ariels

417.3

proof of Schwarz lemma

Define g(z) = f (z)/z. Then g : ∆ → C is a holomorphic function. The Schwarz lemma is just an application of the maximal modulus principle to g. For any 1 > > 0, by the maximal modulus principle |g| must attain its maximum on the closed disk {z : |z| ≤ 1 − } at its boundary {z : |z| = 1 − }, say at some point z . But then 1 |g(z)| ≤ |g(z )| ≤ 1− for any |z| ≤ 1 − . Taking an infinimum as → 0, we see that values of g are bounded: |g(z)| ≤ 1. Thus |f (z)| ≤ |z|. Additionally, f (0) = g(0), so we see that |f (0)| = |g(0)| ≤ 1. This is the first part of the lemma. Now suppose, as per the premise of the second part of the lemma, that |g(w)| = 1 for some w ∈ ∆. For any r > |w|, it must be that |g| attains its maximal modulus (1) inside the disk {z : |z| ≤ r}, and it follows that g must be constant inside the entire open disk ∆. So g(z) ⇔ a for a = g(w) of size 1, and f (z) = az, as required. Version: 2 Owner: ariels Author(s): ariels

1602

Chapter 418 30D20 – Entire functions, general theory
418.1 Liouville’s theorem

A bounded entire function is constant. That is, a bounded complex function f : C → C which is holomorphic on the entire complex plane is always a constant function. More generally, any holomorphic function f : C → C which satisfies a polynomial bound condition of the form |f (z)| < c · |z|n

for some c ∈ R, n ∈ Z, and all z ∈ C with |z| sufficiently large is necessarily equal to a polynomial function.

Liouville’s theorem is a vivid example of how stringent the holomorphicity condition on a complex function really is. One has only to compare the theorem to the corresponding statement for real functions (namely, that a bounded differentiable real function is constant, a patently false statement) to see how much stronger the complex differentiability condition is compared to real differentiability. Applications of Liouville’s theorem include proofs of the fundamental theorem of algebra and of the partial fraction decomposition theorem for rational functions. Version: 4 Owner: djao Author(s): djao

418.2

Morera’s theorem

Morera’s theorem provides the converse of Cauchy’s integral theorem. 1603

Theorem [1] Suppose G is a region in C, and f : G → C is a continuous function. If for every closed triangle ∆ in G, we have int∂∆ f dz = 0, then f is analytic on G. (Here, ∂∆ is the piecewise linear boundary of ∆.) In particular, if for every rectifiable closed curve Γ in G, we have intΓ f dz = 0, then f is analytic on G. Proofs of this can be found in [2, 2].

REFERENCES
1. W. Rudin, Real and complex analysis, 3rd ed., McGraw-Hill Inc., 1987. 2. E. Kreyszig, Advanced Engineering Mathematics, John Wiley & Sons, 1993, 7th ed. 3. R.A. Silverman, Introductory Complex Analysis, Dover Publications, 1972.

Version: 7 Owner: matte Author(s): matte, drini, nerdy2

418.3

entire

A function f : C −→ C is entire if it is holomorphic. Version: 2 Owner: djao Author(s): djao

418.4

holomorphic

Let U ⊂ C be a domain in the complex numbers. A function f : U −→ C is holomorphic if f has a complex derivative at every point x in U, i.e. if f (z) − f (z0 ) lim z→z0 z − z0 exists for all z0 ∈ U. Version: 5 Owner: djao Author(s): djao, rmilson

418.5

proof of Liouville’s theorem

Let f : C → C be a bounded, entire function. Then by Taylor’s Theorem, f (z) = cn xn where cn = 1604
n=0

1 f (w) intΓr n+1 dw 2πi w

where Γr is the circle of radius r about 0, for r > 0. Then cn can be estimated as |cn | 1 length(Γr ) sup 2π f (w) : w ∈ Γr w n+1 = Mr Mr 1 2πr n+1 = n 2π r r

where Mr = sup{|f (w)| : w ∈ Γr }.
M But f is bounded, so there is M such that Mr M for all r. Then |cn | rn for all n and all r > 0. But since r is arbitrary, this gives cn = 0 whenever n > 0. So f (z) = c0 for all z, so f is constant.

Version: 2 Owner: Evandar Author(s): Evandar

1605

Chapter 419 30D30 – Meromorphic functions, general theory
419.1 Casorati-Weierstrass theorem

Let U ⊂ C be a domain, a ∈ U, and let f : U \ {a} → C be holomorphic. Then a is an essential singularity of f if and only if the image of any punctured neighborhood of a under f is dense in C. Version: 2 Owner: pbruin Author(s): pbruin

419.2

Mittag-Leffler’s theorem

Let G be an open subset of C, let {ak } be a sequence of distinct points in G which has no limit point in G. For each k, let A1k , . . . , Amk k be arbitrary complex coefficients, and define
mk

Sk (z) =
j=1

Ajk . (z − ak )j

Then there exists a meromorphic function f on G whose poles are exactly the points {ak } and such that the singular part of f at ak is Sk (z), for each k. Version: 1 Owner: Koro Author(s): Koro

1606

419.3

Riemann’s removable singularity theorem

Let U ⊂ C be a domain, a ∈ U, and let f : U \ {a} be holomorphic. Then a is a removable singularity of f if and only if
z→a

lim (z − a)f (z) = 0.

In particular, a is a removable singularity of f if f is bounded near a, i.e. if there is a punctured neighborhood V of a and a real number M > 0 such that |f (z)| < M for all z ∈ V. Version: 1 Owner: pbruin Author(s): pbruin

419.4

essential singularity

Let U ⊂ C be a domain, a ∈ U, and let f : U \{a} → C be holomorphic. If the Laurent series expansion of f (z) around a contains infinitely many terms with negative powers of z −a, then a is said to be an essential singularity of f . Any singularity of f is a removable singularity, a pole or an essential singularity. If a is an essential singularity of f , then the image of any punctured neighborhood of a under f is dense in C (the Casorati-Weierstrass theorem). In fact, an even stronger statement is true: according to Picard’s theorem, the image of any punctured neighborhood of a is C, with the possible exception of a single point. Version: 4 Owner: pbruin Author(s): pbruin

419.5

meromorphic

Let U ⊂ C be a domain. A function f : U −→ C is meromorphic if f is holomorphic except at an isolated set of poles. It can be proven that if f is meromorphic then its set of poles does not have an accumulation point. Version: 2 Owner: djao Author(s): djao

419.6

pole

Let U ⊂ C be a domain and let a ∈ C. A function f : U −→ C has a pole at a if it can be represented by a Laurent series centered about a with only finitely many negative terms; 1607

that is, f (z) =

∞ k=−n

ck (z − a)k

in some nonempty deleted neighborhood of a, for some n ∈ N. Version: 2 Owner: djao Author(s): djao

419.7

proof of Casorati-Weierstrass theorem

Assume that a is an essential singularity of f . Let V ⊂ U be a punctured neighborhood of a, and let λ ∈ C. We have to show that λ is a limit point of f (V ). Suppose it is not, then there is an > 0 such that |f (z) − λ| > for all z ∈ V , and the function g : V → C, z → 1 f (z) − λ

1 is bounded, since |g(z)| = |f (z)−λ| < −1 for all z ∈ V . According to Riemann’s removable singularity theorem this implies that a is a removable singularity of g, so that g can be extended to a holomorphic function g : V {a} → C. Now ¯ 1 f (z) = −λ g (z) ¯

for z = a, and a is either a removable singularity of f (if g (z) = 0) or a pole of order n (if g ¯ ¯ has a zero of order n at a). This contradicts our assumption that a is an essential singularity, which means that λ must be a limit point of f (V ). The argument holds for all λ ∈ C, so f (V ) is dense in C for any punctured neighborhood V of a. To prove the converse, assume that f (V ) is dense in C for any punctured neighborhood V of a. If a is a removable singularity, then f is bounded near a, and if a is a pole, f (z) → ∞ as z → a. Either of these possibilities contradicts the assumption that the image of any punctured neighborhood of a under f is dense in C, so a must be an essential singularity of f. Version: 1 Owner: pbruin Author(s): pbruin

419.8

proof of Riemann’s removable singularity theorem

Suppose that f is holomorphic on U \ {a} and limz→a (z − a)f (z) = 0. Let f (z) =
∞ k=−∞

ck (z − a)k

1608

be the Laurent series of f centered at a. We will show that ck = 0 for k < 0, so that f can be holomorphically extended to all of U by defining f (a) = c0 . For n ∈ N0 , the residue of (z − a)n f (z) at a is Res((z − a)n f (z), a) = This is equal to zero, because (z − a)n f (z)dz ≤ 2πδ max |(z − a)n f (z)| = 2πδ
n |z−a|=δ |z−a|=δ

1 lim 2πi δ→0+

|z−a|=δ

(z − a)n f (z)dz.

|z−a|=δ

max |(z − a)f (z)|

which, by our assumption, goes to zero as δ → 0. Since the residue of (z − a)n f (z) at a is also equal to c−n−1 , the coefficients of all negative powers of z in the Laurent series vanish. Conversely, if a is a removable singularity of f , then f can be expanded in a power series centered at a, so that lim (z − a)f (z) = 0
z→a

because the constant term in the power series of (z − a)f (z) is zero. A corollary of this theorem is the following: if f is bounded near a, then |(z − a)f (z)| ≤ |z − a|M for some M > 0. This implies that (z − a)f (z) → 0 as z → a, so a is a removable singularity of f . Version: 1 Owner: pbruin Author(s): pbruin

419.9

residue

Let U ⊂ C be a domain and let f : U −→ C be a function represented by a Laurent series f (z) :=
∞ k=−∞

ck (z − a)k

centered about a. The coefficient c−1 of the above Laurent series is called residue of f at a, and denoted Res(f ; a). Version: 2 Owner: djao Author(s): djao

1609

419.10

simple pole

A simple pole is a pole of order 1. That is, a meromorphic function f has a simple pole at x0 ∈ C if a + g(z) f (z) = z − x0 where a = 0 ∈ C, and g is holomorphic at x0 . Version: 3 Owner: bwebste Author(s): bwebste

1610

Chapter 420 30E20 – Integration, integrals of Cauchy type, integral representations of analytic functions
420.1 Cauchy integral formula

The formulas. Let D = {z ∈ C : z − z0 < R} be an open disk in the complex plane, and let f (z) be a holomorphic 1 function defined on some open domain that contains D and its boundary. Then, for every z ∈ D we have 1 2πi 1 f (z) = 2πi . . . n! f (n) (z) = 2πi f (z) =
C

C

f (ζ) dζ ζ −z f (ζ) dζ (ζ − z)2

C

Here C = ∂D is the corresponding circular boundary contour, oriented counterclockwise, with the most obvious parameterization given by ζ = z0 + Reit , 0 t 2π.

f (ζ) dζ (ζ − z)n+1

Discussion. The first of the above formulas underscores the “rigidity” of holomoprhic functions. Indeed, the values of the holomorphic function inside a disk D are completely
It is necessary to draw a distinction between holomorphic functions (those having a complex derivative) and analytic functions (those representable by power series). The two concepts are, in fact, equivalent, but the standard proof of this fact uses the Cauchy Integral Formula with the (apparently) weaker holomorphicity hypothesis.
1

1611

specified by its values on the boundary of the disk. The second formula is useful, because it gives the derivative in terms of an integral, rather than as the outcome of a limit process. Generalization. The following technical generalization of the formula is needed for the treatment of removable singularities. Let S be a finite subset of D, and suppose that f (z) is holomorphic for all z ∈ S, but also that f (z) is bounded near all z ∈ S. Then, the above / formulas are valid for all z ∈ D \ S. Using the Cauchy residue theorem, one can further generalize the integral formula to the situation where D is any domain and C is any closed rectifiable curve in D; in this case, the formula becomes 1 f (ζ) η(C, z)f (z) = dζ 2πi C ζ − z where η(C, z) denotes the winding number of C. It is valid for all points z ∈ D \ S which are not on the curve C. Version: 19 Owner: djao Author(s): djao, rmilson

420.2

Cauchy integral theorem

Theorem 10. Let U ⊂ C be an open, simply connected domain, and let f : U → C be a function whose complex derivative, that is
w→z

lim

exists for all z ∈ U. Then, the integral around a every closed contour γ ⊂ U vanishes; in symbols f (z) dz = 0.
γ

f (w) − f (z) , w−z

We also have the following, technically important generalization involving removable singularities. Theorem 11. Let U ⊂ C be an open, simply connected domain, and S ⊂ U a finite subset. Let f : U\S → C be a function whose complex derivative exists for all z ∈ U\S, and that is bounded near all z ∈ S. Then, the integral around a every closed contour γ ⊂ U\S that avoids the exceptional points vanishes. Cauchy’s theorem is an essential stepping stone in the theory of complex analysis. It is required for the proof of the Cauchy integral formula, which in turn is required for the proof that the existence of a complex derivative implies a power series representation. The original version of the theorem, as stated by Cauchy in the early 1800s, requires that the derivative f (z) exist and be continuous. The existence of f (z) implies the Cauchy-Riemann equations, 1612

which in turn can be restated as the fact that the complex-valued differential f (z) dz is closed. The original proof makes use of this fact, and calls on Green’s theorem to conclude that the contour integral vanishes. The proof of Green’s theorem, however, involves an interchange of order in a double integral, and this can only be justified if the integrand, which involves the real and imaginary parts of f (z), is assumed to be continuous. To this date, many authors prove the theorem this way, but erroneously fail to mention the continuity assumption. In the latter part of the 19th century E. Goursat found a proof of the integral theorem that merely required that f (z) exist. Continuity of the derivative, as well as the existence of all higher derivatives, then follows as a consequence of the Cauchy integral formula. Not only is Goursat’s version a sharper result, but it is also more elementary and self-contained, in that sense that it is does not require Green’s theorem. Goursat’s argument makes use of rectangular contour (many authors use triangles though), but the extension to an arbitrary simply-connected domain is relatively straight-forward. Theorem 12 (Goursat). Let U be an open domain containing a rectangle If the complex derivative of a function f : U → C exists at all points of U, then the contour integral of f around the boundary of R vanishes; in symbols f (z) dz = 0.
∂R

R = {x + iy ∈ C : a

x

b,c

y

d}.

Bibliography. • A. Ahlfors, “Complex Analysis”. Version: 7 Owner: rmilson Author(s): rmilson

420.3

Cauchy residue theorem

Let U ⊂ C be a simply connected domain, and suppose f is a complex valued function which is defined and analytic on all but finitely many points a1 , . . . , am of U. Let C be a closed curve in U which does not intersect any of the ai . Then
m

intC f (z) dz = 2πi
i=1

η(C, ai) Res(f ; ai ),

1 dz intC 2πi z − ai is the winding number of C about ai , and Res(f ; ai) denotes the residue of f at ai . η(C, ai) := Version: 4 Owner: djao Author(s): djao, rmilson 1613

where

420.4

Gauss’ mean value theorem

Let Ω be a domain in C and suppose f is an analytic function on Ω. Furthermore, let C be a circle inside Ω with center z0 and radius r. Then f (z0 ) is the mean value of f along C, that is, 1 int2π f (z0 + reiθ )dθ. f (z0 ) = 2π 0 Version: 7 Owner: Johan Author(s): Johan

420.5

M¨bius circle transformation theorem o

M¨bius transformations always transform circles into circles. o Version: 1 Owner: Johan Author(s): Johan

420.6

M¨bius transformation cross-ratio preservation o theorem

A M¨bius transformation f : z → w preserves the cross-ratios, i.e. o (z1 − z2 )(z3 − z4 ) (w1 − w2 )(w3 − w4 ) = (w1 − w4 )(w3 − w2 ) (z1 − z4 )(z3 − z2 ) Version: 3 Owner: Johan Author(s): Johan

420.7

Rouch’s theorem

Let f, g be analytic on and inside a simple closed curve C. Suppose |f (z)| > |g(z)| on C. Then f and f + g have the same number of zeros inside C. Version: 2 Owner: Johan Author(s): Johan

1614

420.8

absolute convergence implies convergence for an infinite product

If an infinite product is absolutely convergent then it is convergent. Version: 2 Owner: Johan Author(s): Johan

420.9

absolute convergence of infinite product
∞ n=1 (1

An infinite product verges.

+ an ) is said to be absolutely convergent if

∞ n=1 (1

+ |an |) con-

Version: 4 Owner: mathcam Author(s): mathcam, Johan

420.10

closed curve theorem

Let U ⊂ C be a simply connected domain, and suppose f : U −→ C is holomorphic. Then intC f (z) dz = 0 for any smooth closed curve C in U. More generally, if U is any domain, and C1 and C2 are two homotopic smooth closed curves in U, then intC1 f (z) dz = intC2 f (z) dz. for any holomorphic function f : U −→ C. Version: 3 Owner: djao Author(s): djao

420.11

conformal M¨bius circle map theorem o

Any conformal map that maps the interior of the unit disc onto itself is a M¨bius transformation. o Version: 4 Owner: Johan Author(s): Johan

1615

420.12

conformal mapping

A mapping f : C → C which preserves the size and orientation of the angles (at z0 ) between any two curves which intersects in a given point z0 is said to be conformal at z0 . A mapping that is conformal at any point in a domain D is said to be conformal in D. Version: 4 Owner: Johan Author(s): Johan

420.13

conformal mapping theorem

Let f (z) be analytic in a domain D. Then it is conformal at any point z ∈ D where f (z) = 0. Version: 2 Owner: Johan Author(s): Johan

420.14

convergence/divergence for an infinite product

Consider ∞ pn . We say that this infinite product converges iff the finite products Pm = n=1 m pn −→ P = 0 converge or for at most a finite number of terms pnk = 0 , k = 1, . . . , K. n=1 Otherwise the infinite product is called divergent. Note: The infinite product vanishes only if a factor is zero. Version: 6 Owner: Johan Author(s): Johan

420.15

example of conformal mapping

Consider the four curves A = {t}, B = {t + it}, C = {it} and D = {−t + it}, t ∈ [−10, 10]. Suppose there is a mapping f : C → C which maps A to D and B to C. Is f conformal at z0 = 0? The size of the angles between A and B at the point of intersection z0 = 0 is preserved, however the orientation is not. Therefore f is not conformal at z0 = 0. Now suppose there is a function g : C → C which maps A to C and B to D. In this case we see not only that the size of the angles is preserved, but also the orientation. Therefore g is conformal at z0 = 0. Version: 3 Owner: Johan Author(s): Johan

1616

420.16

examples of infinite products

A classic example is the Riemann zeta function. For Re(z) > 1 we have ζ(z) =
∞ n=1

1 1 = . z n 1 − p−z p prime

With the help of a Fourier series, or in other ways, one can prove this infinite product expansion of the sine function: sin z = z
∞ n=1

1−

z2 n2 π 2

(420.16.1)

where z is an arbitrary complex number. Taking the logarithmic derivative (a frequent move in connection with infinite products) we get a decomposition of the cotangent into partial fractions: ∞ 1 1 1 + . (420.16.2) π cot πz = + 2 z z+n z−n n=1 The equation (495.2.1), in turn, has some interesting uses, e.g. to get the Taylor expansion of an Eisenstein series, or to evaluate ζ(2n) for positive integers n. Version: 1 Owner: mathcam Author(s): Larry Hammick

420.17
Let

link between infinite products and sums
∞ k=1

pk

be an infinite product such that pk > 0 for all k. Then the infinite product converges if and only if the infinite sum

log pk
∞ k=1

k=1

converges. Moreover

∞ k=1

pk = exp

log pk .

Proof. Simply notice that
N N

pk = exp
k=1 k=1

log pk .

1617

If the infinite sum converges then
N N →∞ N

lim

pk = lim exp
k=1 N →∞ k=1

log pk = exp

∞ k=1

log pk

and also the infinite product converges. Version: 1 Owner: paolini Author(s): paolini

420.18

proof of Cauchy integral formula

Let D = {z ∈ C : z − z0 < R} be a disk in the complex plane, S ⊂ D a finite subset, and ¯ U ⊂ C an open domain that contains the closed disk D. Suppose that • f : U\S → C is holomorphic, and that • f (z) is bounded on D\S. Let z ∈ D\S be given, and set g(ζ) = f (ζ) − f (z) , ζ −z ζ ∈ D\S ,

where S = S {z}. Note that g(ζ) is holomorphic and bounded on D\S . The second assertion is true, because g(ζ) → f (z), as ζ → z. Therefore, by the Cauchy integral theorem g(ζ) dζ = 0,
C

where C is the counterclockwise circular contour parameterized by ζ = z0 + Reit , 0 Hence,
C

t

2π.

f (ζ) dζ = ζ −z

C

f (z) dζ. ζ −z

(420.18.1)

lemma. If z ∈ C is such that z = 1, then dζ = ζ −z 0 if z > 1 2πi if z < 1

ζ =1

1618

The proof is fun exercise in elementary integral calculus, an application of the half-angle trigonometric substitutions. Thanks to the Lemma, the right hand side of (495.2.1) evaluates to 2πif (z). Dividing through by 2πi, we obtain 1 f (ζ) dζ, f (z) = 2πi C ζ − z as desired. Since a circle is a compact set, the defining limit for the derivative f (ζ) d f (ζ) = dz ζ − z (ζ − z)2 converges uniformly for ζ ∈ ∂D. Thanks to the uniform convergence, the order of the derivative and the integral operations can be interchanged. In this way we obtain the second formula: 1 d f (ζ) f (ζ) 1 f (z) = dζ = dζ. 2πi dz C ζ − z 2πi C (ζ − z)2 Version: 9 Owner: rmilson Author(s): rmilson, stawn

420.19

proof of Cauchy residue theorem

Being f holomorphic by Cauchy Riemann equations the differential form f (z) dz is closed. So by the lemma about closed differential forms on a simple connected domain we know that the integral intC f (z) dz is equal to intC f (z) dz if C is any curve which is homotopic to C. In particular we can consider a curve C which turns around the points aj along small circles and join these small circles with segments. Since the curve C follows each segment two times with opposite orientation it is enough to sum the integrals of f around the small circles. So letting z = aj + ρeiθ be a parameterization of the curve around the point aj , we have dz = ρieiθ dθ and hence intC f (z) dz = intC f (z) dz =
j

η(C, aj )int∂Bρ (aj ) f (z) dz

=
j

η(C, aj )int2π f (aj + ρeiθ )ρieiθ dθ 0

where ρ > 0 is choosen so small that the balls Bρ (aj ) are all disjoint and all contained in the domain U. So by linearity, it is enough to prove that for all j iint2π f (aj + eiθ )ρeiθ dθ = 2πiRes(f, aj ). 0

1619

Let now j be fixed and consider now the Laurent series for f in aj : f (z) =
k∈Z

ck (z − aj )k

so that Res(f, aj ) = c−1 . We have int2π f (aj + eiθ )ρeiθ dθ = 0
k

int2π ck (ρeiθ )k ρeiθ dθ = ρk+1 0
k

ck int2π ei(k+1)θ dθ. 0

Notice now that if k = −1 we have ρk+1 ck int2π ei(k+1)θ dθ = c−1 int2π dθ = 2πc−1 = 2π Res(f, aj ) 0 0 while for k = −1 we have int2π ei(k+1)θ 0 Hence the result follows. Version: 2 Owner: paolini Author(s): paolini ei(k+1)θ dθ = i(k + 1)

= 0.
0

420.20

proof of Gauss’ mean value theorem

We can parametrize the circle by letting z = z0 + reiφ . Then dz = ireiφ dφ. Using the Cauchy integral formula we can express f (z0 ) in the following way: f (z0 ) = 1 2πi f (z) f (z0 + reiφ ) iφ 1 1 dz = int2π ire dφ = int2π f (z0 + reiφ )dφ. 0 iφ z − z0 2πi re 2π 0

C

Version: 12 Owner: Johan Author(s): Johan

420.21

proof of Goursat’s theorem

We argue by contradiction. Set η=
∂R

f (z) dz,

and suppose that η = 0. Divide R into four congruent rectangles R1 , R2 , R3 , R4 (see Figure 1), and set ηi =
∂Ri

f (z) dz.

1620

Figure 1: subdivision of the rectangle contour.

Now subdivide each of the four sub-rectangles, to get 16 congruent sub-sub-rectangles Ri1 i2 , i1 , i2 = 1 . . . 4, and then continue ad infinitum to obtain a sequence of nested families of rectangles Ri1 ...ik , with ηi1 ...ik the values of f (z) integrated along the corresponding contour. Orienting the boundary of R and all the sub-rectangles in the usual counter-clockwise fashion we have η = η1 + η2 + η3 + η4 , and more generally ηi1 ...ik = ηi1 ...ik 1 + ηi1 ...ik 2 + ηi1 ...ik 3 + ηi1 ...ik 4 . In as much as the integrals along oppositely oriented line segments cancel, the contributions from the interior segments cancel, and that is why the right-hand side reduces to the integrals along the segments at the boundary of the composite rectangle. Let j1 ∈ {1, 2, 3, 4} be such that |ηj1 | is the maximum of |ηi |, i = 1, . . . , 4. By the triangle inequality we have |η1 | + |η2 | + |η3 | + |η4 | |η|, and hence |ηj1 | 1/4|η|.

Continuing inductively, let jk+1 be such that |ηj1 ...jk jk+1 | is the maximum of |ηj1 ...jk i |, i = 1, . . . , 4. We then have |ηj1 ...jk jk+1 | 4−(k+1) |η|. (420.21.1) Now the sequence of nested rectangles Rj1 ...jk converges to some point z0 ∈ R; more formally {z0 } =

Rj1 ...jk .

k=1

The derivative f (z0 ) is assumed to exist, and hence for every > 0 there exists a k sufficiently large, so that for all z ∈ Rj1 ...jk we have |f (z) − f (z0 )(z − z0 )| Now we make use of the following. Lemma 9. Let Q ⊂ C be a rectangle, let a, b ∈ C, and let f (z) be a continuous, complex valued function defined and bounded in a domain containing Q. Then, (az + b)dz = 0
∂Q

|z − z0 |.

f (z)
∂Q

MP,

where M is an upper bound for |f (z)| and where P is the length of ∂Q. 1621

The first of these assertions follows by the fundamental theorem of calculus; after all the function az + b has an anti-derivative. The second assertion follows from the fact that the absolute value of an integral is smaller than the integral of the absolute value of the integrand — a standard result in integration theory. Using the lemma and the fact that the perimeter of a rectangle is greater than its diameter we infer that for every > 0 there exists a k sufficiently large that ηj1 ...jk = f (z) dz
∂Rj1 ...jk

|∂Rj1 ...jk |2 = 4−k |∂R|2 .

where |∂R| denotes the length of perimeter of the rectangle R. This contradicts the earlier estimate (419.21.1). Therefore η = 0. Version: 10 Owner: rmilson Author(s): rmilson

420.22

proof of M¨bius circle transformation theorem o

Case 1: f (z) = az + b. Case 1a: The points on |z − C| = R can be written as z = C + Reiθ . They are mapped to the points w = aC + b + aReiθ which all lie on the circle |w − (aC + b)| = |a|R. Case 1b: The line Re(eiθ z) = k are mapped to the line Re
1 Case 2: f (z) = z . eiθ w a

= k + Re

b a

.

Case 2a: Consider a circle passing through the origin. This can be written as |z − C| = |C|. 1 This circle is mapped to the line Re(Cw) = 2 which does not pass through the origin. To 1 1 iθ show this, write z = C + |C|e . w = z = C+|C|eiθ . 1 1 Re(Cw) = (Cw + Cw) = 2 2 1 2 C eiθ C/|C| C + C + |C|eiθ C + |C|e−iθ eiθ C/|C| C C + iθ C + |C|e C + |C|e−iθ = 1 2 C |C|eiθ + C + |C|eiθ |C|eiθ + C = 1 2

=

Case 2b: Consider the line which does not pass through the origin. This can be written as a a Re(az) = 1 for a = 0. Then az + az = 2 which is mapped to w + w = 2. This is simplified as aw + aw = 2ww which becomes (w − a/2)(w − a/2) = aa/4 or w − a = |a| which is a 2 2 circle passing through the origin.

1622

Case 2c: Consider a circle which does not pass through the origin. This can be written as |z − C| = R with |C| = R. This circle is mapped to the circle w− |C|2 C R = 2 2 − R2 | −R ||C|

Note:

which is another circle not passing through the origin. To show this, we will demonstrate that C R C −zz 1 + = 2 − R2 2 − R2 |C| R z |C| z
C−z z R z

= 1.

|C|2 =

C−zz zC − zz + zC R C + = 2 2 − R2 −R R z |C| z(|C|2 − R2 ) CC − (z − C)(z − C) |C|2 − R2 1 = = 2 − R2 ) 2 − R2 ) z(|C| z(|C| z

Case 2d: Consider a line passing through the origin. This can be written as Re(eiθ z) = iθ 0. This is mapped to the set Re ew = 0, which can be rewritten as Re(eiθ w) = 0 or Re(we−iθ ) = 0 which is another line passing through the origin. Case 3: An arbitrary Mobius transformation can be written as f (z) = falls into Case 1, so we will assume that c = 0. Let f1 (z) = cz + d f2 (z) = 1 z f3 (z) =
az+b . cz+d

If c = 0, this

bc − ad a z+ c c

Then f = f3 ◦ f2 ◦ f1 . By Case 1, f1 and f3 map circles to circles and by Case 2, f2 maps circles to circles. Version: 2 Owner: brianbirgen Author(s): brianbirgen

420.23

proof of Simultaneous converging or diverging of product and sum theorem
ex for x
m

From the fact that 1 + x

0 we get
m

an
n=1

(1 + an )
n=1

e

Pm

n=1

an

Since an 0 both the partial sums and the partial products are monotone increasing with the number of terms. This concludes the proof. Version: 2 Owner: Johan Author(s): Johan 1623

420.24

proof of absolute convergence implies convergence for an infinite product

This comes at once from the link between infinite products and sums and the absolute convergence theorem for infinite sums. Version: 1 Owner: paolini Author(s): paolini

420.25
Let

proof of closed curve theorem
f (x + iy) = u(x, y) + iv(x, y).

Hence we have intC f (z) dz = intC ω + iintC η where ω and η are the differential forms ω = u(x, y) dx − v(x, y) dy, η = v(x, y) dx + u(x, y) dy.

Notice that by Cauchy-Riemann equations ω and η are closed differential forms. Hence by the lemma on closed differential forms on a simple connected domain we get intC1 ω = intC2 ω, and hence intC1 f (z) dz = intC2 f (z) dz Version: 2 Owner: paolini Author(s): paolini intC1 η = intC2 η.

420.26

proof of conformal M¨bius circle map theorem o

z−a Let f be a conformal map from the unit disk ∆ onto itself. Let a = f (0). Let ga (z) = 1−az . Then ga ◦ f is a conformal map from ∆ onto itself, with ga ◦ f (0) = 0. Therefore, by Schwarz’s lemma for all z ∈ ∆ |ga ◦ f (z)| ≤ |z|.

Because f is a conformal map onto ∆, f −1 is also a conformal map of ∆ onto itself. (ga ◦ f )−1 (0) = 0 so that by Schwarz’s Lemma |(ga ◦ f )−1 (w)| ≤ |w| for all w ∈ ∆. Writing w = ga ◦ f (z) this becomes |z| ≤ |ga ◦ f (z)|. Therefore, for all z ∈ ∆ |ga ◦ f (z)| = |z|. By Schwarz’s Lemma, ga ◦ f is a rotation. Write −1 ga ◦ f (z) = eiθ z, or f (z) = eiθ ga . 1624

Therefore, f is a M¨bius transformation. o Version: 2 Owner: brianbirgen Author(s): brianbirgen

420.27

simultaneous converging or diverging of product and sum theorem

Let ak

0. Then

(1 + an )and

∞ n=1

an

n=1

converge or diverge simultaneously. Version: 3 Owner: Johan Author(s): Johan

420.28

Cauchy-Riemann equations
∂u ∂v = , ∂x ∂y ∂u ∂v =− , ∂y ∂x

The following system of partial differential equations

where u(x, y), v(x, y) are real-valued functions defined on some open subset of R2 , was introduced by Riemann[1] as a definition of a holomorphic function. Indeed, if f (z) satisfies the standard definition of a holomorphic function, i.e. if the complex derivative f (z) = lim
ζ→0

f (z + ζ) − f (z) ζ

exists in the domain of definition, then the real and imaginary parts of f (z) satisfy the Cauchy-Riemann equations. Conversely, if u and v satisfy the Cauchy-Riemann equations, and if their partial derivatives are continuous, then the complex valued function f (z) = u(x, y) + iv(x, y), possesses a continuous complex derivative. References 1. D. Laugwitz, Bernhard Riemann, 1826-1866: Turning points in the Conception of Mathematics, translated by Abe Shenitzer. Birkhauser, 1999. Version: 2 Owner: rmilson Author(s): rmilson 1625 z = x + iy,

420.29

Cauchy-Riemann equations (polar coordinates)

Suppose A is an open set in C and f (z) = f (reiθ ) = u(r, θ) + iv(r, θ) : A ⊂ C → C is a function. If the derivative of f (z) exists at z0 = (r0 , θ0 ). Then the functions u, v at z0 satisfy: ∂u 1 ∂v = ∂r r ∂θ ∂v 1 ∂u = − ∂r r ∂θ which are called Cauchy-Riemann equations in polar form.

Version: 4 Owner: Daume Author(s): Daume

420.30

proof of the Cauchy-Riemann equations

Existence of complex derivative implies the Cauchy-Riemann equations. Suppose that the complex derivative f (z) = lim f (z + ζ) − f (z) ζ→0 ζ (420.30.1)

exists for some z ∈ C. This means that for all complex ζ with |ζ| < ρ, we have f (z) − Henceforth, set f = u + iv,

> 0, there exists a ρ > 0, such that for all

f (z + ζ) − f (z) < . ζ

z = x + iy.

If ζ is real, then the above limit reduces to a partial derivative in x, i.e. f (z) = ∂f ∂u ∂v = +i , ∂x ∂x ∂x ∂f ∂u ∂v = −i + . ∂y ∂y ∂y

Taking the limit with an imaginary ζ we deduce that f (z) = −i Therefore

∂f ∂f = −i , ∂x ∂y and breaking this relation up into its real and imaginary parts gives the Cauchy-Riemann equations. 1626

The Cauchy-Riemann equations imply the existence of a complex derivative. Suppose that the Cauchy-Riemann equations ∂u ∂v = , ∂x ∂y ∂u ∂v =− , ∂y ∂x

hold for a fixed (x, y) ∈ R2 , and that all the partial derivatives are continuous at (x, y) as well. The continuity implies that all directional derivatives exist as well. In other words, for ξ, η ∈ R and ρ = ξ 2 + η 2 we have
∂u u(x + ξ, y + η) − u(x, y) − (ξ ∂x + η ∂u ) ∂y

ρ

→ 0, as ρ → 0,

with a similar relation holding for v(x, y). Combining the two scalar relations into a vector relation we obtain ρ−1 u(x, y) u(x + ξ, y + η) − − v(x, y) v(x + ξ, y + η)
∂u ∂x ∂v ∂x ∂u ∂y ∂v ∂y

ξ η

→ 0, as ρ → 0.

Note that the Cauchy-Riemann equations imply that the matrix-vector product above is equivalent to the product of two complex numbers, namely ∂u ∂v +i ∂x ∂x Setting f (z) = u(x, y) + iv(x, y), ∂u ∂v f (z) = +i ∂x ∂x ζ = ξ + iη we can therefore rewrite the above limit relation as f (z + ζ) − f (z) − f (z)ζ ζ → 0, as ρ → 0, (ξ + iη).

which is the complex limit definition of f (z) shown in (419.30.1). Version: 2 Owner: rmilson Author(s): rmilson

420.31

removable singularity

Let U ⊂ C be an open neighbourhood of a point a ∈ C. We say that a function f : U\{a} → C has a removable singularity at a, if the complex derivative f (z) exists for all z = a, and if f (z) is bounded near a. Removable singularities can, as the name suggests, be removed. 1627

Theorem 13. Suppose that f : U\{a} → C has a removable singularity at a. Then, f (z) can be holomorphically extended to all of U, i.e. there exists a holomorphic g : U → C such that g(z) = f (z) for all z = a. Proof. Let C be a circle centered at a, oriented counterclockwise, and sufficiently small so that C and its interior are contained in U. For z in the interior of C, set g(z) = 1 2πi f (ζ) dζ. ζ −z

C

Since C is a compact set, the defining limit for the derivative d f (ζ) f (ζ) = dz ζ − z (ζ − z)2 converges uniformly for ζ ∈ C. Thanks to the uniform convergence, the order of the derivative and the integral operations can be interchanged. Hence, we may deduce that g (z) exists for all z in the interior of C. Furthermore, by the Cauchy integral formula we have that f (z) = g(z) for all z = a, and therefore g(z) furnishes us with the desired extension. Version: 2 Owner: rmilson Author(s): rmilson

1628

Chapter 421 30F40 – Kleinian groups
421.1 Klein 4-group

Any group G of order 4 must be abelian. If G isn’t isomorphic to the cyclic group with order 4 C4 , then it must be isomorphic to Z2 ⊕ Z2 . This groups is known as the 4-Klein group. The operation is the operation induced by Z2 taking it coordinate-wise. Version: 3 Owner: drini Author(s): drini, apmxi

1629

Chapter 422 31A05 – Harmonic, subharmonic, superharmonic functions
422.1 a harmonic function on a graph which is bounded below and nonconstant

There exists no harmonic function on all of the d-dimensional grid Zd which is bounded below and nonconstant. This categorises a particular property of the grid; below we see that other graphs can admit such harmonic functions. Let T3 = (V3 , E3 ) be a 3-regular tree. Assign “levels” to the vertices of T3 as follows: fix a vertex o ∈ V3 , and let π be a branch of T3 (an infinite simple path) from o. For every vertex v ∈ V3 of T3 there exists a unique shortest path from v to a vertex of π; let (v) = |π| be the length of this path. Now define f (v) = 2− (v) > 0. Without loss of generality, note that the three neighbours u1 , u2 , u3 of v satisfy (u1 ) = (v) − 1 (“u1 is the parent of v”), (u2) = (u3 ) = (v) + 1 (“u2 , u3 are the siblings of v”). And indeed, 1 2 (v)−1 + 2 (v)+1 + 2 (v)+1 = 2 (v) . 3 So f is a positive nonconstant harmonic function on T3 . Version: 2 Owner: drini Author(s): ariels

422.2

example of harmonic functions on graphs

1. Let G = (V, E) be a connected finite graph, and let a, z ∈ V be two of its vertices. The function f (v) = P {simple random walk from v hits a before z} 1630

Finiteness of G is required only to ensure f is well-defined. So we may replace “G finite” with “simple random walk on G is recurrent”. 2. Let G = (V, E) be a graph, and let V ⊆ V . Let α : V → R be some boundary condition. For u ∈ V , define a random variable Xu to be the first vertex of V that simple random walk from u hits. The function f (v) = E α(Xv ) is a harmonic function except on V . The first example is a special case of this one, taking V = {a, z} and α(a) = 1, α(z) = 0. Version: 1 Owner: ariels Author(s): ariels

is a harmonic function except on {a, z}.

422.3

examples of harmonic functions on Rn

Some real functions in Rn (e.g. any linear function, or any affine function) are obviously harmonic functions. What are some more interesting harmonic functions? • For n ≥ 3, define (on the punctured space U = Rn \ {0}) the function f (x) = x Then ∂f xi = (2 − n) , ∂xi x n and x2 1 ∂2f i = n(n − 2) 2 n+2 − (n − 2) x n ∂xi x
2−n

.

Summing over i = 1, ..., n shows ∆f ⇔ 0. • For n = 2, define (on the punctured plane U = R2 \ {0}) the function f (x, y) = log(x2 + y 2). Derivation and summing yield ∆f ⇔ 0. • For n = 1, the condition (∆f )(x) = f (x) =⇔ 0 forces f to be an affine function on every segment; there are no “interesting” harmonic functions in one dimension. Version: 2 Owner: ariels Author(s): ariels

1631

422.4

harmonic function

• A real or complex-valued function f : U → R or f : U → C in C2 (i.e. f is twice continuously differentiable), where U ⊆ Rn is some domain, is called harmonic if its Laplacian vanishes on U: ∆f ⇔ 0. • A real or complex-valued function f : V → R or f : V → C defined on the vertices V of a graph G = (V, E) is called harmonic at v ∈ V if its value at v is its average value at the neighbours of v: 1 f (u). f (v) = deg(v)
{u,v}∈E

It is called harmonic except on A, for some A ⊆ V , if it is harmonic at each v ∈ V \A, and harmonic if it is harmonic at each v ∈ V . In the continuous (first) case, any harmonic f : Rn → R or f : Rn → C satisfies Liouville’s theorem. Indeed, a holomorphic function is harmonic, and a real harmonic function f : U → R, where U ⊆ R2 , is locally the real part of a holomorphic function. In fact, it is enough that a harmonic function f be bounded below (or above) to conclude that it is constant. In the discrete (second) case, any harmonic f : Zn → R, where Zn is the n-dimensional grid, is constant if bounded below (or above). However, this is not necessarily true on other graphs. Version: 4 Owner: ariels Author(s): ariels

1632

Chapter 423 31B05 – Harmonic, subharmonic, superharmonic functions
423.1 Laplacian

Let (x1 , . . . , xn ) be Cartesian coordinates for some open set Ω in Rn . Then the Laplacian differential operator ∆ is defined as ∆= ∂2 ∂2 + ...+ 2 . ∂x2 ∂xn 1

In other words, if f is a twice differentiable function f : Ω → C, then ∆f = ∂2f ∂2f + ...+ 2 . ∂x2 ∂xn 1 · , i.e., ∆ is the composition

A coordinate independent definition of the Laplacian is ∆ = of gradient and divergence. A harmonic function is one for which the Laplacian vanishes.

An older symbol for the Laplacian is 2 – conceptually the scalar product of This form may be more favoured by physicists. Version: 4 Owner: matte Author(s): matte, ariels

with itself.

1633

Chapter 424 32A05 – Power series, series of functions
424.1 exponential function

We begin by defining the exponential function exp : BR → BR+ for all real values of x by the power series
∞ k=0

exp(x) =

xk k!

It has a few elementary properties, which can be easily shown. • The radius of convergence is infinite • exp(0) = 1 • It is infinitely differentiable, and the derivative is the exponential function itself • exp(x) ≥ 1 + x so it is positive and unbounded on the non-negative reals Now consider the function f : R → R with f(x) = exp(x) exp(y − x) so, by the product rule and property 3

1634

f (x) = 0 By the constant value theorem exp(x) exp(y − x) = exp(y) ∀y, x ∈ R With a suitable change of variables, we have exp(x + y) = exp(x) exp(y) exp(x) exp(−x) = 1 Consider just the non-negative reals. Since it is unbounded, by the intermediate value theorem, it can take any value on the interval [1, ∞). We have that the derivative is strictly positive so by the mean-value theorem, exp(x) is strictly increasing. This gives surjectivity and injectivity i.e. it is a bijection from [0, ∞) → [1, ∞). Now exp(−x) = that:
1 , exp(x)

so it is also a bijection from (−∞, 0) → (0, 1). Therefore we can say

exp(x) is a bijection onto R+ We can now naturally define the logarithm function, as the inverse of the exponential function. It is usually denoted by ln(x), and it maps R+ to R Similarly, the natural log base, e may be defined by e = exp(1) Since the exponential function obeys the rules normally associated with powers, it is often denoted by ex . In fact it is now possible to define powers in terms of the exponential function by ax = ex ln(a) a>0

Note the domain may be extended to the complex plane with all the same properties as before, except the bijectivity and ordering properties. Comparison with the power series expansions for sine and cosine yields the following identity, with the famous corollary attributed to Euler: eix = cos(x) + i sin(x) eiπ = −1 1635

Version: 10 Owner: mathcam Author(s): mathcam, vitriol

1636

Chapter 425 32C15 – Complex spaces
425.1 Riemann sphere

ˆ The Riemann Sphere, denoted C is the one-point compactification of the complex plane C, obtained by identifying the limits of all infinitely extending rays from the origin as one ˆ single ”point at infinity.” Heuristically, C can be viewed as a 2-sphere with the top point corresponding to the point at infinity, and the bottom point corresponding the origin. An atlas for the Riemann sphere is given by two charts: ˆ C\{∞} → C : z → z and 1 ˆ C\{0} → C : z → z ˆ ˆ Any polynomial on C has a unique smooth extension to a map p : C → C. ˆ ˆ Version: 2 Owner: mathcam Author(s): mathcam

1637

Chapter 426 32F99 – Miscellaneous
426.1 star-shaped region

Definition A subset U of a real (or possibly complex) vector space is called star-shaped if there is a point p ∈ U such that the line segment pq is contained in U for all q ∈ U. We then say that U is star-shaped with respect of p. (Here, pq = {tp + (1 − t)q | t ∈ [0, 1]}.) A region U is, in other words, star-shaped if there is a point p ∈ U such that U can be ”collapsed” or ”contracted” onto p.

Examples 1. In Rn , any vector subspace is star-shaped. Also, the unit cube and unit ball are starshaped, but the unit sphere is not. 2. A subset U in a vector space is star-shaped with respect of all it’s points, if and only if U is convex. Version: 2 Owner: matte Author(s): matte

1638

Chapter 427 32H02 – Holomorphic mappings, (holomorphic) embeddings and related questions
427.1 Bloch’s theorem

Let f be an holomorphic function on a region containing the closure of the disk D = {z ∈ C : |z| < 1}, such that f (0) = 0 and f (0) = 1. Then there is a disk S ⊂ D such that f is 1 injective on S and f (S) contains a disk of radius 72 . Version: 2 Owner: Koro Author(s): Koro

427.2

Hartog’s theorem

Let U ⊂ Cn (n > 1) be an open set containing the origin 0. Then any holomorphic function on U − {0} extends uniquely to a holomorphic function on U. Version: 1 Owner: bwebste Author(s): bwebste

1639

Chapter 428 32H25 – Picard-type theorems and generalizations
428.1 Picard’s theorem

Let f be an holomorphic function with an essential singularity at w ∈ C. Then there is a number z0 ∈ C such that the image of any neighborhood of w by f contains C − {z0 }. In other words, f assumes every complex value, with the possible exception of z0 , in any neighborhood of w. Remark. little Picard theorem follows as a corollary: Given a nonconstant entire function f , if it is a polynomial, it assumes every value in C as a consequence of the fundamental theorem of algebra. If f is not a polynomial, then g(z) = f (1/z) has an essential singularity at 0; Picard’s theorem implies that g (and thus f ) assumes every complex value, with one possble exception. Version: 4 Owner: Koro Author(s): Koro

428.2

little Picard theorem

The range of a nonconstant entire function is either the whole complex plane C, or the complex plane with a single point removed. In other words, if an entire function omits two or more values, then it is a constant function. Version: 2 Owner: Koro Author(s): Koro

1640

Chapter 429 33-XX – Special functions
429.1 beta function

The beta function is defined as: B(p, q) = int1 xp−1 (1 − x)q−1 dx 0 for any p, q > 0 The beta fuction has the property: Γ(p)Γ(q) Γ(p + q)

B(p, q) = where Γ is the gamma function Also,

B(p, q) = B(q, p) and 1 1 B( , ) = π 2 2 The function was discovered by L.Euler (1730) and the name was given by J.Binet. Version: 8 Owner: vladm Author(s): vladm

1641

Chapter 430 33B10 – Exponential and trigonometric functions
430.1 natural logarithm

The natural logarithm of a number is the logarithm in base e. It is defined formally as 1 ln(x) = intx dx 1 x The origin of the natural logarithm, the exponential function and Euler’s number e are very much intertwined. The integral above was found to have the properties of a logarithm. You can view these properties in the entry on logarithms. If indeed the integral represented a logarithmic function, its base would have to be e, where the value of the integral is 1. Thus was the natural logarithm defined. The natural logarithm can be represented by power-series for −1 < x ln(1 + x) =
∞ k=1

1:

(−1)k+1 k x . k

Note that the above is only the definition of a logarithm for real numbers greater than zero. For complex and negative numbers, one has to look at the Euler relation. Version: 3 Owner: mathwizard Author(s): mathwizard, slider142

1642

Chapter 431 33B15 – Gamma, beta and polygamma functions
431.1 Bohr-Mollerup theorem

Let f : R+ → R+ be a function with the following properties: 1. log f (x) is a convex function; 2. f (x + 1) = xf (x) for all x > 0; 3. f (1) = 1. Then f (x) = Γ(x) for all x > 0. That is, the only function satisfying those properties is the gamma function (restricted to the positive reals.) Version: 1 Owner: Koro Author(s): Koro

431.2

gamma function

The gamma function is Γ(x) = int∞ e−t tx−1 dt 0 For integer values of x = n,

1643

Γ(n) = (n − 1)! Hence the Gamma function satisfies Γ(x + 1) = xΓ(x) if x > 0. The gamma function looks like :

(generated by GNU Octave and gnuplot)

Some values of the gamma function for small arguments are: Γ(1/5) = 4.5909 Γ(1/3) = 2.6789 Γ(3/5) = 1.4892 Γ(3/4) = 1.2254 and the ever-useful Γ(1/2) = √ Γ(1/4) = 3.6256 Γ(2/5) = 2.2182 Γ(2/3) = 1.3541 Γ(4/5) = 1.1642

π. These values allow a quick calculation of Γ(n + f )

Where n is a natural number and f is any fractional value for which the Gamma function’s value is known. Since Γ(x + 1) = xΓ(x), we have

Γ(n + f ) = (n + f − 1)Γ(n + f − 1) = (n + f − 1)(n + f − 2)Γ(n + f − 2) . . . = (n + f − 1)(n + f − 2) · · · (f )Γ(f ) Which is easy to calculate if we know Γ(f ). The gamma function has a meromorphic continuation to the entire complex plane with poles at the non-positive integers. It satisfies the product formula e−γz Γ(z) = z
∞ n=1

1+

z n

−1

ez/n

1644

where γ is Euler’s constant, and the functional equation Γ(z)Γ(1 − z) = Version: 8 Owner: akrowne Author(s): akrowne π . sin πz

431.3

proof of Bohr-Mollerup theorem

We prove this theorem in two stages: first, we establish that the gamma function satisfies the given conditions and then we prove that these conditions uniquely determine a function on (0, ∞). By its definition, Γ(x) is positive for positive x. Let x, y > 0 and 0 λ 1.

log Γ(λx + (1 − λ)y) = log int∞ e−t tλx+(1−λ)y−1 dt 0 = log int∞ (e−t tx−1 )λ (e−t ty−1 )1−λ dt 0 log((int∞ e−t tx−1 dt)λ (int∞ e−t ty−1 dt)1−λ ) 0 0 = λ log Γ(x) + (1 − λ) log Γ(y) The inequality follows from H¨lder’s inequality, where p = o
1 λ

and q =

1 . 1−λ

This proves that Γ is log-convex. Condition 2 follows from the definition by applying integration by parts. Condition 3 is a trivial verification from the definition. Now we show that the 3 conditions uniquely determine a function. By condition 2, it suffices to show that the conditions uniquely determine a function on (0, 1). Let G be a function satisfying the 3 conditions, 0 x 1 and n ∈ N. G(n)1−x G(n + 1)x =

n + x = (1 − x)n + x(n + 1) and by log-convexity of G, G(n + x) G(n)1−x G(n)x nx = (n − 1)!nx . Similarly n + 1 = x(n + x) + (1 − x)(n + 1 + x) gives n! Combining these two we get n!(n + x)x−1 G(n + x) (n − 1)!xn

G(n + x)(n + x)1−x .

and by using condition 2 to express G(n + x) in terms of G(x) we find

1645

an :=

n!(n + x)x−1 x(x + 1) . . . (x + n − 1)

G(x)

(n − 1)!xn =: bn . x(x + 1) . . . (x + n − 1)

Now these inequalities hold for every integer n and the terms on the left and right side have n a common limit (limn→∞ an = 1) so we find this determines G. b As a corollary we find another expression for Γ. For 0 x 1, n!nx . n→∞ x(x + 1) . . . (x + n)

Γ(x) = lim

In fact, this equation, called Gauß’s product, goes for the whole complex plane minus the negative integers. Version: 1 Owner: lieven Author(s): lieven

1646

Chapter 432 33B30 – Higher logarithm functions
432.1 Lambert W function

Lambert’s W function is the inverse of the function f : C → C given by f (x) := xex . That is, W (x) is the complex valued function that satisfies W (x)eW (x) = x, for all x ∈ C. In practice the definition of W (x) requires a branch cut, which is usually taken along the negative real axis. Lambert’s W function is sometimes also called product log function. This function allow us to solve the functional equation g(x)g(x) = x since g(x) = eW (ln(x)) .

432.1.1

References

A site with good information on Lambert’s W function is Corless’ page ”On the Lambert W Function” Version: 4 Owner: drini Author(s): drini

1647

Chapter 433 33B99 – Miscellaneous
433.1 natural log base

The natural log base, or e, has value 2.718281828459045 . . . e was extensively studied by Euler in the 1720’s, but it was originally discovered by John Napier. e is defined by 1 n
n

n→∞

lim

1+

It is more effectively calculated, however, by using a Taylor series to get the representation e= 1 1 1 1 1 + + + + +··· 0! 1! 2! 3! 4!

Version: 3 Owner: akrowne Author(s): akrowne

1648

Chapter 434 33D45 – Basic orthogonal polynomials and functions (Askey-Wilson polynomials, etc.)
434.1 orthogonal polynomials

Polynomials of order n are analytic functions that can be written in the form pn (x) = a0 + a1 x + a2 x2 + · · · + an xn They can be differentiated and integrated for any value of x, and are fully determined by the n + 1 coefficients a0 . . . an . For this simplicity they are frequently used to approximate more complicated or unknown functions. In approximations, the necessary order n of the polynomial is not normally defined by criteria other than the quality of the approximation. Using polynomials as defined above tends to lead into numerical difficulties when determining the ai , even for small values of n. It is therefore customary to stabilize results numerically by using orthogonal polynomials over an interval [a, b], defined with respect to a positive weight function W (x) > 0 by intb pn (x)pm (x)W (x)dx = 0 for n = m a Orthogonal polynomials are obtained in the following way: define the scalar product. (f, g) = intb f (x)g(x)W (x)dx a

1649

between the functions f and g, where W (x) is a weight factor. Starting with the polynomials p0 (x) = 1, p1 (x) = x, p2 (x) = x2 , etc., from the Gram-Schmidt decomposition one obtains a sequence of orthogonal polynomials q0 (x), q1 (x), . . ., such that (qm , qn ) = Nn δmn . The normalization factors Nn are arbitrary. When all Ni are equal to one, the polynomials are called orthonormal. Some important orthogonal polynomials are: a b -1 1 1 1 −∞ ∞ W (x) 1 (1 − x2 )−1/2 2 e−x name Legendre polynomials Chebychev polynomials Hermite polynomials

Orthogonal polynomials of successive orders can be expressed by a recurrence relation pn = (An + Bn x)pn−1 + Cn pn−2 This relation can be used to compute a finite series a0 p0 + a1 p1 + · · · + an pn with arbitrary coefficients ai , without computing explicitly every polynomial pj (Horner’s rule). Chebyshev polynomials Tn (x) are also orthogonal with respect to discrete values xi : Tn (xi )Tm (xi ) = 0 for nr < m ≤ M

i

where the xi depend on M. For more information, see [Abramowitz74], [Press95]. References • Originally from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.html) Abramowitz74 M. Abramowitz and I.A. Stegun (Eds.), Handbook of Mathematical Functions, National Bureau of Standards, Dover, New York, 1974. Press95 W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes in C, Second edition, Cambridge University Press, 1995. (The same book exists for the Fortran language). There is also an Internet version which you can work from. Version: 3 Owner: akrowne Author(s): akrowne 1650

Chapter 435 33E05 – Elliptic functions and integrals
435.1 Weierstrass sigma function

Definition 37. Let Λ ⊂ C be a lattice. Let Λ∗ denote Λ − {0}. 1. The Weierstrass sigma function is defined as the product z z/w+ 1 (z/w)2 2 σ(z; Λ) = z 1− e w ∗ w∈Λ 2. The Weierstrass zeta function is defined by the sum ζ(z; Λ) = σ (z; Λ) 1 = + σ(z; Λ) z w∈Λ∗ 1 1 z + + 2 z−w w w

Note that the Weierstrass zeta function is basically the derivative of the logarithm of the sigma function. The zeta function can be rewritten as: 1 G2k+2 (Λ)z 2k+1 ζ(z; Λ) = − z k=1 where G2k+2 is the Eisenstein series of weight 2k + 2. 3. The Weierstrass eta function is defined to be η(w; Λ) = ζ(z + w; Λ) − ζ(z; Λ), for any z ∈ C (It can be proved that this is well defined, i.e. ζ(z + w; Λ) −ζ(z; Λ) only depends on w). The Weierstrass eta function must not be confused with the Dedekind eta function. Version: 1 Owner: alozano Author(s): alozano 1651

435.2

elliptic function

Let Λ ∈ C be a lattice in the sense of number theory, i.e. a 2-dimensional free group over Z which generates C over R. An elliptic function φ, with respect to the lattice Λ, is a meromorphic funtion φ : C → C which is Λ-periodic: φ(z + λ) = φ(z), ∀z ∈ C, ∀λ ∈ Λ Remark: An elliptic function which is holomorphic is constant. Indeed such a function would induce a holomorphic function on C/Λ, which is compact (and it is a standard result from complex Analysis that any holomorphic function with compact domain is constant, this follows from Liouville’s theorem). Example: The Weierstrass ℘-function (see elliptic curve) is an elliptic function, probably the most important. In fact: Theorem 12. The field of elliptic functions with respect to a lattice Λ is generated by ℘ and ℘ (the derivative of ℘).
S

ee [2], chapter 1, theorem 4.

REFERENCES
1. James Milne, Modular Functions and Modular Forms, online course notes. http://www.jmilne.org/math/CourseNotes/math678.html 2. Serge Lang, Elliptic Functions. Springer-Verlag, New York. 3. Joseph H. Silverman, The Arithmetic of Elliptic Curves. Springer-Verlag, New York, 1986.

Version: 4 Owner: alozano Author(s): alozano

435.3

elliptic integrals and Jacobi elliptic functions

Elliptic integrals For 0 < k < 1, write F (k, φ) = intφ 0 E(k, φ) = intφ 0 Π(k, n, φ) = intφ 0 dθ 1 − k 2 sin2 θ dθ dθ (1 + n sin2 θ) 1652 1 − k 2 sin2 θ 1 − k 2 sin2 θ (435.3.1) (435.3.2) (435.3.3)

The change of variable x = sin φ turns these into F1 (k, x) = intx 0 E1 (k, x) = intx 0 dv (1 − v 2 )(1 − k 2 v 2 ) (435.3.4) (435.3.5) (435.3.6)

1 − k2 v2 dv 1 − v2 dv Π1 (k, n, x) = intx 0 (1 + nv 2 ) (1 − v 2 )(1 − k 2 v 2 )

The first three functions are known as Legendre’s form of the incomplete elliptic integrals of the first, second, and third kinds respectively. Notice that (2) is the special case n = 0 of (3). The latter three are known as Jacobi’s form of those integrals. If φ = π/2, or x = 1, they are called complete rather than incomplete integrals, and their names are abbreviated to F (k), E(k), etc. One use for elliptic integrals is to systematize the evaluation of certain other integrals. In particular, let p be a third- or fourth-degree polynomial in one variable, and let y = p(x). If q and r are any two polynomials in two variables, then the indefinite integral int q(x, y) dx r(x, y)

has a “closed form” in terms of the above incomplete elliptic integrals, together with elementary functions and their inverses. Jacobi’s elliptic functions In (1) we may regard φ as a function of F , or vice versa. The notation used is φ = am u u = arg φ

and φ and u are known as the amplitude and argument respectively. But x = sin φ = sin am u. The function u → sin am u = x is denoted by sn and is one of four Jacobi (or Jacobian) elliptic functions. The four are: sn u = x √ 1 − x2 cn u = sn u tn u = cn √ u dn u = 1 − k 2 x2 When the Jacobian elliptic functions are extended to complex arguments, they are doubly periodic and have two poles in any parallogram of periods; both poles are simple. Version: 1 Owner: mathcam Author(s): Larry Hammick

1653

435.4

examples of elliptic functions

Examples of Elliptic Functions Let Λ ⊂ C be a lattice generated by w1 , w2 . Let Λ∗ denote Λ − {0}. 1. The Weierstrass ℘-function is defined by the series ℘(z; Λ) = 1 1 1 + − 2 2 2 z (z − w) w w∈Λ∗ 1 (z − w)3 w∈Λ∗ w −2k
w∈Λ∗

2. The derivative of the Weierstrass ℘-function is also an elliptic function ℘ (z; L) = −2

3. The Eisenstein series of weight 2k for Λ is the series G2k (Λ) =

The Eisenstein series of weight 4 and 6 are of special relevance in the theory of elliptic curves. In particular, the quantities g2 and g3 are usually defined as follows: g2 = 60 · G4 (Λ), Version: 3 Owner: alozano Author(s): alozano g3 = 140 · G6 (Λ)

435.5

modular discriminant

Definition 38. Let Λ ⊂ C be a lattice. 1. Let qτ = e2πiτ . The Dedekind eta function is defined to be η(τ ) =
1/24 qτ ∞ n (1 − qτ )

n=1

The Dedekind eta function should not be confused with the Weierstrass eta function, η(w; Λ). 2. The j-invariant, as a function of lattices, is defined to be: j(Λ) =
3 g2 3 2 g2 − 27g3

where g2 and g3 are certain multiples of the Eisenstein series of weight 4 and 6 (see this entry). 1654

3. The ∆ function (delta function or modular discriminant) is defined to be
3 2 ∆(Λ) = g2 − 27g3

Let Λτ be the lattice generated by 1, τ . The ∆ function for Λτ has a product expansion ∆(τ ) = ∆(Λτ ) = (2πi)12 qτ
∞ n (1 − qτ )24 = (2πi)12 η(τ )24

n=1

Version: 2 Owner: alozano Author(s): alozano

1655

Chapter 436 34-00 – General reference works (handbooks, dictionaries, bibliographies, etc.)
436.1 Liapunov function

Suppose we are given an autonomous system of first order differential equations. dx = F (x, y) dt dy = G(x, y) dt

Let the origin be an isolated critical point of the above system. A function V (x, y) that is of class C 1 and satisfies V (0, 0) = 0 is called a Liapunov function if every open ball Bδ (0, 0) contains at least one point where V > 0. If there happens to exist ˙ δ ∗ such that the function V , given by ˙ V (x, y) = Vx (x, y)F (x, y) + Vy (x, y)G(x, y)
∗ is positive definite in Bδ (0, 0). Then the origin is an unstable critical point of the system.

Version: 2 Owner: tensorking Author(s): tensorking

1656

436.2
436.2.1

Lorenz equation
The history

The Lorenz equation was published in 1963 by a meteorologist and mathematician from MIT called Edward N. Lorenz. The paper containing the equation was titled “Deterministic non-periodic flows” and was published in the Journal of Atmospheric Science. What drove Lorenz to find the set of three dimentional ordinary differential equations was the search for an equation that would “model some of the unpredictable behavior which we normally associate with the weather”[PV]. The Lorenz equation represent the convective motion of fluid cell which is warmed from below and cooled from above.[PV] The same system can also apply to dynamos and laser. In addition some of its popularity can be attributed to the beauty of its solution. It is also important to state that the Lorenz equation has enough properties and interesting behavior that whole books are written analyzing results.

436.2.2

The equation

The Lorenz equation is commonly defined as three coupled ordinary differential equation like dx = σ(y − x) dt dy = x(τ − z) − y dt dz = xy − βz dt where the three parameter σ, τ , β are positive and are called the Prandtl number, the Rayleigh number, and a physical proportion, respectively. It is important to note that the x, y, z are not spacial coordinate. The ”x is proportional to the intensity of the convective motion, while y is proportional to the temperature difference between the ascending and descending currents, similar signs of x and y denoting that warm fluid is rising and cold fluid is descending. The variable z is proportional to the distortion of vertical temperature profile from linearity, a positive value indicating that the strongest gradients occur near the boundaries.” [GSS]

436.2.3

Properties of the Lorenz equations

• Symmetry The Lorenz equation has the following symmetry of ordinary differential equation: (x, y, z) → (−x, −y, z)

This symmetry is present for all parameters of the Lorenz equation (see natural symmetry of the Loren 1657

• Invariance The z-axis is invariant, meaning that a solution that starts on the z-axis (i.e. x = y = 0) will remain on the z-axis. In addition the solution will tend toward the origin if the initial condition are on the z-axis. • Critical points  σ(y − x) ˙ To solve for the critical points we let x = f (x) = x(τ − z) − y  and we solve xy − βz f (x) = 0. It is clear that one of those critical point is x0 = (0, 0, 0) and with some algebraic manipulation we detemine that xC1 = ( β(τ − 1), β(τ − 1), τ − 1) and xC2 = (− β(τ − 1), − β(τ − 1), τ − 1) are critical points and real when τ > 1. 

436.2.4

An example
(The x solution with respect to time.) (The y solution with respect to time.) (The z solution with respect to time.)

the above is the solution of the Lorenz equation with parameters σ = 10, τ = 28 and β = 8/3(which is the classical example). The inital condition of the system is (x0 , y0 , z0 ) = (3, 15, 1).

436.2.5

Experimenting with octave

By changing the parameters and initial condition one can observe that some solution will be drastically different. (This is in no way rigorous but can give an idea of the qualitative property of the Lorenz equation.)

function y = lorenz (x, t) y = [10*(x(2) - x(1)); x(1)*(28 - x(3)) - x(2); x(1)*x(2) - 8 endfunction solution = lsode ("lorenz", [3; 15; 1], (0:0.01:50)’);

gset parametric gset xlabel "x" gset ylabel "y" gset zlabel "z" gset nokey gsplot soluti

REFERENCES
[LNE] Lorenz, N. Edward: Deterministic non-periodic flows. Journal of Atmospheric Science, 1963.

1658

[MM] Marsden, E. J. McCracken, M.: The Hopf Bifurcation and Its Applications. Springer-Verlag, New York, 1976. [SC] Sparow, Colin: The Lorenz Equations: Bifurcations, Chaos and Strange Attractors. SpringerVerlag, New York, 1982.

436.2.6

See also

• Paul Bourke, The Lorenz Attractor in 3D • Tim Whitcomb, http://students.washington.edu/timw/ (If you click on the Lorenz equation phase portrait, you get to download a copy of the article[GSS].) Version: 12 Owner: Daume Author(s): Daume

436.3

Wronskian determinant

If we have some functions f1 , f2 , . . . fn then the Wronskian determinant (or simply the Wronskian) W (f1 , f2 , f3 . . . fn ) is the determinant of the square matrix f1 f1 f1 . . . f1
(n−1)

W (f1 , f2 , f3 . . . fn ) =

f2 f2 f2 . . . f2
(n−1)

f3 f3 f3 . . . f3
(n−1)

... ... ... .. .

fn fn fn . . .
(n−1)

. . . fn

where f (k) indicates the kth derivative of f (not exponentiation). The Wronskian of a set of functions F is another function, which is zero over any interval where F is linearly dependent. Just as a set of vectors is said to be linearly dependent whenever one vector may by expressed as a linear combination of a finite subset of the others, a set of functions {f1 , f2 , f3 . . . fn } is said to be dependent over an interval I if one of the functions can be expressed as a linear combination of a finite subset of the others, i.e, a1 f1 (t) + a2 f2 (t) + · · · an fn (t) = 0 for some a1 , a2 , . . . an , not all zero, at any t ∈ I. Therefore the Wronskian can be used to determine if functions are independent. This is useful in many situations. For example, if we wish to determine if two solutions of a second-order differential equation are independent, we may use the Wronskian. 1659

Examples Consider the functions x2 , x, and 1. Take the Wronskian: x2 x 1 2x 1 0 2 0 0

W =

= −2

Note that W is always non-zero, so these functions are independent everywhere. Consider, however, x2 and x: x2 x 2x 1 = x2 − 2x2 = −x2

W =

Here W = 0 only when x = 0. Therefore x2 and x are independent except at x = 0. Consider 2x2 + 3, x2 , and 1: 2x2 + 3 x2 1 4x 2x 0 4 2 0

W =

= 8x − 8x = 0

Here W is always zero, so these functions are always dependent. This is intuitively obvious, of course, since 2x2 + 3 = 2(x2 ) + 3(1) Version: 5 Owner: mathcam Author(s): mathcam, vampyr

436.4

dependence on initial conditions of solutions of ordinary differential equations

Let E ⊂ W where W is a normed vector space, f ∈ C 1 (E) is a continuous differentiable map f : E → W . Furthermore consider the the ordinary differential equation x = f (x) ˙ with the initial condition x(0) = x0 . Let x(t) be the solution of the above initial value problem defined as x:I→E 1660

where I = [−a, a]. Then there exist δ > 0 such that for all y0 ∈ Nδ (x0 )(y0 in the δ neighborhood of x0 ) has a unique solution y(t) to the initial value problem above except for the initial value changed to x(0) = y0 . In addition y(t) is twice continouously differentialble function of t over the interval I. Version: 1 Owner: Daume Author(s): Daume

436.5

differential equation

A differential equation is an equation involving an unknown function of one or more variables, its derivatives and the independent variables. This type of equations comes up often in many different branches of mathematics. They are also especially important in many problems in physics and engineering. There are many types of differential equations. An ordinary differential equation (ODE) is a differential equation where the unknown function depends on a single variable. A general ODE has the form F (x, f (x), f (x), . . . , f (n) (x)) = 0, (436.5.1) where the unknown f is usually understood to be a real or complex valued function of x, and x is usually understood to be either a real or complex variable. The order of a differential equation is the order of the highest derivative appearing in Eq. (435.5.1). In this case, assuming that F depends nontrivially on f (n) (x), the equation is of nth order. If a differential equation is satisfied by a function which identically vanishes (i.e. f (x) = 0 for each x in the domain of interest), then the equation is said to be homogeneous. Otherwise it is said to be nonhomogeneous (or inhomogeneous). Many differential equations can be expressed in the form L[f ] = g(x), where L is a differential operator (with g(x) = 0 for the homogeneous case). If the operator L is linear in f , then the equation is said to be a linear ODE and otherwise nonlinear. Other types of differential equations involve more complicated relations involving the unknown function. A partial differential equation (PDE) is a differential equation where the unknown function depends on more than one variable. In a delay differential equation (DDE), the unknown function depends on the state of the system at some instant in the past. Solving differential equations is a difficult task. Three major types of approaches are possible: • Exact methods are generally restricted to equations of low order and/or to linear systems. • Qualitative methods do not give explicit formula for the solutions, but provide information pertaining to the asymptotic behavior of the system. 1661

• Finally, numerical methods allow to construct approximated solutions.

Examples
A common example of an ODE is the equation for simple harmonic motion d2 u + ku = 0. dx2 This equation is of second order. It can be transformed into a system of two first order differential equations by introducing a variable v = du/dx. Indeed, we then have dv = −ku dx du = v. dx A common example of a PDE is the wave equation in three dimensions ∂2u ∂2u ∂2u ∂2u + 2 + 2 = c2 2 ∂x2 ∂y ∂z ∂t Version: 7 Owner: igor Author(s): jarino, igor

436.6

existence and uniqueness of solution of ordinary differential equations

Let E ⊂ W where W is a normed vector space, f ∈ C 1 (E) is a continuous differentiable map f : E → W , and let x0 ∈ E. Then there exists an a > 0 such that the ordinary differential equation x = f (x) ˙ with the initial condition x(0) = x0 has a unique solution x : [−a, a] → E which also satify the initial condition of the initial value problem. Version: 3 Owner: Daume Author(s): Daume

1662

436.7

maximal interval of existence of ordinary differential equations

Let E ⊂ W where W is a normed vector space, f ∈ C 1 (E) is a continuous differentiable map f : E → W . Furthermore consider the the ordinary differential equation x = f (x) ˙ with the initial condition x(0) = x0 . For all x0 ∈ E there exists a unique solution x:I→E where I = [−a, a], which also satify the initial condition of the initial value problem. Then there exists a maximal interval of existence J = (α, β) such that I ⊂ J and there exists a unique solution x : J → E. Version: 3 Owner: Daume Author(s): Daume

436.8

method of undetermined coefficients

Given a (usually non-homogenous) ordinary differential equation F (x, f (x), f (x), . . . , f (n) (x)) = 0, the method of undetermined coefficients is a way of finding an exact solution when a guess can be made as to the general form of the solution. In this method, the form of the solution is guessed with unknown coefficients left as variables. A typical guess might be of the form Ae2x or Ax2 + Bx + C. This can then be substituted into the differential equation and solved for the coefficients. Obviously the method requires knowing the approximate form of the solution, but for many problems this is a feasible requirement. This method is most commonly used when the formula is some combination of exponentials, polynomials, sin and cos.

1663

Examples
Suppose we have f (x) − 2f (x) + f (x) − 2e2x = 0. If we guess that the soution is of the form f (x) = Ae2x then we have 4Ae2x − 4Ae2x + Ae2x − 2e2 x = 0 and therefore Ae2x = 2e2x , so A = 2, giving f (x) = 2e2x as a solution. Version: 4 Owner: Henry Author(s): Henry

436.9

natural symmetry of the Lorenz equation

The Lorenz equation has a natural symmetry defined by To verify that (435.9.1) is a symmetry of an ordinary differential equation (Lorenz equation) there must exist a 3 × 3 matrix which commutes with the differential equation. This can be easily verified by observing that the symmetry is associated with the matrix R defined as   −1 0 0 R =  0 −1 0 . (436.9.2) 0 0 1  σ(y − x) ˙ x = f (x) = x(τ − z) − y  xy − βz  (436.9.3) (x, y, z) → (−x, −y, z). (436.9.1)

Let

where f (x) is the Lorenz equation and xT = (x, y, z). We proceed by showing that Rf (x) = f (Rx). Looking at the left hand side    σ(y − x) −1 0 0 Rf (x) =  0 −1 0 x(τ − z) − y  xy − βz 0 0 1   σ(x − y) x(z − τ ) + y  = xy − βz and now looking at the right hand side    −1 0 0 x f (Rx) = f ( 0 −1 0 y ) 0 0 1 z   −x  y ) = f( z   σ(x − y) = x(z − τ ) + y  . xy − βz 1664

Since the left hand side is equal to the right hand side then (435.9.1) is a symmetry of the Lorenz equation. Version: 2 Owner: Daume Author(s): Daume

436.10

symmetry of a solution of an ordinary differential equation

Let γ be a symmetry of the ordinary differential equation and x0 be a steady state solution of x = f (x). If ˙ γx0 = x0 then γ is called a symmetry of the solution of x0 . Let γ be a symmetry of the ordinary differential equation and x0 (t) be a periodic solution of x = f (x). If ˙ γx0 (t − t0 ) = x0 (t) for a certain t0 then (γ, t0 ) is called a symmetry of the periodic solution of x0 (t). lemma: If γ is a symmetry of the ordinary differential equation and let x0 (t) be a solution (either steady state or preiodic) of x = f (x). Then γx0 (t) is a solution of x = f (x). ˙ ˙ dx0 (t) dx proof: If x0 (t) is a solution of dt = f (x) implies dt = f (x0 (t)). Let’s now verify that γx0 (t) is a solution, with a substitution into dx = f (x). The left hand side of the equation becomes dt dγx0 (t) 0 = γ dxdt(t) and the right hand side of the equation becomes f (γx0 (t)) = γf (x0 (t)) since dt γ is a symmetry of the differential equation. Therefore we have that the left hand side equals 0 the right hand side since dxdt(t) = f (x0 (t)). QED

REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.

Version: 3 Owner: Daume Author(s): Daume

436.11

symmetry of an ordinary differential equation

Let f : Rn → Rn be a smooth function and let x = f (x) ˙ 1665

be a system of ordinary differential equation, in addition let γ be an invertible matrix. Then γ is a symmetry of the ordinary differential equation if f (γx) = γf (x). Example:

• natural symmetry of the Lorenz equation is a simple example of a symmetry of a differential equation.

REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.

Version: 4 Owner: Daume Author(s): Daume

1666

Chapter 437 34-01 – Instructional exposition (textbooks, tutorial papers, etc.)
437.1 second order linear differential equation with constant coefficients

Consider the second order homogeneous linear differential equation x + bx + cx = 0, where b and c are real constants. The explicit solution is easily found using the characteristic equation method. This method, introduced by Euler, consists in seeking solutions of the form x(t) = ert for (446.2.1). Assuming a solution of this form, and substituting it into (446.2.1) gives r 2 ert + brert + cert = 0. Thus r 2 + br + c = 0 (437.1.2) which is called the characteristic equation of (446.2.1). Depending on the nature of the roots r1 and r2 of (436.1.2), there are three cases. • If the roots are real and distinct, then two linearly independent solutions of (446.2.1) are x1 (t) = er1 t , x2 (t) = er2 t . • If the roots are real and equal, then two linearly independent solutions of (446.2.1) are x1 (t) = er1 t , 1667 x2 (t) = ter1 t . (437.1.1)

• If the roots are complex conjugates of the form r1,2 = α ± iβ, then two linearly independent solutions of (446.2.1) are x1 (t) = eαt cos βt, x2 (t) = eαt sin βt.

The general solution to (446.2.1) is then constructed from these linearly independent solutions, as φ(t) = C1 x1 (t) + C2 x2 (t). (437.1.3) Characterizing the behavior of (436.1.3) can be accomplished by studying the two dimensional linear system obtained from (446.2.1) by defining y = x : x =y y = −by − cx. (437.1.4)

Remark that the roots of (436.1.2) are the eigenvalues of the Jacobian matrix of (436.1.4). This generalizes to the characteristic equation of a differential equation of order n and the n-dimensional system associated to it. Also note that the only equilibrium of (436.1.4) is the origin (0, 0). Suppose that c = 0. Then (0, 0) is called a 1. source iff b < 0 and c > 0, 2. spiral source iff it is a source and b2 − 4c < 0, 3. sink iff b > 0 and c > 0, 4. spiral sink iff it is a sink and b2 − 4c < 0, 5. saddle iff c < 0, 6. center iff b = 0 and c > 0. Version: 3 Owner: jarino Author(s): jarino

1668

Chapter 438 34A05 – Explicit solutions and reductions
438.1 separation of variables

Separation of variables is a valuable tool for solving differential equations of the form dy = f (x)g(y) dx The above equation can be rearranged algebraically through Leibniz notation to separate the variables and be conveniently integrable. dy = f (x)dx g(y) It follows then that int dy = F (x) + C g(y)

where F (x) is the antiderivative of f and C is a constant of integration. This gives a general form of the solution. An explicit form may be derived by an initial value. Example: A population that is initially at 200 organisms increases at a rate of 15% each year. We then have a differential equation dP = 0.15P dt The solution of this equation is relatively straightforward, we simple separate the variables algebraically and integrate. dP int = int0.15 dt P 1669

This is just ln P = 0.15t + C or P = Ce0.15t When we substitute P (0) = 200, we see that C = 200. This is where we get the general relation of exponential growth P (t) = P0 ekt [more later] Version: 2 Owner: slider142 Author(s): slider142

438.2

variation of parameters

The method of variation of parameters is a way of finding a particular solution to a nonhomogeneous linear differential equation. Suppose that we have an nth order linear differential operator L[y] := y (n) + p1 (t)y (n−1) + · · · + pn (t)y, and a corresponding nonhomogeneous differential equation L[y] = g(t). (438.2.2) (438.2.1)

Suppose that we know a fundamental set of solutions y1 , y2, . . . , yn of the corresponding homogeneous differential equation L[yc ] = 0. The general solution of the homogeneous equation is yc (t) = c1 y1 (t) + c2 y2 (t) + · · · + cn yn (t), (438.2.3) where c1 , c2 , . . . , cn are constants. The general solution to the nonhomogeneous equation L[y] = g(t) is then y(t) = yc (t) + Y (t), (438.2.4) where Y (t) is a particular solution which satisfies L[Y ] = g(t), and the constants c1 , c2 , . . . , cn are chosen to satisfy the appropriate boundary conditions or initial conditions. The key step in using variation of parameters is to suppose that the particular solution is given by Y (t) = u1 (t)y1 (t) + u2 (t)y2 (t) + · · · + un (t)yn (t), (438.2.5) where u1 (t), u2(t), . . . , un (t) are as yet to be determined functions (hence the name variation of parameters). To find these n functions we need a set of n independent equations. One obvious condition is that the proposed ansatz stisfies Eq. (437.2.2). Many possible additional conditions are possible, we choose the ones that make further calculations easier. Consider

1670

the following set of n − 1 conditions y1 u1 + y2 u2 + · · · + yn un = 0 y1 u1 + y2 u2 + · · · + yn un = 0 . . .
(n−2) y1 u1

(438.2.6)

+

(n−2) y2 u2

+···+

(n−2) yn un

= 0.

Now, substituting Eq. (437.2.5) into L[Y ] = g(t) and using the above conditions, we can get another equation y1
(n−1)

u1 + y2

(n−1)

(n−1) u2 + · · · + yn un = g.

(438.2.7)

So we have a system of n equations for u1 , u2 , . . . , un which we can solve using Cramer’s rule: um (t) = g(t)Wm (t) , W (t) m = 1, 2, . . . , n. (438.2.8)

Such a solution always exists since the Wronskian W = W (y1 , y2, . . . , yn ) of the system is nowhere zero, because the y1 , y2, . . . , yn form a fundamental set of solutions. Lastly the term Wm is the Wronskian determinant with the mth column replaced by the column (0, 0, . . . , 0, 1). Finally the particular solution can be written explicitly as
n

Y (t) =
m=1

ym (t)int

g(t)Wm (t) dt. W (t)

(438.2.9)

REFERENCES
1. W. E. Boyce—R. C. DiPrima. Elementary Differential Equations and Boundary Value Problems John Wiley & Sons, 6th edition, 1997.

Version: 3 Owner: igor Author(s): igor

1671

Chapter 439 34A12 – Initial value problems, existence, uniqueness, continuous dependence and continuation of solutions
439.1 initial value problem

Consider the simple differential equation dy = x. dx The solution goes by writing dy = x dx and then integrating both sides as intdy = intx dx. 2 the solution becomes then y = x + C where C is any constant. 2 Differentiating x + 5, x + 7 and some other examples shows that all these functions hold 2 2 the condition given by the differential equation. So we have an infinite number of solutions. An initial value problem is then a differential equation (ordinary or partial, or even a system) which, besides of stating the relation among the derivatives, it also specifies the value of the unknown solutions at certain points. This allows to get a unique solution from the infinite number of potential ones. In our example we could add the condition y(4) = 3 turning it into an initial value problem. 2 The general solution x + C is now hold to the restriction 2 42 +C =3 2 1672
2 2

by solving for C we obtain C = −5 and so the unique solution for the system dy = x dx y(4) = 3 is y(x) =
x2 2

− 5.

Version: 1 Owner: drini Author(s): drini

1673

Chapter 440 34A30 – Linear equations and systems, general
440.1 Chebyshev equation

Chebyshev’s equation is the second order linear differential equation (1 − x2 ) where p is a real constant. There are two independent solutions which are given as series by: y1 (x) = 1 −
p2 2 x 2!

d2 y dy − x + p2 y = 0 2 dx dx

+

(p−2)p2 (p+2) 4 x 4!

and

(p−4)(p−2)p2 (p+2)(p+4) 6 x 6!

+···

y2 (x) = x −

(p−1)(p+1) 3 x 3!

+

(p−3)(p−1)(p+1)(p+3) 5 x 5!

−···

In each case, the coefficients are given by the recursion an+2 = (n − p)(n + p) an (n + 1)(n + 2)

with y1 arising from the choice a0 = 1, a1 = 0, and y2 arising from the choice a0 = 0, a1 = 1. The series converge for |x| < 1; this is easy to see from the ratio test and the recursion formula above. When p is a non-negative integer, one of these series will terminate, giving a polynomial solution. If p ≥ 0 is even, then the series for y1 terminates at xp . If p is odd, then the series for y2 terminates at xp . 1674

These polynomials are, up to multiplication by a constant, the Chebyshev polynomials. These are the only polynomial solutions of the Chebyshev equation. (In fact, polynomial solutions are also obtained when p is a negative integer, but these are not new solutions, since the Chebyshev equation is invariant under the substitution of p by −p.) Version: 3 Owner: mclase Author(s): mclase

1675

Chapter 441 34A99 – Miscellaneous
441.1 autonomous system

A system of ordinary differential equation is autonomous when it does not depend on time (does not depend on the independent variable) i.e. x = f (x). In contrast nonautonomous ˙ is when the system of ordinary differential equation does depend on time (does depend on the independent variable) i.e. x = f (x, t). ˙ It can be noted that every nonautonomous system can be converted to an autonomous system by additng a dimension. i.e. If ˙ = ( , t) ∈ Rn then it can be written as an autonomous system with ∈ Rn+1 and by doing a substitution with xn+1 = t and xn+1 = 1. ˙ Version: 1 Owner: Daume Author(s): Daume

1676

Chapter 442 34B24 – Sturm-Liouville theory
442.1 eigenfunction

Consider the Sturm-Liouville system given by dy d p(x) + q(x)y + λr(x)y = 0 dx dx a1 y(a) + a2 y (a) = 0,

a

x

b

(442.1.1)

b1 y(b) + b2 y (b) = 0,

(442.1.2)

where ai , bi ∈ R with i ∈ {1, 2} and p(x), q(x), r(x) are differentiable functions and λ ∈ R. A non zero solution of the system defined by (441.1.1) and (441.1.2) exists in general for a specified λ. The functions corresponding to that specified λ are called eigenfunctions. More generally, if D is some linear differential operator, and λ ∈ R and f is a function such that Df = λf then we say f is an eigenfunction of D with eigenvalue λ. Version: 5 Owner: tensorking Author(s): tensorking

1677

Chapter 443 34C05 – Location of integral curves, singular points, limit cycles
443.1 Hopf bifurcation theorem

Consider a planar system of ordinary differential equations, written is such a form as to make explicit the dependance on a parameter µ: x y = f1 (x, y, µ) = f2 (x, y, µ)

Assume that this system has the origin as an equilibrium for all µ. Suppose that the linearization Df at zero has the two purely imaginary eigenvalues λ1 (µ) and λ2 (µ) when µ = µc . If the real part of the eigenvalues verify d (Re (λ1,2 (µ)))|µ=µc > 0 dµ and the origin is asymptotically stable at µ = µc , then 1. µc is a bifurcation point; ¯ 2. for some µ1 ∈ R such that µ1 < µ < µc , the origin is a stable focus; ¯ 3. for some µ2 ∈ R such that µc < µ < µ2 , the origin is unstable, surrounded by a stable limit cycle whose size increases with µ. This is a simplified version of the theorem, corresponding to a supercritical Hopf bifurcation. Version: 1 Owner: jarino Author(s): jarino 1678

443.2

Poincare-Bendixson theorem

Let M be an open subset of R2 , and f ∈ C 1 (M, R2 ). Consider the planar differential equation x = f (x) Consider a fixed x ∈ M. Suppose that the omega limit set ω(x) = ∅ is compact, connected, and contains only finitely many equilibria. Then one of the following holds: 1. ω(x) is a fixed orbit (a periodic point with period zero, i.e., an equilibrium). 2. ω(x) is a regular periodic orbit. 3. ω(x) consists of (finitely many) equilibria {xj } and non-closed orbits γ(y) such that ω(y) ∈ {xj } and α(y) ∈ {xj } (where α(y) is the alpha limit set of y). The same result holds when replacing omega limit sets by alpha limit sets. Since f was chosen such that existence and unicity hold, and that the system is planar, the Jordan curve theorem implies that it is not possible for orbits of the system satisfying the hypotheses to have complicated behaviors. Typical use of this theorem is to prove that an equilibrium is globally asymptotically stable (after using a Dulac type result to rule out periodic orbits). Version: 1 Owner: jarino Author(s): jarino

443.3

omega limit set

Let Φ(t, x) be the flow of the differential equation x = f (x), where f ∈ C k (M, Rn ), with k 1 and M an open subset of Rn . Consider x ∈ M. The omega limit set of x, denoted ω(x), is the set of points y ∈ M such that there exists a sequence tn → ∞ with Φ(tn , x) = y. Similarly, the alpha limit set of x, denoted α(x), is the set of points y ∈ M such that there exists a sequence tn → −∞ with Φ(tn , x) = y. Note that the definition is the same for more general dynamical systems. Version: 1 Owner: jarino Author(s): jarino

1679

Chapter 444 34C07 – Theory of limit cycles of polynomial and analytic vector fields (existence, uniqueness, bounds, Hilbert’s 16th problem and ramif
444.1 Hilbert’s 16th problem for quadratic vector fields

Find a maximum natural number H(2) and relative position of limit cycles of a vector field
2

x = p(x, y) = ˙
i+j=0 2

aij xi y j bij xi y j
i+j=0

y = q(x, y) = ˙

[GSS] As of now neither part of the problem (i.e. the bound and the positions of the limit cycles) are solved. Although R. Bam`n in 1986 showed [PV] that a quadratic vector field has finite o number of limit cycles. In 1980 Shi Songling showed [SS] an example of a quadratic vector field which has four limit cycles (i.e. H(2) 4).

REFERENCES
[DRR] Dumortier, F., Roussarie, R., Rousseau, C.: Hilbert’s 16th Problem for Quadratic Vector Fields. Journal of Differential Equations 110, 86-133, 1994. [BR] R. Bam`n: Quadratic vector fields in the plane have a finite number of limit cycles, Publ. o I.H.E.S. 64 (1986), 111-142.

1680

[SS] Shi Songling, A concrete example of the existence of four limit cycles for plane quadratic systems, Scientia Sinica 23 (1980), 154-158.

Version: 6 Owner: Daume Author(s): Daume

1681

Chapter 445 34C23 – Bifurcation
445.1 equivariant branching lemma

Let Γ be a Lie group acting absolutely irreducible on V and let g ∈ Ex,λ(where E(Γ) is the space of Γ-equivariant germs, at the origin, of C ∞ mappings of V into V ) be a bifurcation problem with symmetry group Γ. Since V is absolutely irreducible the Jacobian matrix is (dg)0,λ = c(λ)I then we suppose that c (0) = 0. Let Σ be an isotropy subgroup satisfying dim Fix(Σ) = 1 . Then there exists a unique smooth solution branch to g = 0 such that the isotropy subgroup of each solution is Σ. [GSS]

REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.

Version: 2 Owner: Daume Author(s): Daume

1682

Chapter 446 34C25 – Periodic solutions
446.1
Let ˙ x = f(x) be a planar dynamical system where f = (X, Y)t and x = (x, y)t . Furthermore f ∈ C 1 (E) where E is a simply connected region of the plane. If ∂X + ∂Y (the divergence of the ∂x ∂y vector field f, · f) is always of the same sign but not identically zero then there are no periodic solution in the region E of the planar system. Version: 1 Owner: Daume Author(s): Daume

Bendixson’s negative criterion

446.2
Let

Dulac’s criteria

˙ x = f(x) be a planar dynamical system where f = (X, Y)t and x = (x, y)t . Furthermore f ∈ C 1 (E) where E is a simply connected region of the plane. If there exists a function p(x, y) ∈ C 1 (E) such that ∂(p(x,y)X) + ∂(p(x,y)Y) (the divergence of the vector field p(x, y)f, · p(x, y)f) is ∂x ∂y always of the same sign but not identically zero then there are no periodic solution in the region E of the planar system. In addition, if A is an annular region contained in E on which the above condition is satisfied then there exists at most one periodic solution in A. Version: 1 Owner: Daume Author(s): Daume 1683

446.3

proof of Bendixson’s negative criterion

Suppose that there exists a periodic solution called Γ which has a period of T and lies in E. Let the interior of Γ be denoted by D. Then by Green’s theorem we can observe that int D int · f dx dy = int D int =
Γ

∂X ∂Y + dx dy ∂x ∂y

(X dx − Y dy)

= intT (Xy − Yx) dt ˙ ˙ 0 T = int0 (XY − YX) dt = 0 Since · f is not identically zero by hypothesis and is of one sign, the double integral on the left must be non zero and of that sign. This leads to a contradiction since the right hand side is equal to zero. Therefore there does not exists a periodic solution in the simply connected region E. Version: 1 Owner: Daume Author(s): Daume

1684

Chapter 447 34C99 – Miscellaneous
447.1 Hartman-Grobman theorem
x = f (x) (447.1.1)

Consider the differential equation where f is a C 1 vector field. Assume that x0 is a hyperbolic equilibrium of f . Denote Φt (x) the flow of (446.2.1) through x at time t. Then there exists a homeomorphism ϕ(x) = x+h(x) with h bouded, such that ϕ ◦ etDf (x0 ) = Φt ◦ ϕ is a sufficiently small neighboorhood of x0 . This fundamental theorem in the qualitative analysis of nonlinear differential equations states that, in a small neighborhood of x0 , the flow of the nonlinear equation (446.2.1) is qualitatively similar to that of the linear system x = Df (x0 )x. Version: 1 Owner: jarino Author(s): jarino

447.2

equilibrium point

Consider an autonomous differential equation x = f (x) ˙ (447.2.1)

An equilibrium (point) x0 of (446.2.1) is such that f (x0 ) = 0. If the linearization Df (x0 ) has no eigenvalue with zero real part, x0 is said to be a hyperbolic equilibrium, whereas if there exists an eigenvalue with zero real part, the equilibrium point is nonhyperbolic. Version: 5 Owner: Daume Author(s): Daume, jarino 1685

447.3

stable manifold theorem

Let E be an open subset of Rn containing the origin, let f ∈ C 1 (E), and let φt be the flow of the nonlinear system x = f (x). Suppose that f (x0 ) = 0 and that Df (x0 ) has k eigenvalues with negative real part and n − k eigenvalues with positive real part. Then there exists a k-dimensional differentiable manifold S tangent to the stable subspace E S of the linear system x = Df (x)x at x0 such that for all t 0, φt (S) ⊂ S and for all y ∈ S,
t→∞

lim φt (y) = x0

and there exists an n − k dimensional differentiable manifold U tangent to the unstable subspace E U of x = Df (x)x at x0 such that for all t 0, φt (U) ⊂ U and for all y ∈ U,
t→−∞

lim φt (y) = x0 .

Version: 1 Owner: jarino Author(s): jarino

1686

Chapter 448 34D20 – Lyapunov stability
448.1 Lyapunov stable

A fixed point is Lyapunov stable if trajectories of nearby points remain close for future time. More formally the fixed point x∗ is Lyapunov stable, if for any > 0, there is a δ > 0 such that for all t ≥ 0 and for all x such that d(x, x∗ ) < δ, d(x(t), x∗ ) < . Version: 2 Owner: armbrusterb Author(s): yark, armbrusterb

448.2

neutrally stable fixed point

A fixed point is considered neutrally stable if is Liapunov stable but not attracting. A center is an example of such a fixed point. Version: 3 Owner: armbrusterb Author(s): Johan, armbrusterb

448.3

stable fixed point

Let X be a vector field on a manifold M. A fixed point of X is said to be stable if it is both attracting and Lyapunov stable. Version: 5 Owner: alinabi Author(s): alinabi, yark, armbrusterb

1687

Chapter 449 34L05 – General spectral theory
449.1 Gelfand spectral radius theorem

For every self consistent matrix norm, || · ||, and every square matrix A we can write ρ(A) = lim ||An || n .
n→∞
1

Note: ρ(A) denotes the spectral radius of A. Version: 4 Owner: Johan Author(s): Johan

1688

Chapter 450 34L15 – Estimation of eigenvalues, upper and lower bounds
450.1 Rayleigh quotient

The Rayleigh quotient, RA , to the Hermitian matrix A is defined as RA (x) = xH Ax , xH x x = 0.

Version: 1 Owner: Johan Author(s): Johan

1689

Chapter 451 34L40 – Particular operators (Dirac, one-dimensional Schr¨dinger, etc.) o
451.1 Dirac delta function

The Dirac delta “function” δ(x) is not a true function since it cannot be defined completely by giving the function value for all values of the argument x. Similar to the Kronecker delta, the notation δ(x) stands for δ(x) = 0 for x = 0, and int∞ δ(x)dx = 1 −∞ For any continuous function F : int∞ δ(x)F (x)dx = F (0) −∞ or in n dimensions: intRn δ(x − s)f (s)dn s = f (x) δ(x) can also be defined as a normalized Gaussian function (normal distribution) in the limit of zero width. References • Originally from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.html) Version: 2 Owner: akrowne Author(s): akrowne 1690

451.2

construction of Dirac delta function

The Dirac delta function is notorious in mathematical circles for having no actual realization as a function. However, a little known secret is that in the domain of nonstandard analysis, the Dirac delta function admits a completely legitimate construction as an actual function. We give this construction here. Choose any positive infinitesimal ε and define the hyperreal valued function δ : ∗ R −→ ∗ R by 1/ε −ε/2 < x < ε/2, δ(x) := 0 otherwise. We verify that the above function satisfies the required properties of the Dirac delta function. By definition, δ(x) = 0 for all nonzero real numbers x. Moreover, int∞ δ(x) dx = int−ε/2 −∞
ε/2

1 dx = 1, ε

so the integral property is satisfied. Finally, for any continuous real function f : R −→ R, choose an infinitesimal z > 0 such that |f (x) − f (0)| < z for all |x| < ε/2; then ε· f (0) + z f (0) − z < int∞ δ(x)f (x) dx < ε · −∞ ε ε

which implies that int∞ δ(x)f (x) dx is within an infinitesimal of f (0), and thus has real −∞ part equal to f (0). Version: 2 Owner: djao Author(s): djao

1691

Chapter 452 35-00 – General reference works (handbooks, dictionaries, bibliographies, etc.)
452.1 differential operator

Roughly speaking, a differential operator is a mapping, typically understood to be linear, that transforms a function into another function by means of partial derivatives and multiplication by other functions. On Rn , a differential operator is commonly understood to be a linear transformation of C∞ (Rn ) having the form f→ aI fI , f ∈ C∞ (Rn ),
I

where the sum is taken over a finite number of multi-indices I = (i1 , . . . , in ) ∈ Nn , where aI ∈ C∞ (Rn ), and where fI denotes a partial derivative of f taken i1 times with respect to the first variable, i2 times with respect to the second variable, etc. The order of the operator is the maximum number of derivatives taken in the above formula, i.e. the maximum of i1 + . . . + in taken over all the I involved in the above summation. On a C∞ manifold M, a differential operator is commonly understood to be a linear transformation of C∞ (M) having the above form relative to some system of coordinates. Alternatively, one can equip C∞ (M) with the limit-order topology, and define a differential operator as a continuous transformation of C∞ (M). The order of a differential operator is a more subtle notion on a manifold than on Rn . There are two complications. First, one would like a definition that is independent of any particular system of coordinates. Furthermore, the order of an operator is at best a local concept: it can change from point to point, and indeed be unbounded if the manifold is non-compact. 1692

To address these issues, for a differential operator T and x ∈ M, we define ordx (T ) the order of T at x, to be the smallest k ∈ N such that T [f k+1](x) = 0 for all f ∈ C∞ (M) such that f (x) = 0. For a fixed differential operator T , the function ord(T ) : M → N defined by x → ordx (T ) is lower semi-continuous, meaning that ordy (T ) for all y ∈ M sufficiently close to x. The global order of T is defined to be the maximum of ordx (T ) taken over all x ∈ M. This maximum may not exist if M is non-compact, in which case one says that the order of T is infinite. Let us conclude by making two remarks. The notion of a differential operator can be generalized even further by allowing the operator to act on sections of a bundle. A differential operator T is a local operator, meaning that T [f ](x) = T [f ](x), f, g ∈ C∞ (M), x ∈ M, ordx (T )

if f ⇔ g in some neighborhood of x. A theorem, proved by Peetre states that the converse is also true, namely that every local operator is necessarily a differential operator. References 1. Dieudonn´, J.A., Foundations of modern analysis e 2. Peetre, J. , “Une caract´risation abstraite des op´rateurs diff´rentiels”, Math. Scand., e e e v. 7, 1959, p. 211 Version: 5 Owner: rmilson Author(s): rmilson

1693

Chapter 453 35J05 – Laplace equation, reduced wave equation (Helmholtz), Poisson equation
453.1 Poisson’s equation

Poisson’s equation is a second-order partial differential equation which arises in physical problems such as finding the electrical potential of a given charge distribution. Its general form in n dimensions is 2 φ(r) = ρ(r) where 2 is the Laplacian and ρ : D → R, often called a source function, is a given function on some subset D of Rn . If ρ is identical to zero, the Poisson equation reduces to the Laplace equation. The Poisson equation is linear, and therefore obeys the superposition principle: if 2 φ1 = ρ1 and 2 φ2 = ρ2 , then 2 (φ1 + φ2 ) = ρ1 + ρ2 . This fact can be used to construct solutions to Poisson’s equation from fundamental solutions, where the source distribution is a delta function. A very important case is the one in which n = 3, D is all of R3 , and φ(r) → 0 as |r| → ∞. The general solution is then given by φ(r) = − 1 ρ(r ) 3 intR3 d r. 4π |r − r |

Version: 2 Owner: pbruin Author(s): pbruin

1694

Chapter 454 35L05 – Wave equation
454.1 wave equation

The wave equation is a partial differential equation which describes all kinds of waves. It arises in various physical situations, such as vibrating strings, sound waves, and electromagnetic waves. The wave equation in one dimension is ∂2u ∂2u = c2 2 . ∂t2 ∂x The general solution of the one-dimensional wave equation can be obtained by a change of ∂2u variables (x, t) −→ (ξ, η), where ξ = x − ct and η = x + ct. This gives ∂ξ∂η = 0, which we can integrate to get d’Alembert’s solution: u(x, t) = F (x − ct) + G(x + ct) where F and G are twice differentiable functions. F and G represent waves travelling in the positive and negative x directions, respectively, with velocity c. These functions can be obtained if appropriate starting or boundary conditions are given. For example, if u(x, 0) = f (x) and ∂u (x, 0) = g(x) are given, the solution is ∂t 1 1 x+ct u(x, t) = [f (x − ct) + f (x + ct)] + intx−ct g(s)ds. 2 2c In general, the wave equation in n dimensions is ∂2u = c2 ∂t2
2

u.

where u is a function of the location variables x1 , x2 , . . . , xn , and time t. Here, 2 is the Laplacian with respect to the location variables, which in Cartesian coordinates is given by ∂2 ∂2 ∂2 2 = ∂x2 + ∂x2 + · · · + ∂x2 .
1 2 n

1695

Version: 4 Owner: pbruin Author(s): pbruin

1696

Chapter 455 35Q53 – KdV-like equations (Korteweg-de Vries, Burgers, sine-Gordon, sinh-Gordon, etc.)
455.1 Korteweg - de Vries equation

The Korteweg - de Vries equation is ut = uux + uxxx where u = u(x, t) and the subscripts indicate derivatives. Version: 4 Owner: superhiggs Author(s): superhiggs (455.1.1)

1697

Chapter 456 35Q99 – Miscellaneous
456.1 heat equation

The heat equation in 1-dimension (for example, along a metal wire) is a partial differential equation of the following form: ∂u ∂2u = c2 · 2 ∂t ∂x also written as ut = c2 · uxx Where u : R2 → R is the function giving the temperature at time t and position x and c is a real valued constant. This can be easily extended to 2 or 3 dimensions as ut = c2 · (uxx + uyy ) and ut = c2 · (uxx + uyy + uzz ) Note that in the steady state, that is when ut = 0, we are left with the Laplacian of u: ∆u = 0 Version: 2 Owner: dublisk Author(s): dublisk

1698

Chapter 457 37-00 – General reference works (handbooks, dictionaries, bibliographies, etc.)

1699

Chapter 458 37A30 – Ergodic theorems, spectral theory, Markov operators
458.1 ergodic

Let (X, B, µ) be a probability space, and T : X → X be a measure-preserving transformation. We call T ergodic if for A ∈ B, T A = A ⇒ µ(A) = 0 or µ(A) = 1. (458.1.1)

That is, T takes almost all sets all over the space. The only sets it doesn’t move are some sets of measure zero and the entire space. Version: 2 Owner: drummond Author(s): drummond

458.2

fundamental theorem of demography

Let At be a sequence of n × n nonnegative primitive matrices. Suppose that At → A∞ , with A∞ also a nonnegative primitive matrix. Define the sequence xt+1 = At xt , with xt ∈ Rn . If x0 0, then xt =p lim t→∞ xt where p is the normalized ( p = 1) eigenvector associated to the dominant eigenvalue of A∞ (also called the Perron-Frobenius eigenvector of A∞ ). Version: 3 Owner: jarino Author(s): jarino

1700

458.3

proof of fundamental theorem of demography

• First we will prove that ∃m, M > 0 such that m xk+1 xk M, ∀k (458.3.1)

with m and M independent of the sequence. In order to show this we use the primitivity of the matrices Ak and A∞ . Primitivity of A∞ implies that there exists l ∈ N such that Al ∞ 0 k0 , we have By continuity, this implies that there exists k0 such that, for all k Ak+l Ak+l−1 · · · Ak Let us then write xk+l+1 as a function of xk : xk+l+1 = Ak+l · · · Ak xk We thus have xk+l+1 C l+1 xk (458.3.2) But since the matrices Ak+l ,. . . ,Ak are strictly positive for k k0 , there exists a ε > 0 such that each component of these matrices is superior or equal to ε. From this we deduce that ∀k k0 , xk+l+1 ε xk 0

Applying relation (457.3.2), we then have that C l xk+1 which yields xk+1 and so we indeed have relation (457.3.1). • Let us denote by ek the (normalised) Perron eigenvector of Ak . Thus Ak ek = λk ek ek = 1 Let us denote by πk the projection on the supplementary space of {ek } invariant by Ak . Choosing a proper norm, we can find ε > 0 such that |Ak πk | • We shall now prove that e∗ , xk+1 k+1 → λ∞ when k → ∞ e∗ , xk k 1701 (λk − ε); ∀k ε xk 0

ε xk , ∀k Cl

In order to do this, we compute the inner product of the sequence xk+1 = Ak xk with the ek ’s: e∗ , xk+1 k+1 Therefore we have = e∗ − e∗ , Ak xk + λk e∗ , xk k+1 k k = o ( e∗ , xk ) + λk e∗ , xk k k e∗ , xk+1 k+1 = o(1) + λk e∗ , xk k uk =

πk xk e∗ , xk k We will verify that uk → 0 when k → ∞. We have uk+1 = (πk+1 − πk )Ak and so |uk+1| πk+1 − πk |C +

• Now assume

xk xk e∗ , xk Ak πk ∗ + ∗k ∗ ek , xk+1 ek , xk ek+1 , xk+1 e∗ , xk k (λk − ε)|uk | ∗ ek+1 , xk+1

We deduce that there exists k1

k0 such that, for all k k1 ε |uk+1| δk + (λ∞ − )|uk | 2 δk = (πk+1 − πk )C

where we have noted We have δk → 0 when t → ∞, we thus finally deduce that |uk | → 0 when k → ∞ Remark that this also implies that zk = πk xk → 0 when k → ∞ xk

• We have zk → 0 when k → ∞, and xk / xk can be written xk = αk ek + zk xk Therefore, we have αk ek → 1 when k → ∞, which implies that αk tends to 1, since we have chosen ek to be normalised (i.e., ek = 1). We then can conclude that and the proof is done. Version: 2 Owner: jarino Author(s): jarino 1702 xk → e∞ when k → ∞ xk

Chapter 459 37B05 – Transformations and group actions with special properties (minimality, distality, proximality, etc.)
459.1 discontinuous action

Let X be a topological space and G a group that acts on X by homeomorphisms. The action of G is said to be discontinuous at x ∈ X if there is a neighborhood U of x such that the set {g ∈ G | gU U = ∅} is finite. The action is called discontinuous if it is discontinuous at every point. Remark 1. If G acts discontinuously then the orbits of the action have no accumulation points, i.e. if {gn } is a sequence of distinct elements of G and x ∈ X then the sequence {gn x} has no limit points. If X is locally compact then an action that satisfies this condition is discntinuous. Remark 2. Assume that X is a locally compact Hausdorff space and let Aut(X) denote the group of self homeomorphisms of X endowed with the compact-open topology. If ρ : G → Aut(X) defines a discontimuous action then the image ρ(G) is a discrete subset of Aut(X). Version: 2 Owner: Dr Absentius Author(s): Dr Absentius

1703

Chapter 460 37B20 – Notions of recurrence
460.1 nonwandering set

Let X be a metric space, and f : X → X a continuous surjection. An element x of X is a wandering point if there is a neighborhood U of x and an integer N such that, for all n N, f n (U) U = ∅. If x is not wandering, we call it a nonwandering point. Equivalently, x is a nonwandering point if for every neighborhood U of x there is n 1 such that f n (U) U is nonempty. The set of all nonwandering points is called the nonwandering set of f , and is denoted by Ω(f ). If X is compact, then Ω(f ) is compact, nonempty, and forward invariant; if, additionally, f is an homeomorphism, then Ω(f ) is invariant. Version: 1 Owner: Koro Author(s): Koro

1704

Chapter 461 37B99 – Miscellaneous
461.1 ω-limit set

Let X be a metric space, and let f : X → X be a homeomorphism. The ω-limit set of x ∈ X, denoted by ω(x, f ), is the set of cluster points of the forward orbit {f n (x)}n∈N . Hence, y ∈ ω(x, f ) if and only if there is a strictly increasing sequence of natural numbers {nk }k∈N such that f nk (x) → y as k → ∞. Another way to express this is ω(x, f ) =
n∈N

{f k (x) : k > n}.

The α-limit set is defined in a similar fashion, but for the backward orbit; i.e. α(x, f ) = ω(x, f −1). Both sets are f -invariant, and if X is compact, they are compact and nonempty. If ϕ : R × X → X is a continuous flow, the definition is similar: ω(x, ϕ) consists of those elements y of X for which there exists a strictly increasing sequnece {tn } of real numbers such that tn → ∞ and ϕ(x, tn ) → y as n → ∞. Similarly, α(x, ϕ) is the ω-limit set of the reversed flow (i.e. ψ(x, t) = φ(x, −t)). Again, these sets are invariant and if X is compact they are compact and nonempty. Furthermore, ω(x, f ) =
n∈N

{ϕ(x, t) : t > n}.

Version: 2 Owner: Koro Author(s): Koro

1705

461.2

asymptotically stable

Let (X, d) be a metric space and f : X → X a continuous function. A point x ∈ X is said to be Lyapunov stable if for each > 0 there is δ > 0 such that for all n ∈ N and all y ∈ X such that d(x, y) < δ, we have d(f n (x), f n (y)) < . We say that x is asymptotically stable if it belongs to the interior of its stable set, i.e. if there is δ > 0 such that limn→∞ d(f n (x), f n (y)) = 0 whenever d(x, y) < δ. In a similar way, if ϕ : X × R → X is a flow, a point x ∈ X is said to be Lyapunov stable if for each > 0 there is δ > 0 such that, whenever d(x, y) < δ, we have d(ϕ(x, t), ϕ(y, t)) < for each t 0; and x is called asymptotically stable if there is a neighborhood U of x such that limt→∞ d(ϕ(x, t), ϕ(y, t)) = 0 for each y ∈ U. Version: 6 Owner: Koro Author(s): Koro

461.3

expansive

If (X, d) is a metric space, a homeomorphism f : X → X is said to be expansive if there is a constant ε0 > 0, called the expansivity constant, such that for any two points of X, their n-th iterates are at least ε0 appart for some integer n; i.e. if for any pair of points x = y in X there is n ∈ Z such that d(f n (x), f n (y)) ε0 . The space X is often assumed to be compact, since under that assumption expansivity is a topological property; i.e. if d is any other metric generating the same topology as d, and if f is expansive in (X, d), then f is expansive in (X, d ) (possibly with a different expansivity constant). If f : X → X is a continuous map, we say that X is positively expansive (or forward expansive) if there is ε0 such that, for any x = y in X, there is n ∈ N such that d(f n (x), f n (y)) ε0 . Remarks.The latter condition is much stronger than expansivity. In fact, one can prove that if X is compact and f is a positively expansive homeomorphism, then X is finite (proof). Version: 9 Owner: Koro Author(s): Koro

1706

461.4

the only compact metric spaces that admit a positively expansive homeomorphism are discrete spaces

Theorem. Let (X, d) be a compact metric space. If there exists a positively expansive homeomorphism f : X → X, then X consists only of isolated points, i.e. X is finite. Lemma 1. If (X, d) is a compact metric space and there is an expansive homeomorphism f : X → X such that every point is Lyapunov stable, then every point is asymptotically stable. Proof. Let 2c be the expansivity constant of f . Suppose some point x is not asymptotically stable, and let δ be such that d(x, y) < δ implies d(f n (x), f n (y)) < c for all n ∈ N. Then there exist > 0, a point y with d(x, y) < δ, and an increasing sequence {nk } such that d(f nk (y), f nk (x)) > for each k By uniform expansivity, there is N > 0 such that for every u and v such that d(u, v) > there is n ∈ Z with |n| < N such that d(f n (x), f n (y)) > c. Choose k so large that nk > N. Then there is n with |n| < N such that d(f n+nk (x), f n+nk (y)) = d(f n (f nk (x)), f n (f nk (y))) > c. But since n + nk > 0, this contradicts the choce of δ. Hence every point is asymptotically stable. Lemma 2 If (X, d) is a compact metric space and f : X → X is a continuous surjection such that every point is asymptotically stable, then X is finite. Proof. For each x ∈ X let Kx be a closed neighborhood of x such that for all y ∈ Kx we have limn→∞ d(f n (x), f n (y)) = 0. We assert that limn→∞ diam(f n (Kx )) = 0. In fact, if that is not the case, then there is an increasing sequence of positive integers {nk }, some > 0 and a sequence {xk } of points of Kx such that d(f nk (x), f nk (xk )) > , and there is a subsequence {xki } converging to some point y ∈ Kx for which lim sup d(f n (x), f n (y)) , contradicting the choice of Kx . Now since X is compact, there are finitely many points x1 , . . . , xm such that X = m Kxi , i=1 so that X = f n (X) = m f n (Kxi ). To show that X = {x1 , . . . , xm }, suppose there is y ∈ X i=1 such that r = min{d(y, xi) : 1 i m} > 0. Then there is n such that diam(f n (Kxi )) < r for 1 i m but since y ∈ f n (Kxi ) for some i, we have a contradiction. Proof of the theorem. Consider the sets K = {(x, y) ∈ X × X : d(x, y) } for > 0 and U = {(x, y) ∈ X × X : d(x, y) > c}, where 2c is the expansivity constant of f , and let F : X × X → X × X be the mapping given by F(x, y) = (f (x), f (y)). It is clear that F is a homeomorphism. By uniform expansivity, we know that for each > 0 there is N such that for all (x, y) ∈ K , there is n ∈ {1, . . . , N } such that F n (x, y) ∈ U. We will prove that for each > 0, there is δ > 0 such that F n (K ) ⊂ Kδ for all n ∈ N. This is equivalent to say that every point of X is Lyapunov stable for f −1 , and by the previous lemmas the proof will be completed. Let K =
N n=0

F n (K ), and let δ0 = min{d(x, y) : (x, y) ∈ K}. Since K is compact, 1707

the minimum distance δ0 is reached at some point of K; i.e. there exist (x, y) ∈ K and 0 n N such that d(f n (x), f n (y)) = δ0 . Since f is injective, it follows that δ0 > 0 and letting δ = δ0 /2 we have K ⊂ Kδ . Given α ∈ K − K , there is β ∈ K and some 0 < m N such that α = F m (β), and F k (β) ∈ K for 0 < k / m. Also, there is n with 0 < m < n N such that F n (β) ∈ m+1 m+1 U ⊂ K . Hence m < N , and F(β) = F (α) ∈ F (K ) ⊂ K; On the other hand, n F(K ) ⊂ K. Therefore F (K) ⊂ K, and inductively F (K) ⊂ K for any n ∈ N. It follows that F n (K ) ⊂ F n (K) ⊂ K ⊂ Kδ for each n ∈ N as required. Version: 5 Owner: Koro Author(s): Koro

461.5

topological conjugation

Let X and Y be topological spaces, and let f : X → X and g : Y → Y be continuous functions. We say that f is topologically semicongugate to g, if there exists a continuous surjection h : Y → X such that f h = hg. If h is a homeomorphism, then we say that f and g are topologically conjugate, and we call h a topological conjugation between f and g. Similarly, a flow ϕ on X is topologically semiconjugate to a flow ψ on Y if there is a continuous surjection h : Y → X such that ϕ(h(y), t) = hψ(y, t) for each y ∈ Y , t ∈ R. If h is a homeomorphism then ψ and ϕ are topologically conjugate.

461.5.1

Remarks

Topological conjugation defines an equivalence relation in the space of all continuous surjections of a topological space to itself, by declaring f and g to be related if they are topologically conjugate. This equivalence relation is very useful in the theory of dynamical systems, since each class contains all functions which share the same dynamics from the topological viewpoint. In fact, orbits of g are mapped to homeomorphic orbits of f through the conjugation. Writting g = h−1 f h makes this fact evident: g n = h−1 f n h. Speaking informally, topological conjugation is a “change of coordinates” in the topological sense. However, the analogous definition for flows is somewhat restrictive. In fact, we are requiring the maps ϕ(·, t) and ψ(·, t) to be topologically conjugate for each t, which is requiring more than simply that orbits of ϕ be mapped to orbits of ψ homeomorphically. This motivates the definition of topological equivalence, which also partitions the set of all flows in X into classes of flows sharing the same dynamics, again from the topological viewpoint. We say that ψ and ϕ are topologically equivalent, if there is an homeomorphism h : Y → X, mapping orbits of ψ to orbits of ϕ homeomorphically, and preserving orientation of the orbits. This means that: 1708

1. O(y, ψ) = {ψ(y, t) : t ∈ R} = {ϕ(h(y), t) : t ∈ R} = O(h(y), ϕ) for each y ∈ Y ; 2. for each y ∈ Y , there is δ > 0 such that, if 0 < |s| < t < δ, and if s is such that ϕ(h(y), s) = ψ(y, t), then s > 0. Version: 5 Owner: Koro Author(s): Koro

461.6

topologically transitive

A continuous surjection f on a topological space X to itself is topologically transitive if for every pair of open sets U and V in X there is an integer n > 0 such that f n (U) V = ∅, where f n denotes the n-th iterate of f . If for every pair of open sets U and V there is an integer N such that f n (U) each n > N, we say that f is topologically mixing. V = ∅ for

If X is a compact metric space, then f is topologically transitive if and only if there exists a point x ∈ X with a dense orbit, i.e. such that O(x, f ) = {f n (x) : n ∈ N} is dense in X. Version: 2 Owner: Koro Author(s): Koro

461.7

uniform expansivity

Let (X, d) be a compact metric space and let f : X → X be an expansive homeomorphism. Theorem (uniform expansivity). For every > 0 and δ > 0 there is N > 0 such that for each pair x, y of points of X such that d(x, y) > there is n ∈ Z with |n| N such that d(f n (x), f n (y)) > c − δ, where c is the expansivity constant of f . Proof. Let K = {(x, y) ∈ X × X : d(x, y) /2}. Then K is closed, and hence comc. pact. For each pair (x, y) ∈ K, there is n(x,y) ∈ Z such that d(f n(x,y) (x), f n(x,y) (y)) Since the mapping F : X × X → X × X defined by F (x, y) = (f (x), f (y)) is continuous, F nx is also continuous and there is a neighborhood U(x,y) of each (x, y) ∈ K such that d(f n(x,y) (u), f n(x,y) (v)) < c − δ for each (u, v) ∈ U(x,y) . Since K is compact and {U(x,y) : (x, y) ∈ K} is an open cover of K, there is a finite subcover {U(xi ,yi ) : 1 i m}. Let N = max{|n(xi ,yi ) | : 1 i m}. If d(x, y) > , then (x, y) ∈ K, so that (x, y) ∈ U(xi ,yi) for some i ∈ {1, . . . , m}. Thus for n = n(xi ,yi ) we have d(f n (x), f n (y)) < c − δ and |n| N as requred. Version: 2 Owner: Koro Author(s): Koro

1709

Chapter 462 37C10 – Vector fields, flows, ordinary differential equations
462.1 flow

A flow on a set X is a group action of (R, +) on X. More explicitly, a flow is a function ϕ : X × R → X satisfying the following properties: 1. ϕ(x, 0) = x 2. ϕ(ϕ(x, t), s) = ϕ(x, s + t) for all s, t in R and x ∈ X. The set O(x, ϕ) = {ϕ(x, t) : t ∈ R} is called the orbit of x by ϕ. Flows are usually required to be continuous or even differentiable, when the space X has some additional structure (e.g. when X is a topological space or when X = Rn .) The most common examples of flows arise from describing the solutions of the autonomous ordinary differential equation y = f (y), y(0) = x (462.1.1) as a function of the initial condition x, when the equation has existence and uniqueness of solutions. That is, if (461.1.1) has a unique solution ψx : R → X for each x ∈ X, then ϕ(x, t) = ψx (t) defines a flow. Version: 3 Owner: Koro Author(s): Koro

1710

462.2

globally attracting fixed point

An attracting fixed point is considered globally attracting if its stable manifold is the entire space. Equivalently, the fixed point x∗ is globally attracting if for all x, x(t) → x∗ as t → ∞. Version: 4 Owner: mathcam Author(s): mathcam, yark, armbrusterb

1711

Chapter 463 37C20 – Generic properties, structural stability
463.1 Kupka-Smale theorem

Let M be a compact smooth manifold. For every k ∈ N, the set of Kupka-Smale diffeomorphisms is residual in Diff k (M) (the space of all Ck diffeomorphisms from M to itself endowed with the uniform or strong Ck topology, also known as the Whitney Ck topology). Version: 2 Owner: Koro Author(s): Koro

463.2

Pugh’s general density theorem

Let M be a compact smooth manifold. There is a residual subset of Diff 1 (M) in which every element f satisfies Per(f ) = Ω(f ). In other words: generically, the set of periodic points of a C1 diffeomorphism is dense in its nonwandering set. Here, Diff 1 (M) denotes the set of all C1 difeomorphisms from M to itself, endowed with the (strong) C1 topology.

REFERENCES
1. Pugh, C., An improved closing lemma and a general density theorem, Amer. J. Math. 89 (1967).

Version: 5 Owner: Koro Author(s): Koro

1712

463.3

structural stability

Given a metric space (X, d) and an homeomorphism f : X → X, we say that f is structurally stable if there is a neighborhood V of f in Homeo(X) (the space of all homeomorphisms mapping X to itself endowed with the compact-open topology) such that every element of V is topologically conjugate to f . If M is a compact smooth manifold, a Ck diffeomorphism f is said to be Ck structurally stable if there is a neighborhood of f in Diff k (M) (the space of all Ck diffeomorphisms from M to itself endowed with the strong Ck topology) in which every element is topologically conjugate to f . If X is a vector field in the smooth manifold M, we say that X is Ck -structurally stable if there is a neighborhood of X in Xk (M) (the space of all Ck vector fields on M endowed with the strong Ck topology) in which every element is topologically equivalent to X, i.e. such that every other field Y in that neighborhood generates a flow on M that is topologically equivalent to the flow generated by X. Remark. The concept of structural stability may be generalized to other spaces of functions with other topologies; the general idea is that a function or flow is structurally stable if any other function or flow close enough to it has similar dynamics (from the topological viewpoint), which essentially means that the dynamics will not change under small perturbations. Version: 5 Owner: Koro Author(s): Koro

1713

Chapter 464 37C25 – Fixed points, periodic points, fixed-point index theory
464.1 hyperbolic fixed point

Let M be a smooth manifold. A fixed point x of a diffeomorphism f : M → M is said to be a hyperbolic fixed point if Df (x) is a linear hyperbolic isomorphism. If x is a periodic point of least period n, it is called a hyperbolic periodic point if it is a hyperbolic fixed point of f n (the n-th iterate of f ). If the dimension of the stable manifold of a fixed point is zero, the point is called a source; if the dimension of its unstable manifold is zero, it is called a sink; and if both the stable and unstable manifold have nonzero dimension, it is called a saddle. Version: 3 Owner: Koro Author(s): Koro

1714

Chapter 465 37C29 – Homoclinic and heteroclinic orbits
465.1 heteroclinic

Let f be an homeomorphism mapping a topological space X to itself or a flow on X. An heteroclinic point, or heteroclinic intersection, is a point that belongs to the intersection of the stable set of x with the unstable set of y, where x and y are two different fixed or periodic points of f , i.e. a point that belongs to W s (f, x) W u (f, y). Version: 1 Owner: Koro Author(s): Koro

465.2

homoclinic

If X is a topological space and f is a flow on X or an homeomorphism mapping X to itself, we say that x ∈ X is an homoclinic point (or homoclinic intersection) if it belongs to both the stable and unstable sets of some fixed or periodic point p; i.e. x ∈ W s (f, p) W u (f, p).

The orbit of an homoclinic point is called an homoclinic orbit. Version: 2 Owner: Koro Author(s): Koro

1715

Chapter 466 37C75 – Stability theory
466.1 attracting fixed point

A fixed point is considered attracting if there exists a small neighborhood of the point in its stable manifold. Equivalently, the fixed point x∗ is attracting if there exists a δ > 0 such that for all x, d(x, x∗ ) < δ implies x(t) → x∗ as t → ∞. The stability of a fixed point can also be classified as stable, unstable, neutrally stable, and Liapunov stable. Version: 2 Owner: alinabi Author(s): alinabi, armbrusterb

466.2

stable manifold

Let X be a topological space, and f : X → X a homeomorphism. If p is a fixed point for f , the stable and unstable sets of p are defined by W s (f, p) = {q ∈ X : f n (q) − − p}, −→
n→∞

W (f, p) = {q ∈ X : f respectively.

u

−n

(q) − − p}, −→
n→∞

If p is a periodic point of least period k, then it is a fixed point of f k , and the stable and unstable sets of p are W s (f, p) = W s (f k , p) W u (f, p) = W u (f k , p).

1716

Given a neighborhood U of p, the local stable and unstable sets of p are defined by
s Wloc (f, p, U) = {q ∈ U : f n (q) ∈ U for each n u s Wloc (f, p, U) = Wloc (f −1 , p, U).

0},

If X is metrizable, we can define the stable and unstable sets for any point by W s (f, p) = {q ∈ U : d(f n (q), f n (p)) − − 0}, −→
n→∞

W (f, p) = W (f

u

s

−1

, p),

where d is a metric for X. This definition clearly coincides with the previous one when p is a periodic point. Suppose now that X is a compact smooth manifold, and f is a Ck diffeomorphism, k 1. If p is a hyperbolic periodic point, the stable manifold theorem assures that for some neighborhood U of p, the local stable and unstable sets are Ck embedded disks, whose tangent spaces at p are E s and E u (the stable and unstable spaces of Df (p)), respectively; moreover, they vary continuously (in certain sense) in a neighborhood of f in the Ck topology of Diff k (X) (the space of all Ck diffeomorphisms from X to itself). Finally, the stable and unstable sets are Ck injectively immersed disks. This is why they are commonly called stable and unstable manifolds. This result is also valid for nonperiodic points, as long as they lie in some hyperbolic set (stable manifold theorem for hyperbolic sets). Version: 7 Owner: Koro Author(s): Koro

1717

Chapter 467 37C80 – Symmetries, equivariant dynamical systems
467.1 Γ-equivariant

Let Γ be a compact Lie group acting linearly on V and let g be a mapping defined as g : V → V . Then g is Γ-equivariant if g(γv) = γg(v) for all γ ∈ Γ, and all v ∈ V . Therefore if g commutes with Γ then g is Γ-equivariant. [GSS]

REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.

Version: 2 Owner: Daume Author(s): Daume

1718

Chapter 468 37D05 – Hyperbolic orbits and sets
468.1 hyperbolic isomorphism

Let X be a Banach space and T : X → X a continuous linear isomorphism. We say that T is an hyperbolic isomorphism if its spectrum is disjoint with the unit circle, i.e. σ(T ) {z ∈ C : |z| = 1} = ∅. If this is the case, then there is a splitting of X into two invariant subspaces, X = E s ⊕ E u (and therefore, a corresponding splitting of T into two operators T s : E s → E s and T u : E u → E u , i.e. T = T s ⊕T u ), and an equivalent norm · 1 in X such that T s is a contraction and T u is an expansion with respect to this norm. That is, there are constants λs and λu , with 0 < λs , λu < 1 such that for all xs ∈ E s and xu ∈ E u , T s xs
1

< λs xs

1

and (T u )−1 xu

1

< λu xu 1 .

Version: 4 Owner: Koro Author(s): Koro

1719

Chapter 469 37D20 – Uniformly hyperbolic systems (expanding, Anosov, Axiom A, etc.)
469.1 Anosov diffeomorphism

If M is a compact smooth manifold, a diffeomorphism f : M → M (or a flow φ : R × M → M) such that the whole space M is an hyperbolic set for f (or φ) is called an Anosov diffeomorphism (or flow). Anosov diffeomorphisms were introduced by D.V. Anosov, who proved that they are C1 structurally stable and form an open subset of C1 (M, M) with the C 1 topology. Not every manifold admits an Anosov diffeomorphism; for example, there are no such diffeomorphisms on the sphere S n . The simplest examples of compact manifolds admiting them are the tori Tn : they admit the so called linear Anosov diffeomorphisms, which are isomorphisms of Tn having no eigenvalue of modulus 1. It was proved that any other Anosov diffeomorphism in Tn is topologically conjugate to one of this kind. The problem of classifying manifolds that admit Anosov diffeomorphisms showed to be very difficult and still has no answer. The only known examples of these are the tori and infranilmanifolds, and it is conjectured that they are the only ones. Another famous problem that still remains open is to determine whether or not the nonwandering set of an Anosov diffeomorphism must be the whole manifold M. This is known to be true for linear Anosov diffeomorphisms (and hence for any Anosov diffeomorphism in a torus). Version: 1 Owner: Koro Author(s): Koro

1720

469.2

Axiom A

Let M be a smooth manifold. We say that a diffeomorphism f : M → M satisfies (Smale’s) Axiom A (or that f is an Axiom A diffeomorphism) if 1. the nonwandering set Ω(f ) has a hyperbolic structure; 2. the set of periodic points of f is dense in Ω(f ): Per(f ) = Ω(f ). Version: 3 Owner: Koro Author(s): Koro

469.3

hyperbolic set

Let M be a compact smooth manifold, and let f : M → M be a diffeomorphism. An f invariant subset Λ of M is said to be hyperbolic (or to have an hyperbolic structure) if there is a splitting of the tangent bundle of M restricted to Λ into a (Whitney) sum of two Df -invariant subbundles, E s and E u such that the restriction of Df |E s is a contraction and Df |E u is an expansion. This means that there are constants 0 < λ < 1 and c > 0 such that 1. TΛ M = E s ⊕ E u ;
s s s u 2. Df (x)Ex = Ef (x) and Df (x)Ex = Ef (x) for each x ∈ Λ;

3. Df n v < cλn v for each v ∈ E s and n > 0; 4. Df −n v < cλn v for each v ∈ E u and n > 0. using some Riemannian metric on M. If Λ is hyperbolic, then there exists an adapted Riemannian metric, i.e. one such that c = 1. Version: 1 Owner: Koro Author(s): Koro

1721

Chapter 470 37D99 – Miscellaneous
470.1 Kupka-Smale

A diffeomorphism f mapping a smooth manifold M to itself is called a Kupka-Smale diffeomorphism if 1. every periodic point of f is hyperbolic; 2. for each pair of periodic points p,q of f , the intersection between the stable manifold of p and the unstable manifold of q is transversal. Version: 1 Owner: Koro Author(s): Koro

1722

Chapter 471 37E05 – Maps of the interval (piecewise continuous, continuous, smooth)
471.1 Sharkovskii’s theorem

Every natural number can be written as 2r p, where p is odd, and r is the maximum exponent such that 2r divides the given number. We define the Sharkovskii ordering of the natural numbers in this way: given two odd numbers p and q, and two nonnegative integers r and s, then 2r p 2s q if 1. r < s and p > 1; 2. r = s and p < q; 3. r > s and p = q = 1. This defines a linear ordering of N, in which we first have 3, 5, 7, . . . , followed by 2·3, 2·5, . . . , followed by 22 · 3, 22 · 5, . . . , and so on, and finally 2n+1 , 2n , . . . , 2, 1. So it looks like this: 3 5 ··· 3·2 5·2 ··· 3 · 2n 5 · 2n ··· 22 2 1.

Sharkovskii’s theorem. Let I ⊂ R be an interval, and let f : I → R be a continuous function. If f has a periodic point of least period n, then f has a periodic point of least period k, for each k such that n k. Version: 3 Owner: Koro Author(s): Koro

1723

Chapter 472 37G15 – Bifurcations of limit cycles and periodic orbits
472.1 Feigenbaum constant

The Feigenbaum delta constant has the value δ = 4.669211660910299067185320382047 . . . It governs the structure and behavior of many types of dynamical systems. It was discovered in the 1970’s by Mitchell Feigenbaum, while studying the logistic map y = µ · y(1 − y), which produces the Feigenbaum tree:

Generated by GNU Octave and GNUPlot.

If the bifurcations in this tree (first few shown as dotted blue lines) are at points b1 , b2 , b3 , . . ., then bn − bn−1 =δ n→∞ bn+1 − bn lim That is, the ratio of the intervals between the bifurcation points approaches Feigenbaum’s constant. 1724

However, this is only the beginning. Feigenbaum discovered that this constant arose in any dynamical system that approaches chaotic behavior via period-doubling bifurcation, and has a single quadratic maximum. So in some sense, Feigenbaum’s constant is a universal constant of chaos theory. Feigenabum’s constant appears in problems of fluid-flow turbulence, electronic oscillators, chemical reactions, and even the Mandelbrot set (the ”budding” of the Mandelbrot set along the negative real axis occurs at intervals determined by Feigenbaum’s constant).

References. • “What is Feigenbaum’s constant?”: http://fractals.iuta.u-bordeaux.fr/sci-faq/feigenbaum.html • “Bifurcations”: http://mcasco.com/bifurcat.html • “Feigenbaum’s Constant”: http://home.imf.au.dk/abuch/feigenbaum.html • “Bifurcation”: http://spanky.triumf.ca/www/fractint/bif type.html Version: 2 Owner: akrowne Author(s): akrowne

• “Feigenbaum’s Universal Constant”: http://www.stud.ntnu.no/ berland/math/feigenbaum/feigconst

472.2

Feigenbaum fractal

A Feigenbaum fractal is any bifurcation fractal produced by a period-doubling cascade. The “canonical” Feigenbaum fractal is produced by the logistic map (a simple population model), y = µ · y(1 − y) where µ is varied smoothly along one dimension. The logistic iteration either terminates in a cycle (set of repeating values) or behaves chaotically. If one plots the points of this cycle versus the µ-value, a graph like the following is produced:

Note the distinct bifurcation (branching) points and the chaotic behavior as µ increases. Many other iterations will generate this same type of plot, for example the iteration 1725

p = r · sin(π · p) One of the most amazing things about this class of fractals is that the bifurcation intervals are always described by Feigenbaum’s constant. Octave/Matlab code to generate the above image is available here.

References.

• “Quadratic Iteration, bifurcation, and chaos”: http://mathforum.org/advanced/robertd/bifurcation.h • “Bifurcation”: http://spanky.triumf.ca/www/fractint/bif type.html • “Feigenbaum’s Constant”: http://fractals.iuta.u-bordeaux.fr/sci-faq/feigenbaum.html Version: 3 Owner: akrowne Author(s): akrowne

472.3

equivariant Hopf theorem

Let the system of ordinary differential equations ˙ x + f(x, λ) = 0 where f : Rn ×R → Rn is smooth and commutes with a compact Lie group Γ(Γ-equivariant). In addition we assume that Rn is Γ-simple and we choose a basis of coordinates such that (df )0,0 ⇔ 0 −Im Im 0

where m = n/2. Furthermore let the eigenvalues of (df )0,0 be defined as σ(λ) ± iρ(λ) and σ(0) = 0. ˙ Suppose that dim Fix(Σ) = 2 where Σ is an isotropy subgroup Σ ⊂ Γ×S1 acting on Rn . Then there exists a unique branch ˙ of small-amplitude periodic solutions to x + f(x, λ) = 0 with period near 2π, having Σ as their group of symmetries. [GSS]

1726

REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.

Version: 1 Owner: Daume Author(s): Daume

1727

Chapter 473 37G40 – Symmetries, equivariant bifurcation theory
473.1 Po´naru (1976) theorem e

Let Γ be a compact Lie group and let g1 , . . . , gr generate the module P(Γ)(the space of Γ-equivariant polynomial mappings) of Γ-equivariant polynomials over the ring P(Γ)(the ring of Γ-invariant polynomial). Then g1 , . . . , gr generate the module E(Γ)(the space of Γequivariant germs at the origin of C ∞ mappings) over the ring E(Γ)(the ring of Γ-invariant germs). [GSS]

REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988. [PV] Po´naru, V.:Singularit´s C ∞ en Pr´sence de Sym´trie. Lecture Notes in Mathematics 510, e e e e Springer-Verlag, Berlin, 1976.

Version: 1 Owner: Daume Author(s): Daume

473.2

bifurcation problem with symmetry group

Let Γ be a Lie group acting on a vector space V and let the system of ordinary differential equations ˙ x + g(x, λ) = 0

1728

where g : Rn ×R → Rn is smooth. Then g is called a bifurcation problem with symmetry group Γ if g ∈ Ex,λ(Γ) (where E(Γ) is the space of Γ-equivariant germs, at the origin, of C ∞ mappings of V into V ) satisfying g(0, 0) = 0 and (dg)0,0 = 0 where (dg)0,0 denotes the Jacobian matrix evaluated at (0, 0). [GSS]

REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.

Version: 1 Owner: Daume Author(s): Daume

473.3

trace formula

Let Γ be a compact Lie group acting on V and let Σ ⊂ Γ be a Lie subgroup. Then dim Fix(Σ) = intΣ trace(σ) where int denotes the normalized Haar integral on Σ and Fix(Σ) is the fixed-point subspace of Σ .

REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeffer, G. David: Singularities and Groups in Bifurcation Theory (Volume II). Springer-Verlag, New York, 1988.

Version: 2 Owner: Daume Author(s): Daume

1729

Chapter 474 37G99 – Miscellaneous
474.1 chaotic dynamical system

As Strogatz says in reference [1], ”No definition of the term chaos is universally accepted yet, but almost everyone would agree on the three ingredients used in the following working definition”. Chaos is aperiodic long-term behavior in a deterministic system that exhibits sensitive dependence on initial conditions. Aperiodic long-term behavior means that there are trajectories which do not settle down to fixed points, periodic orbits, or quasiperiodic orbits as t → ∞. For the purposed of this definition, a trajectory which approaches a limit of ∞ as t → ∞ should be considered to have a fixpoint at ∞. Sensitive dependence on initial conditions means that nearby trajectories separate exponentially fast, i.e., the system has a positive Liapunov exponent. Strogatz notes that he favors additional contraints on the aperidodic long-term behavior, but leaves open what form they may take. He suggests two alternatives to fulfill this: 1. Requiring that ∃ an open set of initial conditions having aperiodic trajectories, or 2. If one picks a random initial condition x(0) then there must be a nonzero chance of the associated trajectory x(t) being aperiodic.

474.1.1

References

1. Steven H. Strogatz, ”Nonlinear Dynamics and Chaos”. Westview Press, 1994. 1730

Version: 2 Owner: bshanks Author(s): bshanks

1731

Chapter 475 37H20 – Bifurcation theory
475.1 bifurcation

Bifurcation refers to the splitting of attractors, as in dynamical systems. For example, the branching of the Feigenbaum tree is an instance of bifurcation. A cascade of bifurcations is a precursor to chaos.

REFERENCES
1. “Bifurcations”, http://mcasco.com/bifurcat.html 2. “Bifurcation”, http://spanky.triumf.ca/www/fractint/bif type.html 3. “Quadratic Iteration, bifurcation, and chaos”, http://mathforum.org/advanced/robertd/bifurcation.html

Version: 2 Owner: akrowne Author(s): akrowne

1732

Chapter 476 39B05 – General
476.1 functional equation

A functional equation is an equation whose unknowns are functions. f (x+y) = f (x)+f (y), f (x·y) = f (x)·f (y) are examples of such equations. The systematic study of these didn’t begin before the 1960’s, although various mathematicians have been studying them before, including Euler and Cauchy just to mention a few. Functional equations appear many places, for example, the gamma function and Riemann’s zeta function both satisfy functional equations. Version: 4 Owner: jgade Author(s): jgade

1733

Chapter 477 39B62 – Functional inequalities, including subadditivity, convexity, etc.
477.1 Jensen’s inequality

If f is a convex function on the interval [a, b], then
n n

f
k=1

λk xk
k=1

λk f (xk )

where 0 ≤ λk ≤ 1, λ1 + λ2 + · · · + λn = 1 and each xk ∈ [a, b]. If f is a concave function, the inequality is reversed. Example: f (x) = x2 is a convex function on [0, 10]. Then (0.2 · 4 + 0.5 · 3 + 0.3 · 7)2 0.2(42 ) + 0.5(32 ) + 0.3(72 ).

A very special case of this inequality is when λk = f 1 n
n

1 n n

because then f (xk )

xk
k=1

1 n

k=1

that is, the value of the function at the mean of the xk is less or equal than the mean of the values of the function at each xk . There is another formulation of Jensen’s inequality used in probability: Let X be some random variable, and let f (x) be a convex function (defined at least on a 1734

With this approach, the weights of the first form can be seen as probabilities. Version: 2 Owner: drini Author(s): drini

segment containing the range of X). Then the expected value of f (X) is at least the value of f at the mean of X: E f (X) ≥ f (E X).

477.2

proof of Jensen’s inequality

We prove an equivalent, more convenient formulation: Let X be some random variable, and let f (x) be a convex function (defined at least on a segment containing the range of X). Then the expected value of f (X) is at least the value of f at the mean of X: E f (X) ≥ f (E X). Indeed, let c = E X. Since f (x) is convex, there exists a supporting line for f (x) at c: for some α, and ϕ(x) ≤ f (x). Then as claimed. ϕ(x) = α(x − c) + f (c)

E f (X) ≥ E ϕ(X) = E α(X − c) + f (c) = f (c)

Version: 2 Owner: ariels Author(s): ariels

477.3

proof of arithmetic-geometric-harmonic means inequality

We can use the Jensen inequality for an easy proof of the arithmetic-geometric-harmonic means inequality. Let x1 , . . . , xn > 0; we shall first prove that √ x1 + . . . + xn n . x1 · . . . · xn ≤ n Note that log is a concave function. Applying it to the arithmetic mean of x1 , . . . , xn and using Jensen’s inequality, we see that log( x1 + . . . + xn ) n log(x1 ) + . . . + log(xn ) n log(x1 · . . . · xn ) = √ n = log n x1 · . . . · xn . 1735

Since log is also a monotone function, it follows that the arithmetic mean is at least as large as the geometric mean. The proof that the geometric mean is at least as large as the harmonic mean is the usual one (see “proof of arithmetic-geometric-harmonic means inequality”). Version: 4 Owner: mathcam Author(s): mathcam, ariels

477.4

subadditivity

A sequence {an }∞ is called subadditive if it satisfies the inequality n=1 an+m an + am for all n and m. (477.4.1) The major reason for use of subadditive sequences is the following lemma due to Fekete. Lemma 10 ([1]). For every subadditive sequence {an }∞ the limit lim an /n exists and equal n=1 to infan /n. Similarly, a function f (x) is subadditive if f (x + y) f (x) + f (y) for all x and y.

The analogue of Fekete lemma holds for subadditive functions as well. There are extensions of Fekete lemma that do not require (476.5.1) to hold for all m and n. There are also results that allow one to deduce the rate of convergence to the limit whose existence is stated in Fekete lemma if some kind of both super- and subadditivity is present. A good exposition of this topic may be found in [2].

REFERENCES
1. Gy¨rgy Polya and G´bor Szeg¨. Problems and theorems in analysis, volume 1. 1976. o a o Zbl 0338.00001. 2. Michael J. Steele. Probability theory and combinatorial optimization, volume 69 of CBMS-NSF Regional Conference Series in Applied Mathematics. SIAM, 1997. Zbl 0916.90233.

Version: 6 Owner: bbukh Author(s): bbukh

477.5

superadditivity

A sequence {an }∞ is called superadditive if it satisfies the inequality n=1 an+m an + am 1736 for all n and m. (477.5.1)

The major reason for use of superadditive sequences is the following lemma due to Fekete. Lemma 11 ([1]). For every superadditive sequence {an }∞ the limit lim an /n exists and n=1 equal to sup an /n. Similarly, a function f (x) is superadditive if f (x + y) f (x) + f (y) for all x and y.

The analogue of Fekete lemma holds for superadditive functions as well. There are extensions of Fekete lemma that do not require (476.5.1) to hold for all m and n. There are also results that allow one to deduce the rate of convergence to the limit whose existence is stated in Fekete lemma if some kind of both super- and subadditivity is present. A good exposition of this topic may be found in [2].

REFERENCES
1. Gy¨rgy Polya and G´bor Szeg¨. Problems and theorems in analysis, volume 1. 1976. o a o Zbl 0338.00001. 2. Michael J. Steele. Probability theory and combinatorial optimization, volume 69 of CBMS-NSF Regional Conference Series in Applied Mathematics. SIAM, 1997. Zbl 0916.90233.

Version: 5 Owner: bbukh Author(s): bbukh

1737

Chapter 478 40-00 – General reference works (handbooks, dictionaries, bibliographies, etc.)
478.1 Cauchy product

Let ak and bk be two sequences of real or complex numbers for k ∈ N0 ( N0 is the set of natural numbers containing zero). The Cauchy product is defined by:
k

(a ◦ b)(k) =

al bk−l .
l=0

(478.1.1)

This is basically the convolution for two sequences. Therefore the product of two series ∞ ∞ k=0 ak , k=0 bk is given by:
∞ k=0

ck =

∞ k=0

ak

·

∞ k=0

bk

=

k

al bk−l .

(478.1.2)

k=0 l=0 ∞ k=0 ck

A sufficient condition for the resulting series ∞ ∞ k=0 bk both converge absolutely . k=0 ak and Version: 4 Owner: msihl Author(s): msihl

to be absolutely convergent is that

1738

478.2

Cesro mean

Definition Let {an }∞ be a sequence of real (or possibly complex numbers). The Ces`ro a n=0 mean of the sequence {an } is the sequence {bn }∞ with n=0 1 bn = n+1
n

ai .
i=0

(478.2.1)

Properties 1. A key property of the Ces`ro mean is that it has the same limit as the original sequence. a In other words, if {an } and {bn } are as above, and an → a, then bn → a. In particular, if {an } converges, then {bn } converges too. Version: 5 Owner: mathcam Author(s): matte, drummond

478.3

alternating series

An alternating series is of the form
∞ i=0

(−1)i ai

or
∞ i=0

(−1)i+1 ai

where (an ) is a non-negative sequence. Version: 2 Owner: vitriol Author(s): vitriol

478.4

alternating series test

The alternating series test, or the Leibniz’s Theorem, states the following: Theorem [1, 2] Let (an )∞ be a non-negative, non-increasing sequence or real numbers such n=1 that limn→∞ an = 0. Then the infinite sum ∞ (−1)(n+1) an converges. i=1 1739

This test provides a sufficient (but not necessary) condition for the convergence of an alternating series, and is therefore often used as a simple first test for convergence of such series. The condition limn→∞ an = 0 is necessary for convergence of an alternating series. Example: The series converges to ln(2).
∞ 1 k=1 k

does not converge, but the alternating series

∞ k+1 1 k=1 (−1) k

REFERENCES
1. W. Rudin, Principles of Mathematical Analysis, McGraw-Hill Inc., 1976. 2. E. Kreyszig, Advanced Engineering Mathematics, John Wiley & Sons, 1993, 7th ed.

Version: 10 Owner: Koro Author(s): Larry Hammick, matte, saforres, vitriol

478.5

monotonic

A sequence (sn ) is said to be monotonic if it is • monotonically increasing • monotonically decreasing • monotonically nondecreasing • monotonically nonincreasing Intuitively, this means that the sequence can be thought of a “staircase” going either only up, or only down, with the stairs any height and any depth. Version: 1 Owner: akrowne Author(s): akrowne

478.6

monotonically decreasing

A sequence (sn ) is monotonically decreasing if s m < sn ∀ m > n 1740

Compare this to monotonically nonincreasing. Version: 4 Owner: akrowne Author(s): akrowne

478.7

monotonically increasing

A sequence (sn ) is called monotonically increasing if s m > sn ∀ m > n Compare this to monotonically nondecreasing. Version: 3 Owner: akrowne Author(s): akrowne

478.8

monotonically nondecreasing

A sequence (sn ) is called monotonically nondecreasing if sm ≥ sn ∀ m > n Compare this to monotonically increasing. Version: 2 Owner: akrowne Author(s): akrowne

478.9

monotonically nonincreasing

A sequence (sn ) is monotonically nonincreasing if sm ≤ sn for all m > n Compare this to monotonically decreasing. Examples. • (sn ) = 1, 0, −1, −2, . . . is monotonically nonincreasing. It is also monotonically decreasing. 1741

• (sn ) = 1, 1, 1, 1, . . . is nonincreasing but not monotonically decreasing.
1 • (sn ) = ( n+1 ) is nonincreasing (note that n is nonnegative).

• (sn ) = 1, 1, 2, 1, 1, . . . is not nonincreasing. It also happens to fail to be monotonically nondecreasing. • (sn ) = 1, 2, 3, 4, 5, . . . is not nonincreasing, rather it is nondecreasing (and monotonically increasing). Version: 5 Owner: akrowne Author(s): akrowne

478.10

sequence

Sequences Given any set X, a sequence in X is a function f : N −→ X from the set of natural numbers to X. Sequences are usually written with subscript notation: x0 , x1 , x2 . . . , instead of f (0), f (1), f (2) . . . .

Generalized sequences One can generalize the above definition to any arbitrary ordinal. For any set X, a generalized sequence in X is a function f : ω −→ X where ω is any ordinal number. If ω is a finite ordinal, then we say the sequence is a finite sequence. Version: 5 Owner: djao Author(s): djao

478.11

series

Given a sequence of real numbers {an } we can define a sequence of partial sums {SN }, where SN = N an . We define the series ∞ an to be the limit of these partial sums. More n=1 n=1 precisely
∞ n=1 N

an = lim Sn = lim
N →∞

N →∞

an .
n=1

The elements of the sequence {an } are called the terms of the series. Traditionally, as above, series are infinite sums of real numbers. However, the formal constraints on the terms {an } are much less strict. We need only be able to add the terms and take the limit of partial sums. So in full generality the terms could be complex numbers or even elements of certain rings, fields, and vector spaces. Version: 2 Owner: igor Author(s): igor

1742

Chapter 479 40A05 – Convergence and divergence of series and sequences
479.1 Abel’s lemma

Theorem 1 Let {ai }N and {bi }N be sequences of real (or complex) numbers with N ≥ 0. i=0 i=0 For n = 0, . . . , N, let An be the partial sum An = n ai . Then i=0
N

ai bi =
i=0

N −1 i=0

Ai (bi − bi+1 ) + AN bN .

In the trivial case, when N = 0, then sum on the right hand side should be interpreted as identically zero. In other words, if the upper limit is below the lower limit, there is no summation. An inductive proof can be found here. The result can be found in [1] (Exercise 3.3.5). If the sequences are indexed from M to N, we have the following variant: Corollary Let {ai }N and {bi }N be sequences of real (or complex) numbers with 0 ≤ i=M i=M M ≤ N. For n = M, . . . , N, let An be the partial sum An = n ai . Then i=M
N

ai bi =
i=M

N −1 i=M

Ai (bi − bi+1 ) + AN bN .

Proof. By defining a0 = . . . = aM −1 = b0 = . . . = bM −1 = 0, we can apply Theorem 1 to the sequences {ai }N and {bi }N . P i=0 i=0

1743

REFERENCES
1. R.B. Guenther, L.W. Lee, Partial Differential Equations of Mathematical Physics and Integral Equations, Dover Publications, 1988.

Version: 10 Owner: mathcam Author(s): matte, lieven

479.2

Abel’s test for convergence

Suppose an converges and that (bn ) is a monotonic convergent sequence. Then the series an bn converges. Version: 4 Owner: vypertd Author(s): vypertd

479.3
Let (xn )n
0

Baroni’s Theorem
be a sequence of real numbers such that lim (xn+1 −xn ) = 0. Let A = {xn |n ∈ N}
n→∞

and A’ the set of limit points of A. Then A’ is a (possibly degenerate) interval from R, where R = R {−∞, +∞} Version: 2 Owner: slash Author(s): slash

479.4

Bolzano-Weierstrass theorem

Given any bounded, real sequence (an ) there exists a convergent subsequence (anj ). More generally, any sequence (an ) in a compact set, has a convergent subsequence. Version: 6 Owner: vitriol Author(s): vitriol

479.5
A series

Cauchy criterion for convergence
∞ i=0

ai is convergent iff for every ε > 0 there is a number N ∈ N such that |an+1 + an+2 + . . . + an+p | < ε

holds for all n > N and p

1. 1744

Proof:
First define sn :=
i=0 n

ai .

Now by definition the series converges iff for every ε > 0 there is a number N, such that for all n, m > N holds: |sm − sn | < ε. We can assume m > n and thus set m = n + p. The the series is convergent iff |sn+p − sn | = |an+1 + an+2 + . . . + an+p | < ε. Version: 2 Owner: mathwizard Author(s): mathwizard

479.6
If

Cauchy’s root test
√ n

an is a series of positive real terms and an < k < 1 √ n an 1 for an infinite number of values of n,

for all n > N, then an is convergent. If then an is divergent. Limit form Given a series an of complex terms, set

ρ = lim sup
n→∞

n

|an |

The series an is absolutely convergent if ρ < 1 and is divergent if ρ > 1. If ρ = 1, then the test is inconclusive. Version: 4 Owner: vypertd Author(s): vypertd

479.7

Dirichlet’s convergence test
n i=0

Theorem. Let {an } and {bn } be sequences of real numbers such that { and {bn } decreases with 0 as limit. Then ∞ an bn converges. n=0 1745

ai } is bounded

Proof. Let An :=

n i=0

an and let M be an upper bound for {|An |}. By Abel’s lemma,
n m−1

n

ai bi =
i=m i=0 n

ai bi −

ai bi
i=0 m−1

=
i=0 n

Ai (bi − bi+1 ) −

i=0

Ai (bi − bi+1 ) + An bn − Am−1 bm−1

=
n i=m n

Ai (bi − bi+1 ) + An bn − Am−1 bm−1 |Ai (bi − bi+1 )| + |An bn | + |Am−1 bm−1 |
n

|

i=m

ai bi |

i=m

M
i=m

(bi − bi+1 ) + |An bn | + |Am−1 bm−1 |

Since {bn } converges to 0, there is an N( ) such that both n (bi −bi+1 ) < 3M and bi < i=m for m, n > N( ). Then, for m, n > N( ), | n ai bi | < and an bn converges. i=m Version: 1 Owner: lieven Author(s): lieven

3M

479.8

Proof of Baroni’s Theorem

Let m = infA and M = sup A . If m = M we are done since the sequence is convergent and A is the degenerate interval composed of the point l ∈ R , where l = lim xn .
n→∞

Now , assume that m < M . For every λ ∈ (m, M) , we will construct inductively two subsequences xkn and xln such that lim xkn = lim xln = λ and xkn < λ < xln
n→∞ n→∞

From the definition of M there is an N1 ∈ N such that : λ < xN1 < M Consider the set of all such values N1 . It is bounded from below (because it consists only of natural numbers and has at least one element) and thus it has a smallest element . Let n1 be the smallest such element and from its definition we have xn1 −1 λ < xn1 . So , choose k1 = n1 − 1 , l1 = n1 . Now, there is an N2 > k1 such that : λ < xN2 < M

1746

Consider the set of all such values N2 . It is bounded from below and it has a smallest element n2 . Choose k2 = n2 − 1 and l2 = n2 . Now , proceed by induction to construct the sequences kn and ln in the same fashion . Since ln − kn = 1 we have :
n→∞

lim xkn = lim xln
n→∞

and thus they are both equal to λ. Version: 1 Owner: slash Author(s): slash

479.9

Proof of Stolz-Cesaro theorem

From the definition of convergence , for every > 0 there is N( ) ∈ N such that (∀)n N( ) , we have : an+1 − an <l+ l− < bn+1 − bn Because bn is strictly increasing we can multiply the last equation with bn+1 − bn to get : (l − )(bn+1 − bn ) < an+1 − an < (l + )(bn+1 − bn ) Let k > N( ) be a natural number . Summing the last relation we get :
k k k

(l − )

i=N ( )

(bi+1 − bi ) <

i=N ( )

(an+1 − an ) < (l + )

i=N ( )

(bi+1 − bi ) ⇒

divide the last relation by bk+1 > 0 to get : (l − )(1 − (l − )(1 −

(l − )(bk+1 − bN ( ) ) < ak+1 − aN ( ) < (l + )(bk+1 − bN ( ) ) bN ( ) bN ( ) ak+1 aN ( ) )< − < (l + )(1 − )⇔ bk+1 bk+1 bk+1 bk+1

bN ( ) aN ( ) bN ( ) aN ( ) ak+1 )+ < < (l + )(1 − )+ bk+1 bk+1 bk+1 bk+1 bk+1 This means that there is some K such that for k K we have : ak+1 < (l + ) (l − ) < bk+1 (since the other terms who were left out converge to 0) This obviously means that : and we are done . Version: 1 Owner: slash Author(s): slash 1747 an =l n→∞ bn lim

479.10

Stolz-Cesaro theorem

Let (an )n 1 and (bn )n 1 be two sequences of real numbers. If bn is positive, strictly increasing and unbounded and the following limit exists: an+1 − an =l n→∞ bn+1 − bn lim Then the limit:
n→∞

lim

an bn

also exists and it is equal to l. Version: 4 Owner: Daume Author(s): Daume, slash

479.11

absolute convergence theorem

Every absolutely convergent series is convergent. Version: 1 Owner: paolini Author(s): paolini

479.12
The series

comparison test
∞ i=0

ai

with real ai is absolutely convergent if there is a sequence (bn )n∈N with positive real bn such that ∞ bi
i=0

is convergent and for all sufficiently large k holds |ak |

bk .

Also, the series ai is divergent if there is a sequence (bn ) with positive real bn , so that bi is divergent and ak bk for all sufficiently large k. Version: 1 Owner: mathwizard Author(s): mathwizard

1748

479.13

convergent sequence

A sequence x0 , x1 , x2 , . . . in a metric space (X, d) is a convergent sequence if there exists a point x ∈ X such that, for every real number > 0, there exists a natural number N such that d(x, xn ) < for all n > N. The point x, if it exists, is unique, and is called the limit point of the sequence. One can also say that the sequence x0 , x1 , x2 , . . . converges to x. A sequence is said to be divergent if it does not converge. Version: 4 Owner: djao Author(s): djao

479.14

convergent series

A series Σan is convergent iff the sequence of partial sums Σn ai is convergent. i=1 A series Σan is said to be absolutely convergent if Σ|an | is convergent. Equivalently, a series Σan is absolutely convergent if and only if all possible rearrangements are also convergent. A series Σan which converges, but which is not absolutely convergent is called conditionally convergent. It can be shown that absolute convergence implies convergence. Let Σan be an absolutely convergent series, and Σbn be a conditionally convergent series. Then any rearrangement of Σan is convergent to the same sum. It is a result due to Riemann that Σbn can be rearranged to converge to any sum, or not converge at all. Version: 5 Owner: vitriol Author(s): vitriol

479.15

determining series convergence

Consider a series Σan . To determine whether Σan converges or diverges, several tests are available. There is no precise rule indicating which type of test to use with a given series. The more obvious approaches are collected below. • When the terms in Σan are positve, there are several possibilities: – the comparison test, – the root test (Cauchy’s root test), – the ratio test, – the integral test. 1749

• If the series is an alternating series, then the alternating series test may be used. • Abel’s test for convergence can be used when terms in Σan can be obained as the product of terms of a convergent series with terms of a monotonic convergent sequence. The root test and the ratio test are direct applications of the comparison test to the geometric series with terms (|an |)1/n and an+1 , respectively. an Version: 2 Owner: jarino Author(s): jarino

479.16

example of integral test
∞ k=1

Consider the series

1 . k log k

Since the integral

1 dx = lim [log(log(x))]M 1 M →∞ x log x is divergent also the series considered is divergent. int∞ 1 Version: 2 Owner: paolini Author(s): paolini

479.17

geometric series

A geometric series is a series of the form
n

ar i−1
i=1

(with a and r real or complex numbers). The sum of a geometric series is

sn =

a(1 − r n ) 1−r

(479.17.1)

An infinite geometric series is a geometric series, as above, with n → ∞. It is denoted
∞ i=1

ar i−1

1750

If |r| ≥ 1, the infinite geometric series diverges. Otherwise it converges to
∞ i=1

ar i−1 =

a 1−r

(479.17.2)

Taking the limit of sn as n → ∞, we see that sn diverges if |r| ≥ 1. However, if |r| < 1, sn approaches (2). One way to prove (1) is to take

sn = a + ar + ar 2 + · · · + ar n−1 and multiply by r, to get

rsn = ar + ar 2 + ar 3 + · · · + +ar n−1 + ar n subtracting the two removes most of the terms:

sn − rsn = a − ar n factoring and dividing gives us a(1 − r n ) 1−r

sn =

Version: 6 Owner: akrowne Author(s): akrowne

479.18

harmonic number

The harmonic number of order n of θ is defined as
n

Hθ (n) =
i=1

1 iθ

1751

Note that n may be equal to ∞, provided θ > 1. If θ ≤ 1, while n = ∞, the harmonic series does not converge and hence the harmonic number does not exist. If θ = 1, we may just write Hθ (n) as Hn (this is a common notation).

479.18.1

Properties

• If Re(θ) > 1 and n = ∞ then the sum is the Riemann zeta function. • If θ = 1, then we get what is known simply as“the harmonic number”, and it has many 1 important properties. For example, it has asymptotic expansion Hn = ln n+γ+ 2m +. . . where γ is Euler’s constant. • It is possible to define harmonic numbers for non-integral n. This is done by means of the series Hn (z) = n 1 (n−z − (n + x)−z )1 . Version: 5 Owner: akrowne Author(s): akrowne

479.19

harmonic series

The harmonic series is
∞ n=1

h=

1 n

The harmonic series is known to diverge. This can be proven via the integral test; compare h with int∞ 1 A harmonic series is any series of the form
∞ n=1
1

1 dx. x

hp =

1 np

See “The Art of computer programming” vol. 2 by D. Knuth

1752

These are the so-called ”p-series.” When p > 1, these are known to converge (leading to the p-series test for series convergence). For complex-valued p, hp = ζ(p), the Riemann zeta function. A famous harmonic series is h2 (or ζ(2)), which converges to series of odd p has been solved analytically.
π2 . 6

In general no p-harmonic

A harmonic series which is not summed to ∞, but instead is of the form
k

hp (k) =
n=1

1 np

is called a harmonic series of order k of p. Version: 2 Owner: akrowne Author(s): akrowne

479.20

integral test

Consider a sequence (an ) = {a0 , a1 , a2 , a3 , . . .} and given M ∈ R consider any monotonically nonincreasing function f : [M, +∞) → R which extends the sequence, i.e. f (n) = an An example is an = 2n (the former being the sequence {0, 2, 4, 6, 8, . . .} and the later the doubling function for any real number. We are interested on finding out when the summation
∞ n=0

∀n ≥ M f (x) = 2x

an

converges. The integral test states the following. The series
∞ n=0

an

converges if and only if the integral int∞ f (x) dx M 1753

is finite. Version: 16 Owner: drini Author(s): paolini, drini, vitriol

479.21

proof of Abel’s lemma (by induction)

Proof. The proof is by induction. However, let us first recall that sum on the right side is a piece-wise defined function of the upper limit N − 1. In other words, if the upper limit is below the lower limit 0, the sum is identically set to zero. Otherwise, it is an ordinary sum. We therefore need to manually check the first two cases. For the trivial case N = 0, both sides equal to a0 b0 . Also, for N = 1 (when the sum is a normal sum), it is easy to verify that both sides simplify to a0 b0 + a1 b1 . Then, for the induction step, suppose that the claim holds for N ≥ 2. For N + 1, we then have
N +1 N

ai bi =
i=0

ai bi + aN +1 bN +1
i=0 N −1 i=0 N

= =

Ai (bi − bi+1 ) + AN bN + aN +1 bN +1 Ai (bi − bi+1 ) − AN (bN − bN +1 ) + AN bN + aN +1 bN +1 .

i=0

Since −AN (bN − bN +1 ) + AN bN + aN +1 bN +1 = AN +1 bN +1 , the claim follows. P. Version: 4 Owner: mathcam Author(s): matte

479.22

proof of Abel’s test for convergence

Let b be the limit of {bn } and let dn = bn − b when {bn } is decreasing and dn = b − bn when {bn } is increasing. By Dirichlet’s convergence test, an dn is convergent and so is an bn = an (b ± dn ) = b an ± an dn . Version: 1 Owner: lieven Author(s): lieven

479.23

proof of Bolzano-Weierstrass Theorem

To prove the Bolzano-Weierstrass theorem, we will first need two lemmas. Lemma 1. 1754

All bounded monotone sequences converge. proof. Let (sn ) be a bounded, nondecreasing sequence. Let S denote the set {sn : n ∈ N}. Then let b = sup S (the supremum of S.) Choose some > 0. Then there is a corresponding N such that sN > b − . Since (sn ) is nondecreasing, for all n > N, sn > b − . But (sn ) is bounded, so we have b − < sn ≤ b. But this implies |sn − b| < , so lim sn = b. (The proof for nonincreasing sequences is analogous.) Lemma 2. Every sequence has a monotonic subsequence. proof. First a definition: call the nth term of a sequence dominant if it is greater than every term following it. For the proof, note that a sequence (sn ) may have finitely many or infinitely many dominant terms. First we suppose that (sn ) has infinitely many dominant terms. Form a subsequence (snk ) solely of dominant terms of (sn ). Then snk+1 < snk k by definition of “dominant”, hence (snk ) is a decreasing (monotone) subsequence of (sn ). For the second case, assume that our sequence (sn ) has only finitely many dominant terms. Select n1 such that n1 is beyond the last dominant term. But since n1 is not dominant, there must be some m > n1 such that sm > sn1 . Select this m and call it n2 . However, n2 is still not dominant, so there must be an n3 > n2 with sn3 > sn2 , and so on, inductively. The resulting sequence s1 , s2 , s3 , . . . is monotonic (nondecreasing). proof of Bolzano-Weierstrass. The proof of the Bolzano-Weierstrass theorem is now simple: let (sn ) be a bounded sequence. By Lemma 2 it has a monotonic subsequence. By Lemma 1, the subsequence converges. Version: 2 Owner: akrowne Author(s): akrowne

1755

479.24
If for all n then

proof of Cauchy’s root test
N √ n an < k < 1

an < k n < 1. √ ∞ ∞ i n a Since n > 1 the by i=N k converges so does i=N an by the comparison test. If ∞ comparison with i=N 1 the series is divergent. Absolute convergence in case of nonpositive an can be proven in exactly the same way using n |an |. Version: 1 Owner: mathwizard Author(s): mathwizard

479.25

proof of Leibniz’s theorem (using Dirichlet’s convergence test)

Proof. Let us define the sequence αn = (−1)n for n ∈ N = {0, 1, 2, . . .}. Then
n

αi =
i=0

1 for even n, 0 for odd n,

so the sequence n αi is bounded. By assumption {an }∞ is a bounded decreasing sen=1 i=0 quence with limit 0. For n ∈ N we set bn := an+1 . Using Dirichlet’s convergence test, it follows that the series ∞ αi bi converges. Since i=0
∞ i=0

αi bi =

∞ n=1

(−1)n+1 an ,

the claim follows. P Version: 4 Owner: mathcam Author(s): matte, Thomas Heye

479.26
Suppose that notice that

proof of absolute convergence theorem
an is absolutely convergent, i.e., that 0 ≤ an + |an | ≤ 2|an |, |an | is convergent. First of all, 2|an | =

and since the series (an + |an |) has non-negative terms it can be compared with 2 |an | and hence converges. 1756

On the other hand

N

N

N

an =
n=1 n=1

(an + |an |) −

n=1

|an |.

Since both the partial sums on the right hand side are convergent, the partial sum on the left hand side is also convergent. So, the series an is convergent. Version: 3 Owner: paolini Author(s): paolini

479.27

proof of alternating series test

If the first term a1 is positive then the series has partial sum S2n+2 = a1 − a2 + a3 + ... − a2n + a2n+1 − a2n+2 where the ai are all non-negative and non-increasing. If the first term is negative, consider the series in the absence of the first term. From above, we have S2n+1 = S2n + a2n+1 S2n+2 = S2n + (a2n+1 − a2n+2 ) S2n+3 S2n+2

Sincea2n+1

a2n+2

S2n+3 = S2n+1 − (a2n+2 − a2n+3 ) = S2n+2 + a2n+3 . a2n+3 we have S2n+1

S2n . Moreover,

S2n+2 = a1 − (a2 − a3 ) − (a4 − a5 ) − · · · − (a2n − a2n+1 ) − a2n+2 Because the a1 are non-increasing, we haveSn Thus a1 S2n+1 S2n+3 0, for any n. Also, S2n+2 S2n+2 S2n 0 S2n+1 a1 .

Hence the even partial sums S2n and the odd partial sumsS2n+1 are bounded. The S2n are monotonically nondecreasing, while the odd sums S2n+1 are monotonically nonincreasing. Thus the even and odd series both converge. We note that S2n+1 − S2n = a2n+1 , therefore the sums converge to the same limit if and only if(an ) → 0. The theorem is then established. Version: 7 Owner: volator Author(s): volator

479.28
Assume |ak |

proof of comparison test
bk for all k > n. Then we define sk :=
∞ i=k

|ai |

1757

and tk :=

∞ i=k

bi .

obviously sk tk for all k > n. Since by assumption (tk ) is convergent (tk ) is bounded and so is (sk ). Also (sk ) is monotonic and therefore. Therefore ∞ ai is absolutely convergent. i=0 Now assume bk ak for all k > n. If ∞ bi is divergent then so is ∞ ai because otherwise i=k i=k we could apply the test we just proved and show that ∞ bi is convergent, which is is not i=0 by assumption. Version: 1 Owner: mathwizard Author(s): mathwizard

479.29

proof of integral test

Consider the function (see the definition of floor) g(x) = a
x

.

Clearly for x ∈ [n, n + 1), being f non increasing we have g(x + 1) = an+1 = f (n + 1) ≤ f (x) ≤ f (n) = an = g(x) hence int+∞ g(x + 1) dx = int+∞ g(x) dx ≤ int+∞ f (x) ≤ int+∞ g(x) dx. M M +1 M M Since the integral of f and g on [M, M +1] is finite we notice that f is integrable on [M, +∞) if and only if g is integrable on [M, +∞). On the other hand g is locally constant so intn+1 g(x) dx = intn+1 an dx = an n n and hence for all N ∈ Z
∞ n=N

int+∞ g(x) N

=

an an is convergent.

that is g is integrable on [N, +∞) if and only if

∞ n=N

But, again, intN g(x) dx is finite hence g is integrable on [M, +∞) if and only if g is integrable M on [N, +∞) and also N an is finite so ∞ an is convergent if and only if ∞ an is n=0 n=0 n=N convergent. Version: 1 Owner: paolini Author(s): paolini

1758

479.30

proof of ratio test

Assume k < 1. By definition ∃N such that n > N → | an+1 − k| < 1−k → | an+1 | < 1+k < 1 an 2 an 2 i.e. eventually the series |an | becomes less than a convergent geometric series, therefore a shifted subsequence of |an | converges by the comparison test. Note that a general sequence bn converges iff a shifted subsequence of bn converges. Therefore, by the absolute convergence theorem, the series an converges. Similarly for k > 1 a shifted subsequence of |an | becomes greater than a geometric series tending to ∞, and so also tends to ∞. Therefore an diverges. Version: 3 Owner: vitriol Author(s): vitriol

479.31

ratio test

Let (an ) be a real sequence. If | an+1 | → k then: an • k<1→ • k>1→ an converges absolutely an diverges

Version: 4 Owner: vitriol Author(s): vitriol

1759

Chapter 480 40A10 – Convergence and divergence of integrals
480.1 improper integral

Improper integrals are integrals of functions which either go to infinity at the integrands, between the integrands, or where the integrands are infinite. To evaluate these integrals, we use a limit process of the antiderivative. Thus we say that an improper integral converges and/or diverges if the limit converges or diverges. [examples and more exposition later] Version: 1 Owner: slider142 Author(s): slider142

1760

Chapter 481 40A25 – Approximation to limiting values (summation of series, etc.)
481.1 Euler’s constant

Euler’s constant γ is defined by 1 1 1 1 + + + · · · + − ln n 2 3 4 n

γ = lim 1 +
n→∞

or equivalently
n

γ = lim Euler’s constant has the value

n→∞

i=1

1 1 − ln 1 + i i

0.57721566490153286060651209008240243104 . . . It is related to the gamma function by γ = −Γ (1) It is not known whether γ is rational or irrational. References.

1761

• Chris Caldwell - “Euler’s Constant”, http://primes.utm.edu/glossary/page.php/Gamma.html Version: 6 Owner: akrowne Author(s): akrowne

1762

Chapter 482 40A30 – Convergence and divergence of series and sequences of functions
482.1 Abel’s limit theorem
an xn has a radius of convergence r and that lim an xn = an r n = an r n is convergent. Then

Suppose that

x→r −

( lim an xn ) −
x→r

Version: 2 Owner: vypertd Author(s): vypertd

482.2

L¨wner partial ordering o

Let A and B be two Hermitian matrices of the same size. If A − B is positive semidefinite we write

A

B or B

A

Note: is a partial ordering, referred to as L¨wner partial ordering, on the set of hermitian o matrices. Version: 3 Owner: Johan Author(s): Johan

1763

482.3

L¨wner’s theorem o

A real function f on an interval I is matrix monotone if and only if it is real analytic and has (complex) analytic continuations to the upper and lower half planes such that Im(f ) > 0 in the upper half plane. (L¨wner 1934) o Version: 4 Owner: mathcam Author(s): Larry Hammick, yark, Johan

482.4

matrix monotone

A real function f on a real interval I is said to be matrix monotone of order n, if A B ⇒ f (A) f (B) (482.4.1)

for all Hermitian n × n matrices A, B with spectra contained in I. Version: 5 Owner: Johan Author(s): Johan

482.5

operator monotone

A function is said to be operator monotone if it is matrix monotone of arbitrary order. Version: 2 Owner: Johan Author(s): Johan

482.6

pointwise convergence

Let X be any set, and let Y be a topological space. A sequence f1 , f2 , . . . of functions mapping X to Y is said to be pointwise convergent (or simply convergent) to another function f , if the sequence fn (x) converges to f (x) for each x in X. This is usually denoted by fn → f . Version: 1 Owner: Koro Author(s): Koro

1764

482.7

uniform convergence

Let X be any set, and let (Y, d) be a metric space. A sequence f1 , f2 , . . . of functions mapping X to Y is said to be uniformly convergent to another function f if, for each ε > 0, there exists N such that, for all x and all n > N, we have d(fn (x), f (x)) < ε. This is denoted by u fn − f , or “fn → f uniformly” or, less frequently, by fn → f. Version: 8 Owner: Koro Author(s): Koro

1765

Chapter 483 40G05 – Ces`ro, Euler, N¨rlund and a o Hausdorff methods
483.1 Ces`ro summability a

Ces`ro summability is a generalized convergence criterion for infinite series. We say that a a series ∞ an is Ces`ro summable if the Ces`ro means of the partial sums converge to some a a n=0 limit L. To be more precise, letting
N

sN =
n=0

an Ces`ro converges to a limit L, if a

denote the N th partial sum, we say that

∞ n=0 an

1 (s0 + . . . + sN ) → L as N → ∞. N +1 Ces`ro summability is a generalization of the usual definition of the limit of an infinite series. a Proposition 19. Suppose that
∞ n=0

an = L,

in the usual sense that sN → L as N → ∞. Then, the series in question Ces`ro converges a to the same limit. The converse, however is false. The standard example of a divergent series, that is nonetheless Ces`ro summable is a ∞ (−1)n .
n=0

1766

The sequence of partial sums 1, 0, 1, 0, . . . does not converge. The Ces`ro means, namely a 1 1 2 2 3 3 , , , , , ,... 1 2 3 4 5 6 do converge, with 1/2 as the limit. Hence the series in question is Ces`ro summable. a There is also a relation between Ces`ro summability and Abel summability 1 . a Theorem 14 (Frobenius). A series that is Ces`ro summable is also Abel summable. To a be more precise, suppose that 1 (s0 + . . . + sN ) → L N +1 Then, f (r) = as well. Version: 3 Owner: rmilson Author(s): rmilson
∞ n=0

as N → ∞.

an r n → L

as r → 1−

1

This and similar results are often called Abelian theorems.

1767

Chapter 484 40G10 – Abel, Borel and power series methods
484.1 Abel summability

Abel summability is a generalized convergence criterion for power series. It extends the usual definition of the sum of a series, and gives a way of summing up certain divergent series. Let us start with a series ∞ an , convergent or not, and use that series to define a power n=0 series ∞ f (r) = an r n .
n=0

Note that for |r| < 1 the summability of f (r) is easier to achieve than the summability of the original series. Starting with this observation we say that the series an is Abel summable if the defining series for f (r) is convergent for all |r| < 1, and if f (r) converges to some limit L as r → 1− . If this is so, we shall say that an Abel converges to L. Of course it is important to ask whether an ordinary convergent series is also Abel summable, and whether it converges to the same limit? This is true, and the result is known as Abel’s convergence theorem, or simply as Abel’s theorem. Theorem 15 (Abel). Let
∞ n=0

an be a series; let N ∈ N,

sN = a0 + . . . + aN ,

denote the corresponding partial sums; and let f (r) be the corresponding power series defined as above. If an is convergent, in the usual sense that the sN converge to some limit L as N → ∞, then the series is also Abel summable and f (r) → L as r → 1− . The standard example of a divergent series that is nonetheless Abel summable is the alternating series
∞ n=0

(−1)n . 1768

The corresponding power series is 1 = 1+r Since 1 1 → 1+r 2
∞ n=0

(−1)n r n .

as r → 1− ,

1 this otherwise divergent series Abel converges to 2 .

Abel’s theorem is the prototype for a number of other theorems about convergence, which are collectively known in analysis as Abelian theorems. An important class of associated results are the so-called Tauberian theorems. These describe various convergence criteria, and sometimes provide partial converses for the various Abelian theorems. The general converse to Abel’s theorem is false, as the example above illustrates 1 . However, in the 1890’s Tauber proved the following partial converse. Theorem 16 (Tauber). Suppose that an is an Abel summable series and that nan → 0 as n → ∞. Then, n an is convergent in the ordinary sense as well. The proof of the above theorem is not hard, but the same cannot be said of the more general Tauberian theorems. The more famous of these are due to Hardy, Hardy-Littlewood, Weiner, and Ikehara. In all cases, the conclusion is that a certain series or a certain integral is convergent. However, the proofs are lengthy and require sophisticated techniques. Ikehara’s theorem is especially noteworthy because it is used to prove the prime number theorem. Version: 1 Owner: rmilson Author(s): rmilson

484.2

proof of Abel’s convergence theorem
∞ n=0

Suppose that

an = L
∞ n=0

is a convergent series, and set f (r) =

an r n .

Convergence of the first series implies that an → 0, and hence f (r) converges for |r| < 1. We will show that f (r) → L as r → 1− .
1 We want the converse to be false; the whole idea is to describe a method of summing certain divergent series!

1769

Let sN = a0 + . . . + aN , N ∈ N, sn r n .
n

denote the corresponding partial sums. Our proof relies on the following identity f (r) =
n

an r n = (1 − r)

(484.2.1)

The above identity obviously works at the level of formal power series. Indeed, a0 + (a1 + a0 ) r + (a2 + a1 + a0 ) r 2 + . . . −( a0 r + (a1 + a0 ) r 2 + . . .) = a0 + a1 r + a2 r 2 + . . . Since the partial sums sn converge to L, they are bounded, and hence n sn r n converges for |r| < 1. Hence for |r| < 1, identity (483.2.1) is also a genuine functional equality. Let > 0 be given. Choose an N sufficiently large so that all partial sums, sn with n > N, are sandwiched between L − and L + . It follows that for all r such that 0 < r < 1 the series ∞ (1 − r) sn r n
n=N +1

is sandwiched between r N +1 (L − ) and r N +1 (L + ). Note that
N

f (r) = (1 − r)

n=0

sn r n + (1 − r)

∞ n=N +1

sn r n .

As r → 1− , the first term goes to 0. Hence, lim sup f (r) and lim inf f (r) as r → 1− are sandwiched between L − and L + . Since > 0 was arbitrary, it follows that f (r) → L as r → 1− . QED Version: 1 Owner: rmilson Author(s): rmilson

484.3
Let

proof of Tauber’s convergence theorem
∞ n=0

f (z) =

an z n ,

be a complex power series, convergent in the open disk z < 1. We suppose that 1. nan → 0 as n → ∞, and that 2. f (r) converges to some finite L as r → 1− ; 1770

and wish to show that

n

an converges to the same L as well.

Let sn = a0 + . . . + an , where n = 0, 1, . . ., denote the partial sums of the series in question. The enabling idea in Tauber’s convergence result (as well as other Tauberian theorems) is the existence of a correspondence in the evolution of the sn as n → ∞, and the evolution of f (r) as r → 1− . Indeed we shall show that sn − f n−1 n → 0 as n → 0. (484.3.1)

The desired result then follows in an obvious fashion. For every real 0 < r < 1 we have
n

sn = f (r) +
k=0

ak (1 − r ) −

k

∞ k=n+1

ak r k .

Setting
n

= sup kak ,
k>n

and noting that 1 − r k = (1 − r)(1 + r + . . . + r k−1) < k(1 − r), we have that sn − f (r) (1 − r)
n

kak +
k=0

n

∞ k=n+1

n

rk .

Setting r = 1 − 1/n in the above inequality we get sn − f (1 − 1/n) where µn = 1 n µn +
n n (1

− 1/n)n+1 ,

kak
k=0

are the Ces`ro means of the sequence kak , k = 0, 1, . . . Since the latter sequence converges a to zero, so do the means µn , and the suprema n . Finally, Euler’s formula for e gives
n→∞

lim (1 − 1/n)n = e−1 .

The validity of (483.3.1) follows immediately. QED Version: 1 Owner: rmilson Author(s): rmilson

1771

Chapter 485 41A05 – Interpolation
485.1 Lagrange Interpolation formula

Let (x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ) be n points in the plane (xi = xj for i = j). Then there exists an unique polynomial p(x) of degree at most n−1 such that yi = p(xi ) for i = 1, . . . , n. Such polynomial can be found using Lagrange’s Interpolation formula: f (x) f (x) f (x) y1 + y2 + · · · + yn (x − x1 )f (x1 ) (x − x2 )f (x2 ) (x − xn )f (xn )

p(x) =

where f (x) = (x − x1 )(x − x2 ) · · · (x − xn ).

To see this, notice that the above formula is the same as p(x) = y1

(x − x2 )(x − x3 ) . . . (x − xn ) (x − x1 )(x − x3 ) . . . (x − xn ) (x − x1 )(x − x2 ) . . . +y2 +· · ·+yn (x1 − x2 )(x1 − x3 ) . . . (x1 − xn ) (x2 − x1 )(x2 − x3 ) . . . (x2 − xn ) (xn − x1 )(xn − x2 ) . .

and that every polynomial in the numerators vanishes for all xi except onen and for that one xi the denominator makes the fraction equal to 1 so each p(xi ) equals yi. Version: 4 Owner: drini Author(s): drini

485.2

Simpson’s 3/8 rule

3 Simpson’s 8 rule is a method for approximating a definite integral by evaluating the integrand at finitely many points. The formal rule is given by

intx3 f (x) dx ≈ x0

3h [f (x0 ) + 3f (x1 ) + 3f (x2 ) + f (x3 )] 8 1772

where h =

x1 −x0 . 3

Simpson’s 3 rule is the third Newton-Cotes quadrature formula. It has degree of precision 8 3. This means it is exact for polynomials of degree less than or equal to three. Simpson’s 3 rule is an improvement to the traditional Simpson’s rule. The extra function evaluation 8 gives a slightly more accurate approximation . We can see this with an example. Using the fundamental theorem of the calculus shows intπ sin(x) dx = 2. 0 In this case Simpson’s rule gives, intπ sin(x) dx ≈ 0 However, Simpson’s
3 8

π π sin(0) + 4 sin + sin(π) = 2.094 6 2

rule does slightly better. 3 8 π π sin(0) + 3 sin + 3 sin 3 3 2π 3

intπ sin(x) dx ≈ 0

+ sin(π) = 2.040

Version: 4 Owner: tensorking Author(s): tensorking

485.3

trapezoidal rule

Definition 11. The trapezoidal rule is a method for approximating a definite integral by evaluating the integrand at finitely many points. The formal rule is given by intx1 f (x) dx = x0 where h = x1 − x0 . The trapezoidal rule is the first Newton-Cotes quadrature formula. It has degree of precision 1. This means it is exact for polynomials of degree less than or equal to one. We can see this with a simple example. Example 20. Using the fundamental theorem of the calculus shows int1 x dx = 1/2. 0 In this case the trapezoidal rule gives the exact value, int1 x dx ≈ 0 1 [f (0) + f (1)] = 1/2. 2 1773 h [f (x0 ) + f (x1 )] 2

It is important to note that most calculus books give the wrong definition of the trapezoidal rule. Typically they define a composite trapezoidal rule which uses the trapezoidal rule on a specified number of subintervals. Also note the trapezoidal rule can be derived by integrating a linear interpolation or using the method of undetermined coefficients. The later is probably a bit easier. Version: 6 Owner: tensorking Author(s): tensorking

1774

Chapter 486 41A25 – Rate of convergence, degree of approximation
486.1 superconvergence

Let xi = |ai+1 −ai |, the difference between two successive entries of a sequence. The sequence a0 , a1 , . . . superconverges if, when the xi are written in base 2, then each number xi starts with 2i − 1 ≈ 2i zeroes. The following sequence is superconverging to 0. xn+1 x0 x1 x2 x3 x4 = x2 (xn )10 n 1 = 2 1 = 4 1 = 8 1 = 16 1 = 32 (xn )2 .1 .01 .0001 .00000001 .0000000000000001

In this case it is easy to see that the number of binary places increases by twice the previous amount per xn . Version: 8 Owner: slider142 Author(s): slider142

1775

Chapter 487 41A58 – Series expansions (e.g. Taylor, Lidstone series, but not Fourier series)
487.1
487.1.1

Taylor series
Taylor Series

Let f be a function defined on any open interval containing 0. If f possesses derivatives of all order at 0, then
∞ k=0

T (x) =

f (k) (0) k x k!

is called the Taylor series of f about 0. We use 0 for simplicity, but any function with an infinitely-differentiable point can be shifted such that this point becomes 0. Tn (x), the “nth degree Taylor approximation” or a “Taylor series approximation to n terms”1 , is defined as
n−1

Tn (x) =
k=0
1

f (k) (0) k x k!

Tn is often defined as the sum from k = 0 to n rather than the sum from k = 0 to n − 1. This has the beneficial result of making the “nth degree Taylor approximation” a degree-n polynomial. However, the drawback is that Tn is no longer an approximation “to n terms”. The different definitions also give rise to slightly different statements of Taylor’s Theorem. In sum, mind the context when dealing with Taylor series and Taylor’s theorem.

1776

The remainder, Rn (x), is defined as Rn (x) = f (x) − Tn (x) Also note that f (x) = T (x) if and only if lim Rn (x) = 0

n→∞

For most functions one encounters in college calculus, f (x) = T (x) (for example, polynomials and ratios of polynomials), and thus, limn→∞ Rn (x) = 0. Taylor’s theorem is typically invoked in order to show this (the theorem gives the specific form of the remainder). Taylor series approximations are extremely useful to linearize or otherwise reduce the analytical complexity of a function. Taylor series approximations are most useful when the magnitude of the terms falls off rapidly.

487.1.2

Examples

Using the above definition of a Taylor series about 0, we have the following important series representations: x x2 x3 + + +··· 1! 2! 3! x3 x5 x7 x − + − +··· sin x = 1! 3! 5! 7! x2 x4 x6 cos x = 1 − + − +··· 2! 4! 6! ex = 1 +

487.1.3

Generalizations

Taylor series can also be extended to functions of more than one variable. The two-variable Taylor series of f (x, y) is
∞ ∞

T (x, y) =

i=0 j=0

f (i,j) (x, y) i j xy i!j!

Where f (i,j) is the partial derivative of f taken with respect to x i times and with respect to y j times. We can generalize this to n variables, or functions f (x) , x ∈ Rn×1 . The Taylor series of this function of a vector is then 1777

T (x) =

∞ i1 =0

···

∞ in

f (i1 ,i2 ,...,in ) (0) i1 i2 x1 x2 · · · xin n i1 !i2 ! · · · in ! =0

Version: 7 Owner: akrowne Author(s): akrowne

487.2
487.2.1

Taylor’s Theorem
Taylor’s Theorem

Let f be a function which is defined on the interval (a, b), with a < 0 < b, and suppose the nth derivative f (n) exists on (a, b). Then for all nonzero x in (a, b), f (n) (y) n x n!

Rn (x) =

with y strictly between 0 and x (y depends on the choice of x). Rn (x) is the nth remainder of the Taylor series for f (x). Version: 2 Owner: akrowne Author(s): akrowne

1778

Chapter 488 41A60 – Asymptotic approximations, asymptotic expansions (steepest descent, etc.)
488.1 Stirling’s approximation

Stirling’s formula gives an approximation for n!, the factorial function. It is n! ≈ √ 2nπnn e−n

We can derive this from the gamma function. Note that for large x, Γ(x) = where
∞ n=0

2πxx− 2 e−x+µ(x)

1

(488.1.1)

µ(x) =

x+n+

1 2

ln 1 +

1 x+n

−1=

θ 12x

with 0 < θ < 1. Taking x = n and multiplying by n, we have n! = √ 2πnn+ 2 e−n+ 12n
1 θ

(488.1.2)

Taking the approximation for large n gives us Stirling’s formula.

1779

There is also a big-O notation version of Stirling’s approximation: √ n e
n

n! =

2πn

1+O

1 n

(488.1.3)

We can prove this equality starting from (487.1.2). It is clear that the big-o portion θ of (487.1.3) must come from e 12n , so we must consider the asymptotic behavior of e. First we observe that the Taylor series for ex is x x2 x3 + + +··· 1 2! 3!

ex = 1 +

1 But in our case we have e to a vanishing exponent. Note that if we vary x as n , we have as n −→ ∞

ex = 1 + O

1 n

We can then (almost) directly plug this in to (487.1.2) to get (487.1.3) (note that the factor of 12 gets absorbed by the big-O notation.) Version: 16 Owner: drini Author(s): drini, akrowne

1780

Chapter 489 42-00 – General reference works (handbooks, dictionaries, bibliographies, etc.)
489.1 countable basis

A countable basis β of a vector space V over a field F is a countable subset β ⊂ V with the property that every element v ∈ V can be written as an infinite series v=
x∈β

ax x

in exactly one way (where ax ∈ F ). We are implicitly assuming, without further comment, that the vector space V has been given a topological structure or normed structure in which the above infinite sum is absolutely convergent (so that it converges to v regardless of the order in which the terms are summed). The archetypical example of a countable basis is the Fourier series of a function: every continuous real-valued periodic function f on the unit circle S 1 = R/2π can be written as a Fourier series ∞ ∞ f (x) = an cos(nx) + bn sin(nx)
n=0 n=1

in exactly one way. Note: A countable basis is a countable set, but it is not usually a basis. Version: 4 Owner: djao Author(s): djao

1781

489.2

discrete cosine transform

The discrete cosine transform is closely related to the fast Fourier transform; it plays a role in coding signals and images [Jain89], e.g. in the widely used standard JPEG compression. The one-dimensional transform is defined by
N −1 n=0

t(k) = c(k)

s(n) cos

π(2n + 1)k 2N

where s is the array of N original values, t is the array of N transformed values, and the coefficients c are given by c(0) = for 1 ≤ k ≤ N − 1. The discrete cosine transform in two dimensions, for a square matrix, can be written as
N −1 N −1 n=1 m=0

1/N, c(k) =

2/N

t(i, j) = c(i, j)

s(m, n) cos

π(2m + 1)i π(2n + 1)j cos 2N 2N

with an analogous notation for N, s, t, and the c(i, j) given by c(0, j) = 1/N, c(i, 0) = 1/N, and c(i, j) = 2/N for both i and j = 0. The DCT has an inverse, defined by
N −1 k=0

s(n) =

c(k)t(k) cos

π(2n + 1)k 2N

for the one-dimensional case, and
N −1 N −1 i=0 j=0

s(m, n) =

c(i, j)t(i, j) cos

π(2m + 1)i π(2n + 1)j cos 2N 2N

for two dimensions. The DCT is included in commercial image processing packages, e.g. in Matlab. References 1782

• Originally from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.html) • Jain89 A.K. Jain, Fundamentals of Digital Image Processing, Prentice Hall, 1989. Version: 4 Owner: akrowne Author(s): akrowne

1783

Chapter 490 42-01 – Instructional exposition (textbooks, tutorial papers, etc.)
490.1 Laplace transform

Let f (t) be a function defined on the interval [0, ∞). The Laplace transform of f (t) is the function F (s) defined by F (s) = int∞ e−st f (t) dt, 0 provided that the improper integral converges. We will usually denote the Laplace transform of f by L{f (t)}. Some of the most common Laplace transforms are: 1. L{eat } =
1 s−a

,s>a
s s2 −b2 b s2 −b2

2. L{cos(bt)} = 3. L{sin(bt)} = 4. L{(tn )} =

,s>0 ,s>0

n! sn+1

, s > 0.

Notice the Laplace transform is a linear transformation. Much like the Fourier transform, the Laplace transform has a convolution. The most popular usage of the Laplace transform is to solve initial value problems by taking the Laplace transform of both sides of an ordinary differential equation. Version: 4 Owner: tensorking Author(s): tensorking

1784

Chapter 491 42A05 – Trigonometric polynomials, inequalities, extremal problems
491.1 Chebyshev polynomial

We can always express cos(kt) as a polynomial of cos(t): Examples: cos(1t) = cos(t) cos(2t) = 2(cos(t))2 − 1 cos(3t) = 4(cos(t))3 − 3 cos(t) . . . This fact can be proved using the formula for cosine of angle-sum. If we write x = cos t we obtain the Chebyshev polynomials of first kind, that is Tn (x) = cos(nt) where x = cos t. So we have T0 (x) T1 (x) T2 (x) T3 (x) = = = = . . . 1 x 2x2 − 1 4x3 − 3x

1785

These polynomials hold the recurrence relation: Tn+1 (x) = 2xTn (x) − Tn−1 (x) for n = 1, 2, . . . Version: 4 Owner: drini Author(s): drini

1786

Chapter 492 42A16 – Fourier coefficients, Fourier series of functions with special properties, special Fourier series
492.1 Riemann-Lebesgue lemma

Proposition. Let f : [a, b] → C be a measurable function. If f is L1 integrable, that is to say if the Lebesgue integral of |f | is finite, then intb f (x)einx dx → 0, a as n → ±∞.

The above result, commonly known as the Riemann-Lebesgue lemma, is of basic importance ˆ in harmonic analysis. It is equivalent to the assertion that the Fourier coefficients fn of a periodic, integrable function f (x), tend to 0 as n → ±∞. The proof can be organized into 3 steps. Step 1. An elementary calculation shows that intI einx dx → 0, as n → ±∞

for every interval I ⊂ [a, b]. The proposition is therefore true for all step functions with support in [a, b]. Step 2. By the monotone convergence theorem, the proposition is true for all positive functions, integrable on [a, b]. Step 3. Let f be an arbitrary measurable function, integrable on [a, b]. The proposition is true for such a general f , because one can always write f = g − h, 1787

where g and h are positive functions, integrable on [a, b]. Version: 2 Owner: rmilson Author(s): rmilson

492.2

example of Fourier series

Here we present an example of Fourier series: Example: Let f : R → R be the “identity” function, defined by f (x) = x, for all x ∈ R We will compute the Fourier coefficients for this function. Notice that cos(nx) is an even function, while f and sin(nx) are odd functions. af = 0 1 1 intπ f (x)dx = intπ xdx = 0 −π 2π 2π −π

af = n

1 π 1 int−π f (x) cos(nx)dx = intπ x cos(nx)dx = 0 π π −π

bf = n

1 1 π int−π f (x) sin(nx)dx = intπ x sin(nx)dx = π π −π π sin(nx) x cos(nx) 2 π 2 + − = int0 x sin(nx)dx = π π n n 0

π

=
0

= (−1)n+1

2 n

Notice that af , af are 0 because x and x cos(nx) are odd functions. Hence the Fourier series 0 n for f (x) = x is:

f (x) = x = =

af 0

+

∞ n=1

(af cos(nx) + bf sin(nx)) = n n ∀x ∈ (−π, π)

2 (−1)n+1 sin(nx), n n=1

For an application of this Fourier series, see value of the Riemann zeta function at s = 2. Version: 4 Owner: alozano Author(s): alozano

1788

Chapter 493 42A20 – Convergence and absolute convergence of Fourier and trigonometric series
493.1 Dirichlet conditions

Let f be a piecewise regular real-valued function defined on some interval [a, b], such that f has only a finite number of discontinuities and extrema in [a, b]. Then the Fourier series of this function converges to f when f is continuous and to the arithmetic mean of the left-handed and right-handed limit of f at a point where it is discontinuous. Version: 3 Owner: mathwizard Author(s): mathwizard

1789

Chapter 494 42A38 – Fourier and Fourier-Stieltjes transforms and other transforms of Fourier type
494.1 Fourier transform

The Fourier transform F (s) of a function f (t) is defined as follows: 1 F (s) = √ int∞ e−ist f (t)dt. −∞ 2π The Fourier transform exists if f is Lebesgue integrable on the whole real axis. If f is Lebesgue integrable and can be divided into a finite number of continuous, monotone functions and at every point both one-sided limits exist, the Fourier transform can be inverted: 1 f (t) = √ int∞ eist F (s)ds. −∞ 2π Sometimes the Fourier transform is also defined without the factor √1 in one direction, 2π 1 but therefore giving the transform into the other direction a factor 2π . So when looking a transform up in a table you should find out how it is defined in that table. The Fourier transform has some important properties when solving differential equations. We denote the Fourier transform of f with respect to t in terms of s by Ft (f ). • Ft (af + bg) = aFt (f ) + bF(g), where a and b are real constants and f and g are real functions. 1790

• Ft • Ft

∂ f ∂t ∂ f ∂x

= isFt (f ). =
∂ F (f ). ∂x t

• We define the bilateral convolution of two functions f1 and f2 as: 1 (f1 ∗ f2 )(t) := √ int∞ f1 (τ )f2 (t − τ )dτ. −∞ 2π Then the following equation holds: Ft ((f1 ∗ f2 )(t)) = Ft (f1 ) · Ft (f2 ). If f (t) is some signal (maybe a sound wave) then the frequency domain of f is given as Ft (f ). Rayleigh’s theorem states that then the energy E carried by the signal f given by: E = int∞ |f (t)|2 dt −∞ can also be expressed as: E = int∞ |Ft (f )(s)|2 ds. −∞ In general we have: int∞ |f (t)|2dt = int∞ |Ft (f )(s)|2 ds. −∞ −∞ Version: 9 Owner: mathwizard Author(s): mathwizard

1791

Chapter 495 42A99 – Miscellaneous
495.1 Poisson summation formula

Let f : R → R be a once-differentiable, square-integrable function. Let f ∨ (y) = intR f (x)e2πixy dx be its Fourier transform. Then f (n) =
n n

f ∨ (n).

By convention, sums are over all integers.
L et g(x) = n f (x + n). This sum converges absolutely, since f is square integrable, so g is differentiable, and periodic. Thus, the Fourier series n f ∨ (n)e2πinx converges pointwise to f . Evaluating our two sums for g at x = 0, we find

f (n) = g(0) =
n n

f ∨ (n).

Version: 5 Owner: bwebste Author(s): bwebste

1792

Chapter 496 42B05 – Fourier series and coefficients
496.1 Parseval equality

Let f be a Riemann integrable function from [−π, π] to R. The equation 1 π 2 int f (x)dx = 2(af )2 + 0 π −π
∞ k=1

[(af )2 + (bf )2 ], k k

where af , af , bf are the Fourier coefficients of the function f , is usually known as Parseval’s 0 k k equality or Parseval’s theorem. Version: 3 Owner: vladm Author(s): vladm

496.2

Wirtinger’s inequality

Theorem: Let f : R → R be a periodic function of period 2π, which is continuous and has a continuous derivative throughout R, and such that int2π f (x) = 0 . 0 Then with equality iff f (x) = a sin x + b sin x for some a and b (or equivalently f (x) = c sin(x + d) for some c and d). Proof:Since Dirichlet’s conditions are met, we can write 1 (an sin nx + bn cos ny) f (x) = a0 + 2 n≥1 1793 int2π f 2 (x)dx ≥ int2π f 2 (x)dx 0 0 (496.2.2) (496.2.1)

and moreover a0 = 0 by (495.2.1). By Parseval’s identity, int2π f 2 (x)dx = 0 and int2π f 2 (x)dx 0 =
∞ n=1

(a2 + b2 ) n n

∞ n=1

n2 (a2 + b2 ) n n

and since the summands are all ≥ 0, we get (495.2.2), with equality iff an = bn = 0 for all n ≥ 2. Hurwitz used Wirtinger’s inequality in his tidy 1904 proof of the isoperimetric inequality. Version: 2 Owner: matte Author(s): Larry Hammick

1794

Chapter 497 43A07 – Means on groups, semigroups, etc.; amenable groups
497.1 amenable group

Let G be a locally compact group and L∞ (G) be the Banach space of all essentially bounded functions G → R with respect to the Haar measure. Definition 12. A linear functional on L∞ (G) is called a mean if it maps the constant function f (g) = 1 to 1 and non-negative functions to non-negative numbers. Definition 13. Let Lg be the left action of g ∈ G on f ∈ L∞ (G), i.e. (Lg f )(h) = f (gh). Then, a mean µ is said to be left invariant if µ(Lg f ) = µ(f ) for all g ∈ G and f ∈ L∞ (G). Similarly, right invariant if µ(Rg f ) = µ(f ), where Rg is the right action (Rg f )(h) = f (hg). Definition 14. A locally compact group G is amenable if there is a left (or right) invariant mean on L∞ (G). Example 21 (Amenable groups). All finite groups and all abelian groups are amenable. compact groups are amenable as the Haar measure is an (unique) invariant mean. Example 22 (Non-amenable groups). If a group contains a free (non-abelian) subgroup on two generators then it is not amenable. Version: 5 Owner: mhale Author(s): mhale

1795

Chapter 498 44A35 – Convolution
498.1 convolution

Introduction The convolution of two functions f, g : R → R is the function (f ∗ g)(u) = int∞ f (x)g(u − x)dx −∞ In a sense, (f ∗ g)(u) is the sum of all the terms f (x)g(y) where x + y = u. Such sums occur when investigating sums of independent random variables, and discrete versions appear in the coefficients of products of polynomials and power series. Convolution is an important tool in data processing, in particular in digital signal and image processing. We will first define the concept in various general settings, discuss its properties and then list several convolutions of probability distributions.

Definitions If G is a locally compact abelian topological group with Haar measure µ and f and g are measurable functions on G, we define the convolution (f ∗ g)(u) := intG f (x)g(u − x)dµ(x) whenever the right hand side integral exists (this is for instance the case if f ∈ Lp (G, µ), g ∈ Lq (G, µ) and 1/p + 1/q = 1). The case G = Rn is the most important one, but G = Z is also useful, since it recovers the convolution of sequences which occurs when computing the coefficients of a product of polynomials or power series. The case G = Zn yields the so-called cyclic convolution which is often discussed in connection with the discrete Fourier transform.

1796

The (Dirichlet) convolution of multiplicative functions considered in number theory does not quite fit the above definition, since there the functions are defined on a commutative monoid (the natural numbers under multiplication) rather than on an abelian group. If X and Y are independent random variables with probability densities fX and fY respectively, and if X + Y has a probability density, then this density is given by the convolution fX ∗ fY . This motivates the following definition: for probability distributions P and Q on Rn , the convolution P ∗ Q is the probability distribution on Rn given by (P ∗ Q)(A) := (P × Q)({(x, y) | x + y ∈ A}) for every Borel set A. The convolution of two distributions u and v on Rn is defined by (u ∗ v)(φ) = u(ψ) for any test function φ for v, assuming that ψ(t) := v(φ(· + t)) is a suitable test function for u. Properties The convolution operation, when defined, is commutative, associative and distributive with respect to addition. For any f we have f ∗δ =f where δ is the Dirac delta distribution. The Fourier transform F translates between convolution and pointwise multiplication: F (f ∗ g) = F (f ) · F (g). Because of the availability of the Fast Fourier Transform and its inverse, this latter relation is often used to quickly compute discrete convolutions, and in fact the fastest known algorithms for the multiplication of numbers and polynomials are based on this idea. Some convolutions of probability distributions
2 2 • The convolution of two normal distributions with zero mean and variances σ1 and σ2 2 2 2 is a normal distribution with zero mean and variance σ = σ1 + σ2 .

• The convolution of two χ2 distributions with f1 and f2 degrees of freedom is a χ2 distribution with f1 + f2 degrees of freedom. • The convolution of two Poisson distributions with parameters λ1 and λ2 is a Poisson distribution with parameter λ = λ1 + λ2 . 1797

• The convolution of an exponential and a normal distribution is approximated by another exponential distribution. If the original exponential distributionhas density f (x) = e−x/τ τ (x ≥ 0) or f (x) = 0 (x < 0), σ the

and the normal distribution has zero mean and variance σ 2 , then for u probability density of the sum is e−u/τ +σ /(2τ √ f (u) ≈ στ 2π
2 2)

In a semi-logarithmic diagram where log(fX (x)) is plotted versus x and log(f (u)) versus u, the latter lies bye the amount σ 2 /(2τ 2 ) higher than the former but both are represented by parallel straight lines, the slope of which is determined by the parameter τ. • The convolution of a uniform and a normal distribution results in a quasi-uniform distribution smeared out at its edges. If the original distribution is uniform in the region a ≤ x < b and vanishes elsewhere and the normal distribution has zero mean and variance σ 2 , the probability density of the sum is f (u) = Where 1 2 ψ0 (x) = √ intx e−t /2 dt −∞ 2π is the distribution function of the standard normal distribution. For σ → 0, the function f (u) vanishes for u < a and u > b and is equal to 1/(b − a) in between. For finite σ the sharp steps at a and b are rounded off over a width of the order 2σ. ψ0 ((u − a)/σ) − ψ0 ((u − b)/σ) b−a

References

• Adapted with permission from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.htm Version: 12 Owner: akrowne Author(s): akrowne, AxelBoldt

1798

Chapter 499 46-00 – General reference works (handbooks, dictionaries, bibliographies, etc.)
499.1 balanced set

Definition [3, 1, 2, 1] Let V be a vector space over R (or C), and let S be a subset of V . If λS ⊂ S for all scalars λ such that |λ| ≤ 1, then S is a balanced set in V . Here, λS = {λs | s ∈ S}, and | · | is the absolute value (in R), or the modulus of a complex number (in C). Examples and properties 1. Let V be a normed space with norm || · ||. Then the unit ball {v ∈ V | ||v|| ≤ 1} is a balanced set. 2. Any vector subspace is a balanced set. Thus, in R3 , lines and planes passing trough the origin are balanced sets. 3. Any nonempty balanced set contains the zero vector [1]. 4. The union and intersection of an arbitrary collection of balanced sets is again a balanced set [2]. 5. Suppose f is a linear map between to vector spaces. Then both f and f −1 (the inverse image of f ) map balanced sets into balanced sets [3, 2].

1799

Definition Suppose S is a set in a vector space V . Then the balanced hull of S, denoted by eq(S), is the smallest balanced set containing S. The balanced core of S is defined as the largest balanced contained in S. Proposition Let S be a set in a vector space. 1. For eq(S) we have [1, 1] eq(S) = {λa | a ∈ A, |λ| ≤ 1}. 2. The balanced hull of S is the intersection of all balanced sets containing A [1, 2]. 3. The balanced core of S is the union of all balanced sets contained in A [2]. 4. The balanced core of S is nonempty if and only if the zero vector belongs to S [2]. 5. If S is a closed set in a topological vector space, then the balanced core is also a closed set [2]. Notes A balanced set is also sometimes called circled [2]. The term balanced evelope is also used for the balanced hull [1]. Bourbaki uses the term ´quilibr´ [1], c.f. eq(A) above. In e e [4], a balanced set is defined as above, but with the condition |λ| = 1 instead of |λ| ≤ 1.

REFERENCES
1. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973. 2. R.E. Edwards, Functional Analysis: Theory and Applications, Dover Publications, 1995. 3. J. Horv´th, Topological Vector Spaces and Distributions, Addison-Wsley Publishing a Company, 1966. 4. R. Cristescu, Topological vector spaces, Noordhoff International Publishing, 1977. 5. M. Reed, B. Simon, Methods of Modern Mathematical Physics: Functional Analysis I, Revised and enlarged edition, Academic Press, 1980.

Version: 7 Owner: matte Author(s): matte

499.2

bounded function

Definition Suppose X is a nonempty set. Then a function f : X → C is a bounded function if there exist a C < ∞ such that |f (x)| < C for all x ∈ X. The set of all bounded functions on X is usually denoted by B(X) ([1], pp. 61). 1800

Under standard point-wise addition and point-wise multiplication by a scalar, B(X) is a complex vector space. If f ∈ B(X), then the sup-norm, or uniform norm, of f is defined as ||f ||∞ = sup |f (x)|.
x∈X

It is straightforward to check that || · ||∞ makes B(X) into a normed vector space, i.e., to check that || · ||∞ satisfies the assumptions for a norm. Example Suppose X is a compact topological space. Further, let C(X) be the set of continuous complex-valued functions on X (with the same vector space structure as B(X)). Then C(X) is a vector subspace of B(X).

REFERENCES
1. C.D. Aliprantis, O. Burkinshaw, Principles of Real Analysis, 2nd ed., Academic Press, 1990.

Version: 3 Owner: matte Author(s): matte

499.3

bounded set (in a topological vector space)

Definition [3, 1, 1] Suppose B is a subset of a topological vector space V . Then B is a bounded set if for every neighborhood U of the zero vector in V , there exists a scalar λ such that B ⊂ λS. Theorem If K is a compact set in a topological vector space, then K is bounded. ([3], pp. 12)

REFERENCES
1. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973. 2. F.A. Valentine, Convex sets, McGraw-Hill Book company, 1964. 3. R. Cristescu, Topological vector spaces, Noordhoff International Publishing, 1977.

Version: 2 Owner: matte Author(s): matte 1801

499.4

cone

Definition [4, 2, 1] Suppose V is a real (or complex) vector space with a subset C. 1. If λC ⊂ C for any real λ > 0, then C is a cone. 2. If the origin belongs to a cone, then the cone is pointed. Otherwise, the cone is blunt. 3. A pointed cone is salient, if it contains no 1-dimensional vector subspace of V . 4. If C − x0 is a cone for some x0 in V , then C is a cone with vertex at x0 . Examples 1. In R, the set x > 0 is a salient blunt cone. 2. Suppose x ∈ Rn . Then for any ε > 0, the set C= { λBx (ε) | λ > 0 } is an open cone. If |x| < ε, then C = Rn . Here, Bx (ε) is the open ball at x with radius ε. Properties 1. The union and intersection of a collection of cones is a cone. 2. A set C in a real (or complex) vector space is a convex cone if and only if [2, 1] λC ⊂ C, for all λ > 0, C + C ⊂ C. 3. For a convex pointed cone C, the set C in C [2, 1]. (−C) is the largest vector subspace contained (−C) = {0} [1].

4. A pointed convex cone C is salient if and only if C

REFERENCES
1. M. Reed, B. Simon, Methods of Modern Mathematical Physics: Functional Analysis I, Revised and enlarged edition, Academic Press, 1980. 2. J. Horv´th, Topological Vector Spaces and Distributions, Addison-Wesley Publishing a Company, 1966. 3. R.E. Edwards, Functional Analysis: Theory and Applications, Dover Publications, 1995.

Version: 4 Owner: bwebste Author(s): matte 1802

499.5

locally convex topological vector space

Definition Let V be a topological vector space. If the topology of V has a basis where each member is a convex set, then V is a locally convex topological vector space [1].

REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999.

Version: 2 Owner: matte Author(s): matte

499.6

sequential characterization of boundedness

Theorem [3, 1] A set B in a real (or possibly complex) topological vector space V is bounded if and only if the following condition holds: If {zi }∞ is a sequence in B, and {λi }∞ is a sequence of scalars (in R or C), such that i=1 i=1 λi → 0, then λi zi → 0 in V .

REFERENCES
1. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973. 2. R. Cristescu, Topological vector spaces, Noordhoff International Publishing, 1977.

Version: 4 Owner: bwebste Author(s): matte

499.7

symmetric set

Definition [1, 3] Suppose A is a set in a vector space. Then A is a symmetric set, if A = −A. Here, −A = {−a | a ∈ A}. In other words, A is symmetric if for any a ∈ A also −a ∈ A.

1803

Examples 1. In R, examples of symmetric sets are intervals of the type (−k, k) with k > 0, and the sets Z and {−1, 1}. 2. Any vector subspace in a vector space is a symmetric set. 3. If A is any set in a vector space, then A −A [1] and A −A are symmetric sets.

REFERENCES
1. R. Cristescu, Topological vector spaces, Noordhoff International Publishing, 1977. 2. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973.

Version: 1 Owner: matte Author(s): matte

1804

Chapter 500 46A30 – Open mapping and closed graph theorems; completeness (including B-, Br -completeness)
500.1 closed graph theorem

A linear mapping between two Banach spaces X and Y is continuous if and only if its graph is a closed subset of X × Y (with the product topology). Version: 4 Owner: Koro Author(s): Koro

500.2

open mapping theorem

There are two important theorems having this name. In the context of functions of a complex variable: Theorem. Every non-constant analytic function on a region is an open mapping. In the context of functional analysis: Theorem. Every surjective continuous linear mapping between two Banach spaces is an open mapping. Version: 8 Owner: Koro Author(s): Koro

1805

Chapter 501 46A99 – Miscellaneous
501.1 Heine-Cantor theorem

Let X, Y be uniform spaces, and f : X → Y a continuous function. If X is compact, then f is uniformly continuous. For instance, if f : [a, b] → R is a continuous function, then it is uniformly continuous. Version: 6 Owner: n3o Author(s): n3o

501.2

proof of Heine-Cantor theorem

We prove this theorem in the case when X and Y are metric spaces. Suppose f is not uniformly continuous. Then ∃ > 0 ∀δ > 0 ∃x, y ∈ X d(x, y) < δ but d(f (x), f (y)) ≥ .

In particular by letting δ = 1/k we can construct two sequences xk and yk such that d(xk , yk ) < 1/k and d(f (xk ), f (yk ) ≥ . Since X is compact the two sequence have convergent subsequences i.e. xkj → x ∈ X, ¯ ykj → y ∈ X. ¯

Since d(xk , yk ) → 0 we have x = y . Being f continuous we hence conclude d(f (xkj ), f (ykj )) → ¯ ¯ 0 which is a contradiction being d(f (xk ), f (yk )) ≥ . Version: 2 Owner: paolini Author(s): paolini 1806

501.3

topological vector space

A topological vector space is a pair (V, T) where V is a vector space over a topological field K, and T is a Hausdorff topology on V such that under T, the vector space operations v → λv is continuous from K × V to V and (v, w) → v + w is continuous from V × V to V , where K × V and V × V are given the respective product topologies. A finite dimensional vector space inherits a natural topology. For if V is a finite dimensional vectos space, then V is isomorphic to K n for some N; then let f : V → K n be such an isomorphism, and suppose K n has the product topology. Give V the topology where a subset A of V is open in V if and only if f (A) is open in K n . This topology is independent of the choice of isomorphism f . Version: 6 Owner: Evandar Author(s): Evandar

1807

Chapter 502 46B20 – Geometry and structure of normed linear spaces
502.1 limp→∞ x
p

= x


p

Suppose x = (x1 , . . . , xn ) is a point in Rn , and let x ∞-norm;

an x
1/p

be the usual p-norm and

x p = |x1 |p + · · · + |xn |p x ∞ = max{|x1 |, . . . , |xn |}. Our claim is that
p→∞

,

lim x

p

=

x

.

(502.1.1)

In other words, for any fixed x ∈ Rn , the above limit holds. This, or course, justifies the notation for the ∞-norm. Proof. Since both norms stay invariant if we exchange two components in x, we can arrange things such that x ∞ = |x1 |. Then for any real p > 0, we have x and x
p ∞

= |x1 | = (|x1 |p )1/p ≤ n1/p |x1 | = n1/p x x ≤ ≤ 1808 lim x x

x

p

.

Taking the limit of the above inequalities (see this page) we obtain
∞ p p→∞ p

,

p→∞

lim x

,

which combined yield the result. P Version: 7 Owner: matte Author(s): matte

502.2

Hahn-Banach theorem

The Hahn-Banach theorem is a foundational result in functional analysis. Roughly speaking, it asserts the existence of a great variety of bounded (and hence continuous) linear functionals on an normed vector space, even if that space happens to be infinite-dimensional. We first consider an abstract version of this theorem, and then give the more classical result as a corollary. Let V be a real, or a complex vector space, with K denoting the corresponding field of scalars, and let p : V → R+ be a seminorm on V . Theorem 17. Let f : U → K be a linear functional defined on a subspace U ⊂ V . If the restricted functional satisfies |f (u)| p(u), u ∈ U, then it can be extended to all of V without violating the above property. To be more precise, there exists a linear functional F : V → K such that F (u) = f (u), |F (u)| p(u), u∈U u ∈ V.

Definition 15. We say that a linear functional f : V → K is bounded if there exists a bound B ∈ R+ such that |f (u)| B p(u), u ∈ V. (502.2.1) If f is a bounded linear functional, we define f , the norm of f , according to f = sup{|f (u)| : p(u) = 1}. One can show that f is the infimum of all the possible B that satisfy (501.2.1) Theorem 18 (Hahn-Banach). Let f : U → K be a bounded linear functional defined on a subspace U ⊂ V . Let f U denote the norm of f relative to the restricted seminorm on U. Then there exists a bounded extension F : V → K with the same norm, i.e. F
V

= f

U.

Version: 7 Owner: rmilson Author(s): rmilson, Evandar 1809

502.3

proof of Hahn-Banach theorem

Consider the family of all possible extensions of f , i.e. the set F of all pairings (F, H) where H is a vector subspace of X containing U and F is a linear map F : H → K such that F (u) = f (u) for all u ∈ U and |F (u)| ≤ p(u) for all u ∈ H. F is naturally endowed with an partial order relation: given (F1 , H1 ), (F2 , H2 ) ∈ F we say that (F1 , H1 ) ≤ (F2 , H2 ) iff F2 is an extension of F1 that is H1 ⊂ H2 and F2 (u) = F1 (u) for all u ∈ H1 . We want to apply Zorn’s lemma to F so we are going to prove that every chain in F has an upper bound. Let (Fi , Hi ) be the elements of a chain in F. Define H = i Hi . Clearly H is a vector subspace of V and contains U. Define F : H → K by “merging” all Fi ’s as follows. Given u ∈ H there exists i such that u ∈ Hi : define F (u) = Fi (u). This is a good definition since if both Hi and Hj contain u then Fi (u) = Fj (u) in fact either (Fi , Hi ) ≤ (Fj , Hj ) or (Fj , Hj ) ≤ (Fi , Hi ). Clearly the so constructed pair (F, H) is an upper bound for the chain (Fi , Hi ) since F is an extension of every Fi . Zorn’s Lemma then assures that there exists a maximal element (F, H) ∈ F. To complete the proof we will only need to prove that H = V . Suppose by contradiction that there exists v ∈ V \ H. Then consider the vector space H = H + Kv = {u + tv : u ∈ H, t ∈ K} (H is the vector space generated by H and v). Choose λ = sup{F (x) − p(x − v)}.
x∈H

We notice that given any x, y ∈ H it holds F (x) − F (y) = F (x − y) ≤ p(x − y) = p(x − v + v − y) ≤ p(x − v) + p(y − v) i.e. F (x) − p(x − v) ≤ F (y) + p(y − v); in particular we find that λ < +∞ and for all y ∈ H it holds F (y) − p(y − v) ≤ λ ≤ F (y) + p(y − v). Define F : H → K as follows: F (u + tv) = F (u) + tλ. Clearly F is a linear functional. We have |F (u + tv)| = |F (u) + tλ| = |t| |F (u/t) + λ| and by letting y = −u/t by the previous estimates on λ we obtain F (u/t) + λ ≤ F (u/t) + F (−u/t) + p(−u/t − v) = p(u/t + v) 1810

and F (u/t) + λ ≥ F (u/t) + F (−u/t) − p(−u/t − v) = −p(u/t + v) which together give |F (u/t) + λ| ≤ p(u/t + v) and hence |F (u + tv)| ≤ |t|p(u/t + v) = p(u + tv). So we have proved that (F , H ) ∈ F and (F , H ) > (F, H) which is a contradiction. Version: 4 Owner: paolini Author(s): paolini

502.4

seminorm

Let V be a real, or a complex vector space, with K denoting the corresponding field of scalars. A seminorm is a function p : V → R+ , from V to the set of non-negative real numbers, that satisfies the following two properties. p(k u) = |k| p(u), k ∈ K, u ∈ V p(u + v) p(u) + p(v), u, v ∈ U, Homogeneity Sublinearity

A seminorm differs from a norm in that it is permitted that p(u) = 0 for some non-zero u ∈ V. It is possible to characterize the seminorms properties geometrically. For k > 0, let Bk = {u ∈ V : p(u) k}

denote the ball of radius k. The homogeneity property is equivalent to the assertion that Bk = kB1 , in the sense that u ∈ B1 if and only if ku ∈ Bk . Thus, we see that a seminorm is fully determined by its unit ball. Indeed, given B ⊂ V we may define a function pB : V → R+ by pB (u) = inf{λ ∈ R+ : λ−1 u ∈ B}. The geometric nature of the unit ball is described by the following. Proposition 20. The function pB satisfies the homegeneity property if and only if for every u ∈ V , there exists a k ∈ R+ {∞} such that λu ∈ B if and only if 1811 λ k.

Proposition 21. Suppose that p is homogeneous. Then, it is sublinear if and only if its unit ball, B1 , is a convex subset of V . Proof. First, let us suppose that the seminorm is both sublinear and homogeneous, and prove that B1 is necessarily convex. Let u, v ∈ B1 , and let k be a real number between 0 and 1. We must show that the weighted average ku + (1 − k)v is in B1 as well. By assumption, p(k u + (1 − k)v) k p(u) + (1 − k) p(v).

The right side is a weighted average of two numbers between 0 and 1, and is therefore between 0 and 1 itself. Therefore k u + (1 − k)v ∈ B1 , as desired. Conversely, suppose that the seminorm function is homogeneous, and that the unit ball is convex. Let u, v ∈ V be given, and let us show that p(u + v) p(u) + p(v).

The essential complication here is that we do not exclude the possibility that p(u) = 0, but that u = 0. First, let us consider the case where p(u) = p(v) = 0. By homogeneity, for every k > 0 we have ku, kv ∈ B1 , and hence as well. By homogeneity, again, k k u + v ∈ B1 , 2 2 p(u + v)

2 . k Since the above is true for all positive k, we infer that p(u + v) = 0, as desired. Next suppose that p(u) = 0, but that p(v) = 0. We will show that in this case, necessarily, p(u + v) = p(v). Owing to the homogeneity assumption, we may without loss of generality assume that p(v) = 1. 1812

For every k such that 0

k < 1 we have k u + k v = (1 − k) ku + k v. 1−k

The right-side expression is an element of B1 because ku , v ∈ B1 . 1−k Hence k p(u + v) 1, and since this holds for k arbitrarily close to 1 we conclude that p(u + v) The same argument also shows that p(v) = p(−u + (u + v)) and hence p(u + v) = p(v), as desired. Finally, suppose that neither p(u) nor p(v) is zero. Hence, u v , p(u) p(v) are both in B1 , and hence p(u) u p(v) v u+v + = p(u) + p(v) p(u) p(u) + p(v) p(v) p(u) + p(v) is in B1 also. Using homogeneity, we conclude that p(u + v) as desired. Version: 14 Owner: rmilson Author(s): rmilson, drummond p(u) + p(v), p(u + v), p(v).

502.5

vector norm

A vector norm on the real vector space V is a function f : V → R that satisfies the following properties: 1813

f (x) = 0 ⇔ x = 0 f (x) ≥ 0 f (x + y) f (x) + f (y) f (αx) = |α|f (x)

x∈V x, y ∈ V α ∈ R, x ∈ V

Such a function is denoted as || x ||. Particular norms are distinguished by subscripts, such as || x ||V , when referring to a norm in the space V . A unit vector with respect to the norm || · || is a vector x satisfying || x || = 1. A vector norm on a complex vector space is defined similarly. A common (and useful) example of a real norm is the Euclidean norm given by ||x|| = (x2 + x2 + · · · + x2 )1/2 defined on V = Rn . Note, however, that there does not exist any norm 1 2 n on all metric spaces; when it does, the space is called a normed vector space. A necessary and sufficient condition for a metric space to be a normed space, is d(x + a, y + a) = d(x, y) ∀x, y, a ∈ V d(αx, αy) = |α|d(x, y) ∀x, y ∈ V, α ∈ R But given a norm, a metric can always be defined by the equation d(x, y) = ||x − y|| Version: 14 Owner: mike Author(s): mike, Manoj, Logan

1814

Chapter 503 46B50 – Compactness in Banach (or normed) spaces
503.1 Schauder fixed point theorem

Let X be a Banach space, K ⊂ X compact, convex and non-empty, and let f : K → K be a continuous mapping. Then there exists x ∈ K such that f (x) = x. Notice that the unit disc of a finite dimensional vector space is always convex and compact hence this theorem extends Brouwer fixed point theorem. Version: 3 Owner: paolini Author(s): paolini

503.2

proof of Schauder fixed point theorem

The idea of the proof is to reduce ourselves to the finite dimensional case. Given > 0 notice that the family of open sets {B (x) : x ∈ K} is an open covering of K. Being K compact there exists a finite subcover, i.e. there exists N points p1 , . . . , pN of K such that the balls B (pi ) cover the whole set K. Let K be the convex hull of p1 , . . . , pN and let V be the affine N − 1 dimensional space containing these points so that K ⊂ V . Now consider a projection π : X → V such that π (x) − π (y) ≤ x − y and define f :K →K, f (x) = π (f (x)).

This is a continuous function defined on a convex and compact set K of a finite dimensional vector space V . Hence by Brouwer fixed point theorem it admits a fixed point x f (x ) = x . 1815

Since K is sequentially compact we can find a sequence to some point x ∈ K. ¯ We claim that f (¯) = x. x ¯

k

→ 0 such that xk = x k converges

Clearly f k (xk ) = xk → x. To conclude the proof we only need to show that also f k (xk ) → ¯ f (¯) or, which is the same, that f k (xk ) − f (¯) → 0. x x In fact we have f k (xk ) − f (¯) = π k (f (xk )) − f (¯) x x ≤ π k (f (xk )) − f (xk ) + f (xk ) − f (¯) x ≤
k

+ f (xk ) − f (¯) → 0 x

where we used the fact that π (x) − x ≤ being x ∈ K contained in some ball B centered on K . Version: 1 Owner: paolini Author(s): paolini

1816

Chapter 504 46B99 – Miscellaneous
504.1
p

Let F be either R or C, and let p ∈ R with p sequences (ai )i 0 in F such that
∞ i=0

1. We define |ai |p

p

to be the vector space of all

exists.
p

is a normed vector space, under the norm (ai )
p

=(

∞ i=0

|ai |p )1/p with norm given by

is defined to be the vector space of all bounded sequences (ai )i (ai )

0

= sup{|ai | : i

0}

and p for p 1 are complete under these norms, making them into Banach spaces. 2 Moreover, is a Hilbert space under the inner product (ai ), (bi ) =
∞ i=0

ai bi

For p > 1 the (continuous) dual space of ∞ , and the dual space of ∞ is 1 .

p

is

q

where

1 p

+

1 q

= 1. The dual space of

1

is

Version: 10 Owner: Evandar Author(s): Evandar 1817

504.2

Banach space

A Banach space (X, . ) is a normed vector space such that X is complete under the metric induced by the norm . . Some authors use the term Banach space only in the case where X is infinite dimensional, although on Planetmath finite dimensional spaces are also considered to be Banach spaces. If Y is a Banach space and X is any normed vector space, then the set of continuous linear maps f : X → Y forms a Banach space, with norm given by the operator norm. In particular, since R and C are complete, the space of all continuous linear functionals on a normed vector space is a Banach space. Version: 4 Owner: Evandar Author(s): Evandar

504.3

an inner product defines a norm

an inner product defines a norm
Let F be a field, and X be an inner product space over F with an inner product . : X × X → F then we can define a function , from X → F such that X → x, x and this defines a norm on X. Version: 20 Owner: say 10 Author(s): say 10, apmxi

504.4

continuous linear mapping

If (V1 , · 1 ) and (V2 , · 2 ) are normed vector spaces, a linear mapping T : V1 → V2 is continuous if it is continuous in the metric induced by the norms. If there is a nonnegative constant c such that T (x) 2 c x 1 for each x ∈ V1 , we say that T is bounded. This should not be confused with the usual terminology referring to a bounded function as one that has bounded range. In fact, bounded linear mappings usually have unbounded ranges. The expression bounded linear mapping is often used in functional analysis to refer to continuous linear mappings as well. This is because the two definitions are equivalent: If T is bounded, then T (x) − T (y) 2 = T (x − y) 2 c x − y 1 , so T is a lipschitz function. Now suppose T is continuous. Then there exists r > 0 such that T (x) 2 1 when x 1 r.

1818

For any x ∈ V1 , we then have r x hence T (x)
2

T (x)
1

2

= T

r x x 1

2

1,

r x 1 ; so T is bounded.

It can be shown that a linear mapping between two topological vector spaces is continuous if and only if it is continuous at 0 [3].

REFERENCES
1. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973.

Version: 4 Owner: Koro Author(s): Koro

504.5

equivalent norms

Definition Let ||·|| and ||·|| be two norms on a vector space V . These norms are equivalent norms if there exist positive real numbers c, d such that c||x|| ≤ ||x|| ≤ d||x|| for all x ∈ V . An equivalent condition is that there exists a number C > 0 such that 1 ||x|| ≤ ||x|| ≤ C||x|| C for all x ∈ V . To see the equivalence, set C = max{1/c, d}. Some key results are as follows: 1. On a finite dimensional vector space all norms are equivalent. The same is not true for vector spaces of infinite dimension [2]. It follows that on a finite dimensional vector space, one can check the convergence of a sequence with respect with any norm. If a sequence converges in one norm, it converges in all norms. 2. If two norms are equivalent on a vector space V , they induce the same topology on V [2].

1819

REFERENCES
1. E. Kreyszig, Introductory Functional Analysis With Applications, John Wiley & Sons, 1978.

Version: 3 Owner: Koro Author(s): matte

504.6

normed vector space

Let F be a field which is either R or C. A normed vector space over F is a pair (V, . ) where V is a vector space over F and . : V → R is a function such that 1. v 0 for all v ∈ V and v = 0 if and only if v = 0 in V (positive definiteness) v + w for all v, w ∈ V (the triangle inequality)

2. λv = |λ| v for all v ∈ V and all λ ∈ F 3. v + w

The function . is called a norm on V . Some properties of norms: 1. If W is a subspace of V then W can be made into a normed space by simply restricting the norm on V to W . This is called the induced norm on W . 2. Any normed vector space (V, . ) is a metric space under the metric d : V × V → R given by d(u, v) = u − v . This is called the metric induced by the norm . . 3. In this metric, the norm defines a continuous map from V to R - this is an easy consequence of the triangle inequality. 4. If (V, , ) is an inner product space, then there is a natural induced norm given by v = v, v for all v ∈ V . Version: 5 Owner: Evandar Author(s): Evandar

1820

Chapter 505 46Bxx – Normed linear spaces and Banach spaces; Banach lattices
505.1 vector p-norm

A class of vector norms, called a p-norm and denoted || · ||p , is defined as || x ||p = (|x1 |p + · · · + |xn |p ) p
1

p

1, x ∈ Rn

The most widely used are the 1-norm, 2-norm, and ∞-norm:

|| x ||1 = |x1 | + · · · + |xn |
1 i n

√ |x1 |2 + · · · + |xn |2 = || xT x x ||2 = x ||∞ = max |xi |

The 2-norm is sometimes called the Euclidean vector norm, because || x − y ||2 yields the Euclidean distance between any two vectors x, y ∈ Rn . The 1-norm is also called the taxicab metric (sometimes Manhattan metric) since the distance of two points can be viewed as the distance a taxi would travel on a city (horizontal and vertical movements). A useful fact is that for finite dimensional spaces (like Rn ) the tree mentioned norms are equivalent. Version: 5 Owner: drini Author(s): drini, Logan

1821

Chapter 506 46C05 – Hilbert and pre-Hilbert spaces: geometry and topology (including spaces with semidefinite inner product)
506.1 Bessel inequality

Let H be a Hilbert space, and suppose e1 , e2 , . . . ∈ H is an orthonormal sequence. Then for any x ∈ H,
∞ k=1

| x, ek |2 ≤ x

2

.

Bessel’s inequality immediately lets us define the sum x =
∞ k=1

x, ek ek .

The inequality means that the series converges. For a complete orthonormal series, we have Parseval’s theorem, which replaces inequality with equality (and consequently x with x). Version: 2 Owner: ariels Author(s): ariels

506.2

Hilbert module

Definition 16. A 1822

Definition 39. (right) pre-Hilbert module over a C ∗ -algebra A is a right A-module E equipped with an A-valued inner product −, − : E × E → A, i.e. a sesquilinear pairing satisfying u, va u, v v, v = = u, v a v, u ∗ 0, with v, v = 0 iff v = 0, (506.2.1) (506.2.2) (506.2.3)

for all u, v ∈ E and a ∈ A. Note, positive definiteness is well-defined due to the notion of positivity for C ∗ -algebras. The norm of an element v ∈ E is defined by ||v|| = || v, v ||. Definition 17. A Definition 40. (right) Hilbert module over a C ∗ -algebra A is a right pre-Hilbert module over A which is complete with respect to the norm. Example 23 (Hilbert spaces). A complex Hilbert space is a Hilbert C-module. Example 24 (C ∗ -algebras). A C ∗ -algebra A is a Hilbert A-module with inner product a, b = a∗ b. Definition 18. A Definition 41. Hilbert A-B-bimodule is a (right) Hilbert module E over a C ∗ -algebra B together with a *-homomorphism π from a C ∗ -algebra A to End(E). Version: 4 Owner: mhale Author(s): mhale

506.3

Hilbert space

A Hilbert space is an inner product space (X, , ) which is complete under the induced metric. In particular, a Hilbert space is a Banach space in the norm induced by the inner product, since the norm and the inner product both induce the same metric. Some authors require X to be infinite dimensional for it to be called a Hilbert space. Version: 7 Owner: Evandar Author(s): Evandar

506.4
Let

proof of Bessel inequality
n

rn = x −

k=1

x, ek · ek .

1823

Then for j = 1, . . . , n, rn , ej = x, ej − n k=1 x, ek · ek , ej = x, ej − x, ej ej , ej = 0 (506.4.1) (506.4.2)

so e1 , . . . , en , rn is an orthogonal series. Computing norms, we see that
n 2 n n

x So the series

2

= rn +
k=1

x, ek · ek

= rn

2

+
k=1

| x, ek | ≥

2

k=1

| x, ek |2 .

∞ k=1 2

| x, ek |2

converges and is bounded by x , as required. Version: 1 Owner: ariels Author(s): ariels

1824

Chapter 507 46C15 – Characterizations of Hilbert spaces
507.1 classification of separable Hilbert spaces

Let H1 and H2 be infinite dimensional, separable Hilbert spaces. Then there is an isomorphism f : H1 → H2 which is also an isometry. In other words, H1 and H2 are identical as Hilbert spaces. Version: 2 Owner: Evandar Author(s): Evandar

1825

Chapter 508 46E15 – Banach spaces of continuous, differentiable or analytic functions
508.1 Ascoli-Arzela theorem

Theorem 19. Let Ω be a bounded subset of Rn and (fk ) a sequence of functions fk : Ω → Rm . If {fk } is equibounded and uniformly equicontinuous then there exists a uniformly convergent subsequence (fkj ). A more abstract (and more general) version is the following. Theorem 20. Let X and Y be totally bounded metrical spaces and let F ⊂ C(X, Y ) be an equibounded family of continuous mappings from X to Y . Then F is totally bounded (with respect to the uniform convergence metric induced by C(X, Y )). Notice that the first version is a consequence of the second. Recall, in fact, that a subset of a complete metric space is totally bounded if and only if its closure is compact (or sequentially compact). Hence Ω is totally bounded and all the functions fk have image in a totally bounded set. Being F = {fk } totally bounded means that F is sequentially compact and hence (fk ) has a convergent subsequence. Version: 6 Owner: paolini Author(s): paolini, n3o

508.2

Stone-Weierstrass theorem

Let X be a compact metric space and let C 0 (X, R) be the algebra of continuous real functions defined over X. Let A be a subalgebra of C 0 (X, R) for which the following conditions hold: 1826

1. ∀x, y ∈ X ∃f ∈ A : f (x) = f (y) 2. 1 ∈ A Then A is dense in C 0 (X, R). Version: 1 Owner: n3o Author(s): n3o

508.3

proof of Ascoli-Arzel theorem

Given > 0 we aim at finding a 4 -lattice in F (see the definition of totally boundedness). Let δ > 0 be given with respect to in the definition of equi-continuity of F . Let Xδ be a δ-lattice in X and Y be a -lattice in Y . Let now Y Xδ be the set of functions from Xδ to Y and define G ⊂ Y Xδ by G = {g ∈ Y Xδ : ∃f ∈ F ∀x ∈ Xδ d(f (x), g(x)) < }.

Since Y Xδ is a finite set, G is finite too: say G = {g1 , . . . , gN }. Then define F ⊂ F , F = {f1 , . . . , fN } where fk : X → Y is a function in F such that d(fk (x), gk (x)) < for all x ∈ Xδ (the existence of such a function is guaranteed by the definition of G ). We now will prove that F is a 4 -lattice in F . Given f ∈ F choose g ∈ Y Xδ such that for all x ∈ Xδ it holds d(f (x), g(x)) < (this is possible as for all x ∈ Xδ there exists y ∈ Y with d(f (x), y) < ). We conclude that g ∈ G and hence g = gk for some k ∈ {1, . . . , N}. Notice also that for all x ∈ Xδ we have d(f (x), fk (x)) ≤ d(f (x), gk (x)) + d(gk (x), fk (x)) < 2 . Given any x ∈ X we know that there exists xδ ∈ Xδ such that d(x, xδ ) < δ. So, by equicontinuity of F , d(f (x), fk (x)) ≤ d(f (x), f (xδ )) + d(fk (x), fk (xδ )) + d(f (xδ ), fk (xδ )) ≤ 4 . Version: 3 Owner: paolini Author(s): paolini

508.4

Holder inequality

The H¨lder inequality concerns vector p-norms: o 1 1 + = 1 then |xT y| p q || x ||p|| y ||q

If

An important instance of a H¨lder inequality is the Cauchy-Schwarz inequality. o 1827

There is a version of this result for the Lp spaces. If a function f is in Lp (X), then the Lp -norm of f is denoted || f ||p . Let (X, B, µ) be a measure space. If f is in Lp (X) and g is in Lq (X) (with 1/p + 1/q = 1), then the H¨lder inequality becomes o

fg

1

= intX |f g|dµ ≤ (intX |f |pdµ) p (intX |g|q dµ) q = f p g q

1

1

Version: 10 Owner: drini Author(s): paolini, drini, Logan

508.5

Young Inequality

Let a, b > 0 and p, q ∈ ]0, ∞[ with 1/p + 1/q = 1. Then ab ≤ ap bq + . p q

Version: 1 Owner: paolini Author(s): paolini

508.6

conjugate index
1 p

For p, q ∈ R, 1 < p, q < ∞ we say p and q are conjugate indices if we will also define q = ∞ as conjugate to p = 1 and vice versa.

+

1 q

= 1. Formally,

Conjugate indices are used in the H¨lder inequality and more generally to define conjugate o spaces. Version: 4 Owner: bwebste Author(s): bwebste, drummond

508.7

proof of Holder inequality

First we prove the more general form (in measure spaces). Let (X, µ) be a measure space and let f ∈ Lp (X), g ∈ Lq (X) where p, q ∈ [1, +∞] and 1 + 1 = 1. p q The case p = 1 and q = ∞ is obvious since |f (x)g(x)| ≤ g 1828
L∞ |f (x)|.

Also if f = 0 or g = 0 the result is obvious. Otherwise notice that (applying Young inequality) we have p q f g L1 |f | |g| 1 |f | |g| 1 1 1 = intX · dµ ≤ intX dµ+ intX dµ = + = 1 f Lp · g Lq f Lp g Lq p f Lp q g Lq p q hence the desired inequality holds intX |f g| = f g
L1

≤ f

Lp p

· g and

Lq q

= (intX |f |p ) p (intX |g|q ) q . -spaces the proof is the same, only replace
p
1 p

1

1

If x and y are vectors in Rn or vectors in integrals with sums. If we define x we have | k xk yk | |xk ||yk | ≤ k = x p· y q x p· y q
p

=

|xk |

k

1 |xk | |yk | ≤ x p y q p

k

|xk |p 1 + x p q p

k

1 1 |yk |q + = 1. q = y q p q

Version: 1 Owner: paolini Author(s): paolini

508.8

proof of Young Inequality
1 1 1 1 log ap + log bq ≤ log( ap + bq ). p q p q

By the concavity of the log function we have log ab =

By exponentiation we obtain the desired result. Version: 1 Owner: paolini Author(s): paolini

508.9

vector field

A (smooth, differentiable) vector field on a (smooth differentiable) manifold M is a (smooth, differentiable) function v : M → T M, where T M is the tangent bundle of M, which takes m to the tangent space Tm M, i.e., a section of the tangent bundle. Less formally, it can be thought of a continuous choice of a tangent vector at each point of a manifold. Alternatively, vector fields on a manifold can be identified with derivations of the algebra of (smooth, differentiable) functions. Though less intuitive, this definition can be more formally useful. Version: 8 Owner: bwebste Author(s): bwebste, slider142 1829

Chapter 509 46F05 – Topological linear spaces of test functions, distributions and ultradistributions
509.1 Tf is a distribution of zeroth order

To check that Tf is a distribution of zeroth order, we shall use condition (3) on this page. First, it is clear that Tf is a linear mapping. To see that Tf is continuous, suppose K is a compact set in U and u ∈ DK , i.e., u is a smooth function with support in K. We then have |Tf (u)| = |intK f (x)u(x)dx| ≤ intK |f (x)| |u(x)|dx ≤ intK |f (x)|dx ||u||∞. Since f is locally integrable, it follows that C = intK |f (x)|dx is finite, so |Tf (u)| ≤ C||u||∞. Thus f is a distribution of zeroth order ([2], pp. 381). P

REFERENCES
1. S. Lang, Analysis II, Addison-Wesley Publishing Company Inc., 1969.

Version: 3 Owner: matte Author(s): matte

1830

509.2

1 p.v.( x ) is a distribution of first order

(Following [4, 2].) Let u ∈ D(U). Then supp u ⊂ [−k, k] for some k > 0. For any ε > 0, u(x)/x is Lebesgue integrable in |x| ∈ [ε, k]. Thus, by a change of variable, we have 1 p.v.( )(u) = x
ε→0+

lim int[ε,k]

u(x) − u(−x) dx. x

Now it is clear that the integrand is continuous for all x ∈ R \ {0}. What is more, the integrand approaches 2u (0) for x → 0, so the integrand has a removable discontinuity at x = 0. That is, by assigning the value 2u (0) to the integrand at x = 0, the integrand becomes continuous in [0, k]. This means that the integrand is Lebesgue measurable on [0, k]. Then, by defining fn (x) = χ[1/n,k] u(x) − u(−x) /x (where χ is the characteristic function), and applying the Lebesgue dominated convergence theorem, we have 1 u(x) − u(−x) p.v.( )(u) = int[0,k] dx. x x
1 1 It follows that p.v.( x )(u) is finite, i.e., p.v.( x ) takes values in C. Since D(U) is a vector space, 1 if follows easily from the above expression that p.v.( x ) is linear. 1 To prove that p.v.( x ) is continuous, we shall use condition (3) on this page. For this, suppose K is a compact subset of R and u ∈ DK . Again, we can assume that K ⊂ [−k, k] for some k > 0. For x > 0, we have

|

1 u(x) − u(−x) | = | int(−x,x) u (t)dt| x x ≤ 2||u ||∞ ,

where ||·||∞ is the supremum norm. In the first equality we have used the Fundamental theorem of calculus f (valid since u is absolutely continuous on [−k, k]). Thus 1 | p.v.( )(u)| ≤ 2k||u ||∞ x
1 and p.v.( x ) is a distribution of first order as claimed. P

REFERENCES
1. M. Reed, B. Simon, Methods of Modern Mathematical Physics: Functional Analysis I, Revised and enlarged edition, Academic Press, 1980. 2. S. Igari, Real analysis - With an introduction to Wavelet Theory, American Mathematical Society, 1998.

Version: 4 Owner: matte Author(s): matte 1831

509.3

Cauchy principal part integral

∞ Definition [4, 2, 2] Let C0 (R) be the set of smooth functions with compact support on 1 1 ∞ R. Then the Cauchy principal part integral p.v.( x ) is mapping p.v.( x ) : C0 (R) → C defined as 1 u(x) p.v.( )(u) = lim int|x|>ε dx ε→0+ x x ∞ for u ∈ C0 (R). 1 1 Theorem The mapping p.v.( x ) is a distribution of first order. That is, p.v.( x ) ∈ D 1 (R).

(proof.)

Properties
1 1. The distribution p.v.( x ) is obtained as the limit ([2], pp. 250)

χn|x| 1 → p.v.( ). x x as n → ∞. Here, χ is the characteristic function, the locally integrable functions on the left hand side should be interpreted as distributions (see this page), and the limit should be taken in D (R). 2. Let ln |t| be the distribution induced by the locally integrable function ln |t| : R → R. Then, for the distributional derivative D, we have ([2], pp. 149) 1 D(ln |t|) = p.v.( ). x

REFERENCES
1. M. Reed, B. Simon, Methods of Modern Mathematical Physics: Functional Analysis I, Revised and enlarged edition, Academic Press, 1980. 2. S. Igari, Real analysis - With an introduction to Wavelet Theory, American Mathematical Society, 1998. 3. J. Rauch, Partial Differential Equations, Springer-Verlag, 1991.

Version: 5 Owner: matte Author(s): matte

1832

509.4

delta distribution

Let U be an open subset of Rn such that 0 ∈ U. Then the delta distribution is the mapping [2, 3, 4] δ : D(U) → C u → u(0). Claim The delta distribution is a distribution of zeroth order, i.e., δ ∈ D 0 (U). Proof. With obvious notation, we have δ(u + v) = (u + v)(0) = u(0) + v(0) = δ(u) + δ(v), δ(αu) = (αu)(0) = αu(0) = αδ(u), so δ is linear. To see that δ is continuous, we use condition (3) on this this page. Indeed, if K is a compact set in U, and u ∈ DK , then |δ(u)| = |u(0)| ≤ ||u||∞, where || · ||∞ is the supremum norm. P

REFERENCES
1. J. Rauch, Partial Differential Equations, Springer-Verlag, 1991. 2. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973. 3. M. Reed, B. Simon, Methods of Modern Mathematical Physics: Functional Analysis I, Revised and enlarged edition, Academic Press, 1980.

Version: 1 Owner: matte Author(s): matte

509.5

distribution

Definition [1] Suppose U is an open set in Rn , and suppose D(U) is the topological vector space of smooth functions with compact support. A distribution is a linear continuous functional on D(U), i.e., a linear continuous mapping D(U) → C. The set of all distributions on U is denoted by D (U). Suppose T is a linear functional on D(U). Then T is continuous if and only if T is continuous in the origin (see this page). This condition can be rewritten in various ways, and the below theorem gives two convenient conditions that can be used to prove that a linear mapping is a distribution. 1833

Theorem Let U be an open set in Rn , and let T be a linear functional on D(U). Then the following are equivalent: 1. T is a distribution. 2. If K is a compact set in U, and {ui }∞ be a sequence in DK such that for any i=1 multi-index α, we have D α ui → 0 in the supremum norm as i → ∞, then T (ui) → 0 in C. 3. For any compact set K in U, there are constants C > 0 and k ∈ {1, 2, . . .} such that for all u ∈ DK , we have |T (u)| ≤ C ||D αu||∞ , (509.5.1)
|α|≤k

where α is a multi-index, and || · ||∞ is the supremum norm. Proof The equivalence of (2) and (3) can be found on this page, and the equivalence of (1) and (3) is shown in [3], pp. 141. If T is a distribution on an open set U, and the same k can be used for any K in inequality (508.5.1), then T is a distribution of order k. The set of all such distributions is denoted by D k (U). Further, the set of all distributions of finite order on U is defined as [4] DF (U) = {T ∈ D (U) | T ∈ D k (U) for some k < ∞ }. A common notation for the action of a distribution T onto a test function u ∈ D(U) (i.e., T (u) with above notation) is T, u . The motivation for this comes from this example.

Topology for D (U) The standard topology for D (U) is the weak∗ topology. In this topology, a sequence {Ti }∞ i=1 of distributions (in D (U)) converges to a distribution D ∈ D (U) if and only if Ti (u) → T (u) (in C) as i → ∞ for every u ∈ D(U) [3].

REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999. 2. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973.

1834

3. L. H¨rmander, The Analysis of Linear Partial Differential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990.

Version: 6 Owner: matte Author(s): matte

509.6

equivalence of conditions

Let us first show the equivalence of (2) and (3) following [4], pp. 35. First, the proof that (3) implies (2) is a direct calculation. Next, let us show that (2) implies (3): Suppose T ui → 0 in C, and if K is a compact set in U, and {ui }∞ is a sequence in DK such that for any i=1 multi-index α, we have D α ui → 0 in the supremum norm || · ||∞ as i → ∞. For a contradiction, suppose there is a compact set K in U such that for all constants C > 0 and k ∈ {0, 1, 2, . . .} there exists a function u ∈ DK such that |T (u)| > C ||D αu||∞ .
|α|≤k

Then, for C = k = 1, 2, . . . we obtain functions u1 , u2 , . . . in D(K) such that |T (ui )| > i |α|≤i ||D α ui||∞ . Thus |T (ui)| > 0 for all i, so for vi = ui /|T (ui)|, we have 1>i
|α|≤i

||D α vi ||∞ .

It follows that ||D αui ||∞ < 1/i for any multi-index α with |α| ≤ i. Thus {vi }∞ satisfies i=1 our assumption, whence T (vi ) should tend to 0. However, for all i, we have T (vi ) = 1. This contradiction completes the proof. TODO: The equivalence of (1) and (3) is given in [3]. P

REFERENCES
1. L. H¨rmander, The Analysis of Linear Partial Differential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990. 2. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973.

Version: 3 Owner: matte Author(s): matte

1835

509.7

every locally integrable function is a distribution

Suppose U is an open set in Rn and f is a locally integrable function on U, i.e., f ∈ L1 (U). loc Then the mapping Tf : D(U) → C u → intU f (x)u(x)dx is a zeroth order distribution [4, 2]. (Here, D(U) is the set of smooth functions with compact support on U.) (proof) If f and g are both locally integrable functions on a open set U, and Tf = Tg , then it follows (see this page), that f = g almost everywhere. Thus, the mapping f → Tf is a linear injection when L1 is equipped with the usual equivalence relation for an Lp -space. For this loc reason, one also writes f for the distribution Tf [2].

REFERENCES
1. L. H¨rmander, The Analysis of Linear Partial Differential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990. 2. S. Lang, Analysis II, Addison-Wesley Publishing Company Inc., 1969.

Version: 2 Owner: matte Author(s): matte

509.8

localization for distributions

Definition [1, 3] Suppose U is an open set in Rn and T is a distribution T ∈ D (U). Then we say that T vanishes on an open set V ⊂ U, if the restriction of T to V is the zero ∞ distribution on V . In other words, T vanishes on V , if T (v) = 0 for all v ∈ C0 (V ). (Here ∞ C0 (V ) is the set of smooth function with compact support in V .) Similarly, we say that two distributions S, T ∈ D (U) are equal, or coincide on V , if S − T vanishes on V . We then write: S = T on V . Theorem[1, 4] Suppose U is an open set in Rn and {Ui }i∈I is an open cover of U, i.e., U=
i∈I

Ui .

Here, I is an arbitrary index set. If S, T are distributions on U, such that S = T on each Ui , then S = T (on U). 1836

Proof. Suppose u ∈ D(U). Our aim is to show that S(u) = T (u). First, we have supp u ⊂ K for some compact K ⊂ U. It follows that there exist a finite collection of Ui :s from the open cover, say U1 , . . . , UN , such that K ⊂ N Ui . By a smooth partition of unity (see e.g. [2] i=1 pp. 137), there are smooth functions φ1 , . . . , φN : U → R such that 1. supp φi ⊂ Ui for all i. 2. φi (x) ∈ [0, 1] for all x ∈ U and all i, 3.
N i=1

φi (x) = 1 for all x ∈ K.

From the first property, and from a property for the support of a function, it follows that supp φi u ⊂ supp φi supp u ⊂ Ui . Therefore, for each i, S(φiu) = T (φi u) since S and T conicide on Ui . Then
N N

S(u) =
i=1

S(φiu) =
i=1

T (φiu) = T (u),

and the theorem follows. P

REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999. 2. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973. 3. L. H¨rmander, The Analysis of Linear Partial Differential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990. 4. S. Igari, Real analysis - With an introduction to Wavelet Theory, American Mathematical Society, 1998.

Version: 4 Owner: matte Author(s): matte

509.9

operations on distributions

Let us assume that U is an open set in Rn . Then we can define the below operations for distributions in D (U). To prove that these operations indeed give rise to other distributions, one can use condition (2) given on this page.

1837

Vector space structure of D (U) Suppose S, T are distributions in D (U) and α are complex numbers. Thus it is natural to define [4] S + T : D(U) → C u → S(u) + T (u) and αT : D(U) → C u → αT (u). It is readily shown that these are again distributions. Thus D (U) is a complex vector space. Restriction of distribution Suppose T is a distribution in D (U), and V is an open subset V in U. Then the restriction of the distribution T onto V is the distribution T |V ∈ D (V ) defined as [4] T |V : D(V ) → C v → T (v).

Again, using condition (2) on this page, one can check that T |V is indeed a distribution. Derivative of distribution Suppose T is a distribution in D (U), and α is a multi-index. Then the α-derivative of T is the distribution ∂ α T ∈ D (U) defined as ∂ α : D(U) → C u → (−1)|α| T (∂ α u),

where the last ∂ α is the usual derivative defined here for smooth functions. Suppose α is a multi-index, and f : U → C is a locally integrable function, whose all partial differentials up to order |α| are continuous. Then, if Tf is the distribution induced by f , we have ([3], pp. 143) ∂ α Tf = T∂ α f . This means that the derivative for a distribution coincide with the usual definition of the derivative provided that the distribution is induced by a sufficiently smooth function. If α and β are multi-indices, then for any T ∈ D (U) we have ∂ α ∂ β T = ∂ β ∂ α T. This follows since the corresponding relation holds in D(U) (see this page). 1838

Multiplication of distribution and smooth function Suppose T is a distribution in D (U), and f is a smooth function on U, i.e., f ∈ C ∞ (U). Then f T is the distribution f T ∈ D (U) defined as f T : D(U) → C u → T (f u), where f u is the smooth mapping f u : x → f (x)u(x). The proof that f T is a distribution is an application of Leibniz’ rule [3].

REFERENCES
1. L. H¨rmander, The Analysis of Linear Partial Differential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990. 2. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973.

Version: 4 Owner: matte Author(s): matte

509.10

smooth distribution

Definition 1 Suppose U is an open set in Rn , suppose T is a distribution on U, i.e., T ∈ D (U), and suppose V is an open set V ⊂ U. Then we say that T is smooth on V , if there exists a smooth function f : V → C such that T |V = Tf . In other words, T is smooth on V , if the restriction of T to V coincides with the distribution induced by some smooth function f : V → C. Definition 2 [1, 2] Suppose U is an open set in Rn and T ∈ D (U). Then the singular support of T (which is denoted by sing supp T ) is the complement of the largest open set where T is smooth. Examples 1. [2] On R, let f be the function f (x) = +1 when x is irrational, 0 when x is rational.

Then the distribution induced by f , that is Tf , is smooth. Indeed, let 1 be the smooth function x → 1. Since f = 1 almost everywhere, we have Tf = T1 (see this page), so Tf is smooth. 1839

2. For the delta distribution δ, we have sing supp δ = {0}. 3. For any distribution T ∈ D (U), we have [1] sing supp T ⊂ supp T, where supp T is the support of T . 4. Let f be a smooth function f : U → C. Then sing supp Tf is empty [1].

REFERENCES
1. J. Barros-Neta, An introduction to the theory of distributions, Marcel Dekker, Inc., 1973. 2. A. Grigis, J. Sj¨strand, Microlocal Analysis for Differential Operators, Cambridge Unio versity Press, 1994. 3. J. Rauch, Partial Differential Equations, Springer-Verlag, 1991.

Version: 2 Owner: matte Author(s): matte

509.11

space of rapidly decreasing functions

Definition [4, 2] The space of rapidly decreasing functions is the function space S(Rn ) = {f ∈ C ∞ (Rn ) | sup | ||f ||α,β < ∞ for all multi-indices α, β},
x∈Rn

where C ∞ (Rn ) is the set of smooth functions from Rn to C, and ||f ||α,β = ||xα D β f ||∞ . Here, || · ||∞ is the supremum norm, and we use multi-index notation. When the dimension n is clear, it is convenient to write S = S(Rn ). The space S is also called the Schwartz space, after Laurent Schwartz (1915-2002) [3]. The set S is closed under point-wise addition and under multiplication by a complex scalar. Thus S is a complex vector space. Examples of functions in S 1. If k ∈ {0, 1, 2, . . .}, and a is a positive real number, then [2] xk exp{−ax2 } ∈ S. 2. Any smooth function with compact support f is in S. This is clear since any derivative of f is continuous, so xα D β f has a maximum in Rn . 1840

Properties 1. For any 1 ≤ p ≤ ∞, we have [2, 4] S(Rn ) ⊂ Lp (Rn ), where Lp (Rn ) is the space of p-integrable functions. 2. Using Leibniz’ rule, it follows that S is also closed under point-wise multiplication; if f, g ∈ S, then f g : x → f (x)g(x) is also in S.

REFERENCES
1. L. H¨rmander, The Analysis of Linear Partial Differential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990. 2. S. Igari, Real analysis - With an introduction to Wavelet Theory, American Mathematical Society, 1998. 3. The MacTutor History of Mathematics archive, Laurent Schwartz 4. M. Reed, B. Simon, Methods of Modern Mathematical Physics: Functional Analysis I, Revised and enlarged edition, Academic Press, 1980.

Version: 2 Owner: matte Author(s): matte

509.12

support of distribution

Definition [1, 2, 3, 4] Let U be an open set in Rn and let T be a distribution T ∈ D (U). Then the support of T is the complement of the union of all open sets V ⊂ U where T vanishes. This set is denoted by supp T . If we denote by T |V the restriction of T to the set V , then we have the following formula supp T = {V ⊂ U | V is open, and T |V = 0 } .

Examples and properties [2, 1] Let U be an open set in Rn . 1. For the delta distribution, supp δ = {0} provided that 0 ∈ U. 1841

2. For any distribution T , the support supp T is closed. 3. Suppose Tf is the distribution induced by a continuous function f : U → C. Then the above definition for the support of Tf is compatible with the usual definition for the support of the function f , i.e., supp Tf = supp f. 4. If T ∈ D (U), then we have for any multi-index α, supp D α T ⊂ supp T. 5. If T ∈ D (U) and f ∈ D(U), then supp(f T ) ⊂ supp f supp T.

Theorem [2, 3] Suppose U is an open set on Rn . If T is a distribution with compact support in U, then T is a distribution of finite order. What is more, if supp T is a point, say supp T = {p}, then T is of the form T =
|α|≤N

Cα D α δp ,

for some N ≥ 0, and complex constants Cα . Here, δp is the delta distribution at p; δp (u) = u(p) for u ∈ D(U).

REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999. 2. J. Rauch, Partial Differential Equations, Springer-Verlag, 1991. 3. W. Rudin, Functional Analysis, McGraw-Hill Book Company, 1973. 4. L. H¨rmander, The Analysis of Linear Partial Differential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, Springer-Verlag, 1990. 5. R.E. Edwards, Functional Analysis: Theory and Applications, Dover Publications, 1995.

Version: 3 Owner: matte Author(s): matte

1842

Chapter 510 46H05 – General theory of topological algebras
510.1 Banach algebra

Definition 19. A Banach algebra is a Banach space with a multiplication law compatible with the norm, i.e. ||ab|| ||a|| ||b|| (product inequality). Definition 20. A Definition 42. Banach *-algebra is a Banach algebra with an involution following properties: a∗∗ (ab)∗ (λa + µb)∗ ||a∗ || = = = = a, b∗ a∗ , ¯ λa∗ + µb∗ ¯ ||a||.

satisfying the (510.1.1) (510.1.2) (510.1.3) (510.1.4)

∀λ, µ ∈ C,

Example 25. The algebra of bounded operators on a Banach Space is a Banach algebra for the operator norm. Version: 4 Owner: mhale Author(s): mhale

1843

Chapter 511 46L05 – General theory of C ∗-algebras
511.1
A Definition 43. C ∗ -algebra A is a Banach *-algebra such that ||a∗ a|| = ||a||2 for all a ∈ A. Version: 2 Owner: mhale Author(s): mhale

C ∗-algebra

511.2

Gelfand-Naimark representation theorem

Every C ∗ -algebra is isomorphic to a C ∗ -subalgebra (norm closed *-subalgebra) of some B(H), the algebra of bounded operators on some Hilbert space H. In particular, every finite dimensional C ∗ -algebra is isomorphic to a direct sum of matrix algebras. Version: 2 Owner: mhale Author(s): mhale

511.3
A

state

Definition 44. state Ψ on a C ∗ -algebra A is a positive linear functional Ψ : A → C, Ψ(a∗ a) 0 for all a ∈ A, with unit norm. The norm of a positive linear functional is defined by ||Ψ|| = sup{|Ψ(a)| : ||a|| 1}. (511.3.1)
a∈A

For a unital C -algebra, ||Ψ|| = Ψ(1 I). 1844

The space of states is a convex set. Let Ψ1 and Ψ2 be states, then the convex combination λΨ1 + (1 − λ)Ψ2 , is also a state. A state is Definition 45. pure if it is not a convex combination of two other states. Pure states are the extreme points of the convex set of states. A pure state on a commutative C ∗ -algebra is equivalent to a character. When a C ∗ -algebra is represented on a Hilbert space H, every unit vector ψ ∈ H determines a (not necessarily pure) state in the form of an Definition 46. expectation value, Ψ(a) = ψ, aψ . (511.3.3) λ ∈ [0, 1], (511.3.2)

In physics, it is common to refer to such states by their vector ψ rather than the linear functional Ψ. The converse is not always true; not every state need be given by an expectation value. For example, delta functions (which are distributions not functions) give pure states on C0 (X), but they do not correspond to any vector in a Hilbert space (such a vector would not be square-integrable).

REFERENCES
1. G. Murphy, C ∗ -Algebras and Operator Theory. Academic Press, 1990.

Version: 1 Owner: mhale Author(s): mhale

1845

Chapter 512 46L85 – Noncommutative topology
512.1 Gelfand-Naimark theorem

Let Haus be the category of locally compact Hausdorff spaces with continuous proper maps as morphisms. And, let C∗ Alg be the category of commutative C ∗ -algebras with proper *-homomorphisms (send approximate units into approximate units) as morphisms. There is a contravariant functor C : Hausop → C∗ Alg which sends each locally compact Hausdorff space X to the commutative C ∗ -algebra C0 (X) (C(X) if X is compact). Conversely, there is a contravariant functor M : C∗ Algop → Haus which sends each commutative C ∗ -algebra A to the space of characters on A (with the Gelfand topology). The functors C and M are an equivalences of category. Version: 1 Owner: mhale Author(s): mhale

512.2

Serre-Swan theorem

Let X be a compact Hausdorff space. Let Vec(X) be the category of complex vector bundles over X. And, let ProjMod(C(X)) be the category of finitely generated projective modules over the C ∗ -algebra C(X). There is a functor Γ : Vec(X) → ProjMod(C(X)) which sends each complex vector bundle E → X to the C(X)-module Γ(X, E) of continuous sections. The functor Γ is an equivalences of category. Version: 1 Owner: mhale Author(s): mhale

1846

Chapter 513 46T12 – Measure (Gaussian, cylindrical, etc.) and integrals (Feynman, path, Fresnel, etc.) on manifolds
513.1 path integral

where P (a), P (b) are elements of Rn , and dx = dx1 , · · · , dxn dt where each xi is parametrized dt dt into a function of t. Proof and existence of path integral: Assume we have a parametrized curve P (t) with t ∈ [a, b]. We want to construct a sum of F over this interval on the curve P . Split the interval [a, b] into n subintervals of size ∆t = (b − a)/n. This means that the path P has been divided into n segments of lesser change in tangent vector. Note that the arc lengths need not be of equal length, though the intervals are of equal size. Let ti be an element of the ith subinterval. The quantity |P (ti )| gives the average magnitude of the vector tangent to the curve at a point in the interval ∆t. |P (ti )|∆t is then the approximate arc length of the curve segment produced by the subinterval ∆t. Since we want to sum F over our curve P , we let the range of our curve equal the domain of F . We can then dot this vector with our tangent vector to get the approximation to F at the point P (ti ). Thus, to get the sum we want, we can take the 1847

The path integral is a generalization of the integral that is very useful in theoretical and applied physics. Consider a vector field F : Rn → Rm and a path P ⊂ Rn . The path integral of F along the path P is defined as a definite integral. It can be construed to be the Riemann sum of the values of F along the curve P , aka the area under the curve S : P → F . Thusly, it is defined in terms of the parametrization of P , mapped into the domain Rn of F . Analytically, intP F · dx = intb F (P (t)) · dx a

limit as ∆t approaches 0.
b ∆t→0

lim

a

F (P (ti )) · P (ti )∆t

This is a Riemann sum, and thus we can write it in integral form. This integral is known as a path or line integral (the older name). intP F · dx = intb F (P (t)) · P (t)dt a Note that the path integral only exists if the definite integral exists on the interval [a, b]. properties: A path integral that begins and ends at the same point is called a closed path integral, and is denoted with the summa symbol with a centered circle: . These types of path integrals can also be evaluated using Green’s theorem. Another property of path integrals is that the directed path integral on a path C in a vector field is equal to the negative of the path integral in the opposite direction along the same path. A directed path integral on a closed path is denoted by summa and a circle with an arrow denoting direction. Visualization Aids:

This is an image of a path P superimposed on a vector field F .

This is a visualization of what we are doing when we take the integral under the curve S : P → F. Version: 9 Owner: slider142 Author(s): slider142

1848

Chapter 514 47A05 – General (adjoints, conjugates, products, inverses, domains, ranges, etc.)
514.1 Baker-Campbell-Hausdorff formula(e)

Given a linear operator A, we define: exp A := It follows that
∞ k=0

1 k A . k!

(514.1.1)

∂ τA e = Aeτ A = eτ A A. (514.1.2) ∂τ Consider another linear operator B. Let B(τ ) = eτ A Be−τ A . Then one can prove the following series representation for B(τ ): ∞ τm Bm , (514.1.3) B(τ ) = m! m=0 where Bm = [A, B]m := [A, [A, . . . [A, B]]] (m times) and B0 := B. A very important special case of eq. (513.1.3) is known as the Baker-Campbell-Hausdorff (BCH) formula. Namely, for τ = 1 we get: ∞ 1 eA Be−A = Bm . (514.1.4) m! m=0 Alternatively, this expression may be rewritten as 1 [B, e−A ] = e−A [A, B] + [A, [A, B]] + . . . , 2 1849 (514.1.5)

or

1 (514.1.6) [A, B] + [A, [A, B]] + . . . eA . 2 There is a descendent of the BCH formula, which often is also referred to as BCH formula. It provides us with the multiplication law for two exponentials of linear operators: Suppose [A, [A, B]] = [B, [B, A]] = 0. Then, [eA , B] = eA eB = eA+B e 2 [A,B] .
1

(514.1.7) (514.1.8)

Thus, if we want to commute two exponentials, we get an extra factor eA eB = eB eA e[A,B] . Version: 5 Owner: msihl Author(s): msihl

514.2

adjoint

Let H be a Hilbert space and let A : D(A) ⊂ H → H be a densely defined linear operator. Suppose that for some y ∈ H, there exists z ∈ H such that (Ax, y) = (x, z) for all x ∈ D(A). Then such z is unique, for if z is another element of H satisfying that condition, we have (x, z − z ) = 0 for all x ∈ D(A), which implies z − z = 0 since D(A) is dense. Hence we may define a new operator A∗ : D(A∗ ) ⊂ H → H by D(A∗ ) ={y ∈ H : there isz ∈ Hsuch that(Ax, y) = (x, z)}, A∗ (y) =z.

It is easy to see that A∗ is linear, and it is called the adjoint of A∗ . Remark. The requirement for A to be densely defined is essential, for otherwise we cannot guarantee A∗ to be well defined. Version: 4 Owner: Koro Author(s): Koro

514.3

closed operator

Let B be a Banach space. A linear operator A : D(A) ⊂ B → B is said to be closed if for every sequence {xn }n∈N in D(A) converging to x ∈ B such that Axn − − y ∈ B, it holds −→ x ∈ D(A) and Ax = y. Equivalently, A is closed if its graph is closed in B ⊕ B.
n→∞

Given an operator A, not necessarily closed, if the closure of its graph in B ⊕ B happens to be the graph of some operator, we call that operator the closure of A and denote it by A. It follows easily that A is the restriction of A to D(A). The following properties are easily checked: 1850

1. Any bounded linear operator defined on the whole space B is closed; 2. If A is closed then A − λI is closed; 3. If A is closed and it has an inverse, then A−1 is also closed; 4. An operator A admits a closure if and only if for every pair of sequences {xn } and {yn } in D(A) converging to x and y, respectively, such that both {Axn } and {Ayn } converge, it holds limn Axn = limn Ayn . Version: 2 Owner: Koro Author(s): Koro

514.4

properties of the adjoint operator

Let A and B be linear operators in a Hilbert space, and let λ ∈ C. Assuming all the operators involved are densely defined, the following properties hold: 1. If A−1 exists and is densely defined, then (A−1 )∗ = (A∗ )− 1; 2. (λA)∗ = λA∗ ; 3. A ⊂ B implies B ∗ ⊂ A∗ ; 4. A∗ + B ∗ ⊂ (A + B)∗ ; 5. B ∗ A∗ ⊂ (AB)∗ ; 6. (A + λI)∗ = A∗ + λI; 7. A∗ is a closed operator. Remark. The notation A ⊂ B for operators means that B is an extension of A, i.e. A is the restriction of B to a smaller domain. Also, we have the following Proposition. 1. If A admits a closure A, then A∗ is densely defined and (A∗ )∗ = A. Version: 5 Owner: Koro Author(s): Koro

1851

Chapter 515 47A35 – Ergodic theory
515.1 ergodic theorem

Let (X, B, µ) be a space with finite measure, f ∈ L1 (X), and T : X → X be an ergodic transformation, not necessarily invertible. The ergodic theorem (often the pointwise or strong ergodic theorem) states that 1 k holds for almost all x as k → ∞. That is, for an ergodic transformation, the time average converges to the space average almost surely. Version: 3 Owner: bbukh Author(s): bbukh, drummond
k−1

j=0

f (T j x) −→ intf dµ

1852

Chapter 516 47A53 – (Semi-) Fredholm operators; index theories
516.1 Fredholm index

Let P be a Fredholm operator. The Definition 47. index of P is defined as index(P ) = dim ker (P ) − dim coker(P ) = dim ker (P ) − dim ker (P ∗ ).

Note: this is well defined as ker (P ) and ker (P ∗ ) are finite-dimensional vector spaces, for P Fredholm. properties • index(P ∗ ) = − index(P ). • index(P + K) = index(P ) for any compact operator K. • If P1 : H1 → H2 and P2 : H2 → H3 are Fredholm operators then index(P2 P1 ) = index(P1 ) + index(P2 ). Version: 2 Owner: mhale Author(s): mhale

516.2

Fredholm operator

A Fredholm operator is a bounded operator that has a finite dimensional kernel and cokernel. Equivalently, it is invertible modulo compact operators. That is, if F : X → Y 1853

is a Fredholm operator between two vector spaces X and Y . Then, there exists a bounded operator G : Y → X such that GF − 1 X ∈ K(X), I If F is Fredholm then so is it’s adjoint, F ∗ . Version: 4 Owner: mhale Author(s): mhale F G − 1 Y ∈ K(Y ), I (516.2.1)

where K(X) denotes the space of compact operators on X.

1854

Chapter 517 47A56 – Functions whose values are linear operators (operator and matrix valued functions, etc., including analytic and meromorphic ones
517.1 Taylor’s formula for matrix functions

Let p be a polynomial and suppose A and B commute, i.e. AB = BA, then
n

p(A + B) =
k=0

1 (k) p (A)Bk . k!

where n = deg(p). Version: 4 Owner: bwebste Author(s): bwebste, Johan

1855

Chapter 518 47A60 – Functional calculus
518.1 Beltrami identity

D Let q(t) be a function R → R, q = Dt q, and L = L(q, q, t). Begin with the time-relative ˙ ˙ Euler-Lagrange condition ∂ D ∂ L− L = 0. (518.1.1) ∂q Dt ∂ q ˙

If

∂ L ∂t

= 0, then the Euler-Lagrange condition reduces to

∂ L = C, (518.1.2) ∂q ˙ which is the Beltrami identity. In the calculus of variations, the ability to use the Beltrami identity can vastly simplify problems, and as it happens, many physical problems have ∂ L = 0. ∂t L+q ˙ In space-relative terms, with q :=
D q, Dx

we have (518.1.3)

∂ D ∂ L− L = 0. ∂q Dx ∂q If
∂ L ∂x

= 0, then the Euler-Lagrange condition reduces to ∂ L = C. ∂q

L+q To derive the Beltrami identity, note that D Dt q ˙ ∂ L ∂q ˙

(518.1.4)

=q ¨

∂ D L+q ˙ ∂q ˙ Dt

∂ L ∂q ˙

(518.1.5)

1856

Multiplying (1) by q, we have ˙ q ˙ D ∂ L−q ˙ ∂q Dt ∂ L ∂q ˙ = 0. (518.1.6)

Now, rearranging (5) and substituting in for the rightmost term of (6), we obtain q ˙ ∂ D ∂ L+q L− ¨ ∂q ∂q ˙ Dt q ˙ ∂ L ∂q ˙ = 0. (518.1.7)

Now consider the total derivative ∂ ∂ ∂ D L(q, q, t) = q L + q L + L. ˙ ˙ ¨ Dt ∂q ∂q ˙ ∂t

(518.1.8)

∂ If ∂t L = 0, then we can substitute in the left-hand side of (8) for the leading portion of (7) to get ∂ D D q L = 0. ˙ L− (518.1.9) Dt Dt ∂q ˙ Integrating with respect to t, we arrive at

L+q ˙ which is the Beltrami identity.

∂ L = C, ∂q ˙

(518.1.10)

Version: 4 Owner: drummond Author(s): drummond

518.2

Euler-Lagrange differential equation

D Let q(t) be a function R → R, q = Dt q, and L = L(q, q, t). The Euler-Lagrange differen˙ ˙ tial equation (or Euler-Lagrange condition) is

∂ D L− ∂q Dt

∂ L ∂q ˙

= 0.

(518.2.1)

This is the central equation of the calculus of variations. In some cases, specifically for ∂ L = 0, it can be replaced by the Beltrami identity. ∂t Version: 1 Owner: drummond Author(s): drummond

518.3

calculus of variations

Imagine a bead of mass m on a wire whose endpoints are at a = (0, 0) and b = (xf , yf ), with yf lower than the starting position. If gravity acts on the bead with force F = mg, what 1857

path (arrangement of the wire) minimizes the bead’s travel time from a to b, assuming no friction? This is the famed “brachistochrone problem,” and its solution was one of the first accomplishments of the calculus of variations. Many minimum problems can be solved using the techniques introduced here. In its general form, the calculus of variations concerns quantities S[q, q, t] = intb L(q(t), q(t), t)dt ˙ ˙ a for which we wish to find a minimum or a maximum. To make this concrete, let’s consider a much simpler problem than the brachistochrone: what’s the shortest distance between two points p = (x1, y1) and q = (x2, y2)? Let the variable s represent distance along the path, so that intq ds = S. We wish to find the path p such that S is a minimum. Zooming in on a small portion of the path, we can see that ds2 = dx2 + dy 2 ds = dx2 + dy 2 (518.3.2) (518.3.3) (518.3.1)

If we parameterize the path by t, then we have ds = dx dt
2

+

dy dt

2

dt

(518.3.4)

Let’s assume y = f (x), so that we may simplify (4) to ds = 1+ dy dx
2

dx =

1 + f (x)2 dx.

(518.3.5)

Now we have S = intq L dx = intx2 p x1 1 + f (x)2 dx (518.3.6) In this case, L is particularly simple. Converting to q’s and t’s to make the comparison easier, we have L = L[f (x)] = L[q(t)], not the more general L[q(t), q(t), t] covered by the ˙ ˙ calculus of variations. We’ll see later how to use our L’s simplicity to our advantage. For now, let’s talk more generally. We wish to find the path described by L, passing through a point q(a) at t = a and through q(b) at t = b, for which the quantity S is a minimum, for which small perturbations in the path produce no first-order change in S, which we’ll call a “stationary point.” This is directly analogous to the idea that for a function f (t), the minimum can be found where small perturbations δt produce no first-order change in f (t). This is where f (t + δt) ≈ f (t); taking a Taylor series expansion of f (t) at t, we find f (t + δt) = f (t) + δtf (t) + O(δt2 ) = f (t), 1858 (518.3.7)

D with f (t) := Dt f (t). Of course, since the whole point is to consider δt = 0, once we neglect terms O(δt2 ) this is just the point where f (t) = 0. This point, call it t = t0 , could be a minimum or a maximum, so in the usual calculus of a single variable we’d proceed by taking the second derivative, f (t0 ), and seeing if it’s positive or negative to see whether the function has a minimum or a maximum at t0 , respectively.

In the calculus of variations, we’re not considering small perturbations in t—we’re considering small perturbations in the integral of the relatively complicated function L(q, q, t), where ˙ D q = Dt q(t). S is called a functional, essentially a mapping from functions to real numbers, ˙ and we can think of the minimization problem as the discovery of a minimum in S-space as we jiggle the parameters q and q. ˙ For the shortest-distance problem, it’s clear the maximum time doesn’t exist, since for any finite path length S0 we (intuitively) can always find a curve for which the path’s length is greater than S0 . This is often true, and we’ll assume for this discussion that finding a stationary point means we’ve found a minimum. Formally, we write the condition that small parameter perturbations produce no change in S as δS = 0. To make this precise, we simply write:

δS := S[q + δq, q + δ q, t] − S[q, q, t] ˙ ˙ ˙

= intb L(q + δq, q + δ q)Dt − S[q, q, t] ˙ ˙ ˙ a

How are we to simplify this mess? We are considering small perturbations to the path, which suggests a Taylor series expansion of L(q + δq, q + δ q) about (q, q): ˙ ˙ ˙ L(q + δq, q + δ q) = L(q, q) + δq ˙ ˙ ˙ ∂ ∂ L(q, q) + δ q L(q, q) + O(δq 2) + O(δ q 2 ) ˙ ˙ ˙ ˙ ∂q ∂q ˙

and since we make little error by discarding higher-order terms in δq and δ q, we have ˙ intb L(q + δq, q + δ q)Dt = S[q, q, t] + intb δq ˙ ˙ ˙ a a Keeping in mind that δ q = ˙
D δq Dt

∂ ∂ L(q, q) + δ q L(q, q)Dt ˙ ˙ ˙ ∂q ∂q ˙

and noting that = δq D ∂ ∂ L(q, q) + δ q L(q, q), ˙ ˙ ˙ Dt ∂ q ˙ ∂q ˙

D ∂ ˙ δq L(q, q) Dt ∂q ˙

a simple application of the product rule δq ˙

D (f g) Dt

= f˙g + f g which allows us to substitute ˙

∂ D ∂ D ∂ L(q, q) = ˙ δq L(q, q) − δq ˙ L(q, q), ˙ ∂q ˙ Dt ∂q ˙ Dt ∂ q ˙ 1859

we can rewrite the integral, shortening L(q, q) to L for convenience, as: ˙ intb δq a ∂ ∂ ∂ ∂ D ∂ D δq L Dt L + δ q LDt = intb δq L − δq ˙ L+ a ∂q ∂q ˙ ∂q Dt ∂ q ˙ Dt ∂q ˙ ∂ D ∂ ∂ b = intb δq L− L Dt + δq L a ∂q Dt ∂ q ˙ ∂q a ˙

Substituting all of this progressively back into our original expression for δS, we obtain δS = intb L(q + δq, q + δ q)Dt − S[q, q, t] ˙ ˙ ˙ a ∂ ∂ ˙ = S + intb δq L + δ q L Dt − S a ∂q ∂q ˙ ∂ D ∂ ∂ = intb δq L− L Dt + δq L a ∂q Dt ∂ q ˙ ∂q ˙

b a

= 0.

Two conditions come to our aid. First, we’re only interested in the neighboring paths that still begin at a and end at b, which corresponds to the condition δq = 0 at a and b, which lets us cancel the final term. Second, between those two points, we’re interested in the paths which do vary, for which δq = 0. This leads us to the condition intb δq a ∂ D ∂ L− L Dt = 0. ∂q Dt ∂ q ˙ (518.3.8)

The fundamental theorem of the calculus of variations is that for functions f (t), g(t) with g(t) = 0 ∀t ∈ (a, b), intb f (t)g(t)dt = 0 =⇒ f (t) = 0 ∀t ∈ (a, b). a Using this theorem, we obtain ∂ D L− ∂q Dt ∂ L ∂q ˙ = 0. (518.3.10) (518.3.9)

This condition, one of the fundamental equations of the calculus of variations, is called the Euler-Lagrange condition. When presented with a problem in the calculus of variations, the first thing one usually does is to ask why one simply doesn’t plug the problem’s L into this equation and solve. Recall our shortest-path problem, where we had arrived at S = intb L dx = intx2 a x1 1 + f (x)2 dx. (518.3.11)

Here, x takes the place of t, f takes the place of q, and (8) becomes D ∂ ∂ L− L=0 ∂f Dx ∂f 1860 (518.3.12)

Even with

∂ L ∂f

= 0, this is still ugly. However, because L−q ∂ L = C. ∂q

∂ L ∂f

= 0, we can use the Beltrami identity, (518.3.13)

(For the derivation of this useful little trick, see the corresponding entry.) Now we must simply solve ∂ 1 + f (x)2 − f (x) L = C (518.3.14) ∂q which looks just as daunting, but quickly reduces to 1+f (x)2 − f (x) =C 1 + f (x)2 1 + f (x)2 − f (x)2 =C 1 + f (x)2 1 =C 1 + f (x)2 f (x) = 1 − 1 = m. C2
1 2f 2

(x)

(518.3.15) (518.3.16) (518.3.17) (518.3.18)

That is, the slope of the curve representing the shortest path between two points is a constant, which means the curve must be a straight line. Through this lengthy process, we’ve proved that a straight line is the shortest distance between two points. To find the actual function f (x) given endpoints (x1 , y1 ) and (x2 , y2), simply integrate with respect to x: f (x) = intf (x)dx = intbdx = mx + d (518.3.19) and then apply the boundary conditions f (x1 ) = y1 = mx1 + d f (x2 ) = y2 = mx2 + d Subtracting the first condition from the second, we get m = for the slope of a line. Solving for d = y1 − mx1 , we get f (x) = y2 − y1 (x − x1 ) + y1 x2 − x1
y2 −y1 , x2 −x1

(518.3.20) (518.3.21) the standard equation

(518.3.22)

which is the basic equation for a line passing through (x1 , y1 ) and (x2 , y2 ). The solution to the brachistochrone problem, while slightly more complicated, follows along exactly the same lines. Version: 6 Owner: drummond Author(s): drummond

1861

Chapter 519 47B15 – Hermitian and normal operators (spectral measures, functional calculus, etc.)
519.1 self-adjoint operator

A linear operator A : D(A) ⊂ H → H in a Hilbert space H is an Hermitian operator if (Ax, y) = (x, Ay) for all x, y ∈ D(A). If A is densely defined, it is said to be a symmetric operator if it is the restriction of its adjoint A∗ to D(A), i.e. if A ⊂ A∗ ; and it is called a self-adjoint operator if A = A∗ . Version: 2 Owner: Koro Author(s): Koro

1862

Chapter 520 47G30 – Pseudodifferential operators
520.1 Dini derivative

The upper Dini derivative of a continuous function, f , denoted D + f or f+ , is defined as D + f (t) = f+ (t) = lim+ sup
h→0

f (t + h) − f (t) . h

The lower Dini derivative, D − f or f− is defined as D − f (t) = f− (t) = lim+ inf
h→0

f (t + h) − f (t) . h

If f is defined on a vector space, then the upper Dini derivative at t in the direction d is denoted f (t + hd) − f (t) . f+ (t, d) = lim+ sup h→0 hd If f is locally Lipschitz then D + f is finite. If f is differentiable at t then the Dini derivative at t is the derivative at t. Version: 5 Owner: lha Author(s): lha

1863

Chapter 521 47H10 – Fixed-point theorems
521.1 Brouwer fixed point in one dimension

Theorem 1 [1, 1] Suppose f is a continuous function f : [−1, 1] → [−1, 1]. Then f has a fixed point, i.e., there is a x such that f (x) = x. Proof (Following [1]) We can assume that f (−1) > −1 and f (+1) < 1, since otherwise there is nothing to prove. Then, consider the function g : [−1, 1] → R defined by g(x) = f (x) − x. It satisfies g(+1) > 0, g(−1) < 0, so by the intermediate value theorem, there is a point x such that g(x) = 0, i.e., f (x) = x. P Assuming slightly more of the function f yields the Banach fixed point theorem. In one dimension it states the following: Theorem 2 Suppose f : [−1, 1] → [−1, 1] is a function that satisfies the following condition: for some constant C ∈ [0, 1), we have for each a, b ∈ [−1, 1], |f (b) − f (a)| ≤ C|b − a|. Then f has a unique fixed point in [−1, 1]. In other words, there exists one and only one point x ∈ [−1, 1] such that f (x) = x. Remarks The fixed point in Theorem 2 can be found by iteration from any s ∈ [−1, 1] as follows: first choose some s ∈ [−1, 1]. Then form s1 = f (s), then s2 = f (s1 ), and generally sn = f (sn−1 ). As n → ∞, sn approaches the fixed point for f . More details are given on the 1864

entry for the Banach fixed point theorem. A function that satisfies the condition in Theorem 2 is called a contraction mapping. Such mappings also satisfy the Lipschitz condition.

REFERENCES
1. A. Mukherjea, K. Pothoven, Real and Functional analysis, Plenum press, 1978.

Version: 5 Owner: mathcam Author(s): matte

521.2

Brouwer fixed point theorem

Theorem Let B = {x ∈ Rn : x ≤ 1} be the closed unit ball in Rn . Any continuous function f : B → B has a fixed point.

Notes
Shape is not important The theorem also applies to anything homeomorphic to a closed disk, of course. In particular, we can replace B in the formulation with a square or a triangle. Compactness counts (a) The theorem is not true if we drop a point from the interior of 1 B. For example, the map f (x) = 2 x has the single fixed point at 0; dropping it from the domain yields a map with no fixed points. Compactness counts (b) The theorem is not true for an open disk. For instance, the map f (x) = 1 x + ( 1 , 0, . . . , 0) has its single fixed point on the boundary of B. 2 2 Version: 3 Owner: matte Author(s): matte, ariels

521.3

any topological space with the fixed point property is connected

Theorem Any topological space with the fixed point property is connected [3, 2]. Proof. Suppose X is a topological space with the fixed point property. We will show that X is connected by contradiction: suppose there are non-empty disjoint open sets A, B ⊂ X

1865

such that X = A B. Then there are elements a ∈ A and b ∈ B, and we can define a function f : X → X by a, when x ∈ B, f (x) = b, when x ∈ A. Since A B = ∅ and A B = X, the function f is well defined. Also, since f (x) and x are always in disjoint connected components of X, f can have no fixed point. To obtain a contradiction, we only need to show that f is continuous. However, if V is an open set in X, a short calculation shows that f −1 (V ) is either ∅, A, B or X, which are all open sets. Thus f is continuous, and X must be connected. P

REFERENCES
1. G.J. Jameson, Topology and Normed Spaces, Chapman and Hall, 1974. 2. L.E. Ward, Topology, An Outline for a First Course, Marcel Dekker, Inc., 1972.

Version: 5 Owner: matte Author(s): matte

521.4

fixed point property

Definition [2, 3, 2] Suppose X is a topological space. If every continuous function f : X → X has a fixed point, then X has the fixed point property.

Example 1. Brouwer’s fixed point theorem states that in Rn , the closed unit ball with the subspace topology has the fixed point property.

Properties 1. The fixed point property is preserved under a homeomorphism. In other words, suppose f : X → Y is a homeomorphism between topological spaces X and Y . If X has the fixed point property, then Y has the fixed point property [2]. 2. any topological space with the fixed point property is connected [3, 2]. 3. Suppose X is a topological space with the fixed point property, and Y is a retract of X. Then Y has the fixed point property [3].

1866

REFERENCES
1. G.L. Naber, Topological methods in Euclidean spaces, Cambridge University Press, 1980. 2. G.J. Jameson, Topology and Normed Spaces, Chapman and Hall, 1974. 3. L.E. Ward, Topology, An Outline for a First Course, Marcel Dekker, Inc., 1972.

Version: 5 Owner: matte Author(s): matte

521.5

proof of Brouwer fixed point theorem

Proof of the Brouwer fixed point theorem: Assume that there does exist a map from f : B n → B n with no fixed point. Then let g(x) be the following map: Start at f (x), draw the ray going through x and then let g(x) be the first intersection of that line with the sphere. This map is continuous and well defined only because f fixes no point. Also, it is not hard to see that it must be the identity on the boundary sphere. Thus we have a map g : B n → S n−1 , which is the identity on S n−1 = ∂B n , that is, a retraction. Now, if i : S n−1 → B n is the inclusion map, g ◦ i = idS n−1 . Applying the reduced homology functor, we find that g∗ ◦ i∗ = idHn−1 (S n−1 ) , where ∗ indicates the induced ˜ map on homology. ˜ ˜ But, it is a well-known fact that Hn−1(B n ) = 0 (since B n is contractible), and that Hn−1 (S n−1 ) = Z. Thus we have an isomorphism of a non-zero group onto itself factoring through a trivial group, which is clearly impossible. Thus we have a contradiction, and no such map f exists. Version: 3 Owner: bwebste Author(s): bwebste

1867

Chapter 522 47L07 – Convex sets and cones of operators
522.1 convex hull of S is open if S is open

Theorem If S is an open set in a topological vector space, then the convex hull co(S) is open [1]. As the next example shows, the corresponding result does not hold for a closed set. Example ([1], pp. 14) If S = {(x, 1/|x|) ∈ R2 | x ∈ R \ {0}}, then S is closed, but co(S) is the open half-space {(x, y) | x ∈ R, y ∈ (0, ∞)}, which is open. P

REFERENCES
1. F.A. Valentine, Convex sets, McGraw-Hill book company, 1964.

Version: 3 Owner: drini Author(s): matte

1868

Chapter 523 47L25 – Operator spaces (= matricially normed spaces)
523.1 operator norm

Let A : V → W be a linear map between normed vector spaces V and W. We can define a function · op : A → R+ as Av A op := sup . v v∈V
v=0

Equivalently, the above definition can be written as A
op

:= sup Av = sup
v∈V v =1 v∈V 0< v ≤1

Av .

Turns out that · op satisfies all the properties of a norm and hence is called the operator norm (or the induced norm) of A. If A op exists and is finite, we say that A is a bounded linear map. The space L(V, W) of bounded linear maps from V to W also forms a vector space with · as the natural norm.
op

523.1.1

Example
·
p

Suppose that V = (Rn , · p ) and W = (Rn , · p ), where the operator norm · op = · p is the matrix p-norm. Version: 3 Owner: igor Author(s): igor 1869

is the vector p-norm. Then

Chapter 524 47S99 – Miscellaneous
524.1 Drazin inverse

A Drazin inverse of an operator T is an operator, S, such that T S = ST, S 2 T = S, T m+1 S = T m for any integer m ≥ 0. For example, a projection operator, P , is its own Drazin inverse, as P P = P 2P = P m = P for any integer m ≥ 0. Version: 2 Owner: lha Author(s): lha

1870

Chapter 525 49K10 – Free problems in two or more independent variables
525.1 Kantorovitch’s theorem

Let a0 be a point in Rn , U an open neighborhood of a0 in Rn and f : U → Rn a differentiable mapping, with its derivative [Df (a0 )] invertible. Define h0 = −[Df (a0 )]−1 f(a0 ) , a1 = a0 + h0 , U0 = {x | |x − a1 | If U0 ⊂ U and the derivative [Df (x)] satisfies the Lipschitz condition |[Df(u1 )] − [Df (u2 )]| for all points u1 , u2 ∈ U0 , and if the inequality f (a0 ) [Df (a0 )]−1 M
2

|h0 |}

M|u1 − u2 | 1 2

is satisfied, the equation f (x) = 0 has a unique solution in U0 , and Newton’s method with initial guess a0 converges to it. If we replace with <, then it can be shown that Newton’s method superconverges! If you want an even stronger version, one can replace |...| with the norm ||...||. logic behind the theorem:Let’s look at the useful part of the theorem: f (a0 ) [Df (a0 )]−1 M
2

1 2

It is a product of three distinct properties of your function such that the product is less than or equal to a certain number, or bound. If we call the product R, then it says that a0 must be within a ball of radius R. It also says that the solution x is within this same ball. How was this ball defined ? 1871

The first term, |f (a0 )|, is a measure of how far the function is from the domain; in the cartesian plane, it would be how far the function is from the x-axis. Of course, if we’re solving for f (x) = 0, we want this value to be small, because it means we’re closer to the axis. However a function can be annoyingly close to the axis, and yet just happily curve away from the axis. Thus we need more. The second term, |[Df(a0 )]−1 |2 is a little more difficult. This is obviously a measure of how fast the function is changing with respect to the domain (x-axis in the plane). The larger the derivative, the faster it’s approaching wherever it’s going (hopefully the axis). Thus, we take the inverse of it, since we want this product to be less than a number. Why it’s squared though, is because it is the denominator where a product of two terms of like units is the numerator. Thus to conserve units with the numerator, it is multiplied by itself. Combined with the first term, this also seems to be enough, but what if the derivative changes sharply, but it changes the wrong way? The third term is the Lipschitz ratio M. This measures sharp changes in the first derivative, so we can be sure that if this is small, that the function won’t try to curve away from our goal on us too sharply. By the way, the number 1 is unitless, so all the units on the left side cancel. Checking units 2 is essential in applications, such as physics and engineering, where Newton’s method is used. Version: 18 Owner: slider142 Author(s): slider142

1872

Chapter 526 49M15 – Methods of Newton-Raphson, Galerkin and Ritz types
526.1 Newton’s method

− → Let f be a differentiable function from Rn to Rn . Newton’s method consists of starting at − → − → an a0 for the equation f (x) = 0 . Then the function is linearized at a0 by replacing the − → − → − → increment f (x) − f (a0 ) by a linear function of the increment [D f (a0 )](x − a0 ). − → − → − → Now we can solve the linear equation f (a0 ) + [D f (a0 )](x − a0 ) = 0 . Since this is a system − → − → of n linear equations in n unknowns, [D f (a0 )](x − a0 ) = − f (a0 ) can be likened to the → → − general linear system A− = b . x − → − → − → Therefore, if [D f (a0 )] is invertible, then x = a0 − [D f (a0 )]−1 f (a0 ). By renaming x to a1 , you can reiterate Newton’s method to get an a2 . Thus, Newton’s method states − → − → an+1 = an − [D f (an )]−1 f (an ) − → − → Thus we get a series of a’s that hopefully will converge to x| f (x) = 0 . When we solve − → − → an equation of the form f (x) = 0 , we call the solution a root of the equation. Thus, Newton’s method is used to find roots of nonlinear equations. Unfortunately, Newton’s method does not always converge. There are tests for neighborhoods of a0 ’s where Newton’s method will converge however. One such test is Kantorovitch’s theorem, which combines what is needed into a concise mathematical equation. Corollary 1:Newton’s Method in one dimension - The above equation is simplified in one dimension to the well-used 1873

a1 = a0 −

f (a0 ) f (a0 )

This intuitively cute equation is pretty much the equation of first year calculus. :) Corollary 2:Finding a square root - So now that you know the equation, you need to know how to use it, as it is an algorithm. The construction of the primary equation, of course is the important part. Let’s see how you do it if you want to find a square root of a number b. We want to find a number x (x for unknown), such that x2 = b. You might think ”why not √ find a number such that x = b ?” Well, the problem with that approach is that we don’t √ have a value for b, so we’d be right back where we started. However, squaring both sides of the equation to get x2 = b lets us work with the number we do know, b.) Back to x2 = b. With some manipulation, we see this means that x2 − b = 0 ! Thus we have our f (x) = 0 scenario. We can see that f (x) = 2x thus, f (a0 ) = 2a0 and f (a0 ) = a2 − b. Now we have all we need 0 to carry out Newton’s method. By renaming x to be a1 , we have 1 b 1 2 (a0 − b) = a0 + 2a0 2 a0

a1 = a0 − .

The equation on the far right is also known as the divide and average method, for those who have not learned the full Newton’s method, and just want a fast way to find square roots. Let’s see how this works out to find the square root of a number like 2: Let x2 = 2 x2 − 2 = 0 = f (x) Thus, by Newton’s method,... a2 − 2 0 2a0

a1 = a0 −

All we did was plug in the expressions f (a0 ) and f (a0 ) where Newton’s method asks for them. Now we have to pick an a0 . Hmm, since √ 1< 1< √ √ 2< √ 4

2<2

1874

let’s pick a reasonable number between 1 and 2 like 1.5 1.52 − 2 2(1.5) ¯ a1 = 1.416

a1 = 1.5 −

Looks like our guess was too high. Let’s see what the next iteration says a2 = 1.41¯ − 6 1.41¯2 − 2 6 2(1.41¯ 6)

a2 = 1.414215686 . . . getting better =) You can use your calculator to find that √ 2 = 1.414213562 . . .

Not bad for only two iterations! Of course, the more you iterate, the more decimal places your an will be accurate to. By the way, this is also how your calculator/computer finds square roots! Geometric interpretation: Consider an arbitrary function f : R → R such as f (x) = x2 − b. Say you wanted to find a root of this function. You know that in the neighborhood of x = a0 , there is a root (Maybe you used Kantorovitch’s theorem or tested and saw that the function changed signs in this neighborhood). We want to use our knowledge a0 to find an a1 that is a better approximation to x0 (in this case, closer to it on the x-axis). So we know that x0 a1 a0 or in another case a0 a1 x0 . What is an efficient algorithm to bridge the gap between a0 and x0 ? Let’s look at a tangent line to the graph. Note that the line intercepts the x-axis between a0 and x0 , which is exactly what we want. The slope of this tangent line is f (a0 ) by definition of the derivative at a0 , and we know one point on the line is (a1 , 0), since that is the x-intercept. That is all we need to find the formula of the line, and solve for a1 .

1875

y − y1

= m(x − x1 ) Substituting

f (a0 ) − 0 = f (a0 )(a0 − a1 )
f (a0 ) f (a0 )

= a0 − a1 = −a0 + = a0 −
f (a0 ) f (a0 )

−a1 a1

Aesthetic change. Flipped the equation around. Newton’s method!

f (a0 ) f (a0 )

Version: 17 Owner: slider142 Author(s): slider142

1876

Chapter 527 51-00 – General reference works (handbooks, dictionaries, bibliographies, etc.)
527.1 Apollonius theorem

Let a, b, c the sides of a triangle and m the length of the median to the side with length a. 2 Then b2 + c2 = 2m2 + a2 .

Version: 2 Owner: drini Author(s): drini

527.2

Apollonius’ circle

Apollonius’ circle. The locus of a point moving so that the ratio of its distances from two fixed points is fixed, is a circle. If two circles C1 and C2 are fixed with radius r1 and r2 then the cicle of Apollonius of the two centers with ratio r1 /r2 is the circle whose diameter is the segment that joins the two homothety centers of the circles. Version: 1 Owner: drini Author(s): drini

1877

527.3

Brahmagupta’s formula

If a cyclic quadrilateral has sides p, q, r, s then its area is given by (T − p)(T − q)(T − r)(T − s) where T =
p+q+r+s . 2

Note that if s → 0, Heron’s formula is recovered.

Version: 3 Owner: drini Author(s): drini

527.4

Brianchon theorem

If an hexagon ABCDEF (not necessarily convex) is inscribed into a conic (in particular into a circle), then the three diagonals AD, BE, CF are concurrent. This theorem is the dual of Pascal line theorem. (C. Brianchon, 1806)

Version: 2 Owner: vladm Author(s): vladm

527.5

Brocard theorem

Theorem: Let ABC be a triangle. Let A , B , C be three points such as A ∈ (BC), B ∈ (AC), C ∈ (AB). Then the circumscribed circles to the triangles AB C , BC A , CA B have a point in common. This point is called Brocard point. Proof: Let M be the point in witch the circles circumscribed to the triangles AB C , BC A meets. Because the quadrilateral AB MC is discreptible, the angles AB M and MC B are congruent. Analog because the quadrilateral BA MC is discreptible, the angles MC B and MA C are congruent. So AB M and MA C are congruent and MA CB is inscribable.

Version: 2 Owner: vladm Author(s): vladm 1878

527.6

Carnot circles

If ABC is a triangle, and H is the orthocenter, then we have three circles so that every circle contains two angles and the orthocenter. The three circles are called the Carnot circles.

Version: 2 Owner: vladm Author(s): vladm

527.7

Erd¨s-Anning Theorem o

If an infinite number of points in a plane are all separated by integer distances, then all the points lie on a straight line. Version: 1 Owner: giri Author(s): giri

527.8

Euler Line

In any triangle, the orthocenter H, the centroid G and the circumcenter O are collinear, and OG/GH = 1/2. The line passing by these points is known as the Euler line of the triangle.

This line also passes by the center of the nine-point circle (or Feuerbach circle) N, and N is the midpoint of OH.

Version: 9 Owner: drini Author(s): drini

527.9

Gergonne point

Let ABC be a triangle and D, E, F where the incircle touches the sides BC, CA, AB respectively. Then the lines AD, BE, CF are concurrent, and the common point is called the Gergonne point of the triangle. Version: 3 Owner: drini Author(s): drini 1879

527.10

Gergonne triangle

Let ABC be a triangle and D, E, F where the incircle touches the sides BC, CA, AB respectively. Then triangle DEF is called the Gergonne triangle or contact triangle of ABC. Version: 2 Owner: drini Author(s): drini

527.11

Heron’s formula

The area of a triangle with side lengths a, b, c is = where s =
a+b+c 2

s(s − a)(s − b)(s − c)

(the semiperimeter).

Version: 2 Owner: drini Author(s): drini

527.12

Lemoine circle

If throught the Lemoine point of a triangle are drawn parallels to the sides, the six points where these intersect the circle lie all on a same circle. This circle is called the Lemoine circle of the triangle Version: 1 Owner: drini Author(s): drini

527.13

Lemoine point

The Lemoine point of a triangle, is the intersection point of its three symmedians. (That is, the isogonal conjugate of the centroid). It is related with the Gergonne point by the following result: On any triangle ABC, the Lemoine point of its Gergonne triangle is the Gergonne point of ABC. In the picture, the blue lines are the medians, intersecting an the centroid G. The green lines are anglee bisectors intersecting at the incentre I and the red lines are symmedians. The symmedians intersect at Lemoine point L. Version: 5 Owner: drini Author(s): drini 1880

527.14

Miquel point

Let AECF be a complete quadrilateral, then the four circles circumcised to the four triangles : AED, AF B, BEC, CDF , are concurrent in a point M. This point is called the Miquel point. The Miquel point is also the focus of the parabola inscribed to AECF .

Version: 2 Owner: vladm Author(s): vladm

527.15

Mollweide’s equations

In a triangle, having the sides a, b and c opposite to the angles α, β and γ respectively the following equations hold: α−β γ (a + b) sin = c cos 2 2 and (a − b) cos γ = c sin 2 α−β 2 .

Version: 2 Owner: mathwizard Author(s): mathwizard

527.16

Morley’s theorem

Morley’s theorem. The points of intersections of the adjacent trisectors in any triangle, are the vertices of an equilateral triangle.

Version: 3 Owner: drini Author(s): drini

1881

527.17

Newton’s line

Let ABCD be a circumscribed quadrilateral. The middle of the two diagonals M, N and the center of the inscribed circle I are collinear. This line is called the Newton’s line

Version: 1 Owner: vladm Author(s): vladm

527.18

Newton-Gauss line

Let AECF be a complete quadrilateral, and AC, BD, EF his diagonals. Let P be the middle of AC, Q the middle of BD, and R the middle of EF . Then P, Q, R are on a same line, called the Newton-Gauss line.

Version: 1 Owner: vladm Author(s): vladm

527.19

Pascal’s mystic hexagram

If an hexagon ADBF CE (not necessarily convex) is inscribed into a conic (in particular into a circle), then the points of intersections of opposite sides (AD with F C, DBwith CE and BF with EA) are collinear. This line is called the Pascal line of the hexagon. A very special case happens when the conic degenerates into two lines, however the theorem still holds although this particular case is usually called Pappus theorem.

Version: 5 Owner: drini Author(s): drini

527.20

Ptolemy’s theorem

If ABCD is a cyclic quadrilateral, then the product of the two diagonals is equal to the sum of the products of opposite sides. 1882

AC · BD = AB · CD + AD · BC. When the quadrilateral is not cyclic we have the following inequality AB · CD + BC · AD > AC · BD Version: 5 Owner: drini Author(s): drini

527.21

Pythagorean theorem

Pythagorean theorem states: If ABC is a right triangle, then the square of the length of the hypothenuse is equal to the sum of the squares of the two legs. In the following picture, the purple squares add up the same area as the orange one.

AC 2 = AB 2 + BC 2 . Cosines law is a generalization of Pythagorean theorem for any triangle. Version: 12 Owner: drini Author(s): drini

527.22

Schooten theorem

Theorem: Let ABC be a equilateral triangle. If M is a point on the circumscribed circle then the equality AM = BM + CM holds. Proof: Let B ∈ (MA) so that MB = B B. Because BMA = BCA = 60◦ , the triangle MBB is equilateral, so BB = MB = MB . Because AB = BC, BB = BM and ABB ⇔ MBC we have that the triangles ABB and CBM are equivalent. Since MC = AB we have that AM = AB + B M = MC + MB.

1883

Version: 1 Owner: vladm Author(s): vladm

527.23

Simson’s line

Let ABC a triangle and P a point on its circumcircle (other than A, B, C). Then the feet of the perpendiculars drawn from P to the sides AB, BC, CA (or their prolongations) are collinear.

An interesting result form the realm of analytic geometry states that the envelope formed by Simson’s lines when P varies is a circular hypocycloid of three points. Version: 9 Owner: drini Author(s): drini

527.24

Stewart’s theorem

Let be given a triangle ABC with AB = c, BC = a, CA = b, and a point X on BC such that BX = m and XC = n. Denote with p the length of AX. Then a(p2 + mn) = b2 m + c2 n. Version: 3 Owner: drini Author(s): drini

527.25

Thales’ theorem

Let A and B be two points and C a point on the semicircle above them. Then the angle ACB is 90◦ .

Version: 3 Owner: mathwizard Author(s): mathwizard

1884

527.26

alternate proof of parallelogram law

Proof of this is simple, given the cosine law: c2 = a2 + b2 − 2ab cos φ where a, b, and c are the lengths of the sides of the triangle, and angle φ is the corner angle opposite the side of length c. Let us define the largest interior angles as angle θ. Applying this to the parallelogram, we find that d2 = u2 + v 2 − 2uv cos θ 1 d2 = u2 + v 2 − 2uv cos (π − θ) 2 Knowing that we can add the two expressions together, and find ourselves with d2 + d2 = 2u2 + 2v 2 − 2uv cos θ + 2uv cos θ 2 1 d2 + d2 = 2u2 + 2v 2 2 1 which is the theorem we set out to prove. Version: 2 Owner: drini Author(s): fiziko cos (π − θ) = − cos θ

527.27

alternative proof of the sines law

The goal is to prove the sine law: sin A sin B sin C 1 = = = a b c 2R where the variables are defined by the triangle

(60,30)**@- (0,0)**@- ,(50,12)*a ,(30,18)*b ,(20,-3)*c ,(9,2)*A ,(39,3)* and where R is the radius of the circumcircle that encloses our triangle. Let’s add a couple of lines and define more variables.

(60,30)**@- (0,0)**@- ,(50,12)*a ,(30,18)*b ,(20,-3)*c ,(9,2)*A ,(39,3)* So, we now know that sin A = 1885

y b

and, therefore, we need to prove or

y sin B = b ba sin B = y a y a

From geometry, we can see that sin (π − B) = So the proof is reduced to proving that

sin (π − B) = sin B This is easily seen as true after examining the top half of the unit circle. So, putting all of our results together, we get sin A y = a ba sin (π − B) sin A = a b sin A sin B = a b

(527.27.1)
sin C . c

The same logic may be followed to show that each of these fractions is also equal to For the final step of the proof, we must show that 2R = a sin A

We begin by defining our coordinate system. For this, it is convenient to find one side that is not shorter than the others and label it with length b. (The concept of a “longest” side is not well defined in equilateral and some isoceles triangles, but there is always at least one side that is not shorter than the others.) We then define our coordinate system such that the corners of the triangle that mark the ends of side b are at the coordinates (0, 0) and (b, 0). Our third corner (with sides labelled alphbetically clockwise) is at the point (c cos A, c sin A). Let the center of our circumcircle be at (x0 , y0 ). We now have
2 x2 + y0 = R2 0 2 (b − x0 )2 + y0 = R2 (c cos A − x0 )2 + (c sin A − y0 )2 = R2

(527.27.2) (527.27.3) (527.27.4)

as each corner of our triangle is, by definition of the circumcircle, a distance R from the circle’s center. 1886

Combining equations (3) and (2), we find
2 2 (b − x0 )2 + y0 = x2 + y0 0 b2 − 2bx0 = 0 b = x0 2

Substituting this into equation (2) we find that
2 y0 = R 2 −

b2 4

(527.27.5)

Combining equations (4) and (5) leaves us with (c cos A − x0 )2 + (c sin A − y0 )2 c2 cos2 A − 2x0 c cos A + c2 sin2 A − 2y0 c sin A c − 2x0 cos A − 2y0 sin A c − b cos A 2 sin A (c − b cos A)2 4 sin2 A (c − b cos A)2 + b2 sin2 A c2 − 2bc cos A + b2 a2 a sin A
2 = x2 + y0 0 = 0 = 0

= y0 = R2 − = = = = b2 4 2 4R sin2 A 4R2 sin2 A 4R2 sin2 A 2R

where we have applied the cosines law in the second to last step. Version: 3 Owner: drini Author(s): fiziko

527.28

angle bisector

For every angle, there exists a line that divides the angle into two equal parts. This line is called the angle bisector.

The interior bisector of an angle is the line or line segment that divides it into two equal angles on the same side as the angle. The exterior bisector of an angle is the line or line segment that divides it into two equal angles on the opposite side as the angle. 1887

For a triangle, the point where the angle bisectors of the three angles meet is called the incenter. Version: 1 Owner: giri Author(s): giri

527.29

angle sum identity

It is desired to prove the identities sin(θ + φ) = sin θ cos φ + cos θ sin φ and cos(θ + φ) = cos θ cos φ − sin θ sin φ Consider the figure

where we have ◦ ◦ Aad ⇔ Bba ⇔ Ccb Ddc

◦ ad = dc = 1. Also, everything is Euclidean, and in particular, the interior angles of any triangle sum to π. Call ∠Aad = θ and ∠baB = φ. From the triangle sum rule, we have ∠Ada = ∠Ddc = π − φ, while the degenerate angle ∠AdD = π, so that 2 ∠adc = θ + φ We have, therefore, that the area of the pink parallelogram is sin(θ + φ). On the other hand, we can rearrange things thus:
π 2

− θ and

In this figure we see an equal pink area, but it is composed of two pieces, of areas sin φ cos θ and cos φ sin θ. Adding, we have sin(θ + φ) = sin φ cos θ + cos φ sin θ 1888

which gives us the first. From definitions, it then also follows that sin(θ + π/2) = cos(θ), and sin(θ + π) = −sin(θ). Writing cos(θ + φ) = = = = sin(θ + φ + π/2) sin(θ) cos(φ + π/2) + cos(θ) sin(φ + π/2) sin(θ) sin(φ + π) + cos(θ) cos(φ) cos θ cos φ − sinθ sin φ

Version: 7 Owner: quincynoodles Author(s): quincynoodles

527.30

annulus

An annulus is a two-dimensional shape which can be thought of as a disc with a smaller disc removed from its center. An annulus looks like:

Note that both the inner and outer radii may take on any values, so long as the outer radius is larger than the inner. Version: 9 Owner: akrowne Author(s): akrowne

527.31

butterfly theorem

Let M be the midpoint of a chord P Q of a circle, through which two other chords AB and CD are drawn. If AD intersects P Q at X and CB intersects P Q at Y ,then M is also the midpoint of XY.

The theorem gets its name from the shape of the figure, which resembles a butterfly. Version: 5 Owner: giri Author(s): giri

527.32

centroid

The centroid of a triangle (also called center of gravity of the triangle) is the point where the three medians intersect each other. 1889

In the figure, AA , BB and CC are medians and G is the centroid of ABC. The centroid G has the property that divides the medians in the ratio 2 : 1, that is AG = 2GA BG = 2GB CG = 2GC .

Version: 5 Owner: drini Author(s): drini

527.33

chord

A chord is the line segment joining two points on a curve. Usually it is used to refer to a line segment whose end points lie on a circle. Version: 1 Owner: giri Author(s): giri

527.34

circle

Definition A circle in the plane is determined by a center and a radius. The center is a point in the plane, and the radius is a positive real number. The circle consists of all points whose distance from the center equals the radius. (In this entry, we only work with the standard Euclidean norm in the plane.) A circle determines a closed curve in the plane, and this curve is called the perimeter or circumference of the circle. If the radius of a circle is r, then the length of the perimeter is 2πr. Also, the area of the circle is πr 2 . More precisely, the interior of the perimeter has area πr 2 . The diameter of a circle is defined as d = 2r. The circle is a special case of an ellipse. Also, in three dimensions, the analogous geometric object to a circle is a sphere. The circle in analytic geometry Let us next derive an analytic equation for a circle in Cartesian coordinates (x, y). If the circle has center (a, b) and radius r > 0, we obtain the following condition for the points of the sphere, (x − a)2 + (y − b)2 = r 2 . (527.34.1) In other words, the circle is the set of all points (x, y) that satisfy the above equation. In the special case that a = b = 0, the equation is simply x2 + y 2 = r 2 . The unit circle is the circle x2 + y 2 = 1.

1890

It is clear that equation 526.34.1 can always be reduced to the form x2 + y 2 + Dx + Ey + F = 0, (527.34.2)

where D, E, F are real numbers. Conversely, suppose that we are given an equation of the above form where D, E, F are arbitrary real numbers. Next we derive conditions for these constants, so that equation (526.34.2) determines a circle [1]. Completing the squares yields x2 + Dx + whence D x+ 2 There are three cases: 1. If D 2 −4F +E 2 > 0, then equation (526.34.2) determines a circle with center (− D , − E ) 2 2 √ and radius 1 D 2 − 4F + E 2 . 2 2. If D 2 − 4F + E 2 = 0, then equation (526.34.2) determines the point (− D , − E ). 2 2 3. If D 2 − 4F + E 2 < 0, then equation (526.34.2) has no (real) solution in the (x, y)-plane. The circle in polar coordinates Using polar coordinates for the plane, we can parameterize the circle. Consider the circle with center (a, b) and radius r > 0 in the plane R2 . It is then natural to introduce polar coordinates (ρ, φ) for R2 \ {(a, b)} by x(ρ, φ) = a + ρ cos φ, y(ρ, φ) = b + ρ sin φ, with ρ > 0 and φ ∈ [0, 2π). Since we wish to parameterize the circle, the point (a, b) does not pose a problem; it is not part of the circle. Plugging these expressions for x, y into equation (526.34.1) yields the condition ρ = r. The given circle is thus parameterization by φ → (a + ρ cos φ, b + ρ sin φ), φ ∈ [0, 2π). It follows that a circle is a closed curve in the plane. Three point formula for the circle Suppose we are given three points on a circle, say (x1 , y1 ), (x2 , y2), (x3 , y3 ). We next derive expressions for the parameters D, E, F in terms of these points. We also derive equation (526.34.3), which gives an equation for a circle in terms of a determinant. First, from equation (526.34.2), we have
2 x2 + y1 + Dx1 + Ey1 + F = 0, 1 2 x2 + y2 + Dx2 + Ey2 + F = 0, 2 2 x2 + y3 + Dx3 + Ey3 + F = 0. 3

D2 E2 D2 E 2 + y 2 + Ey + = −F + + , 4 4 4 4
2

E + y+ 2

2

=

D 2 − 4F + E 2 . 4

1891

Let us denote the matrix on the left hand side by Λ. Also, let us assume that det Λ = 0. Then, using Cramer’s rule, we obtain  2  2 x1 + y1 y1 1 1 2 det x2 + y2 y2 1 , D=− 2 det Λ x2 + y 2 y3 1  3 23 2  x1 x1 + y1 1 1 2 x2 x2 + y2 1 , det E=− 2 det Λ 2 2 x3 x3 + y3 1   2 x1 y1 x2 + y1 1 1 2 det x2 y2 x2 + y2  . F =− 2 det Λ 2 2 x3 y3 x3 + y3 These equations give the parameters D, E, F as functions of the three given points. Substituting these equations into equation (526.34.2) yields     2 2 x1 y1 1 x1 + y1 y1 1 2 (x2 + y 2 ) det x2 y2 1 − x det x2 + y2 y2 1 2 2 x2 + y3 y3 1 x3 y3 1 3   2 x1 x2 + y1 1 1 2 − y det x2 x2 + y2 1 2 2 x3 x2 + y3 1 3   2 x1 y1 x2 + y1 1 2 − det x2 y2 x2 + y2  = 0. 2 2 x3 y3 x2 + y3 3

These equations form a linear set of equations for D, E, F , i.e.,  2      2 x1 + y1 D x1 y1 1 2 x2 y2 1 · E  = − x2 + y2  . 2 2 F x2 + y3 x3 y3 1 3

Using the cofactor expansion, we can now write the equation for the circle passing trough (x1 , y1 ), (x2 , y2 ), (x3 , y3 ) as [2, 3]   2 x + y2 x y 1 2 x2 + y1 x1 y1 1  det  1 (527.34.3) 2 2 x2 + y2 x2 y2 1 = 0. 2 x2 + y3 x3 y3 1 3 See also • Wikipedia’s entry on the circle.

1892

REFERENCES
1. J. H. Kindle, Schaum’s Outline Series: Theory and problems of plane of Solid Analytic Geometry, Schaum Publishing Co., 1950. 2. E. Weisstein, Eric W. Weisstein’s world of mathematics, entry on the circle. 3. L. Rade, B. Westergren, Mathematics Handbook for Science and Engineering, Stu˙ dentlitteratur, 1995.

Version: 2 Owner: bbukh Author(s): bbukh, matte

527.35

collinear

A set of points are said to be collinear of they all lie on a straight line. In the following picture, A, P, B are collinear.

Version: 6 Owner: drini Author(s): drini

527.36

complete quadrilateral
CD and {E} = BC AD. Then AF CE

Let ABCD be a quadrilateral. Let {F } = AB is a complete quadrilateral.

The complete quadrilateral has four sides : ABF , ADE, BCE, DCF , and six angles: A, B, C, D, E, F .

Version: 2 Owner: vladm Author(s): vladm

527.37

concurrent

A set of lines or curves is said to be concurrent if all of them pass through some point:

Version: 2 Owner: drini Author(s): drini 1893

527.38

cosines law

Cosines Law. Let a, b, c be the sides of a triangle, and let A the angle opposite to a. Then a2 = b2 + c2 − 2bc cos A.

Version: 9 Owner: drini Author(s): drini

527.39

cyclic quadrilateral

Cyclic quadrilateral. A quadrilateral is cyclic when its four vertices lie on a circle.

A necessary and sufficient condition for a quadrilateral to be cyclic, is that the sum of a pair of opposite angles be equal to 180◦ . One of the main results about these quadrilaterals is Ptolemy’s theorem. Also, from all the quadrilaterals with given sides p, q, r, s, the one that is cyclic has the greatest area. If the four sides of a cyclic quadrilateral are known, the area can be found using Brahmagupta’s formula Version: 4 Owner: drini Author(s): drini

527.40

derivation of cosines law

The idea is to prove the cosines law: a2 = b2 + c2 − 2bc cos θ where the variables are defined by the triangle: (60,30)**@- (0,0)**@- ,(20,-3)*c ,(7,2)*θ ,(50,12)*a ,(30,17)*b

1894

Let’s add a couple of lines and two variables, to get

(60,30)**@- (0,0)**@- ,(20,-3)*c ,(7,2)*θ ,(50,12)*a ,(30,17)*b ,(40,0) ;( This is all we need. We can use Pythagoras’ theorem to show that a2 = x2 + y 2 and b2 = y 2 + (c + x)2

So, combining these two we get a2 = x2 + b2 − (c + x)2 a2 = x2 + b2 − c2 − 2cx − x2 a2 = b2 − c2 − 2cx So, all we need now is an expression for x. Well, we can use the definition of the cosine function to show that c + x = b cos θ x = b cos θ − c With this result in hand, we find that a2 a2 a2 a2 = = = = b2 − c2 − 2cx b2 − c2 − 2c (b cos θ − c) b2 − c2 − 2bc cos θ + 2c2 b2 + c2 − 2bc cos θ

(527.40.1)

Version: 2 Owner: drini Author(s): fiziko

527.41

diameter

The diameter of a circle or a sphere is the length of the segment joining a point with the one symmetrical respect to the center. That is, the length of the longest segment joining a pair of points. Also, we call any of these segments themselves a diameter. So, in the next picture, AB is a diameter. 1895

The diameter is equal to twice the radius. Version: 17 Owner: drini Author(s): drini

527.42

double angle identity

The double-angle identities are

sin(2a) = 2 cos(a) sin(a) cos(2a) = 2 cos2 (a) − 1 = 1 − 2 sin2 (a) 2 tan(a) tan(2a) = 1 + tan2 (a) These are all derived from their respective trig addition formulas. For example,

(527.42.1) (527.42.2) (527.42.3)

sin(2a) = sin(a + a) = cos(a) sin(a) + sin(a) cos(a) = 2 cos(a) sin(a) The formula for cosine follows similarly, and tangent is derived by taking the ratio of sine to cosine, as always. The double-angle formulae can also be derived from the de Moivre identity. Version: 5 Owner: akrowne Author(s): akrowne

527.43

equilateral triangle

A triangle with its three sides and its three angles equal.

Therefore, an equilateral triangle has 3 angles of 60◦ . Due to the congruence criterion sideside-side, an equilateral triangle gets completely detrmined by specifying its side. 1896

In an equilateral triangle, the bisector of any angle coincides with the height, the median and the perpendicular bisector of the opposite side. √ If r is the length of the side, then the height is equal to r 3/2. Version: 3 Owner: drini Author(s): drini

527.44

fundamental theorem on isogonal lines

Let ABC be a triangle and AX, BY, CZ three concurrent lines at P . If AX , BY , CZ are the respective isogonal conjugate lines for AX, BY, CZ, then AX , BY , CZ are also concurrent at some point P . An applications of this theorem proves the existence of Lemoine point (for it is the intersection point of the symmedians): This theorem is a direct consequence of Ceva’s theorem (trigonometric version). Version: 1 Owner: drini Author(s): drini

527.45

height

Let ABC be a given triangle. A height of ABC is a line drawn from a vertex to the opposite side (or its prolongations) and perpendicular to it. So we have three heights in any triangle. The three heights are always concurrent and the common point is called orthocenter. In the following figure, AD, BE and CE are heights of ABC.

Version: 2 Owner: drini Author(s): drini

527.46

hexagon

An hexagon is a 6-sided polygon.

Figure. A regular hexagon. 1897

Version: 2 Owner: drini Author(s): drini

527.47

hypotenuse

Let ABC a right triangle with right angle at C. Then AB is called hypotenuse.

The middle point P of the hypotenuse coincides with the circumcenter of the triangle, so it is equidistant from the three vertices. When the triangle is inscribed on his circumcircle, the hypotenuse becomes a diameter. So the distance from P to the vertices is precisely the circumradius. The hypotenuse’s lenght can be calculated by means of the Pythagorean theorem: √ c = a2 + b2 Sometimes, the longest side of a triangle is also called an hypotenuse but this naming is seldom seen. Version: 5 Owner: drini Author(s): drini

527.48

isogonal conjugate

Let ABC be a triangle, AL the angle bisector of ∠BAC and AX any line passing through A. The isogonal conjugate line to AX is the line AY obtained by reflecting the line AX on the angle bisector AL. In the picture ∠Y AL = ∠LAX. This is the reason why AX and AY are called isogonal conjugates, since they form the same angle with AL. (iso= equal, gonal = angle). Let P be a point on the plane. The lines AP, BP, CP are concurrent by construction. Consider now their isogonals conjugates (reflections on the inner angle bisectors). The isogonals conjugates will also concurr by the fundamental theorem on isogonal lines, and their intersection point Q is called the isogonal conjugate of P . If Q is the isogonal conjugate of P , then P is the isogonal conjugate of Q so both are often referred as an isogonal conjugate pair. An example of isogonal conjugate pair is found by looking at the centroid of the triangle and the Lemoine point. Version: 4 Owner: drini Author(s): drini 1898

527.49

isosceles triangle

A triangle with two equal sides. This definition implies that any equilateral triangle is isosceles too, but there are isosceles triangles that are not equilateral. In any isosceles triangle, the angles opposite to the equal sides are also equal. In an equilateral triangle, the height, the median and the bisector to the third side are the same line. Version: 5 Owner: drini Author(s): drini

527.50

legs

The legs of a triangle are the two sides which are not the hypotenuse.
Above: Various triangles, with legs in red.

Note that there are no legs for isosceles or right triangles, just as there is no notion of hypotenuse for these special triangles. Version: 3 Owner: akrowne Author(s): akrowne

527.51

medial triangle

The medial triangle of a triangle ABC is the triangle formed by joining the midpoints of the sides of the triangle ABC.

Here, A B C is the medial triangle. The incircle of the medial triangle is called the Spieker circle and the incenter is called the Spieker center. The circumcircle of the medial triangle is called the medial circle. An important property of the medial triagle is that the medial triangle medial triangle DEF of ABC is similar to ABC. A B C of the

Version: 2 Owner: giri Author(s): giri 1899

527.52

median

The median of a triangle is a line joining a vertex with the midpoint of the opposite side. In the next figure, AA is a median. That is, BA = A C, or equivalently, A is the midpoint of BC.

Version: 7 Owner: drini Author(s): drini

527.53

midpoint

If AB is a segment, then its midpoint is the point P whose distances from B and C are equal. That is, AP = P B. With the notation of directed segments, it’s the point on the line that contains AB such − → AP that the ratio −→ = 1. −
PB

Version: 2 Owner: drini Author(s): drini

527.54

nine-point circle

The nine point circle also known as the Euler’s circle or the Feuerbach circle is the circle that passes through the feet of perpendiculars from the vertices A, B and C of a triangle ABC.

Some of the properties of this circle are: Property 1 : This circle also passes through the midpoints of the sides AB, BC and CA of ABC. This was shown by Euler. Property 2 : Feuerbach showed that this circle also passes through the midpoints of the line segments AH, BH and CH which are drawn from the vertices of ABC to its orthocenter H. 1900

These three triples of points make nine in all, giving the circle its name. Property 3 : The radius of the nine-point cirlce is R/2, where R is the circumradius (radius of the circumcircle). Property 4 : The center of the nine-point circle is the midpoint of the line segment joining the orthocenter and the circumcenter, and hence lies on the Euler Line. Property 5 : All triangles inscribed in a given circle and having the same orthocenter, have the same nine-point circle. Version: 3 Owner: mathwizard Author(s): mathwizard, giri

527.55

orthic triangle

If ABC is a triangle and AD, DE, CF are its three heights, then the triangle DEF is called the orthic triangle of ABC. A remarkable property of orthic triangles says that the orthocenter of ABC is also the incenter of the orthic triangle DEF . That is, the heights or ABC are the angle bisectors of DEF .

Version: 2 Owner: drini Author(s): drini

527.56

orthocenter

The orthocenter of a triangle is the point of intersection of its three heights.

In the figure, H is the orthocenter of ABC. The orthocenter H lies inside, on a vertex or outside the triangle depending on the triangle being acute, right or obtuse respectively. Orthocenter is one of the most important triangle centers and it is very related with the orthic triangle (formed by the three height’s foots). It lies on the Euler Line and the four quadrilaterals F HDB, CHEC, AF HE are cyclic. Version: 3 Owner: drini Author(s): drini 1901

527.57

parallelogram

A quadrilateral whose opposite sides are parallel. Some special parallelograms have their own names: squares, rectangles, rhombuses. A rectangle is a parallelogram whose 4 angles are equal, a rhombus is a parallelogram whose 4 sides are equal, and a square is a parallelogram that is a rectangle and a rhombus at the same time. All parallelograms have their opposite sides and opposite angles equal (moreover, if a quadrilateral has a pair of opposite sides equal and parallel, the quadrilateral must be a parallelogram). Also, adjacent angles always add up 180◦ , and the diagonals cut each other by the midpoint. There is also a neat relation between the length of the sides and the length of the diagonals called parallelogram law. Version: 2 Owner: drini Author(s): drini

527.58

parallelogram law

Let ABCD be a parallelogram with side lengths u, v and whose diagonals have lengths d1 and d2 then 2u2 + 2v 2 = d2 + d2 . 1 2

Version: 3 Owner: drini Author(s): drini

527.59

pedal triangle

The pedal triangle of any ABC is the triangle, whose vertices are the feet of perpendiculars from A, B, and C to the opposite sides of the triangle. In the figure DEF is the pedal triangle.

In general, for any point P inside a triangle, the pedal triangle of P is one whose vertices are the feet of perpendiculars from P to the sides of the triangle. 1902

Version: 3 Owner: giri Author(s): giri

527.60

pentagon

A pentagon is a 5-sided polygon. Regular pentagons are of particular interest for geometers. On a regular pentagon, the inner angles are equal to 108◦ . All the diagonals have the same length. If s is the length of a side and d is the length of a diagonal, then √ 1+ 5 d = s 2 that is, the ratio between a diagonal and a side is the golden number. Version: 1 Owner: drini Author(s): drini

527.61

polygon

A polygon is a plane region delimited by straight lines. Some polygons have special names Number of sides 3 4 5 6 7 8 Name of the polygon triangle quadrilateral pentagon hexagon Heptagon Octagon

In general, a polygon with n sides is called an n-gon. In an n-gon, there are n points where two sides meet. These are called the vertices of the n-gon. At each vertex, the two sides that meet determine two angles: the interior angle and the exterior angle. The former angle opens towards to the interior of the polygon, and the latter towards to exterior of the polygon. Below are some properties for polygons. 1. The sum of all its interior angles is (n − 2)180◦. 2. Any polygonal divides the plane into two components, one bounded (the inside of the polygon) and one unbounded. This result is the Jordan curve theorem for polygons. A direct proof can be found in [2], pp. 16-18. 1903

3. In complex analysis, the Schwarz-Christoffel transformation [2] gives a conformal map from any polygon to the upper half plane. 4. The area of a polygon can be calculated using Pick’s theorem.

REFERENCES
1. E.E. Moise, Geometric Topology in Dimensions 2 and 3, Springer-Verlag, 1977. 2. R.A. Silverman, Introductory Complex Analysis, Dover Publications, 1972.

Version: 5 Owner: matte Author(s): matte, drini

527.62

proof of Apollonius theorem

Let b = CA, a = BC, c = AB, and m = AM. Let ∠CMA = θ, so that ∠BMA = π − θ. By the law of cosines, b2 = m2 + a4 − am cos θ and c2 = m2 + 2 m2 + a4 + am cos θ, and adding gives b2 + c2 = 2m2 + QED Version: 1 Owner: quincynoodles Author(s): quincynoodles a2 . 2
2

a2 4

− am cos(π − θ) =

527.63

proof of Apollonius theorem

Let m be a median of the triangle, as shown in the figure. By Stewart’s theorem we have a m2 + and thus a 2
2 2

= b2

a a + c2 2 2

a m + 2

2

b2 + c2 . = 2

Multiplying both sides by 2 gives 2m2 + a2 = b2 + c2 . 2 1904

QED Version: 2 Owner: drini Author(s): drini

527.64

proof of Brahmagupta’s formula

We shall prove that the area of a cyclic quadrilateral with sides p, q, r, s is given by (T − p)(T − q)(T − r)(T − s) where T =
p+q+r+s . 2

Area of the cyclic quadrilateral = Area of

ADB+ Area of

BDC.

1 1 = pq sin A + rs sin C 2 2 But since ABCD is a cyclic quadrilateral, ∠DAB = 180◦ − ∠DCB. Hence sin A = sin C. Therefore area now is 1 1 Area = pq sin A + rs sin A 2 2 1 (Area)2 = sin2 A(pq + rs)2 4 2 4(Area) = (1 − cos2 A)(pq + rs)2 4(Area)2 = (pq + rs)2 − cos2 A(pq + rs)2 ADB and p2 + q 2 − 2pq cos A = r 2 + s2 − 2rs cos C Substituting cos C = − cos A (since angles A and C are suppplementary) and rearranging, we have 2 cos A(pq + rs) = p2 + q 2 − r 2 − s2 substituting this in the equation for area, 1 4(Area)2 = (pq + rs)2 − (p2 + q 2 − r 2 − s2 )2 4 which is of the form a2 − b2 and hence can be written in the form (a + b)(a − b) as (2(pq + rs) + p2 + q 2 − r 2 − s2 )(2(pq + rs) − p2 − q 2 + r 2 + s2 ) 1905 16(Area)2 = 4(pq + rs)2 − (p2 + q 2 − r 2 − s2 )2 Applying cosines law for have BDC and equating the expressions for side DB, we

= ((p + q)2 − (r − s)2 )((r + s)2 − (p − q)2 ) = (p + q + r − s)(p + q + s − r)(p + r + s − q)(q + r + s − p) Introducing T =
p+q+r+s , 2

16(Area)2 = 16(T − p)(T − q)(T − r)(T − s) Taking square root, we get Area = (T − p)(T − q)(T − r)(T − s)

Version: 3 Owner: giri Author(s): giri

527.65

proof of Erd¨s-Anning Theorem o

Let A, B and C be three non-collinear points. For an additional point P consider the triangle ABP . By using the triangle inequality for the sides P B and P A we find −|AB| |P B| − |P A| |AB|. Likewise, for triangle BCP we get −|BC| |P B| − |P C| |BC|. Geometrically, this means the point P lies on two hyperbola with A and B or B and C respectively as foci. Since all the lengths involved here are by assumption integer there are only 2|AB| + 1 possibilities for |P B| − |P A| and 2|BC| + 1 possibilities for |P B| − |P C|. These hyperbola are distinct since they don’t have the same major axis. So for each pair of hyperbola we can have at most 4 points of intersection and there can be no more than 4(2|AB| + 1)(2|BC| + 1) points satisfying the conditions. Version: 1 Owner: lieven Author(s): lieven

527.66

proof of Heron’s formula

Let α be the angle between the sides b and c, then we get from the cosines law: cos α = Using the equation sin α = we get: sin α = Now we know: √ b2 + c2 − a2 . 2bc √ 1 − cos2 α

−a4 − b4 − c4 + 2b2 c2 + 2a2 b2 + 2a2 c2 . 2bc 1 ∆ = bc sin α. 2 1906

So we get: ∆ = 1√ 4 −a − b4 − c4 + 2b2 c2 + 2a2 b2 + 2a2 c2 4 1 (a + b + c)(b + c − a)(a + c − b)(a + b − c) = 4 s(s − a)(s − b)(s − c). =

This is Heron’s formula. P Version: 2 Owner: mathwizard Author(s): mathwizard

527.67

proof of Mollweide’s equations

We transform the equation (a + b) sin to γ α−β = c cos 2 2

α β α β α β α β + + b cos + = c cos cos + c sin sin , 2 2 2 2 2 2 2 2 using the fact that γ = π − α − β. The left hand side can be further expanded, so that we get: a cos a cos β α β β α β α α cos − sin sin +b cos cos − sin sin 2 2 2 2 2 2 2 2 = c cos β α β α cos +c sin sin . 2 2 2 2

Collecting terms we get: (a + b − c) cos Using s :=
a+b+c 2

α β α β cos − (a + b + c) sin sin = 0. 2 2 2 2

and using the equations α = 2 β cos = 2 sin (s − b)(s − c) bc s(s − a) bc

we get: s(s − c) (s − a)(s − b) s(s − c) (s − a)(s − b)) −2 = 0, c ab c ab which is obviously true. So we can prove the first equation by going backwards. The second equation can be proved in quite the same way. 2 Version: 1 Owner: mathwizard Author(s): mathwizard 1907

527.68

proof of Ptolemy’s inequality

Looking at the quadrilateral ABCD we construct a point E, such that the triangles ACD and AEB are similar (∠ABE = ∠CDA and ∠BAE = ∠CAD).

This means that: from which follows that

AE AB BE = = , AC AD DC BE = AB · DC . AD

AB AD = AC AE the triangles EAC and BAD are similar. So we get: EC = Now if ABCD is cyclic we get ∠ABE + ∠CBA = ∠ADC + ∠CBA = 180◦ . This means that the points C, B and E are on one line ans thus: EC = EB + BC Now we can use the formulas we already found to get: AB · DC AC · DB = + BC. AD AD Multiplication with AD gives: AC · DB = AB · DC + BC · AD. Now we look at the case that ABCD is not cyclic. Then ∠ABE + ∠CBA = ∠ADC + ∠CBA = 180◦ , so the points E, B and C form a triangle and from the triangle inequality we know: EC < EB + BC. AC · DB . AD

Also because ∠EAC = ∠BAD and

1908

Again we use our formulas to get: AB · DC AC · DB < + BC. AD AD From this we get: AC · DB < AB · DC + BC · AD. Putting this together we get Ptolomy’s inequality: AC · DB with equality iff ABCD is cyclic. Version: 1 Owner: mathwizard Author(s): mathwizard AB · DC + BC · AD,

527.69

proof of Ptolemy’s theorem

Let ABCD be a cyclic quadrialteral. We will prove that AC · BD = AB · CD + BC · DA.

Find a point E on BD such that ∠BCA = ∠ECD. Since ∠BAC