1) – volume 2
Chapter 242 1600 – General reference works (handbooks, dictionaries, bibliographies, etc.)
242.1 direct product of modules
Let {Xi : i ∈ I} be a collection of modules in some category of modules. Then the direct product i∈I Xi of that collection is the module whose underlying set is the cartesian product of the Xi with componentwise addition and scalar multiplication. For example, in a category of left modules: (xi ) + (yi ) = (xi + yi), r(xi ) = (rxi ). For each j ∈ I we have a projection pj : i∈I Xi → Xj deﬁned by (xi ) → xj , and an injection λj : Xj → i∈I Xi where an element xj of Xj maps to the element of i∈I Xi whose jth term is xj and every other term is zero. The direct product i∈I Xi satisﬁes a certain universal property. Namely, if Y is a module and there exist homomorphisms fi : Xi → Y for all i ∈ I, then there exists a unique homomorphism φ : Y → i∈I Xi satisfying φλi = fi for all i ∈ I. Xi
λi i∈I fi
Y
φ
Xi
The direct product is often referred to as the complete direct sum, or the strong direct sum, or simply the product. 1088
Compare this to the direct sum of modules. Version: 3 Owner: antizeus Author(s): antizeus
242.2
direct sum
Let {Xi : i ∈ I} be a collection of modules in some category of modules. Then the direct sum i∈I Xi of that collection is the submodule of the direct product of the Xi consisting of all elements (xi ) such that all but a ﬁnite number of the xi are zero. For each j ∈ I we have a projection pj : i∈I Xi → Xj deﬁned by (xi ) → xj , and an injection λj : Xj → i∈I Xi where an element xj of Xj maps to the element of i∈I Xi whose jth term is xj and every other term is zero. The direct sum i∈I Xi satisﬁes a certain universal property. Namely, if Y is a module and there exist homomorphisms fi : Y → Xi for all i ∈ I, then there exists a unique homomorphism φ : i∈I Xi → Y satisfying pi φ = fi for all i ∈ I. Xi
pi i∈I fi
Y
φ
Xi
The direct sum is often referred to as the weak direct sum or simply the sum. Compare this to the direct product of modules. Version: 3 Owner: antizeus Author(s): antizeus
242.3
exact sequence
If we have two homomorphisms f : A → B and g : B → C in some category of modules, then we say that f and g are exact at B if the image of f is equal to the kernel of g. A sequence of homomorphisms · · · → An+1 −→ An −→ An−1 → · · ·
fn+1 fn
is said to be exact if each pair of adjacent homomorphisms (fn+1 , fn ) is exact – in other words if imfn+1 = kerfn for all n. Compare this to the notion of a chain complex. Version: 2 Owner: antizeus Author(s): antizeus 1089
242.4
quotient ring
Deﬁnition. Let R be a ring and let I be a twosided ideal of R. To deﬁne the quotient ring R/I, let us ﬁrst deﬁne an equivalence relation in R. We say that the elements a, b ∈ R are equivalent, written as a ∼ b, if and only if a − b ∈ I. If a is an element of R, we denote the corresponding equivalence class by [a]. Thus [a] = [b] if and only if a − b ∈ I. The quotient ring of R modulo I is the set R/I = {[a]  a ∈ R}, with a ring structure deﬁned as follows. If [a], [b] are equivalence classes in R/I, then • [a] + [b] := [a + b], • [a] · [b] := [a · b]. Here a and b are some elements in R that represent [a] and [b]. By construction, every element in R/I has such a representative in R. Moreover, since I is closed under addition and multiplication, one can verify that the ring structure in R/I is well deﬁned. properties. 1. If R is commutative, then R/I is commutative. Examples. 1. For any ring R, we have that R/R = {0} and R\{0} = R. 2. Let R = Z, and let I be the set of even numbers. Then R/I contains only two classes; one for even numbers, and one for odd numbers. Version: 3 Owner: matte Author(s): matte, djao
1090
Chapter 243 16D10 – General module theory
243.1 annihilator
Let R be a ring. Suppose that M is a left Rmodule. If X is a subset of M, then we deﬁne the left annihilator of X in R: l.ann(X) = {r ∈ R  rx = 0 for all x ∈ X}. If Z is a subset of R, then we deﬁne the right annihilator of Z in M: r.annM (Z) = {m ∈ M  zm = 0 for all z ∈ Z}. Suppose that N is a right Rmodule. If Y is a subset of N, then we deﬁne the right annihilator of Y in R: r.ann(Y ) = {r ∈ R  yr = 0 for all y ∈ Y }. If Z is a subset of R, then we deﬁne the left annihilator of Z in N: l.annN (Z) = {n ∈ N  nz = 0 for all z ∈ Z}. Version: 3 Owner: antizeus Author(s): antizeus
243.2
annihilator is an ideal
The right annihilator of a right Rmodule MR in R is an ideal. 1091
y the distributive law for modules, it is easy to see that r. ann(MR ) is closed under addition and right multiplication. Now take x ∈ r. ann(MR ) and r ∈ R.
B
Take any m ∈ MR . Then mr ∈ MR , but then (mr)x = 0 since x ∈ r. ann(MR ). So m(rx) = 0 and rx ∈ r. ann(MR ). An equivalent result holds for left annihilators. Version: 2 Owner: saforres Author(s): saforres
243.3
artinian
A module M is artinian if it satisﬁes the following equivalent conditions: • the descending chain condition holds for submodules of M; • every nonempty family of submodules of M has a minimal element. A ring R is left artinian if it is artinian as a left module over itself (i.e. if R R is an artinian module), and right artinian if it is artinian as a right module over itself (i.e. if RR is an artinian module), and simply artinian if both conditions hold. Version: 3 Owner: antizeus Author(s): antizeus
243.4
composition series
Let R be a ring and let M be a (right or left) Rmodule. A series of submodules M = M0 ⊃ M1 ⊃ M2 ⊃ · · · ⊃ Mn = 0 in which each quotient Mi /Mi+1 is simple is called a composition series for M. A module need not have a composition series. For example, the ring of integers, Z, condsidered as a module over itself, does not have a composition series. A necessary and suﬃcient condition for a module to have a composition series is that it is both noetherian and artinian. If a module does have a composition series, then all composition series are the same length. This length (the number n above) is called the composition length of the module. If R is a semisimple Artinian ring, then RR and R R always have composition series. Version: 1 Owner: mclase Author(s): mclase 1092
243.5
conjugate module
If M is a right module over a ring R, and α is an endomorphism of R, we deﬁne the conjugate module M α to be the right Rmodule whose underlying set is {mα  m ∈ M}, with abelian group structure identical to that of M (i.e. (m − n)α = mα − nα ), and scalar multiplication given by mα · r = (m · α(r))α for all m in M and r in R. In other words, if φ : R → EndZ (M) is the ring homomorphism that describes the right module action of R upon M, then φα describes the right module action of R upon M α . If N is a left Rmodule, we deﬁne α N similarly, with r · α n = α (α(r) · n). Version: 4 Owner: antizeus Author(s): antizeus
243.6
modular law
Let R M be a left Rmodule with submodules A, B, C, and suppose C ⊆ B. Then C + (B A) = B (C + A)
Version: 1 Owner: saforres Author(s): saforres
243.7
module
Let R be a ring, and let M be an abelian group. We say that M is a left Rmodule if there exists a ring homomorphism φ : R → EndZ (M) from R to the ring of abelian group endomorphisms on M (in which multiplication of endomorphisms is composition, using left function notation). We typically denote this function using a multiplication notation: [φ(r)](m) = r · m = rm This ring homomorphism deﬁnes what is called a left module action of R upon M. If R is a unital ring (i.e. a ring with identity), then we typically demand that the ring homomorphism map the unit 1 ∈ R to the identity endomorphism on M, so that 1 · m = m for all m ∈ M. In this case we may say that the module is unital. Typically the abelian group structure on M is expressed in additive terms, i.e. with operator +, identity element 0M (or just 0), and inverses written in the form −m for m ∈ M. 1093
Right module actions are deﬁned similarly, only with the elements of R being written on the right sides of elements of M. In this case we either need to use an antihomomorphism R → EndZ (M), or switch to right notation for writing functions. Version: 7 Owner: antizeus Author(s): antizeus
243.8
proof of modular law
First we show C + (B A) ⊆ B (C + A): Note that C ⊆ B, B A ⊆ B, and therefore C + (B A) ⊆ B. Further, C ⊆ C + A, B A ⊆ C + A, thus C + (B A) ⊆ C + A. Next we show B (C + A) ⊆ C + (B A): Let b ∈ B (C + A). Then b = c + a for some c ∈ C and a ∈ A. Hence a = b − c, and so a ∈ B since b ∈ B and c ∈ C ⊆ B. Hence a ∈ B A, so b = c + a ∈ C + (B A). Version: 5 Owner: saforres Author(s): saforres
243.9
zero module
Let R be a ring. The abelian group which contains only an identity element (zero) gains a trivial Rmodule structure, which we call the zero module. Every Rmodule M has an zero element and thus a submodule consisting of that element. This is called the zero submodule of M. Version: 2 Owner: antizeus Author(s): antizeus
1094
Chapter 244 16D20 – Bimodules
244.1 bimodule
Suppose that R and S are rings. An (R, S)bimodule is an abelian group M which has a left Rmodule action as well as a right Smodule action, which satisfy the relation r(ms) = (rm)s for every choice of elements r of R, s of S, and m of M. A (R, S)subbimodule of M is a subgroup which is also a left Rsubmodule and a right Ssubmodule. Version: 3 Owner: antizeus Author(s): antizeus
1095
Chapter 245 16D25 – Ideals
245.1 associated prime
Let R be a ring, and let M be an Rmodule. A prime ideal P of R is an annihilator prime for M if P is equal to the annihilator of some nonzero submodule X of M. Note that if this is the case, then the module annA (P ) contains X, has P as its annihilator, and is a faithful (R/P )module. If, in addition, P is equal to the annihilator of a submodule of M that is a fully faithful (R/P )module, then we call P an associated prime of M. Version: 2 Owner: antizeus Author(s): antizeus
245.2
nilpotent ideal
A left (right) ideal I of a ring R is a nilpotent ideal if I n = 0 for some positive integer n. Here I n denotes a product of ideals – I · I · · · I. Version: 2 Owner: antizeus Author(s): antizeus
245.3
primitive ideal
Let R be a ring, and let I be an ideal of R. We say that I is a left (right) primitive ideal if there exists a simple left (right) Rmodule X such that I is the annihilator of X in R. We say that R is a left (right) primitive ring if the zero ideal is a left (right) primitive ideal 1096
of R. Note that I is a left (right) primitive ideal if and only if R/I is a left (right) primitive ring. Version: 2 Owner: antizeus Author(s): antizeus
245.4
product of ideals
Let R be a ring, and let A and B be left (right) ideals of R. Then the product of the ideals A and B, which we denote AB, is the left (right) ideal generated by the products {ab  a ∈ A, b ∈ B}. Version: 2 Owner: antizeus Author(s): antizeus
245.5
proper ideal
Suppose R is a ring and I is an ideal of R. We say that I is a proper ideal if I is not equal to R. Version: 2 Owner: antizeus Author(s): antizeus
245.6
semiprime ideal
Let R be a ring. An ideal I of R is a semiprime ideal if it satisﬁes the following equivalent conditions: (a) I can be expressed as an intersection of prime ideals of R; (b) if x ∈ R, and xRx ⊂ I, then x ∈ I; (c) if J is a twosided ideal of R and J 2 ⊂ I, then J ⊂ I as well; (d) if J is a left ideal of R and J 2 ⊂ I, then J ⊂ I as well; (e) if J is a right ideal of R and J 2 ⊂ I, then J ⊂ I as well. Here J 2 is the product of ideals J · J. The ring R itself satisﬁes all of these conditions (including being expressed as an intersection of an empty family of prime ideals) and is thus semiprime. A ring R is said to be a semiprime ring if its zero ideal is a semiprime ideal. 1097
Note that an ideal I of R is semiprime if and only if the quotient ring R/I is a semiprime ring. Version: 7 Owner: antizeus Author(s): antizeus
245.7
zero ideal
In any ring, the set consisting only of the zero element (i.e. the additive identity) is an ideal of the left, right, and twosided varieties. It is the smallest ideal in any ring. Version: 2 Owner: antizeus Author(s): antizeus
1098
Chapter 246 16D40 – Free, projective, and ﬂat modules and ideals
246.1 ﬁnitely generated projective module
Let R be a unital ring. A ﬁnitely generated projective right Rmodule is of the form eRn , n ∈ N, where e is an idempotent in EndR (Rn ). Let A be a unital C ∗ algebra and p be a projection in EndA (An ), n ∈ N. Then, E = pAn is a ﬁnitely generated projective right Amodule. Further, E is a preHilbert Amodule with (Avalued) inner product
n
u, v =
i=1
u∗ vi , i
u, v ∈ E.
Version: 3 Owner: mhale Author(s): mhale
246.2
ﬂat module
A right module M over a ring R is ﬂat if the tensor product functor M ⊗R (−) is an exact functor. Similarly, a left module N over R is ﬂat if the tensor product functor (−) ⊗R N is an exact functor. Version: 2 Owner: antizeus Author(s): antizeus
1099
246.3
free module
Let R be a commutative ring. A free module over R is a direct sum of copies of R. In particular, as every abelian group is a Zmodule, a free abelian group is a direct sum of copies of Z. This is equivalent to saying that the module has a free basis, i.e. a set of elements with the property that every element of the module can be uniquely expressed as an linear combination over R of elements of the free basis. In the case that a free module over R is a sum of ﬁnitely many copies of R, then the number of copies is called the rank of the free module. An alternative deﬁnition of a free module is via its universal property: Given a set X, the free Rmodule F (X) on the set X is equipped with a function i : X → F (X) satisfying the property that for any other Rmodule A and any function f : X → A, there exists a unique Rmodule map h : F (X) → A such that (h ◦ i) = f . Version: 4 Owner: mathcam Author(s): mathcam, antizeus
246.4
free module
Let R be a ring. A free module over R is a direct sum of copies of R. Similarly, as an abelian group is simply a module over Z, a free abelian group is a direct sum of copies of Z. This is equivalent to saying that the module has a free basis, i.e. a set of elements with the property that every element of the module can be uniquely expressed as an linear combination over R of elements of the free basis. Version: 1 Owner: antizeus Author(s): antizeus
246.5
projective cover
Let X and P be modules. We say that P is a projective cover of X if P is a projective module and there exists an epimorphism p : P → X such that ker p is a superﬂuous submodule of P . Equivalently, P is an projective cover of X if P is projective, and there is an epimorphism p : P → X, and if g : P → X is an epimorphism from a projective module P to X, then
1100
there exists an epimorphism h : P → P such that ph = g. P
h g p
P
X 0
0
Version: 2 Owner: antizeus Author(s): antizeus
246.6
projective module
A module P is projective if it satisﬁes the following equivalent conditions: (a) Every short exact sequence of the form 0 → A → B → P → 0 is split; (b) The functor Hom(P, −) is exact; (c) If f : X → Y is an epimorphism and there exists a homomorphism g : P → Y , then there exists a homomorphism h : P → X such that f h = g. P
h g f
X
Y
0
(d) The module P is a direct summand of a free module. Version: 3 Owner: antizeus Author(s): antizeus
1101
Chapter 247 16D50 – Injective modules, selfinjective rings
247.1 injective hull
Let X and Q be modules. We say that Q is an injective hull or injective envelope of X if Q is both an injective module and an essential extension of X. Equivalently, Q is an injective hull of X if Q is injective, and X is a submodule of Q, and if g : X → Q is a monomorphism from X to an injective module Q , then there exists a monomorphism h : Q → Q such that h(x) = g(x) for all x ∈ X. 0 0 X
g i
Q
h
Q Version: 2 Owner: antizeus Author(s): antizeus
247.2
injective module
A module Q is injective if it satisﬁes the following equivalent conditions: (a) Every short exact sequence of the form 0 → Q → B → C → 0 is split; (b) The functor Hom(−, Q) is exact; 1102
(c) If f : X → Y is a monomorphism and there exists a homomorphism g : X → Q, then there exists a homomorphism h : Y → Q such that hf = g. 0 X
g f
Y
h
Q Version: 3 Owner: antizeus Author(s): antizeus
1103
Chapter 248 16D60 – Simple and semisimple modules, primitive rings and ideals
248.1 central simple algebra
Let K be a ﬁeld. A central simple algebra A (over K) is an algebra A over K, which is ﬁnite dimensional as a vector space over K, such that • A has an identity element, as a ring • A is central: the center of A equals K (for all z ∈ A, we have z · a = a · z for all a ∈ A if and only if z ∈ K) • A is simple: for any two sided ideal I of A, either I = {0} or I = A By a theorem of Brauer, for every central simple algebra A over K, there exists a unique (up to isomorphism) division ring D containing K and a unique natural number n such that A is isomorphic to the ring of n × n matrices with coeﬃcients in D. Version: 2 Owner: djao Author(s): djao
248.2
completely reducible
A module M is called completely reducible (or semisimple) if it is a direct sum of irreducible (or simple) modules. Version: 1 Owner: bwebste Author(s): bwebste
1104
248.3
simple ring
A nonzero ring R is said to be a simple ring if it has no (twosided) ideal other then the zero ideal and R itself. This is equivalent to saying that the zero ideal is a maximal ideal. If R is a commutative ring with unit, then this is equivalent to being a ﬁeld. Version: 4 Owner: antizeus Author(s): antizeus
1105
Chapter 249 16D80 – Other classes of modules and ideals
249.1 essential submodule
Let X be a submodule of a module Y . We say that X is an essential submodule of Y , and that Y is an essential extension of X, if whenever A is a nonzero submodule of Y , then A X is also nonzero. A monomorphism f : X → Y is an essential monomorphism if the image imf is an essential submodule of Y . Version: 2 Owner: antizeus Author(s): antizeus
249.2
faithful module
Let R be a ring, and let M be an Rmodule. We say that M is a faithful Rmodule if its annihilator annR (M) is the zero ideal. We say that M is a fully faithful Rmodule if every nonzero Rsubmodule of M is faithful. Version: 3 Owner: antizeus Author(s): antizeus
1106
249.3
minimal prime ideal
A prime ideal P of a ring R is called a minimal prime ideal if it does not properly contain any other prime ideal of R. If R is a prime ring, then the zero ideal is a prime ideal, and is thus the unique minimal prime ideal of R. Version: 2 Owner: antizeus Author(s): antizeus
249.4
module of ﬁnite rank
Let M be a module, and let E(M) be the injective hull of M. Then we say that M has ﬁnite rank if E(M) is a ﬁnite direct sum of indecomposible submodules. This turns out to be equivalent to the property that M has no inﬁnite direct sums of nonzero submodules. Version: 3 Owner: antizeus Author(s): antizeus
249.5
simple module
Let R be a ring, and let M be an Rmodule. We say that M is a simple or irreducible module if it contains no submodules other than itself and the zero module. Version: 2 Owner: antizeus Author(s): antizeus
249.6
superﬂuous submodule
Let X be a submodule of a module Y . We say that X is a superﬂuous submodule of Y if whenever A is a submodule of Y such that A + X = Y , then A = Y . Version: 2 Owner: antizeus Author(s): antizeus
1107
249.7
uniform module
A module M is said to be uniform if any two nonzero submodules of M must have a nonzero intersection. This is equivalent to saying that any nonzero submodule is an essential submodule. Version: 3 Owner: antizeus Author(s): antizeus
1108
Chapter 250 16E05 – Syzygies, resolutions, complexes
250.1 nchain
An nchain on a topological space X is a ﬁnite formal sum of nsimplices on X. The group of such chains is denoted Cn (X). For a CWcomplex Y, Cn (Y ) = Hn (Y n , Y n−1 ), where Hn denotes the nth homology group. The boundary of an nchain is the (n − 1)chain given by the formal sum of the boundaries of its constitutent simplices. An nchain is closed if its boundary is 0 and exact if it is the boundary of some (n + 1)chain. Version: 3 Owner: mathcam Author(s): mathcam
250.2
chain complex
A sequence of modules and homomorphisms
n · · · → An+1 −→ An −→ An−1 → · · ·
dn+1
d
is said to be a chain complex or complex if each pair of adjacent homomorphisms (dn+1 , dn ) satisﬁes the relation dn dn+1 = 0. This is equivalent to saying that im dn+1 ⊂ ker dn . We often denote such a complex as (A, d) or simply A. Compare this to the notion of an exact sequence. Version: 4 Owner: antizeus Author(s): antizeus
1109
250.3
ﬂat resolution
Let M be a module. A ﬂat resolution of M is an exact sequence of the form · · · → Fn → Fn−1 → · · · → F1 → F0 → M → 0 where each Fn is a ﬂat module. Version: 2 Owner: antizeus Author(s): antizeus
250.4
free resolution
Let M be a module. A free resolution of M is an exact sequence of the form · · · → Fn → Fn−1 → · · · → F1 → F0 → M → 0 where each Fn is a free module. Version: 2 Owner: antizeus Author(s): antizeus
250.5
injective resolution
Let M be a module. An injective resolution of M is an exact sequence of the form 0 → M → Q0 → Q1 → · · · → Qn−1 → Qn → · · · where each Qn is an injective module. Version: 2 Owner: antizeus Author(s): antizeus
250.6
projective resolution
Let M be a module. A projective resolution of M is an exact sequence of the form · · · → Pn → Pn−1 → · · · → P1 → P0 → M → 0 where each Pn is a projective module. Version: 2 Owner: antizeus Author(s): antizeus
1110
250.7
short exact sequence
A short exact sequence is an exact sequence of the form 0 → A → B → C → 0. Note that in this case, the homomorphism A → B must be a monomorphism, and the homomorphism B → C must be an epimorphism. Version: 2 Owner: antizeus Author(s): antizeus
250.8
split short exact sequence
f g
In an abelian category, a short exact sequence 0 → A → B → C → 0 is split if it satisﬁes the following equivalent conditions: (a) there exists a homomorphism h : C → B such that gh = 1C ; (b) there exists a homomorphism j : B → A such that jf = 1A ; (c) B is isomorphic to the direct sum A ⊕ C. In this case, we say that h and j are backmaps or splitting backmaps. Version: 4 Owner: antizeus Author(s): antizeus
250.9
von Neumann regular
An element a of a ring R is said to be von Neumann regular if there exists b ∈ R such that aba = a. A ring R is said to be a von Neumann regular ring (or simply a regular ring, if the meaning is clear from context) if every element of R is von Neumann regular. Note that regular ring in the sense of von Neumann should not be confused with regular ring in the sense of commutative algebra. Version: 1 Owner: igor Author(s): igor
1111
Chapter 251 16K20 – Finitedimensional
251.1 quaternion algebra
A quaternion algebra over a ﬁeld K is a central simple algebra over K which is four dimensional as a vector space over K. Examples: • For any ﬁeld K, the ring M2×2 (K) of 2 × 2 matrices with entries in K is a quaternion algebra over K. If K is algebraically closed, then all quaternion algebras over K are isomorphic to M2×2 (K). • For K = R, the well known algebra H of Hamiltonian quaternions is a quaternion algebra over R. The two algebras H and M2×2 (R) are the only quaternion algebras over R, up to isomorphism. • When K is a number ﬁeld, there are inﬁnitely many non–isomorphic quaternion algebras over K. In fact, there is one such quaternion algebra for every even sized ﬁnite collection of ﬁnite primes or real primes of K. The proof of this deep fact leads to many of the major results of class ﬁeld theory. Version: 1 Owner: djao Author(s): djao
1112
Chapter 252 16K50 – Brauer groups
252.1 Brauer group
Let K be a ﬁeld. The Brauer group Br(K) of K is the set of all equivalence classes of central simple algebras over K, where two central simple algebras A and B are equivalent if there exists a division ring D over K and natural numbers n, m such that A (resp. B) is isomorphic to the ring of n × n (resp. m × m) matrices with coeﬃcients in D. The group operation in Br(K) is given by tensor product: for any two central simple algebras A, B over K, their product in Br(K) is the central simple algebra A ⊗K B. The identity element in Br(K) is the class of K itself, and the inverse of a central simple algebra A is the opposite algebra Aopp deﬁned by reversing the order of the multiplication operation of A. Version: 5 Owner: djao Author(s): djao
1113
Chapter 253 16K99 – Miscellaneous
253.1 division ring
A division ring is a ring D with identity such that • 1=0 • For all nonzero a ∈ D, there exists b ∈ D with a · b = b · a = 1 A ﬁeld is equivalent to a commutative division ring. Version: 3 Owner: djao Author(s): djao
1114
Chapter 254 16N20 – Jacobson radical, quasimultiplication
254.1 Jacobson radical
The Jacobson radical J(R) of a ring R is the intersection of the annihilators of irreducible left Rmodules. The following are alternate characterizations of the Jacobson radical J(R): 1. The intersection of all left primitive ideals. 2. The intersection of all maximal left ideals. 3. The set of all t ∈ R such that for all r ∈ R, 1 − rt is left invertible (i.e. there exists u such that u(1 − rt) = 1). 4. The largest ideal I such that for all v ∈ I, 1 − v is a unit in R. 5. (1)  (3) with “left” replaced by “right” and rt replaced by tr. Note that if R is commutative and ﬁnitely generated, then J(R) = {x ∈ R  xn = 0for some n ∈ N} = Nil(R) Version: 13 Owner: saforres Author(s): saforres
1115
254.2
a ring modulo its Jacobson radical is semiprimitive
Let R be a ring. Then J(R/J(R)) = (0).
L et [u] ∈ J(R/J(R)). Then by one of the alternate characterizations of the Jacobson radical, 1 − [r][u] is left invertible for all r ∈ R, so there exists v ∈ R such that [v](1 − [r][u]) = 1.
Then v(1 − ru) = 1 − a for some a ∈ J(R). So wv(1−ru) = 1 since w(1−a) = 1 for some w ∈ R. Since this holds for all r ∈ R, u ∈ J(R), then [u] = 0. Version: 3 Owner: saforres Author(s): saforres
254.3
examples of semiprimitive rings
Examples of semiprimitive rings:
The integers Z: ince Z is commutative, any left ideal is twosided. So the maximal left ideals of Z are the maximal ideals of Z, which are the ideals pZ for p prime. Note that pZ qZ = (0) if gcd(p, q) > 1. Hence J(Z) = p pZ = (0).
S
A matrix ring Mn (D) over a division ring D:
T
he ring Mn (D) is simple, so the only proper ideal is (0). Thus J(Mn (D)) = (0).
A polynomial ring R[x] over a domain R:
T
1. By one of the alternate characterizations of the Jacobson radical, 1 − ax is a unit. But deg(1 − ax) = max{deg(1), deg(ax)} 1. So 1 − ax is not a unit, and by this contradiction we see that J(R[x]) = (0). Version: 5 Owner: saforres Author(s): saforres
ake a ∈ J(R[x]) with a = 0. Then ax ∈ J(R[x]), since J(R[x]) is an ideal, and deg(ax)
1116
254.4
proof of Characterizations of the Jacobson radical
First, note that by deﬁnition a left primitive ideal is the annihilator of an irreducible left Rmodule, so clearly characterization 1) is equivalent to the deﬁnition of the Jacobson radical. Next, we will prove cyclical containment. Observe that 5) follows after the equivalence of 1)  4) is established, since 4) is independent of the choice of left or right ideals. 1) ⊂ 2) We know that every left primitive ideal is the largest ideal contained in a maximal left ideal. So the intersection of all left primitive ideals will be contained in the intersection of all maximal left ideals. 2) ⊂ 3) Let S = {M : M a maximal left ideal of R} and take r ∈ R. Let t ∈ M ∈S M. Then rt ∈ M ∈S M. Assume 1 − rt is not left invertible; therefore there exists a maximal left ideal M0 of R such that R(1 − rt) ⊆ M0 . Note then that 1 − rt ∈ M0 . Also, by deﬁnition of t, we have rt ∈ M0 . Therefore 1 ∈ M0 ; this contradiction implies 1 − rt is left invertible. 3) ⊂ 4) We claim that 3) satisﬁes the condition of 4). Let K = {t ∈ R : 1 − rt is left invertible for all r ∈ R}. We shall ﬁrst show that K is an ideal. Clearly if t ∈ K, then rt ∈ K. If t1 , t2 ∈ K, then 1 − r(t1 + t2 ) = (1 − rt1 ) − rt2 Now there exists u1 such that u1(1 − rt1 ) = 1, hence u1 ((1 − rt1 ) − rt2 ) = 1 − u1rt2 Similarly, there exists u2 such that u2 (1 − u1 rt2 ) = 1, therefore u2 u1 (1 − r(t1 + t2 )) = 1 Hence t1 + t2 ∈ K. Now if t ∈ K, r ∈ R, to show that tr ∈ K it suﬃces to show that 1−tr is left invertible. Suppose u(1 − rt) = 1, hence u − urt = 1, then tur − turtr = tr. So (1 + tur)(1 − tr) = 1 + tur − tr − turtr = 1. Therefore K is an ideal. Now let v ∈ K. Then there exists u such that u(1 − v) = 1, hence 1 − u = −uv ∈ K, so u = 1 − (1 − u) is left invertible. So there exists w such that wu = 1, hence wu(1 − v) = w, then 1 − v = w. Thus 1117
(1 − v)u = 1 and therefore 1 − v is a unit. Let J be the largest ideal such that, for all v ∈ J, 1 − v is a unit. We claim that K ⊆ J. Suppose this were not true; in this case K + J strictly contains J. Consider rx + sy ∈ K + J with x ∈ K, y ∈ J and r, s ∈ R. Now 1 − (rx + sy) = (1 − rx) − sy, and since rx ∈ K, then 1 − rx = u for some unit u ∈ R. So 1 − (rx + sy) = u − sy = u(1 − u−1 sy), and clearly u−1sy ∈ J since y ∈ J. Hence 1 − u−1 sy is also a unit, and thus 1 − (rx + sy) is a unit. Thus 1 − v is a unit for all v ∈ K + J. But this contradicts the assumption that J is the largest such ideal. So we must have K ⊆ J. 4) ⊂ 1) We must show that if I is an ideal such that for all u ∈ I, 1 − u is a unit, then I ⊂ ann(R M) for every irreducible left Rmodule R M. Suppose this is not the case, so there exists R M such that I ⊂ ann(R M). Now we know that ann(R M) is the largest ideal inside some maximal left ideal J of R. Thus we must also have I ⊂ J, or else this would contradict the maximality of ann(R M) inside J. But since I ⊂ J, then by maximality I + J = R, hence there exist u ∈ I and v ∈ J such that u + v = 1. Then v = 1 − u, so v is a unit and J = R. But since J is a proper left ideal, this is a contradiction. Version: 25 Owner: saforres Author(s): saforres
254.5
properties of the Jacobson radical
Theorem: Let R, T be rings and ϕ : R → T be a surjective homomorphism. Then ϕ(J(R)) ⊆ J(T ). e shall use the characterization of the Jacobson radical as the set of all a ∈ R such that for all r ∈ R, 1 − ra is left invertible.
W
Let a ∈ J(R), t ∈ T . We claim that 1 − tϕ(a) is left invertible: Since ϕ is surjective, t = ϕ(r) for some r ∈ R. Since a ∈ J(R), we know 1 − ra is left invertible, so there exists u ∈ R such that u(1 − ra) = 1. Then we have ϕ(u) (ϕ(1) − ϕ(r)ϕ(a)) = ϕ(u)ϕ(1 − ra) = ϕ(1) = 1 So ϕ(a) ∈ J(T ) as required. Theorem: Let R, T be rings. Then J(R × T ) ⊆ J(R) × J(T ). 1118
et π1 : R ×T → R be a (surjective) projection. By the previous theorem, π1 (J(R ×T )) ⊆ J(R).
L
Similarly let π2 : R × T → T be a (surjective) projection. We see that π2 (J(R × T )) ⊆ J(T ). Now take (a, b) ∈ J(R × T ). Note that a = π1 (a, b) ∈ J(R) and b = π2 (a, b) ∈ J(T ). Hence (a, b) ∈ J(R) × J(T ) as required. Version: 8 Owner: saforres Author(s): saforres
254.6
quasiregularity
An element x of a ring is called right quasiregular [resp. left quasiregular] if there is an element y in the ring such that x + y + xy = 0 [resp. x + y + yx = 0]. For calculations with quasiregularity, it is useful to introduce the operation ∗ deﬁned: x ∗ y = x + y + xy. Thus x is right quasiregular if there is an element y such that x ∗ y = 0. The operation ∗ is easily demonstrated to be associative, and x ∗ 0 = 0 ∗ x = 0 for all x. An element x is called quasiregular if it is both left and right quasiregular. In this case, there are elements x and y such that x+y+xy = 0 = x+z+zx (equivalently, x∗y = z∗x = 0). A calculation shows that y = 0 ∗ y = (z ∗ x) ∗ y = z ∗ (x ∗ y) = z. So y = z is a unique element, depending on x, called the quasiinverse of x. An ideal (one or twosided) of a ring is called quasiregular if each of its elements is quasiregular. Similarly, a ring is called quasiregular if each of its elements is quasiregular (such rings cannot have an identity element). Lemma 1. Let A be an ideal (one or twosided) in a ring R. If each element of A is right quasiregular, then A is a quasiregular ideal. This lemma means that there is no extra generality gained in deﬁning terms such as right quasiregular left ideal, etc. Quasiregularity is important because it provides elementary characterizations of the Jacobson radical for rings without an identity element: • The Jacobson radical of a ring is the sum of all quasiregular left (or right) ideals. 1119
• The Jacobson radical of a ring is the largest quasiregular ideal of the ring. For rings with an identity element, note that x is [right, left] quasiregular if and only if 1 + x is [right, left] invertible in the ring. Version: 1 Owner: mclase Author(s): mclase
254.7
semiprimitive ring
A ring R is said to be semiprimitive (sometimes semisimple) if its Jacobson radical is the zero ideal. Any simple ring is automatically semiprimitive. A ﬁnite direct product of matrix rings over division rings can be shown to be semiprimitive and both left and right artinian. The ArtinWedderburn Theorem states that any semiprimitive ring which is left or right Artinian is isomorphic to a ﬁnite direct product of matrix rings over division rings. Version: 11 Owner: saforres Author(s): saforres
1120
Chapter 255 16N40 – Nil and nilpotent radicals, sets, ideals, rings
255.1 Koethe conjecture
The Koethe Conjecture is the statement that for any pair of nil right ideals A and B in any ring R, the sum A + B is also nil. If either of A or B is a twosided ideal, it is easy to see that A + B is nil. Suppose A is a twosided ideal, and let x ∈ A + B. The quotient (A + B)/A is nil since it is a homomorphic image of B. So there is an n > 0 with xn ∈ A. Then there is an m > 0 such that xnm = 0, because A is nil. In particular, this means that the Koethe conjecture is true for commutative rings. It has been shown to be true for many classes of rings, but the general statement is still unproven, and no counter example has been found. Version: 1 Owner: mclase Author(s): mclase
255.2
nil and nilpotent ideals
An element x of a ring is nilpotent if xn = 0 for some positive integer n. A ring R is nil if every element in R is nilpotent. Similarly, a one or twosided ideal is called nil if each of its elements is nilpotent. A ring R [resp. a one or two sided ideal A] is nilpotent if Rn = 0 [resp. An = 0] for some positive integer n. 1121
A ring or an ideal is locally nilpotent if every ﬁnitely generated subring is nilpotent. The following implications hold for rings (or ideals): nilpotent ⇒ locally nilpotent ⇒ nil
Version: 3 Owner: mclase Author(s): mclase
1122
Chapter 256 16N60 – Prime and semiprime rings
256.1 prime ring
A ring R is said to be a prime ring if the zero ideal is a prime ideal. If R is commutative, this is equivalent to being an integral domain. Version: 2 Owner: antizeus Author(s): antizeus
1123
Chapter 257 16N80 – General radicals and rings
257.1 prime radical
The prime radical of a ring R is the intersection of all the prime ideals of R. Note that the prime radical is the smallest semiprime ideal of R, and that R is a semiprime ring if and only if its prime radical is the zero ideal. Version: 2 Owner: antizeus Author(s): antizeus
257.2
radical theory
Let x◦ be a property which deﬁnes a class of rings, which we will call the x◦ rings. Then x◦ is a radical property if it satisﬁes: 1. The class of x◦ rings is closed under homomorphic images. 2. Every ring R has a largest ideal in the class of x◦ rings; this ideal is written x◦ (R). 3. x◦ (R/x◦ (R)) = 0. Note: it is extremely important when interpreting the above deﬁnition that your deﬁnition of a ring does not require an identity element. The ideal x◦ (R) is called the x◦ radical of R. A ring is called x◦ radical if x◦ (R) = R, and is called x◦ semisimple if x◦ (R) = 0. If x◦ is a radical property, then the class of x◦ rings is also called the class of x◦ radical rings. 1124
The class of x◦ radical rings is closed under ideal extensions. That is, if A is an ideal of R, and A and R/A are x◦ radical, then so is R. Radical theory is the study of radical properties and their interrelations. There are several wellknown radicals which are of independent interest in ring theory (See examples – to follow). The class of all radicals is however very large. Indeed, it is possible to show that any partition of the class of simple rings into two classes, R and S gives rise to a radical x◦ with the property that all rings in R are x◦ radical and all rings in S are x◦ semisimple. A radical x◦ is hereditary if every ideal of an x◦ radical ring is also x◦ radical. A radical x◦ is supernilpotent if the class of x◦ rings contains all nilpotent rings. Version: 2 Owner: mclase Author(s): mclase
1125
Chapter 258 16P40 – Noetherian rings and modules
258.1 Noetherian ring
A ring R is right noetherian (or left noetherian ) if R is noetherian as a right module (or left module ), i.e., if the three equivalent conditions hold: 1. right ideals (or left ideals) are ﬁnitely generated 2. the ascending chain condition holds on right ideals (or left ideals) 3. every nonempty family of right ideals (or left ideals) has a maximal element. We say that R is noetherian if it is both left noetherian and right noetherian. Examples of Noetherian rings include any ﬁeld (as the only ideals are 0 and the whole ring) and the ring Z of integers (each ideal is generated by a single integer, the greatest common divisor of the elements of the ideal). The Hilbert basis theorem says that a ring R is noetherian iﬀ the polynomial ring R[x] is. Version: 10 Owner: KimJ Author(s): KimJ
258.2
noetherian
A module M is noetherian if it satisﬁes the following equivalent conditions:
1126
• the ascending chain condition holds for submodules of M ; • every nonempty family of submodules of M has a maximal element; • every submodule of M is ﬁnitely generated. A ring R is left noetherian if it is noetherian as a left module over itself (i.e. if R R is a noetherian module), and right noetherian if it is noetherian as a right module over itself (i.e. if RR is an noetherian module), and simply noetherian if both conditions hold. Version: 2 Owner: antizeus Author(s): antizeus
1127
Chapter 259 16P60 – Chain conditions on annihilators and summands: Goldietype conditions , Krull dimension
259.1 Goldie ring
Let R be a ring. If the set of annihilators {r. ann(x)  x ∈ R} satisiﬁes the ascending chain condition, then R is said to satisfy the ascending chain condition on right annihilators. A ring R is called a right Goldie ring if it satisﬁes the ascending chain condition on right annihilators and RR is a module of ﬁnite rank. Left Goldie ring is deﬁned similarly. If the context makes it clear on which side the ring operates, then such a ring is simply called a Goldie ring. A right noetherian ring is right Goldie. Version: 3 Owner: mclase Author(s): mclase
259.2
uniform dimension
Let M be a module over a ring R, and suppose that M contains no inﬁnite direct sums of nonzero submodules. (This is the same as saying that M is a module of ﬁnite rank.)
1128
Then there exits an integer n such that M contains an essential submodule N where N = U1 ⊕ U2 ⊕ · · · ⊕ Un is a direct sum of n uniform submodules. This number n does not depend on the choice of N or the decomposition into uniform submodules. We call n the uniform dimension of M. Sometimes this is written udim M = n. If R is a ﬁeld K, and M is a ﬁnitedimensional vector space over K, then udim M = dimK M. udim M = 0 if and only if M = 0. Version: 3 Owner: mclase Author(s): mclase
1129
Chapter 260 16S10 – Rings determined by universal properties (free algebras, coproducts, adjunction of inverses, etc.)
260.1 Ore domain
Let R be a domain. We say that R is a right Ore domain if any two nonzero elements of R have a nonzero common right multiple, i.e. for every pair of nonzero x and y, there exists a pair of elements r and s of R such that xr = ys = 0. This condition turns out to be equivalent to the following conditions on R when viewed as a right Rmodule: (a) RR is a uniform module. (b) RR is a module of ﬁnite rank. The deﬁnition of a left Ore domain is similar. If R is a commutative domain, then it is a right (and left) Ore domain. Version: 6 Owner: antizeus Author(s): antizeus
1130
Chapter 261 16S34 – Group rings , Laurent polynomial rings
261.1 support
Let R[G] be the group ring of a group G over a ring R. Let x = g xg g be an element of R[G]. The support of x, often written supp(x), is the set of elements of G which occur with nonzero coeﬃcient in the expansion of x. Thus: supp(x) = {g ∈ G  xg = 0}. Version: 2 Owner: mclase Author(s): mclase
1131
Chapter 262 16S36 – Ordinary and skew polynomial rings and semigroup rings
262.1 Gaussian polynomials
For an indeterminate u and integers n ≥ m ≥ 0 we deﬁne the following: (a) (m)u = um−1 + um−2 + · · · + 1 for m > 0, (b) (m!)u = (m)u (m − 1)u · · · (1)u for m > 0, and (0!)u = 1, (c)
n m u (n!)u . (m!)u ((n−m)!)u n m u n m u
=
If m > n then we deﬁne
= 0.
The expressions
are called ubinomial coeﬃcients or Gaussian polynomials.
Note: if we replace u with 1, then we obtain the familiar integers, factorials, and binomial coeﬃcients. Speciﬁcally, (a) (m)1 = m, (b) (m!)1 = m!, (c)
n m 1
=
n m
.
Version: 3 Owner: antizeus Author(s): antizeus
1132
262.2
q skew derivation
Let (σ, δ) be a skew derivation on a ring R. Let q be a central (σ, δ)constant. Suppose further that δσ = q · σδ. Then we say that (σ, δ) is a qskew derivation. Version: 5 Owner: antizeus Author(s): antizeus
262.3
q skew polynomial ring
If (σ, δ) is a qskew derivation on R, then we say that the skew polynomial ring R[θ; σ, δ] is a qskew polynomial ring. Version: 3 Owner: antizeus Author(s): antizeus
262.4
sigma derivation
If σ is a ring endomorphism on a ring R, then a (left) σderivation is an additive map δ on R such that δ(x · y) = σ(x) · δ(y) + δ(x) · y for all x, y in R. Version: 7 Owner: antizeus Author(s): antizeus
262.5
sigma, delta constant
If (σ, δ) is a skew derivation on a ring R, then a (σ, δ)constant is an element q of R such that σ(q) = q and δ(q) = 0. Note: If q is a (σ, δ)constant, then it follows that σ(q · x) = q · σ(x) and δ(q · x) = q · δ(x) for all x in R. Version: 3 Owner: antizeus Author(s): antizeus
262.6
skew derivation
A (left) skew derivation on a ring R is a pair (σ, δ), where σ is a ring endomorphism of R, and δ is a left σderivation on R. Version: 4 Owner: antizeus Author(s): antizeus 1133
262.7
skew polynomial ring
If (σ, δ) is a left skew derivation on R, then we can construct the (left) skew polynomial ring R[θ; σ, δ], which is made up of polynomials in an indeterminate θ and lefthand coeﬃcients from R, with multiplication satisfying the relation θ · r = σ(r) · θ + δ(r) for all r in R. Version: 2 Owner: antizeus Author(s): antizeus
1134
Chapter 263 16S99 – Miscellaneous
263.1 algebra
Let A be a ring with identity. An algebra over A is a ring B with identity together with a ring homomorphism f : A −→ Z(B), where Z(B) denotes the center of B. Equivalently, an algebra over A is an A–module B which is a ring and satisﬁes the property a · (x ∗ y) = (a · x) ∗ y = x ∗ (a · y) for all a ∈ A and all x, y ∈ B. Here · denotes A–module multiplication and ∗ denotes ring multiplication in B. One passes between the two deﬁnitions as follows: given any ring homomorphism f : A −→ Z(B), the scalar multiplication rule a · b := f (a) ∗ b makes B into an A–module in the sense of the second deﬁnition. Version: 5 Owner: djao Author(s): djao
263.2
algebra (module)
Given a commutative ring R, an algebra over R is a module M over R, endowed with a law of composition f :M ×M →M which is Rbilinear. Most of the important algebras in mathematics belong to one or the other of two classes: the unital associative algebras, and the Lie algebras. 1135
263.2.1
Unital associative algebras
In these cases, the ”product” (as it is called) of two elements v and w of the module, is denoted simply by vw or v w or the like. Any unital associative algebra is an algebra in the sense of djao (a sense which is also used by Lang in his book Algebra (SpringerVerlag)). Examples of unital associative algebras: – tensor algebras and – quotients of them – Cayley algebras, such as the – ring of – quaternions – polynomial rings – the ring of – endomorphisms of a – vector space, in which – the bilinear product of – two mappings is simply the – composite mapping.
263.2.2
Lie algebras
In these cases the bilinear product is denoted by [v, w], and satisﬁes [v, v] = 0 for all v ∈ M [v, [w, x]] + [w, [x, v]] + [x, [v, w]] = 0 for all v, w, x ∈ M The second of these formulas is called the Jacobi identity. One proves easily [v, w] + [w, v] = 0 for all v, w ∈ M for any Lie algebra M. Lie algebras arise naturally from Lie groups, q.v. Version: 1 Owner: karthik Author(s): Larry Hammick
1136
Chapter 264 16U10 – Integral domains
264.1 Pr¨ fer domain u
An integral domain R is a Pr¨fer domain if every ﬁnitely generated ideal I of R is invertu ible. Let RI denote the localization of R at I. Then the following statements are equivalent: • i) R is a Pr¨ fer domain. u • ii) For every prime ideal P in R, RP is a valuation domain. • iii) For every maximal ideal M in R, RM is a valuation domain. A Pr¨ fer domain is a Dedekind domain if and only if it is noetherian. u If R is a Pr¨ fer domain with quotient ﬁeld K, then any domain S such that R ⊂ S ⊂ K is u Pr¨ fer. u
REFERENCES
1. Thomas W. Hungerford. Algebra. SpringerVerlag, 1974. New York, NY.
Version: 2 Owner: mathcam Author(s): mathcam
264.2
valuation domain
An integral domain R is a valuation domain if for all a, b ∈ R, either ab or ba. 1137
Version: 3 Owner: mathcam Author(s): mathcam
1138
Chapter 265 16U20 – Ore rings, multiplicative sets, Ore localization
265.1 Goldie’s Theorem
Let R be a ring with an identity. Then R has a right classical ring of quotients Q which is semisimple Artinian if and only if R is a semiprime right Goldie ring. If this is the case, then the composition length of Q is equal to the uniform dimension of R. An immediate corollary of this is that a semiprime right noetherian ring always has a right classical ring of quotients. This result was discovered by Alfred Goldie in the late 1950’s. Version: 3 Owner: mclase Author(s): mclase
265.2
Ore condition
A ring R satisﬁes the left Ore condition (resp. right Ore condition) if and only if for all elements x and y with x regular, there exist elements u and v with v regular such that ux = vy (resp.xu = yv).
A ring which satisﬁes the (left, right) Ore condition is called a (left, right) Ore ring. Version: 3 Owner: mclase Author(s): mclase
1139
265.3
Ore’s theorem
A ring has a (left, right) classical ring of quotients if and only if it satisﬁes the (left, right) Ore condition. Version: 3 Owner: mclase Author(s): mclase
265.4
classical ring of quotients
Let R be a ring. An element of R is called regular if it is not a right zero divisor or a left zero divisor in R. A ring Q ⊃ R is a left classical ring of quotients for R (resp. right classical ring of quotients for R) if it satisiﬁes: • every regular element of R is invertible in Q • every element of Q can be written in the form x−1 y (resp. yx−1 ) with x, y ∈ R and x regular. If a ring R has a left or right classical ring of quotients, then it is unique up to isomorphism. If R is a commutative integral domain, then the left and right classical rings of quotients always exist – they are the ﬁeld of fractions of R. For noncommutative rings, necessary and suﬃcient conditions are given by Ore’s theorem. Note that the goal here is to construct a ring which is not too diﬀerent from R, but in which more elements are invertible. The ﬁrst condition says which elements we want to be invertible. The second condition says that Q should contain just enough extra elements to make the regular elements invertible. Such rings are called classical rings of quotients, because there are other rings of quotients. These all attempt to enlarge R somehow to make more elements invertible (or sometimes to make ideals invertible). Finally, note that a ring of quotients is not the same as a quotient ring. Version: 2 Owner: mclase Author(s): mclase
1140
265.5
saturated
Let S be multiplicative subset of A. We say that S is a saturated if ab ∈ S ⇒ a, b ∈ S. When A is an integral domain, then S is saturated if and only if its complement A\S is union of prime ideals. Version: 1 Owner: drini Author(s): drini
1141
Chapter 266 16U70 – Center, normalizer (invariant elements)
266.1 center (rings)
If A is a ring, the center of A, sometimes denoted Z(A), is the set of all elements in A that commute with all other elements of A. That is, Z(A) = {a ∈ A  ax = xa∀x ∈ A} Note that 0 ∈ Z(A) so the center is nonempty. If we assume that A is a ring with a multiplicative unity 1, then 1 is in the center as well. The center of A is also a subring of A. Version: 3 Owner: dublisk Author(s): dublisk
1142
Chapter 267 16U99 – Miscellaneous
267.1 antiidempotent
An element x of a ring is called an antiidempotent element, or simply an antiidempotent if x2 = −x. The term is most often used in linear algebra. Every antiidempotent matrix over a ﬁeld is diagonalizable. Two antiidempotent matrices are similar if and only if they have the same rank. Version: 1 Owner: mathcam Author(s): mathcam
1143
Chapter 268 16W20 – Automorphisms and endomorphisms
268.1 ring of endomorphisms
Let R be a ring and let M be a right Rmodule. An endomorphism of M is a Rmodule homomorphism from M to itself. We shall write endomorphisms on the left, so that f : M → M maps x → f (x). If f, g : M → M are two endomorphisms, we can add them: f + g : x → f (x) + g(x) and multiply them With these operations, the set of endomorphisms of M becomes a ring, which we call the ring of endomorphisms of M, written EndR (M). Instead of writing endomorphisms as functions, it is often convenient to write them multiplicatively: we simply write the application of the endomorphism f as x → f x. Then the fact that each f is an Rmodule homomorphism can be expressed as: f (xr) = (f x)r for all x ∈ M and r ∈ R and f ∈ EndR (M). With this notation, it is clear that M becomes an EndR (M)Rbimodule. Now, let N be a left Rmodule. We can construct the ring EndR (N) in the same way. There is a complication, however, if we still think of endomorphism as functions written on the left. In order to make M into a bimodule, we need to deﬁne an action of EndR (N) on the right of N: say x · f = f (x) 1144 f g : x → f (g(x))
But then we have a problem with the multiplication: x · f g = f g(x) = f (g(x)) but (x · f ) · g = f (x) · g = g(f (x))! In order to make this work, we need to reverse the order of composition when we deﬁne multiplication in the ring EndR (N) when it acts on the right. There are essentially two diﬀerent ways to go from here. One is to deﬁne the muliplication in EndR (N) the other way, which is most natural if we write the endomorphisms as functions on the right. This is the approach taken in many older books. The other is to leave the muliplication in EndR (N) the way it is, but to use the opposite ring to deﬁne the bimodule. This is the approach that is generally taken in more recent works. Using this approach, we conclude that N is a REndR (N)op bimodule. We will adopt this convention for the lemma below. Considering R as a right and a left module over itself, we can construct the two endomorphism rings EndR (RR ) and EndR (R R). Lemma 2. Let R be a ring with an identity element. Then R EndR (R R)op .
D
EndR (RR ) and R
eﬁne ρr ∈ EndR (R R) by x → xr.
A calculation shows that ρrs = ρs ρr (functions written on the left) from which it is easily seen that the map θ : r → ρr is a ring homomorphism from R to EndR (R R)op . We must show that this is an isomorphism. If ρr = 0, then r = 1r = ρr (1) = 0. So θ is injective. Let f be an arbitrary element of EndR (R R), and let r = f (1). Then for any x ∈ R, f (x) = f (x1) = xf (1) = xr = ρr (x), so f = ρr = θ(r). The proof of the other isomorphism is similar. Version: 4 Owner: mclase Author(s): mclase
1145
Chapter 269 16W30 – Coalgebras, bialgebras, Hopf algebras ; rings, modules, etc. on which these act
269.1 Hopf algebra
A Hopf algebra is a bialgebra A over a ﬁeld K with a Klinear map S : A → A, called the Deﬁnition 1. antipode, such that m ◦ (S ⊗ id) ◦ ∆ = η ◦ ε = m ◦ (id ⊗ S) ◦ ∆, (269.1.1)
where m : A ⊗ A → A is the multiplication map m(a ⊗ b) = ab and η : K → A is the unit map η(k) = k1 I. In terms of a commutative diagram: A
∆ ∆ ε
A⊗A
S⊗id
A⊗A
id⊗∗S
C
η m m
A⊗A
A⊗A
A
1146
Example 1 (Algebra of functions on a ﬁnite group). Let A = C(G) be the algebra of complexvalued functions on a ﬁnite group G and identify C(G × G) with A ⊗ A. Then, A is a Hopf algebra with comultiplication (∆(f ))(x, y) = f (xy), counit ε(f ) = f (e), and antipode (S(f ))(x) = f (x−1 ). Example 2 (Group algebra of a ﬁnite group). Let A = CG be the complex group algebra of a ﬁnite group G. Then, A is a Hopf algebra with comultiplication ∆(g) = g ⊗ g, counit ε(g) = 1, and antipode S(g) = g −1 . The above two examples are dual to one another. Deﬁne a bilinear form C(G) ⊗ CG → C by f, x = f (x). Then, f g, x 1, x ∆(f ), x ⊗ y ε(f ) S(f ), x = f ⊗ g, ∆(x) , = ε(x), = f, xy , = f, e , = f, S(x) .
Example 3 (Polynomial functions on a Lie group). Let A = Poly(G) be the algebra of complexvalued polynomial functions on a complex Lie group G and identify Poly(G × G) with A ⊗ A. Then, A is a Hopf algebra with comultiplication (∆(f ))(x, y) = f (xy), counit ε(f ) = f (e), and antipode (S(f ))(x) = f (x−1 ). Example 4 (Universal enveloping algebra of a Lie algebra). Let A = U(g) be the universal enveloping algebra of a complex Lie algebra g. Then, A is a Hopf algebra with comultiplication ∆(X) = X ⊗ 1 + 1 ⊗ X, counit ε(X) = 0, and antipode S(X) = −X. The above two examples are dual to one another (if g is the Lie algebra of G). Deﬁne a d bilinear form Poly(G) ⊗ U(g) → C by f, X = dt t=0 f (exp(tX)). Version: 6 Owner: mhale Author(s): mhale
269.2
almost cocommutative bialgebra
R∆(a) = ∆op (a)R
A bialgebra A is called almost cocommutative if there is an unit R ∈ A ⊗ A such that where ∆op is the opposite comultiplication (the usual comultiplication, composed with the ﬂip map of the tensor product A ⊗ A). The element R is often called the Rmatrix of A. The signiﬁcance of the almost cocommutative condition is that σV,W = σ ◦ R : V ⊗ W → W ⊗ V gives a natural isomorphism of bialgebra representations, where V and W are Amodules, making the category of Amodules into a quasitensor or braided monoidal category. Note that σW,V ◦ σV,W is not necessarily the identity (this is the braiding of the category). Version: 2 Owner: bwebste Author(s): bwebste 1147
269.3
A
bialgebra
Deﬁnition 2. bialgebra is a vector space that is both a unital algebra and a coalgebra, such that the comultiplication and counit are unital algebra homomorphisms. Version: 2 Owner: mhale Author(s): mhale
269.4
A
coalgebra
Deﬁnition 3. coalgebra is a vector space A over a ﬁeld K with a Klinear map ∆ : A → A ⊗ A, called the Deﬁnition 4. comultiplication, and a (nonzero) Klinear map ε : A → K, called the (∆ ⊗ id) ◦ ∆ = (id ⊗ ∆) ◦ ∆ (coassociativity), (ε ⊗ id) ◦ ∆ = id = (id ⊗ ε) ◦ ∆. In terms of commutative diagrams: A
∆ ∆
Deﬁnition 5. counit, such that (269.4.1) (269.4.2)
A⊗A
∆⊗id id⊗∆
A⊗A A⊗A⊗A A
∆ ∆ id
A⊗A
ε⊗id
A⊗A
id⊗ε
A Let σ : A ⊗ A → A ⊗ A be the ﬂip map σ(a ⊗ b) = b ⊗ a. A coalgebra is said to be Deﬁnition 6. cocommutative if σ ◦ ∆ = ∆. Version: 4 Owner: mhale Author(s): mhale 1148
269.5
coinvariant
Let V be a comodule with a right coaction t : V → V ⊗ A of a coalgebra A. An element v ∈ V is Deﬁnition 7. right coinvariant if t(v) = v ⊗ 1 A . I Version: 1 Owner: mhale Author(s): mhale (269.5.1)
269.6
comodule
Let (A, ∆, ε) be a coalgebra. A Deﬁnition 8. right Acomodule is a vector space V with a linear map t : V → V ⊗A, called the Deﬁnition 9. right coaction, satisfying (t ⊗ id) ◦ t = (id ⊗ ∆) ◦ t, (id ⊗ ε) ◦ t = id. (269.6.1)
An Acomodule is also referred to as a corepresentation of A. Let V and W be two right Acomodules. Then V ⊕ W is also a right Acomodule. If A is a bialgebra then V ⊗ W is a right Acomodule as well (make use of the multiplication map A ⊗ A → A). Version: 2 Owner: mhale Author(s): mhale
269.7
comodule algebra
Let H be a bialgebra. A right Hcomodule algebra is a unital algebra A which is a right Hcomodule satisfying t(ab) = t(a)t(b) = for all h ∈ H and a, b ∈ A. There is a dual notion of a Hmodule coalgebra. Example 5. Let H be a bialgebra. Then H is itself a Hcomodule algebra for the right regular coaction t(h) = ∆(h). Version: 5 Owner: mhale Author(s): mhale 1149 a(1) b(1) ⊗ a(2) b(2) , t(1 A ) = 1 A ⊗ 1 H , I I I (269.7.1)
269.8
comodule coalgebra
Let H be a bialgebra. A right Hcomodule coalgebra is a coalgebra A which is a right Hcomodule satisfying (∆ ⊗ id)t(a) = a(1)(1) ⊗ a(2)(1) ⊗ a(1)(2) a(2)(2) , (ε ⊗ id)t(a) = ε(a)1 H , I (269.8.1)
for all h ∈ H and a ∈ A. There is a dual notion of a Hmodule algebra. Example 6. Let H be a Hopf algebra. Then H is itself a Hcomodule coalgebra for the adjoint coaction t(h) = h(2) ⊗ S(h(1) )h(3) . Version: 4 Owner: mhale Author(s): mhale
269.9
module algebra
Let H be a bialgebra. A left Hmodule algebra is a unital algebra A which is a left Hmodule satisfying h (ab) = for all h ∈ H and a, b ∈ A. There is a dual notion of a Hcomodule coalgebra. Example 7. Let H be a Hopf algebra. Then H is itself a Hmodule algebra for the adjoint action g h = g(1) hS(g(2) ). Version: 4 Owner: mhale Author(s): mhale (h(1) a)(h(2) b), h 1 A = ε(h)1 A , I I (269.9.1)
269.10
module coalgebra
Let H be a bialgebra. A left Hmodule coalgebra is a coalgebra A which is a left Hmodule satisfying ∆(h a) = (h(1) a(1) ) ⊗ (h(2) a(2) ), ε(h a) = ε(h)ε(a), (269.10.1)
for all h ∈ H and a ∈ A. There is a dual notion of a Hcomodule algebra. Example 8. Let H be a bialgebra. Then H is itself a Hmodule coalgebra for the left regular action g h = gh. Version: 5 Owner: mhale Author(s): mhale 1150
Chapter 270 16W50 – Graded rings and modules
270.1 graded algebra
An algebra A is graded if it is a graded module and satisﬁes Ap · Aq ⊆ Ap+q Examples of graded algebras include the polynomial ring k[X] being an Ngraded kalgebra, and the exterior algebra. Version: 1 Owner: dublisk Author(s): dublisk
270.2
graded module
If R = R0 ⊕ R1 ⊕ · · · is a graded ring, then a graded module over R is a module M of the form M = ⊕∞ Mi and satisﬁes Ri Mj ⊆ Mi+j for all i, j. i=−∞ Version: 4 Owner: KimJ Author(s): KimJ
270.3
supercommutative
Let R be a Z2 graded ring. Then R is supercommutative if for any homogeneous elements a and b ∈ R: ab = (−1)deg a deg b ba. 1151
This is, even homogeneous elements are in the center of the ring, and odd homogeneous elements anticommute. Common examples of supercommutative rings are the exterior algebra of a module over a commutative ring (in particular, a vector space) and the cohomology ring of a topological space (both with the standard grading by degree reduced mod 2). Version: 1 Owner: bwebste Author(s): bwebste
1152
Chapter 271 16W55 – “Super” (or “skew”) structure
271.1 super tensor product
If A and B are Zgraded algebras, we deﬁne the super tensor product A ⊗su B to be the ordinary tensor product as graded modules, but with multiplication  called the super product  deﬁned by (a ⊗ b)(a ⊗ b ) = (−1)(deg b)(deg a ) aa ⊗ bb where a, a , b, b are homogeneous. The super tensor product of A and B is itself a graded algebra, as we grade the super tensor product of A and B as follows: (A ⊗su B)n = Ap ⊗ B q
p,q : p+q=n
Version: 4 Owner: dublisk Author(s): dublisk
271.2
superalgebra
A graded algebra A is said to be a super algebra if it has a Z/2Z grading. Version: 2 Owner: dublisk Author(s): dublisk
1153
271.3
supernumber
Let ΛN be the Grassmann algebra generated by θi , i = 1 . . . N, such that θi θj = −θj θi and (θi )2 = 0. Denote by Λ∞ , the case of an inﬁnite number of generators θi . A Deﬁnition 10. supernumber is an element of ΛN or Λ∞ . Any supernumber z can be expressed uniquely in the form 1 1 z = z0 + zi θi + zij θi θj + . . . + zi1 ...in θi1 . . . θin + . . . , 2 n! where the coeﬃcients zi1 ...in ∈ C are antisymmetric in their indices. The Deﬁnition 11. body of z is deﬁned as zB = z0 , and its Deﬁnition 12. soul is deﬁned as zS = z − zB . If zB = 0 then z has an inverse given by z
−1
1 = zB
k=0
zS − zB
k
.
A supernumber can be decomposed into the even and odd parts 1 1 zeven = z0 + zij θi θj + . . . + zi1 ...i2n θi1 . . . θi2n + . . . , 2 (2n)! 1 1 zodd = zi θi + zijk θi θj θk + . . . + zi ...i θi1 . . . θi2n+1 + . . . . 6 (2n + 1)! 1 2n+1 Purely even supernumbers are called Deﬁnition 13. cnumbers, and odd supernumbers are called Deﬁnition 14. anumbers. The superalgebra ΛN thus has a decomposition ΛN = Cc ⊕ Ca , where Cc is the space of cnumbers, and Ca is the space of anumbers. Supernumbers are the generalisation of complex numbers to a commutative superalgebra of commuting and anticommuting “numbers”. They are primarily used in the description of fermionic ﬁelds in quantum ﬁeld theory. Version: 5 Owner: mhale Author(s): mhale
1154
Chapter 272 16W99 – Miscellaneous
272.1 Hamiltonian quaternions
Deﬁnition of Q We deﬁne a unital associative algebra Q over R, of dimension 4, by the basis {1, i, j, k} and the multiplication table 1 i i −1 j −k k j j k −1 −i k −j i −1
(where the element in row x and column y is xy, not yx). Thus an arbitrary element of Q is of the form a1 + bi + cj + dk, a, b, c, d ∈ R (sometimes denoted by a, b, c, d or by a+ b, c, d ) and the product of two elements a, b, c, d and α, β, γ, δ is w, x, y, z where w x y z = = = = aα − bβ − cγ − dδ aβ + bα + cδ − dγ aγ − bδ + cα + kβ aδ + bγ − cβ + kα
The elements of Q are known as Hamiltonian quaternions. Clearly the subspaces of Q generated by {1} and by {1, i} are subalgebras isomorphic to R and C respectively. R is customarily identiﬁed with the corresponding subalgebra of Q. (We
1155
shall see in a moment that there are other and less obvious embeddings of C in Q.) The real numbers commute with all the elements of Q, and we have λ · a, b, c, d = λa, λb, λc, λd for λ ∈ R and a, b, c, d ∈ Q. norm, conjugate, and inverse of a quaternion Like the complex numbers (C), the quaternions have a natural involution called the quaternion conjugate. If q = a1 + bi + cj + dk, then the quaternion conjugate of q, denoted q, is simply q = a1 − bi − cj − dk. One can readily verify that if q = a1 + bi + cj + dk, then qq = (a2 + b2 + c2 + d2 )1. (See Euler foursquare identity.) This product is used to form a norm  ·  on the algebra (or the √ ring) Q: We deﬁne q = s where qq = s1. If v, w ∈ Q and λ ∈ R, then 1. v 0 with equality only if v = 0, 0, 0, 0 = 0
2. λv = λ v 3. v + w v + w 4. v · w = v · w which means that Q qualiﬁes as a normed algebra when we give it the norm  · . Because the norm of any nonzero quaternion q is real and nonzero, we have qq qq = = 1, 0, 0, 0 2 q q 2 which shows that any nonzero quaternion has an inverse: q −1 = Other embeddings of C into Q One can use any nonzero q to deﬁne an embedding of C into Q. If n(z) is a natural embedding of z ∈ C into Q, then the embedding: z → qn(z)q −1 is also an embedding into Q. Because Q is an associative algebra, it is obvious that: (qn(a)q −1 )(qn(b)q −1 ) = q(n(a)n(b))q −1 1156 q q
2
.
and with the distributive laws, it is easy to check that (qn(a)q −1 ) + (qn(b)q −1 ) = q(n(a) + n(b))q −1 Rotations in 3space Let us write U = {q ∈ Q : q = 1} With multiplication, U is a group. Let us brieﬂy sketch the relation between U and the group SO(3) of rotations (about the origin) in 3space.
θ An arbitrary element q of U can be expressed cos 2 + sin θ (ai + bj + ck), for some real 2 numbers θ, a, b, c such that a2 + b2 + c2 = 1. The permutation v → qv of U thus gives rise to a permutation of the real sphere. It turns out that that permutation is a rotation. Its axis is the line through (0, 0, 0) and (a, b, c), and the angle through which it rotates the sphere is θ. If rotations F and G correspond to quaternions q and r respectively, then clearly the permutation v → qrv corresponds to the composite rotation F ◦ G. Thus this mapping of U onto SO(3) is a group homomorphism. Its kernel is the subset {1, −1} of U, and thus it comprises a double cover of SO(3). The kernel has a geometric interpretation as well: two unit vectors in opposite directions determine the same axis of rotation.
Version: 3 Owner: mathcam Author(s): Larry Hammick, patrickwonders
1157
Chapter 273 16Y30 – Nearrings
273.1 nearring
A nearring is a set N together with two binary operations, denoted + : N × N → N and · : N × N → N, such that 1. (a + b) + c = a + (b + c) and (a · b) · c = a · (b · c) for all a, b, c ∈ N (associativity of both operations) 2. There exists an element 0 ∈ N such that a + 0 = 0 + a = a for all a ∈ N (additive identity) 3. For all a ∈ N, there exists b ∈ N such that a + b = b + a = 0 (additive inverse) 4. (a + b) · c = (a · c) + (b · c) for all a, b, c ∈ N (right distributive law) Note that the axioms of a nearring diﬀer from those of a ring in that they do not require addition to be commutative, and only require distributivity on one side. Every element a in a nearring has a unique additive inverse, denoted −a. We say N has an identity element if there exists an element 1 ∈ N such that a · 1 = 1 · a = a for all a ∈ N. We say N is distributive if a · (b + c) = (a · b) + (a · c) holds for all a, b, c ∈ N. We say N is commutative if a · b = b · a for all a, b ∈ N. A natural example of a nearring is the following. Let (G, +) be a group (not necessarily abelian), and let M be the set of all functions from G to G. For two functions f and g in M deﬁne f + g ∈ M by (f + g)(x) = f (x) + g(x) for all x ∈ G. Then (M, +, ◦) is a nearring with identity, where ◦ denotes composition of functions. Version: 13 Owner: yark Author(s): yark, juergen 1158
Chapter 274 17A01 – General theory
274.1 commutator bracket
[a, b] = ab − ba
Let A be an associative algebra over a ﬁeld K. For a, b ∈ A, the element of A deﬁned by is called the commutator of a and b. The corresponding bilinear operation is called the commutator bracket. [−, −] : A × A → A
The commutator bracket is bilinear, skewsymmetric, and also satisﬁes the Jacobi identity. To wit, for a, b, c ∈ A we have [a, [b, c]] + [b, [c, a]] + [c, [a, b]] = 0. The proof of this assertion is straightforward. Each of the brackets in the lefthand side expands to 4 terms, and then everything cancels. In categorical terms, what we have here is a functor from the category of associative algebras to the category of Lie algebras over a ﬁxed ﬁeld. The action of this functor is to turn an associative algebra A into a Lie algebra that has the same underlying vector space as A, but whose multiplication operation is given by the commutator bracket. It must be noted that this functor is rightadjoint to the universal enveloping algebra functor. Examples • Let V be a vector space. Composition endows the vector space of endomorphisms End V with the structure of an associative algebra. However, we could also regard End V as a Lie algebra relative to the commutator bracket: [X, Y ] = XY − Y X, 1159 X, Y ∈ End V.
• The algebra of diﬀerential operators has some interesting properties when viewed as a Lie algebra. The fact is that even though, even though the composition of diﬀerential operators is a noncommutative operation, it is commutative when restricted to the highest order terms of the involved operators. Thus, if X, Y are diﬀerential operators of order p and q, respectively, the compositions XY and Y X have order p + q. Their highest order term coincides, and hence the commutator [X, Y ] has order p + q − 1. • In light of the preceding comments, it is evident that the vector space of ﬁrstorder diﬀerential operators is closed with respect to the commutator bracket. Specializing even further we remark that, a vector ﬁeld is just a homogeneous ﬁrstorder diﬀerential operator, and that the commutator bracket for vector ﬁelds, when viewed as ﬁrstorder operators, coincides with the usual, geometrically motivated vector ﬁeld bracket. Version: 4 Owner: rmilson Author(s): rmilson
1160
Chapter 275 17B05 – Structure theory
275.1 Killing form
Let g be a ﬁnite dimensional Lie algebra over a ﬁeld k, and adX : g → g be the adjoint action, adX Y = [X, Y ]. Then the Killing form on g is a bilinear map Bg : g × g → k given by Bg(X, Y ) = tr(adX ◦ adY ). The Killing form is invariant and symmetric (since trace is symmetric). Version: 4 Owner: bwebste Author(s): bwebste
275.2
Levi’s theorem
Let g be a complex Lie algebra, r its radical. Then the extension 0 → r → g → g/r → 0 is split, i.e., there exists a subalgebra h of g mapping isomorphically to g/r under the natural projection. Version: 2 Owner: bwebste Author(s): bwebste
275.3
nilradical
Let g be a Lie algebra. Then the nilradical n of g is deﬁned to be the intersection of the kernels of all the irreducible representations of g. Equivalently, n = [g, g] rad g, the 1161
interesection of the derived ideal and radical of g. Version: 1 Owner: bwebste Author(s): bwebste
275.4
radical
Let g be a Lie algebra. Since the sum of any two solvable ideals of g is in turn solvable, there is a unique maximal solvable ideal of any Lie algebra. This ideal is called the radical of g. Note that g/rad g has no solvable ideals, and is thus semisimple. Thus, every Lie algebra is an extension of a semisimple algebra by a solvable one. Version: 2 Owner: bwebste Author(s): bwebste
1162
Chapter 276 17B10 – Representations, algebraic theory (weights)
276.1 Ado’s theorem
Every ﬁnite dimensional Lie algebra has a faithful ﬁnite dimensional representation. In other words, every ﬁnite dimensional Lie algebra is a matrix algebra. This result is not true for Lie groups. Version: 2 Owner: bwebste Author(s): bwebste
276.2
Lie algebra representation
A representation of a Lie algebra g is a Lie algebra homomorphism ρ : g → End V, where End V is the commutator Lie algebra of some vector space V . In other words, ρ is a linear mapping that satisﬁes ρ([a, b]) = ρ(a)ρ(b) − ρ(b)ρ(a), We call the representation faithful if ρ is injective. A invariant subsspace or submodule W ⊂ V is a subspace of V satisfying ρ(a)(W ) ⊂ W for all a ∈ g. A representation is called irreducible or simple if its only invariant subspaces are {0} and the whole representation. 1163 a, b ∈ g
Alternatively, one calls V a gmodule, and calls ρ(a), a ∈ g the action of a on V .
The dimension of V is called the dimension of the representation. If V is inﬁnitedimensional, then one speaks of an inﬁnitedimensional representation. Given a representation or pair of representation, there are a couple of operations which will produce other representations: First there is direct sum. If ρ : g → End(V ) and σ : g → End(W ) are representations, then V ⊕ W has the obvious Lie algebra action, by the embedding End(V ) × End(W ) → End(V ⊕ W ). Version: 9 Owner: bwebste Author(s): bwebste, rmilson
276.3
adjoint representation
Let g be a Lie algebra. For every a ∈ g we deﬁne the adjoint endomorphism, a.k.a. the adjoint action, ad(a) : g → g to be the linear transformation with action ad(a) : b → [a, b], The linear mapping ad : g → End(g) with action a → ad(a), a∈g b ∈ g.
is called the adjoint representation of g. The fact that ad deﬁnes a representation is a straightforward consequence of the Jacobi identity axiom. Indeed, let a, b ∈ g be given. We wish to show that ad([a, b]) = [ad(a), ad(b)], where the bracket on the left is the g multiplication structure, and the bracket on the right is the commutator bracket. For all c ∈ g the left hand side maps c to [[a, b], c], while the right hand side maps c to [a, [b, c]] + [b, [a, c]]. Taking skewsymmetry of the bracket as a given, the equality of these two expressions is logically equivalent to the Jacobi identity: [a, [b, c]] + [b, [c, a]] + [c, [a, b]] = 0. Version: 2 Owner: rmilson Author(s): rmilson 1164
276.4
examples of nonmatrix Lie groups
While most wellknown Lie groups are matrix groups, there do in fact exist Lie groups which are not matrix groups. That is, they have no faithful ﬁnite dimensional representations. For example, let H be the real Heisenberg group 1 a b H = 0 1 c  a, b, c ∈ R , 0 0 1 1 0 n Γ = 0 1 0  n ∈ Z . 0 0 1
and Γ the discrete subgroup
The subgroup Γ is central, and thus normal. The Lie group H/Γ has no faithful ﬁnite dimensional representations over R or C. Another example is the universal cover of SL2 R. SL2 R is homotopy equivalent to a circle, and thus π(SL2 R) ∼ Z, and thus has an inﬁnitesheeted cover. Any real or complex repre= sentation of this group factors through the projection map to SL2 R. Version: 3 Owner: bwebste Author(s): bwebste
276.5
isotropy representation
Let g be a Lie algebra, and h ⊂ g a subalgebra. The isotropy representation of h relative to g is the naturally deﬁned action of h on the quotient vector space g/h. Here is a synopsis of the technical details. As is customary, we will use b + h, b ∈ g to denote the coset elements of g/h. Let a ∈ h be given. Since h is invariant with respect to adg(a), the adjoint action factors through the quotient to give a well deﬁned endomorphism of g/h. The action is given by b + h → [a, b] + h, This is the action alluded to in the ﬁrst paragraph. Version: 3 Owner: rmilson Author(s): rmilson 1165 b ∈ g.
Chapter 277 17B15 – Representations, analytic theory
277.1 invariant form (Lie algebras)
Let V be a representation of a Lie algebra g over a ﬁeld k. Then a bilinear form B : V ×V → k is invariant if B(Xv, w) + B(v, Xw) = 0. for all X ∈ g, v, w ∈ V . This criterion seems a little odd, but in the context of Lie algebras, ˜ it makes sense. For example, the map B : V → V ∗ given by v → B(·, v) is equivariant if and only if B is an invariant form. Version: 2 Owner: bwebste Author(s): bwebste
1166
Chapter 278 17B20 – Simple, semisimple, reductive (super)algebras (roots)
278.1 Borel subalgebra
Let g be a semisimple Lie group, h a Cartan subalgebra, R the associated root system and R+ ⊂ R a set of positive roots. We have a root decomposition into the Cartan subalgebra and the root spaces gα g=h⊕ gα
α∈R
.
Now let b be the direct sum of the Cartan subalgebra and the positive root spaces. This is called a Borel subalgebra. b=h⊕
β∈R+
gβ .
Version: 2 Owner: bwebste Author(s): bwebste
278.2
Borel subgroup
Let G be a complex semisimple Lie group. Then any maximal solvable subgroup B G is called a Borel subgroup. All Borel subgroups of a given group are conjugate. Any Borel group is connected and equal to its own normalizer, and contains a unique Cartan subgroup. The intersection of B with a maximal compact subgroup K of G is the maximal torus of K. If G = SLn C, then the standard Borel subgroup is the set of upper triangular matrices. 1167
Version: 2 Owner: bwebste Author(s): bwebste
278.3
Cartan matrix
Let R ⊂ E be a reduced root system, with E a euclidean vector space, with inner product (·, ·), and let Π = {α1 , · · · , αn } be a base of this root system. Then the Cartan matrix of the root system is the matrix 2(αi , αj ) Ci,j = . (αi , αi ) The Cartan matrix uniquely determines the root system, and is unique up to simultaneous permutation of the rows and columns. It is also the basis change matrix from the basis of fundamental weights to the basis of simple roots in E. Version: 1 Owner: bwebste Author(s): bwebste
278.4
Cartan subalgebra
Let g be a Lie algebra. Then a Cartan subalgebra is a maximal subalgebra of g which is selfnormalizing, that is, if [g, h] ∈ h for all h ∈ h, then g ∈ h as well. Any Cartan subalgebra h is nilpotent, and if g is semisimple, it is abelian. All Cartan subalgebras of a Lie algebra are conjugate by the adjoint action of any Lie group with algebra g. Version: 3 Owner: bwebste Author(s): bwebste
278.5
Cartan’s criterion
A Lie algebra g is semisimple if and only if its Killing form Bg is nondegenerate. Version: 2 Owner: bwebste Author(s): bwebste
278.6
Casimir operator
Let g be a semisimple Lie algebra, and let (·, ·) denote the Killing form. If {gi} is a basis of g, then there is a dual basis {g i} with respect to the Killing form, i.e., (gi , g j ) = δij . Consider the element Ω = gi g i of the universal enveloping algebra of g. This element, called the Casimir operator is central in the enveloping algebra, and thus commutes with the g action on any representation. 1168
Version: 2 Owner: bwebste Author(s): bwebste
278.7
Dynkin diagram
Dynkin diagrams are a combinatorial way of representing the imformation in a root system. Their primary advantage is that they are easier to write down, remember, and analyze than explicit representations of a root system. They are an important tool in the classiﬁcation of simple Lie algebras. Given a reduced root system R ⊂ E, with E an innerproduct space, choose a base or simple roots Π (or equivalently, a set of positive roots R+ ). The Dynkin diagram associated to R is a graph whose vertices are Π. If πi and πj are distinct elements of the root system, we −4(πi ,π )2 add mij = (πi ,πi)(πjj,πj ) lines between them. This number is obivously positive, and an integer since it is the product of 2 quantities that the axioms of a root system require to be integers. By the CauchySchwartz inequality, and the fact that simple roots are never antiparallel (they are all strictly contained in some half space), mij ∈ {0, 1, 2, 3}. Thus Dynkin diagrams are ﬁnite graphs, with single, double or triple edges. Fact, the criteria are much stronger than this: if the multiple edges are counted as single edges, all Dynkin diagrams are trees, and have at most one multiple edge. In fact, all Dynkin diagrams fall into 4 inﬁnite families, and 5 exceptional cases, in exact parallel to the classiﬁcation of simple Lie algebras. (Does anyone have good Dynkin diagram pictures? I’d love to put some up, but am decidedly lacking.) Version: 1 Owner: bwebste Author(s): bwebste
278.8
Verma module
Let g be a semisimple Lie algebra, h a Cartan subalgebra, and b a Borel subalgebra. Let Fλ for a weight λ ∈ h∗ be the 1d dimensional b module on which h acts by multiplication by λ, and the positive root spaces act trivially. Now, the Verma module Mλ of the weight λ is the g module Mλ = Fλ ⊗U(b) U(g). This is an inﬁnite dimensional representation, and it has a very important property: If V is any representation with highest weight λ, there is a surjective homomorphism Mλ → V . That is, all representations with highest weight λ are quotients of Mλ . Also, Mλ has a unique maximal submodule, so there is a unique irreducible representation with highest weight λ. Version: 1 Owner: bwebste Author(s): bwebste 1169
278.9
Weyl chamber
If R ⊂ E is a root system, with E a euclidean vector space, and R+ is a set of positive roots, then the positive Weyl chamber is the set C = {e ∈ E(e, α) 0 ∀α ∈ R+ }.
The interior of C is a fundamental domain for the action of the Weyl group on E. The image w(C) of C under the any element of the Weyl group is called a Weyl chamber. The Weyl group W acts simply transitively on the set of Weyl chambers. A weight which lies inside the positive Weyl chamber is called dominant Version: 2 Owner: bwebste Author(s): bwebste
278.10
Weyl group
The Weyl group WR of a root system R ⊂ E, where E is a euclidean vector space, is the subgroup of GL(E) generated by reﬂection in the hyperplanes perpendicular to the roots. The map of reﬂection in a root α is given by rα (v) = v − 2 . The Weyl group is generated by reﬂections in the simple roots for any choice of a set of positive roots. There is a welldeﬁned length function : WR → Z, where (w) is the minimal number of reﬂections in simple roots that w can be written as. This is also the number of positive roots that w takes to negative roots. Version: 1 Owner: bwebste Author(s): bwebste (v, α) (α, α)
278.11
Weyl’s theorem
Let g be a ﬁnite dimensional semisimple Lie algebra. Then any ﬁnite dimensional representation of g is completely reducible. Version: 1 Owner: bwebste Author(s): bwebste
1170
278.12
classiﬁcation of ﬁnitedimensional representations of semisimple Lie algebras
If g is a semisimple Lie algebra, then we say that an irreducible representation V has highest weight λ, if there is a vector v ∈ Vλ , the weight space of λ, such that Xv = 0 for X in any positive root space, and v is called a highest vector, or vector of highest weight. There is a unique (up to isomorphism) irreducible ﬁnite dimensional representation of g with highest weight λ for any dominant weight λ ∈ ΛW , where ΛW is the weight lattice of g, and every irreducible representation of g is of this type. Version: 1 Owner: bwebste Author(s): bwebste
278.13
cohomology of semisimple Lie algebras
There are some important facts that make the cohomology of semisimple Lie algebras easier to deal with than general Lie algebra cohomology. In particular, there are a number of vanishing theorems. First of all, let g be a ﬁnitedimensional, semisimple Lie algebra over C. Theorem. Let M be an irreducible representation of g. Then H n (g, M) = 0 for all n. Whitehead’s lemmata. Let M be any representation of g, then H 1 (g, M) = H 2 (g, M) = 0. Whitehead’s lemmata lead to two very important results. From the vanishing of H 1 , we can derive Weyl’s theorem, the fact that representations of semisimple Lie algebras are completely reducible, since extensions of M by N are classiﬁed by H 1 (g, HomMN). And from the vanishing of H 2 , we obtain Levi’s theorem, which states that every Lie algebra is a split extension of a semisimple algebra by a solvable algebra since H 2 (g, M) classiﬁes extensions of g by M with a speciﬁed action of g on M. Version: 2 Owner: bwebste Author(s): bwebste
278.14
nilpotent cone
Let g be a ﬁnite dimensional semisimple Lie algebra. Then the nilpotent cone N of g is set of elements which act nilpotently on all representations of g. This is a irreducible subvariety of g (considered as a kvector space), which is invariant under the adjoint action of G on g (here G is the adjoint group associated to g).
1171
Version: 3 Owner: bwebste Author(s): bwebste
278.15
parabolic subgroup
Let G be a complex semisimple Lie group. Then any subgroup P of G containg a Borel subgroup B is called parabolic. Parabolics are classiﬁed in the following manner. Let g be the Lie algebra of G, h the unique Cartan subalgebra contained in b, the algebra of B, R the set of roots corresponding to this choice of Cartan, and R+ the set of positive roots whose root spaces are contained in b and let p be the Lie algebra of P . Then there exists a unique subset ΠP of Π, the base of simple roots associated to this choice of positive roots, such that {b, g−α }α∈ΠP generates p. In other words, parabolics containing a single Borel subgroup are classiﬁed by subsets of the Dynkin diagram, with the empty set corresponding to the Borel, and the whole graph corresponding to the group G. Version: 1 Owner: bwebste Author(s): bwebste
278.16
pictures of Dynkin diagrams
Here is a complete list of connected Dynkin diagrams. In general if the name of a diagram has n as a subscript then there are n dots in the diagram. There are four inﬁnite series that correspond to classical complex (that is over C) simple Lie algebras. No pan intended. • An , for n 1 represents the simple complex Lie algebra sln+1 : A1 A2 A3 An • Bn , for n • Cn , for n
1 represents the simple complex Lie algebra so2n+1 : 1 represents the simple complex Lie algebra sp2n :
¢ ¢¡ ¢ ¡¡ ¡¡ ¢ ¢ ¢ ¡ ¡ ¡ ¡ ¡ ¡
1172
B1 B2 B3 Bn C1 C2 C3 Cn • Dn , for n
3 represents the simple complex Lie algebra so2n :
D3
D4
D5
Dn
And then there are the exceptional cases that come in ﬁnite families. The corresponding Lie algebras are usually called by the name of the diagram.
• There is the E series that has three members: E6 which represents a 78–dimensional Lie algebra, E7 which represents a 133–dimensional Lie algebra, and E8 which represents a 248–dimensional Lie algebra. 1173
§ ¦ §¦§ ¦¨§ § ¦§ § §¨¦ ¦¨¦ §¦ ¦¨ § ¦ § ¦ § ¦ §¨ ¦¨ § ¦ § ¦ § ¦ § ¦ § ¦ § ¦ ¥¦ ¥§ ¥¥ ¤¤ ££ ¤¤ ££ ¤ £ ¥ ¤ £ ¤ £ ¤ £ ¤ ¤ £ ¤ £ ¢£ ¢¥ ¢ ¡¡ ¡¡ ¡ ¢ ¢ ¢ ¡ ¡ ¡ ¡ ¡ ¡
E6
E7
E8
• There is the F4 diagram which represents a 52–dimensional complex simple Lie algebra:
F4
• And ﬁnally there is G2 that represents a 14–dimensional Lie algebra. G2
Notice the low dimensional coincidences:
which reﬂects the exceptional isomorphisms
Also reﬂecting the isomorphism And, reﬂecting
¥ ¤ £¥ ¢¤ £ ¢ £ ¢ £ ¢ ¡¡¡ ¡¡ ¡¡ ¡ ¡¡ ¡¡ ¡¡ ¡ ¡¡ ¡¡ ¡ ¡
A1 = B1 = C1 sl2 ∼ so3 ∼ sp2 . = = B2 ∼ C2 = so5 ∼ sp4 . = A3 ∼ D3 = sl4 ∼ so6 . = 1174
Remark 1. Often in the literature the listing of Dynkin diagrams is arranged so that there are no “intersections” between diﬀerent families. However by allowing intersections one gets a graphical representation of the low degree isomorphisms. In the same vein there is a graphical representation of the isomorphism
Namely, if not for the requirement that the families consist of connected diagrams, one could start the D family with
D2
which consists of two disjoint copies of A2 .
Version: 9 Owner: Dr Absentius Author(s): Dr Absentius
278.17
positive root
If R ⊂ E is a root system, with E a euclidean vector space, then a subset R+ ⊂ R is called a set of positive roots if there is a vector v ∈ E such that (α, v) > 0 if α ∈ R+ , and (α, v) < 0 if α ∈ R\R+ . roots which are not positive are called negative. Since −α is negative exactly when α is positive, exactly half the roots must be positive. Version: 2 Owner: bwebste Author(s): bwebste
278.18
rank
Let lg be a ﬁnite dimensional Lie algebra. One can show that all Cartan subalgebras h ⊂ lg have the same dimension. The rank of lg is deﬁned to be this dimension. Version: 5 Owner: rmilson Author(s): rmilson
278.19
root lattice
If R ⊂ E is a root system, and E a euclidean vector space, then the root lattice ΛR of R is the subset of E generated by R as an abelian group. In fact, this group is free on the simple roots, and is thus a full sublattice of E. 1175
¡ ¡
so4 ∼ sl2 × sl2 . =
Version: 1 Owner: bwebste Author(s): bwebste
278.20
root system
Root systems are sets of vectors in a Euclidean space which are used classify simple Lie algebras, and to understand their representation theory, and also in the theory of reﬂection groups. Axiomatically, an (abstract) root system R is a set of vectors in a euclidean vector space E with inner product (·, ·), such that: 1. R spans the vector space E. 2. if α ∈ R, then reﬂection in the hyperplane orthogonal to α preserves R.
(α,β) 3. if α, β ∈ R, then 2 (α,α) is an integer.
Axiom 3 is sometimes dropped when dealing with reﬂection groups, but it is necessary for the root systems which arise in connection with Lie algebras. Additionally, a root system is called reduced if for all α ∈ R, if kα ∈ R, then k = ±1. We call a root system indecomposable if there is no proper subset R ⊂ R such that every vector in R is orthogonal to R. Root systems arise in the classiﬁcation of semisimple Lie algebras in the following manner: If g is a semisimple complex Lie algebra, then one can choose a maximal selfnormalizing subalgebra of g (alternatively, this is the commutant of an element with commutant of minimal dimension), called a Cartan subalgebra, traditionally denote h. These act on g by the adjoint action by diagonalizable linear maps. Since these maps all commute, they are all simultaneously diagonalizable. The simultaneous eigenspaces of this action are called root spaces, and the decomposition of g into h and the root spaces is called a root decompositon of g. It turns out that all root spaces are all one dimensional. Now, for each eigenspace, we have a map λ : h → C, given by Hv = λ(H)v for v an element of that eigenspace. The set R ⊂ h∗ of these λ is called the root system of the algebra g. The Cartan subalgebra h has a natural inner product (the Killing form), which in turn induces an inner product on h∗ . With respect to this inner product, the root system R is an abstract root system, in the sense deﬁned up above. Conversely, given any abstract root system R, there is a unique semisimple complex Lie algebra g such that R is its root system. Thus to classify complex semisimple Lie algebras, we need only classify roots systems, a somewhat easier task. Really, we only need to classify indecomposable root systems, since all other root systems are built out of these. The Lie algebra corresponding to a root system is simple if and only if the associated root system is indecomposable. 1176
By convention e1 , . . . , en are orthonormal vectors, and the subscript on the name of the root system is the dimension of the space it is contained in, also called the rank of the system, and the indices i and j will run from 1 to n. There are four inﬁnite series of indecomposable root systems : • An = {ei − ej , δ + ei }i=j , where δ = • Bn = {±ei ± ej }i<j • Cn = {±ei ± ej }i<j
n k=1 ek .
This system corresponds to sl2 C.
{ei }. This system corresponds to so2n+1 C. {2ei }. This system corresponds to sp2n C.
• Dn = {±ei ± ej }i<j . This sytem corresponds to so2n C. and there are ﬁve exceptional root systems G2 , F4 , E6 , E7 , E8 , with ﬁve corresponding exceptional algebras, generally denoted by the same letter in lowercase Fraktur (g2 , etc.). Version: 3 Owner: bwebste Author(s): bwebste
278.21
simple and semisimple Lie algebras
A Lie algebra is called simple if it has no proper ideals and is not abelian. A Lie algebra is called semisimple if it has no proper solvable ideals and is not abelian. Let k = R or C. Examples of simple algebras are sln k, the Lie algebra of the special linear group (traceless matrices), son k, the Lie algebra of the special orthogonal group (skewsymmetric matrices), and sp2n k the Lie algebra of the symplectic group. Over R, there are other simple Lie algebas, such as sun , the Lie algebra of the special unitary group (skewHermitian matrices). Any semisimple Lie algebra is a direct product of simple Lie algebras. Simple and semisimple Lie algebras are one of the most widely studied classes of algebras for a number of reasons. First of all, many of the most interesting Lie groups have semisimple Lie algebras. Secondly, their representation theory is very well understood. Finally, there is a beautiful classiﬁcation of simple Lie algebras. Over C, there are 3 inﬁnite series of simple Lie algebras: sln , son and sp2n , and 5 exceptional simple Lie algebras g2 , f4 , e6 , e7 , and e8 . Over R the picture is more complicated, as several diﬀerent Lie algebras can have the same complexiﬁcation (for example, sun and sln R both have complexiﬁcation sln C). Version: 3 Owner: bwebste Author(s): bwebste
1177
278.22
simple root
Let R ⊂ E be a root system, with E a euclidean vector space. If R+ is a set of positive roots, then a root is called simple if it is positive, and not the sum of any two positive roots. The simple roots form a basis of the vector space E, and any positive root is a positive integer linear combination of simple roots. A set of roots which is simple with respect to some choice of a set of positive roots is called a base. The Weyl group of the root system acts simply transitively on the set of bases. Version: 1 Owner: bwebste Author(s): bwebste
278.23
weight (Lie algebras)
Let g be a semisimple Lie algebra. Choose a Cartan subalgebra h. Then a weight is simply an element of the dual h∗ . Weights arise in the representation theory of semisimple Lie algebras in the following manner: The elements of h must act on V by diagonalizable (also called semisimple) linear transformations. Since h is abelian, these must be simultaneously diagonalizable. Thus, V decomposes as the direct sum of simultaneous eigenspaces for h. Let V be such an eigenspace. Then the map λ deﬁned by λ(H)v = Hv is a linear functional on h, and thus a weight, as deﬁned above. The maximal eigenspace Vλ with weight λ is called the weight space of λ. The dimension of Vλ is called the multiplicity of λ. A representation of a semisimple algebra is determine by the multiplicities of its weights. Version: 3 Owner: bwebste Author(s): bwebste
278.24
weight lattice
The weight lattice ΛW of a root system R ⊂ E is the dual lattice to ΛR , the root lattice of R. That is, ΛW = {e ∈ E(e, r) ∈ Z}. Weights which lie in the weight lattice are called integral. Since the simple roots are free generators of the root lattice, one need only check that (e, π) ∈ Z for all simple roots π. If R ⊂ h is the root system of a semisimple Lie algebra g with Cartan subalgebra h, then ΛW is exactly the set of weights appearing in ﬁnite dimensional representations of g. Version: 4 Owner: bwebste Author(s): bwebste
1178
Chapter 279 17B30 – Solvable, nilpotent (super)algebras
279.1 Engel’s theorem
Before proceeding, it will be useful to recall the deﬁnition of a nilpotent Lie algebra. Let g be a Lie algebra. The lower central series of g is deﬁned to be the ﬁltration of ideals D0 g ⊃ D1 g ⊃ D2 g ⊃ . . . , where D0 g = g, Dk+1g = [g, Dk g], k ∈ N. To say that g is nilpotent is to say that the lower central series has a trivial termination, i.e. that there exists a k such that Dk g = 0, or equivalently, that k nested bracket operations always vanish. Theorem 1 (Engel). Let g ⊂ End V be a Lie algebra of endomorphisms of a ﬁnitedimensional vector space V . Suppose that all elements of g are nilpotent transformations. Then, g is a nilpotent Lie algebra. Lemma 3. Let X : V → V be a nilpotent endomorphism of a vector space V . Then, the adjoint action ad(X) : End V → End V is also a nilpotent endomorphism.
Proof. Suppose that Xk = 0 1179
for some k ∈ N. We will show that ad(X)2k−1 = 0. Note that ad(X) = l(X) − r(X), where l(X), r(X) : End V → End V, are the endomorphisms corresponding, respectively, to left and right multiplication by X. These two endomorphisms commute, and hence we can use the binomial formula to write
2k−1
ad(X)
2k−1
=
i=0
(−1)i l(X)2k−1−i r(X)i .
Each of terms in the above sum vanishes because l(X)k = r(X)k = 0. QED Lemma 4. Let g be as in the theorem, and suppose, in addition, that g is a nilpotent Lie algebra. Then the joint kernel, ker g = ker a,
a∈g
is nontrivial.
Proof. We proceed by induction on the dimension of g. The claim is true for dimension 1, because then g is generated by a single nilpotent transformation, and all nilpotent transformations are singular. Suppose then that the claim is true for all Lie algebras of dimension less than n = dim g. We note that D1 g ﬁts the hypotheses of the lemma, and has dimension less than n, because g is nilpotent. Hence, by the induction hypothesis V0 = ker D1 g is nontrivial. Now, if we restrict all actions to V0 , we obtain a representation of g by abelian transformations. This is because for all a, b ∈ g and v ∈ V0 we have abv − bav = [a, b]v = 0. Now a ﬁnite number of mutually commuting linear endomorphisms admits a mutual eigenspace decomposition. In particular, if all of the commuting endomorphisms are singular, their joint kernel will be nontrivial. We apply this result to a basis of g/D1 g acting on V0 , and the desired conclusion follows. QED 1180
Proof of the theorem. We proceed by induction on the dimension of g. The theorem is true in dimension 1, because in that circumstance D1 g is trivial. Next, suppose that the theorem holds for all Lie algebras of dimension less than n = dim g. Let h ⊂ g be a properly contained subalgebra of minimum codimension. We claim that there exists an a ∈ g but not in h such that [a, h] ⊂ h. By the induction hypothesis, h is nilpotent. To prove the claim consider the isotropy representation of h on g/h. By Lemma 1, the action of each a ∈ h on g/h is a nilpotent endomorphism. Hence, we can apply Lemma 2 to deduce that the joint kernel of all these actions is nontrivial, i.e. there exists a a ∈ g but not in h such that [b, a] ⇔ 0 (mod#1) , for all b ∈ h. Equivalently, [h, a] ⊂ h and the claim is proved. Evidently then, the span of a and h is a subalgebra of g. Since h has minimum codimension, we infer that h and a span all of g, and that D1 g ⊂ h. Next, we claim that all the Dk h are ideals of g. It is enough to show that [a, Dk h] ⊂ Dk h. We argue by induction on k. Suppose the claim is true for some k. Let b ∈ h, c ∈ Dk h be given. By the Jacobi identity [a, [b, c]] = [[a, b], c] + [b, [a, c]]. The ﬁrst term on the right handside in Dk+1h because [a, b] ∈ h. The second term is in Dk+1 h by the induction hypothesis. In this way the claim is established. Now a is nilpotent, and hence by Lemma 1, ad(a)n = 0 for some n ∈ N. We now claim that Dn+1 g ⊂ D1 h. By (278.1.1) it suﬃces to show that
n times
(279.1.1)
(279.1.2)
[g, [. . . [g, h] . . .]] ⊂ D1 h. Putting g1 = g/D1 h, h1 = h/D1 h,
1181
this is equivalent to
n times
[g1 , [. . . [g1 , h1 ] . . .]] = 0. However, h1 is abelian, and hence, the above follows directly from (278.1.2). Adapting this argument in the obvious fashion we can show that Dkn+1g ⊂ Dk h. Since h is nilpotent, g must be nilpotent as well. QED
Historical remark. In the traditional formulation of Engel’s theorem, the hypotheses are the same, but the conclusion is that there exists a basis B of V , such that all elements of g are represented by nilpotent matrices relative to B. Let us put this another way. The vector space of nilpotent matrices Nil, is a nilpotent Lie algebra, and indeed all subalgebras of Nil are nilpotent Lie algebras. Engel’s theorem asserts that the converse holds, i.e. if all elements of a Lie algebra g are nilpotent transformations, then g is isomorphic to a subalgebra of Nil. The classical result follows straightforwardly from our version of the Theorem and from Lemma 2. Indeed, let V1 be the joint kernel g. We then let U2 be the joint kernel of g acting on V /V0 , and let V2 ⊂ V be the subspace obtained by pulling U2 x back to V . We do this a ﬁnite number of times and obtain a ﬂag of subspaces 0 = V0 ⊂ V1 ⊂ V2 ⊂ . . . ⊂ Vn = V, such that gVk+1 = Vk for all k. The choose an adapted basis relative to this ﬂag, and we’re done. Version: 2 Owner: rmilson Author(s): rmilson
279.2
Lie’s theorem
Let g be a ﬁnite dimensional complex solvable Lie algebra, and V a repesentation of g. Then there exists an element of V which is a simultaneous eigenvector for all elements of g. Applying this result inductively, we ﬁnd that there is a basis of V with respect to which all elements of g are upper triangular. Version: 3 Owner: bwebste Author(s): bwebste
1182
279.3
solvable Lie algebra
Let g be a Lie algebra. The lower central series of g is the ﬁltration of subalgebras D1 g ⊃ D2 g ⊃ D3 g ⊃ · · · ⊃ Dk g ⊃ · · · of g, inductively deﬁned for every natural number k as follows: D1 g := [g, g] Dk g := [g, Dk−1 g] The upper central series of g is the ﬁltration D1 g ⊃ D2 g ⊃ D3 g ⊃ · · · ⊃ Dk g ⊃ · · · deﬁned inductively by D1 g := [g, g] Dk g := [Dk−1g, Dk−1 g] In fact both Dk g and Dk g are ideals of g, and Dk g ⊂ Dk g for all k. The Lie algebra g is deﬁned to be nilpotent if Dk g = 0 for some k ∈ N, and solvable if Dk g = 0 for some k ∈ N. A subalgebra h of g is said to be nilpotent or solvable if h is nilpotent or solvable when considered as a Lie algebra in its own right. The terms may also be applied to ideals of g, since every ideal of g is also a subalgebra. Version: 1 Owner: djao Author(s): djao
1183
Chapter 280 17B35 – Universal enveloping (super)algebras
280.1 Poincar´BirkhoﬀWitt theorem e
Let g be a Lie algebra over a ﬁeld k, and let B be a kbasis of g equipped with a linear order . The Poincar´BirkhoﬀWitttheorem (often abbreviated to PBWtheorem) states that e the monomials x1 x2 · · · xn with x1 x2 . . . xn elements of B constitute a kbasis of the universal enveloping algebra U(g) of g. Such monomials are often called ordered monomials or PBWmonomials. It is easy to see that they span U(g): for all n ∈ N, let Mn denote the set Mn = {(x1 , . . . , xn )  x1 and denote by π :
∞ n=0
...
xn } ⊂ B n ,
B n → U(g) the multiplication map. Clearly it suﬃces to prove that
n
π(B n ) ⊆
π(Mi )
i=0
for all n ∈ N; to this end, we proceed by induction. For n = 0 the statement is clear. Assume that it holds for n − 1 0, and consider a list (x1 , . . . , xn ) ∈ B n . If it is an element of Mn , then we are done. Otherwise, there exists an index i such that xi > xi+1 . Now we have π(x1 , . . . , xn ) = π(x1 , . . . , xi−1 , xi+1 , xi , xi+2 , . . . , xn ) + x1 · · · xi−1 [xi , xi+1 ]xi+1 · · · xn . As B is a basis of k, [xi , xi+1 ] is a linear combination of B. Using this to expand the second n−1 term above, we ﬁnd that it is in i=0 π(Mi ) by the induction hypothesis. The argument 1184
of π in the ﬁrst term, on the other hand, is lexicographically smaller than (x1 , . . . , xn ), but contains the same entries. Clearly this rewriting proces must end, and this concludes the induction step. The proof of linear independence of the PBWmonomials is slightly more diﬃcult. Version: 1 Owner: draisma Author(s): draisma
280.2
universal enveloping algebra
A universal enveloping algebra of a Lie algebra g over a ﬁeld k is an associative algebra U (with unity) over k, together with a Lie algebra homomorphism ι : g → U (where the Lie algebra structure on U is given by the commutator), such that if A is a another associative algebra over k and φ : g → A is another Lie algebra homomorphism, then there exists a unique homomorphism ψ : U → A of associative algebras such that the diagram g
ι
U
ψ
φ
then U = T /I is a universal enveloping algebra of g. Moreover, the universal property above ensures that all universal enveloping algebras of g are canonically isomorphic; this justiﬁes the standard notation U(g). Some remarks:
A commutes. Any g has a universal enveloping algebra: let T be the associative tensor algebra generated by the vector space g, and let I be the twosided ideal of T generated by elements of the form xy − yx − [x, y] for x, y ∈ g;
1. By the Poincar´BirkhoﬀWitt theorem, the map ι is injective; usually g is identiﬁed e with ι(g). From the construction above it is clear that this space generates U(g) as an associative algebra with unity. 2. By deﬁnition, the (left) representation theory of U(g) is identical to that of g. In particular, any irreducible gmodule corresponds to a maximal left ideal of U(g). Example: let g be the Lie algebra generated by the elements p, q, and e with Lie bracket determined by [p, q] = e and [p, e] = [q, e] = 0. Then U(g)/(e − 1) (where (e − 1) denotes the ∂ twosided ideal generated by e − 1) is isomorphic to the skew polynomial algebra k[x, ∂x ], the isomorphism being determined by ∂ and p + (e − 1) → ∂x q + (e − 1) → x. 1185
Version: 1 Owner: draisma Author(s): draisma
1186
Chapter 281 17B56 – Cohomology of Lie (super)algebras
281.1 Lie algebra cohomology
Let g be a Lie algebra, and M a representation of g. Let M g = {m ∈ M : Xm = 0∀X ∈ g}. This is clearly a covariant functor. Call its derived functor Ri (−g) = H i(g, −) the Lie algebra cohomology of g with coeﬃcients in M These cohomology groups have certain interpretations. For any Lie algebra, H 1 (g, k) ∼ = 2 g/[g, g], the abelianization of g, and H (g, M) is in natural bijection with Lie algebra extensions (thinking of M as an abelian Lie algebra) 0 → M → f → g → 0 such that the action of g on M induced by that of f coincides with that already speciﬁed. Version: 2 Owner: bwebste Author(s): bwebste
1187
Chapter 282 17B67 – KacMoody (super)algebras (structure and representation theory)
282.1 KacMoody algebra
Let A be an n × n generalized Cartan matrix. If n − r is the rank of A, then let h be a n + r dimensional complex vector space. Choose n linearly independent elements α1 , . . . , αn ∈ h∗ (called roots), and α1 , . . . , αn ∈ h (called coroots) such that αi , αj = aij , where ·, · is the ˇ ˇ ˇ ∗ natural pairing of h and h. This choice is unique up to automorphisms of h. Then the KacMoody algebra associated to g(A) is the Lie algebra generated by elements X1 , . . . , Xn , Y1 , . . . , Yn and h, with the relations [Xi , Yi ] = αi ˇ [Xi , h] = αi (h)Xi [Xi , [Xi , · · · , [Xi , Xj ] · · · ]] = 0
1−aij times
[Xi , Yj ] = 0 [Yi , h] = −αi (h)Yi [Yi , [Yi, · · · , [Yi, Yj ] · · · ]] = 0
1−aij times
If the matrix A is positivedeﬁnite, we obtain a ﬁnite dimensional semisimple Lie algebra, and A is the Cartan matrix associated to a Dynkin diagram. Otherwise, the algebra we obtain is inﬁnite dimensional and has an rdimensional center. Version: 2 Owner: bwebste Author(s): bwebste
282.2
generalized Cartan matrix
A generalized Cartan matrix is a matrix A whose diagonal entries are all 2, and whose oﬀdiagonal entries are nonpositive integers, such that aij = 0 if and only if aji = 0. Such a 1188
matrix is called symmetrizable if there is a diagonal matrix B such that AB is symmetric. Version: 2 Owner: bwebste Author(s): bwebste
1189
Chapter 283 17B99 – Miscellaneous
283.1 Jacobi identity interpretations
The Jacobi identity in a Lie algebra g has various interpretations that are more transparent, whence easier to remember, than the usual form [x, [y, z]] + [y, [z, x]] + [z, [x, y]] = 0. One is the fact that the adjoint representation ad : g → End(g) really is a representation. Yet another way to formulate the identity is ad(x)[y, z] = [ad(x)y, z] + [y, ad(x)z], i.e., ad(x) is a derivation on g for all x ∈ g. Version: 2 Owner: draisma Author(s): draisma
283.2
Lie algebra
A Lie algebra over a ﬁeld k is a vector space g with a bilinear map [ , ] : g × g → g, called the Lie bracket and denoted (x, y) → [x, y]. It is required to satisfy: 1. [x, x] = 0 for all x ∈ g. 2. The Jacobi identity: [x, [y, z]] + [y, [z, x]] + [z, [x, y]] = 0 for all x, y, z ∈ g.
1190
283.2.1
Subalgebras & Ideals
A vector subspace h of the Lie algebra g is a subalgebra if h is closed under the Lie bracket operation, or, equivalently, if h itself is a Lie algebra under the same bracket operation as g. An ideal of g is a subspace h for which [x, y] ∈ h whenever either x ∈ h or y ∈ h. Note that every ideal is also a subalgebra. Some general examples of subalgebras: • The center of g, deﬁned by Z(g) := {x ∈ g  [x, y] = 0for all y ∈ g}. It is an ideal of g. • The normalizer of a subalgebra h is the set N(h) := {x ∈ g  [x, h] ⊂ h}. The Jacobi identity guarantees that N(h) is always a subalgebra of g. • The centralizer of a subset X ⊂ g is the set C(X) := {x ∈ g  [x, X] = 0}. Again, the Jacobi identity implies that C(X) is a subalgebra of g.
283.2.2
Homomorphisms
Given two Lie algebras g and g over the ﬁeld k, a homomorphism from g to g is a linear transformation φ : g → g such that φ([x, y]) = [φ(x), φ(y)] for all x, y ∈ g. An injective homomorphism is called a monomorphism, and a surjective homomorphism is called an epimorphism. The kernel of a homomorphism φ : g → g (considered as a linear transformation) is denoted ker (φ). It is always an ideal in g.
283.2.3
Examples
• Any vector space can be made into a Lie algebra simply by setting [x, x] = 0 for all x. The resulting Lie algebra is called an abelian Lie algebra. • If G is a Lie group, then the tangent space at the identity forms a Lie algebra over the real numbers. • R3 with the cross product operation is a nonabelian three dimensional Lie algebra over R.
283.2.4
Historical Note
Lie algebras are sonamed in honour of Sophus Lie, a Norwegian mathematician who pioneered the study of these mathematical objects. Lie’s discovery was tied to his investigation 1191
of continuous transformation groups and symmetries. One joint project with Felix Klein called for the classiﬁcation of all ﬁnitedimensional groups acting on the plane. The task seemed hopeless owing to the generally nonlinear nature of such group actions. However, Lie was able to solve the problem by remarking that a transformation group can be locally reconstructed from its corresponding “inﬁnitesimal generators”, that is to say vector ﬁelds corresponding to various 1–parameter subgroups. In terms of this geometric correspondence, the group composition operation manifests itself as the bracket of vector ﬁelds, and this is very much a linear operation. Thus the task of classifying group actions in the plane became the task of classifying all ﬁnitedimensional Lie algebras of planar vector ﬁeld; a project that Lie brought to a successful conclusion. This “linearization trick” proved to be incredibly fruitful and led to great advances in geometry and diﬀerential equations. Such advances are based, however, on various results from the theory of Lie algebras. Lie was the ﬁrst to make signiﬁcant contributions to this purely algebraic theory, but he was surely not the last. Version: 10 Owner: djao Author(s): djao, rmilson, nerdy2
283.3
real form
Let G be a complex Lie group. A real Lie group K called a real form of G if g ∼ C ⊗R k, = where g and k are the Lie algebras of G and K, respectively. Version: 2 Owner: bwebste Author(s): bwebste
1192
Chapter 284 1800 – General reference works (handbooks, dictionaries, bibliographies, etc.)
284.1 Grothendieck spectral sequence
If F : C → D and G : D → E are two covariant left exact functors between abelian categories, and if F takes injective objects of C to Gacyclic objects of D then there is a spectral sequence for each object A of C:
pq E2 = (Rp G ◦ Rq F )(A) → Rp+q (G ◦ F )(A)
If X and Y are topological spaces and C = Ab(X) is the category of sheaves of abelian groups on X and D = Ab(Y ) and E = Ab is the category of abelian groups, then for a continuous map f : X → Y we have a functor f∗ : Ab(X) → Ab(Y ), the direct image functor. We also have the global section functors ΓX : Ab(X) → Ab, and ΓY : Ab(Y ) → Ab. Then since ΓY ◦ f∗ = ΓX and we can verify the hypothesis (injectives are ﬂasque, direct images of ﬂasque sheaves are ﬂasque, and ﬂasque sheaves are acyclic for the global section functor), the sequence in this case becomes: H p (Y, Rq f∗ F) → H p+q (X, F) for a sheaf F of abelian groups on X, exactly the Leray spectral sequence. I can recommend no better book than Weibel’s book on homological algebra. Sheaf theory can be found in Hartshorne or in Godement’s book. Version: 5 Owner: bwebste Author(s): Manoj, ceps, nerdy2
1193
284.2
category of sets
The category of sets has as its objects all sets and as its morphisms functions between sets. (This works if a category’s objects are only required to be part of a class, as the class of all sets exists.) Alternately one can specify a universe, containing all sets of interest in the situation, and take the category to contain only sets in that universe and functions between those sets. Version: 1 Owner: nerdy2 Author(s): nerdy2
284.3
functor
Given two categories C and D, a covariant functor T : C → D consists of an assignment for each object X of C an object T (X) of D (i.e. a “function” T : Ob(C) → Ob(D)) together with an assignment for every morphism f ∈ HomC(A, B), to a morphism T (f ) ∈ HomD(T (A), T (B)), such that: • T (1A ) = 1T (A) where 1X denotes the identity morphism on the object X (in the respective category). • T (g ◦ f ) = T (g) ◦ T (f ), whenever the composition g ◦ f is deﬁned. A contravariant functor T : C → D is just a covariant functor T : Cop → D from the opposite category. In other words, the assignment reverses the direction of maps. If f ∈ HomC(A, B), then T (f ) ∈ HomD(T (B), T (A)) and T (g ◦ f ) = T (f ) ◦ T (g) whenever the composition is deﬁned (the domain of g is the same as the codomain of f ). Given a category C and an object X we always have the functor T : C → Sets to the category of sets deﬁned on objects by T (A) = Hom(X, A). If f : A → B is a morphism of C, then we deﬁne T (f ) : Hom(X, A) → Hom(X, B) by g → f ◦ g. This is a covariant functor, denoted by Hom(X, −). Similarly, one can deﬁne a contravariant functor Hom(−, X) : C → Sets. Version: 3 Owner: nerdy2 Author(s): nerdy2
284.4
monic
A morphism f : A → B in a category is called monic if for any object C and any morphisms g1 , g2 : C → A, if f ◦ g1 = f ◦ g2 then g1 = g2 . 1194
A monic in the category of sets is simply a onetoone function. Version: 1 Owner: nerdy2 Author(s): nerdy2
284.5
natural equivalence
A natural transformation between functors τ : F → G is called a natural equivalence (or a natural isomorphism) if there is a natural transformation σ : G → F such that τ ◦ σ = idG and σ ◦ τ = idF where idF is the identity natural transformation on F (which for each object A gives the identity map F (A) → F (A)), and composition is deﬁned in the obvious way (for each object compose the morphisms and it’s easy to see that this results in a natural transformation). Version: 2 Owner: mathcam Author(s): mathcam, nerdy2
284.6
representable functor
A contravariant functor T : C → Sets between a category and the category of sets is representable if there is an object X of C such that T is isomorphic to the functor X • = Hom(−, X). Similarly, a covariant functor is T called representable if it is isomorphic to X• = Hom(X, −). We say that the object X represents T . X is unique up to canonical isomorphism. A vast number of important objects in mathematics are deﬁned as representing functors. For example, if F : C → D is any functor, then the adjoint G : D → C (if it exists) can be deﬁned as follows. For Y in D, G(Y ) is the object of C representing the functor X → Hom(F (X), Y ) if G is right adjoint to F or X → Hom(Y, F (X)) if G is left adjoint. Thus, for example, if R is a ring, then N⊗M represents the functor L → HomR (N, HomR (M, L)). Version: 3 Owner: bwebste Author(s): bwebste, nerdy2
284.7
supplemental axioms for an Abelian category
These are axioms introduced by Alexandre Grothendieck for an abelian category. The ﬁrst two are satisﬁed by deﬁnition in an Abelian category, and others may or may not be. (Ab1) Every morphism has a kernel and a cokernel. 1195
(Ab2) Every monic is the kernel of its cokernel. (Ab3) coproducts exist. (Coproducts are also called direct sums.) If this axiom is satisﬁed the category is often just called cocomplete. (Ab3*) Products exist. If this axiom is satisﬁed the category is often just called complete. (Ab4) Coproducts exist and the coproduct of monics is a monic. (Ab4*) Products exist and the product of epics is an epic. (Ab5) Coproducts exist and ﬁltered colimits of exact sequences are exact. (Ab5*) Products exist and ﬁltered inverse limits of exact sequences are exact. Grothendieck introduced these in his homological algebra paper in the Tokohu Journal of Math. They can also be found in Weibel’s excellent homological algebra book. Version: 5 Owner: nerdy2 Author(s): nerdy2
1196
Chapter 285 18A05 – Deﬁnitions, generalizations
285.1 autofunctor
Let F : C → C be an endofunctor on a category C. If F is a bijection on objects, Ob(C), and morphisms, Mor(C), then it is an autofunctor. In short, an autofunctor is a full and faithful endofunctor F : C → C such that the mapping Ob(C) → Ob(C) which is induced by F is a bijection. An autofunctor F : C → C is naturally isomorphic to the identity functor idC. Version: 10 Owner: mathcam Author(s): mathcam, mhale, yark, gorun manolescu
285.2
automorphism
Roughly, an automorphism is a map from a mathematical object onto itself such that: 1. There exists an ”inverse” map such that the composition of the two is the identity map of the object, and 2. any relevent structure related to the object in question is preserved. In category theory, an automorphism of an object A in a category C is a morphishm ψ ∈ Mor(A, A) such that there exists another morphism φ ∈ Mor(A, A) and ψ ◦ φ = φ ◦ ψ = idA . For example in the category of groups an automorphism is just a bijective (inverse exists and composition gives the identity) group homomorphism (group structure is preserved). Concretely, the map: x → −x is an automorphism of the additive group of real numbers. In the category of topological spaces an automorphism would be a bijective, continuous map such that it’s inverse map is also continuous (not guaranteed as in the group case). Concretely, the map ψ : S 1 → S 1 where ψ(α) = α + θ for some ﬁxed angle θ is an automorphism of the topological space that is the circle. 1197
Version: 4 Owner: benjaminfjones Author(s): benjaminfjones
285.3
category
A category C consists of the following data: 1. a collection ob(C) of objects (of C) 2. for each ordered pair (A, B) of objects of C, a collection (we will assume it is a set) Hom(A, B) of morphisms from the domain A to the codomain B 3. a function ◦ : Hom(A, B) × Hom(B, C) → Hom(A, C) called composition. We normally denote ◦(f, g) by g ◦ f for morphisms f, g. The above data must satisfy the following axioms: for objects A, B, C, D, A1: Hom(A, B) Hom(C, D) = ∅ whenever A = B or C = D
A2: (associativity) if f ∈ Hom(A, B), g ∈ Hom(B, C) and h ∈ Hom(C, D), h ◦ (g ◦ f ) = (h ◦ g) ◦ f A3: (Existence of an identity morphism) for each object A there exists an identity morphism idA ∈ Hom(A, A) such that for every f ∈ Hom(A, B), f ◦ idA = f and idA ◦ g = g for every g ∈ Hom(B, A). Some examples of categories: • 0 is the empty category with no objects or morphisms, 1 is the category with one object and one (identity) morphism. • If we assume we have a universe U which contains all sets encountered in “everyday” mathematics, Set is the category of all such small sets with morphisms being set functions • Top is the category of all small topological spaces with morphisms continuous functions • Grp is the category of all small groups whose morphisms are group homomorphisms Version: 9 Owner: mathcam Author(s): mathcam, RevBobo
1198
285.4
category example (arrow category)
Let C be a category, and let D be the category whose objects are the arrows of C. A morphism between two morphisms f : A → B and g : A → B is deﬁned to be a couple of morphisms (h, k), where h ∈ Hom(A, A ) and k ∈ Hom(B, B ) such that the following diagram A
f h
A
g
B
k
B
commutes. The resulting category D is called the arrow category of C. Version: 6 Owner: n3o Author(s): n3o
285.5
commutative diagram
Deﬁnition 15. Let C be a category. A diagram in C is a directed graph Γ with vertex set V and edge set E, (“loops” and “parallel edges” are allowed) together with two maps o : V → Obj(C), m : E → Morph(C) such that if e ∈ E has source s(e) ∈ V and target t(e) ∈ V then m(e) ∈ HomC (s(e), t(e)). Usually diagrams are denoted by drawing the corresponding graph and labeling its vertices (respectively edges) with their images under o (respectively m), for example if f : A → B is a morphism A
f
B
is a diagram. Often (as in the previous example) the vertices themselves are not drawn since their position can b deduced by the position of their labels. Deﬁnition 16. Let D = (Γ, o, m) be a diagram in the category C and γ = (e1 , . . . , en ) be a path in Γ. Then the composition along γ is the following morphism of C ◦(γ) := m(en ) ◦ · · · ◦ m(e1 ) . We say that D is commutative or that it commutes if for any two objects in the image of o, say A = o(v1 ) and B = o(v2 ), and any two paths γ1 and γ2 that connect v1 to v2 we have ◦(γ1 ) = ◦(γ2 ) . For example the commutativity of the triangle A
h f
B
g
C 1199
translates to h = g ◦ f , while the commutativity of the square A
k f
B
g
C translates to g ◦ f = h ◦ k.
h
D
Version: 3 Owner: Dr Absentius Author(s): Dr Absentius
285.6
double dual embedding
Let V be a vector space over a ﬁeld K. Recall that V ∗ , the dual space, is deﬁned to be the vector space of all linear forms on V . There is a natural embedding of V into V ∗∗ , the dual of its dual space. In the language of categories, this embedding is a natural transformation between the identity functor and the double dual functor, both endofunctors operating on VK , the category of vector spaces over K. Turning to the details, let denote the identity and the dual functors, respectively. Recall that for a linear mapping L : U → V (a morphism in VK ), the dual homomorphism D[L] : V ∗ → U ∗ is deﬁned by D[L](α) : u → α(Lu), u ∈ U, α ∈ V ∗ . The double dual embedding is a natural transformation δ : I → D2 , that associates to every V ∈ VK a linear homomorphism δV ∈ Hom(V, V ∗∗ ) described by δV (v) : α → α(v), v ∈ V, α ∈ V ∗ To show that this transformation is natural, let L : U → V be a linear mapping. We must show that the following diagram commutes:
δU
I, D : VK → VK
U
L
U ∗∗
D 2 [L]
V
δV
V ∗∗
Let u ∈ U and α ∈ V ∗ be given. Following the arrows down and right we have that (δV ◦ L)(u) : α → α(Lu). 1200
Following the arrows right, then down we have that (D[D[L]] ◦ δU )(u) : α → (δU u)(D[L]α) = (D[L]α)(u) = α(Lu), as desired. Let us also note that for every nonzero v ∈ V , there exists an α ∈ V ∗ such that α(v) = 0. Hence δV (v) = 0, and hence δV is an embedding, i.e. it is onetoone. If V is ﬁnite dimensional, then V ∗ has the same dimension as V . Consequently, for ﬁnitedimensional V , the natural embedding δV is, in fact, an isomorphism. Version: 1 Owner: rmilson Author(s): rmilson
285.7
dual category
Let C be a category. The dual category C∗ of C is the category which has the same objects as C, but in which all morphisms are ”reversed”. That is to say if A, B are objects of C and we have a morphism f : A → B, then f ∗ : B → A is a morphism in C∗ . The dual category is sometimes called the opposite category and is denoted Cop . Version: 3 Owner: RevBobo Author(s): RevBobo
285.8
duality principle
Let Σ be any statement of the elementary theory of an abstract category. We form the dual of Σ as follows: 1. Replace each occurrence of ”domain” in Σ with ”codomain” and conversely. 2. Replace each occurrence of g ◦ f = h with f ◦ g = h Informally, these conditions state that the dual of a statement is formed by reversing arrows and compositions. For example, consider the following statements about a category C: • f :A→B • f is monic, i.e. for all morphisms g, h for which composition makes sense, f ◦ g = f ◦ h implies g = h.
1201
The respective dual statements are • f :B→A • f is epi, i.e. for all morphisms g, h for which composition makes sense, g ◦ f = h ◦ f implies g = h. The duality principle asserts that if a statement is a theorem, then the dual statment is also a theorem. We take ”theorem” here to mean provable from the axioms of the elementary theory of an abstract category. In practice, for a valid statement about a particular category C, the dual statement is valid in the dual category C∗ (Cop ). Version: 3 Owner: RevBobo Author(s): RevBobo
285.9
endofunctor
Given a category C, an endofunctor is a functor T : C → C. Version: 2 Owner: rmilson Author(s): NeuRet, Logan
285.10
examples of initial objects, terminal objects and zero objects
Examples of initial objects, terminal objects and zero objects of categories include: • The empty set is the unique initial object in the category of sets; every oneelement set is a terminal object in this category; there are no zero objects. Similarly, the empty space is the unique initial object in the category of topological spaces; every onepoint space is a terminal object in this category. • In the category of nonempty sets, there are no initial objects. The singletons are not initial: while every nonempty set admits a function from a singleton, this function is in general not unique. • In the category of pointed sets (whose objects are nonempty sets together with a distinguished point; a morphism from (A, a) to (B, b) is a function f : A → B with f (a) = b) every singleton serves as a zero object. Similarly, in the category of pointed topological spaces, every singleton is a zero object.
1202
• In the category of groups, any trivial group (consisting only of its identity element) is a zero object. The same is true for the category of abelian groups as well as for the category of modules over a ﬁxed ring. This is the origin of the term ”zero object”. • In the category of rings with identity, the ring of integers (and any ring isomorphic to it) serves as an initial object. The trivial ring consisting only of a single element 0 = 1 is a terminal object. • In the category of schemes, the prime spectrum of the integers spec(Z) is a terminal object. The emtpy scheme (which is the prime spectrum of the trivial ring) is an initial object. • In the category of ﬁelds, there are no initial or terminal objects. • Any partially ordered set (P, ≤) can be interpreted as a category: the objects are the elements of P , and there is a single morphism from x to y if and only if x ≤ y. This category has an initial object if and only if P has a smallest element; it has a terminal object if and only if P has a largest element. This explains the terminology. • In the category of graphs, the null graph is an initial object. There are no terminal objects, unless we allow our graphs to have loops (edges starting and ending at the same vertex), in which case the onepointoneloop graph is terminal. • Similarly, the category of all small categories with functors as morphisms has the empty category as initial object and the oneobjectonemorphism category as terminal object. ˆ • Any topological space X can be viewed as a category X by taking the open sets as objects, and a single morphism between two open sets U and V if and only if U ⊂ V . The empty set is the initial object of this category, and X is the terminal object. • If X is a topological space and C is some small category, we can form the category of ˆ all contravariant functors from X to C, using natural transformations as morphisms. This category is called the category of presheaves on X with values in C. If C has an initial object c, then the constant functor which sends every open set to c is an initial object in the category of presheaves. Similarly, if C has a terminal object, then the corresponding constant functor serves as a terminal presheave. • If we ﬁx a homomorphism f : A → B of abelian groups, we can consider the category C consisting of all pairs (X, φ) where X is an abelian group and φ : X → A is a 1203
group homomorphism with f φ = 0. A morphism from the pair (X, φ) to the pair (Y, ψ) is deﬁned to be a group homomorphism r : X → Y with the property ψr = φ: X
r φ
A
ψ
f
B
Y
The kernel of f is a terminal object in this category; this expresses the universal property of kernels. With an analogous construction, cokernels can be retrieved as initial objects of a suitable category. • The previous example can be generalized to arbitrary limits of functors: if F : I → C is ˆ a functor, we deﬁne a new category F as follows: its objects are pairs (X, (φi )) where X is an object of C and for every object i of I, φi : X → F (i) is a morphism in C such that for every morphism ρ : i → j in I, we have F (ρ)φi = φj . A morphism between pairs (X, (φi )) and (Y, (ψi )) is deﬁned to be a morphism r : X → Y such that ψi r = φi for all objects i of I. The universal property of the limit can then be expressed as ˆ ˆ saying: any terminal object of F is a limit of F and vice versa (note that F need not contain a terminal object, just like F need not have a limit). Version: 11 Owner: AxelBoldt Author(s): AxelBoldt
285.11
forgetful functor
Let C and D be categories such that each object c of C can be regarded an object of D by suitably ignoring structures c may have as a Cobject but not a Dobject. A functor U : C → D which operates on objects of C by “forgetting” any imposed mathematical structure is called a forgetful functor. The following are examples of forgetful functors: 1. U : Grp → Set takes groups into their underlying sets and group homomorphisms to set maps. 2. U : Top → Set takes topological spaces into their underlying sets and continuous maps to set maps. 3. U : Ab → Grp takes abelian groups to groups and acts as identity on arrows. Forgetful functors are often instrumental in studying adjoint functors. Version: 1 Owner: RevBobo Author(s): RevBobo
1204
285.12
isomorphism
A morphism f : A −→ B in a category is an isomorphism if there exists a morphism f −1 : B −→ A which is its inverse. The objects A and B are isomorphic if there is an isomorphism between them. Examples: • In the category of sets and functions, a function f : A −→ B is an isomorphism if and only if it is bijective. • In the category of groups and group homomorphisms (or rings and ring homomorphisms), a homomorphism φ : G −→ H is an isomorphism if it has an inverse map φ−1 : H −→ G which is also a homomorphism. • In the category of vector spaces and linear transformations, a linear transformation is an isomorphism if and only if it is an invertible linear transformation. • In the category of topological spaces and continuous maps, a continuous map is an isomorphism if and only if it is a homeomorphism. Version: 2 Owner: djao Author(s): djao
285.13
natural transformation
Let A, B be categories and T, S : A → B functors. A natural transformation τ : S → T is a family of morphisms τ = {τA : T (A) → S(A)} such that for each object A of A, τA : S(A) → T (A) is an object of B and for each morphism f : A → A in A the following diagram commutes: τA T (A) S(A)
Sf Tf τA
S(A )
T (A )
Version: 6 Owner: RevBobo Author(s): RevBobo
285.14
types of homomorphisms
Often in a category of algebraic structures, those structures are generated by certain elements, and subject to certain relations. One often refers to functions between structures 1205
which are said to preserve those relations. These functions are typically called homomorphisms. An example is the category of groups. Suppose that f : A → B is a function between two groups. We say that f is a group homomorphism if: (a) the binary operator is preserved: f (a1 · a2 ) = f (a1 ) · f (a2 ) for all a1 , a2 ∈ A; (b) the identity element is preserved: f (eA ) = eB ; (c) inverses of elements are preserved: f (a−1 ) = [f (a)]−1 for all a ∈ A. One can deﬁne similar natural concepts of homomorphisms for other algebraic structures, giving us ring homomorphisms, module homomorphisms, and a host of others. We give special names to homomorphisms when their functions have interesting properties. If a homomorphism is an injective function (i.e. onetoone), then we say that it is a monomorphism. These are typically monic in their category. If a homomorphism is an surjective function (i.e. onto), then we say that it is an epimorphism. These are typically epic in their category. If a homomorphism is an bijective function (i.e. both onetoone and onto), then we say that it is an isomorphism. If the domain of a homomorphism is the same as its codomain (e.g. a homomorphism f : A → A), then we say that it is an endomorphism. We often denote the collection of endomorphisms on A as End(A). If a homomorphism is both an endomorphism and an isomorphism, then we say that it is an automorphism. We often denote the collection of automorphisms on A as Aut(A). Version: 4 Owner: antizeus Author(s): antizeus
285.15
zero object
An initial object in a category C is an object A in C such that, for every object X in C, there is exactly one morphism A −→ X. A terminal object in a category C is an object B in C such that, for every object X in C, there is exactly one morphism X −→ B. A zero object in a category C is an object 0 that is both an initial object and a terminal object.
1206
All initial objects (respectively, terminal objects, and zero objects), if they exist, are isomorphic in C. Version: 2 Owner: djao Author(s): djao
1207
Chapter 286 18A22 – Special properties of functors (faithful, full, etc.)
286.1 exact functor
A covariant functor F is said to be left exact if whenever 0 → A −→ B −→ C is an exact sequence, then 0 → F A −→ F B −→ F C is also an exact sequence. A covariant functor F is said to be right exact if whenever A −→ B −→ C → 0 is an exact sequence, then F A −→ F B −→ F C → 0 is also an exact sequence. A contravariant functor F is said to be left exact if whenever A −→ B −→ C → 0 is an exact sequence, then 0 → F C −→ F B −→ F A is also an exact sequence.
Fβ Fα α β Fα Fβ α β Fα Fβ α β
1208
A contravariant functor F is said to be right exact if whenever 0 → A −→ B −→ C is an exact sequence, then F C −→ F B −→ F A → 0 is also an exact sequence. A (covariant or contravariant) functor is said to be exact if it is both left exact and right exact. Version: 3 Owner: antizeus Author(s): antizeus
Fβ Fα α β
1209
Chapter 287 18A25 – Functor categories, comma categories
287.1 Yoneda embedding
ˆ If C is a category, write C for the category of contravariant functors from C to Sets, the ˆ category of sets. The morphisms in C are natural transformations of functors. (To avoid set theoretical concerns, one can take a universe U and take all categories to be Usmall.) For any object X of C, there is the functor hX = Hom(−, X). Then X → hX is a covariant ˆ ˆ functor C → C, which embeds C faithfully as a full subcategory of C. Version: 4 Owner: nerdy2 Author(s): nerdy2
1210
Chapter 288 18A30 – Limits and colimits (products, sums, directed limits, pushouts, ﬁber products, equalizers, kernels, ends and coends, etc.)
288.1 categorical direct product
Let {Ci}i∈I be a set of objects in a category C. A direct product of the collection {Ci }i∈I is an object i∈I Ci of C, with morphisms πi : j∈I Cj −→ Ci for each i ∈ I, such that: For every object A in C, and any collection of morphisms fi : A −→ Ci for every i ∈ I, there exists a unique morphism f : A −→ i∈I Ci making the following diagram commute for all i ∈ I. fi Ci A
f j∈I πi
Cj
Version: 4 Owner: djao Author(s): djao
288.2
categorical direct sum
Let {Ci }i∈I be a set of objects in a category C. A direct sum of the collection {Ci }i∈I is an object i∈I Ci of C, with morphisms ιi : Ci −→ j∈I Cj for each i ∈ I, such that:
1211
For every object A in C, and any collection of morphisms fi : Ci −→ A for every i ∈ I, there exists a unique morphism f : i∈I Ci −→ A making the following diagram commute for all i ∈ I. fi Ci A
ιi j∈I Cj f
Version: 4 Owner: djao Author(s): djao
288.3
kernel
Let f : X → Y be a function and let Y be have some sort of zero, neutral or null element that we’ll denote as e. (Examples are groups, vector spaces, modules, etc) The kernel of f is the set: ker f = {x ∈ X : f (x) = e} that is, the set of elements in X such that their image is e. This set can also denoted as f −1 (e) (that doesn’t mean f has an inverse function, it’s just notation) and that is read as ”the kernel is the preimage of the neutral element”. Let’s see an examples. If X = Z and Y = Z6 , the function f that sends each integer n to its residue class modulo 6. So f (4) = 4, f (20) = 2, f (−5) = 1. The kernel of f consist precisely of the multiples of 6 (since they have residue 0, we have f (6k) = 0). This is also an example of kernel of a group homomorphism, and since the sets are also rings, the function f is also a homomorphism between rings and the kernel is also the kernel of a ring homomorphism. Usually we are interested on sets with certain algebraic structure. In particular, the following theorem holds for maps between pairs of vector spaces, groups, rings and ﬁelds (and some other algebraic structures): A map f : X → Y is injective if and only if ker f = {0} (the zero of Y ). Version: 4 Owner: drini Author(s): drini
1212
Chapter 289 18A40 – Adjoint functors (universal constructions, reﬂective subcategories, Kan extensions, etc.)
289.1 adjoint functor
Let C, D be categories and T : C → D, S : D → C be covariant functors. T is said to be a left adjoint functor to S (equivalently, S is a right adjoint functor to T ) if there exists ν = νC,D such that ν : Hom(T (C), D) ∼ Hom(C, S(D)) =
D C
is a natural bijection of homsets for all objects C of C and D of D. An adjoint to any functor are unique up to natural transformation. Examples: 1. Let U : Top → Set be the forgetful functor (i.e. U takes topological spaces to their underlying sets, and continuous maps to set functions). Then U is right adjoint to the functor F : Set → Top which gives each set the discrete topology. 2. If U : Grp → Set is again the forgetful functor, this time on the category of groups, the functor F : Set → Grp which takes a set A to the free group generated by A is left adjoint to U. 3. If UN : R − mod → R − mod is the functor M → N ⊗ M for an Rmodule N, then UN is the left adjoint to the functor FN : R−mod → R−mod given by L → HomR (N, L). 1213
Version: 8 Owner: bwebste Author(s): bwebste, RevBobo
289.2
equivalence of categories
Let C and D be two categories with functors F : C → D and G : D → C. The functors F and G are an Deﬁnition 17. equivalence of categories if there are natural isomorphisms F G ∼ idD and = ∼ idC . GF = Note, F is left adjoint to G, and G is right adjoint to F as Hom(F (c), d) −→ Hom(GF (c), G(d)) ←→ Hom(c, G(d)).
D C C G
And, F is right adjoint to G, and G is left adjoint to F as Hom(G(d), c) −→ Hom(F G(d), F (c)) ←→ Hom(d, F (c)).
C D D F
In practical terms, two categories are equivalent if there is a fully faithful functor F : C → D, such that every object d ∈ D is isomorphic to an object F (c), for some c ∈ C. Version: 2 Owner: mhale Author(s): mhale
1214
Chapter 290 18B40 – Groupoids, semigroupoids, semigroups, groups (viewed as categories)
290.1 groupoid (category theoretic)
A groupoid, also known as a virtual group, is a small category where every morphism is invertible. There is also a grouptheoretic concept with the same name. Version: 6 Owner: akrowne Author(s): akrowne
1215
Chapter 291 18E10 – Exact categories, abelian categories
291.1 abelian category
An abelian category is a category A satisfying the following axioms. Because the later axioms rely on terms whose deﬁnitions involve the earlier axioms, we will intersperse the statements of the axioms with such auxiliary deﬁnitions as needed. Axiom 1. For any two objects A, B in A, the set of morphisms Hom(A, B) is an abelian group. The identity element in the group Hom(·, ·) will be denoted by 0, and the group operation by +. Axiom 2. Composition of morphisms distributes over addition in Hom(·, ·). That is, given any diagram of morphisms A
f g1
B
g2
C
h
D
we have (g1 + g2 )f = g1 f + g2 f and h(g1 + g2 ) = hg1 + hg2 . Axiom 3. A has a zero object. Axiom 4. For any two objects A, B in A, the categorical direct product A × B exists in A. Given a morphism f : A −→ B in A, a kernel of f is a morphism i : X −→ A such that: • f i = 0. • For any other morphism j : X −→ A such that f j = 0, there exists a unique morphism 1216
j : X −→ X such that the diagram X
j j i
X commutes.
A
f
B
Likewise, a cokernel of f is a morphism p : B −→ Y such that: • pf = 0. • For any other morphism j : B −→ Y such that jf = 0, there exists a unique morphism j : Y −→ Y such that the diagram A
f
B
j
p
Y
j
Y commutes. Axiom 5. Every morphism in A has a kernel and a cokernel. The kernel and cokernel of a morphism f in A will be denoted ker (f ) and cok(f ), respectively. A morphism f : A −→ B in A is called a monomorphism if, for every morphism g : X −→ A such that f g = 0, we have g = 0. Similarly, the morphism f is called an epimorphism if, for every morphism h : B −→ Y such that hf = 0, we have h = 0. Axiom 6. ker (cok(f )) = f for every monomorphism f in A. Axiom 7. cok(ker (f )) = f for every epimorphism f in A. Version: 6 Owner: djao Author(s): djao
291.2
exact sequence
Let A be an abelian category. We begin with a preliminary deﬁnition. Deﬁnition 1. For any morphism f : A −→ B in A, let m : X −→ B be the morphism equal to ker (cok(f )). Then the object X is called the image of f , and denoted Im(f ). The morphism m is called the image morphism of f , and denoted i(f ). 1217
Note that Im(f ) is not the same as i(f ): the former is an object of A, while the latter is a morphism of A. We note that f factors through i(f ): A
e
Im(f )
f
i(f )
B
The proof is as follows: by deﬁnition of cokernel, cok(f )f = 0; therefore by deﬁnition of kernel, the morphism f factors through ker (cok(f )) = i(f ), and this factor is the morphism e above. Furthermore m is a monomorphism and e is an epimorphism, although we do not prove these facts. Deﬁnition 2. A sequence ···A B of morphisms in A is exact at B if ker (g) = i(f ). Version: 3 Owner: djao Author(s): djao
f g
C ···
291.3
derived category
Let A be an abelian category, and let K(A) be the category of chain complexes in A, with morphisms chain homotopy classes of maps. Call a morphism of chain complexes a quasiisomorphism if it induces an isomorphism on homology groups of the complexes. For example, any chain homotopy is an quasiisomorphism, but not conversely. Now let the derived category D(A) be the category obtained from K(A) by adding a formal inverse to every quasiisomorphism (technically this called a localization of the category). Derived categories seem somewhat obscure, but in fact, many mathematicians believe they are the appropriate place to do homological algebra. One of their great advantages is that the important functors of homological algebra which are left or right exact (Hom,N⊗k , where N is a ﬁxed kmodule, the global section functor Γ, etc.) become exact on the level of derived functors (with an appropriately modiﬁed deﬁnition of exact). See Methods of Homological Algebra, by Gelfand and Manin for more details. Version: 2 Owner: bwebste Author(s): bwebste
291.4
enough injectives
An abelian category is said to have enough injectives if for every object X, there is a monomorphism 0 → X → I where I is an injective object. Version: 2 Owner: bwebste Author(s): bwebste 1218
Chapter 292 18F20 – Presheaves and sheaves
292.1
292.1.1
locally ringed space
Deﬁnitions
A locally ringed space is a topological space X together with a sheaf of rings OX with the property that, for every point p ∈ X, the stalk (OX )p is a local ring 1 . A morphism of locally ringed spaces from (X, OX ) to (Y, OY ) is a continuous map f : X −→ Y together with a morphism of sheaves φ : OY −→ OX with respect to f such that, for every point p ∈ X, the induced ring homomorphism on stalks φp : (OY )f (p) −→ (OX )p is a local homomorphism. That is, φp (y) ∈ mp for every y ∈ mf (p) , where mp (respectively, mf (p) ) is the maximal ideal of the ring (OX )p (respectively, (OY )f (p) ).
292.1.2
Applications
Locally ringed spaces are encountered in many natural contexts. Basically, every sheaf on the topological space X consisting of continuous functions with values in some ﬁeld is a locally ringed space. Indeed, any such function which is not zero at a point p ∈ X is nonzero and thus invertible in some neighborhood of p, which implies that the only maximal ideal of the stalk at p is the set of germs of functions which vanish at p. The utility of this deﬁnition lies in the fact that one can then form constructions in familiar instances of locally ringed spaces which readily generalize in ways that would not necessarily be obvious without this framework. For example, given a manifold X and its locally ringed space DX of real–valued diﬀerentiable functions, one can show that the space of all tangent vectors to
1
All rings mentioned in this article are required to be commutative.
1219
X at p is naturally isomorphic to the real vector space (mp /m2 )∗ , where the ∗ indicates the p dual vector space. We then see that, in general, for any locally ringed space X, the space of tangent vectors at p should be deﬁned as the k–vector space (mp /m2 )∗ , where k is the p residue ﬁeld (OX )p /mp and ∗ denotes dual with respect to k as before. It turns out that this deﬁnition is the correct deﬁnition even in esoteric contexts like algebraic geometry over ﬁnite ﬁelds which at ﬁrst sight lack the diﬀerential structure needed for constructions such as tangent vector. Another useful application of locally ringed spaces is in the construction of schemes. The forgetful functor assigning to each locally ringed space (X, OX ) the ring OX (X) is adjoint to the ”prime spectrum” functor taking each ring R to its prime spectrum Spec(R), and this correspondence is essentially why the category of locally ringed spaces is the proper building block to use in the formulation of the notion of scheme. Version: 9 Owner: djao Author(s): djao
292.2
presheaf
For a topological space X a presheaf F with values in a category C associates to each open set U ⊂ X, an object F (U) of C and to each inclusion U ⊂ V a morphism of C, ρU V : F (V ) → F (U), the restriction morphism. It is required that ρU U = 1F (U ) and ρU W = ρU V ◦ ρV W for any U ⊂ V ⊂ W . A presheaf with values in the category of sets (or abelian groups) is called a presheaf of sets (or abelian groups). If no target category is speciﬁed, either the category of sets or abelian groups is most likely understood. A more categorical way to state it is as follows. For X form the category Top(X) whose objects are open sets of X and whose morphisms are the inclusions. Then a presheaf is merely a contravariant functor Top(X) → C. Version: 2 Owner: nerdy2 Author(s): nerdy2
292.3
292.3.1
sheaf
Presheaves
Let X be a topological space and let A be a category. A presheaf on X with values in A is a contravariant functor F from the category of open sets in X and inclusion morphisms to the category A. As this deﬁnition may be less than helpful to many readers, we oﬀer the following equivalent 1220
(but longer) deﬁnition. A presheaf F on X consists of the following data: 1. An object F (U) in A, for each open set U ⊂ X 2. A morphism resV,U : F (V ) −→ F (U) for each pair of open sets U ⊂ V in X (called the restriction morphism), such that: (a) For every open set U ⊂ X, the morphism resU,U is the identity morphism.
resW,U
(b) For any open sets U ⊂ V ⊂ W in X, the diagram
F (W ) resW,V F (V ) resV,U F (U) commutes. If the object F (U) of A is a set, its elements are called sections of U.
292.3.2
Morphisms of Presheaves
Let f : X −→ Y be a continuous map of topological spaces. Suppose FX is a presheaf on X, and GY is a presheaf on Y (with FX and GY both having values in A). We deﬁne a morphism of presheaves φ from GY to FX , relative to f , to be a collection of morphisms φU : GY (U) −→ FX (f −1 (U)) in A, one for every open set U ⊂ Y , such that the diagram GY (V )
resV,U φV
FX (f −1 (V ))
resf −1 (V ),f −1 (U )
GY (U)
φU
FX (f −1 (U))
commutes, for each pair of open sets U ⊂ V in Y . In the special case that f is the identity map id : X −→ X, we omit mention of the map f , and speak of φ as simply a morphism of presheaves on X. Form the category whose objects are presheaves on X and whose morphisms are morphisms of presheaves on X. Then an isomorphism of presheaves φ on X is a morphism of presheaves on X which is an isomorphism in this category; that is, there exists a morphism φ−1 whose composition with φ both ways is the identity morphism. More generally, if f : X −→ Y is any homeomorphism of topological spaces, a morphism of presheaves φ relative to f is an isomorphism if it admits a two–sided inverse morphism of presheaves φ−1 relative to f −1 .
1221
292.3.3
Sheaves
We now assume that the category A is a concrete category. A sheaf is a presheaf F on X, with values in A, such that for every open set U ⊂ X, and every open cover {Ui } of U, the following two conditions hold: 1. Any two elements f1 , f2 ∈ F (U) which have identical restrictions to each Ui are equal. That is, if resU,Ui f1 = resU,Ui f2 for every i, then f1 = f2 . 2. Any collection of elements fi ∈ F (Ui ) that have common restrictions can be realized as the collective restrictions of a single element of F (U). That is, if resUi ,Ui T Uj fi = resUj ,Ui T Uj fj for every i and j, then there exists an element f ∈ F (U) such that resU,Ui f = fi for all i.
292.3.4
Sheaves in abelian categories
If A is a concrete abelian category, then a presheaf F is a sheaf if and only if for every open subset U of X, the sequence 0 F (U)
incl i
F (Ui )
diﬀ i,j
F (Ui
Uj )
(292.3.1)
is an exact sequence of morphisms in A for every open cover {Ui } of U in X. This diagram requires some explanation, because we owe the reader a deﬁnition of the morphisms incl and diﬀ. We start with incl (short for “inclusion”). The restriction morphisms F (U) −→ F (Ui ) induce a morphism F (U) −→ F (Ui )
i
to the categorical direct product i F (Ui ), which we deﬁne to be incl. The map diﬀ (called “diﬀerence”) is deﬁned as follows. For each Ui , form the morphism αi : F (Ui ) −→ F (Ui
j
Uj ).
By the universal properties of categorical direct product, there exists a unique morphism α:
i
F (Ui ) −→
F (Ui
i j
Uj )
such that πi α = αi πi for all i, where πi is projection onto the ith factor. In a similar manner, form the morphism β: F (Uj ) −→ F (Ui Uj ).
j j i
1222
Then α and β are both elements of the set Hom
i
F (Ui ),
i,j
F (Ui
Uj ) ,
which is an abelian group since A is an abelian category. Take the diﬀerence α − β in this group, and deﬁne this morphism to be diﬀ. Note that exactness of the sequence (291.3.1) is an element free condition, and therefore makes sense for any abelian category A, even if A is not concrete. Accordingly, for any abelian category A, we deﬁne a sheaf to be a presheaf F for which the sequence (291.3.1) is always exact.
292.3.5
Examples
It’s high time that we give some examples of sheaves and presheaves. We begin with some of the standard ones. Example 9. If F is a presheaf on X, and U ⊂ X is an open subset, then one can deﬁne a presheaf F U on U by restricting the functor F to the subcategory of open sets of X in U and inclusion morphisms. In other words, for open subsets of U, deﬁne F U to be exactly what F was, and ignore open subsets of X that are not open subsets of U. The resulting presheaf is called, for obvious reasons, the restriction presheaf of F to U, or the restriction sheaf if F was a sheaf to begin with. Example 10. For any topological space X, let cX be the presheaf on X, with values in the category of rings, given by • cX (U) := the ring of continuous real–valued functions U −→ R, • resV,U f := the restriction of f to U, for every element f : V −→ R of cX (V ) and every subset U of V . Then cX is actually a sheaf of rings, because continuous functions are uniquely speciﬁed by their values on an open cover. The sheaf cX is called the sheaf of continuous real–valued functions on X. Example 11. Let X be a smooth diﬀerentiable manifold. Let DX be the presheaf on X, with values in the category of real vector spaces, deﬁned by setting DX (U) to be the space of smooth real–valued functions on U, for each open set U, and with the restriction morphism given by restriction of functions as before. Then DX is a sheaf as well, called the sheaf of smooth real–valued functions on X. Much more surprising is that the construct DX can actually be used to deﬁne the concept of smooth manifold! That is, one can deﬁne a smooth manifold to be a locally Euclidean n–dimensional second countable topological space X, together with a sheaf F , such that there exists an open cover {Ui } of X where: 1223
For every i, there exists a homeomorphism fi : Ui −→ Rn and an isomorphism of sheaves φi : DRn −→ F Ui relative to fi . The idea here is that not only does every smooth manifold X have a sheaf DX of smooth functions, but specifying this sheaf of smooth functions is suﬃcient to fully describe the smooth manifold structure on X. While this phenomenon may seem little more than a toy curiousity for diﬀerential geometry, it arises in full force in the ﬁeld of algebraic geometry where the coordinate functions are often unwieldy and algebraic structures in many cases can only be satisfactorily described by way of sheaves and schemes. Example 12. Similarly, for a complex analytic manifold X, one can form the sheaf HX of holomorphic functions by setting HX (U) equal to the complex vector space of C–valued holomorphic functions on U, with the restriction morphism being restriction of functions as before. Example 13. The algebraic geometry analogue of the sheaf DX of diﬀerential geometry is the prime spectrum Spec(R) of a commutative ring R. However, the construction of the sheaf Spec(R) is beyond the scope of this discussion and merits a separate article. Example 14. For an example of a presheaf that is not a sheaf, consider the presheaf F on X, with values in the category of real vector spaces, whose sections on U are locally constant real–valued functions on U modulo constant functions on U. Then every section f ∈ F (U) is locally zero in some ﬁne enough open cover {Ui } (it is enough to take a cover where each Ui is connected), whereas f may be nonzero if U is not connected. We conclude with some interesting examples of morphisms of sheaves, chosen to illustrate the unifying power of the language of schemes across various diverse branches of mathematics. 1. For any continuous function f : X −→ Y , the map φU : cY (U) −→ cX (f −1 (U)) given by φU (g) := gf deﬁnes a morphisms of sheaves from cY to cX with respect to f . 2. For any continuous function f : X −→ Y of smooth diﬀerentiable manifolds, the map given by φU (g) := gf has the property if and only if f is a smooth function. g ∈ DY (U) → φU (g) ∈ DX (f −1 (U))
3. For any continuous function f : X −→ Y of complex analytic manifolds, the map given by φU (g) := gf has the property if and only if f is a holomorphic function. g ∈ HY (U) → φU (g) ∈ HX (f −1 (U))
4. For any Zariski continuous function f : X −→ Y of algebraic varieties over a ﬁeld k, the map given by φU (g) := gf has the property if and only if f is a regular function. Here OX denotes the sheaf of k–valued regular functions on the algebraic variety X. 1224 g ∈ OY (U) → φU (g) ∈ OX (f −1 (U))
REFERENCES
1. David Mumford, The Red Book of Varieties and Schemes, Second Expanded Edition, Springer– Verlag, 1999 (LNM 1358). 2. Charles Weibel, An Introduction to Homological Algebra, Cambridge University Press, 1994.
Version: 9 Owner: djao Author(s): djao
292.4
sheaﬁﬁcation
Let F be a presheaf over a topological space X with values in a category A for which sheaves are deﬁned. The sheaﬁﬁcation of F , if it exists, is a sheaf F over X together with a morphism θ : F −→ F satisfying the following universal property: For any sheaf G over X and any morphism of presheaves φ : F −→ G over X, there exists a unique morphism of sheaves ψ : F −→ G such that the diagram F commutes. In light of the universal property, the sheaﬁﬁcation of F is uniquely deﬁned up to canonical isomorphism whenever it exists. In the case where A is a concrete category (one consisting of sets and set functions), the sheaﬁﬁcation of any presheaf F can be constructed by taking F (U) to be the set of all functions s : U −→ p∈U Fp such that 1. s(p) ∈ Fp for all p ∈ U 2. For all p ∈ U, there is a neighborhood V ⊂ U of p and a section t ∈ F (V ) such that, for all q ∈ V , the induced element tq ∈ Fq equals s(q) for all open sets U ⊂ X. Here Fp denotes the stalk of the presheaf F at the point p. The following quote, taken from [1], is perhaps the best explanation of sheaﬁﬁcation to be found anywhere: F is ”the best possible sheaf you can get from F ”. It is easy to imagine how to get it: ﬁrst identify things which have the same restrictions, and then add in all the things which can be patched together. 1225
θ
F
φ
ψ
G
REFERENCES
1. David Mumford, The Red Book of Varieties and Schemes, Second Expanded Edition, Springer– Verlag, 1999 (LNM 1358)
Version: 4 Owner: djao Author(s): djao
292.5
stalk
Let F be a presheaf over a topological space X with values in an abelian category A, and suppose direct limits exist in A. For any point p ∈ X, the stalk Fp of F at p is deﬁned to be the object in A which is the direct limit of the objects F (U) over the directed set of all open sets U ⊂ X containing p, with respect to the restriction morphisms of F . In other words, Fp := lim F (U)
−→ U p
If A is a category consisting of sets, the stalk Fp can be viewed as the set of all germs of sections of F at the point p. That is, the set Fp consists of all the equivalence classes of ordered pairs (U, s) where p ∈ U and s ∈ F (U), under the equivalence relation (U, s) ∼ (V, t) if there exists a neighborhood W ⊂ U V of p such that resU,W s = resV,W t. By universal properties of direct limit, a morphism φ : F −→ G of presheaves over X induces a morphism φp : Fp −→ Gp on each stalk Fp of F . Stalks are most useful in the context of sheaves, since they encapsulate all of the local data of the sheaf at the point p (recall that sheaves are basically deﬁned as presheaves which have the property of being completely characterized by their local behavior). Indeed, in many of the standard examples of sheaves that take values in rings (such as the sheaf DX of smooth functions, or the sheaf OX of regular functions), the ring Fp is a local ring, and much of geometry is devoted to the study of sheaves whose stalks are local rings (socalled ”locally ringed spaces”). We mention here a few illustrations of how stalks accurately reﬂect the local behavior of a sheaf; all of these are drawn from [1]. • A morphism of sheaves φ : F −→ G over X is an isomorphism if and only if the induced morphism φp is an isomorphism on each stalk. • A sequence F −→ G −→ H of morphisms of sheaves over X is an exact sequence at G if and only if the induced morphism Fp −→ Gp −→ Hp is exact at each stalk Gp . • The sheaﬁﬁcation F of a presheaf F has stalk equal to Fp at every point p.
1226
REFERENCES
1. Robin Hartshorne, Algebraic Geometry, Springer–Verlag New York Inc., 1977 (GTM 52).
Version: 4 Owner: djao Author(s): djao
1227
Chapter 293 18F30 – Grothendieck groups
293.1 Grothendieck group
Let S be an Abelian semigroup. The Grothendieck group of S is K(S) = S × S/∼, where ∼ is the equivalence relation: (s, t) ∼ (u, v) if there exists r ∈ S such that s + v + r = t + u + r. This is indeed an abelian group with zero element (s, s) (any s ∈ S) and inverse −(s, t) = (t, s). The Grothendieck group construction is a functor from the category of abelian semigroups to the category of abelian groups. A morphism f : S → T induces a morphism K(f ) : K(S) → K(T ). Example 15. K(N) = Z. Example 16. Let G be an abelian group, then K(G) ∼ G via (g, h) ↔ g − h. = Let C be a symmetric monoidal category. Its Grothendieck group is K([C]), i.e. the Grothendieck group of the isomorphism classes of objects of C. Version: 2 Owner: mhale Author(s): mhale
1228
Chapter 294 18G10 – Resolutions; derived functors
294.1 derived functor
There are two objects called derived functors. First, there are classical derived functors. Let A, B be abelian categories, and F : A → B be a covariant leftexact functor. Note that a completely analogous construction can be done for rightexact and contravariant functors, but it is traditional to only describe one case, as doing the other mostly consists of reversing arrows. Given an object A ∈ A, we can construct an injective resolution: A→A:0 A I1 I2 ··· which is unique up to chain homotopy equivalence. Then we apply the functor F to the injectives in the resolution to to get a complex F (A) : 0 F (I 1 ) F (I 2 ) ···
(notice that the term involving A has been left out. This is not an accident, in fact, it is crucial). This complex also is independent of the choice of I’s (up to chain homotopy equivalence). Now, we deﬁne the classical right derived functors Ri F (A) to be the cohomology groups H i (F (A)). These only depend on A. Important properties of the classical derived functors are these: If the sequence 0 → A → A → A → 0 is exact, then there is a long exact sequence 0 F (A) F (A ) F (A ) R1 F (A) ···
which is natural (a morphism of short exact sequences induces a morphism of long exact sequences). This, along with a couple of other properties determine the derived functors completely, giving an axiomatic deﬁnition, though the construction used above is usually necessary to show existence. From the deﬁnition, one can see immediately that the following are equivalent: 1229
1. F is exact 2. Rn F (A) = 0 for n 1 and all A ∈ A.
3. R1 F (A) = 0 for all A ∈ A. However, R1 F (A) = 0 for a particular A does not imply that Rn F (A) = 0 for all n 1.
Important examples are Extn , the derived functor of Hom, Torn , the derived functor of tensor product, and sheaf cohomology, the derived functor of the global section functor on sheaves. (Coming soon: the derived categoies deﬁnition) Version: 4 Owner: bwebste Author(s): bwebste
1230
Chapter 295 18G15 – Ext and Tor, generalizations, K¨nneth formula u
295.1 Ext
For a ring R, and Rmodule A, we have a covariant functor HomA − R. Extn (A, −) are R deﬁned to be the right derived functors of HomA − R (Extn (A, −) = Rn HomA − R). R Ext gets its name from the following fact: There is a natural bijection between elements of Ext1 (A, B) and extensions of B by A up to isomorphism of short exact sequences, where an R extension of B by A is an exact sequence 0→B→C→A→0 . For example, Ext1 (Z/nZ, Z) ∼ Z/nZ = Z
, with 0 corresponding to the trivial extension 0 → Z → Z ⊕ Z/nZ → 0, and m = 0 corresponding to n m Z/nZ 0. 0 Z Z Version: 3 Owner: bwebste Author(s): bwebste
1231
Chapter 296 18G30 – Simplicial sets, simplicial objects (in a category)
296.1
The Deﬁnition 18. nerve of a category C is the simplicial set Hom(i(−), C), where i : ∆ → Cat is the fully faithful functor that takes each ordered set [n] in the simplicial category, ∆, to op the preorder n + 1. The nerve is a functor Cat → Set∆ . Version: 1 Owner: mhale Author(s): mhale
nerve
296.2
simplicial category
The simplicial category ∆ is deﬁned as the small category whose objects are the totally ordered ﬁnite sets [n] = {0 < 1 < 2 < . . . < n}, n 0, (296.2.1) and whose morphisms are monotonic nondecreasing (orderpreserving) maps. It is generated by two families of morphisms:
n δi : [n − 1] → [n] is the injection missing i ∈ [n], n n n σi : [n + 1] → [n] is the surjection such that σi (i) = σi (i + 1) = i ∈ [n]. n The δi morphisms are called n Deﬁnition 19. face maps, and the σi morphisms are called
1232
Deﬁnition 20. degeneracy maps. They satisfy the following relations,
n+1 n n+1 n δj δi = δi δj−1
for i < j,
(296.2.2) (296.2.3) (296.2.4)
0 All morphisms [n] → [0] factor through σ0 , so [0] is terminal.
n−1 n n−1 n σj σi = σi σj+1 for i j, n n−1 if i < j, δi σj−1 n n+1 idn if i = j or i = j + 1, σj δi = n n−1 δi−1 σj if i > j + 1.
There is a bifunctor + : ∆ × ∆ → ∆ deﬁned by [m] + [n] = [m + n + 1], f (i) if 0 i m, (f + g)(i) = g(i − m − 1) + m + 1 if m < i (m + n + 1), (296.2.5) (296.2.6)
where f : [m] → [m ] and g : [n] → [n ]. Sometimes, the simplicial category is deﬁned to include the empty set [−1] = ∅, which provides an initial object for the category. This makes ∆ a strict monoidal category as ∅ is a unit for the bifunctor: ∅ + [n] = [n] = [n] + ∅ and id∅ + f = f = f + id∅ . Further, ∆ is then the free monoidal category on a monoid object 0 (the monoid object being [0], with product σ0 : [0] + [0] → [0]). There is a fully faithful functor from ∆ to Top, which sends each object [n] to an oriented nsimplex. The face maps then embed an (n − 1)simplex in an nsimplex, and the degeneracy maps collapse an (n + 1)simplex to an nsimplex. The bifunctor forms a simplex from the disjoint union of two simplicies by joining their vertices together in a way compatible with their orientations. There is also a fully faithful functor from ∆ to Cat, which sends each object [n] to a preorder n + 1. The preorder n is the category consisting of n partiallyordered objects, with one morphism a → b iﬀ a b. Version: 4 Owner: mhale Author(s): mhale
296.3
A
simplicial object
Deﬁnition 21. simplicial object in a category C is a contravariant functor from the simplicial category n ∆ to C. Such a functor X is uniquely speciﬁed by the morphisms X(δi ) : [n] → [n − 1] and
1233
n X(σi ) : [n] → [n + 1], which satisfy n−1 n n−1 n X(δi ) X(δj ) = X(δj−1 ) X(δi ) for i < j, n+1 n n+1 n X(σi ) X(σj ) = X(σj+1 ) X(σi ) for n−1 n X(σj−1 ) X(δi ) n+1 n idn X(δi ) X(σj ) = n−1 n X(σj ) X(δi−1 )
(296.3.1) (296.3.2) (296.3.3)
i
j,
if i < j, if i = j or i = j + 1, if i > j + 1.
In particular, a
Deﬁnition 22. simplicial set is a simplicial object in Set. Equivalently, one could say that a simplicial set is a presheaf on ∆. The object X([n]) of a simplicial set is a set of nsimplices, and is called the nskeleton. Version: 2 Owner: mhale Author(s): mhale
1234
Chapter 297 18G35 – Chain complexes
297.1 5lemma
If Ai , Bi for i = 1, . . . , 5 are objects in an abelian category (for example, modules over a ring R) such that there is a commutative diagram A1
γ1
A2
γ2
A3
γ3
A4
γ4
A5
γ5
B1
B2
B3
B4
B5
with the rows exact, and γ1 is surjective, γ5 is injective, and γ2 and γ4 are isomorphisms, then γ3 is an isomorphism as well. Version: 2 Owner: bwebste Author(s): bwebste
1235
297.2
9lemma
If Ai , Bi , Ci , for i = 1, 2, 3 are objects of an abelian category such that there is a commutative diagram 0 0 0 0 A1 A2 A3 0 0 B1 B2 B3 0 0 C1 C2 C3 0 0 0 0
with the columns and bottom two rows are exact, then the top row is exact as well. Version: 2 Owner: bwebste Author(s): bwebste
297.3
Snake lemma
There are two versions of the snake lemma: (1) Given a commutative (1) diagram as below, with exact (1) rows 0 − − A1 − − B1 − − C1 − − 0 −→ −→ −→ −→ γ α β 0 − − A2 − − B2 − − C2 − − 0 −→ −→ −→ −→ there is an exact sequence 0 → ker α → ker β → ker γ → coker α → coker β → coker γ → 0 where ker denotes the kernel of a map and coker its cokernel. (2) Applying this result inductively to a short exact (2) sequence of (2) chain complexes, we obtain the following: Let A, B, C be chain complexes, and let 0→A→B→C→0 be a short exact sequence. Then there is a long exact sequence of homology groups · · · → Hn (A) → Hn (B) → Hn (C) → Hn−1 (A) → · · · Version: 5 Owner: bwebste Author(s): bwebste 1236
297.4
chain homotopy
Let (A, d) and (A , d ) be chain complexes and f : A → A , g : A → A be chain maps. A chain homotopy D between f and g is a sequence of homomorphisms {Dn : An → An+1 } so that dn+1 ◦ Dn + Dn−1 ◦ dn = fn − gn for each n. Thus, we have a commutative diagram: An+1
fn+1 −gn+1 dn+1
An
dn
An−1
fn−1 −gn−1
Dn−1 Dn dn+1
An+1
An
dn
An−1
Version: 4 Owner: RevBobo Author(s): RevBobo
297.5
chain map
Let (A, d) and (A , d ) be chain complexes. A chain map f : A → A is a sequence of homomorphisms {fn } such that dn ◦ fn = fn−1 ◦ dn for each n. Diagramatically, this says that the following diagram commutes: An
fn dn
An−1
fn−1
An
dn
An−1
Version: 3 Owner: RevBobo Author(s): RevBobo
297.6
homology (chain complex)
If (A, d) is a chain complex − − −− −− · · · ← − An−1 ← n − An ← − An+1 ← − · · · −− then the nth homology group Hn (A, d) (or module) of the chain complex A is the quotient Hn (A, d) = Version: 2 Owner: bwebste Author(s): bwebste 1237 ker dn . i dn+1
dn−1 d dn+1 dn+2
Chapter 298 18G40 – Spectral sequences, hypercohomology
298.1 spectral sequence
A spectral sequence is a collection of Rmodules (or more generally, objects of an abelian category) r r r {Ep,q } for all r ∈ N, p, q ∈ Z, equipped with maps dr : Ep,q → Ep−r,q+r−1 such that is a pq r+1 chain complex, and the E ’s are its homology, that is,
r+1 Ep,q ∼ ker(dr )/im(dr = p,q p+r,q−r+1).
(Note: what I have deﬁned above is a homology spectral sequence. Cohomology spectral sequences are identical, except that all the arrows go in the other direction.)
r Most interesting spectral sequences are upper right quadrant, meaning that Ep,q = 0 if p or q < 0. If this is the case then for any p, q, both dr and dr pq p+r,q−r+1 are 0 for suﬃciently large r since the target or source is out of the upper right quadrant, so that for all r > r0 r r+1 ∞ Ep,q = Ep,q · · · . This group is called Ep,q . r A upper right quadrant spectral sequence {Ep,q } is said to converge to a sequence Fn of Rmodules if there is an exhaustive ﬁltration Fn,0 = 0 ⊂ Fn,1 ⊂ · · · ⊂ of each Fn such that
Fp+q,q+1 /Fp+q,q ∼ Ep,q = ∞
r . This is typically written Ep,q ⇒ Fp+q .
Typically spectral sequences are used in the following manner: we ﬁnd an interpretation of E r for a small value of r, typically 1, and of E ∞ , and then in cases where enough groups and diﬀerentials are 0, we can obtain information about one from the other. Version: 2 Owner: bwebste Author(s): bwebste 1238
Chapter 299 1900 – General reference works (handbooks, dictionaries, bibliographies, etc.)
299.1 Algebraic Ktheory
Algebraic Ktheory is a series of functors on the category of rings. It classiﬁes ring invariants, i.e. ring properties that are Morita invariant. The functor K0 Let R be a ring and denote by M∞ (R) the algebraic direct limit of matrix algebras Mn (R) a 0 . The zeroth Kgroup of under the embeddings Mn (R) → Mn+1 (R) : a → 0 0 R, K0 (R), is the Grothendieck group (abelian group of formal diﬀerences) of the unitary equivalence classes of projections in M∞ (R). The addition of two equivalence classes [p] and [q] is given by the direct summation of the projections p and q: [p] + [q] = [p ⊕ q]. The functor K1 [To Do: coauthor?] The functor K2 [To Do: coauthor?] Higher Kfunctors Higher Kgroups are deﬁned using the Quillen plus construction,
alg Kn (R) = πn (BGL∞ (R)+ ),
(299.1.1)
1239
where GL∞ (R) is the inﬁnite general linear group over R (deﬁned in a similar way to M∞ (R)), and BGL∞ (R) is its classifying space. Algebraic Ktheory has a product structure, Ki (R) ⊗ Kj (S) → Ki+j (R ⊗ S). Version: 2 Owner: mhale Author(s): mhale (299.1.2)
299.2
Ktheory
Topological Ktheory is a generalised cohomology theory on the category of compact Hausdorﬀ spaces. It classiﬁes the vector bundles over a space X up to stable equivalences. Equivalently, via the SerreSwan theorem, it classiﬁes the ﬁnitely generated projective modules over the C ∗ algebra C(X). Let A be a unital C ∗ algebra over C and denote by M∞ (A) the algebraic direct limit of a 0 matrix algebras Mn (A) under the embeddings Mn (A) → Mn+1 (A) : a → . The 0 0 K0 (A) group is the Grothendieck group (abelian group of formal diﬀerences) of the homotopy classes of the projections in M∞ (A). Two projections p and q are homotopic if p = uqu−1 for some unitary u ∈ M∞ (A). Addition of homotopy classes is given by the direct summation of projections: [p] + [q] = [p ⊕ q]. Denote by U∞ (A) the direct limit of unitary groups Un (A) under the embeddings Un (A) → u 0 Un+1 (A) : u → . Give U∞ (A) the direct limit topology, i.e. a subset U of U∞ (A) 0 1 is open if and only if U Un (A) is an open subset of Un (A), for all n. The K1 (A) group is the Grothendieck group (abelian group of formal diﬀerences) of the homotopy classes of the unitaries in U∞ (A). Two unitaries u and v are homotopic if uv −1 lies in the identity component of U∞ (A). Addition of homotopy classes is given by the direct summation of unitaries: [u] + [v] = [u ⊕ v]. Equivalently, one can work with invertibles in GL∞ (A) (an invertible g is connected to the unitary u = gg−1 via the homotopy t → gg−t). Higher Kgroups can be deﬁned through repeated suspensions, Kn (A) = K0 (S n A). But, the Bott periodicity theorem means that K1 (SA) ∼ K0 (A). = (299.2.2) (299.2.1)
1240
The main properties of Ki are: Ki (A ⊕ B) Ki (Mn (A)) Ki (A ⊗ K) Ki+2 (A) = = = = Ki (A) ⊕ Ki (B), Ki (A) (Morita invariance), Ki (A) (stability), Ki (A) (Bott periodicity). (299.2.3) (299.2.4) (299.2.5) (299.2.6)
There are three ﬂavours of topological Ktheory to handle the cases of A being complex (over C), real (over R) or Real (with a given real structure). Ki (C(X, C)) = KU −i (X) (complex/unitary), Ki (C(X, R)) = KO −i (X) (real/orthogonal), KR i (C(X), J) = KR −i (X, J) (Real). Real Ktheory has a Bott period of 8, rather than 2. (299.2.7) (299.2.8) (299.2.9)
REFERENCES
1. N. E. WeggeOlsen, Ktheory and C ∗ algebras. Oxford science publications. Oxford University Press, 1993. 2. B. Blackadar, KTheory for Operator Algebras. Cambridge University Press, 2nd ed., 1998.
Version: 12 Owner: mhale Author(s): mhale
299.3
examples of algebraic Ktheory groups
R Z R C K0 (R) Z Z C K1 (R) Z/2 R× C× K2 (R) Z/2 K3 (R) Z/48 K4 (R) 0
Algebraic Ktheory of some common rings. Version: 2 Owner: mhale Author(s): mhale
1241
Chapter 300 19K33 – EXT and Khomology
300.1 Fredholm module
Fredholm modules represent abstract elliptic pseudodiﬀerential operators. Deﬁnition 3. An Deﬁnition 23. odd Fredholm module (H, F ) over a C ∗ algebra A is given by an involutive representation π of A on a Hilbert space H, together with an operator F on H such that F = F ∗ , F 2 = 1 and [F, π(a)] ∈ K(H) for all a ∈ A. I Deﬁnition 4. An Deﬁnition 24. even Fredholm module (H, F, Γ) is given by an odd Fredholm module (H, F ) together with a Z2 grading Γ on H, Γ = Γ∗ , Γ2 = 1 such that Γπ(a) = π(a)Γ and ΓF = I, −F Γ. Deﬁnition 5. A Fredholm module is called Deﬁnition 25. degenerate if [F, π(a)] = 0 for all a ∈ A. Degenerate Fredholm modules are homotopic to the 0module. Example 17 (Fredholm modules over C). An even Fredholm module (H, F, Γ) over C is given by H = Ck ⊕ Ck F = Γ = 0 1k I 1k 0 I 1k I 0 0 −1 k I with π(a) = , . a1 k 0 I 0 0 ,
Version: 3 Owner: mhale Author(s): mhale 1242
300.2
Khomology
Khomology is a homology theory on the category of compact Hausdorﬀ spaces. It classiﬁes the elliptic pseudodiﬀerential operators acting on the vector bundles over a space. In terms of C ∗ algebras, it classiﬁes the Fredholm modules over an algebra. The K 0 (A) group is the abelian group of homotopy classes of even Fredholm modules over A. The K 1 (A) group is the abelian group of homotopy classes of odd Fredholm modules over A. Addition is given by direct summation of Fredholm modules, and the inverse of (H, F, Γ) is (H, −F, −Γ). Version: 1 Owner: mhale Author(s): mhale
1243
Chapter 301 19K99 – Miscellaneous
301.1 examples of Ktheory groups
A C Mn (C) H K B B/K C0 ((0, 1)) C0 (R2n ) C0 (R2n+1 ) C([0, 1]) C(Tn ) C(S2n ) C(S2n+1 ) C(CPn ) On Aθ C ∗ (H3 ) K0 (A) Z Z Z Z 0 0 0 Z 0 Z n−1 Z2 Z2 Z Zn+1 Z/(n − 1) Z2 Z3 K1 (A) 0 0 0 0 0 Z Z 0 Z 0 n−1 Z2 0 Z 0 0 Z2 Z3
Topological Ktheory of some common C ∗ algebras. Version: 5 Owner: mhale Author(s): mhale
1244
Chapter 302 2000 – General reference works (handbooks, dictionaries, bibliographies, etc.)
302.1 alternating group is a normal subgroup of the symmetric group
Theorem 2. The alternating group An is a normal subgroup of the symmetric group Sn eﬁne the epimorphism f : Sn → Z2 by : σ → 0 if σ is an even permutation and : σ → 1 if σ is an odd permutation. Hence, An is the kernel of f and so it is a normal subgroup of the domain Sn . Furthermore Sn /An ∼ Z2 by the ﬁrst isomorphism theorem. So by = Lagrange’s theorem Sn  = An Sn /An .
D
Therefore, An  = n!/2. That is, there are n!/2 many elements in An Version: 1 Owner: tensorking Author(s): tensorking
302.2
associative
Let (S, φ) be a set with binary operation φ. φ is said to be associative over S if φ(a, φ(b, c)) = φ(φ(a, b), c)
1245
for all a, b, c ∈ S. Examples of associative operations are addition and multiplication over the integers (or reals), or addition or multiplication over n × n matrices. We can construct an operation which is not associative. Let S be the integers. and deﬁne ν(a, b) = a2 + b. Then ν(ν(a, b), c) = ν(a2 + b, c) = a4 + 2ba2 + b2 + c. But ν(a, ν(b, c)) = ν(a, b2 + c) = a + b4 + 2cb2 + c2 , hence ν(ν(a, b), c) = ν(a, ν(b, c)). Note, however, that if we were to take S = {0}, ν would be associative over S!. This illustrates the fact that the set the operation is taken with respect to is very important. Example. We show that the division operation over nonzero reals is nonassociative. All we need is a counterexample: so let us compare 1/(1/2) and (1/1)/2. The ﬁrst expression is equal to 2, the second to 1/2, hence division over the nonzero reals is not associative. Version: 6 Owner: akrowne Author(s): akrowne
302.3
canonical projection
Given a group G and a normal subgroup N ¡ G there is an epimorphism π : G → G/N deﬁned by sending an element g ∈ G to its coset gN. The epimorphism π is referred to as the canonical projection. Version: 4 Owner: Dr Absentius Author(s): Dr Absentius
302.4
centralizer
For a given group G, the centralizer of an element a ∈ G is deﬁned to be the set C(a) = {x ∈ G  xa = ax} We note that if x, y ∈ C(a) then xy −1 a = xay −1 = axy −1 so that xy −1 ∈ C(a). Thus C(a) is a nontrivial subgroup of G containing at least {e, a}. To illustrate an application of this concept we prove the following lemma. There exists a bijection between the right cosets of C(a) and the conjugates of a. If x, y ∈ G are in the same right coset, then y = cx for some c ∈ C(a). Thus y −1ay = x−1 c−1 acx = x−1 c−1 cax = x−1 ax. Conversely, if y −1 ay = x−1 ax then xy −1 a = axy −1 and 1246
xy −1 ∈ C(a) giving x, y are in the same right coset. Let [a] denote the conjugacy class of a. It follows that [a] = [G : C(a)] and [a]  G. We remark that a ∈ Z(G) ⇔ C(a) = G ⇔ [a] = 1, where Z(G) denotes the center of G. Now let G be a pgroup, i.e. a ﬁnite group of order pn , where p is a prime and n > 0. Let z = Z(G). Summing over elements in distinct conjugacy classes, we have pn = [a] = z + a∈Z(G) [a] since the center consists precisely of the conjugacy classes of cardinality 1. / But [a]  pn , so p  z. However, Z(G) is certainly nonempty, so we conclude that every pgroup has a nontrivial center. The groups C(gag −1) and C(a), for any g, are isomorphic. Version: 5 Owner: mathcam Author(s): Larry Hammick, vitriol
302.5
commutative
Let (S, φ) be a set with binary operation φ. φ is said to be commutative if φ(a, b) = φ(b, a) for all a, b ∈ S. Some operations which are commutative are addition over the integers, multiplication over the integers, addition over n × n matrices, and multiplication over the reals. An example of a noncommutative operation is multiplication over n × n matrices. Version: 3 Owner: akrowne Author(s): akrowne
302.6
examples of groups
Groups are ubiquitous throughout mathematics. Many “naturally occurring” groups are either groups of numbers (typically abelian) or groups of symmetries (typically nonabelian). Groups of numbers • The most important group is the group of integers Z with addition as operation. • The integers modulo n, often denoted by Zn , form a group under addition. Like Z itself, this a cyclic group; any cyclic group is isomorphic to one of these. 1247
• The rational (or real, or complex) numbers form a group under addition. • The positive rationals form a group under multiplication, and so do the nonzero rationals. The same is true for the reals. • The nonzero complex numbers form a group under multiplication. So do the nonzero quaternions. The latter is our ﬁrst example of a nonabelian group. • More generally, any (skew) ﬁeld gives rise to two groups: the additive group of all ﬁeld elements, and the multiplicative group of all nonzero ﬁeld elements. • The complex numbers of absolute value 1 form a group under multiplication, best thought of as the unit circle. The quaternions of absolute value 1 form a group under multiplication, best thought of as the threedimensional unit sphere S 3 . The twodimensional sphere S 2 however is not a group in any natural way. Most groups of numbers carry natural topologies turning them into topological groups.
Symmetry groups • The symmetric group of degree n, denoted by Sn , consists of all permutations of n items and has n! elements. Every ﬁnite group is isomorphic to a subgroup of some Sn . • An important subgroup of the symmetric group of degree n is the alternating group, denoted An . This consists of all even permutations on n items. A permutation is said to be even if it can be written as the product of an even number of transpositions. The alternating group is normal in Sn , of index 2, and it is an interesting fact that An is simple for n 5. See the proof on the simplicity of the alternating groups. By the JordanH¨lder theorem, this means that this is the only normal subgroup of Sn . o • If any geometrical object is given, one can consider its symmetry group consisting of all rotations and reﬂections which leave the object unchanged. For example, the symmetry group of a cone is isomorphic to S 1 . • The set of all automorphisms of a given group (or ﬁeld, or graph, or topological space, or object in any category) forms a group with operation given by the composition of homomorphisms. These are called automorphism groups; they capture the internal symmetries of the given objects. • In Galois theory, the symmetry groups of ﬁeld extensions (or equivalently: the symmetry groups of solutions to polynomial equations) are the central object of study; they are called Galois groups. • Several matrix groups describe various aspects of the symmetry of nspace:
1248
– The general linear group GL(n, R) of all real invertible n × n matrices (with matrix multiplication as operation) contains rotations, reﬂections, dilations, shear transformations, and their combinations. – The orthogonal group O(n, R) of all real orthogonal n × n matrices contains the rotations and reﬂections of nspace. – The special orthogonal group SO(n, R) of all real orthogonal n × n matrices with determinant 1 contains the rotations of nspace. All these matrix groups are Lie groups: groups which are diﬀerentiable manifolds such that the group operations are smooth maps.
Other groups • The trivial group consists only of its identity element. • If X is a topological space and x is a point of X, we can deﬁne the fundamental group of X at x. It consists of (equivalence classes of) continuous paths starting and ending at x and describes the structure of the “holes” in X accessible from x. • The free groups are important in algebraic topology. In a sense, they are the most general groups, having only those relations among their elements that are absolutely required by the group axioms. • If A and B are two abelian groups (or modules over the same ring), then the set Hom(A, B) of all homomorphisms from A to B is an abelian group (since the sum and diﬀerence of two homomorphisms is again a homomorphism). Note that the commutativity of B is crucial here: without it, one couldn’t prove that the sum of two homorphisms is again a homomorphism. • The set of all invertible n × n matrices over some ring R forms a group denoted by GL(n, R). • The positive integers less than n which are coprime to n form a group if the operation is deﬁned as multiplication modulo n. This is a cyclic group whose order is given by the Euler phifunction φ(n), • Generalizing the last two examples, every ring (and every monoid) contains a group, its group of units (invertible elements), where the group operation is ring (monoid) multiplication. • If K is a number ﬁeld, then multiplication of (equivalence classes of) nonzero ideals in the ring of algebraic integers OK gives rise to the ideal class group of K. • The set of arithmetic functions that take a value other than 0 at 1 form an abelian group under Dirichlet convolution. They include as a subgroup the set of multiplicative functions. 1249
• Consider the curve C = {(x, y) ∈ R2  y 2 = x3 − x}. Every straight line intersects this set in three points (counting a point twice if the line is tangent, and allowing for a point at inﬁnity). If we require that those three points add up to zero for any straight line, then we have deﬁned an abelian group structure on C. Groups like these are called abelian varieties; the most prominent examples are elliptic curves of which C is the simplest one. • In the classiﬁcation of all ﬁnite simple groups, several “sporadic” groups occur which don’t follow any discernable pattern. The largest of these is the monster group with some 8 · 1053 elements. Version: 14 Owner: AxelBoldt Author(s): AxelBoldt, NeuRet
302.7
group
Group. A group is a pair (G, ∗) where G is a nonempty set and ∗ is binary operation on G that holds the following conditions. • For any a, b, c ∈ G, (a ∗ b) ∗ c = a ∗ (b ∗ c). (associativity of the operation). • For any a, b in G, a ∗ b belongs to G. (The operation ∗ is closed). • There is an element e ∈ G such that ge = eg = g for any g ∈ G. (Existence of identity element). • For any g ∈ G there exists an element h such that gh = hg = e. (Existence of inverses) Usually the symbol ∗ is omitted and we write ab for a ∗ b. Sometimes, the symbol + is used to represent the operation, especially when the group is abelian. It can be proved that there is only one identity element , and that for every element there is only one inverse. Because of this we usually denote the inverse of a as a−1 or −a when we are using additive notation. The identity element is also called neutral element due to its behavior with respect to the operation. Version: 10 Owner: drini Author(s): drini
302.8
quotient group
Let (G, ∗) be a group and H a normal subgroup. The relation ∼ given by a ∼ b when ab−1 ∈ H is an equivalence relation. The equivalence classes are called cosets. The equivalence class of a is denoted as aH (or a + H if additive notation is being used). 1250
We can induce a group structure on the cosets with the following operation: (aH) (bH) = (a ∗ b)H. The collection of cosets is denoted as G/H and together with the quotient group or factor group of G with H. Example. Consider the group Z and the subgroup 3Z = {n ∈ Z : n = 3k, Since Z is abelian, 3Z is then also a normal subgroup. Using additive notation, the equivalence relation becomes n ∼ m when (n−m) ∈ 3Z, that is, 3 divides n − m. So the relation is actually congruence modulo 3. Therefore the equivalence classes (the cosets) are: 3Z = . . . , −9, −6, −3, 0, 3, 6, 9, . . . 1 + 3Z = . . . , −8, −5, −2, 1, 4, 7, 10, . . . 2 + 3Z = . . . , −7, −4, −1, 2, 5, 8, 11, . . . which we’ll represent as ¯ ¯ and ¯ 0, 1 2. Then we can check that Z/3Z is actually the integers modulo 3 (that is, Z/3Z ∼ Z3 ). = Version: 6 Owner: drini Author(s): drini k ∈ Z}. operation form the
1251
Chapter 303 2002 – Research exposition (monographs, survey articles)
303.1 length function
Let G be a group. A length function on G is a function L : G → R+ satisfying: L(e) = 0, L(g) = L(g −1 ), ∀g ∈ G, L(g1 g2 ) L(g1 ) + L(g2 ), ∀g1 , g2 ∈ G. Version: 2 Owner: mhale Author(s): mhale
1252
Chapter 304 20XX – Group theory and generalizations
304.1 free product with amalgamated subgroup
Deﬁnition 26. Let Gk , k = 0, 1, 2 be groups and ik : G0 → Gi , k = 1, 2 be monomorphisms. The free product of G1 and G2 with amalgamated subgroup G0 , is deﬁned to be a group G that has the following two properties 1. there are homomorphisms jk : Gk → G, k = 1, 2 that make the following diagram commute G1
i1 j1
G0
i2 j2
G G2
2. G is universal with respect to the previous property, that is for any other group G and homomorphisms jk : Gk → G , k = 1, 2 that ﬁt in such a commutative diagram there is a unique homomorphism G → G so that the following diagram commutes G1
i1 j1 j1
G0
i2 j2
G G2
j2
!
G
1253
It follows by “general nonsense” that the free product of G1 and G2 with amalgamated subgroup G0 , if it exists, is “unique up to unique isomorphism.” The free product of G1 and G2 with amalgamated subgroup G0 , is denoted by G1 G0 G2 . The following theorem asserts its existence. Theorem 1. G1 Gi , k = 1, 2.
G0 G 2
exists for any groups Gk , k = 0, 1, 2 and monomorphisms ik : G0 →
[ Sketch of proof] Without loss of generality assume that G0 is a subgroup of Gk and that ik is the inclusion for k = 1, 2. Let
Gk = (xk;s )s∈S  (rk;t)t∈T be a presentation of Gk for k = 1, 2. Each g ∈ G0 can be expressed as a word in the generators of Gk ; denote that word by wk (g) and let N be the normal closure of {w1 (g)w2(g)−1  g ∈ G0 } in the free product G1 G2 . Deﬁne G1
G0 G 2
:= G1 G2 /N
and for k = 0, 1 deﬁne jk to be the inclusion into the free product followed by the canonical projection. Clearly (1) is satisﬁed, while (2) follows from the universal properties of the free product and the quotient group. Notice that in the above proof it would be suﬃcient to divide by the relations w1 (g)w2 (g)−1 for g in a generating set of G0 . This is useful in practice when one is interested in obtaining a presentation of G1 G0 G2 . In case that the ik ’s are not injective the above still goes through verbatim. The group thusly obtained is called a “pushout”. Examples of free products with amalgamated subgroups are provided by Van Kampen’s theorem. Version: 1 Owner: Dr Absentius Author(s): Dr Absentius
304.2
nonabelian group
Let (G, ∗) be a group. If a ∗ b = b ∗ a for some a, b ∈ G, we say that the group is nonabelian or noncommutative. proposition. There is a nonabelian group for which x → x3 is a homomorphism Version: 2 Owner: drini Author(s): drini, apmxi
1254
Chapter 305 20A05 – Axiomatics and elementary properties
305.1 FeitThompson theorem
An important result in the classiﬁcation of all ﬁnite simple groups, the FeitThompson theorem states that every nonAbelian simple group must have even order. The proof requires 255 pages. Version: 1 Owner: mathcam Author(s): mathcam
305.2
Proof: The orbit of any element of a group is a subgroup
Following is a proof that, if G is a group and g ∈ G, then g ≤ G. Here g is the orbit of g and is deﬁned as g = {g n : n ∈ Z} Since g ∈ g , then g is nonempty. Let a, b ∈ g . Then there exist x, y ∈ Z such that a = g x and b = g y . Since ab−1 = g x (g y )−1 = g x g −y = g x−y ∈ g , it follows that g ≤ G. Version: 3 Owner: drini Author(s): drini, Wkbj79
1255
305.3
center
The center of a group G is the subgroup of elements which commute with every other element. Formally Z(G) = {x ∈ G  xg = gx, ∀ g ∈ G} It can be shown that the center has the following properties • It is nonempty since it contains at least the identity element • It consists of those conjugacy classes containing just one element • The center of an abelian group is the entire group • It is normal in G • Every pgroup has a nontrivial center Version: 5 Owner: vitriol Author(s): vitriol
305.4
characteristic subgroup
If (G, ∗) is a group, then H is a characteristic subgroup of G (H char G) if every automorphism of G maps H to itself. That is: ∀f ∈ Aut(G)∀h ∈ Hf (h) ∈ H or, equivalently: ∀f ∈ Aut(G)f [H] = H A few properties of characteristic subgroups:
(a) If H char G then H is a normalsubgroup of G (b) If G has only one subgroup of a given size then that subgroup is characteristic (c) If K char H and H G then K G (contrast with normality of subgroups is not transitive)
(d) If K char H and H char G then K char G Proofs of these properties: 1256
(a) Consider H char G under the inner automorphisms of G. Since every automorphism preserves H, in particular every inner automorphism preserves H, and therefore g ∗ h ∗ g −1 ∈ H for any g ∈ G and h ∈ H. This is precisely the deﬁnition of a normal subgroup. (b) Suppose H is the only subgroup of G of order n. In general, homomorphisms takes subgroups to subgroups, and of course isomorphisms take subgroups to subgroups of the same order. But since there is only one subgroup of G of order n, any automorphism must take H to H, and so H char G. (c) Take K char H and H G, and consider the inner automorphisms of G (automorphisms of the form h → g ∗ h ∗ g −1 for some g ∈ G). These all preserve H, and so are automorphisms of H. But any automorphism of H preserves K, so for any g ∈ G and k ∈ K, g ∗ k ∗ g −1 ∈ K. (d) Let K char H and H char G, and let φ be an automorphism of G. Since H char G, φ[H] = H, so φH , the restriction of φ to H is an automorphism of H. Since K char H, so φH [K] = K. But φH is just a restriction of φ, so φ[K] = K. Hence K char G. Version: 1 Owner: Henry Author(s): Henry
305.5
class function
Given a ﬁeld K, a K–valued class function on a group G is a function f : G −→ K such that f (g) = f (h) whenever g and h are elements of the same conjugacy class of G. An important example of a class function is the character of a group representation. Over the complex numbers, the set of characters of the irreducible representations of G form a basis for the vector space of all C–valued class functions, when G is a compact Lie group. Relation to the convolution algebra Class functions are also known as central functions, because they correspond to functions f in the convolution algebra C ∗ (G) that have the property f ∗ g = g ∗ f for all g ∈ C ∗ (G) (i.e., they commute with everything under the convolution operation). More precisely, the set of measurable complex valued class functions f is equal to the set of central elements of the convolution algebra C ∗ (G), for G a locally compact group admitting a Haar measure. Version: 5 Owner: djao Author(s): djao
1257
305.6
conjugacy class
Two elements g and g of a group G are said to be conjugate if there exists h ∈ G such that g = hgh−1 . Conjugacy of elements is an equivalence relation, and the equivalence classes of G are called conjugacy classes. Two subsets S and T of G are said to be conjugate if there exists g ∈ G such that T = {gsg −1  s ∈ S} ⊂ G. In this situation, it is common to write gSg −1 for T to denote the fact that everything in T has the form gsg −1 for some s ∈ S. We say that two subgroups of G are conjugate if they are conjugate as subsets. Version: 2 Owner: djao Author(s): djao
305.7
conjugacy class formula
The conjugacy classes of a group form a partition of its elements. In a ﬁnite group, this means that the order of the group is the sum of the number of elements of the distinct conjugacy classes. For an element g of group G, we denote the conjugacy class of g as Cg and the normalizer in G of g as NG (g). The number of elements in Cg equals [G : NG (g)], the index of the normalizer of g in G. For an element g of the center Z(G) of G, the conjugacy class of g consists of the singleton {g}. Putting this together gives us the conjugacy class formula m G = Z(G) + [G : NG (xi )]
i=1
where the xi are elements of the distinct conjugacy classes contained in G − Z(G). Version: 3 Owner: lieven Author(s): lieven
305.8
conjugate stabilizer subgroups
Let · be a right group action of G on a set M. Then Gα·g = g −1 Gα g for any α ∈ M and g ∈ G. Proof:
1
1
Gα is the stabilizer subgroup of α ∈ M .
1258
x ∈ Gα·g ↔ α · (gx) = α · g ↔ α · (gxg −1 ) = α ↔ gxg −1 ∈ Gα ↔ x ∈ g −1 αg and therefore Gα·g = g −1 Gα g. Thus all stabilizer subgroups for elements of the orbit G(α) of α are conjugate to Gα . Version: 4 Owner: Thomas Heye Author(s): Thomas Heye
305.9
coset
Let H be a subgroup of a group G, and let a ∈ G. The left coset of a with respect to H in G is deﬁned to be the set aH := {ah  h ∈ H}. The right coset of a with respect to H in G is deﬁned to be the set Ha := {ha  h ∈ H}. Two left cosets aH and bH of H in G are either identical or disjoint. Indeed, if c ∈ aH bH, then c = ah1 and c = bh2 for some h1 , h2 ∈ H, whence b−1 a = h2 h−1 ∈ H. But then, given 1 any ah ∈ aH, we have ah = (bb−1 )ah = b(b−1 a)h ∈ bH, so aH ⊂ bH, and similarly bH ⊂ aH. Therefore aH = bH. Similarly, any two right cosets Ha and Hb of H in G are either identical or disjoint. Accordingly, the collection of left cosets (or right cosets) partitions the group G; the corresponding equivalence relation for left cosets can be described succintly by the relation a ∼ b if a−1 b ∈ H, and for right cosets by a ∼ b if ab−1 ∈ H. The index of H in G, denoted [G : H], is the cardinality of the set G/H of left cosets of H in G. Version: 5 Owner: djao Author(s): rmilson, djao
305.10
cyclic group
A group G is said to be cyclic if it is generated entirely by some x ∈ G. That is, if G has inﬁnite order then every g ∈ G can be expressed as xk with k ∈ Z. If G has ﬁnite order then every g ∈ G can be expressed as xk with k ∈ N0 , and G has exactly φ(G) generators, where φ is the Euler totient function. It is a corollary of Lagrange’s theorem that every group of prime order is cyclic. All cyclic groups of the same order are isomorphic to each other. Consequently cyclic groups of order n are often denoted by Cn . Every cyclic group is abelian. 1259
Examples of cyclic groups are (Zm , +m ), (Zp , ×p ) and (Rm , ×m ) where p is prime and Rm = {n ∈ N : (n, m) = 1, n ≤ m} Version: 10 Owner: yark Author(s): yark, Larry Hammick, vitriol
305.11
derived subgroup
Let G be a group and a, b ∈ G. The group element aba−1 b−1 is called the commutator of a and b. An element of G is called a commutator if it is the commutator of some a, b ∈ G. The subgroup of G generated by all the commutators in G is called the derived subgroup of G, and also the commutator subgroup. It is commonly denoted by G and also by G(1) . Alternatively, one may deﬁne G as the smallest subgroup that contains all the commutators. Note that the commutator of a, b ∈ G is trivial, i.e. aba−1 b−1 = 1 if and only if a and b commute. Thus, in a fashion, the derived subgroup measures the degree to which a group fails to be abelian. Proposition 1. The derived subgroup G is normal in G, and the factor group G/G is abelian. Indeed, G is abelian if and only if G is the trivial subgroup. One can of course form the derived subgroup of the derived subgroup; this is called the second derived subgroup, and denoted by G or by G(2) . Proceeding inductively one deﬁnes the nth derived subgroup as the derived subgroup of G(n−1) . In this fashion one obtains a sequence of subgroups, called the derived series of G: G = G(0) ⊇ G(1) ⊇ G(2) ⊇ . . . Proposition 2. The group G is solvable if and only if the derived series terminates in the trivial group {1} after a ﬁnite number of steps. In this case, one can reﬁne the derived series to obtain a composition series (a.k.a. a JordanHolder decomposition) of G. Version: 4 Owner: rmilson Author(s): rmilson
305.12
equivariant
Let G be a group, and X and Y left (resp. right) homogeneous spaces of G. Then a map f : X → Y is called equivariant if g(f (x)) = f (gx) (resp. (f (x))g = f (xg)) for all g ∈ G. Version: 1 Owner: bwebste Author(s): bwebste 1260
305.13
examples of ﬁnite simple groups
This entry under construction. If I take too long to ﬁnish it, nag me about it, or ﬁll in the rest yourself. All groups considered here are ﬁnite. It is now widely believed that the classiﬁcation of all ﬁnite simple groups up to isomorphism is ﬁnished. The proof runs for at least 10,000 printed pages, and as of the writing of this entry, has not yet been published in its entirety.
Abelian groups • The ﬁrst trivial example of simple groups are the cyclic groups of prime order. It is not diﬃcult to see (say, by Cauchy’s theorem) that these are the only abelian simple groups.
Alternating groups • The alternating group on n symbols is the set of all even permutations of Sn , the symmetric group on n symbols. It is usually denoted by An , or sometimes by Alt(n). This is a normal subgroup of Sn , namely the kernel of the homomorphism that sends every even permutation to 1 and the odd permutations to −1. Because every permutation is either even or odd, and there is a bijection between the two (multiply every even permutation by a transposition), the index of An in Sn is 2. A3 is simple because it only has three elements, and the simplicity of An for n 5 can be proved by an elementary argument. The simplicity of the alternating groups is an important ´ fact that Evariste Galois required in order to prove the insolubility by radicals of the general polynomial of degree higher than four.
Groups of Lie type • Projective special linear groups • Other groups of Lie type. Sporadic groups There are twentysix sporadic groups (no more, no less!) that do not ﬁt into any of the inﬁnite sequences of simple groups considered above. These often arise as the group of automorphisms of strongly regular graphs.
1261
• Mathieu groups. • Janko groups. • The baby monster. • The monster. Version: 8 Owner: drini Author(s): bbukh, yark, NeuRet
305.14
ﬁnitely generated group
A group G is ﬁnitely generated if there is a ﬁnite subset X ⊆ G such that X generates G. That is, every element of G is a product of elements of X and inverses of elements of X. Or, equivalently, no proper subgroup of G contains X. Every ﬁnite group is ﬁnitely generated, as we can take X = G. Every ﬁnitely generated group is countable. Version: 6 Owner: yark Author(s): yark, nerdy2
305.15
ﬁrst isomorphism theorem
If f : G → H is a homorphism of groups (or rings, or modules), then it induces an isomorphism G/ker f ≈ imf . Version: 2 Owner: nerdy2 Author(s): nerdy2
305.16
fourth isomorphism theorem
fourth isomorphism theorem
1: X group 2: N X
3: A set of subgroups of X that contain N 4: B set of subgroups of X/N 1262
5: ∃ϕ : A → B bijection : ∀Y, Z X : N Y & N Z ⇒ Y Z ⇔ Y /N Z/N & Z Y ⇒ [Y : Z] = [Y /N : Z/N] & Y, Z /N = Y /N, Z/N & (Y Z)/N = Y /N Z/N & Y G ⇔ Y /N G/N Note: This is a “seed” entry written using a shorthand format described in this FAQ. Version: 2 Owner: bwebste Author(s): yark, apmxi
305.17
generator
If G is a cyclic group and g ∈ G, then g is a generator of G if g = G. All inﬁnite cyclic groups have exactly 2 generators. Let G be an inﬁnite cyclic group and g be a generator of G. Let z ∈ Z such that g z is a generator of G. Then g z = G. Since g ∈ G, then g ∈ g z . Thus, there exists, n ∈ Z with g = (g z )n = g nz . Thus, g nz−1 = eG . Since G is inﬁnite and g =  g  = G must be inﬁnity, then nz − 1 = 0. Since nz = 1 and n and z are integers, then n = z = 1 or n = z = −1. It follows that the only generators of G are g and g −1. A ﬁnite cyclic group of order n has exactly ϕ(n) generators, where ϕ is the Euler totient function. Let G be a ﬁnite cyclic group of order n and g be a generator of G. Then g =  g  = G = n. Let z ∈ Z such that g z is a generator of G. By the division algorithm, there exist q, r ∈ Z with 0 ≤ r < n such that z = qn + r. Thus, g z = g qn+r = g qn g r = (g n )q g r = (eG )q g r = eG g r = g r . g n Since g r is a generator of G, then g r = G. Thus, n = G =  g r  = g r  = gcd(r,g) = gcd(r,n) . Therefore, gcd(r, n) = 1, and the result follows. Version: 3 Owner: Wkbj79 Author(s): Wkbj79
305.18
group actions and homomorphisms
Notes on group actions and homomorphisms Let G be a group, X a nonempty set and SX the symmetric group of X, i.e. the group of all bijective maps on X. · may denote a left group action of G on X. 1. For each g ∈ G we deﬁne fg− 1 (fg (x)) = g −1 · (g · x) = x ∀ x ∈ X, so fg− 1 is the inverse of fg . so fg is bijective and thus element of SX . We deﬁne F : G −→ SX , F (g) = fg for all g ∈ G. This mapping is a group homomorphism: Let g, h ∈ G, x ∈ X. Then fg : X −→ X, fg (x) = g · x ∀ x ∈ X.
F (gh)(x) = fgh (x) = (gh) · x = g · (h · x) = (fg ◦ fh )(x) = 1263
implies F (gh) = F (g) ◦ F (h). – The same is obviously true for a right group action. 2. Now let F : G −→ Sx be a group homomorphism, and let f : G × X −→ X, (g, x) −→ F (g)(x) satisﬁes (a) f (1G , x) = F (1g )(x) = x∀x ∈ X and
(b) f (gh, x) = F (gh)(x) = (F (g) ◦ F (h)(x) = F (g)(F (h)(x)) = f (g, f (h, x)), so f is a group action induced by F .
Characterization of group actions
Let G be a group acting on a set X. Using the same notation as above, we have for each g ∈ ker(F ) F (g) = id x = fg ↔ g · x = x ∀x ∈ X ↔ g ∈ Gx (305.18.1)
x∈X
and it follows ker(F ) =
x∈X
Gx .
Let G act transitively on X. Then for any x ∈ X X is the orbit G(x) of x. As shown in “conjugate stabilizer subgroups’, all stabilizer subgroups of elements y ∈ G(x) are conjugate subgroups to Gx in G. From the above it follows that ker(F ) =
g∈G
gGx g −1 .
For a faithful operation of G the condition g · x = x∀x ∈ X → g = 1G is equivalent to ker(F ) = {1G } and therefore F : G −→ SX is a monomorphism. For the trivial operation of G on X given by g · x = x∀g ∈ G the stabilizer subgroup Gx is G for all x ∈ X, and thus ker(F ) = G. The corresponding homomorphism is g −→ id x∀g ∈ G. If the operation of G on X is free, then Gx = {1G } ∀ x ∈ X, thus the kernel of F is {1G }–like for a faithful operation. But: Let X = {1, . . . , n} and G = Sn . Then the operation of G on X given by π · i := π(i)∀i ∈ X, π ∈ Sn is faithful but not free. Version: 5 Owner: Thomas Heye Author(s): Thomas Heye 1264
305.19
group homomorphism
Let (G, ∗g ) and (K, ∗k ) be two groups. A group homomorphism is a function φ : G → K such that φ(s ∗g t) = φ(s) ∗k φ(t) for all s, t ∈ G. The composition of group homomorphisms is again a homomorphism. Let φ : G → K a group homomorphism. Then • φ(eg ) = ek where eg and ek are the respective identity elements for G and K. • φ(g)−1 = φ(g −1 ) for all g ∈ G • φ(g)z = φ(g z ) for all g ∈ G and for all z ∈ Z The kernel of φ is a subgroup of G and its image is a subgroup of K. Some special homomorphisms have special names. If φ : G → K is injective, we say that φ is an monomorphism, and if φ is onto we call it an epimorphism. When φ is both injective and surjective (that is, bijective) we call it an isomorphism. In the latter case we also say that G and K are isomorphic, meaning they are basically the same group (have the same structure). An homomorphism from G on itself is called an endomorphism, and if it is bijective, then is called an automorphism. Version: 15 Owner: drini Author(s): saforres, drini
305.20
homogeneous space
Overview and deﬁnition. Let G be a group acting transitively on a set X. In other words, we consider a homomorphism φ : G → Perm(X), where the latter denotes the group of all bijections of X. If we consider G as being, in some sense, the automorphisms of X, the transitivity assumption means that it is impossible to distinguish a particular element of X from any another element. Since the elements of X are indistinguishable, we call X a homogeneous space. Indeed, the concept of a homogeneous space, is logically equivalent to the concept of a transitive group action. Action on cosets. Let G be a group, H < G a subgroup, and let G/H denote the set of left cosets, as above. For every g ∈ G we consider the mapping ψH (g) : G/H → G/H with action aH → gaH, a ∈ G. Proposition 3. The mapping ψH (g) is a bijection. The corresponding mapping ψH : G → Perm(G/H) is a group homomorphism, specifying a transitive group action of G on G/H. 1265
Thus, G/H has the natural structure of a homogeneous space. Indeed, we shall see that every homogeneous space X is isomorphic to G/H, for some subgroup H. N.B. In geometric applications, the want the homogeneous space X to have some extra structure, like a topology or a diﬀerential structure. Correspondingly, the group of automorphisms is either a continuous group or a Lie group. In order for the quotient space X to have a Hausdorﬀ topology, we need to assume that the subgroup H is closed in G. The isotropy subgroup and the basepoint identiﬁcation. Let X be a homogeneous space. For x ∈ X, the subgroup Hx = {h ∈ G : hx = x}, consisting of all Gactions that ﬁx x, is called the isotropy subgroup at the basepoint x. We identify the space of cosets G/Hx with the homogeneous space by means of the mapping τx : G/Hx → X, deﬁned by τx (aHx ) = ax, a ∈ G. Proposition 4. The above mapping is a welldeﬁned bijection. To show that τx is well deﬁned, let a, b ∈ G be members of the same left coset, i.e. there exists an h ∈ Hx such that b = ah. Consequently bx = a(hx) = ax, as desired. The mapping τx is onto because the action of G on X is assumed to be transitive. To show that τx is onetoone, consider two cosets aHx , bHx , a, b ∈ G such that ax = bx. It follows that b−1 a ﬁxes x, and hence is an element of Hx . Therefore aHx and bHx are the same coset. The homogeneous space as a quotient. Next, let us show that τx is equivariant relative to the action of G on X and the action of G on the quotient G/Hx . Proposition 5. We have that φ(g) ◦ τx = τx ◦ ψHx (g) for all g ∈ G. To prove this, let g, a ∈ G be given, and note that ψHx (g)(aHx ) = gaHx . The latter coset corresponds under τx to the point gax, as desired. Finally, let us note that τx identiﬁes the point x ∈ X with the coset of the identity element eHx , that is to say, with the subgroup Hx itself. For this reason, the point x is often called the basepoint of the identiﬁcation τx : G/Hx → X. 1266
The choice of basepoint. Next, we consider the eﬀect of the choice of basepoint on the quotient structure of a homogeneous space. Let X be a homogeneous space. Proposition 6. The set of all isotropy subgroups {Hx : x ∈ X} forms a single conjugacy class of subgroups in G. To show this, let x0 , x1 ∈ X be given. By the transitivity of the action we may choose a g ∈ G such that x1 = g x0 . Hence, for all h ∈ G satisfying hx0 = x0 , we have ˆ ˆ (ˆhˆ−1 )x1 = g (h(ˆ−1 x1 )) = g x0 = x1 . g g ˆ g ˆ Similarly, for all h ∈ Hx1 we have that g −1hˆ ﬁxes x0 . Therefore, ˆ g g (Hx0 )ˆ−1 = Hx1 ; ˆ g gHx g −1 = Hgx . or what is equivalent, for all x ∈ X and g ∈ G we have
Equivariance. Since we can identify a homogeneous space X with G/Hx for every possible x ∈ X, it stands to reason that there exist equivariant bijections between the diﬀerent G/Hx . To describe these, let H0 , H1 < G be conjugate subgroups with H1 = g H0 g −1 ˆ ˆ for some ﬁxed g ∈ G. Let us set ˆ X = G/H0 ,
and let x0 denote the identity coset H0 , and x1 the coset g H0 . What is the subgroup of G ˆ that ﬁxes x1 ? In other words, what are all the h ∈ G such that hˆH0 = g H0 , g ˆ or what is equivalent, all h ∈ G such that g −1 hˆ ∈ H0 . ˆ g
The collection of all such h is precisely the subgroup H1 . Hence, τx1 : G/H1 → G/H0 is the desired equivariant bijection. This is a well deﬁned mapping from the set of H1 cosets to the set of H0 cosets, with action given by τx1 (aH1 ) = aˆH0 , g a ∈ G.
Let ψ0 : G → Perm(G/H0 ) and ψ1 : G → Perm(G/H1 ) denote the corresponding coset Gactions. Proposition 7. For all g ∈ G we have that τx1 ◦ ψ1 (g) = ψ0 (g) ◦ τx1 . Version: 3 Owner: rmilson Author(s): rmilson 1267
305.21
identity element
Let G be a groupoid, that is a set with a binary operation G×G → G, written muliplicatively so that (x, y) → xy. An identity element for G is an element e such that ge = eg = g for all g ∈ G. The symbol e is most commonly used for identity elements. Another common symbol for an identity element is 1, particularly in semigroup theory (and ring theory, considering the multiplicative structure as a semigroup). Groups, monoids, and loops are classes of groupoids that, by deﬁnition, always have an identity element. Version: 6 Owner: mclase Author(s): mclase, vypertd, imran
305.22
inner automorphism
Let G be a group. For every x ∈ G, we deﬁne a mapping φx : G → G, y → xyx−1 , y ∈ G,
called conjugation by x. It is easy to show the conjugation map is in fact, a group automorphism. An automorphism of G that corresponds to the conjugation by some x ∈ G is called inner. An automorphism that isn’t inner is called an outer automorphism. The composition operation gives the set of all automorphisms of G the structure of a group, Aut(G). The inner automorphisms also form a group, Inn(G), which is a normal subgroup of Aut(G). Indeed, if φx , x ∈ G is an inner automorphism and π : G → G an arbitrary automorphism, then π ◦ φx ◦ π −1 = φπ(x) . Let us also note that the mapping x → φx , x∈G
is a surjective group homomorphism with kernel Z(G), the centre subgroup. Consequently, Inn(G) is naturally isomorphic to the quotient of G/ Z(G). Version: 7 Owner: rmilson Author(s): rmilson, tensorking
1268
305.23
kernel
Let ρ : G → K be a group homomorphism. The preimage of the codomain identity element eK ∈ K forms a subgroup of the domain G, called the kernel of the homomorphism; ker(ρ) = {s ∈ G  ρ(s) = eK } The kernel is a normal subgroup. It is the trivial subgroup if and only if ρ is a monomorphism. Version: 9 Owner: rmilson Author(s): rmilson, Daume
305.24
maximal
Let G be a group. A subgroup H of G is said to be maximal if H = G and whenever K is a subgroup of G with H ⊆ K ⊆ G then K = H or K = G. Version: 1 Owner: Evandar Author(s): Evandar
305.25
normal subgroup
A subgroup H of a group G is normal if aH = Ha for all a ∈ G. Equivalently, H ⊂ G is normal if and only if aHa−1 = H for all a ∈ G, i.e., if and only if each conjugacy class of G is either entirely inside H or entirely outside H. The notation H G or H G is often used to denote that H is a normal subgroup of G.
The kernel ker (f ) of any group homomorphism f : G −→ G is a normal subgroup of G. More surprisingly, the converse is also true: any normal subgroup H ⊂ G is the kernel of some homomorphism (one of these being the projection map ρ : G −→ G/H, where G/H is the quotient group). Version: 6 Owner: djao Author(s): djao
305.26
normality of subgroups is not transitive
Let G be a group. Obviously, a subgroup K ≤ H of a subgroup H ≤ G of G is a subgroup K ≤ G of G. It seems plausible that a similar situation would also hold for normal subgroups. This is not true. Even when K examples: H and H G, it is possible that K G. Here are two
1269
1. Let G be the subgroup of orientationpreserving isometries of the plane R2 (G is just all rotations and translations), let H be the subgroup of G of translations, and let K be the subgroup of H of integer translations τi,j (x, y) = (x + i, y + j), where i, j ∈ Z.
Any element g ∈ G may be represented as g = r1 ◦ t1 = t2 ◦ r2 , where r1,2 are rotations and t1,2 are translations. So for any translation t ∈ H we may write g −1 ◦ t ◦ g = r −1 ◦ t ◦ r, where t ∈ H is some other translation and r is some rotation. But this is an orientationpreserving isometry of the plane that does not rotate, so it too must be a translation. Thus G−1 HG = H, and H G. H is an abelian group, so all its subgroups, K included, are normal. We claim that K G. Indeed, if ρ ∈ G is rotation by 45◦ about the origin, then ρ−1 ◦ τ1,0 ◦ ρ is not an integer translation.
2. A related example uses ﬁnite subgroups. Let G = D4 be the dihedral group with four elements (the group of automorphisms of the graph of the square C4 ). Then D4 = r, f  f 2 = 1, r 4 = 1, f r = r −1 f is generated by r, rotation, and f , ﬂipping.
The subgroup
H = rf, f r = 1, rf, r 2 , f r ∼ C2 × C2 = G
is isomorphic to the Klein 4group – an identity and 3 elements of order 2. H since [G : H] = 2. Finally, take K = rf = {1, rf } We claim that K G. And indeed, f ◦ rf ◦ f = f r ∈ K. / Version: 4 Owner: ariels Author(s): ariels H.
305.27
normalizer
Let G be a group, and let H ⊆ G. The normalizer of H in G, written NG (H), is the set {g ∈ G  gHg −1 = H} 1270
If H is a subgroup of G, then NG (H) is a subgroup of G containing H. Note that H is a normal subgroup of NG (H); in fact, NG (H) is the largest subgroup of G of which H is a normal subgroup. In particular, if H is a normal subgroup of G, then NG (H) = G. Version: 6 Owner: saforres Author(s): saforres
305.28
order (of a group)
The order of a group G is the number of elements of G, denoted G; if G is ﬁnite, then G is said to be a ﬁnite group. The order of an element g ∈ G is the smallest positive integer n such that g n = e, where e is the identity element; if there is no such n, then g is said to be of inﬁnite order. Version: 5 Owner: saforres Author(s): saforres
305.29
presentation of a group
A presentation of a group G is a description of G in terms of generators and relations. We say that the group is ﬁnitely presented, if it can be described in terms of a ﬁnite number of generators and a ﬁnite number of deﬁning relations. A collection of group elements gi ∈ G, i ∈ I is said to generate G if every element of G can be speciﬁed as a product of the gi , and of their inverses. A relation is a word over the alphabet consisting of the generators gi and their inverses, with the property that it multiplies out to the identity in G. A set of relations rj , j ∈ J is said to be deﬁning, if all relations in G can be given as a product of the rj , their inverses, and the Gconjugates of these. The standard notation for the presentation of a group is G = g i  rj , meaning that G is generated by generators gi , subject to relations rj . Equivalently, one has a short exact sequence of groups 1 → N → F [I] → G → 1, where F [I] denotes the free group generated by the gi , and where N is the smallest normal subgroup containing all the rj . By the NielsenSchreier theorem, the kernel N is itself a free group, and hence we assume without loss of generality that there are no relations among the relations. Example. The symmetric group on n elements 1, . . . , n admits the following ﬁnite presentation (Note: this presentation is not canonical. Other presentations are known.) As 1271
generators take gi = (i, i + 1), (gi gj )ni,j = id, where ni,i = 1 ni,i+1 = 3 ni,j = 2, the transpositions of adjacent elements. As deﬁning relations take i, j = 1, . . . n, i = 1, . . . , n − 1,
i<j+1
This means that a ﬁnite symmetric group is a Coxeter group, Version: 11 Owner: rmilson Author(s): rmilson
305.30
proof of ﬁrst isomorphism theorem
Let K denote ker f . K is a normal subgroup of G because, by the following calculation, gkg −1 ∈ K for all g ∈ G and k ∈ K (rules of homomorphism imply the ﬁrst equality, deﬁnition of K for the second): f (gkg −1) = f (g)f (k)f (g)−1 = f (g)1H f (g)−1 = 1H Therefore, G/K is well deﬁned. Deﬁne a group homomorphism θ : G/K → imf given by: θ(gK) = f (g) We argue that θ is an isomorphism. First, θ is well deﬁned. Take two representative, g1 and g2 , of the same modulo class. By −1 −1 deﬁnition, g1 g2 is in K. Hence, f sends g1 g2 to 1 (all elements of K are sent by f to −1 1). Consequently, the next calculation is valid: f (g1)f (g2 )−1 = f (g1 g2 ) = 1 but this is the same as saying that f (g1 ) = f (g2 ). And we are done because the last equality indicate that θ(g1 K) is equal to θ(g2 K). Going backward the last argument, we get that θ is also an injection: If θ(g1 K) is equal to −1 θ(g2 K) then f (g1 ) = f (g2 ) and hence g1 g2 ∈ K (exactly as in previous part) which implies an equality between g1 K and g2 K. Now, θ is a homomorphism. We need to show that θ(g1 K · g2 K) = θ(g1 K)θ(g2 K) and that θ((gK)−1 ) = (θ(gK))−1 . And indeed: θ(g1 K · g2 K) = θ(g1 g2 K) = f (g1 g2 ) = f (g1 )f (g2 ) = θ(g1 K)θ(g2 K) 1272
θ((gK)−1 ) = θ(g −1 K) = f (g −1 ) = (f (g))−1 = (θ(gK))−1 To conclude, θ is surjective. Take h to be an element of imf and g its preimage. Since h = f (g) we have that h is also the image of of θ(gK). Version: 3 Owner: uriw Author(s): uriw
305.31
proof of second isomorphism theorem
First, we shall prove that HK is a subgroup of G: Since e ∈ H and e ∈ K, clearly e = e2 ∈ HK. Take h1 , h2 ∈ H, k1, k2 ∈ K. Clearly h1 k1 , h2 k2 ∈ HK. Further, h1 k1 h2 k2 = h1 (h2 h−1 )k1 h2 k2 = h1 h2 (h−1 k1 h2 )k2 2 2 Since K is a normal subgroup of G and h2 ∈ G, then h−1 k1 h2 ∈ K. Therefore h1 h2 (h−1 k1 h2 )k2 ∈ 2 2 HK, so HK is closed under multiplication. Also, (hk)−1 ∈ HK for h ∈ H, k ∈ K, since (hk)−1 = k −1 h−1 = h−1 hk −1 h−1 and hk −1 h−1 ∈ K since K is a normal subgroup of G. So HK is closed under inverses, and is thus a subgroup of G. Since HK is a subgroup of G, the normality of K in HK follows immediately from the normality of K in G. Clearly H K is a subgroup of G, since it is the intersection of two subgroups of G.
Finally, deﬁne φ : H → HK/K by ϕ(h) = hK. We claim that φ is a surjective homomorphism from H to HK/K. Let h0 k0 K be some element of HK/K; since k0 ∈ K, then h0 k0 K = h0 K, and φ(h0 ) = h0 K. Now ker (φ) = {h ∈ H  φ(h) = K} = {h ∈ H  hK = K} and if hK = K, then we must have h ∈ K. So ker (φ) = {h ∈ H  h ∈ K} = H K
Thus, since φ(H) = HK/K and ker φ = H K, by the ﬁrst isomorphism theorem we see that H K is normal in H and that there is a natural isomorphism between H/(H K) and HK/K. Version: 8 Owner: saforres Author(s): saforres 1273
305.32
proof that all cyclic groups are abelian
Following is a proof that all cyclic groups are abelian. Let G be a cyclic group and g be a generator of G. Let a, b ∈ G. Then there exist x, y ∈ Z such that a = g x and b = g y . Since ab = g x g y = g x+y = g y+x = g y g x = ba, it follows that G is abelian. Version: 2 Owner: Wkbj79 Author(s): Wkbj79
305.33
proof that all cyclic groups of the same order are isomorphic to each other
The following is a proof that all cyclic groups of the same order are isomorphic to each other. Let G be a cyclic group and g be a generator of G. Deﬁne ϕ : Z → G by ϕ(c) = g c . Since ϕ(a + b) = g a+b = g a g b = ϕ(a)ϕ(b), then ϕ is a group homomorphism. If h ∈ G, then there exists x ∈ Z such that h = g x . Since ϕ(x) = g x = h, then ϕ is surjective. ker ϕ = {c ∈ Zϕ(c) = eG } = {c ∈ Zg c = eG } If G is inﬁnite, then ker ϕ = {0}, and ϕ is injective. Hence, ϕ is a group isomorphism, and G ∼ Z. = If G is ﬁnite, then let G = n. Thus, g =  g  = G = n. If g c = eG , then n divides c. Therefore, ker ϕ = nZ. By the ﬁrst isomorphism theorem, G ∼ nZ ∼ Zn . = Z = Let H and K be cyclic groups of the same order. If H and K are inﬁnite, then, by the above argument, H ∼ Z and K ∼ Z. If H and K are ﬁnite of order n, then, by the above = = argument, H ∼ Zn and K ∼ Zn . In any case, it follows that H ∼ K. = = = Version: 1 Owner: Wkbj79 Author(s): Wkbj79
305.34
proof that all subgroups of a cyclic group are cyclic
Following is a proof that all subgroups of a cyclic group are cyclic. Let G be a cyclic group and H ≤ G. If G is trivial, then H = G, and H is cyclic. If H is the trivial subgroup, then H = {eG } = eG , and H is cyclic. Thus, for the remainder of the proof, it will be assumed that both G and H are nontrivial. 1274
Let g be a generator of G. Let n be the smallest positive integer such that g n ∈ H. Claim: H = g n Let a ∈ g n . Then there exists z ∈ Z with a = (g n )z . Since g n ∈ H, then (g n )z ∈ H. Thus, a ∈ H. Hence, g n ⊆ H. Let h ∈ H. Then h ∈ G. Let x ∈ Z with h = g x . By the division algorithm, there exist q, r ∈ Z with 0 ≤ r < n such that x = qn + r. Since h = g x = g qn+r = g qn g r = (g n )q g r , then g r = h(g n )−q . Since h, g n ∈ H, then g r ∈ H. By choice of n, r cannot be positive. Thus, r = 0. Therefore, h = (g n )q g 0 = (g n )q eG = (g n )q ∈ g n . Hence, H ⊆ g n . Since g n ⊆ H and H ⊆ g n , then H = g n . It follows that every subgroup of G is cyclic. Version: 3 Owner: Wkbj79 Author(s): Wkbj79
305.35
regular group action
Let G be a group action on a set X. The action is called regular if for any pair α, β ∈ X there exists exactly one g ∈ G such that g · α = β. (For a right group action it is deﬁned correspondingly.) Version: 3 Owner: Thomas Heye Author(s): Thomas Heye
305.36
second isomorphism theorem
Let (G, ∗) be a group. Let H be a subgroup of G and let K be a normal subgroup of G. Then • HK := {h ∗ k  h ∈ H, k ∈ K} is a subgroup of G, • K is a normal subgroup of HK, • H K is a normal subgroup of H, K) = HK/K. • There is a natural group isomorphism H/(H
The same statement also holds in the category of modules over a ﬁxed ring (where normality is neither needed nor relevant), and indeed can be formulated so as to hold in any abelian category. Version: 4 Owner: djao Author(s): djao
1275
305.37
simple group
Let G be a group. G is said to be simple if the only normal subgroups of G are {1} and G itself. Version: 3 Owner: Evandar Author(s): Evandar
305.38
solvable group
A group G is solvable if it has a composition series G = G0 ⊃ G1 ⊃ · · · ⊃ Gn = {1} where all the quotient groups Gi /Gi+1 are abelian. Version: 4 Owner: djao Author(s): djao
305.39
subgroup
Deﬁnition: Let (G, ∗) be a group and let K be subset of G. Then K is a subgroup of G deﬁned under the same operation if K is a group by itself (respect to ∗), that is: • K is closed under the ∗ operation. • There exists an identity element e ∈ K such that for all k ∈ K, k ∗ e = k = e ∗ k. • Let k ∈ K then there exists an inverse k −1 ∈ K such that k −1 ∗ k = e = k ∗ k −1 . The subgroup is denoted likewise (K, ∗). We denote K being a subgroup of G by writing K G. properties: • The set {e} whose only element is the identity is a subgroup of any group. It is called the trivial subgroup. • Every group is a subgroup of itself. • The null set {} is never a subgroup (since the deﬁnition of group states that the set must be nonempty). 1276
There is a very useful theorem that allows proving a given subset is a subgroup. Theorem: If K is a nonempty subset of of the group G. Then K is a subgroup of G if and only if s, t ∈ K implies that st−1 ∈ K. Proof: First we need to show if K is a subgroup of G then st−1 ∈ K. Since s, t ∈ K then st−1 ∈ K, because K is a group by itself. Now, suppose that if for any s, t ∈ K ⊆ G we have st−1 ∈ K. We want to show that K is a subgroup, which we will acomplish by proving it holds the group axioms. Since tt−1 ∈ K by hypothesis, we conclude that the identity element is in K: e ∈ K. (Existence of identity) Now that we know e ∈ K, for all t in K we have that et−1 = t−1 ∈ K so the inverses of elements in K are also in K. (Existence of inverses). Let s, t ∈ K. Then we know that t−1 ∈ K by last step. Applying hypothesis shows that s(t−1 )−1 = st ∈ K so K is closed under the operation. QED Example: • Consider the group (Z, +). Show that(2Z, +) is a subgroup.
The subgroup is closed under addition since the sum of even integers is even.
The identity 0 of Z is also on 2Z since 2 divides 0. For every k ∈ 2Z there is an −k ∈ 2Z which is the inverse under addition and satisﬁes −k + k = 0 = k(−k). Therefore (2Z, +) is a subgroup of (Z, +). Another way to show (2Z, +) is a subgroup is by using the proposition stated above. If s, t ∈ 2Z then s, t are even numbers and s −t ∈ 2Z since the diﬀerence of even numbers is always an even number. See also: • Wikipedia, subgroup Version: 7 Owner: Daume Author(s): Daume
305.40
third isomorphism theorem
If G is a group (or ring, or module) and H ⊂ K are normal subgroups (or ideals, or submodules), with H normal (or an ideal, or a submodule) in K then there is a natural isomorphism 1277
(G/H)/(K/H) ≈ G/K. I think it is not uncommon to see the third and second isomorphism theorems permuted. Version: 2 Owner: nerdy2 Author(s): nerdy2
1278
Chapter 306 20A99 – Miscellaneous
306.1 Cayley table
A Cayley table for a group is essentially the “multiplication table” of the group.1 The columns and rows of the table (or matrix) are labeled with the elements of the group, and the cells represent the result of applying the group operation to the rowth and columnth elements. Formally, Let G be our group, with operation ◦ the group operation. Let C be the Cayley table for the group, with C(i, j) denoting the element at row i and column j. Then C(i, j) = ei ◦ ej where ei is the ith element of the group, and ej is the jth element. Note that for an abelian group, we have ei ◦ ej = ej ◦ ei , hence the Cayley table is a symmetric matrix. All Cayley tables for isomorphic groups are isomorphic (that is, the same, invariant of the labeling and ordering of group elements).
306.1.1
Examples.
• The Cayley table for Z4 , the group of integers modulo 4 (under addition), would be
A caveat to novices in group theory: multiplication is usually used notationally to represent the group operation, but the operation needn’t resemble multiplication in the reals. Hence, you should take “multiplication table” with a grain or two of salt.
1
1279
• The Cayley table for S3 , the permutation group of order 3, is (1) (123) (132) (12) (13) (23) (1) (1) (123) (132) (12) (13) (23) 23) (123) (132) (1) (13) (23) (12) (132) (132) (1) (123) (23) (12) (13) 2) (12) (23) (13) (1) (132) (123) (13) (13) (12) (23) (123) (1) (132) (23) (23) (13) (12) (132) (123) (1) Version: 6 Owner: akrowne Author(s): akrowne
[0] [1] [2] [3]
[0] [0] [1] [2] [3]
[1] [1] [2] [3] [0]
[2] [2] [3] [0] [1]
[3] [3] [0] [1] [2]
306.2
proper subgroup
A group H is a proper subgroup of a group G if and only if H is a subgroup of G and H = G. (306.2.1)
Similarly a group H is an improper subgroup of a group G if and only if H is a subgroup of G and H = G. (306.2.2) Version: 2 Owner: imran Author(s): imran
306.3
quaternion group
The quaternion group, or quaternionic group, is a noncommutative group with eight elements. It is traditionally denoted by Q (not to be confused with Q) or by Q8 . This group is deﬁned by the presentation {i, j; i4 , i2 j 2 , iji−1 j −1 } or, equivalently, deﬁned by the multiplication table
1280
· 1 i j k −i −j −k −1
1 1 i j k −i −j −k −1
i j k −i i j k −i −1 k −j 1 −k −1 i k j −i −1 −j 1 −k j −1 k 1 −i −k −j i 1 j −i −j −k i
−j −j −k 1 i k −1 −i j
−k −1 −k −1 j −i −i −j 1 −k −j i i j −1 k k 1
where we have put each product xy into row x and column y. The minus signs are justiﬁed by the fact that {1, −1} is subgroup contained in the center of Q. Every subgroup of Q is normal and, except for the trivial subgroup {1}, contains {1, −1}. The dihedral group D4 (the group of symmetries of a square) is the only other noncommutative group of order 8. Since i2 = j 2 = k 2 = −1, the elements i, j, and k are known as the imaginary units, by analogy with i ∈ C. Any pair of the imaginary units generate the group. Better, given x, y ∈ {i, j, k}, any element of Q is expressible in the form xm y n . Q is identiﬁed with the group of units (invertible elements) of the ring of quaternions over Z. That ring is not identical to the group ring Z[Q], which has dimension 8 (not 4) over Z. Likewise the usual quaternion algebra is not quite the same thing as the group algebra R[Q]. Quaternions were known to Gauss in 1819 or 1820, but he did not publicize this discovery, and quaternions weren’t rediscovered until 1843, with Hamilton. For an excellent account of this famous Story, see http://math.ucr.edu/home/baez/Octonions/node1.html. Version: 6 Owner: vernondalhart Author(s): vernondalhart, Larry Hammick, patrickwonders
1281
Chapter 307 20B05 – General theory for ﬁnite groups
307.1 cycle notation
The cycle notation is a useful convention for writing down a permutations in terms of its constituent cycles. Let S be a ﬁnite set, and a1 , . . . , ak , k 2
distinct elements of S. The expression (a1 , . . . , ak ) denotes the cycle whose action is a1 → a2 → a3 . . . ak → a1 . Note there are k diﬀerent expressions for the same cycle; the following all represent the same cycle: (a1 , a2 , a3 , . . . , ak ) = (a2 , a3 , . . . , ak , a1 ), = . . . = (ak , a1 , a2 , . . . , ak−1 ). Also note that a 1element cycle is the same thing as the identity permutation, and thus there is not much point in writing down such things. Rather, it is customary to express the identity permutation simply as (). Let π be a permutation of S, and let S1 , . . . , Sk ⊂ S, k∈N
be the orbits of π with more than 1 element. For each j = 1, . . . , k let nj denote the cardinality of Sj . Also, choose an a1,j ∈ Sj , and deﬁne ai+1,j = π(ai,j ), i ∈ N. We can now express π as a product of disjoint cycles, namely π = (a1,1 , . . . an1 ,1 )(a2,1 , . . . , an2 ,2 ) . . . (ak,1 , . . . , ank ,k ). 1282
By way of illustration, here are the 24 elements of the symmetric group on {1, 2, 3, 4} expressed using the cycle notation, and grouped according to their conjugacy classes: (), (12), (13), (14), (23), (24), (34) (123), (213), (124), (214), (134), (143), (234), (243) (12)(34), (13)(24), (14)(23) (1234), (1243), (1324), (1342), (1423), (1432) Version: 1 Owner: rmilson Author(s): rmilson
307.2
permutation group
A permutation group is a pair (G, X) where G is an abstract group, and X is a set on which G acts faithfully. Alternatively, this can be thought of as a group G equipped with a homomorphism in to Sym(X), the symmetric group on X. Version: 2 Owner: bwebste Author(s): bwebste
1283
Chapter 308 20B15 – Primitive groups
308.1 primitive transitive permutation group
1: A ﬁnite set 2: G transitive permutation group on A 3: ∀B ⊂ A block or B = 1
example
1: S4 is a primitive transitive permutation group on {1, 2, 3, 4}
counterexample
1: D8 is not a primitive transitive permutation group on the vertices of a square
stabilizer maximal necessary and suﬃcient for primitivity
1: A ﬁnite set 2: G transitive permutation group on A 3: G primitive ⇔ ∀a ∈ A : H G & H ⊃ StabG (a) ⇒ H = G or H = StabG (a) 1284
Note: This was a “seed” entry written using a shorthand format described in this FAQ. Version: 4 Owner: Thomas Heye Author(s): yark, apmxi
1285
Chapter 309 20B20 – Multiply transitive ﬁnite groups
309.1 Jordan’s theorem (multiply transitive groups)
4. Then
Let G be a sharply ntransitive permutation group, with n 1. G is similar to Sn with the standard action or
2. n = 4 and G is similar to M11 , the Mathieu group of degree 10 or 3. n = 5 and G is similar to M12 , the Mathieu group of degree 11. Version: 1 Owner: bwebste Author(s): bwebste
309.2
multiply transitive
Let G be a group, X a set on which it acts. Let X (n) be the set of order ntuples of distinct elements of X. This is a Gset by the diagonal action: g · (x1 , . . . , xn ) = (g · x1 , . . . , g · xn ) The action of G on X is said to be ntransitive if it acts transitively on X (n) . For example, the standard action of S n , the symmetric group, is ntransitive, and the standard action of An , the alternating group, is (n − 2)transitive. Version: 2 Owner: bwebste Author(s): bwebste 1286
309.3
sharply multiply transitive
Let G be a group, and X a set that G acts on, and let X (n) be the set of order ntuples of distinct elements of X. Then the action of G on X is sharply ntransitive if G acts regularly on X (n) . Version: 1 Owner: bwebste Author(s): bwebste
1287
Chapter 310 20B25 – Finite automorphism groups of algebraic, geometric, or combinatorial structures
310.1 diamond theory
Diamond theory is the theory of aﬃne groups over GF (2) acting on small square and cubic arrays. In the simplest case, the symmetric group of order 4 acts on a twocolored Diamond ﬁgure like that in Plato’s Meno dialogue, yielding 24 distinct patterns, each of which has some ordinary or colorinterchange symmetry. This can be generalized to (at least) a group of order approximately 1.3 trillion acting on a 4x4x4 array of cubes, with each of the resulting patterns still having nontrivial symmetry. The theory has applications to ﬁnite geometry and to the construction of the large Witt design underlying the Mathieu group of degree 24.
Further Reading • ”Diamond Theory,” http://m759.freeservers.com/ Version: 4 Owner: m759 Author(s): akrowne, m759
1288
Chapter 311 20B30 – Symmetric groups
311.1 symmetric group
Let X be a set. Let S(X) be the set of permutations of X (i.e. the set of bijective functions on X). Then the act of taking the composition of two permutations induces a group structure on S(X). We call this group the symmetric group and it is often denoted Sym(X). Version: 5 Owner: bwebste Author(s): bwebste, antizeus
311.2
symmetric group
Let X be a set. Let S(X) be the set of permutations of X (i.e. the set of bijective functions on X). Then the act of taking the composition of two permutations induces a group structure on S(X). We call this group the symmetric group and it is often denoted Sym(X). When X has a ﬁnite number n of elements, we often refer to the symmetric group as Sn , and describe the elements by using cycle notation. Version: 2 Owner: antizeus Author(s): antizeus
1289
Chapter 312 20B35 – Subgroups of symmetric groups
312.1 Cayley’s theorem
Let G be a group, then G is isomorphic to a subgroup of the permutation group SG If G is ﬁnite and of order n, then G is isomorphic to a subgroup of the permutation group Sn Furthermore, suppose H is a proper subgroup of G. Let X = {Hgg ∈ G} be the set of right cosets in G. The map θ : G → SX given by θ(x)(Hg) = Hgx is a homomorphism. The kernel is the largest normal subgroup of H. We note that SX  = [G : H]!. Consequently if G doesn’t divide [G : H]! then θ is not an isomorphism so H contains a nontrivial normal subgroup, namely the kernel of θ. Version: 4 Owner: vitriol Author(s): vitriol
1290
Chapter 313 20B99 – Miscellaneous
313.1 (p, q) shuﬄe
Deﬁnition. Let p and q be positive natural numbers. Further, let S(k) be the set of permutations of the numbers {1, . . . , k}. A permutation τ ∈ S(p + q) is a (p, q) shuﬄe if τ (1) < · · · < τ (p), τ (p + 1) < · · · < τ (p + q).
The set of all (p, q) shuﬄes is denoted by S(p, q).
It is clear that S(p, q) ⊂ S(p + q). Since a (p, q) shuﬄe is completely determined by how the p ﬁrst elements are mapped, the cardinality of S(p, q) is p+q . The wedge product of a p pform and a qform can be deﬁned as a sum over (p, q) shuﬄes. Version: 3 Owner: matte Author(s): matte
313.2
Frobenius group
A permutation group G on a set X is Frobenius if no nontrivial element of G ﬁxes more than one element of X. Generally, one also makes the restriction that at least one nontrivial element ﬁx a point. In this case the Frobenius group is called nonregular. The stabilizer of any point in X is called a Frobenius complement, and has the remarkable property that it is distinct from any conjugate by an element not in the subgroup. Conversely, if any ﬁnite group G has such a subgroup, then the action on cosets of that subgroup makes G into a Frobenius group. Version: 2 Owner: bwebste Author(s): bwebste 1291
313.3
permutation
A permutation of a set {a1 , a2 , . . . , an } is an arrangement of its elements. For example, if S = {ABC} then ABC, CAB , CBA are three diﬀerent permutations of S. The number of permutations of a set with n elements is n!. A permutation can also be seen as a bijective function of a set into itself. For example, the permutation CAB could be seen a function that assigns: f (A) = C, f (B) = A, f (C) = B.
In fact, every bijection of a set into itself gives a permutation, and any permutation gives rise to a bijective function. Therefore, we can say that there are n! bijective fucntion from a set with n elements into itself. Using the function approach, it can be proved that any permutation can be expressed as a composition of disjoint cycles and also as composition of (not necessarily disjoint) transpositions. Moreover, if σ = τ1 τ2 · · · τm = ρ1 ρ2 · · · ρn are two factorization of a permutation σ into transpositions, then m and n must be both even or both odd. So we can label permutations as even or odd depending on the number of transpositions for any decomposition. Permutations (as functions) form a nonabelian group with function composition as binary operation called symmetric group of order n. The subset of even permutations becomes a subgroup called the alternating group of order n. Version: 3 Owner: drini Author(s): drini
313.4
proof of Cayley’s theorem
Let G be a group, and let SG be the permutation group of the underlying set G. For each g ∈ G, deﬁne ρg : G → G by ρg (h) = gh. Then ρg is invertible with inverse ρg−1 , and so is a permutation of the set G. Deﬁne Φ : G → SG by Φ(g) = ρg . Then Φ is a homomorphism, since (Φ(gh))(x) = ρgh (x) = ghx = ρg (hx) = (ρg ◦ ρh )(x) = ((Φ(g))(Φ(h)))(x) And Φ is injective, since if Φ(g) = Φ(h) then ρg = ρh , so gx = hx for all x ∈ X, and so g = h as required. 1292
So Φ is an embedding of G into its own permutation group. If G is ﬁnite of order n, then simply numbering the elements of G gives an embedding from G to Sn . Version: 2 Owner: Evandar Author(s): Evandar
1293
Chapter 314 20C05 – Group rings of ﬁnite groups and their modules
314.1 group ring
For any group G, the group ring Z[G] is deﬁned to be the ring whose additive group is the abelian group of formal integer linear combinations of elements of G, and whose multiplication operation is deﬁned by multiplication in G, extended Z–linearly to Z[G]. More generally, for any ring R, the group ring of G over R is the ring R[G] whose additive group is the abelian group of formal R–linear combinations of elements of G, i.e.:
n
R[G] :=
i=1
ri g i
ri ∈ R, gi ∈ G ,
and whose multiplication operation is deﬁned by R–linearly extending the group multiplication operation of G. In the case where K is a ﬁeld, the group ring K[G] is usually called a group algebra. Version: 4 Owner: djao Author(s): djao
1294
Chapter 315 20C15 – Ordinary representations and characters
315.1 Maschke’s theorem
Let G be a ﬁnite group, and k a ﬁeld of characteristic not dividing G. Then any representation V of G over k is completely reducible. e need only show that any subrepresentation has a compliment, and the result follows by induction.
W
Let V be a representation of G and W a subrepresentation. Let π : V → W be an arbitrary projection, and let 1 π (v) = g −1 π(gv) G g∈G This map is obviously Gequivariant, and is the identity on W , and its image is contained in W , since W is invariant under G. Thus it is an equivariant projection to W , and its kernel is a compliment to W . Version: 5 Owner: bwebste Author(s): bwebste
315.2
a representation which is not completely reducible
If G is a ﬁnite group, and k is a ﬁeld whose characteristic does divide the order of the group, then Maschke’s theorem fails. For example let V be the regular representation of G, which can be thought of as functions from G to k, with the G action g · ϕ(g ) = ϕ(g −1 g ). Then this representation is not completely reducible. 1295
There is an obvious trivial subrepresentation W of V , consisting of the constant functions. I claim that there is no complementary invariant subspace to this one. If W is such a subspace, then there is a homomorphism ϕ : V → V /W ∼ k. Now consider the characteristic function = of the identity e ∈ G 1 g=e δe (g) = 0 g=e and = ϕ(δe ) in V /W . This is not zero since δ generates the representation V . By Gequivarience, ϕ(δg ) = for all g ∈ G. Since η=
g∈G
η(g)δg
for all η ∈ V , W = ϕ(η) =
g∈G
η(g) .
Thus, ker ϕ = {η ∈ V  η(g) = 0}.
∈G
But since the characteristic of the ﬁeld k divides the order of G, W not possibly be complimentary to it.
W , and thus could
For example, if G = C2 = {e, f } then the invariant subspace of V is spanned by e + f . For characteristics other than 2, e − f spans a complimentary subspace, but over characteristic 2, these elements are the same. Version: 1 Owner: bwebste Author(s): bwebste
315.3
orthogonality relations
First orthogonality relations: Let χ1 , χ2 be characters of representations V1 , V2 of a ﬁnite group G over a ﬁeld k of characteristic 0. Then 1 χ1 (g)χ2 (g) = dim(HomV1 V2 ). G g∈G
(χ1 , χ2 ) =
irst of all, consider the special case where V = k with the trivial action of the group. Then HomG (k, V2 ) ∼ V2G , the ﬁxed points. On the other hand, consider the map =
F
φ=
1 g : V2 → V2 G g∈G 1296
(with the sum in End(V2 )). Clearly, the image of this map is contained in V2G , and it is the identity restricted to V2G . Thus, it is a projection with image V2G . Now, the rank of a projection (over a ﬁeld of characteristic 0) is its trace. Thus, dimk HomG (k, V2 ) = dim V2G = tr(φ) = which is exactly the orthogonality formula for V1 = k. Now, in general, Hom(V1 , V2 ) ∼ V1∗ ⊗V2 is a representation, and HomG (V1 , v2 ) = (Hom(V1 , V2 ))G . = ∗ ⊗V = χ1 χ2 , Since χV1 2 dimk HomG (V1 , V2 ) = dimk (Hom(V1 , V2 ))G =
g∈G
1 G
χ2 (g)
χ1 χ2
which is exactly the relation we desired. In particular, if V1 , V2 irreducible, by Schur’s lemma HomV1 V2 = D 0 V1 ∼ V2 = V1 V2
where D is a division algebra. In particular, nonisomorphic irreducible representations have orthogonal characters. Thus, for any representation V , the multiplicities ni in the unique decomposition of V into the direct sum of irreducibles
⊕n V ∼ V1⊕n1 ⊕ · · · ⊕ Vm m =
where Vi ranges over irreducible representations of G over k, can be determined in terms of the character inner product: (ψ, χi ) (χi , χi )
ni =
where ψ is the character of V and χi the character of Vi . In particular, representations over a ﬁeld of characteristic zero are determined by their character. Note: This is not true over ﬁelds of positive characteristic. If the ﬁeld k is algebraically closed, the only ﬁnite division algebra over k is k itself, so the characters of irreducible representations form an orthonormal basis for the vector space of class functions with respect to this inner product. Since (χi , χi ) = 1 for all irreducibles, the multiplicity formula above reduces to ni = (ψ, χi ).
1297
Second orthogonality relations: We assume now that k is algebraically closed. Let g, g be elements of a ﬁnite group G. Then χ(g)χ(g ) =
χ
CG (g1 ) g ∼ g 0 g g
where the sum is over the characters of irreducible representations, and CG (g) is the centralizer of g.
L et χ1 , . . . , χn be the characters of the irreducible representations, and let g1 , . . . , gn be representatives of the conjugacy classes.
Let A be the matrix whose ijth entry is G : CG (gj )(χi (gj )). By ﬁrst orthogonality, AA∗ = GI (here ∗ denotes conjugate transpose), where I is the identity matrix. Since left inverses are right inverses, A∗ A = GI. Thus,
n
G : CG (gi )G : CG (gk )
j=1
χj (gi )χj (gk ) = Gδij .
Replacing gi or gk with any conjuagate will not change the expression above. thus, if our two elements are not conjugate, we obtain that χ χ(g)χ(g ) = 0. On the other hand, if g ∼ g , then i = k in the sum above, which reduced to the expression we desired. A special case of this result, applied to 1 is that G = χ χ(1)2 , that is, the sum of the squares of the dimensions of the irreducible representations of any ﬁnite group is the order of the group. Version: 8 Owner: bwebste Author(s): bwebste
1298
Chapter 316 20C30 – Representations of ﬁnite symmetric groups
316.1 example of immanent
If χ = 1 we obtain the permanent. If χ = sgn we obtain the determinant. Version: 1 Owner: gholmes74 Author(s): gholmes74
316.2
immanent
Let χ : Sn → C be a complex character. For any n × n matrix A deﬁne
n
Immχ (A) =
σ∈Sn
χ(σ)
j=1
A(j, σj)
functions obtained in this way are called immanents. Version: 4 Owner: gholmes74 Author(s): gholmes74
316.3
permanent
The permanent of an n × n matrix A over C is the number
n
per(A) =
σ∈Sn j=1
A(j, σj) 1299
Version: 2 Owner: gholmes74 Author(s): gholmes74
1300
Chapter 317 20C99 – Miscellaneous
317.1 Frobenius reciprocity
Let V be a ﬁnitedimensional representation of a ﬁnite group G, and let W be a representation of a subgroup H ⊂ G. Then the characters of V and W satisfy the inner product relation (χInd(W ) , χV ) = (χW , χRes(V ) ) where Ind and Res denote the induced representation IndG and the restriction representation H ResG . H The Frobenius reciprocity theorem is often given in the stronger form which states that Res and Ind are adjoint functors between the category of G–modules and the category of H–modules: HomH (W, Res(V )) = HomG (Ind(W ), V ), or, equivalently V ⊗ Ind(W ) = Ind(Res(V ) ⊗ W ). Version: 4 Owner: djao Author(s): rmilson, djao
317.2
Schur’s lemma
Schur’s lemma in representation theory is an almost trivial observation for irreducible modules, but deserves respect because of its profound applications and implications. Lemma 5 (Schur’s lemma). Let G be a ﬁnite group represented on irreducible Gmodules V and W . Any Gmodule homomorphism f : V → W is either invertible or the zero map. 1301
he only insight here is that both ker f and im f are Gsubmodules of V and W respectively. This is routine. However, because V is irreducible, ker f is either trivial or all of V . In the former case, im f is all of W , also because W is irreducible, so f is invertible. In the latter case, f is the zero map.
T
The following corollary is a very useful form of Schur’s lemma, in case that our representations are over an algebraically closed ﬁeld. Corollary 1. If G is represented over an algebraically closed ﬁeld F on irreducible Gmodules V and W , then any Gmodule homomorphism f : V → W is a scalar.
T he insight in this case is to consider the modules V and W as vector spaces over F . Notice then that the homomorphism f is a linear transformation and therefore has an eigenvalue λ in our algebraically closed F . Hence, f −λ1 is not invertible. By Schur’s lemma, f −λ1 = 0. In other words, f = λ, a scalar.
Version: 14 Owner: rmilson Author(s): rmilson, NeuRet
317.3
character
Let ρ : G −→ GL(V ) be a ﬁnite dimensional representation of a group G (i.e., V is a ﬁnite dimensional vector space over its scalar ﬁeld K). The character of ρ is the function χV : G −→ K deﬁned by χV (g) := Tr(ρ(g)) where Tr is the trace function. Properties: • χV (g) = χV (h) if g is conjugate to h in G. (Equivalently, a character is a class function on G.) • If G is ﬁnite, the characters of the irreducible representations of G over the complex numbers form a basis of the vector space of all class functions on G (with pointwise addition and scalar multiplication). • Over the complex numbers, the characters of the irreducible representations of G are orthonormal under the inner product (χ1 , χ2 ) := 1 χ1 (g)χ2 (g) G g∈G
Version: 4 Owner: djao Author(s): djao 1302
317.4
group representation
Let G be a group, and let V be a vector space. A representation of G in V is a group homomorphism ρ : G −→ GL(V ) from G to the general linear group GL(V ) of invertible linear transformations of V . Equivalently, a representation of G is a vector space V which is a (left) module over the group ring Z[G]. The equivalence is achieved by assigning to each homomorphism ρ : G −→ GL(V ) the module structure whose scalar multiplication is deﬁned by g · v := (ρ(g))(v), and extending linearly.
Special kinds of representations (preserving all notation from above) A representation is faithful if either of the following equivalent conditions is satisﬁed: • ρ : G −→ GL(V ) is injective • V is a faithful left Z[G]–module A subrepresentation of V is a subspace W of V which is a left Z[G]–submodule of V ; or, equivalently, a subspace W of V with the property that (ρ(g))(w) ∈ W for all w ∈ W. A representation V is called irreducible if it has no subrepresentations other than itself and the zero module. Version: 2 Owner: djao Author(s): djao
317.5
induced representation
Let G be a group, H ⊂ G a subgroup, and V a representation of H, considered as a Z[H]– module. The induced representation of ρ on G, denoted IndG (V ), is the Z[G]–module whose H underlying vector space is the direct sum σV
σ∈G/H
of formal translates of V by left cosets σ in G/H, and whose multiplication operation is deﬁned by choosing a set {gσ }σ∈G/H of coset representatives and setting g(σv) := τ (hv) 1303
where τ is the unique left coset of G/H containing g · gσ (i.e., such that g · gσ = gτ · h for some h ∈ H). One easily veriﬁes that the representation IndG (V ) is independent of the choice of coset H representatives {gσ }. Version: 1 Owner: djao Author(s): djao
317.6
regular representation
Given a group G, the regular representation of G over a ﬁeld K is the representation ρ : G −→ GL( K G ) whose underlying vector space K G is the K–vector space of formal linear combinations of elements of G, deﬁned by
n n
ρ(g)
i=1
ki gi
:=
i=1
ki (ggi)
for ki ∈ K, g, gi ∈ G. Equivalently, the regular representation is the induced representation on G of the trivial representation on the subgroup {1} of G. Version: 2 Owner: djao Author(s): djao
317.7
restriction representation
Let ρ : G −→ GL( V ) be a representation on a group G. The restriction representation of ρ to a subgroup H of G, denoted ResG (V ), is the representation ρH : H −→ GL( V ) obtained H by restricting the function ρ to the subset H ⊂ G. Version: 1 Owner: djao Author(s): djao
1304
Chapter 318 20D05 – Classiﬁcation of simple and nonsolvable groups
318.1 Burnside p − q theorem
If a ﬁnite group G is not solvable, the order of G is divisible by at least 3 distinct primes. Alternatively, any groups whose order is divisible by only two distinct primes is solvable (these two distinct primes are the p and q of the title). Version: 2 Owner: bwebste Author(s): bwebste
318.2
classiﬁcation of semisimple groups
For every semisimple group G there is a normal subgroup H of G, (called the centerless competely reducible radical) which isomorphic to a direct product of nonabelian simple groups such that conjugation on H gives an injection into Aut(H). Thus G is isomorphic to a subgroup of Aut(H) containing the inner automorphisms, and for every group H isomorphic to a direct product of nonabelian simple groups, every such subgroup is semisimple. Version: 1 Owner: bwebste Author(s): bwebste
318.3
semisimple group
A group G is called semisimple if it has no proper normal solvable subgroups. Every group is an extension of a semisimple group by a solvable one.
1305
Version: 1 Owner: bwebste Author(s): bwebste
1306
Chapter 319 20D08 – Simple groups: sporadic groups
319.1 Janko groups
The Janko groups denoted by J1 , J2 , J3 , and J4 are four of the 26 sporadic groups. They were discovered by Z. Janko in 1966 and published in the article ”A new ﬁnite simple group with abelan Sylow subgroups and its characterization.” (Journal of algebra, 1966, 32: 147186). Each of these groups have very intricate matrix representations as maps into large general linear groups. For example, the matrix K corresponding to J4 gives a representation of J4 in GL112 (2). Version: 7 Owner: mathcam Author(s): mathcam, Thomas Heye
1307
Chapter 320 20D10 – Solvable groups, theory of formations, Schunck classes, Fitting classes, πlength, ranks
320.1 ˇ Cuhinin’s Theorem
Let G be a ﬁnite, πseparable group, for some set π of primes. Then if H is a maximal πsubgroup of G, the index of H in G, G : H, is coprime to all elements of π and all such subgroups are conjugate. Such a subgroup is called a Hall πsubgroup. For π = {p}, this essentially reduces to the Sylow theorems (with unnecessary hypotheses). If G is solvable, it is πseparable for all π, so such subgroups exist for all π. This result is often called Hall’s theorem. Version: 4 Owner: bwebste Author(s): bwebste
320.2
separable
Let π be a set of primes. A ﬁnite group G is called πseparable if there exists a composition series {1} = G0 ¡ · · · ¡ Gn = G such that Gi+1 /Gi is a πgroup, or a π group. πseparability can be thought of as a generalization of solvability; a group is πseparable for all sets of primes if and only it is solvable. Version: 3 Owner: bwebste Author(s): bwebste
1308
320.3
supersolvable group
A group G is supersolvable if it has a ﬁnite normal series G = G0 £ G1 £ · · · £ Gn = 1 with the property that each factor group Gi−1 /Gi is cyclic. A supersolvable group is solvable. Finitely generated nilpotent groups are supersolvable. Version: 1 Owner: mclase Author(s): mclase
1309
Chapter 321 20D15 – Nilpotent groups, pgroups
321.1 Burnside basis theorem
If G is a pgroup, then Frat G = G Gp , where Frat G is the Frattini subgroup, G the commutator subgroup, and Gp is the subgroup generated by pth powers. Version: 1 Owner: bwebste Author(s): bwebste
1310
Chapter 322 20D20 – Sylow subgroups, Sylow properties, πgroups, πstructure
322.1 πgroups and π groups
Let π be a set of primes. A ﬁnite group G is called a πgroup if all the primes dividing G are elements of π, and a π group if none of them are. Typically, if π is a singleton π = {p}, we write pgroup and p group for these. Version: 2 Owner: bwebste Author(s): bwebste
322.2
psubgroup
Let G be a ﬁnite group with order n, and let p be a prime integer. We can write n = pk m for some k, m integers, such that k and m are coprimes (that is, pk is the highest power of p that divides n). Any subgroup of G whose order is pk is called a Sylow psubgroup or simply psubgroup. While there is no reason for psubgroups to exist for any ﬁnite group, the fact is that all groups have psubgroups for every prime p that divides G. This statement is the First Sylow theorem When G = pk we simply say that G is a pgroup. Version: 2 Owner: drini Author(s): drini, apmxi
1311
322.3
Burnside normal complement theorem
Let G be a ﬁnite group, and S a Sylow subgroup such that CG (S) = NG (S). Then S has a normal complement. That is, there exists a normal subgroup N ¡ G such that S N = {1} and SN = G. Version: 1 Owner: bwebste Author(s): bwebste
322.4
Frattini argument
If H is a normal subgroup of a ﬁnite group G, and S is a Sylow subgroup of H, then G = HNG (S), where NG (S) is the normalizer of S in G. Version: 1 Owner: bwebste Author(s): bwebste
322.5
Sylow psubgroup
If (G, ∗) is a group then any subgroup of order pa for any integer a is called a psubgroup. IfG = pa m, where p m then any subgroup S of G with S = pa is a Sylow psubgroup. We use Sylp (G) for the set of Sylow pgroups of G. Version: 3 Owner: Henry Author(s): Henry
322.6
Sylow theorems
Let G be a ﬁnite group whose order is divisible by the prime p. Suppose pm is the highest G power of p which is a factor of G and set k = pm • The group G contains at least one subgroup of order pm • Any two subgroups of G of order pm are conjugate • The number of subgroups of G of order pm is congruent to 1 modulo p and is a factor of k Version: 1 Owner: vitriol Author(s): vitriol
1312
322.7
Sylow’s ﬁrst theorem
existence of subgroups of primepower order
1: G ﬁnite group 2: p prime 3: pk divides G 4: ∃H G : H = pk
Note: This is a “seed” entry written using a shorthand format described in this FAQ. Version: 2 Owner: bwebste Author(s): yark, apmxi
322.8
Sylow’s third theorem
Let G ﬁnite group, and let n be the number of Sylow psubgroups of G. Then n ⇔ 1 (mod p), and any two Sylow psubgroups of G are conjugate to one another. Version: 8 Owner: bwebste Author(s): yark, apmxi
322.9
application of Sylow’s theorems to groups of order pq
We can use Sylow’s theorems to examine a group G of order pq, where p and q are primes and p < q. Let nq denote the number of Sylow qsubgroups of G. Then Sylow’s theorems tell us that nq is of the form 1 + kq for some integer k and nq divides pq. But p and q are prime and p < q, so this implies that nq = 1. So there is exactly one Sylow qsubgroup, which is therefore normal (indeed, characteristic) in G. Denoting the Sylow qsubgroup by Q, and letting P be a Sylow psubgroup, then Q P = {1} and QP = G, so G is a semidirect product of Q and P . In particular, if there is only one Sylow psubgroup, then G is a direct product of Q and P , and is therefore cyclic. Version: 9 Owner: yark Author(s): yark, Manoj, Henry
1313
322.10
pprimary component
Deﬁnition 27. Let G be a ﬁnite abelian group and let p ∈ N be a prime. The pprimary component of G, Πp , is the subgroup of all elements whose order is a power of p. Note: The pprimary component of an abelian group G coincides with the unique Sylow psubgroup of G. Version: 2 Owner: alozano Author(s): alozano
322.11
proof of Frattini argument
Let g ∈ G be any element. Since H is normal, gSg −1 ⊂ H. Since S is a Sylow subgroup of H, gSg −1 = hSh−1 for some h ∈ H, by Sylow’s theorems. Thus n = h−1 g normalizes S, and so g = hn for h ∈ H and n ∈ NG (S). Version: 1 Owner: bwebste Author(s): bwebste
322.12
proof of Sylow theorems
We let G be a group of order pm k where p k and prove Sylow’s theorems. First, a fact which will be used several times in the proof: Proposition 8. If p divides the size of every conjugacy class outside the center then p divides the order of the center. Proof: This follows from this Centralizer: G = Z(G) + [a]
a∈Z(G) /
If p divides the left hand side, and divides all but one entry on the right hand side, it must divide every entry on the right side of the equation, so pZ(G). Proposition 9. G has a Sylow psubgroup Proof: By induction on G. If G = 1 then there is no p which divides its order, so the condition is trivial. Suppose G = pm k, p k, and the proposition holds for all groups of smaller order. Then we can consider whether p divides the order of the center, Z(G). 1314
If it does then, by Cauchy’s theorem, there is an element of Z(G) of order p, and therefore a cyclic subgroup generated by p, p , also of order p. Since this is a subgroup of the center, it is normal, so G/ p is welldeﬁned and of order pm−1 k. By the inductive hypothesis, this group has a subgroup P/ p of order pm−1 . Then there is a corresponding subgroup P of G which has P  = P/ p  · N = pm . On the other hand, if p Z(G) then consider the conjugacy classes not in the center. By the proposition above, since Z(G) is not divisible by p, at least one conjugacy class can’t be. If a is a representative of this class then we have p [a] = [G : C(a)], and since C(a) · [G : C(a)] = G, pm  C(a). But C(a) = G, since a ∈ Z(G), so C(a) has a subgroup / m of order p , and this is also a subgroup of G. Proposition 10. The intersection of a Sylow psubgroup with the normalizer of a Sylow psubgroup is the intersection of the subgroups. That is, Q NG (P ) = Q P . Proof: If P and Q are Sylow psubgroups, consider R = Q NG (P ). Obviously Q P ⊆ R. In addition, since R ⊆ NG (P ), the second isomorphism theorem tells us that RP is a group, R·P and RP  = R T P . P is a subgroup of RP , so pm  RP . But R is a subgroup of Q and P is a Sylow psubgroup, so R · P  is a multiple of p. Then it must be that RP  = pm , and therefore P = RP , and so R ⊆ P . Obviously R ⊆ Q, so R ⊆ Q P . The following construction will be used in the remainder of the proof: Given any Sylow psubgroup P , consider the set of its conjugates C. Then X ∈ C ↔ X = xP x−1 = {xpx−1 ∀p ∈ P } for some x ∈ G. Observe that every X ∈ C is a Sylow psubgroup (and we will show that the converse holds as well). We deﬁne a group action of a subset G on C by: g · X = g · xP x−1 = gxP x−1 g −1 = (gx)P (gx)−1 This is clearly a group action, so we can consider the orbits of P under it. Of course, if all G is used then there is only one orbit, so we restrict the action to a Sylow psubgroup Q. Name the orbits O1 , . . . , Os , and let P1 , . . . , Ps be representatives of the corresponding orbits. By the orbitstabilizer theorem, the size of an orbit is the index of the stabilizer, and under this action the stabilizer of any Pi is just NQ (Pi ) = Q NG (Pi ) = Q P , so Oi  = [Q : Q Pi ]. There are two easy results on this construction. If Q = Pi then Oi = [Pi : Pi Pi ] = 1. If Q = Pi then [Q : Q Pi ] > 1, and since the index of any subgroup of Q divides Q, p  Oi. Proposition 11. The number of conjugates of any Sylow psubgroup of G is congruent to 1 modulo p In the construction above, let Q = P1 . Then O1  = 1 and p  Oi for i = 1. Since the number of conjugates of P is the sum of the number in each orbit, the number of conjugates is of the form 1 + k2 p + k3 p + · · · + ks p, which is obviously congruent to 1 modulo p. Proposition 12. Any two Sylow psubgroups are conjugate 1315
Proof: Given a Sylow psubgroup P and any other Sylow psubgroup Q, consider again the construction given above. If Q is not conjugate to P then Q = Pi for every i, and therefore p  Oi for every orbit. But then the number of conjugates of P is divisible by p, contradicting the previous result. Therefore Q must be conjugate to P . Proposition 13. The number of subgroups of G of order pm is congruent to 1 modulo p and is a factor of k Proof: Since conjugates of a Sylow psubgroup are precisely the Sylow psubgroups, and since a Sylow psubgroup has 1 modulo p conjugates, there are 1 modulo p Sylow psubgroups. Since the number of conjugates is the index of the normalizer, it must be G : NG (P ). Since P is a subgroup of its normalizer, pm  NG (P ), and therefore G : NG (P )  k. Version: 3 Owner: Henry Author(s): Henry
322.13
subgroups containing the normalizers of Sylow subgroups normalize themselves
Let G be a ﬁnite group, and S a Sylow subgroup. Let M be a subgroup such that NG (S) ⊂ M. Then M = NG (M). y order considerations, S is a Sylow subgroup of M. Since M is normal in NG (M), by the Frattini argument, NG (M) = NG (S)M = M.
B
Version: 3 Owner: bwebste Author(s): bwebste
1316
Chapter 323 20D25 – Special subgroups (Frattini, Fitting, etc.)
323.1 Fitting’s theorem
If G is a ﬁnite group and M and N are normal nilpotent subgroups, then MN is also a normal nilpotent subgroup. Thus, any ﬁnite group has a maximal normal nilpotent subgroup, called its Fitting subgroup. Version: 1 Owner: bwebste Author(s): bwebste
323.2
characteristically simple group
A group G is called characterisitically simple if its only characteristic subgroups are {1} and G. Any ﬁnite characteristically simple group is the direct product of several copies of isomorphic simple groups. Version: 3 Owner: bwebste Author(s): bwebste
323.3
the Frattini subgroup is nilpotent
The Frattini subgroup Frat G of any ﬁnite group G is nilpotent. et S be a Sylow psubgroup of G. Then by the Frattini argument, (Frat G)NG (S) = G. Since the Frattini subgroup is formed of nongenerators, NG (S) = G. Thus S is normal in
L
1317
G, and thus in Frat G. Any subgroup whose Sylow subgroups are all normal is nilpotent. Version: 4 Owner: bwebste Author(s): bwebste
1318
Chapter 324 20D30 – Series and lattices of subgroups
324.1 maximal condition
A group is said to satisfy the maximal condition if every strictly ascending chain of subgroups G1 ⊂ G2 ⊂ G3 ⊂ · · · is ﬁnite. This is also called the ascending chain condition. A group satiﬁes the maximal condition if and only if the group and all its subgroups are ﬁnitely generated. Similar properties are useful in other classes of algebraic structures: see for example the noetherian condition for rings and modules. Version: 2 Owner: mclase Author(s): mclase
324.2
minimal condition
A group is said to satisfy the minimal condition if every strictly descending chain of subgroups G1 ⊃ G2 ⊃ G3 ⊃ · · · is ﬁnite. This is also called the descending chain condition. 1319
A group which satisﬁes the minimal condition is necessarily periodic. For if it contained an element x of inﬁnite order, then x ⊃ x2 ⊃ x4 ⊃ · · · ⊃ x2 is an inﬁnite descending chain of subgroups. Similar properties are useful in other classes of algebraic structures: see for example the artinian condition for rings and modules. Version: 1 Owner: mclase Author(s): mclase
n
⊃ ···
324.3
subnormal series
Let G be a group with a subgroup H, and let G = G0 £ G1 £ · · · £ Gn = H (324.3.1)
be a series of subgroups with each Gi a normal subgroup of Gi−1 . Such a series is called a subnormal series or a subinvariant series. If in addition, each Gi is a normal subgroup of G, then the series is called a normal series. A subnormal series in which each Gi is a maximal normal subgroup of Gi−1 is called a composition series. A normal series in which Gi is a maximal normal subgroup of G contained in Gi−1 is called a principal series or a chief series. Note that a composition series need not end in the trivial group 1. One speaks of a composition series (1) as a composition series from G to H. But the term composition series for G generally means a compostion series from G to 1. Similar remarks apply to principal series. Version: 1 Owner: mclase Author(s): mclase
1320
Chapter 325 20D35 – Subnormal subgroups
325.1 subnormal subgroup
Let G be a group, and H a subgroup of G. Then H is subnormal if there exists a ﬁnite series H = H0 hdH1 hd · · · hdtHn = G with Hi a normal subgroup of Hi+1 . Version: 1 Owner: bwebste Author(s): bwebste
1321
Chapter 326 20D99 – Miscellaneous
326.1 Cauchy’s theorem
Let G be a ﬁnite group and let p be a prime dividing G. Then there is an element of G of order p. Version: 1 Owner: Evandar Author(s): Evandar
326.2
Lagrange’s theorem
Let G be a ﬁnite group and let H be a subgroup of G. Then the order of H divides the order of G. Version: 2 Owner: Evandar Author(s): Evandar
326.3
exponent
If G is a ﬁnite group, then the exponent of G, denoted exp G, is the smallest positive integer n such that, for every g ∈ G, g n = eG . Thus, for every group G, exp G divides G, and, for every g ∈ G, g divides exp G. The concept of exponent for ﬁnite groups is similar to that of characterisic for rings. If G is a ﬁnite abelian group, then there exists g ∈ G with g = exp G. As a result of the fundamental theorem of ﬁnite abelian groups, there exist a1 , . . . , an with ai dividing ai+1 for every integer i between 1 and n such that G ∼ Za1 ⊕ · · · ⊕ Zan . Since, for every c ∈ G, = 1322
can = eG , then exp G ≤ an . Since (0, . . . , 0, 1) = an , then exp G = an , and the result follows. Following are some examples of exponents of nonabelian groups. Since (12) = 2, (123) = 3, and S3  = 6, then exp S3 = 6. In Q8 = {1, −1, i, −i, j, −j, k, −k}, the ring of quaternions of order eight, since i =  − i = j =  − j = k =  − k = 4 and 14 = (−1)4 = 1, then exp Q = 4. Since the order of a product of two disjoint transpositions is 2, the order of a three cycle is 3, and the only nonidentity elements of A4 are products of two disjoint transpositions and three cycles, then exp A4 = 6. Since (123) = 3 and (1234) = 4, then exp S4 ≥ 12. Since S4 is not abelian, then it is not cyclic, and thus contains no element of order 24. It follows that exp S4 = 12. Version: 5 Owner: Wkbj79 Author(s): Wkbj79
326.4
fully invariant subgroup
A subgroup H of a group G is fully invariant if f (H) ⊆ H for all endomorphisms f : G → G This is a stronger condition than being a characteristic subgroup. The derived subgroup is fully invariant. Version: 1 Owner: mclase Author(s): mclase
326.5
proof of Cauchy’s theorem
Let G be a ﬁnite group and p be a prime divisor of G. Consider the set X of all ordered strings (x1 , x2 , . . . , xp ) for which x1 x2 . . . xp = e. Note X = Gp−1, i.e. a multiple of p. There is a natural group action of Zp on X. m ∈ Zp sends the string (x1 , x2 , . . . , xp ) to (xm+1 , . . . , xp , x1 , . . . , xm ). By orbitstabilizer theorem each orbit contains exactly 1 or p strings. Since (e, e, . . . , e) has an orbit of cardinality 1, and the orbits partition X, the cardinality of which is divisible by p, there must exist at least one other string (x1 , x2 , . . . , xp ) which is left ﬁxed by every element of Zp . i.e. x1 = x2 = . . . = xp and so there exists an element of order p as required. Version: 1 Owner: vitriol Author(s): vitriol
1323
326.6
proof of Lagrange’s theorem
We know that the cosets Hg form a partition of G (see the coset entry for proof of this.) Since G is ﬁnite, we know it can be completely decomposed into a ﬁnite number of cosets. Call this number n. We denote the ith coset by Hai and write G as G = Ha1 since each coset has H elements, we have G = H · n and so H divides G, which proves Lagrange’s theorem. Version: 2 Owner: akrowne Author(s): akrowne Ha2 ··· Han
326.7
proof of the converse of Lagrange’s theorem for ﬁnite cyclic groups
Following is a proof that, if G is a ﬁnite cyclic group and n ∈ Z+ is a divisor of G, then G has a subgroup of order n. Let g be a generator of G. Then g =  g  = G. Let z ∈ Z such that nz = G = g. g Consider g z . Since g ∈ G, then g z ∈ G. Thus, g z ≤ G. Since  g z  = g z  = GCD(z,g) = nz = nz = n, it follows that g z is a subgroup of G of order n. GCD(z,nz) z Version: 3 Owner: Wkbj79 Author(s): Wkbj79
326.8
proof that expG divides G
Following is a proof that exp G divides G for every ﬁnite group G. By the division algorithm, there exist q, r ∈ Z with 0 ≤ r < exp G such that G = q(exp G) + r. Let g ∈ G. Then eG = g G = g q(exp G)+r = g q(exp G) g r = (g exp G )q g r = (eG )q g r = eG g r = g r . Thus, for every g ∈ G, g r = eG . By the deﬁnition of exponent, r cannot be positive. Thus, r = 0. It follows that exp G divides G. Version: 4 Owner: Wkbj79 Author(s): Wkbj79 1324
326.9
proof that g divides expG
Following is a proof that, for every ﬁnite group G and for every g ∈ G, g divides exp G. By the division algorithm, there exist q, r ∈ Z with 0 ≤ r < g such that exp G = qg + r. Since eG = g exp G = g qg+r = (g g )q g r = (eG )q g r = eG g r = g r , then, by deﬁnition of the order of an element, r cannot be positive. Thus, r = 0. It follows that g divides exp G. Version: 2 Owner: Wkbj79 Author(s): Wkbj79
326.10
proof that every group of prime order is cyclic
Following is a proof that every group of prime order is cyclic. Let p be a prime and G be a group such that G = p. Then G contains more than one element. Let g ∈ G such that g = eG . Then g contains more than one element. Since g ≤ G, then by Lagrange’s theorem,  g  divides p. Since  g  > 1 and  g  divides a prime, then  g  = p = G. Hence, g = G. It follows that G is cyclic. Version: 3 Owner: Wkbj79 Author(s): Wkbj79
1325
Chapter 327 20E05 – Free nonabelian groups
327.1 NielsenSchreier theorem
Let G be a free group and H a subgroup of G. Then H is free. Version: 1 Owner: Evandar Author(s): Evandar
327.2
Scheier index formula
Let G be a free group and H a subgroup of ﬁnite index G : H = n. By the NielsenSchreier theorem, H is free. The Scheier index formula states that rank(H) = n(rank(G) − 1) + 1. Thus implies more generally, if G is any group generated by m elements, then any subgroup of index n can be generated by nm − n + 1 elements. Version: 1 Owner: bwebste Author(s): bwebste
327.3
free group
Let A be a set with elements ai for some index set I. We refer to A as an alphabet and the elements of A as letters. A syllable is a symbol of the form an for n ∈ Z. It is customary i to write a for a1 . Deﬁne a word to be a ﬁnite ordered string, or sequence, of syllables made up of elements of A. For example, a3 a1 a−1 a2 a−3 2 4 3 2 1326
is a ﬁvesyllable word. Notice that there exists a unique empty word, i.e. the word with no syllables, usually written simply as 1. Denote the set of all words formed from elements of A by W[A]. Deﬁne a binary operation, called the product, on W[A] by concatenation of words. To illustrate, if a3 a1 and a−1 a4 are elements of W[A] then their product is simply a3 a1 a−1 a4 . 3 1 2 3 1 2 This gives W[A] the structure of a semigroup with identity. The empty word 1 acts as a right and left identity in W[A], and is the only element which has an inverse. In order to give W[A] the structure of a group, two more ideas are needed. If v = u1 a0 u2 is a word where u1 , u2 are also words and ai is some element of A, an elemeni tary contraction of type I replaces the occurrence of a0 by 1. Thus, after this type of contraction we get another word w = u1 u2 . If v = u1 ap aq u2 is a word, an elementary coni i p+q traction of type II replaces the occurrence of ap aq by ai which results in w = u1 ap+q u2 . i i i In either of these cases, we also say that w is obtained from v by an elementary contraction, or that v is obtained from w by an elementary expansion. Call two words u, v equivalent (denoted u ∼ v) if one can be obtained from the other by a ﬁnite sequence of elementary contractions or expansions. This is an equivalence relation on W[A]. Let F[A] be the set of equivalence classes of words in W[A]. Then F[A] is group under the operation [u][v] = [uv] where [u] ∈ F[A]. The inverse [u]−1 of an element [u] is obtained by reversing the order of the syllables of [u] and changing the sign of each syllable. For example, if [u] = [a1 a2 ], then 3 [u]−1 = [a−2 a−1 ]. 3 1 We call F[A] the free group on the alphabet A or the free group generated by A. A given group G is free if G is isomorphic to F[A] for some A. This seemingly ad hoc construction gives an important result: Every group is the homomorphic image of some free group. Version: 4 Owner: jihemme Author(s): jihemme, rmilson, djao
327.4
proof of NielsenSchreier theorem and Schreier index formula
While there are purely algebraic proofs of this fact, a much easier proof is available through geometric group theory. Let G be a group which is free on a set X. Any group acts freely on its Cayley graph, and the Cayley graph of G is a 2Xregular tree, which we will call T.
If H is any subgroup of G, then H also acts freely on T by restriction. Since groups that act freely on trees a H is free. 1327
Moreover, we can obtain the rank of H (the size of the set on which it is free). If G is a ﬁnite graph, then π1 (G) is free of rank −χ(G) − 1, where χ(G) denotes the Euler characteristic of G. Since H ∼ π1 (H\T), the rank of H is χ(H\T). If H is of ﬁnite index n in G, then H\T = is ﬁnite, and χ(H\T) = nχ(G\T). Of course −χ(G\T) − 1 is the rank of G. Substituting, we ﬁnd that rank(H) = n(rank(G) − 1) + 1. Version: 2 Owner: bwebste Author(s): bwebste
327.5
JordanHolder decomposition
A Jordan–H¨lder decomposition of a group G is a ﬁltration o G = G1 ⊃ G2 ⊃ · · · ⊃ Gn = {1} such that Gi+1 is a normal subgroup of Gi and the quotient Gi /Gi+1 is a simple group for each i. Version: 4 Owner: djao Author(s): djao
327.6
proﬁnite group
A topological group G is proﬁnite if it is isomorphic to the inverse limit of some projective system of ﬁnite groups. In other words, G is proﬁnite if there exists a directed set I, a collection of ﬁnite groups {Hi }i∈I , and homomorphisms αij : Hj → Hi for each pair i, j ∈ I with i j, satisfying 1. αii = 1 for all i ∈ I, 2. αij ◦ αjk = αik for all i, j, k ∈ I with i with the property that: • G is isomorphic as a group to the projective limit lim Hi :=
←−
j
k,
(hi ) ∈
Hi
i∈I
αij (hj ) = hi for all i
j
under componentwise multiplication.
1328
• The isomorphism from G to lim Hi (considered as a subspace of
←−
Hi ) is a homeomorphism Hi is given
of topological spaces, where each Hi is given the discrete topology and the product topology.
The topology on a proﬁnite group is called the proﬁnite topology. Version: 3 Owner: djao Author(s): djao
327.7
extension
A short exact sequence 0 → A → B → C → 0 is sometimes called an extension of C by A. This term is also applied to an object B which ﬁts into such an exact sequence. Version: 1 Owner: bwebste Author(s): bwebste
327.8
holomorph
Let K be a group, and let θ : Aut(K) → Aut(K) be the identity map. The holomorph of K, denoted Hol(K), is the semidirect product K θ Aut(K). Then K is a normal subgroup of Hol(K), and any automorphism of K is the restriction of an inner automorphism of Hol(K). For if φ ∈ Aut(K), then (1, φ) · (k, 1) · (1, φ−1) = (1 · k θ(φ) , φ) · (1, φ−1 ) = (k θ(φ) · 1θ(φ) , φφ−1) = (φ(k), 1).
Version: 2 Owner: dublisk Author(s): dublisk
327.9
proof of the Jordan Holder decomposition theorem
Let G = N. We ﬁrst prove existence, using induction on N. If N = 1 (or, more generally, if G is simple) the result is clear. Now suppose G is not simple. Choose a maximal proper normal subgroup G1 of G. Then G1 has a Jordan–H¨lder decomposition by induction, which o produces a Jordan–H¨lder decomposition for G. o
1329
To prove uniqueness, we use induction on the length n of the decomposition series. If n = 1 then G is simple and we are done. For n > 1, suppose that G ⊃ G1 ⊃ G2 ⊃ · · · ⊃ Gn = {1} and G ⊃ G1 ⊃ G2 ⊃ · · · ⊃ Gm = {1} are two decompositions of G. If G1 = G1 then we’re done (apply the induction hypothesis to G1 ), so assume G1 = G1 . Set H := G1 G1 and choose a decomposition series H ⊃ H1 ⊃ · · · ⊃ Hk = {1} for H. By the second isomorphism theorem, G1 /H = G1 G1 /G1 = G/G1 (the last equality is because G1 G1 is a normal subgroup of G properly containing G1 ). In particular, H is a normal subgroup of G1 with simple quotient. But then G1 ⊃ G2 ⊃ · · · ⊃ Gn and G1 ⊃ H ⊃ · · · ⊃ H k are two decomposition series for G1 , and hence have the same simple quotients by the induction hypothesis; likewise for the G1 series. Therefore n = m. Moreover, since G/G1 = G1 /H and G/G1 = G1 /H (by the second isomorphism theorem), we have now accounted for all of the simple quotients, and shown that they are the same. Version: 4 Owner: djao Author(s): djao
327.10
semidirect product of groups
The goal of this exposition is to carefully explain the correspondence between the notions of external and internal semi–direct products of groups, as well as the connection between semi–direct products and short exact sequences. Naturally, we start with the construction of semi–direct products. Deﬁnition 6. Let H and Q be groups and let θ : Q −→ Aut(H) be a group homomorphism. The semi–direct product H θ Q is deﬁned to be the group with underlying set {(h, q)such thath ∈ H, q ∈ Q} and group operation (h, q)(h , q ) := (hθ(q)h , qq ). We leave it to the reader to check that H inverse of (h, q) is (θ(q −1 )(h−1 ), q −1 ).
θ
Q is really a group. It helps to know that the
For the remainder of this article, we omit θ from the notation whenever this map is clear from the context. 1330
Set G := H
Q. There exist canonical monomorphisms H −→ G and Q −→ G, given by h → (h, 1Q ), q → (1H , q), h∈H q∈Q
where 1H (resp. 1Q ) is the identity element of H (resp. Q). These monomorphisms are so natural that we will treat H and Q as subgroups of G under these inclusions. Theorem 3. Let G := H Q as above. Then:
• H is a normal subgroup of G. • HQ = G. • H Q = {1G }.
L et p : G −→ Q be the projection map deﬁned by p(h, q) = q. Then p is a homomorphism with kernel H. Therefore H is a normal subgroup of G.
Every (h, q) ∈ G can be written as (h, 1Q )(1H , q). Therefore HQ = G. Finally, it is evident that (1H , 1Q ) is the only element of G that is of the form (h, 1Q ) for h ∈ H and (1H , q) for q ∈ Q. This result motivates the deﬁnition of internal semi–direct products. Deﬁnition 7. Let G be a group with subgroups H and Q. We say G is the internal semi– direct product of H and Q if: • H is a normal subgroup of G. • HQ = G. • H Q = {1G }.
We know an external semi–direct product is an internal semi–direct product (Theorem 3). Now we prove a converse (Theorem 4), namely, that an internal semi–direct product is an external semi–direct product. Lemma 6. Let G be a group with subgroups H and Q. Suppose G = HQ and H Q = {1G }. Then every element g of G can be written uniquely in the form hq, for h ∈ H and q ∈ Q. ince G = HQ, we know that g can be written as hq. Suppose it can also be written as h q . Then hq = h q so h −1 h = q q −1 ∈ H Q = {1G }. Therefore h = h and q = q .
S
1331
Theorem 4. Suppose G is a group with subgroups H and Q, and G is the internal semi– direct product of H and Q. Then G ∼ H θ Q where θ : Q −→ Aut(H) is given by = θ(q)(h) := qhq −1 , q ∈ Q, h ∈ H. y lemma 6, every element g of G can be written uniquely in the form hq, with h ∈ H and q ∈ Q. Therefore, the map φ : H Q −→ G given by φ(h, q) = hq is a bijection from G to H Q. It only remains to show that this bijection is a homomorphism.
B
Given elements (h, q) and (h , q ) in H
Q, we have
φ((h, q)(h , q )) = φ((hθ(q)(h ), qq )) = φ(hqh q −1 , qq ) = hqh q = φ(h, q)φ(h , q ). Therefore φ is an isomorphism. Consider the external semi–direct product G := H θ Q with subgroups H and Q. We know from Theorem 4 that G is isomorphic to the external semi–direct product H θ Q, where we are temporarily writing θ for the conjugation map θ (q)(h) := qhq −1 of Theorem 4. But in fact the two maps θ and θ are the same: θ (q)(h) = (1H , q)(h, 1Q )(1H , q −1 ) = (θ(q)(h), 1Q ) = θ(q)(h). In summary, one may use Theorems 3 and 4 to pass freely between the notions of internal semi–direct product and external semi–direct product. Finally, we discuss the correspondence between semi–direct products and split exact sequences of groups. Deﬁnition 8. An exact sequence of groups 1 −→ H −→ G −→ Q −→ 1. is split if there exists a homomorphism k : Q −→ G such that j ◦ k is the identity map on Q. Theorem 5. Let G, H, and Q be groups. Then G is isomorphic to a semi–direct product H Q if and only if there exists a split exact sequence 1 −→ H −→ G −→ Q −→ 1. irst suppose G ∼ H Q. Let i : H −→ G be the inclusion map i(h) = (h, 1Q ) and let = j : G −→ Q be the projection map j(h, q) = q. Let the splitting map k : Q −→ G be the inclusion map k(q) = (1H , q). Then the sequence above is clearly split exact.
F
i j i j
Now suppose we have the split exact sequence above. Let k : Q −→ G be the splitting map. Then: • i(H) = ker j, so i(H) is normal in G. 1332
• For any g ∈ G, set q := k(j(g)). Then j(gq −1 ) = j(g)j(k(j(g)))−1 = 1Q , so gq −1 ∈ Im i. Set h := gq −1 . Then g = hq. Therefore G = i(H)k(Q). • Suppose g ∈ G is in both i(H) and k(Q). Write g = k(q). Then k(q) ∈ Im i = ker j, so q = j(k(q)) = 1Q . Therefore g = k(q) = k(1Q ) = 1G , so i(H) k(Q) = {1G }. This proves that G is the internal semi–direct product of i(H) and k(Q). These are isomorphic to H and Q, respectively. Therefore G is isomorphic to a semi–direct product H Q. Thus, not all normal subgroups H ⊂ G give rise to an (internal) semi–direct product G = H G/H. More speciﬁcally, if H is a normal subgroup of G, we have the canonical exact sequence 1 −→ H −→ G −→ G/H −→ 1.
We see that G can be decomposed into H only if the canonical exact sequence splits. Version: 5 Owner: djao Author(s): djao
G/H as an internal semi–direct product if and
327.11
wreath product
Let A and B be groups, and let B act on the set Γ. Let AΓ be the set of all functions from Γ to A. Endow AΓ with a group operation by pointwise multiplication. In other words, for any f1 , f2 ∈ AΓ , (f1 f2 )(γ) = f1 (γ)f2 (γ) ∀γ ∈ Γ where the operation on the right hand side above takes place in A, of course. Deﬁne the action of B on AΓ by bf (γ) := f (bγ),
for any f : Γ → A and all γ ∈ Γ. The wreath product of A and B according to the action of B on Γ, sometimes denoted A Γ B, is the following semidirect product of groups: AΓ B.
Before going into further constructions, let us pause for a moment to unwind this deﬁnition. Let W := A Γ B. The elements of W are ordered pairs (f, b), for some function f : Γ → A and some b ∈ B. The group operation in the semidirect product, for any (f1 , b1 ), (f2 , b2 ) ∈ W is, (f1 (γ), b1 )(f2 (γ), b2 ) = (f1 (γ)f2 (b1 γ), b1 b2 ) ∀γ ∈ Γ
The set AΓ can be interpreted as the cartesian product of A with itself, of cardinality Γ. That is to say, Γ here plays the role of an index set for the Cartesian product. If Γ is ﬁnite, 1333
for instance, say Γ = {1, 2, . . . , n}, then any f ∈ AΓ is an ntuple, and we can think of any (f, b) ∈ W as the following ordered pair: ((a1 , a2 , . . . , an ), b) where a1 , a2 , . . . , an ∈ A The action of B on Γ in the semidirect product has the eﬀect of permuting the elements of the ntuple f , and the group operation deﬁned on AΓ gives pointwise multiplication. To be explicit, suppose (f, a), (g, b) ∈ W , and for j ∈ Γ, f (j) = rj ∈ A and g(j) = sj ∈ A. Then, (f, a)(g, b) = ((r1 , r2 , . . . , rn ), a)((s1 , s2 , . . . , sn ), b) = ((r1 , r2 , . . . , rn )(sa1 , sa2 , . . . , san ), ab) (Notice the permutation of the indices!) = ((r1 sa1 , r2 sa2 , . . . , rn san ), ab). A moment’s thought to understand this slightly messy notation will be illuminating (and might also shed some light on the choice of terminology, “wreath” product). Version: 11 Owner: bwebste Author(s): NeuRet
327.12
JordanHlder decomposition theorem
Every ﬁnite group G has a ﬁltration G ⊃ G0 ⊃ · · · ⊃ Gn = {1}, where each Gi+1 is normal in Gi and each quotient group Gi /Gi+1 is a simple group. Any two such decompositions of G have the same multiset of simple groups Gi /Gi+1 up to ordering. A ﬁltration of G satisfying the properties above is called a Jordan–H¨lder decomposition of o G. Version: 4 Owner: djao Author(s): djao
327.13
simplicity of the alternating groups
5 the alternating group on n symbols, An , is simple.
This is an elementary proof that for n
Throughout this discussion, ﬁx n 5. We will extensively employ cycle notation, with composition on the left, as is usual. The following observation will also be useful. Let π be a permutation written as disjoint cycles π = (a1 , a2 , . . . , ak )(b1 , b2 , . . . , bl )(. . .) . . . 1334
It is easy to check that for any other permutation σ ∈ Sn σπσ −1 = (σ(a1 ), σ(a2 ), . . . , σ(ak ))(σ(b1 ), σ(b2 ), . . .)(. . .) . . .) In particular, two permutations of Sn are conjugate exactly when they have the same cycle type. Two preliminary results are necessary. Lemma 7. An is generated by all cycles of length 3.
A product of 3cycles is an even permutation, so the subgroup generated by all 3cycles is therefore contained in An . For the reverse inclusion, by deﬁnition every even permutation is the product of even number of transpositions. Thus, it suﬃces to show that the product of two transpositions can be written as a product of 3cycles. There are two possibilities. Either the two transpositions move an element in common, say (a, b) and (a, c), or the two transpositions are disjoint, say (a, b) and (c, d). In the former case,
(a, b)(a, c) = (a, c, b), and in the latter, (a, b)(c, d) = (a, b, d)(c, b, d). This establishes the ﬁrst lemma. Lemma 8. If a normal subgroup N An contains a 3cycle, then N = An .
W e will show that if (a, b, c) ∈ N, then the assumption of normality implies that any other (a , b , c ) ∈ N. This is easy to show, because there is some permutation in σ ∈ Sn that under conjugation takes (a, b, c) to (a , b , c ), that is
σ(a, b, c)σ −1 = (σ(a), σ(b), σ(c)) = (a , b , c ). In case σ is odd, then (because n 5) we can choose some transposition (d, e) ∈ An disjoint from (a , b , c ) so that σ(a, b, c)σ −1 = (d, e)(a , b , c )(d, e), that is, σ (a, b, c)σ −1 = (d, e)σ(a, b, c)σ −1 (d, e) = (a , b , c ) where σ is even. This means that N contains all 3cycles, as N lemma N = An as required. An . Hence, by previous
The rest of the proof proceeds by an exhuastive veriﬁcation of all the possible cases. Suppose there is some nontrivial N An . We will show that N = An . In each case we will suppose N contains a particular kind of element, and the normality will imply that N also contains a certain conjugate of the element in An , thereby reducing the situation to a previously solved case. 1335
Case 1 Suppose N contains a permutation π that when written as disjoint cycles has a cycle of length at least 4, say π = (a1 , a2 , a3 , a4 , . . .) . . . Upon conjugation by (a1 , a2 , a3 ) ∈ An , we obtain π = (a1 , a2 , a3 )π(a3 , a2 , a1 ) = (a2 , a3 , a1 , a4 , . . .) . . . so that π ∈ N, and also π π −1 = (a1 , a2 , a4 ) ∈ N. Notice that the rest of the cycles cancel. By Lemma 8, N = An . Case 2 The cyclic decompositions of elements of N only involve cycles of length 2 and at least two cycles of length 3. Consider then π = (a, b, c)(d, e, f ) . . . Conjugation by (c, d, e) implies that N also contains π = (c, d, e)π(e, d, c) = (a, b, d)(e, c, f ) . . . , and hence N also contains π π = (a, d, c, b, f ) . . ., which reduces to Case 1.
Case 3 There is an element of N whose cyclic decomposition only involves transpositions and exactly one 3cycle. Upon squaring, this element becomes a 3cycle and Lemma 8 applies.
Case 4 There is an element of N of the form π = (a, b)(c, d). Conjugating by (a, e, b) with e distinct from a, b, c, d (again, at least one such e, as n 5) yields π = (a, e, b)π(b, e, a) = (a, e)(c, d) ∈ N. Hence π π = (a, b, e) ∈ N. Lemma 8, applies and N = An . Case 5 Every element of N is the product of at least four transpositions. Suppose N contains π = (a1 , b1 )(a2 , b2 )(a3 , b3 )(a4 , b4 ) . . ., the number of transpostions being even, of course. This time we conjugate by (a2 , b1 )(a3 , b2 ). π = (a2 , b1 )(a3 , b2 )π(a3 , b2 )(a2 , b1 ) = (a1 , a2 )(a3 , b1 )(b2 , b3 )(a4 , b4 ), and π π = (a1 , a3 , b2 )(a2 , b3 , b1 ) ∈ N which is Case 2. Since this covers all possible cases, N = An and the alternating group contains no proper nontrivial normal subgroups. QED. Version: 8 Owner: rmilson Author(s): NeuRet 1336
327.14
abelian groups of order 120
Here we present an application of the fundamental theorem of ﬁnitely generated abelian groups. Example (abelian groups of order 120): Let G be an abelian group of order n = 120. Since the group is ﬁnite it is obviously ﬁnitely generated, so we can apply the theorem. There exist n1 , n2 , . . . , ns with G ∼ Z/n1 Z ⊕ Z/n2 Z ⊕ . . . ⊕ Z/ns Z = Notice that in the case of a ﬁnite group, r, as in the statement of the theorem, must be equal to 0. We have s n = 120 = 23 · 3 · 5 =
i=1
∀i, ni
2;
ni+1  ni for 1
i
s−1
ni = n1 · n2 · . . . · ns
and by the divisibility properties of ni we must have that every prime divisor of n must divide n1 . Thus the possibilities for n1 are the following 2 · 3 · 5, 22 · 3 · 5, 23 · 3 · 5
If n1 = 23 · 3 · 5 = 120 then s = 1. In the case that n1 = 22 · 3 · 5 then n2 = 2 and s = 2. It remains to analyze the case n1 = 2 · 3 · 5. Now the possibilities for n2 are 2 (with n3 = 2, s = 3) or 4 (with s = 2). Hence if G is an abelian group of order 120 it must be (up to isomorphism) one of the following: Z/120Z, Z/60Z ⊕ Z/2Z, Z/30Z ⊕ Z/4Z, Z/30Z ⊕ Z/2Z ⊕ Z/2Z
Also notice that they are all nonisomorphic. This is because Z/(n · m)Z ∼ Z/nZ ⊕ Z/mZ ⇔ gcd(n, m) = 1 = which is due to the Chinese remainder theorem. Version: 1 Owner: alozano Author(s): alozano
327.15
fundamental theorem of ﬁnitely generated abelian groups
Theorem 2 (Fundamental Theorem of ﬁnitely generated abelian groups). Let G be a ﬁnitely generated abelian group. Then there is a unique expression of the form: G ∼ Zr ⊕ Z/n1 Z ⊕ Z/n2 Z ⊕ . . . ⊕ Z/ns Z = 1337
for some integers r, ni satisfying: r 0; ∀i, ni 2; ni+1  ni for 1 i s−1
Version: 1 Owner: bwebste Author(s): alozano
327.16
conjugacy class
Let G a group, and consider its operation (action) on itself give by conjugation, that is, the mapping (g, x) → gxg −1 Since conjugation is an equivalence relation, we obtain a partition of G into equivalence classes, called conjugacy classes. So, the conjugacy class of X (represented Cx or C(x) is given by Cx = {y ∈ X : y = gxg −1 for some g ∈ G} Version: 2 Owner: drini Author(s): drini, apmxi
327.17
Frattini subgroup
Let G be a group. The Frattini subgroup Φ(G) of G is the intersection of all maximal subgroups of G. Equivalently, Φ(G) is the subgroup of nongenerators of G. Version: 1 Owner: Evandar Author(s): Evandar
327.18
nongenerator
Let G be a group. g ∈ G is said to be a nongenerator if whenever X is a generating set for G then X {g} is also a generating set for G. Version: 1 Owner: Evandar Author(s): Evandar
1338
Chapter 328 20Exx – Structure and classiﬁcation of inﬁnite or ﬁnite groups
328.1 faithful group action
Let A be a Gset. That is, a set over which acts (or operates) a group G. The map mg : A → A deﬁned as mg (x) = ψ(g, x)
where g ∈ G and ψ is the action, is a permutation of A (in other words, a bijective function of A) and so an element of SA . We can even get an homorphism from G to SA by the rule g → mg . If for any pair g, h ∈ G g = h we have mg = mh , in other words, the homomorphism g → mg being injective, we say that the action is faithful. Version: 3 Owner: drini Author(s): drini, apmxi
1339
Chapter 329 20F18 – Nilpotent groups
329.1 classiﬁcation of ﬁnite nilpotent groups
Let G be a ﬁnite group. The following are equivalent: 1. G is nilpotent. 2. Every subgroup of G is subnormal. 3. Every subgroup H G is properly contained in its normalizer.
4. Every maximal subgroup is normal. 5. Every Sylow subgroup is normal. 6. G is a direct product of pgroups. Version: 1 Owner: bwebste Author(s): bwebste
329.2
nilpotent group
We deﬁne the lower central series of a group G to be the ﬁltration of subgroups G = G1 ⊃ G2 ⊃ · · · deﬁned inductively by: G1 := G, Gi := [Gi−1 , G], i > 1, 1340
where [Gi−1 , G] denotes the subgroup of G generated by all commutators of the form hkh−1 k −1 where h ∈ Gi−1 and k ∈ G. The group G is said to be nilpotent if Gi = 1 for some i. Nilpotent groups can also be equivalently deﬁned by means of upper central series. For a group G, the upper central series of G is the ﬁltration of subgroups C1 ⊂ C2 ⊂ · · · deﬁned by setting C1 to be the center of G, and inductively taking Ci to be the unique subgroup of G such that Ci /Ci−1 is the center of G/Ci−1 , for each i > 1. The group G is nilpotent if and only if G = Ci for some i. Nilpotent groups are related to nilpotent Lie algebras in that a Lie group is nilpotent as a group if and only if its corresponding Lie algebra is nilpotent. The analogy extends to solvable groups as well: every nilpotent group is solvable, because the upper central series is a ﬁltration with abelian quotients. Version: 3 Owner: djao Author(s): djao
1341
Chapter 330 20F22 – Other classes of groups deﬁned by subgroup chains
330.1 inverse limit
Let {Gi }∞ be a sequence of groups which are related by a chain of surjective homomorphisms i=0 fi : Gi → Gi−1 such that G0
f1
G1
f2
G2
f3
G3
f4
...
Deﬁnition 28. The inverse limit of (Gi , fi ), denoted by is the subset of lim(Gi , fi ), or lim Gi ← − ← − ∞ Gi formed by elements satisfying i=0 ( g0 , g1 , g2 , g3 , . . .), with gi ∈ Gi , fi (gi ) = gi−1
∞ i=0
Note: The inverse limit of Gi can be checked to be a subgroup of the product below for a more general deﬁnition. Examples:
Gi . See
1. Let p ∈ N be a prime. Let G0 = {0} and Gi = Z/pi Z. Deﬁne the connecting homomorphisms fi , for i 2, to be “reduction modulo pi−1 ” i.e. fi (x mod pi ) = x mod pi−1 which are obviously surjective homomorphisms. The inverse limit of (Z/pi Z, fi ) is called the padic integers and denoted by Zp = lim Z/pi Z ← − 1342 fi : Z/pi Z → Z/pi−1 Z
2. Let E be an elliptic curve deﬁned over C. Let p be a prime and for any natural number n write E[n] for the ntorsion group, i.e. E[n] = {Q ∈ E  n · Q = O} In this case we deﬁne Gi = E[pi ], and fi : E[pi ] → E[pi−1 ], fi (Q) = p · Q
The inverse limit of (E[pi ], fi ) is called the Tate module of E and denoted Tp (E) = lim E[pi ] ← − The concept of inverse limit can be deﬁned in far more generality. Let (S, ) be a directed set and let C be a category. Let {Gα }α∈S be a collection of objects in the category C and let {fα,β : Gβ → Gα  α, β ∈ S, be a collection of morphisms satisfying: 1. For all α ∈ S, fα,α = IdGα , the identity morphism. 2. For all α, β, γ ∈ S such that α morphisms). β γ, we have fα,γ = fα,β ◦ fβ,γ (composition of α β}
Deﬁnition 29. The inverse limit of ({Gα }α∈S , {fα,β }), denoted by lim(Gα , fα,β ), ← −
α∈S
or
is deﬁned to be the set of all (gα ) ∈
Gα such that for all α, β ∈ S
lim G ← α −
α
β ⇒ fα,β (gβ ) = gα
For a good example of this more general construction, see inﬁnite Galois theory. Version: 6 Owner: alozano Author(s): alozano
1343
Chapter 331 20F28 – Automorphism groups of groups
331.1 outer automorphism group
The outer automorphism group of a group is the quotient of its automorphism group by its inner automorphism group: Out(G) = Aut(G)/Inn(G). Version: 7 Owner: Thomas Heye Author(s): yark, apmxi
1344
Chapter 332 20F36 – Braid groups; Artin groups
332.1 braid group
Consider two sets of n points on the complex plane C2 , of the form (1, 0), . . . , (n, 0), and of the form (1, 1), . . . , (n, 1). We connect these two sets of points via a series of paths fi : I → C2 , such that fi (t) = fj (t) for i = j and any t ∈ [0, 1]. Also, each fi may only intersect the planes (0, z) and (1, z) for t = 0 and 1 respectively. Thus, the picture looks like a bunch of strings connecting the two sets of points, but possibly tangled. The path f = (f1 , . . . , fn ) determines a homotopy class f, where we require homotopies to satisfy the same conditions on the fi . Such a homotopy class f is called a braid on n strands. We can obtain a group structure on the set of braids on n strands as follows. Multiplication of two strands f, g is done by simply following f ﬁrst, then g, but doing each twice as fast. That is, f · g is the homotopy class of the path fg = f (2t) if 0 t 1/2 g(2t − 1) if 1/2 t 1
where f and g are representatives for f and g respectively. Inverses are done by following the same strand backwards, and the identity element is the strand represented by straight lines down. The result is known as the braid group on n strands, it is denoted by Bn . The braid group determines a homomorphism φ : Bn → Sn , where Sn is the symmetric group on n letters. For f ∈ Bn , we get an element of Sn from map sending i → p1 (fi (1)) where f is a representative of the homtopy class f, and p1 is the projection onto the ﬁrst factor. This works because of our requirement on the points that the braids start and end, and since our homotopies ﬁx basepoints. The kernel of φ consists of the braids that bring each strand to its original order. This kernel gives us the pure braid group on n strands, and is denoted by Pn . Hence, we have a short exact sequence 1 → Pn → Bn → Sn → 1. 1345
We can also describe braid groups as certain fundamental groups, and in more generality. Let M be a manifold, The conﬁguration space of n ordered points on M is deﬁned to be Fn (M) = {(a1 , . . . , an ) ∈ M n  ai = aj fori = j}. The group Sn acts on Fn (M) by permuting coordinates, and the corresponding quotient space Cn (M) = Fn (M)/Sn is called the conﬁguration space of n unordered points on M. In the case that M = C, we obtain the regular and pure braid groups as π1 (Cn (M)) and π1 (Fn (M)) respectively. The group Bn can be given the following presentation. The presentation was given in Artin’s ﬁrst paper [1] on the braid group. Label the braids 1 through n as before. Let σi be the braid that twists strands i and i + 1, with i passing beneath i + 1. Then the σi generate Bn , and the only relations needed are σi σj = σj σi for i − j 2, 1 σi σi+1 σi = σi+1 σi σi+1 for 1 i n − 2 The pure braid group has a presentation with
2 −1 −1 −1 generatorsaij = σj−1 σj−2 · · · σi+1 σi σi+1 · · · σj−2 σj−1 for 1
i, j
n−1
i<j
n
and deﬁning relations aij a a a−1 rj ij rj −1 ars aij ars = arj asj aij a−1 a−1 sj rj −1 −1 arj asj arj asj aij asj arj a−1 a−1 sj rj
if if if if
i < r < s < j or r < s < i < j r<i=s<j i=r<s<j r<i<s<j
REFERENCES
1. E. Artin Theorie der Z¨pfe. Abh. Math. Sem. Univ. Hamburg 4(1925), 4272. o 2. V.L. Hansen Braids and Coverings. London Mathematical Society Student Texts 18. Cambridge University Press. 1989.
Version: 7 Owner: dublisk Author(s): dublisk
1346
Chapter 333 20F55 – Reﬂection and Coxeter groups
333.1 cycle
Let S be a set. A cycle is a permutation (bijective function of a set onto itself) such that there exist distinct elements a1 , a2 , . . . , ak of S such that f (ai ) = ai+1 that is f (a1 ) = a2 f (a2 ) = a3 . . . f (ak ) = a1 and f (x) = x for any other element of S. This can also be pictured as a1 → a2 → a3 → · · · → ak → a1 and for any other element x ∈ S, where → represents the action of f . One of the basic results on symmetric groups says that any permutation can be expressed as product of disjoint cycles. Version: 6 Owner: drini Author(s): drini 1347 x→x and f (ak ) = a1
333.2
dihedral group
The nth dihedral group, Dn is the symmetry group of the regular nsided polygon. The group consists of n reﬂections, n − 1 rotations, and the identity transformation. Letting ω = exp(2πi/n) denote a primitive nth root of unity, and assuming the polygon is centered at the origin, the rotations Rk , k = 0, . . . , n − 1 (Note: R0 denotes the identity) are given by Rk : z → ω k z, z ∈ C, and the reﬂections Mk , k = 0, . . . , n − 1 by The abstract group structure is given by Rk Rl = Rk+l , Mk Ml = Rk−l , Rk Ml = Mk+l Mk Rl = Mk−l , Mk : z → ω k z , ¯ z∈C
where the addition and subtraction is carried out modulo n. The group can also be described in terms of generators and relations as (M0 )2 = (M1 )2 = (M1 M0 )n = id. This means that Dn is a rank1 Coxeter group. Since the group acts by linear transformations (x, y) → (ˆ, y), x ˆ p(ˆ, y ) = p(x, y), ˆx ˆ (x, y) ∈ R2 p ∈ R[x, y].
there is a corresponding action on polynomials p → p, deﬁned by ˆ The polynomials left invariant by all the group transformations form an algebra. This algebra is freely generated by the following two basic invariants: x2 + y 2 , xn − n n−2 2 x y + ..., 2
the latter polynomial being the real part of (x + iy)n . It is easy to check that these two polynomials are invariant. The ﬁrst polynomial describes the distance of a point from the origin, and this is unaltered by Euclidean reﬂections through the origin. The second polynomial is unaltered by a rotation through 2π/n radians, and is also invariant with respect to complex conjugation. These two transformations generate the nth dihedral group. Showing that these two invariants polynomially generate the full algebra of invariants is somewhat trickier, and is best done as an application of Chevalley’s theorem regarding the invariants of a ﬁnite reﬂection group. Version: 8 Owner: rmilson Author(s): rmilson 1348
Chapter 334 20F65 – Geometric group theory
334.1 groups that act freely on trees are free
Let X be a tree, and Γ a group acting freely and faithfully by group automorphisms on X. Then Γ is a free group.
S ince Γ acts freely on X, the quotient graph X/Γ is welldeﬁned, and X is the universal cover of X/Γ since X is contractible. Thus Γ ∼ π1 (X/Γ). Since any graph is homotopy equivalent = to a wedge of circles, and the fundamental group of such a space is free by Van Kampen’s theorem, Γ is free.
Version: 3 Owner: bwebste Author(s): bwebste
1349
Chapter 335 20F99 – Miscellaneous
335.1 perfect group
A group G is called perfect if G = [G, G], where [G, G] is the derived subgroup of G, or equivalently, if the abelianization of G is trivial. Version: 1 Owner: bwebste Author(s): bwebste
1350
Chapter 336 20G15 – Linear algebraic groups over arbitrary ﬁelds
336.1 Nagao’s theorem
For any integral domain k, the group of n×n invertible matrices with coeﬃcients in k[t] is the amalgamated free product of invertible matrices over k and invertible upper triangular matrices over k[t], amalgamated over the upper triangular matrices of k. More compactly GLn (k[t]) ∼ GLn (k) ∗B(k) B(k[t]). = Version: 3 Owner: bwebste Author(s): bwebste
336.2
computation of the order of GL(n, Fq )
GL(n, Fq ) is the group of n × n matrices over a ﬁnite ﬁeld Fq with nonzero determinant. Here is a proof that GL(n, Fq ) = (q n − 1)(q n − q) · · · (q n − q n−1 ). Each element A ∈ GL(n, Fq ) is given by a collection of n Fq linearly independent vectors. If one chooses the ﬁrst column vector of A from (Fq )n there are q n choices, but one can’t choose the zero vector since this would make the determinant of A zero. So there are really only (q n − 1) choices. To choose an ith vector from (Fq )n which is linearly independent from (i1) already choosen linearly independent vectors {V1 , · · · , Vi−1 } one must choose a vector not in the span of {V1 , · · · , Vi−1 }. There are q i−1 vectors in this span so the number of choices is clearly (q n − q i−1 ). Thus the number of linearly independent collections of n vectors in Fq is: (q n − 1)(q n − q) · · · (q n − q n−1 ). Version: 5 Owner: benjaminfjones Author(s): benjaminfjones 1351
336.3
general linear group
Given a vector space V , the general linear group GL( V ) is deﬁned to be the group of invertible linear transformations from V to V . The group operation is deﬁned by composition: given T : V −→ V and T : V −→ V in GL( V ), the product T T is just the composition of the maps T and T . If V = Fn for some ﬁeld F, then the group GL(V ) is often denoted GL(n, F) or GLn (F). In this case, if one identiﬁes each linear transformation T : V −→ V with its matrix with respect to the standard basis, the group GL(n, F) becomes the group of invertible n × n matrices with entries in F, under the group operation of matrix multiplication. Version: 3 Owner: djao Author(s): djao
336.4
order of the general linear group over a ﬁnite ﬁeld
GL(n, Fq ) is a ﬁnite group when Fq is a ﬁnite ﬁeld with q elements. Furthermore, GL(n, Fq ) = (q n − 1)(q n − q) · · · (q n − q n−1 ). Version: 16 Owner: benjaminfjones Author(s): benjaminfjones
336.5
special linear group
Given a vector space V , the special linear group SL(V ) is deﬁned to be the subgroup of the general linear group GL(V ) consisting of all invertible linear transformations T : V −→ V in GL(V ) that have determinant 1. If V = Fn for some ﬁeld F, then the group SL(V ) is often denoted SL(n, F) or SLn (F), and if one identiﬁes each linear transformation with its matrix with respect to the standard basis, then SL(n, F) consists of all n × n matrices with entries in F that have determinant 1. Version: 2 Owner: djao Author(s): djao
1352
Chapter 337 20G20 – Linear algebraic groups over the reals, the complexes, the quaternions
337.1 orthogonal group
Let Q be a nondegenerate symmetric bilinear form over the real vector space Rn . A linear transformation T : V −→ V is said to preserve Q if Q(T x, T y) = Q(x, y) for all vectors x, y ∈ V . The subgroup of the general linear group GL(V ) consisting of all linear transformations that preserve Q is called the orthogonal group with respect to Q, and denoted O(n, Q). If Q is also positive deﬁnite (i.e., Q is an inner product), then O(n, Q) is equivalent to the group of invertible linear transformations that preserve the standard inner product on Rn , and in this case it is usually denoted O(n). One can show that a transformation T is in O(n) if and only if T −1 = T T (the inverse of T equals the transpose of T ). Version: 2 Owner: djao Author(s): djao
1353
Chapter 338 20G25 – Linear algebraic groups over local ﬁelds and their integers
338.1 Ihara’s theorem
Let Γ be a discrete, torsionfree subgroup of SL2 Qp (where Qp is the ﬁeld of padic numbers). Then Γ is free. Proof, or a sketch thereof] There exists a p + 1 regular tree X on which SL2 Qp acts, with stabilizer SL2 Zp (here, Zp denotes the ring of padic integers). Since Zp is compact in its proﬁnite topology, so is SL2 Zp . Thus, SL2 Zp Γ must be compact, discrete and torsionfree. Since compact and discrete implies ﬁnite, the only such group is trivial. Thus, Γ acts freely on X. Since groups acting freely on trees are free, Γ is free.
[
Version: 6 Owner: bwebste Author(s): bwebste
1354
Chapter 339 20G40 – Linear algebraic groups over ﬁnite ﬁelds
339.1 SL2(F3)
The special linear group over the ﬁnite ﬁeld F3 is represented by SL2 (F3 ) and consists of the 2 × 2 invertible matrices with determinant equal to 1 and whose entries belong to F3 . Version: 6 Owner: drini Author(s): drini, apmxi
1355
Chapter 340 20J06 – Cohomology of groups
340.1 group cohomology
Let G be a group and let M be a (left) Gmodule. The 0th cohomology group of the Gmodule M is H 0 (G, M) = {m ∈ M : ∀σ ∈ G, σm = m} which is the set of elements of M which are Ginvariant, also denoted by M G . A map φ : G → M is said to be a crossed homomorphism (or 1cocycle) if φ(αβ) = φ(α) + αφ(β) for all α, β ∈ G. If we ﬁx m ∈ M, the map ρ : G → M deﬁned by ρ(α) = αm − m is clearly a crossed homomorphism, said to be principal (or 1coboundary). We deﬁne the following groups: Z 1 (G, M) = {φ : G → M : φ is a 1cocycle} B 1 (G, M) = {ρ : G → M : ρ is a 1coboundary} H 1 (G, M) = Z 1 (G, M)/B 1 (G, M) The following proposition is very useful when trying to compute cohomology groups: Proposition 1. Let G be a group and let A, B, C be Gmodules related by an exact sequence: 0→A→B→C→0 Then there is a long exact sequence in cohomology: 0 → H 0 (G, A) → H 0 (G, B) → H 0 (G, C) → H 1 (G, A) → H 1 (G, B) → H 1(G, C) 1356
Finally, the 1st cohomology group of the Gmodule M is deﬁned to be the quotient group:
In general, the cohomology groups H n (G, M) can be deﬁned as follows: Deﬁnition 30. Deﬁne C 0 (G, M) = M and for n 1 deﬁne the additive group:
C n (G, M) = {φ : Gn → M} The elements of C n (G, M) are called ncochains. Also, for n homomorphism dn : C n (G, M) → C n+1 (G, M): dn (f )(g1 , ..., gn+1 ) = g1 · f (g2, ..., gn+1 )
n
0 deﬁne the nth coboundary
+
i=1
(−1)i f (g1 , ..., gi−1, gi gi+1 , gi+2 , ..., gn+1)
+ (−1)n+1 f (g1 , ..., gn ) Let Z n (G, M) = ker dn for n 0, the set of ncocyles. Also, let B 0 (G, M) = 1 and for n let B n (G, A) = image dn−1 , the set of ncoboundaries. Finally we deﬁne the nth cohomology group of G with coeﬃcients in M to be H n (G, M) = Z n (G, M)/B n (G, M) 1
REFERENCES
1. J.P. Serre, Galois Cohomology, SpringerVerlag, New York. 2. James Milne, Elliptic Curves, online course notes. 3. Joseph H. Silverman, The Arithmetic of Elliptic Curves. SpringerVerlag, New York, 1986.
Version: 4 Owner: alozano Author(s): alozano
340.2
stronger Hilbert theorem 90
¯ ¯ Let K be a ﬁeld and let K be an algebraic closure of K. By K + we denote the abelian group ∗ ¯ ¯ ¯ (K, +) and similarly K = (K, ∗) (here the operation is multiplication). Also we let ¯ GK/K = Gal(K/K) ¯ be the absolute Galois group of K. Theorem 3 (Hilbert 90). Let K be a ﬁeld. 1.
¯ H 1 (GK/K , K + ) = 0 ¯ 1357
2.
¯ H 1 (GK/K , K ∗ ) = 0 ¯
3. If char(K), the characteristic of K, does not divide m (or char(K) = 0) then H 1 (GK/K , µm ) ∼ K ∗ /K ∗m ¯ = where µm denotes the set of all mth roots of unity.
REFERENCES
1. J.P. Serre, Galois Cohomology, SpringerVerlag, New York. 2. J.P. Serre, Local Fields, SpringerVerlag, New York.
Version: 2 Owner: alozano Author(s): alozano
1358
Chapter 341 20J15 – Category of groups
341.1 variety of groups
A variety of groups is the set of groups G such that all elements x1 , . . . , xn ∈ G satisfy a set of equationally deﬁned relations ri (x1 , . . . , xn ) = 1 ∀i ∈ I, where I is an index set. For example, abelian groups are a variety deﬁned by the equations {[x1 , x2 ] = 1}, where [x, y] = xyx−1 y −1. Nilpotent groups of class < c are a variety deﬁned by {[[· · · [[x1 , x2 ], x3 ] · · · ], xc ]}. Analogously, solvable groups of length < c are a variety. Abelian groups are a special case of both of these. Groups of exponent n are a variety, deﬁned by {xn = 1}. 1 A variety of groups is a full subcategory of the category of groups, and there is a free group on any set of elements in the variety, which is the usual free group modulo the relations of the variety applied to all elements. This satisﬁes the usual universal property of the free group on groups in the variety, and is thus adjoint to the forgetful functor in the category of sets. In the variety of abelian groups, we get back the usual free abelian groups. In the variety of groups of exponent n, we get the Burnside groups. Version: 1 Owner: bwebste Author(s): bwebste 1359
Chapter 342 20K01 – Finite abelian groups
342.1 Schinzel’s theorem
Let a ∈ Q, not zero or 1 or −1. For any prime p which does not divide the numerator or denominator of a in reduced form, a can be viewed as an element of the multiplicative group Z/pZ. Let np be the order of this element in the multiplicative group. Then the set of np over all such primes has ﬁnite complement in the set of positive integers. One can generalize this as follows: Similarly, if K is a number ﬁeld, choose a not zero or a root of unity in K. Then for any ﬁnite place (discrete valuation) p with vp (a) = 0, we can view a as an element of the residue ﬁeld at p, and take the order np of this element in the multiplicative group. Then the set of np over all such primes has ﬁnite complement in the set of positive integers. Silverman also generalized this to elliptic curves over number ﬁelds. References to come soon. Version: 4 Owner: mathcam Author(s): Manoj, nerdy2
1360
Chapter 343 20K10 – Torsion groups, primary groups and generalized primary groups
343.1
The Deﬁnition 31. torsion of a group G is the set Tor(G) = {g ∈ G : g n = e for some n ∈ N}. A group is said to be Deﬁnition 32. torsionfree if Tor(G) = {e}, i.e. the torsion consists only of the identity element. If G is abelian then Tor(G) is a subgroup (the Deﬁnition 33. torsion group) of G. Example 18 (Torsion of a cyclic group). For a cyclic group Zp , Tor(Zp ) = Zp . In general, if G is a ﬁnite group then Tor(G) = G. Version: 2 Owner: mhale Author(s): mhale
torsion
1361
Chapter 344 20K25 – Direct sums, direct products, etc.
344.1 direct product of groups
The external direct product G × H of two groups G and H is deﬁned to be the set of ordered pairs (g, h), with g ∈ G and h ∈ H. The group operation is deﬁned by (g, h)(g , h ) = (gg , hh ) It can be shown that G × H obeys the group axioms. More generally, we can deﬁne the external direct product of n groups, in the obvious way. Let G = G1 × . . . × Gn be the set of all ordered ntuples {(g1 , g2 . . . , gn )  gi ∈ Gi } and deﬁne the group operation by componentwise multiplication as before. Version: 4 Owner: vitriol Author(s): vitriol
1362
Chapter 345 20K99 – Miscellaneous
345.1 Klein 4group
The Klein 4group is the subgroup V (Vierergruppe) of S4 (see symmetric group) consisting of the following 4 permutations: (), (12), (34), (12)(34). (see cycle notation). This is an abelian group, isomorphic to the product Z/2Z × Z/2Z. The group is named after Felix Klein, a pioneering ﬁgure in the ﬁeld of geometric group theory. The Klein 4 group enjoys a number of interesting properties, some of which are listed below. 1. It is the automorphism group of the graph consisting of two disjoint edges. 2. It is the unique 4 element group with the property that all elements are idempotent. 3. It is the symmetry group of a planar ellipse. 4. Consider the action of S4 , the permutation group of 4 elements, on the set of partitions into two groups of two elements. There are 3 such partitions, which we denote by (12, 34) (13, 24) (14, 23). Thus, the action of S4 on these partition induces a homomorphism from S4 to S3 ; the kernel is the Klein 4group. This homomorphism is quite exceptional, and corresponds to the fact that A4 (the alternating group) is not a simple group (notice that V is actually a subgroup of A4 ). All other alternating groups are simple. 5. A more geometric way to see the above is the following: S4 is the group of symmetries of a tetrahedron. There is an iduced action of S4 on the six edges of the tetrahedron. Observing that this action preserves incidence relations one gets an action of S4 on the three pairs of opposite edges (See ﬁgure). 1363
6. It is the symmetry group of the Riemannian curvature tensor. 3
4 1 2
Version: 7 Owner: rmilson Author(s): Dr Absentius, rmilson, imran
345.2
divisible group
An abelian group D is said to be divisible if for any x ∈ D, n ∈ Z+ , there exists an element x ∈ D such that nx = x. Some noteworthy facts: • An abelian group is injective (as a Zmodule) if and only if it is divisible. • Every group is isomorphic to a subgroup of a divisible group. • Any divisible abelian group is isomorphic to the direct sum of its torsion subgroup and n copies of the group of rationals (for some cardinal number n). Version: 4 Owner: mathcam Author(s): mathcam
345.3
example of divisible group
Let G denote the group of rational numbers taking the operation to be addition. Then for p p any p ∈ G and n ∈ Z+ , we have nq ∈ G satisfying n nq = p , so the group is divisible. q q Version: 1 Owner: mathcam Author(s): mathcam
345.4
locally cyclic group
A locally cyclic (or generalized cyclic) group is a group in which any pair of elements generates a cyclic subgroup. 1364
Every locally cyclic group is abelian. If G is a locally cyclic group, then every ﬁnite subset of G generates a cyclic subgroup. Therefore, the only ﬁnitelygenerated locally cyclic groups are the cyclic groups themselves. The group (Q, +) is an example of a locally cyclic group that is not cyclic. Subgroups and quotients of locally cyclic groups are also locally cyclic. A group is locally cyclic if and only if its lattice of subgroups is distributive. Version: 10 Owner: yark Author(s): yark
1365
Chapter 346 20Kxx – Abelian groups
346.1 abelian group
Let (G, ∗) be a group. If for any a, b ∈ G we have a ∗ b = b ∗ a, we say that the group is abelian. Sometimes the expression commutative group is used, but this is less frequent. Abelian groups hold several interesting properties. Theorem 4. If ϕ : G → G deﬁned by ϕ(x) = x2 is a homomorphism, then G is abelian. Proof. If such function were a homomorphism, we would have (xy)2 = ϕ(xy) = ϕ(x)ϕ(y) = x2 y 2 that is, xyxy = xxyy. Leftmutiplying by x−1 and rightmultiplying by y −1 we are led to yx = xy and thus the group is abelian. QED Theorem 5. Any subgroup of an abelian group is normal. Proof. Let H be a subgroup of the abelian group G. Since ah = ha for any a ∈ G and any h ∈ H we get aH = Ha. That is, H is normal in G. QED Theorem 6. Quotient groups of abelian groups are also abelian. Proof Let H a subgroup of G. Since G is abelian, H is normal and we can get the quotient group G/H whose elements are the equivalence classes for a ∼ b if ab−1 ∈ H. The operation on the quotient group is given by aH · bH = (ab)H. But bh · aH = (ba)H = (ab)H, therefore the quotient group is also commutative. QED Version: 12 Owner: drini Author(s): drini, yark, akrowne, apmxi 1366
Chapter 347 20M10 – General structure theory
347.1 existence of maximal semilattice decomposition
Let S be a semigroup. A maximal semilattice decomposition for S is a surjective homomorphism φ : S → Γ onto a semilattice Γ with the property that any other semilattice decomposition factors through φ. So if φ : S → Γ is any other semilattice decomposition of S, then there is a homomorphism Γ → Γ such that the following diagram commutes: S
φ φ
Γ Γ
Proposition 14. Every semigroup has a maximal semilattice decomposition. ecall that each semilattice decompostion determines a semilattice congruence. If {ρi  i ∈ I} is the family of all semilattice congruences on S, then deﬁne ρ = i∈I ρi . (Here, we consider the congruences as subsets of S × S, and take their intersection as sets.)
R
It is easy to see that ρ is also a semilattice congruence, which is contained in all other semilattice congruences. Therefore each of the homomorphisms S → S/ρi factors through S → S/ρ. Version: 2 Owner: mclase Author(s): mclase
1367
347.2
semilattice decomposition of a semigroup
A semigroup S has a semilattice decomposition if we can write S = γ∈Γ Sγ as a disjoint union of subsemigroups, indexed by elements of a semilattice Γ, with the additional condition that x ∈ Sα and y ∈ Sβ implies xy ∈ Sαβ . Semilattice decompositions arise from homomorphims of semigroups onto semilattices. If φ : S → Γ is a surjective homomorphism, then it is easy to see that we get a semilattice decomposition by putting Sγ = φ−1 (γ) for each γ ∈ Γ. Conversely, every semilattice decomposition deﬁnes a map from S to the indexing set Γ which is easily seen to be a homomorphism. A third way to look at semilattice decompositions is to consider the congruence ρ deﬁned by the homomorphism φ : S → Γ. Because Γ is a semilattice, φ(x2 ) = φ(x) for all x, and so ρ satisﬁes the constraint that x ρ x2 for all x ∈ S. Also, φ(xy) = φ(yx) so that xy ρ yx for all x, y ∈ S. A congruence ρ which satisﬁes these two conditions is called a semilattice congruence. Conversely, a semilattice congruence ρ on S gives rise to a homomorphism from S to a semilattice S/ρ. The ρclasses are the components of the decomposition. Version: 3 Owner: mclase Author(s): mclase
347.3
simple semigroup
Let S be a semigroup. If S has no ideals other than itself, then S is said to be simple. If S has no left ideals [resp. Right ideals] other than itself, then S is said to be left simple [resp. right simple]. Right simple and left simple are stronger conditions than simple. A semigroup S is left simple if and only if Sa = S for all a ∈ S. A semigroup is both left and right simple if and only if it is a group. If S has a zero element θ, then 0 = {θ} is always an ideal of S, so S is not simple (unless it has only one element). So in studying semigroups with a zero, a slightly weaker deﬁnition is required. Let S be a semigroup with a zero. Then S is zero simple, or 0simple, if the following conditions hold: • S2 = 0 • S has no ideals except 0 and S itself 1368
The condition S 2 = 0 really only eliminates one semigroup: the 2element null semigroup. Excluding this semigroup makes parts of the structure theory of semigroups cleaner. Version: 1 Owner: mclase Author(s): mclase
1369
Chapter 348 20M12 – Ideal theory
348.1 Rees factor
Let I be an ideal of a semigroup S. Deﬁne a congruence ∼ by x ∼ y iﬀ x = y or x, y ∈ I. Then the Rees factor of S by I is the quotient S/ ∼. As a matter of notation, the congruence ∼ is normally suppressed, and the quotient is simply written S/I. Note that a Rees factor always has a zero element. Intuitively, the quotient identiﬁes all element in I and the resulting element is a zero element. Version: 1 Owner: mclase Author(s): mclase
348.2
ideal
Let S be a semigroup. An ideal of S is a nonempty subset of S which is closed under multiplication on either side by elements of S. Formally, I is an ideal of S if I is nonempty, and for all x ∈ I and s ∈ S, we have sx ∈ I and xs ∈ I. Onesided ideals are deﬁned similarly. A nonempty subset A of S is a left ideal (resp. right ideal) of S if for all a ∈ A and s ∈ S, we have sa ∈ A (resp. as ∈ A). A principal left ideal of S is a left ideal generated by a single element. If a ∈ S, then the principal left ideal of S generated by a is S 1 a = Sa {a}. (The notation S 1 is explained here.) Similarly, the principal right ideal generated by a is aS 1 = aS {a}.
The notation L(a) and R(a) are also common for the principal left and right ideals generated 1370
by a respectively. A principal ideal of S is an ideal generated by a single element. The ideal generated by a is S 1 aS 1 = SaS Sa aS {a}. The notation J(a) = S 1 aS 1 is also common. Version: 5 Owner: mclase Author(s): mclase
1371
Chapter 349 20M14 – Commutative semigroups
349.1 Archimedean semigroup
Let S be a commutative semigroup. We say an element x divides an element y, written x  y, if there is an element z such that xz = y. An Archimedean semigroup S is a commutative semigroup with the property that for all x, y ∈ S there is a natural number n such that x  y n . This is related to the Archimedean property of positive real numbers R+ : if x, y > 0 then there is a natural number n such that x < ny. Except that the notation is additive rather than multiplicative, this is the same as saying that (R+ , +) is an Archimedean semigroup. Version: 1 Owner: mclase Author(s): mclase
349.2
commutative semigroup
A semigroup S is commutative if the deﬁning binary operation is commutative. That is, for all x, y ∈ S, the identity xy = yx holds. Although the term Abelian semigroup is sometimes used, it is more common simply to refer to such semigroups as commutative semigroups. A monoid which is also a commutative semigroup is called a commutative monoid. Version: 1 Owner: mclase Author(s): mclase
1372
Chapter 350 20M20 – Semigroups of transformations, etc.
350.1 semigroup of transformations
Let X be a set. A transformation of X is a function from X to X. If α and β are transformations on X, then their product αβ is deﬁned (writing functions on the right) by (x)(αβ) = ((x)α)β. With this deﬁnition, the set of all transformations on X becomes a semigroup, the full semigroupf of transformations on X, denoted TX . More generally, a semigroup of transformations is any subsemigroup of a full set of transformations. When X is ﬁnite, say X = {x1 , x2 , . . . , xn }, then the transformation α which maps xi to yi (with yi ∈ X, of course) is often written: α= x1 x2 . . . xn y1 y2 . . . yn
With this notation it is quite easy to calculate products. For example, if X = {1, 2, 3, 4}, then 1 2 3 4 1 2 3 4 1 2 3 4 = 3 2 1 2 2 3 3 4 3 3 2 3 When X is inﬁnite, say X = {1, 2, 3, . . . }, then this notation is still useful for illustration in cases where the transformation pattern is apparent. For example, if α ∈ TX is given by
1373
α : n → n + 1, we can write α=
1 2 3 4 ... 2 3 4 5 ...
Version: 3 Owner: mclase Author(s): mclase
1374
Chapter 351 20M30 – Representation of semigroups; actions of semigroups on sets
351.1 counting theorem
Given a group action of a ﬁnite group G on a set X, the following expression gives the number of distinct orbits 1 stabg (X) G g∈G Where stabg (X) is the number of elements ﬁxed by the action of g. Version: 8 Owner: mathcam Author(s): Larry Hammick, vitriol
351.2
example of group action
Let a, b, c be integers and let [a, b, c] denote the mapping [a, b, c] : Z × Z → Z, (x, y) → ax2 + bxy + cy 2 . Let G be the group of 2 × 2 matrices such that det A = ±1 ∀ A ∈ G, and A ∈ G. The substitution txy → A · txy leads to [a, b, c](a11 x + a12 y, a21x + a22 y) = a x2 + b xy + c y 2 , 1375
where a b c So we deﬁne to be the binary quadratic form with coeﬃcients a , b , c of x2 , xy, y 2, respectively as in 1 0 we have [a, b, c] ∗ A = [a, b, c] for any binary quadratic (495.2.1). Putting in A = 0 1 form [a, b, c]. Now let B be another matrix in G. We must show that [a, b, c] ∗ (AB) = ([a, b, c] ∗ A) ∗ B. Set [a, b, c] ∗ (AB) := [a , b , c ]. So we have a c = = = = a · (a11 b11 + a12 b21 )2 + c · (a21 b11 + a22 b21 )2 + b · (a11 b11 + a12 b21 ) (a21 b11 + (351.2.2) a22 b21 ) 2 2 a · b11 + c · b21 + (2a · a11 a12 + 2c · a21 a22 + b (a11 a22 + a12 a21 )) b11 b21 a · (a11 b12 + a12 b22 )2 + c · (a21 b12 + a22 b22 )2 + b · (a11 b12 + a12 b22 ) (a21 b12 + (351.2.3) a22 b22 ) 2 2 a · b12 + c · b22 + (2a · a11 a12 + 2c · a21 a22 + b (a11 a22 + a12 a21 )) b12 b22 = 2a · (a11 b11 + a12 b21 ) (a11 b12 + a12 b22 ) + 2c · (a21 b11 + a22 b21 ) (a21 b12 + a22 b22 ) + b · ((a11 b11 + a12 b21 ) (a21 b12 + a22 b22 ) + (a11 b12 + a12 b22 ) (a21 b11 + a22 b21 )) [a, b, c] ∗ A := [a , b , c ] = a · a2 + b · a11 · a21 + c · a2 11 21 = 2a · a11 · a12 + 2c · a21 · a22 + b(a11 a22 + a12 a21 = a · a2 + b · a12 a22 + c · a2 12 22 (351.2.1)
as desired. For the coeﬃcient b we get b
and by evaluating the factors of b11 b12 , b21 b22 , and b11 b22 + b21 b12 , it can be checked that b = 2a b11 b12 + 2c b21 b22 + (b11 b22 + b21 b12 ) (2a · a11 a12 + 2c · a21 a22 + b (a11 a22 + a12 a21 )) . This shows that and therefore [a, b, c] ∗ (AB) = ([a, b, c] ∗ A) ∗ B. Thus, (495.2.1) deﬁnes an action of G on the set of (integer) binary quadratic forms. Furthermore, the discriminant of each quadratic form in the orbit of [a, b, c] under G is b2 − 4ac. Version: 5 Owner: Thomas Heye Author(s): Thomas Heye [a , b , c ] = [a , b , c ] ∗ B (351.2.4)
351.3
group action
Let G be a group and let X be a set. A left group action is a function · : G × X −→ X such that: 1376
1. 1G · x = x for all x ∈ X 2. (g1 g2 ) · x = g1 · (g2 · x) for all g1 , g2 ∈ G and x ∈ X A right group action is a function · : X × G −→ X such that: 1. x · 1G = x for all x ∈ X 2. x · (g1 g2 ) = (x · g1 ) · g2 for all g1 , g2 ∈ G and x ∈ X There is a correspondence between left actions and right actions, given by associating the right action x · g with the left action g · x := x · g −1. In many (but not all) contexts, it is useful to identify right actions with their corresponding left actions, and speak only of left actions. Special types of group actions A left action is said to be eﬀective, or faithful, if the function x → g ·x is the identity function on X only when g = 1G . A left action is said to be transitive if, for every x1 , x2 ∈ X, there exists a group element g ∈ G such that g · x1 = x2 . A left action is free if, for every x ∈ X, the only element of G that stabilizes x is the identity; that is, g · x = x implies g = 1G . Faithful, transitive, and free right actions are deﬁned similarly. Version: 3 Owner: djao Author(s): djao
351.4
orbit
Let G be a group, X a set, and · : G × X −→ X a group action. For any x ∈ X, the orbit of x under the group action is the set {g · x  g ∈ G} ⊂ X. Version: 2 Owner: djao Author(s): djao
351.5
proof of counting theorem
Let N be the cardinality of the set of all the couples (g, x) such that g · x = x. For each g ∈ G, there exist stabg (X) couples with g as the ﬁrst element, while for each x, there are 1377
Gx  couples with x as the second element. Hence the following equality holds: N=
g∈G
stabg (X) =
x∈X
Gx .
From the orbitstabilizer theorem it follows that: N = G 1 . G(x)
x∈X
Since all the x belonging to the same orbit G(x) contribute with G(x) in the sum, then therefore
x∈X
1 =1 G(x)
1/G(x) precisely equals the number of distinct orbits s. We have stabg (X) = Gs,
g∈G
which proves the theorem. Version: 2 Owner: n3o Author(s): n3o
351.6
stabilizer
Let G be a group, X a set, and · : G × X −→ X a group action. For any subset S of X, the stabilizer of S, denoted Stab(S), is the subgroup Stab(S) := {g ∈ G  g · s ∈ Sfor all s ∈ S}. The stabilizer of a single point x in X is often denoted Gx . Version: 3 Owner: djao Author(s): djao
1378
Chapter 352 20M99 – Miscellaneous
352.1 a semilattice is a commutative band
This note explains how a semilattice is the same as a commutative band. Let S be a semilattice, with partial order < and each pair of elements x and y having a greatest lower bound x ∧ y. Then it is easy to see that the operation ∧ deﬁnes a binary operation on S which makes it a commutative semigroup, and that every element is idempotent since x ∧ x = x. Conversely, if S is such a semigroup, deﬁne x y iﬀ x = xy. Again, it is easy to see that this deﬁnes a partial order on S, and that greatest lower bounds exist with respect to this partial order, and that in fact x ∧ y = xy. Version: 3 Owner: mclase Author(s): mclase
352.2
adjoining an identity to a semigroup
It is possible to formally adjoin an identity element to any semigroup to make it into a monoid. Suppose S is a semigroup without an identity, and consider the set S {1} where 1 is a symbol not in S. Extend the semigroup operation from S to S {1} by additionally deﬁning: s · 1 = s = 1 · s, for alls ∈ S 1 It is easy to verify that this deﬁnes a semigroup (associativity is the only thing that needs to be checked).
1379
As a matter of notation, it is customary to write S 1 for the semigroup S with an identity adjoined in this manner, if S does not already have one, and to agree that S 1 = S, if S does already have an identity. Despite the simplicity of this construction, however, it rarely allows one to simplify a problem by considering monoids instead of semigroups. As soon as one starts to look at the structure of the semigroup, it is almost invariably the case that one needs to consider subsemigroups and ideals of the semigroup which do not contain the identity. Version: 2 Owner: mclase Author(s): mclase
352.3
band
A band is a semigroup in which every element is idempotent. A commutative band is called a semilattice. Version: 1 Owner: mclase Author(s): mclase
352.4
bicyclic semigroup
The bicyclic semigroup C(p, q) is the monoid generated by {p, q} with the single relation pq = 1. The elements of C(p, q) are all words of the form q n pm for m, n p0 = q 0 = 1). These words are multiplied as follows: q n pm q k pl = q n+k−m pl q n pl+m−k if m if m k, k. 0 (with the understanding
It is apparent that C(p, q) is simple, for if q n pm is an element of C(p, q), then 1 = pn (q n pm )q m and so S 1 q n pm S 1 = S. It is useful to picture some further properties of C(p, q) by arranging the elements in a table: 1 p p2 p3 p4 2 3 q qp qp qp qp4 q 2 q 2 p q 2 p2 q 2 p3 q 2 p4 q 3 q 3 p q 3 p2 q 3 p3 q 3 p4 q 4 q 4 p q 4 p2 q 4 p3 q 4 p4 . . . . . . . . . . . . . . . 1380 ... ... ... ... ... .. .
Then the elements below any horizontal line drawn through this table form a right ideal and the elements to the right of any vertical line form a left ideal. Further, the elements on the diagonal are all idempotents and their standard ordering is 1 > qp > q 2 p2 > q 3 p3 > · · · . Version: 3 Owner: mclase Author(s): mclase
352.5
congruence
Let S be a semigroup. An equivalence relation ∼ deﬁned on S is called a congruence if it is preserved under the semigroup operation. That is, for all x, y, z ∈ S, if x ∼ y then xz ∼ yz and zx ∼ zy. If ∼ satisﬁes only x ∼ y implies xz ∼ yz (resp. zx ∼ zy) then ∼ is called a right congruence (resp. left congruence). Example 19. Suppose f : S → T is a semigroup homomorphism. Deﬁne ∼ by x ∼ y iﬀ f (x) = f (y). Then it is easy to see that ∼ is a congruence. If ∼ is a congruence, deﬁned on a semigroup S, write [x] for the equivalence class of x under ∼. Then it is easy to see that [x] · [y] = [xy] is a welldeﬁned operation on the set of equivalence classes, and that in fact this set becomes a semigroup with this operation. This semigroup is called the quotient of S by ∼ and is written S/ ∼. Thus semigroup congruences are related to homomorphic images of semigroups in the same way that normal subgroups are related to homomorphic images of groups. More precisely, in the group case, the congruence is the coset relation, rather than the normal subgroup itself. Version: 3 Owner: mclase Author(s): mclase
352.6
cyclic semigroup
A semigroup which is generated by a single element is called a cyclic semigroup. Let S = x be a cyclic semigroup. Then as a set, S = {xn  n > 0}. If all powers of x are distinct, then S = {x, x2 , x3 , . . . } is (countably) inﬁnite. Otherwise, there is a least integer n > 0 such that xn = xm for some m < n. It is clear then that the elements x, x2 , . . . , xn−1 are distinct, but that for any j ≥ n, we must have xj = xi for some i, m ≤ i ≤ n − 1. So S has n − 1 elements. 1381
Unlike in the group case, however, there are in general multiple nonisomorphic cyclic semigroups with the same number of elements. In fact, there are t nonisomorphic cyclic semigroups with t elements: these correspond to the diﬀerent choices of m in the above (with n = t + 1). The integer m is called the index of S, and n − m is called the period of S. The elements K = {xm , xm+1 , . . . , xn−1 } are a subsemigroup of S. In fact, K is a cyclic group. A concrete representation of the semigroup with index m and period r as a semigroup of transformations can be obtained as follows. Let X = {1, 2, 3, . . . , m + r}. Let φ= 1 2 3 ... m + r −1 m + r . 2 3 4 ... m+r r+1
Then φ generates a subsemigroup S of the full semigroup of transformations TX , and S is cyclic with index m and period r. Version: 3 Owner: mclase Author(s): mclase
352.7
idempotent
An element x of a ring is called an idempotent element, or simply an idempotent if x2 = x. The set of idempotents of a ring can be partially ordered by putting e ≤ f iﬀ e = ef = f e. The element 0 is a minimum element in this partial order. If the ring has an identity element, 1, then 1 is a maximum element in this partial order. Since these deﬁnitions refer only to the multiplicative structure of the ring, they also hold for semigroups (with the proviso, of course, that a semigroup may not have a zero element). In the special case of a semilattice, this partial order is the same as the one described in the entry for semilattice. If a ring has an identity, then 1 − e is always an idempotent whenever e is an idempotent, and e(1 − e) = (1 − e)e = 0. In a ring with an identity, two idempotents e and f are called a pair of orthogonal idempotents if e + f = 1, and ef = f e = 0. Obviously, this is just a fancy way of saying that f = 1 − e. More generally, a set {e1 , e2 , . . . , en } of idempotents is called a complete set of orthogonal idempotents if ei ej = ej ei = 0 whenever i = j and if 1 = e1 + e2 + · · · + en . 1382
Version: 3 Owner: mclase Author(s): mclase
352.8
null semigroup
A left zero semigroup is a semigroup in which every element is a left zero element. In other words, it is a set S with a product deﬁned as xy = x for all x, y ∈ S. A right zero semigroup is deﬁned similarly. Let S be a semigroup. Then S is a null semigroup if it has a zero element and if the product of any two elements is zero. In other words, there is an element θ ∈ S such that xy = θ for all x, y ∈ S. Version: 1 Owner: mclase Author(s): mclase
352.9
semigroup
A semigroup G is a set together with a binary operation · : G × G −→ G which satisﬁes the associative property: (a · b) · c = a · (b · c) for all a, b, c ∈ G. Version: 2 Owner: djao Author(s): djao
352.10
semilattice
A lower semilattice is a partially ordered set S in which each pair of elements has a greatest lower bound. A upper semilattice is a partially ordered set S in which each pair of elements has a least upper bound. Note that it is not normally necessary to distinguish lower from upper semilattices, because one may be converted to the other by reversing the partial order. It is normal practise to refer to either structure as a semilattice and it should be clear from the context whether greatest lower bounds or least upper bounds exist. Alternatively, a semilattice can be considered to be a commutative band, that is a semigroup which is commutative, and in which every element is idempotent. In this context, semilattices are important elements of semigroup theory and play a key role in the structure theory of commutative semigroups. A partially ordered set which is both a lower semilattice and an upper semilattice is a lattice. 1383
Version: 3 Owner: mclase Author(s): mclase
352.11
subsemigroup,, submonoid,, and subgroup
Let S be a semigroup, and let T be a subset of S. T is a subsemigroup of S if T is closed under the operation of S; that it if xy ∈ T for all x, y ∈ T . T is a submonoid of S if T is a subsemigroup, and T has an identity element. T is a subgroup of S if T is a submonoid which is a group. Note that submonoids and subgroups do not have to have the same identity element as S itself (indeed, S may not have an identity element). The identity element may be any idempotent element of S. Let e ∈ S be an idempotent element. Then there is a maximal subsemigroup of S for which e is the identity: eSe = {exe  x ∈ S}. U(eSe) = {x ∈ eSe  ∃y ∈ eSe st xy = yx = e}. Subgroups with diﬀerent identity elements are disjoint. To see this, suppose that G and H are subgroups of a semigroup S with identity elements e and f respectively, and suppose x ∈ G H. Then x has an inverse y ∈ G, and an inverse z ∈ H. We have: e = xy = f xy = f e = zxe = zx = f. Thus intersecting subgroups have the same identity element. Version: 2 Owner: mclase Author(s): mclase
In addition, there is a maximal subgroup for which e is the identity:
352.12
zero elements
Let S be a semigroup. An element z is called a right zero [resp. left zero] if xz = z [resp. zx = z] for all x ∈ S. An element which is both a left and a right zero is called a zero element. A semigroup may have many left zeros or right zeros, but if it has at least one of each, then they are necessarily equal, giving a unique (twosided) zero element. 1384
It is customary to use the symbol θ for the zero element of a semigroup. Version: 1 Owner: mclase Author(s): mclase
1385
Chapter 353 20N02 – Sets with a single binary operation (groupoids)
353.1 groupoid
A groupoid G is a set together with a binary operation · : G × G −→ G. The groupoid (or “magma”) is closed under the operation. There is also a separate, categorytheoretic deﬁnition of “groupoid.” Version: 7 Owner: akrowne Author(s): akrowne
353.2
idempotency
If (S, ∗) is a magma, then an element x ∈ S is said to be idempotent if x ∗ x = x. If every element of S is idempotent, then the binary operation ∗ (or the magma itself) is said to be idempotent. For example, the ∧ and ∨ operations in a lattice are idempotent, because x ∧ x = x and x ∨ x = x for all x in the lattice. A function f : D → D is said to be idempotent if f ◦ f = f . (This is just a special case of the above deﬁnition, the magma in question being (D D , ◦), the monoid of all functions from D to D, with the operation of function composition.) In other words, f is idempotent iﬀ repeated application of f has the same eﬀect as a single application: f (f (x)) = f (x) for all x ∈ D. An idempotent linear transformation from a vector space to itself is called a projection. Version: 12 Owner: yark Author(s): yark, Logan
1386
353.3
left identity and right identity
Let G be a groupoid. An element e ∈ G is called a left identity element if ex = x for all x ∈ G. Similarly, e is a right identity element if xe = x for all x ∈ G. An element which is both a left and a right identity is an identity element. A groupoid may have more than one left identify element: in fact the operation deﬁned by xy = y for all x, y ∈ G deﬁnes a groupoid (in fact, a semigroup) on any set G, and every element is a left identity. But as soon as a groupoid has both a left and a right identity, they are necessarily unique and equal. For if e is a left identity and f is a right identity, then f = ef = e. Version: 2 Owner: mclase Author(s): mclase
1387
Chapter 354 20N05 – Loops, quasigroups
354.1 Moufang loop
Proposition: Let Q be a nonempty quasigroup. I) The following conditions are equivalent. (x(yz))x ((xy)z)x (yx)(zy) y(x(yz)) = = = = (xy)(zx) x(y(zx)) (y(xz))y ((yx)y)z ∀x, y, z ∈ Q ∀x, y, z ∈ Q ∀x, y, z ∈ Q ∀x, y, z ∈ Q (354.1.1) (354.1.2) (354.1.3) (354.1.4)
II) If Q satisﬁes those conditions, then Q has an identity element (i.e. Q is a loop). For a proof, we refer the reader to the two references. Kunen in [1] shows that that any of the four conditions implies the existence of an identity element. And Bol and Bruck [2] show that the four conditions are equivalent for loops. Deﬁnition:A nonempty quasigroup satisfying the conditions (1)(4) is called a Moufang quasigroup or, equivalently, a Moufang loop (after Ruth Moufang, 19051977). The 16element set of unit octonians over Z is an example of a nonassociative Moufang loop. Other examples appear in projective geometry, coding theory, and elsewhere. References [1] K. Kunen Moufang Quasigroups (PostScript format) (=Moufang Quasigroups, J. Algebra 83 (1996) 231234) [2] R. H. Bruck, A Survey of Binary Systems, SpringerVerlag, 1958 1388
Version: 3 Owner: yark Author(s): Larry Hammick
354.2
loop and quasigroup
A quasigroup is a groupoid G with the property that for every x, y ∈ G, there are unique elements w, z ∈ G such that xw = y and zx = y. A loop is a quasigroup which has an identity element. What distinguishes a loop from a group is that the former need not satisfy the associative law. Version: 1 Owner: mclase Author(s): mclase
1389
Chapter 355 2200 – General reference works (handbooks, dictionaries, bibliographies, etc.)
355.1 ﬁxedpoint subspace
Let Σ ⊂ Γ be a subgroup where Γ is a compact Lie group acting on a vector space V . The ﬁxedpoint subspace of Σ is Fix(Σ) = {x ∈ V  σx = x, ∀σ ∈ Σ} Fix(Σ) is a linear subspace of V since Fix(Σ) =
σ∈Σ
ker(σ − I)
where I is the identity. If it is important to specify the space V we use the following notation FixV (Σ).
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeﬀer, G. David: Singularities and Groups in Bifurcation Theory (Volume II). SpringerVerlag, New York, 1988.
Version: 1 Owner: Daume Author(s): Daume
1390
Chapter 356 22XX – Topological groups, Lie groups
356.1 Cantor space
Cantor space denoted C is the set of all inﬁnite binary sequences with the product topology. It is a perfect Polish space. It is a compact subspace of Baire space, which is the set of all inﬁnite sequences of integers with the natural product topology.
REFERENCES
1. Moschovakis, Yiannis N. Descriptive set theory theory, 1980, Amsterdam ; New York : NorthHolland Pub. Co.
Version: 8 Owner: xiaoyanggu Author(s): xiaoyanggu
1391
Chapter 357 22A05 – Structure of general topological groups
357.1 topological group
A topological group is a triple (G, ·, T) where (G, ·) is a group and T is a topology on G such that under T, the group operation (x, y) → x · y is continuous with respect to the product topology on G × G and the inverse map x → x−1 is continuous on G. Version: 3 Owner: Evandar Author(s): Evandar
1392
Chapter 358 22C05 – Compact groups
358.1 ntorus
The nTorus, denoted T n , is a smooth orientable n dimensional manifold which is the product of n 1spheres, i.e. T n = S 1 × · · · × S 1 .
n
Equivalently, the nTorus can be considered to be Rn modulo the action (vector addition) of the integer lattice Zn . The nTorus is in addition a topological group. If we think of S 1 as the unit circle in C and T n = S 1 × · · · × S 1 , then S 1 is a topological group and so is T n by coordinatewise multiplication. That is, (z1 , z2 , . . . , zn ) · (w1 , w2 , . . . , wn ) = (z1 w1 , z2 w2 , . . . , zn wn ) Version: 2 Owner: ack Author(s): ack, apmxi
n
358.2
reductive
Let G be a Lie group or algebraic group. G is called reductive over a ﬁeld k if every representation of G over k is completely reducible For example, a ﬁnite group is reductive over a ﬁeld k if and only if its order is not divisible by the characteristic of k (by Maschke’s theorem). A complex Lie group is reductive if and only if it is a direct product of a semisimple group and an algebraic torus. Version: 3 Owner: bwebste Author(s): bwebste 1393
Chapter 359 22D05 – General properties and structure of locally compact groups
359.1 Γsimple
A representation V of Γ is Γsimple if either • V ∼ W1 ⊕ W2 where W1 , W2 are absolutely irreducible for Γ and are Γisomorphic, or = • V is nonabsolutely irreducible for Γ. [GSS]
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeﬀer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). SpringerVerlag, New York, 1988.
Version: 1 Owner: Daume Author(s): Daume
1394
Chapter 360 22D15 – Group algebras of locally compact groups
360.1 group C ∗algebra
Let C[G] be the group ring of a discrete group G. It has two completions to a C ∗ algebra:
∗ Reduced group C ∗ algebra. The reduced group C ∗ algebra, Cr (G), is obtained by completing C[G] in the operator norm for its regular representation on l2 (G).
∗ Maximal group C ∗ algebra. The maximal group C ∗ algebra, Cmax (G) or just C ∗ (G), is deﬁned by the following universal property: any *homomorphism from C[G] to some B(H) (the C ∗ algebra of bounded operators on some Hilbert space H) factors through ∗ the inclusion C[G] → Cmax (G). ∗ If G is amenable then Cr (G) ∼ Cmax (G). = ∗
Version: 3 Owner: mhale Author(s): mhale
1395
Chapter 361 22E10 – General properties and structure of complex Lie groups
361.1 existence and uniqueness of compact real form
Let G be a semisimple complex Lie group. Then there exists a unique (up to isomorphism) real Lie group K such that K is compact and a real form of G. Conversely, if K is compact, semisimple and real, it is the real form of a unique semisimple complex Lie group G. The group K can be realized as the set of ﬁxed points of a special involution of G, called the Cartan involution. For example, the compact real form of SLn C, the complex special linear group, is SU(n), the special unitary group. Note that SLn R is also a real form of SLn C, but is not compact. The compact real form of SOn C, the complex special orthogonal group, is SOn R, the real orthogonal group. SOn C also has other, noncompact real forms, called the pseudoorthogonal groups. The compact real form of Sp2n C, the complex symplectic group, is less wellknown. It is (unfortunately) also usually denoted Sp(2n), and consists of n × n “unitary” quaternion matrices, that is, Sp(2n) = {M ∈ GLn HMM ∗ = I}
where M ∗ denotes M conjugate transpose. This diﬀerent from the real symplectic group Sp2n R. Version: 2 Owner: bwebste Author(s): bwebste
1396
361.2
maximal torus
Let K be a compact group, and let t ∈ K be an element whose centralizer has minimal dimension (such elements are dense in K). Let T be the centralizer of t. This subgroup is closed since T = ϕ−1 (t) where ϕ : K → K is the map k → ktk −1 , and abelian since it is the intersection of K with the Cartan subgroup of its complexiﬁcation, and hence a torus, since K (and thus T ) is compact. We call T a maximal torus of K. This term is also applied to the corresponding maximal abelian subgroup of a complex semisimple group, which is an algebraic torus. Version: 2 Owner: bwebste Author(s): bwebste
361.3
Lie group
A Lie group is a group endowed with a compatible analytic structure. To be more precise, Lie group structure consists of two kinds of data • a ﬁnitedimensional, realanalytic manifold G • and two analytic maps, one for multiplication G×G → G and one for inversion G → G, which obey the appropriate group axioms. Thus, a homomorphism in the category of Lie groups is a group homomorphism that is simultaneously an analytic mapping between two realanalytic manifolds. Next, we describe a natural construction that associates a certain Lie algebra g to every Lie group G. Let e ∈ G denote the identity element of G. For g ∈ G let λg : G → G denote the diﬀeomorphisms corresponding to left multiplication by g. Deﬁnition 9. A vectorﬁeld V on G is called leftinvariant if V is invariant with respect to all left multiplications. To be more precise, V is leftinvariant if and only if (λg )∗ (V ) = V Proposition 15. The vectorﬁeld bracket of two leftinvariant vector ﬁelds is again, a leftinvariant vector ﬁeld. Proof. Let V1 , V2 be leftinvariant vector ﬁelds, and let g ∈ G. The bracket operation is covariant with respect to diﬀeomorphism, and in particular (λg )∗ [V1 , V2 ] = [(λg )∗ V1 , (λg )∗ V2 ] = [V1 , V2 ]. 1397 (see pushforward of a vectorﬁeld) for all g ∈ G.
Q.E.D. Deﬁnition 10. The Lie algebra of G, denoted hereafter by g, is the vector space of all leftinvariant vector ﬁelds equipped with the vectorﬁeld bracket. Now a right multiplication is invariant with respect to all left multiplications, and it turns out that we can characterize a leftinvariant vector ﬁeld as being an inﬁnitesimal right multiplication. Proposition 16. Let a ∈ Te G and let V be a leftinvariant vectorﬁeld such that Ve = a. Then for all g ∈ G we have Vg = (λg )∗ (a). The intuition here is that a gives an inﬁnitesimal displacement from the identity element and that Vg is gives a corresponding inﬁnitesimal right displacement away from g. Indeed consider a curve γ : (− , ) → G passing through the identity element with velocity a; i.e. γ(0) = e, γ (0) = a.
The above proposition is then saying that the curve t → gγ(t), passes through g at t = 0 with velocity Vg . Thus we see that a leftinvariant vectorﬁeld is completely determined by the value it takes at e, and that therefore g is isomorphic, as a vector space to Te G. Of course, we can also consider the Lie algebra of rightinvariant vector ﬁelds. The resulting Liealgebra is antiisomorphic (the order in the bracket is reversed) to the Lie algebra of leftinvariant vector ﬁelds. Now it is a general principle that the group inverse operation gives an antiisomorphism between left and right group actions. So, as one may well expect, the antiisomorphism between the Lie algebras of left and rightinvariant vector ﬁelds can be realized by considering the linear action of the inverse operation on Te G. Finally, let us remark that one can induce the Lie algebra structure directly on Te G by considering adjoint action of G on Te G. Examples. [Coming soon.] t ∈ (− , )
1398
Notes. 1. No generality is lost in assuming that a Lie group has analytic, rather than C ∞ or even C k , k = 1, 2, . . . structure. Indeed, given a C 1 diﬀerential manifold with a C 1 multiplication rule, one can show that the exponential mapping endows this manifold with a compatible realanalytic structure. Indeed, one can go even further and show that even C 0 suﬃces. In other words, a topological group that is also a ﬁnitedimensional topological manifold possesses a compatible analytic structure. This result was formulated by Hilbert as his ﬁfth problem, and proved in the 50’s by Montgomery and Zippin. 2. One can also speak of a complex Lie group, in which case G and the multiplication mapping are both complexanalytic. The theory of complex Lie groups requires the notion of a holomorphic vectorﬁeld. Not withstanding this complication, most of the essential features of the real theory carry over to the complex case. 3. The name “Lie group” honours the Norwegian mathematician Sophus Lie who pioneered and developed the theory of continuous transformation groups and the corresponding theory of Lie algebras of vector ﬁelds (the group’s inﬁnitesimal generators, as Lie termed them). Lie’s original impetus was the study of continuous symmetry of geometric objects and diﬀerential equations. The scope of the theory has grown enormously in the 100+ years of its existence. The contributions of Elie Cartan and Claude Chevalley ﬁgure prominently in this evolution. Cartan is responsible for the celebrated ADE classiﬁcation of simple Lie algebras, as well as for charting the essential role played by Lie groups in diﬀerential geometry and mathematical physics. Chevalley made key foundational contributions to the analytic theory, and did much to pioneer the related theory of algebraic groups. Armand Borel’s book “Essays in the History of Lie groups and algebraic groups” is the deﬁnitive source on the evolution of the Lie group concept. Sophus Lie’s contributions are the subject of a number of excellent articles by T. Hawkins. Version: 6 Owner: rmilson Author(s): rmilson
361.4
complexiﬁcation
Let G be a real Lie group. Then the complexiﬁcation GC of G is the unique complex Lie group equipped with a map ϕ : G → GC such that any map G → H where H is a complex Lie group, extends to a holomorphic map GC → H. If g and gC are the respective Lie algebras, gC ∼ g ⊗R C. = For simply connected groups, the construction is obvious: we simply take the simply connected complex group with Lie algebra gC , and ϕ to be the map induced by the inclusion g → gC . 1399
If γ ∈ G is central, then its image is in central in GC since g → γgγ −1 is a map extending ϕ, and thus must be the identity by uniqueness half of the universal property. Thus, if Γ ⊂ G is a discrete central subgroup, then we get a map G/Γ → GC /ϕ(Γ), which gives a complexiﬁcation for G/Γ. Since every Lie group is of this form, this shows existence. Some easy examples: the complexiﬁcation both of SLn R and SU(n) is SLn C. The complexiﬁcation of R is C and of S 1 is C∗ . The map ϕ : G → GC is not always injective. For example, if G is the universal cover of SLn R (which has fundamental group Z), then GC ∼ SLn C, and ϕ factors through the = covering G → SLn R. Version: 3 Owner: bwebste Author(s): bwebste
361.5
theorem:
HilbertWeyl theorem
Let Γ be a compact Lie group acting on V . Then there exists a ﬁnite Hilbert basis for the ring P(Γ) (the set of invariant polynomials). [GSS] proof: In [GSS] on page 54. theorem:(as stated by Hermann Weyl) The (absolute) invariants J(x, y, . . .) corresponding to a given set of representations of a ﬁnite or a compact Lie group have a ﬁnite integrity basis. [PV] proof: In [PV] on page 274.
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeﬀer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). SpringerVerlag, New York, 1988. [HW] Hermann, Weyl: The Classical Groups: Their Invariants and Representations. Princeton University Press, New Jersey, 1946.
Version: 3 Owner: Daume Author(s): Daume 1400
361.6
the connection between Lie groups and Lie algebras
Given a ﬁnite dimensional Lie group G, it has an associated Lie algebra g = Lie(G). The Lie algebra encodes a great deal of information about the Lie group. I’ve collected a few results on this topic: Theorem 7. (Existence) Let g be a ﬁnite dimensional Lie algebra over R or C. Then there exists a ﬁnite dimensional real or complex Lie group G with Lie(G) = g. Theorem 8. (Uniqueness) There is a unique connected simplyconnected Lie group G with any given ﬁnitedimensional Lie algebra. Every connected Lie group with this Lie algebra is a quotient G/Γ by a discrete central subgroup Γ. Even more important, is the fact that the correspondence G → g is functorial: given a homomorphism ϕ : G → H of Lie groups, there is natural homomorphism deﬁned on Lie algebras ϕ∗ : g → h, which just the derivative of the map ϕ at the identity (since the Lie algebra is canonically identiﬁed with the tangent space at the identity). There are analogous existence and uniqueness theorems for maps: Theorem 9. (Existence) Let ψ : g → h be a homomorphism of Lie algebras. Then if G is the unique connected, simplyconnected group with Lie algebra g, and H is any Lie group with Lie algebra h, there exists a homorphism of Lie groups ϕ : G → H with ϕ∗ = ψ. Theorem 10. (Uniqueness) Let G be connected Lie group and H an arbitrary Lie group. Then if two maps ϕ, ϕ : G → H induce the same maps on Lie algebras, then they are equal. Essentially, what these theorems tell us is the correspondence g → G from Lie algebras to simplyconnected Lie groups is functorial, and right adjoint to the functor H → Lie(H) from Lie groups to Lie algebras. Version: 6 Owner: bwebste Author(s): bwebste
1401
Chapter 362 2600 – General reference works (handbooks, dictionaries, bibliographies, etc.)
362.1 derivative notation
This is the list of known standard representations and their nuances. The most common notation, this is read as the derivative of u with respect d2 to v. Exponents relate which derivative, for example, dxy is the second derivative of y with 2 resspect to x. f (x) , f (x) , y − This is read as f prime of x. The number of primes tells the derivative, ie. f (x) is the third derivative of f (x) with respect to x. Note that in higher dimensions, this may be a tensor of a rank equal to the derivative. Dx f (x), Fy (x), fxy (x)− These notations are rather arcane, and should not be used generally, as they have other meanings. For example Fy can easily by the y component of a vectorvalued function. The subscript in this case means ”with respect to”, so Fyy would be the second derivative of F with respect to y. D1 f (x), F2 (x), f12 (x)− The subscripts in these cases refer to the derivative with respect to the nth variable. For example, F2 (x, y, z) would be the derivative of F with respect to y. They can easily represent higher derivatives, ie. D21 f (x) is the derivative with respect to the ﬁrst variable of the derivative with respect to the second variable.
du df dy , , − dv dx dx
1402
, ∂f − The partial derivative of u with respect to v. This symbol can be manipulated as ∂x in du for higher partials. dv
∂u ∂v ∂ , ∂v − This is the operator version of the derivative. Usually you will see it acting on d something such as dv (v 2 + 3u) = 2v. d dv
where fn represents the nth function of a vector valued function. the second of these notations represents the derivative matrix, which in most cases is the Jacobian, but in some cases, does not exist, even though the Jacobian exists. Note that the directional derivative in the direction v is simply [Jf(x)]v. Version: 7 Owner: slider142 Author(s): slider142
[Jf(x)] , [Df (x)]− The ﬁrst of these represents the Jacobian of f, which is a matrix of partial derivatives such that D1 f1 (x) . . . Dn f1 (x) . . .. . . [Jf (x)] = . . . D1 fm (x) . . . Dn fm (x)
362.2
fundamental theorems of calculus
The Fundamental Theorems of Calculus serve to demonstrate that integration and diﬀerentiation are inverse processes. First Fundamental Theorem: Suppose that F is a diﬀerentiable function on the interval [a, b]. Then intb F (x) dx = F (b) − a F (a). Second Fundamental Theorem: Let f be a continuous function on the interval [a, b], let c be an arbitrary point in this interval and assume f is integrable on the intervals of the form [0, x] for all x ∈ [a, b]. Let F be deﬁned as F (x) = intx f (t) dt for every x in (a, b). Then, F is diﬀerentiable and F (x) = f (x). c This result is about Riemann integrals. When dealing with Lebesgue integrals we get a generalization with Lebesgue’s diﬀerentiation theorem. Version: 9 Owner: mathcam Author(s): drini, greg
1403
362.3
logarithm
Deﬁnition. Three real numbers x, y, p, with x, y > 0, are said to obey the logarithmic relation logx (y) = p if they obey the corresponding exponential relation: xp = y. Note that by the monotonicity and continuity property of the exponential operation, for given x and y there exists a unique p satisfying the above relation. We are therefore able to says that p is the logarithm of y relative to the base x.
Properties. There are a number of basic algebraic identities involving logarithms. logx (yz) = logx (y) + logx (z) logx (y/z) = logx (y) − logx (z) logx (y z ) = z logx (y) logx (1) = 0 logx (x) = 1 logx (y) logy (x) = 1 logx (z) logy (z) = logx (y) Notes. In essence, logarithms convert multiplication to addition, and exponentiation to multiplication. Historically, these properties of the logarithm made it a useful tool for doing numerical calculations. Before the advent of electronic calculators and computers, tables of logarithms and the logarithmic slide rule were essential computational aids. Scientiﬁc applications predominantly make use of logarithms whose base is the Eulerian number e = 2.71828 . . .. Such logarithms are called natural logarithms and are commonly denoted by the symbol ln, e.g. ln(e) = 1. Natural logarithms naturally give rise to the natural logarithm function. A frequent convention, seen in elementary mathematics texts and on calculators, is that logarithms that do not give a base explicitly are assumed to be base 10, e.g. log(100) = 2. This is far from universal. In Rudin’s “Real and Complex analysis”, for example, we see a baseless log used to refer to the natural logarithm. By contrast, computer science and 1404
information theory texts often assume 2 as the default logarithm base. This is motivated by the fact that log2 (N) is the approximate number of bits required to encode N diﬀerent messages. The invention of logarithms is commonly credited to John Napier [ Biography] Version: 13 Owner: rmilson Author(s): rmilson
362.4
proof of the ﬁrst fundamental theorem of calculus
Let us make a subdivison of the interval [a, b], ∆ : {a = x0 < x1 < x2 < · · · < xn−1 < xn = b} From this, we can say F (b) − F (a) = n [F (xi ) − F (xi−1 )]. i=1 ¯ From the meanvalue theorem, we have that for any two points, x and x, ∃ ξ ∈ (¯, x) ¯ x ¯ ¯ ¯ ¯ ¯ F (x) − F (¯) = F (ξ)(x − x) If we use xi as x and xi−1 as x, calling our intermediate point x ¯ ξi , we get F (xi ) − F (xi−1 ) = F (ξi )(xi − xi−1 ). Combining these, and using the abbreviation ∆i x = xi − xi−1 , we have F (xi ) − F (xi−1 ) = n i=1 F (ξi )∆i xi . From the deﬁnition of an integral ∀ > 0 ∃ δ > 0  n F (ξi )∆i x − intb F (x) dx < a i=1 when ∆ < δ. Thus, ∀ > 0, F (b) − F (a) − intb F (x) dx < . a lim →0 F (b) − F (a) − intb F (x) dx = 0, but F (b) − F (a) − intb F (x) dx is constant with a a respect to , which can only mean that F (b) − F (a) − int b F (x) dx = 0, and so we have the a ﬁrst fundamental theorem of calculus F (b) − F (a) = intb F (x) dx. a Version: 4 Owner: greg Author(s): greg
362.5
proof of the second fundamental theorem of calculus
Recall that a continuous function is Riemann integrable, so the integral F (x) = intx f (t) dt c is well deﬁned. Consider the increment of F :
x+h F (x + h) − F (x) = intc f (t) dt − intx f (t) dt = intx+h f (t) dt c x
1405
(we have used the linearity of the integral with respect to the function and the additivity with respect to the domain). Now let M be the maximum of f on [x, x + h] and m be the minimum. Clearly we have
x+h mh ≤ intx f (t) dt ≤ Mh
(this is due to the monotonicity of the integral with respect to the integrand) which can be written as x+h F (x + h) − F (x) intx f (t) dt = ∈ [m, M] h h Being f continuous, by the meanvalue theorem, there exists ξh ∈ [x, x+h] such that f (ξh ) = F (x+h)−F (x) so that h F (x) = lim since ξh → x as h → 0. Version: 1 Owner: paolini Author(s): paolini F (x + h) − F (x) = lim f (ξh ) = f (x) h→0 h→0 h
362.6
rootmeansquare
If x1 , x2 , . . . , xn are real numbers, we deﬁne their rootmeansquare or quadratic mean as x2 + x2 + · · · + x2 1 2 n R(x1 , x2 , . . . , xn ) = . n The rootmeansquare of a random variable X is deﬁned as the square root of the expectation of X 2 : R(X) = E(X 2 ) If X1 , X2 , . . . , Xn are random variables with standard deviations σ1 , σ2 , . . . , σn , then the stan+···+Xn dard deviation of their arithmetic mean, X1 +X2n , is the rootmeansquare of σ1 , σ2 , . . . , σn . Version: 1 Owner: pbruin Author(s): pbruin
362.7
square
The square of a number x is the number obtained multiplying x by itself. It’s denoted as x2 .
1406
Some examples: 52 = 25 2 1 1 = 3 9 2 0 = 0 .52 = .25 Version: 2 Owner: drini Author(s): drini
1407
Chapter 363 26XX – Real functions
363.1 abelian function
An abelian or hyperelliptic function is a generalisation of an elliptic function. It is a function of two variables with four periods. In a similar way to an elliptic function it can also be regarded as the inverse function to certain integrals (called abelian or hyperelliptic integrals) of the form dz int R(z) where R is a polynomial of degree greater than 4. Version: 2 Owner: vladm Author(s): vladm
363.2
fullwidth at half maximum
The fullwidth at half maximum (FWHM) is a parameter used to describe the width of a bump on a function (or curve). The FWHM is given by the distance beteen the points where the function reaches half of its maximum value. For example: the function f (x) = 10 . +1
x2
f reaches its maximum for x = 0,(f (0) = 10), so f reaches half of its maximum value for x = 1 and x = −1 (f (1) = f (−1) = 5). So the FWHM for f , in this case, is 2. Beacouse the distance between A(1, 5) and B(−1, 5) si 2.
1408
10 . +1 is called ’The Agnesi curve’, from Maria Gaetana Agnesi (1718  1799). f (x) = x2 Version: 2 Owner: vladm Author(s): vladm
The function
1409
Chapter 364 26A03 – Foundations: limits and generalizations, elementary topology of the line
364.1 Cauchy sequence
A sequence x0 , x1 , x2 , . . . in a metric space (X, d) is a Cauchy sequence if, for every real number > 0, there exists a natural number N such that d(xn , xm ) < whenever n, m > N. Version: 4 Owner: djao Author(s): djao, rmilson
364.2
Dedekind cuts
The purpose of Dedekind cuts is to provide a sound logical foundation for the real number system. Dedekind’s motivation behind this project is to notice that a real number α, intuitively, is completely determined by the rationals strictly smaller than α and those strictly larger than α. Concerning the completeness or continuity of the real line, Dedekind notes in [2] that If all points of the straight line fall into two classes such that every point of the ﬁrst class lies to the left of every point of the second class, then there exists one and only one point which produces this division of all points into two classes, this severing of the straight line into two portions. Dedekind deﬁnes a point to produce the division of the real line if this point is either the least or greatest element of either one of the classes mentioned above. He further notes that 1410
the completeness property, as he just phrased it, is deﬁcient in the rationals, which motivates the deﬁnition of reals as cuts of rationals. Because all rationals greater than α are really just excess baggage, we prefer to sway somewhat from Dedekind’s original deﬁnition. Instead, we adopt the following deﬁnition. Deﬁnition 34. A Dedekind cut is a subset α of the rational numbers Q that satisﬁes these properties: 1. α is not empty. 2. Q \ α is not empty. 3. α contains no greatest element 4. For x, y ∈ Q, if x ∈ α and y < x, then y ∈ α as well. Dedekind cuts are particularly appealing for two reasons. First, they make it very easy to prove the completeness, or continuity of the real line. Also, they make it quite plain to distinguish the rationals from the irrationals on the real line, and put the latter on a ﬁrm logical foundation. In the construction of the real numbers from Dedekind cuts, we make the following deﬁnition: Deﬁnition 35. A real number is a Dedekind cut. We denote the set of all real numbers by R and we order them by settheoretic inclusion, that is to say, for any α, β ∈ R, α < β if and only if α ⊂ β where the inclusion is strict. We further deﬁne α = β as real numbers if α and β are equal as sets. As usual, we write α β if α < β or α = β. Moreover, a real number α is said to be irrational if Q \ α contains no least element. The Dedekind completeness property of real numbers, expressed as the supremum property, now becomes straightforward to prove. In what follows, we will reserve Greek variables for real numbers, and Roman variables for rationals. Theorem 11. Every nonempty subset of real numbers that is bounded above has a least upper bound.
L et A be a nonempty set of real numbers, such that for every α ∈ A we have that α for some real number γ. Now deﬁne the set
γ
sup A =
α∈A
α.
We must show that this set is a real number. This amounts to checking the four conditions of a Dedekind cut. 1411
1. sup A is clearly not empty, for it is the nonempty union of nonempty sets. 2. Because γ is a real number, there is some rational x that is not in γ. Since every α ∈ A is a subset of γ, x is not in any α, so x ∈ sup A either. Thus, Q \ sup A is nonempty. 3. If sup A had a greatest element g, then g ∈ α for some α ∈ A. Then g would be a greatest element of α, but α is a real number, so by contrapositive, sup A has no greatest element. 4. Lastly, if x ∈ sup A, then x ∈ α for some α, so given any y < x because α is a real number y ∈ α, whence y ∈ sup A. Thus, sup A is a real number. Trivially, sup A is an upper bound of A, for every α ⊆ sup A. It now suﬃces to prove that sup A γ, because γ was an arbitrary upper bound. But this is easy, because every x ∈ sup A is an element of α for some α ∈ A, so because α ⊆ γ, x ∈ γ. Thus, sup A is the least upper bound of A. We call this real number the supremum of A. To ﬁnish the construction of the real numbers, we must endow them with algebraic operations, deﬁne the additive and multiplicative identity elements, prove that these deﬁnitions give a ﬁeld, and prove further results about the order of the reals (such as the totality of this order) – in short, build a complete ordered ﬁeld. This task is somewhat laborious, but we include here the appropriate deﬁnitions. Verifying their correctness can be an instructive, albeit tiresome, exercise. We use the same symbols for the operations on the reals as for the rational numbers; this should cause no confusion in context. Deﬁnition 36. Given two real numbers α and β, we deﬁne • The additive identity, denoted 0, is 0 := {x ∈ Q : x < 0} • The multiplicative identity, denoted 1, is 1 := {x ∈ Q : x < 1} • Addition of α and β denoted α + β is α + β := {x + y : x ∈ α, y ∈ β} • The opposite of α, denoted −α, is −α := {x ∈ Q : −x ∈ α, but − x is not the least element of Q \ α} • The absolute value of α, denoted α, is α := α, if α −α, if α 1412 0 0
• If α, β > 0, then multiplication of α and β, denoted α · β, is α · β := {z ∈ Q : z 0 or z = xy for some x ∈ α, y ∈ β with x, y > 0} In general,
• The inverse of α > 0, denoted α−1 , is α−1 := {x ∈ Q : x If α < 0,
0, if α = 0 or β = 0 α · β := α · β if α > 0, β > 0 or α < 0, β < 0 −(α · β) if α > 0, β < 0 or α > 0, β < 0 0 or x > 0 and (1/x) ∈ α, but 1/x is not the least element of Q\α} α−1 := −(α)−1
All that remains (!) is to check that the above deﬁnitions do indeed deﬁne a complete ordered ﬁeld, and that all the sets implied to be real numbers are indeed so. The properties of R as an ordered ﬁeld follow from these deﬁnitions and the properties of Q as an ordered ﬁeld. It is important to point out that in two steps, in showing that inverses and opposites are properly deﬁned, we require an extra property of Q, not merely in its capacity as an ordered ﬁeld. This requirement is the Archimedean property. Moreover, because R is a ﬁeld of characteristic 0, it contains an isomorphic copy of Q. The rationals correspond to the Dedekind cuts α for which Q \ α contains a least member.
REFERENCES
1. Courant, Richard and Robbins, Herbert. What is Mathematics? pp. 6872 Oxford University Press, Oxford, 1969 2. Dedekind, Richard. Essays on the Theory of Numbers Dover Publications Inc, New York 1963 3. Rudin, Walter Principles of Mathematical Analysis pp. 1721 McGrawHill Inc, New York, 1976 4. Spivak, Michael. Calculus pp. 569596 Publish or Perish, Inc. Houston, 1994
Version: 20 Owner: rmilson Author(s): rmilson, NeuRet
364.3
binomial proof of positive integer power rule
We will use the diﬀerence quotient in this proof of the power rule for positive integers. Let f (x) = xn for some integer n 0. Then we have f (x) = lim (x + h)n − xn . h→0 h 1413
We can use the binomial theorem to expand the numerator f (x) = lim
n where Ck = n! . k!(n−k)! n n n n C0 x0 hn + C1 x1 hn−1 + · · · + Cn−1xn−1 h1 + Cn xn h0 − xn h→0 h
We can now simplify the above hn + nxhn−1 + · · · + nxn−1 h + xn − xn f (x) = lim h→0 h = lim (hn−1 + nxhn−2 + · · · + nxn−1 ) = nx = nxn−1 .
h→0 n−1
Version: 4 Owner: mathcam Author(s): mathcam, slider142
364.4
exponential
Preamble. We use R+ ⊂ R to denote the set of nonnegative real numbers. Our aim is to deﬁne the exponential, or the generalized power operation, xp , x ∈ R+ , p ∈ R.
The power p in the above expression is called the exponent. We take it as proven that R is a complete, ordered ﬁeld. No other properties of the real numbers are invoked. Deﬁnition. For x ∈ R+ and n ∈ Z we deﬁne xn in terms of repeated multiplication. To be more precise, we inductively characterize natural number powers as follows: x0 = 1, xn+1 = x · xn , n ∈ N.
The existence of the reciprocal is guaranteed by the assumption that R is a ﬁeld. Thus, for negative exponents, we can deﬁne x−n = (x−1 )n , where x−1 is the reciprocal of x. The case of arbitrary exponents is somewhat more complicated. A possible strategy is to deﬁne roots, then rational powers, and then extend by continuity. Our approach is diﬀerent. For x ∈ R+ and p ∈ R, we deﬁne the set of all reals that one would want to be smaller than xp , and then deﬁne the latter as the least upper bound of this set. To be more precise, let x > 1 and deﬁne L(x, p) = {z ∈ R+ : z n < xm for all m ∈ Z, n ∈ N such that m < pn}. 1414 n ∈ N,
We then deﬁne xp to be the least upper bound of L(x, p). For x < 1 we deﬁne xp = (x−1 )p . The exponential operation possesses a number of important properties, some of which characterize it up to uniqueness. Note. It is also possible to deﬁne the exponential operation in terms of the exponential function and the natural logarithm. Since these concepts require the context of diﬀerential theory, it seems preferable to give a basic deﬁnition that relies only on the foundational property of the reals. Version: 11 Owner: rmilson Author(s): rmilson
364.5
interleave sequence
Let S be a set, and let {xi }, i = 0, 1, 2, . . . and {yi }, i = 0, 1, 2, . . . be two sequences in S. The interleave sequence is deﬁned to be the sequence x0 , y0 , x1 , y1 , . . . . Formally, it is the sequence {zi }, i = 0, 1, 2, . . . given by zi := xk yk if i = 2k is even, if i = 2k + 1 is odd.
Version: 2 Owner: djao Author(s): djao
364.6
limit inferior
Let S ⊂ R be a set of real numbers. Recall that a limit point of S is a real number x ∈ R such that for all > 0 there exist inﬁnitely many y ∈ S such that x − y < . We deﬁne lim inf S, pronounced the limit inferior of S, to be the inﬁmum of all the limit points of S. If there are no limit points, we deﬁne the limit inferior to be +∞. The two most common notations for the limit inferior are lim inf S and lim S . 1415
An alternative, but equivalent, deﬁnition is available in the case of an inﬁnite sequence of real numbers x0 , x1 , x2 , , . . .. For each k ∈ N, let yk be the inﬁmum of the k th tail, yk = inf j
k xj .
This construction produces a nondecreasing sequence y0 y1 y2 ...,
which either converges to its supremum, or diverges to +∞. We deﬁne the limit inferior of the original sequence to be this limit; lim inf xk = lim yk .
k k
Version: 7 Owner: rmilson Author(s): rmilson
364.7
limit superior
Let S ⊂ R be a set of real numbers. Recall that a limit point of S is a real number x ∈ R such that for all > 0 there exist inﬁnitely many y ∈ S such that x − y < . We deﬁne lim sup S, pronounced the limit superior of S, to be the supremum of all the limit points of S. If there are no limit points, we deﬁne the limit superior to be −∞. The two most common notations for the limit superior are lim sup S and lim S . An alternative, but equivalent, deﬁnition is available in the case of an inﬁnite sequence of real numbers x0 , x1 , x2 , , . . .. For each k ∈ N, let yk be the supremum of the k th tail, yk = sup xj .
j k
This construction produces a nonincreasing sequence y0 y1 y2 ...,
which either converges to its inﬁmum, or diverges to −∞. We deﬁne the limit superior of the original sequence to be this limit; lim sup xk = lim yk .
k k
Version: 7 Owner: rmilson Author(s): rmilson 1416
364.8
power rule
The power rule states that D p x = pxp−1 , Dx p∈R
This rule, when combined with the chain rule, product rule, and sum rule, makes calculating many derivatives far more tractable. This rule can be derived by repeated application of the product rule. See the proof of the power rule. Repeated use of the above formula gives
di k x = dxi for i, k ∈ Z.
0
k! xk−i (k−i)!
i>k i k,
Examples
D 0 x Dx D 1 x Dx D 2 x Dx D 3 x Dx D√ x Dx D e 2x Dx 0 D =0= 1 x Dx D = 1x0 = 1 = x Dx = = 2x = 3x2 = D 1/2 1 −1/2 1 x = x = √ Dx 2 2 x
= 2exe−1
Version: 4 Owner: mathcam Author(s): mathcam, Logan
364.9
properties of the exponential
The exponential operation possesses the following properties. 1417
• Homogeneity. For x, y ∈ R+ , p ∈ R we have (xy)p = xp y p • Exponent additivity. For x ∈ R+ we have x0 = 1, Furthermore xp+q = xp xq , p, q ∈ R. x−p > y −p . • Monotonicity. For x, y ∈ R+ with x < y and p ∈ R+ we have xp < y p , x1 = x.
• Continuity. The exponential operation is continuous with respect to its arguments. To be more precise, the following function is continuous: P : R+ × R → R, P (x, y) = xy .
Let us also note that the exponential operation is characterized (in the sense of existence and uniqueness) by the additivity and continuity properties. [Author’s note: One can probably get away with substantially less, but I haven’t given this enough thought.] Version: 10 Owner: rmilson Author(s): rmilson
364.10
squeeze rule
Squeeze rule for sequences Let f, g, h : N → R be three sequences of real numbers such that f (n) ≤ g(n) ≤ h(n) for all n. If limn→∞ f (n) and limn→∞ h(n) exist and are equal, say to a, then limn→∞ g(n) also exists and equals a. The proof is fairly straightforward. Let e be any real number > 0. By hypothesis there exist M, N ∈ N such that a − f (n) < e for all n ≥ M a − h(n) < e for all n ≥ N Write L = max(M, N). For n ≥ L we have
1418
• if g(n) ≥ a: • else g(n) < a and:
g(n) − a = g(n) − a ≤ h(n) − a < e g(n) − a = a − g(n) ≤ a − f (n) < e
So, for all n ≥ L, we have g(n) − a < e, which is the desired conclusion. Squeeze rule for functions Let f, g, h : S → R be three realvalued functions on a neighbourhood S of a real number b, such that f (x) ≤ g(x) ≤ h(x) for all x ∈ S − {b}. If limx→b f (x) and limx→b h(x) exist and are equal, say to a, then limx→b g(x) also exists and equals a. Again let e be an arbitrary positive real number. Find positive reals α and β such that a − f (x) < e whenever 0 < b − x < α a − h(x) < e whenever 0 < b − x < β Write δ = min(α, β). Now, for any x such that b − x < δ, we have • if g(x) ≥ a: • else g(x) < a and: and we are done. Version: 1 Owner: Daume Author(s): Larry Hammick g(x) − a = g(x) − a ≤ h(x) − a < e g(x) − a = a − g(x) ≤ a − f (x) < e
1419
Chapter 365 26A06 – Onevariable calculus
365.1 Darboux’s theorem (analysis)
Let f : [a, b] → R be a realvalued continuous function on [a, b], which is diﬀerentiable on (a, b), diﬀerentiable from the right at a, and diﬀerentiable from the left at b. Then f satisﬁes the intermediate value theorem: for every t between f+ (a) and f− (b), there is some x ∈ [a, b] such that f (x) = t. Note that when f is continuously diﬀerentiable (f ∈ C 1 ([a, b])), this is trivially true by the intermediate value theorem. But even when f is not continuous, Darboux’s theorem places a severe restriction on what it can be. Version: 3 Owner: mathwizard Author(s): mathwizard, ariels
365.2
Fermat’s Theorem (stationary points)
Let f : (a, b) → R be a continuous function and suppose that x0 ∈ (a, b) is a local extremum of f . If f is diﬀerentiable in x0 then f (x0 ) = 0. Version: 2 Owner: paolini Author(s): paolini
1420
365.3
Heaviside step function
H : R → R deﬁned as when x < 0, when x = 0, when x > 0.
Here, there are many conventions for the value at x = 0. The motivation for setting H(0) = 1/2 is that we can then write H as a function of the signum function (see this page). In applications, such as the Laplace transform, where the Heaviside function is used extensively, the value of H(0) is irrelevant. The function is named after Oliver Heaviside (18501925) [1]. However, the function was already used by Cauchy[2], who deﬁned the function as √ 1 u(t) = t + t/ t2 2 and called it a coeﬃcient limitateur [1].
The Heaviside step function is the function 0 1/2 H(x) = 1
REFERENCES
1. The MacTutor History of Mathematics archive, Oliver Heaviside. 2. The MacTutor History of Mathematics archive, Augustin Louis Cauchy. 3. R.F. Hoskins, Generalised functions, Ellis Horwood Series: Mathematics and its applications, John Wiley & Sons, 1979.
Version: 1 Owner: Koro Author(s): matte
365.4
Leibniz’ rule
Theorem [Leibniz’ rule] ([1] page 592) Let f and g be real (or complex) valued functions that are deﬁned on an open interval of R. If f and g are k times diﬀerentiable, then
k
(f g)(k) =
r=0
k (k−r) (r) f g . r
For multiindices, Leibniz’ rule have the following generalization: Theorem [2] If f, g : Rn → C are smooth functions, and j is a multiindex, then ∂ j (f g) =
i≤j
j i ∂ (f ) ∂ j−i (g), i
where i is a multiindex. 1421
REFERENCES
1. R. Adams, Calculus, a complete course, AddisonWesley Publishers Ltd, 3rd ed. 2. http://www.math.umn.edu/ jodeit/course/TmprDist1.pdf
Version: 3 Owner: matte Author(s): matte
365.5
Rolle’s theorem
Rolle’s theorem. If f is a continuous function on [a, b], such that f (a) = f (b) = 0 and diﬀerentiable on (a, b) then there exists a point c ∈ (a, b) such that f (c) = 0. Version: 8 Owner: drini Author(s): drini
365.6
binomial formula
The binomial formula gives the power series expansion of the pth power function for every real power p. To wit, ∞ n p n x (1 + x) = p , x ∈ R, x < 1, n! n=0 where denotes the nth falling factorial of p. pn = p(p − 1) . . . (p − n + 1)
Note that for p ∈ N the power series reduces to a polynomial. The above formula is therefore a generalization of the binomial theorem. Version: 4 Owner: rmilson Author(s): rmilson
365.7
chain rule
Let f (x), g(x) be diﬀerentiable, realvalued functions. The derivative of the composition (f ◦ g)(x) can be found using the chain rule, which asserts that: (f ◦ g) (x) = f (g(x)) g (x) The chain rule has a particularly suggestive appearance in terms of the Leibniz formalism. Suppose that z depends diﬀerentiably on y, and that y in turn depends diﬀerentiably on x. 1422
Then,
dz dz dy = dx dy dx
The apparent cancellation of the dy term is at best a formal mnemonic, and does not constitute a rigorous proof of this result. Rather, the Leibniz format is well suited to the interpretation of the chain rule in terms of related rates. To wit: The instantaneous rate of change of z relative to x is equal to the rate of change of z relative to y times the rate of change of y relative to x. Version: 5 Owner: rmilson Author(s): rmilson
365.8
complex Rolle’s theorem
Theorem [1] Suppose Ω is an open convex set in C, suppose f is a holomorphic function f : Ω → C, and suppose f (a) = f (b) = 0 for distinct points a, b in Ω. Then there exist points u, v on Lab (the straight line connecting a and b not containing the endpoints), such that Re{f (u)} = 0 and Im{f (v)} = 0.
REFERENCES
1. J.Cl. Evard, F. Jafari, A Complex Rolle’s Theorem, American Mathematical Monthly, Vol. 99, Issue 9, (Nov. 1992), pp. 858861.
Version: 4 Owner: matte Author(s): matte
365.9
complex meanvalue theorem
Theorem [1] Suppose Ω is an open convex set in C, suppose f is a holomorphic function f : Ω → C, and suppose a, b are distinct points in Ω. Then there exist points u, v on Lab (the straight line connecting a and b not containing the endpoints), such that Re{ f (b) − f (a) } = Re{f (u)}, b−a f (b) − f (a) Im{ } = Im{f (v)}, b−a
1423
REFERENCES
1. J.Cl. Evard, F. Jafari, A Complex Rolle’s Theorem, American Mathematical Monthly, Vol. 99, Issue 9, (Nov. 1992), pp. 858861.
Version: 2 Owner: matte Author(s): matte
365.10
deﬁnite integral
The deﬁnite integral with respect to x of some function f (x) over the closed interval [a, b] is deﬁned to be the “area under the graph of f (x) with respect to x” (if f(x) is negative, then you have a negative area). It is written as: intb f (x) dx a one way to ﬁnd the value of the integral is to take a limit of an approximation technique as the precision increases to inﬁnity. For example, use a Riemann sum which approximates the area by dividing it into n intervals of equal widths, and then calculating the area of rectangles with the width of the interval and height dependent on the function’s value in the interval. Let Rn be this approximation, which can be written as n Rn =
i=1
f (x∗ )∆x i
where x∗ is some x inside the ith interval. i Then, the integral would be
n
intb f (x) a
dx = lim Rn = lim
n→∞
n→∞
f (x∗ )∆x i
i=1
We can use this deﬁnition to arrive at some important properties of deﬁnite integrals (a, b, c are constant with respect to x): intb f (x) + g(x) a b inta f (x) − g(x) intb f (x) a intb f (x) a b inta cf (x) dx dx dx dx dx = = = = = intb f (x) dx + intb g(x) dx a a b inta f (x) dx − intb g(x) dx a a −intb f (x) dx intc f (x) dx + intb f (x) dx a c b cinta f (x) dx
There are other generalisations about integrals, but many require the fundamental theorem of calculus. Version: 4 Owner: xriso Author(s): xriso 1424
365.11
derivative of even/odd function (proof )
Suppose f (x) = ±f (−x). We need to show that f (x) = f (−x). To do this, let us deﬁne the auxiliary function m : R → R, m(x) = −x. The condition on f is then f (x) = ±(f ◦ m)(x). Using the chain rule, we have that f (x) = ±(f ◦ m) (x) = ±f m(x) m (x) = f (−x), and the claim follows. P Version: 2 Owner: mathcam Author(s): matte
365.12
direct sum of even/odd functions (example)
Example. direct sum of even and odd functions Let us deﬁne the sets F = {f f is a function fromR toR}, F+ = {f ∈ F f (x) = f (−x)for allx ∈ R}, F− = {f ∈ F f (x) = −f (−x)for allx ∈ R}. In other words, F contain all functions from R to R, F+ ⊂ F contain all even functions, and F− ⊂ F contain all odd functions. All of these spaces have a natural vector space structure: for functions f and g we deﬁne f + g as the function x → f (x) + g(x). Similarly, if c is a real constant, then cf is the function x → cf (x). With these operations, the zero vector is the mapping x → 0. We claim that F is the direct sum of F+ and F− , i.e., that F = F+ ⊕ F− . (365.12.1)
To prove this claim, let us ﬁrst note that F± are vector subspaces of F . Second, given an arbitrary function f in F , we can deﬁne 1 f (x) + f (−x) , 2 1 f− (x) = f (x) − f (−x) . 2 f+ (x) = Now f+ and f− are even and odd functions and f = f+ + f− . Thus any function in F can be split into two components f+ and f− , such that f+ ∈ F+ and f− ∈ F− . To show that the sum 1425
is direct, suppose f is an element in F+ F− . Then we have that f (x) = −f (−x) = −f (x), so f (x) = 0 for all x, i.e., f is the zero vector in F . We have established equation 364.12.1. Version: 2 Owner: mathcam Author(s): matte
365.13
even/odd function
Deﬁnition. Let f be a function from R to R. If f (x) = f (−x) for all x ∈ R, then f is an even function. Similarly, if f (x) = −f (−x) for all x ∈ R, then f is an odd function. Example. 1. The trigonometric functions sin and cos are odd and even, respectively. properties. 1. The vector space of real functions can be written as the direct sum of even and odd functions. (See this page.) 2. Let f : R → R be a diﬀerentiable function. (a) If f is an even function, then the derivative f is an odd function. (b) If f is an odd function, then the derivative f is an even function. (proof) 3. Let f : R → R be a smooth function. Then there exists smooth functions g, h : R → R such that f (x) = g(x2 ) + xh(x2 ) for all x ∈ R. Thus, if f is even, we have f (x) = g(x2 ), and if f is odd, we have f (x) = xh(x2 ) ([4], Exercise 1.2)
REFERENCES
1. L. H¨rmander, The Analysis of Linear Partial Diﬀerential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, SpringerVerlag, 1990.
Version: 4 Owner: mathcam Author(s): matte
1426
365.14
example of chain rule
Suppose we wanted to diﬀerentiate h(x) = Here, h(x) is given by the composition h(x) = f (g(x)), where f (x) = Then chain rule says that h (x) = f (g(x))g (x). Since 1 f (x) = √ , 2 x h (x) = √ √ x and g(x) = sin(x). sin(x).
and g (x) = cos(x), cos x cos x = √ 2 sin x
we have by chain rule 1 2 sin x
Using the Leibniz formalism, the above calculation would have the following appearance. First we describe the functional relation as z= sin(x).
Next, we introduce an auxiliary variable y, and write z= We then have √ y, y = sin(x). dy = cos(x), dx
dz 1 = √ , dy 2 y
and hence the chain rule gives 1 dz = √ cos(x) dx 2 y 1 cos(x) = 2 sin(x) Version: 1 Owner: rmilson Author(s): rmilson
1427
365.15
example of increasing/decreasing/monotone function
The function f (x) = ex is strictly increasing and hence strictly monotone. Similarly g(x) = e−x is strictly decreasing and hence strictly monotone. Consider the function h : [1, 10] → √ √ [1, 5] where h(x) = x − 4 x − 1 + 3 + x − 6 x − 1 + 8. It is not strictly monotone since it is constant on an interval, however it is decreasing and hence monotone. Version: 1 Owner: Johan Author(s): Johan
365.16
extended meanvalue theorem
Let f : [a, b] → R and g : [a, b] → R be continuous on [a, b] and diﬀerentiable on (a, b). Then there exists some number ξ ∈ (a, b) satisfying: (f (b) − f (a))g (ξ) = (g(b) − g(a))f (ξ). If g is linear this becomes the usual meanvalue theorem. Version: 6 Owner: mathwizard Author(s): mathwizard
365.17
increasing/decreasing/monotone function
Deﬁnition Let A a subset of R, and let f be a function from f : A → R. Then 1. f is increasing, if x ≤ y implies that f (x) ≤ f (y) (for all x and y in A). 2. f is strictly increasing, if x < y implies that f (x) < f (y). 3. f is decreasing, if x ≥ y implies that f (x) ≥ f (y). 4. f is strictly decreasing, if x > y implies that f (x) > f (y). 5. f is monotone, if f is either increasing or decreasing. 6. f is strictly monotone, if f is either strictly increasing or strictly decreasing. Theorem Let X be a bounded of unbounded open interval of R. In other words, let X be an interval of the form X = (a, b), where a, b ∈ R {−∞, ∞}. Futher, let f : X → R be a monotone function.
1428
1. The set of points where f is discontinuous is at most countable [1, 1]. Lebesgue f is diﬀerentiable almost everywhere ([1], pp. 514).
REFERENCES
1. C.D. Aliprantis, O. Burkinshaw, Principles of Real Analysis, 2nd ed., Academic Press, 1990. 2. W. Rudin, Principles of Mathematical Analysis, McGrawHill Inc., 1976. 3. F. Jones, Lebesgue Integration on Euclidean Spaces, Jones and Barlett Publishers, 1993.
Version: 3 Owner: matte Author(s): matte
365.18
intermediate value theorem
Let f be a continuous function on the interval [a, b]. Let x1 and x2 be points with a ≤ x1 < x2 ≤ b such that f (x1 ) = f (x2 ). Then for each value y between f (x1 ) and (x2 ), there is a c ∈ (x1 , x2 ) such that f (c) = y. Bolzano’s theorem is a special case of this one. Version: 2 Owner: drini Author(s): drini
365.19
limit
Let f : X \ {a} −→ Y be a function between two metric spaces X and Y , deﬁned everywhere except at some a ∈ X. For L ∈ Y , we say the limit of f (x) as x approaches a is equal to L, or lim f (x) = L
x→a
if, for every real number ε > 0, there exists a real number δ > 0 such that, whenever x ∈ X with 0 < dX (x, a) < δ, then dY (f (x), L) < ε. The formal deﬁnition of limit as given above has a well–deserved reputation for being notoriously hard for inexperienced students to master. There is no easy ﬁx for this problem, since the concept of a limit is inherently diﬃcult to state precisely (and indeed wasn’t even accomplished historically until the 1800’s by Cauchy, well after the invention of calculus in the 1600’s by Newton and Leibniz). However, there are number of related deﬁnitions, which, taken together, may shed some light on the nature of the concept.
1429
• The notion of a limit can be generalized to mappings between arbitrary topological spaces. In this context we say that limx→a f (x) = L if and only if, for every neighborhood V of L (in Y ), there is a deleted neighborhood U of a (in X) which is mapped into V by f. • Let an , n ∈ N be a sequence of elements in a metric space X. We say that L ∈ X is the limit of the sequence, if for every ε > 0 there exists a natural number N such that d(an , L) < ε for all natural numbers n > N. • The deﬁnition of the limit of a mapping can be based on the limit of a sequence. To wit, limx→a f (x) = L if and only if, for every sequence of points xn in X converging to a (that is, xn → a, xn = a), the sequence of points f (xn ) in Y converges to L. In calculus, X and Y are frequently taken to be Euclidean spaces Rn and Rm , in which case the distance functions dX and dY cited above are just Euclidean distance. Version: 5 Owner: djao Author(s): rmilson, djao
365.20
mean value theorem
Mean value theorem Let f : [a, b] → R be a continuous function diﬀerentiable on (a, b). Then there is some real number x0 ∈ (a, b) such that f (x0 ) = f (b) − f (a) . b−a
Version: 3 Owner: drini Author(s): drini, apmxi
365.21
meanvalue theorem
Let f : R → R be a function which is continuous on the interval [a, b] and diﬀerentiable on (a, b). Then there exists a number c : a < c < b such that f (b) − f (a) . b−a The geometrical meaning of this theorem is illustrated in the picture: f (c) =
(365.21.1)
1430
This is often used in the integral context: ∃c ∈ [a, b] such that (b − a)f (c) = intb f (x)dx. a Version: 4 Owner: mathwizard Author(s): mathwizard, drummond (365.21.2)
365.22
monotonicity criterion
Suppose that f : [a, b] → R is a function which is continuous on [a, b] and diﬀerentiable on (a, b). Then the following relations hold. 1. f (x) ≥ 0 for all x ∈ (a, b) ⇔ f is an increasing function on [a, b]; 2. f (x) ≤ 0 for all x ∈ (a, b) ⇔ f is a decreasing function on [a, b]; 3. f (x) > 0 for all x ∈ (a, b) ⇒ f is a strictly increasing function on [a, b]; 4. f (x) < 0 for all x ∈ (a, b) ⇒ f is a strictly decreasing function on [a, b]. Notice that the third and fourth statement cannot be inverted. As an example consider the function f : [−1, 1] → R, f (x) = x3 . This is a strictly increasing function, but f (0) = 0. Version: 4 Owner: paolini Author(s): paolini
365.23
nabla
Let f : Rn → R a C 1 (Rn ) function. That is, a partially diﬀerentiable function on all its coordinates. The symbol , named nabla represents the gradient operator whose action on f (x1 , x2 , . . . , xn ) is given by f = (fx1 , fx2 , . . . , fxn ) ∂f ∂f ∂f , ,..., = ∂x1 ∂x2 ∂xn Version: 2 Owner: drini Author(s): drini, apmxi
1431
365.24
onesided limit
Let f be a realvalued function deﬁned on S ⊆ R. The lefthand onesided limit at a is deﬁned to be the real number L− such that for every > 0 there exists a δ > 0 such that f (x) − L−  < whenever 0 < a − x < δ. Analogously, the righthand onesided limit at a is the real number L+ such that for every > 0 there exists a δ > 0 such that f (x) − L+  < whenever 0 < x − a < δ. Common notations for the onesided limits are L+ = f (x+) = lim+ f (x) = lim f (x),
x→a x a
L
−
= f (x−) = lim− f (x) = lim f (x).
x→a x a
Sometimes, lefthanded limits are referred to as limits from below while righthanded limits are from above. Theorem The ordinary limit of a function exists at a point if and only if both onesided limits exist at this point and are equal (to the ordinary limit). e.g., The Heaviside unit step function, sometimes colloquially referred to as the diving board function, deﬁned by 0 if x < 0 H(x) = 1 if x 0 has the simplest kind of discontinuity at x = 0, a jump discontinuity. Its ordinary limit does not exist at this point, but the onesided limits do exist, and are
x→0−
lim H(x) = 0 and lim+ H(x) = 1.
x→0
Version: 5 Owner: matte Author(s): matte, NeuRet
365.25
product rule
The product rule states that if f : R → R and g : R → R are functions in one variable both diﬀerentiable at a point x0 , then the derivative of the product of the two fucntions, denoted f · g, at x0 is given by D (f · g) (x0 ) = f (x0 )g (x0 ) + f (x0 )g(x0 ). Dx
1432
Proof See the proof of the product rule.
365.25.1
Generalized Product Rule
More generally, for diﬀerentiable functions f1 , f2 , . . . , fn in one variable, all diﬀerentiable at x0 , we have
n
D(f1 · · · fn )(x0 ) = Also see Leibniz’ rule.
i=1
(fi (x0 ) · · · fi−1 (x0 ) · Dfi (x0 ) · fi+1 (x0 ) · · · fn (x0 )) .
Example The derivative of x ln x can be found by application of this rule. Let f (x) = x, g(x) = ln x, 1 so that f (x)g(x) = x ln x. Then f (x) = 1 and g (x) = x . Therefore, by the product rule, D (x ln x) = f (x)g (x) + f (x)g(x) Dx x + 1 · ln x = x = ln x + 1 Version: 8 Owner: mathcam Author(s): mathcam, Logan
365.26
proof of Darboux’s theorem
WLOG, assume f+ (a) > t > f− (b). Let g(x) = f (x) − tx. Then g (x) = f (x) − t, g+ (a) > 0 > g− (b), and we wish to ﬁnd a zero of g . g is a continuous function on [a, b], so it attains a maximum on [a, b]. This maximum cannot be at a, since g+ (a) > 0 so g is locally increasing at a. Similarly, g− (b) < 0, so g is locally decreasing at b and cannot have a maximum at b. So the maximum is attained at some c ∈ (a, b). But then g (c) = 0 by Fermat’s theorem. Version: 2 Owner: paolini Author(s): paolini, ariels 1433
365.27
proof of Fermat’s Theorem (stationary points)
f (x0 + h) − f (x0 ) ≤ 0. h Since the limit of this ratio as h → 0+ exists and is equal to f (x0 ) we conclude that f (x0 ) ≤ 0. On the other hand for h ∈ (−δ, 0) we notice that f (x0 + h) − f (x0 ) ≥0 h but again the limit as h → 0+ exists and is equal to f (x0 ) so we also have f (x0 ) ≥ 0. Hence we conclude that f (x0 ) = 0. Version: 1 Owner: paolini Author(s): paolini
Suppose that x0 is a local maximum (a similar proof applies if x0 is a local minimum). Then there exists δ > 0 such that (x0 − δ, x0 + δ) ⊂ (a, b) and such that we have f (x0 ) ≥ f (x) for all x with x − x0  < δ. Hence for h ∈ (0, δ) we notice that it holds
365.28
proof of Rolle’s theorem
Because f is continuous on a compact (closed and bounded) interval I = [a, b], it attains its maximum and minimum values. In case f (a) = f (b) is both the maximum and the minimum, then there is nothing more to say, for then f is a constant function and f ⇔ 0 on the whole interval I. So suppose otherwise, and f attains an extremum in the open interval (a, b), and without loss of generality, let this extremum be a maximum, considering −f in lieu of f as necessary. We claim that at this extremum f (c) we have f (c) = 0, with a < c < b. To show this, note that f (x) − f (c) 0 for all x ∈ I, because f (c) is the maximum. By deﬁnition of the derivative, we have that f (x) − f (c) . f (c) = lim x→c x−c Looking at the onesided limits, we note that f (x) − f (c) R = lim 0 x→c+ x−c because the numerator in the limit is nonpositive in the interval I, yet x − c > 0, as x approaches c from the right. Similarly, f (x) − f (c) L = lim 0. − x→c x−c Since f is diﬀerentiable at c, the left and right limits must coincide, so 0 L = R 0, that is to say, f (c) = 0. Version: 1 Owner: rmilson Author(s): NeuRet 1434
365.29
proof of Taylor’s Theorem
Let n be a natural number and I be the closed interval [a, b]. We have that f : I → R has n continuous derivatives and its (n + 1)st derivative exists. Suppose that c ∈ I, and x ∈ I is arbitrary. Let J be the closed interval with endpoints c and x. Deﬁne F : J → R by F (t) := f (x) − so that
n n
k=0
(x − t)k (k) f (t) k!
(365.29.1)
F (t) = f (t) − = −
(x − t) (n+1) f (t) n!
k=1 n
(x − t)k (k+1) (x − t)k−1 (k) f (t) − f (t) k! (k − 1)!
since the sum telescopes. Now, deﬁne G on J by G(t) := F (t) − x−t x−c
n+1
F (c)
and notice that G(c) = G(x) = 0. Hence, Rolle’s theorem gives us a ζ strictly between x and c such that (x − ζ)n 0 = G (ζ) = F (ζ) − (n + 1) F (c) (x − c)n+1 that yields 1 (x − c)n+1 F (ζ) n + 1 (x − c)n 1 (x − c)n+1 (x − ζ)n (n+1) = f (ζ) n + 1 (x − c)n n! f (n+1) (ζ) (x − c)n+1 = (n + 1)!
F (c) = −
from which we conclude, recalling (364.29.1),
n
f (x) =
k=0
f (k) (c) f (n+1) (ζ) (x − c)k + (x − c)n+1 k! (n + 1)!
Version: 3 Owner: rmilson Author(s): NeuRet
1435
365.30
proof of binomial formula
∞
Let p ∈ R and x ∈ R, x < 1 be given. We wish to show that (1 + x) = where p denotes the n
n th p
pn
n=0
xn , n!
falling factorial of p.
The convergence of the series in the righthand side of the above equation is a straightforward consequence of the ratio test. Set f (x) = (1 + x)p . and note that f (n) (x) = pn (1 + x)p−n . The desired equality now follows from Taylor’s Theorem. Q.E.D. Version: 2 Owner: rmilson Author(s): rmilson
365.31
proof of chain rule
f (y)−f (y0 ) y−y0
Let’s say that g is diﬀerentiable in x0 and f is diﬀerentiable in y0 = g(x0 ). We deﬁne: ϕ(y) = f (y0 ) if y = y0 if y = y0
Since f is diﬀerentiable in y0 , ϕ is continuous. We observe that, for x = x0 , g(x) − g(x0 ) f (g(x)) − f (g(x0 )) = ϕ(g(x)) , x − x0 x − x0 in fact, if g(x) = g(x0 ), it follows at once from the deﬁnition of ϕ, while if g(x) = g(x0 ), both members of the equation are 0. Since g is continuous in x0 , and ϕ is continuous in y0 ,
x→x0
lim ϕ(g(x)) = ϕ(g(x0 )) = f (g(x0 )), f (g(x)) − f (g(x0 )) x→x0 x − x0 g(x) − g(x0 ) = lim ϕ(g(x)) x→x0 x − x0 = f (g(x0 ))g (x0 ). lim
hence (f ◦ g) (x0 ) =
Version: 3 Owner: n3o Author(s): n3o 1436
365.32
proof of extended meanvalue theorem
Let f : [a, b] → R and g : [a, b] → R be continuous on [a, b] and diﬀerentiable on (a, b). Deﬁne the function h(x) = f (x) (g(b) − g(a)) − g(x) (f (b) − f (a)) − f (a)g(b) + f (b)g(a). Because f and g are continuous on [a, b] and diﬀerentiable on (a, b), so is h. Furthermore, h(a) = h(b) = 0, so by Rolle’s theorem there exists a ξ ∈ (a, b) such that h (ξ) = 0. This implies that f (ξ) (g(b) − g(a)) − g (ξ) (f (b) − f (a)) = 0 and, if g(b) = g(a), f (ξ) f (b) − f (a) = . g (ξ) g(b) − g(a) Version: 3 Owner: pbruin Author(s): pbruin
365.33
proof of intermediate value theorem
We ﬁrst prove the following lemma. If f : [a, b] → R is a continuous function with f (a) ≤ 0 ≤ f (b) then ∃c ∈ [a, b] such that f (c) = 0. Deﬁne the sequences (an ) and (bn ) inductively, as follows. a0 = a b0 = b an + bn cn = 2 (an , bn ) = We note that a0 ≤ a1 . . . ≤ an ≤ bn ≤ . . . b1 ≤ b0 (bn − an ) = 2−n (b0 − a0 ) f (an ) ≤ 0 ≤ f (bn ) 1437 (365.33.1) (365.33.2) (an−1 , cn−1 ) f (cn−1 ) ≥ 0 (cn−1 , bn−1 ) f (cn−1 ) < 0
By the fundamental axiom of analysis (an ) → α and (bn ) → β. But (bn − an ) → 0 so α = β. By continuity of f (f (an )) → f (α) (f (bn )) → f (α) But we have f (α) ≤ 0 and f (α) ≥ 0 so that f (α) = 0. Furthermore we have a ≤ α ≤ b, proving the assertion.
Set g(x) = f (x) − k where f (a) ≤ k ≤ f (b). g satisﬁes the same conditions as before, so ∃c such that f (c) = k. Thus proving the more general result. Version: 2 Owner: vitriol Author(s): vitriol
365.34
proof of mean value theorem
Deﬁne h(x) on [a, b] by h(x) = f (x) − f (a) − f (b) − f (a) (x − a) b−a
clearly, h is continuous on [a, b], diﬀerentiable on (a, b), and h(a) = f (a) − f (a) = 0 h(b) = f (b) − f (a) −
f (b)−f (a) b−a
(b − a) = 0
Notice that h satisﬁes the conditions of Rolle’s theorem. Therefore, by Rolle’s Theorem there exists c ∈ (a, b) such that h (c) = 0. However, from the deﬁnition of h we obtain by diﬀerentiation that h (x) = f (x) − Since h (c) = 0, we therefore have f (c) = as required. f (b) − f (a) b−a f (b) − f (a) b−a
REFERENCES
1. Michael Spivak, Calculus, 3rd ed., Publish or Perish Inc., 1994.
Version: 2 Owner: saforres Author(s): saforres 1438
365.35
proof of monotonicity criterion
Let us start from the implications “⇒”. Suppose that f (x) ≥ 0 for all x ∈ (a, b). We want to prove that therefore f is increasing. So take x1 , x2 ∈ [a, b] with x1 < x2 . Applying the meanvalue theorem on the interval [x1 , x2 ] we know that there exists a point x ∈ (x1 , x2 ) such that f (x2 ) − f (x1 ) = f (x)(x2 − x1 ) and being f (x) ≥ 0 we conclude that f (x2 ) ≥ f (x1 ). This proves the ﬁrst claim. The other three cases can be achieved with minor modiﬁcations: replace all “≥” respectively with ≤, > and <. Let us now prove the implication “⇐” for the ﬁrst and second statement. Given x ∈ (a, b) consider the ratio f (x + h) − f (x) . h If f is increasing the numerator of this ratio is ≥ 0 when h > 0 and is ≤ 0 when h < 0. Anyway the ratio is ≥ 0 since the denominator has the same sign of the numerator. Since we know by hypothesys that the function f is diﬀerentiable in x we can pass to the limit to conclude that f (x + h) − f (x) f (x) = lim ≥ 0. h→0 h If f is decreasing the ratio considered turns out to be ≤ 0 hence the conclusion f (x) ≤ 0. Notice that if we suppose that f is strictly increasing we obtain the this ratio is > 0, but passing to the limit as h → 0 we cannot conclude that f (x) > 0 but only (again) f (x) ≥ 0. Version: 2 Owner: paolini Author(s): paolini
365.36
proof of quotient rule
Let F (x) = f (x)/g(x). Then
f (x+h) − F (x + h) − F (x) g(x+h) = lim F (x) = lim h→0 h→0 h h f (x + h)g(x) − f (x)g(x + h) = lim h→0 hg(x + h)g(x)
f (x) g(x)
h
1439
Like the product rule, the key to this proof is subtracting and adding the same quantity. We separate f and g in the above expression by subtracting and adding the term f (x)g(x) in the numerator.
F (x) = lim = lim =
f (x + h)g(x) − f (x)g(x) + f (x)g(x) − f (x)g(x + h) h→0 hg(x + h)g(x) g(x) f (x+h)−f (x) − f (x) g(x+h)−g(x) h h h→0 g(x + h)g(x)
limh→0 g(x) · limh→0 f (x+h)−f (x) − limh→0 f (x) · limh→0 g(x+h)−g(x) h h limh→0 g(x + h) · limh→0 g(x) g(x)f (x) − f (x)g (x) = [g(x)]2
Version: 1 Owner: Luci Author(s): Luci
365.37
quotient rule
The quotient rule says that the derivative of the quotient f /g of two diﬀerentiable functions f and g exists at all values of x as long as g(x) = 0 and is given by the formula d dx f (x) g(x) = g(x)f (x) − f (x)g (x) [g(x)]2
The Quotient Rule and the other diﬀerentiation formulas allow us to compute the derivative of any rational function. Version: 10 Owner: Luci Author(s): Luci
365.38
signum function
The following properties hold:
The signum function is the function sign : R → R −1 when x < 0, 0 when x = 0, sign(x) = 1 when x > 0.
1440
1. For all x ∈ R, sign(−x) = − sign(x). 2. For all x ∈ R, x = sign(x)x. 3. For all x = 0,
d x dx
= sign(x).
Here, we should point out that the signum function is often deﬁned simply as 1 for x > 0 and −1 for x < 0. Thus, at x = 0, it is left undeﬁned. See e.g. [2]. In applications, such as the Laplace transform, this deﬁnition is adequate since the value of a function at a single point does not change the analysis. One could then, in fact, set sign(0) to any value. However, setting sign(0) = 0 is motivated by the above relations. A related function is the Heaviside step function deﬁned as when x < 0, 0 1/2 when x = 0, H(x) = 1 when x > 0. 1 (sign(x) + 1), 2 H(−x) = 1 − H(x). H(x) = This ﬁrst relation is clear. For the second, we have 1 1 − H(x) = 1 − (sign(x) + 1) 2 1 = (1 − sign(x)) 2 1 (1 + sign(−x)) = 2 = H(−x). Example Let a < b be real numbers, and let f : R → R be the piecewise deﬁned function f (x) = 4 when x ∈ (a, b), 0 otherwise.
Again, this function is sometimes left undeﬁned at x = 0. The motivation for setting H(0) = 1/2 is that for all x ∈ R, we then have the relations
Using the Heaviside step function, we can write f (x) = 4 H(x − a) − H(x − b) (365.38.1)
almost everywhere. Indeed, if we calculate f using equation 364.38.1 we obtain f (x) = 4 for x ∈ (a, b), f (x) = 0 for x ∈ [a, b], and f (a) = f (b) = 2. Therefore, equation 364.38.1 holds / at all points except a and b. P 1441
365.38.1
Signum function for complex arguments
For a complex number z, the signum function is deﬁned as [1] sign(z) = 0 when z = 0, z/z when z = 0.
In other words, if z is nonzero, then sign z is the projection of z onto the unit circle {z ∈ C  z = 1}. clearly, the complex signum function reduces to the real signum function for real arguments. For all z ∈ C, we have z sign z = z, where z is the complex conjugate of z.
REFERENCES
1. E. Kreyszig, Advanced Engineering Mathematics, John Wiley & Sons, 1993, 7th ed. 2. G. Bachman, L. Narici, Functional analysis, Academic Press, 1966.
Version: 4 Owner: mathcam Author(s): matte
1442
Chapter 366 26A09 – Elementary functions
366.1 deﬁnitions in trigonometry
Informal deﬁnitions Given a triangle ABC with a signed angle x at A and a right angle at B, the ratios BC AB BC AC AC AB are dependant only on the angle x, and therefore deﬁne functions, denoted by sin x cos x tan x
respectively, where the names are short for sine, cosine and tangent. Their inverses are rather less important, but also have names: 1 (cotangent) cot x = AB/BC = tan x 1 csc x = AC/BC = (cosecant) sin x 1 (secant) sec x = AC/AB = cos x From Pythagoras’s theorem we have cos2 x + sin2 x = 1 for all (real) x. Also it is “clear” from the diagram at left that functions cos and sin are periodic with period 2π. However: Formal deﬁnitions The above deﬁnitions are not fully rigorous, because we have not deﬁned the word angle. We will sketch a more rigorous approach. The power series
∞ n=0
xn n!
1443
converges uniformly on compact subsets of C and its sum, denoted by exp(x) or by ex , is therefore an entire function of x, called the exponential function. f (x) = exp(x) is the unique solution of the boundary value problem f (0) = 1 f (x) = f (x)
on R. The sine and cosine functions, for real arguments, are deﬁned in terms of exp, simply by exp(ix) = cos x + i(sin x) . Thus x2 x4 x6 + − + ... 2! 4! 6! x x3 x5 sin x = − + − ... 1! 3! 5! Although it is not selfevident, cos and sin are periodic functions on the real line, and have the same period. That period is denoted by 2π. cos x = 1 −
Version: 3 Owner: Daume Author(s): Larry Hammick
366.2
hyperbolic functions
The hyperbolic functions sinh x and cosh x ared deﬁned as follows: ex − e−x 2 x e + e−x cosh x := . 2 sinh x := One can then also deﬁne the functions tanh x and coth x in analogy to the deﬁnitions of tan x and cot x: tanh x := ex − e−x sinh x = x cosh x e + e−x coth x ex + e−x coth x := = x . cosh x e − e−x x2 y 2 − 2 =1 a2 b can be written in parametrical form with the equations: x = a cosh t, y = b sinh t.
The hyperbolic functions are named in that way because the hyperbola
1444
This is because of the equation cosh2 x − sinh2 x = 1. There are also addition formulas which are like the ones for trigonometric functions: sinh(x ± y) = sinh x cosh y ± cosh x sinh y cosh(x ± y) = cosh x cosh y ± sinh x sinh y. The Taylor series for the hyperbolic functions are: sinh x = cosh x =
∞ n=0 ∞ n=0
x2n+1 (2n + 1)! x2n . (2n)!
Using complex numbers we can use the hyperbolic functions to express the trigonometric functions: sinh(ix) i cos x = cosh(ix). sin x = Version: 2 Owner: mathwizard Author(s): mathwizard
1445
Chapter 367 26A12 – Rate of growth of functions, orders of inﬁnity, slowly varying functions
367.1 Landau notation
Given two functions f and g from R+ to R+ , the notation f = O(g) means that the ratio we write
f (x) g(x)
stays bounded as x → ∞. If moreover that ratio approaches zero, f = o(g).
It is legitimate to write, say, 2x = O(x) = O(x2 ), with the understanding that we are using the equality sign in an unsymmetric (and informal) way, in that we do not have, for example, O(x2 ) = O(x). The notation f = Ω(g) means that the ratio
f (x) g(x)
is bounded away from zero as x → ∞, or equivalently g = O(f ).
If both f = O(g) and f = Ω(g), we write f = Θ(g). One more notational convention in this group is f (x) ∼ g(x), meaning limx→∞ f (x) = 1. g(x) 1446
In analysis, such notation is useful in describing error estimates. For example, the Riemann hypothesis is equivalent to the conjecture π(x) = √ x + O( x log x) log x
Landau notation is also handy in applied mathematics, e.g. in describing the eﬃciency of an algorithm. It is common to say that an algorithm requires O(x3 ) steps, for example, without needing to specify exactly what is a step; for if f = O(x3 ), then f = O(Ax3 ) for any positive constant A. Version: 8 Owner: mathcam Author(s): Larry Hammick, Logan
1447
Chapter 368 26A15 – Continuity and related questions (modulus of continuity, semicontinuity, discontinuities, etc.)
368.1 Dirichlet’s function
Dirichlet’s function f : R → R is deﬁned as f (x) = if x = p is a rational number in lowest terms, q 0 if x is an irrational number.
1 q
This function has the property that it is continuous at every irrational number and discontinuous at every rational one. Version: 3 Owner: urz Author(s): urz
368.2
semicontinuous
A real function f : A → R, where A ⊆ R is said to be lower semicontinuous in x0 if ∀ε > 0 ∃δ > 0 ∀x ∈ A x − x0  < δ ⇒ f (x) > f (x0 ) − ε, and f is said to be upper semicontinuous if ∀ε > 0 ∃δ > 0 ∀x ∈ A x − x0  < δ ⇒ f (x) < f (x0 ) + ε.
1448
Remark A real function is continuous in x0 if and only if it is both upper and lower semicontinuous in x0 . We can generalize the deﬁnition to arbitrary topological spaces as follows. Let A be a topological space. f : A → R is lower semicontinuous at x0 if, for each ε > 0 there is a neighborhood U of x0 such that x ∈ U implies f (x) > f (x0 ) − ε. Theorem Let f : [a, b] → R be a lower (upper) semicontinuous function. Then f has a minimum (maximum) in [a, b]. Version: 3 Owner: drini Author(s): drini, n3o
368.3
semicontinuous
Deﬁntion [1] Suppose X is a topological space, and f is a function from X into the extended real numbers R; f : X → R. Then: 1. If {x ∈ X  f (x) > α} is an open set in X for all α ∈ R, then f is said to be lower semicontinuous. 2. If {x ∈ X  f (x) < α} is an open set in X for all α ∈ R, then f is said to be upper semicontinuous.
Properties 1. If X is a topological space and f is a function f : X → R, then f is continuous if and only if f is upper and lower semicontinuous [1, 3]. 2. The characteristic function of an open set is lower semicontinuous [1, 3]. 3. The characteristic function of a closed set is upper semicontinuous [1, 3]. 4. If f and g are lower semicontinuous, then f + g is also lower semicontinuous [3].
REFERENCES
1. W. Rudin, Real and complex analysis, 3rd ed., McGrawHill Inc., 1987. 2. D.L. Cohn, Measure Theory, Birkh¨user, 1980. a
Version: 2 Owner: bwebste Author(s): matte, apmxi 1449
368.4
uniformly continuous
Let f : A → R be a real function deﬁned on a subset A of the real line. We say that f is uniformly continuous if, given an arbitrary small positive ε, there exists a positive δ such that whenever two points in A diﬀer by less than δ, they are mapped by f into points which diﬀer by less than ε. In symbols: ∀ε > 0 ∃δ > 0 ∀x, y ∈ A x − y < δ ⇒ f (x) − f (y) < ε. Every uniformly continuous function is also continuous, while the converse does not always hold. For instance, the function f :]0, +∞[→ R deﬁned by f (x) = 1/x is continuous in its domain, but not uniformly. A more general deﬁnition of uniform continuity applies to functions between metric spaces (there are even more general environments for uniformly continuous functions, i.e. Uniform spaces). Given a function f : X → Y , where X and Y are metric spaces with distances dX and dY , we say that f is uniformly continuous if ∀ε > 0 ∃δ > 0 ∀x, y ∈ X dX (x, y) < δ ⇒ dY (f (x), f (y)) < ε. Uniformly continuous functions have the property that they map Cauchy sequences to Cauchy sequences and that they preserve uniform convergence of sequences of functions. Any continuous function deﬁned on a compact space is uniformly continuous (see HeineCantor theorem). Version: 10 Owner: n3o Author(s): n3o
1450
Chapter 369 26A16 – Lipschitz (H¨lder) classes o
369.1 Lipschitz condition
A mapping f : X → Y between metric spaces is said to satisfy the Lipschitz condition if there exists a real constant α 0 such that dY (f (p), f (q)) αdX (p, q), for all p, q ∈ X.
Proposition 17. A Lipschitz mapping f : X → Y is uniformly continuous. Proof. Let f be a Lipschitz mapping and α every given > 0, choose δ > 0 such that 0 a corresponding Lipschitz constant. For
δα < . Let p, q ∈ X such that be given. By assumption, dY (f (p), f (q)) as desired. QED αδ < , dX (p, q) < δ
Notes. More generally, one says that mapping satisﬁes a Lipschitz condition of order β > 0 if there exists a real constant α 0 such that dY (f (p), f (q)) αdX (p, q)β , for all p, q ∈ X.
Version: 17 Owner: rmilson Author(s): rmilson, slider142
1451
369.2
Lipschitz condition and diﬀerentiability
If X and Y are Banach spaces, e.g. Rn , one can inquire about the relation between diﬀerentiability and the Lipschitz condition. The latter is the weaker condition. If f is Lipschitz, the ratio f (q) − f (p) , p, q ∈ X q−p
is bounded but is not assumed to converge to a limit. Indeed, diﬀerentiability is the stronger condition.
Proposition 18. Let f : X → Y be a continuously diﬀerentiable mapping between Banach spaces. If K ⊂ X is a compact subset, then the restriction f : K → Y satisﬁes the Lipschitz condition. Proof. Let Lin(X, Y ) denote the Banach space of bounded linear maps from X to Y . Recall that the norm T of a linear mapping T ∈ Lin(X, Y ) is deﬁned by T = sup{ Tu : u = 0}. u
Let Df : X → Lin(X, Y ) denote the derivative of f . By deﬁnition Df is continuous, which really means that Df : X → R is a continuous function. Since K ⊂ X is compact, there exists a ﬁnite upper bound B1 > 0 for Df restricted to U. In particular, this means that Df (p)u for all p ∈ K, u ∈ X. Next, consider the secant mapping s : X × X → R deﬁned by f (q) − f (p) − Df (p)(q − p) q−p s(p, q) = 0 Df (p) u B1 u ,
q=p p=q
This mapping is continuous, because f is assumed to be continuously diﬀerentiable. Hence, there is a ﬁnite upper bound B2 > 0 for s restricted to the compact K × K. It follows that for all p, q ∈ K we have f (q) − f (p) f (q) − f (p) − Df (p)(q − p) + Df (p)(q − p) B2 q − p + B1 q − p = (B1 + B2 ) q − p
Therefore B1 , B2 is the desired Lipschitz constant. QED Version: 22 Owner: rmilson Author(s): rmilson, slider142 1452
369.3
Lipschitz condition and diﬀerentiability result
About Lipschitz continuity of diﬀerentiable functions the following holds. Theorem 6. Let X, Y be Banach spaces and let A be a convex (see convex set), open subset of X. Let f : A → Y be a function which is continuous in A and diﬀerentiable in A. Then f is lipschitz continuous on A if and only if the derivative Df is bounded on A i.e. sup Df (x) < +∞.
x∈A
S
uppose that f is lipschitz continuous: f (x) − f (y) ≤ L x − y .
Then given any x ∈ A and any v ∈ X, for all small h ∈ R we have f (x + hv) − f (x) ≤ L. h Hence, passing to the limit h → 0 it must hold Df (x) ≤ L. On the other hand suppose that Df is bounded on A: Df (x) ≤ L, ∀x ∈ A.
Given any two points x, y ∈ A and given any α ∈ Y ∗ consider the function G : [0, 1] → R G(t) = α, f ((1 − t)x + ty) . For t ∈ (0, 1) it holds and hence G (t) ≤ L α y−x . Applying Lagrange meanvalue theorem to G we know that there exists ξ ∈ (0, 1) such that  α, f (y) − f (x)  = G(1) − G(0) = G (ξ) ≤ α L y − x and since this is true for all α ∈ Y ∗ we get f (y) − f (x) ≤ L y − x which is the desired claim. Version: 1 Owner: paolini Author(s): paolini G (t) = α, Df ((1 − t)x + ty)[y − x]
1453
Chapter 370 26A18 – Iteration
370.1 iteration
Let f : X → X be a function, X being any set. The nth iteration of a function is the function which is obtained if f is applied n times, and is denoted by f n . More formally we deﬁne: f 0 (x) = x and f n+1 (x) = f (f n (x)) for nonnegative integers n. If f is invertible, then by going backwards we can deﬁne the iterate also for negative n. Version: 6 Owner: mathwizard Author(s): mathwizard
370.2
periodic point
Let f : X → X be a function and f n its nth iteration. A point x is called a periodic point of period n of f if it is a ﬁxed point of f n . The least n for which x is a ﬁxed point of f n is called prime period or least period. If f is a function mapping R to R or C to C then a periodic point x of prime period n is called hyperbolic if (f n ) (x) = 1, attractive if (f n ) (x) < 1 and repelling if (f n ) (x) > 1. Version: 11 Owner: mathwizard Author(s): mathwizard
1454
Chapter 371 26A24 – Diﬀerentiation (functions of one variable): general theory, generalized derivatives, meanvalue theorems
371.1 Leibniz notation
Leibniz notation centers around the concept of a diﬀerential element. The diﬀerential element of x is represented by dx. You might think of dx as being an inﬁnitesimal change dy in x. It is important to note that d is an operator, not a variable. So, when you see dx , you y can’t automatically write as a replacement x . We use
df (x) dx
or
d f (x) dx
to represent the derivative of a function f (x) with respect to x.
f (x + Dx) − f (x) df (x) = lim Dx→0 dx Dx We are dividing two numbers inﬁnitely close to 0, and arriving at a ﬁnite answer. D is another operator that can be thought of just a change in x. When we take the limit of Dx as Dx approaches 0, we get an inﬁnitesimal change dx. Leibniz notation shows a wonderful use in the following example: dy dy du dy du = = dx dx du du dx The two dus can be cancelled out to arrive at the original derivative. This is the Leibniz notation for the chain rule. Leibniz notation shows up in the most common way of representing an integral, F (x) = intf (x)dx 1455
The dx is in fact a diﬀerential element. Let’s start with a derivative that we know (since F (x) is an antiderivative of f (x)). dF (x) dx dF (x) intdF (x) F (x) = f (x) = f (x)dx = intf (x)dx = intf (x)dx
We can think of dF (x) as the diﬀerential element of area. Since dF (x) = f (x)dx, the element of area is a rectangle, with f (x) × dx as its dimensions. Integration is the sum of all these inﬁnitely thin elements of area along a certain interval. The result: a ﬁnite number. (a diagram is deserved here) One clear advantage of this notation is seen when ﬁnding the length s of a curve. The formula is often seen as the following: s = intds The length is the sum of all the elements, ds, of length. If we have a function f (x), the length element is usually written as ds =
(x) 1 + [ dfdx ]2 dx. If we modify this a bit, we get
ds = [dx]2 + [df (x)]2 . Graphically, we could say that the length element is the hypotenuse of a right triangle with one leg being the x element, and the other leg being the f (x) element. (another diagram would be nice!) There are a few caveats, such as if you want to take the value of a derivative. Compare to the prime notation. df (x) f (a) = dx x=a A second derivative is represented as follows: d dy d2 y = 2 dx dx dx
d The other derivatives follow as can be expected: dxy , etc. You might think this is a little 3 sneaky, but it is the notation. Properly using these terms can be interesting. For example, 2y d2 dy dy what is int ddx ? We could turn it into int dxy dx or intd dx . Either way, we get dx . 2
3
Version: 2 Owner: xriso Author(s): xriso
371.2
derivative
Qualitatively the derivative is a measure of the change of a function in a small region around a speciﬁed point. 1456
Motivation
The idea behind the derivative comes from the straight line. What characterizes a straight line is the fact that it has constant “slope”. Figure 371.1: The straight line y = mx + b
In other words for a line given by the equation y = mx + b, as in Fig. 370.1, the ratio of ∆y ∆y over ∆x is always constant and has the value ∆x = m. Figure 371.2: The parabola y = x2 and its tangent at (x0 , y0 )
For other curves we cannot deﬁne a “slope”, like for the straight line, since such a quantity would not be constant. However, for suﬃciently smooth curves, each point on a curve has a tangent line. For example consider the curve y = x2 , as in Fig. 370.2. At the point (x0 , y0 ) on the curve, we can draw a tangent of slope m given by the equation y − y0 = m(x − x0 ). Suppose we have a curve of the form y = f (x), and at the point (x0 , f (x0 )) we have a tangent given by y − y0 = m(x − x0 ). Note that for values of x suﬃciently close to x0 we can make the approximation f (x) ≈ m(x − x0 ) + y0. So the slope m of the tangent describes how much f (x) changes in the vicinity of x0 . It is the slope of the tangent that will be associated with the derivative of the function f (x).
Formal deﬁnition
More formally for any real function f : R → R, we deﬁne the derivative of f at the point x as the following limit (if it exists) f (x) := lim f (x + h) − f (x) . h→0 h
This deﬁnition turns out to be consistent with the motivation introduced above. The derivatives for some elementary functions are (cf. Derivative notation) 1. 2. 3.
d c dx
= 0,
where c is constant;
d n x dx d dx
= nxn−1 ;
sin x = cos x; 1457
4. 5. 6.
d dx
cos x = − sin x; = ex ;
1 ln x = x .
d x e dx d dx
While derivatives of more complicated expressions can be calculated algorithmically using the following rules Linearity
d dx
(af (x) + bg(x)) = af (x) + bg (x);
d dx
Product rule Chain rule
(f (x)g(x)) = f (x)g(x) + f (x)g (x); = g (f (x))f (x); =
f (x)g(x)−f (x)g (x) . g(x)2
d g(f (x)) dx d f (x) dx g(x)
Quotient Rule
Note that the quotient rule, although given as much importance as the other rules in elementary calculus, can be derived by succesively applying the product rule and the chain rule 1 to f (x) = f (x) g(x) . Also the quotient rule does not generalize as well as the other ones. g(x) Since the derivative f (x) of f (x) is also a function x, higher derivatives can be obtained by applying the same procedure to f (x) and so on.
Generalization Banach Spaces
Unfortunately the notion of the “slope of the tangent” does not directly generalize to more abstract situations. What we can do is keep in mind the facts that the tangent is a linear function and that it approximates the function near the point of tangency, as well as the formal deﬁnition above. Very general conditions under which we can deﬁne a derivative in a manner much similar to the above areas follows. Let f : V → W, where V and W are Banach spaces. Suppose that h ∈ V and h = 0, the we deﬁne the directional derivative (Dh f )(x) at x as the following limit f (x + h) − f (x) (Dh f )(x) := lim ,
→0
where is a scalar. Note that f (x + h) ≈ f (x) + (Dh f )(x), which is consistent with our original motivation. This directional derivative is also called the Gˆteaux derivative. a
1458
Finally we deﬁne the derivative at x as the bounded linear map (Df )(x) : V → W such that for any nonzero h ∈ V
h
lim
(f (x + h) − f (x)) − (Df )(x) · h = 0. →0 h
Once again we have f (x + h) ≈ f (x) + (Df )(x) · h. In fact, if the derivative (Df )(x) exists, the directional derivatives can be obtained as (Dh f )(x) = (Df )(x) · h.1 each nonzero h ∈ V does not guarantee the existence of (Df )(x). This derivative is also called the Fr´chet derivative. In the more familiar case f : Rn → Rm , the derivative Df is simply e the Jacobian of f . Under these general conditions the following properties of the derivative remain 1. Dh = 0, where h is a constant; where A is linear.
2. D(A · x) = A,
Linearity D(af (x) + bg(x)) · h = a(Df )(x) · h + b(Dg)(x) · h; “Product” rule D(B(f (x), g(x)))·h = B((Df )(x)·h, g(x))+B(f (x), (Dg)(x)·h), B is bilinear; Chain rule D(g(f (x)) · h = (Dg)(f (x)) · ((Df )(x) · h). Note that the derivative of f can be seen as a function Df : V → L(V, W) given by Df : x → (Df )(x), where L(V, W) is the space of bounded linear maps from V to W. Since L(V, W) can be considered a Banach space itself with the norm taken as the operator norm, higher derivatives can be obtained by applying the same procedure to Df and so on. where
Manifolds
A manifold is a topological space that is locally homeomorphic to a Banach space V (for ﬁnite dimensional manifolds V = Rn ) and is endowed with enough structure to deﬁne derivatives. Since the notion of a manifold was constructed speciﬁcally to generalize the notion of a derivative, this seems like the end of the road for this entry. The following discussion is rather technical, a more intuitive explanation of the same concept can be found in the entry on related rates. Consider manifolds V and W modeled on Banach spaces V and W, respectively. Say we have y = f (x) for some x ∈ V and y ∈ W , then, by deﬁnition of a manifold, we can ﬁnd
1 The notation A · h is used when h is a vector and A a linear operator. This notation can be considered advantageous to the usual notation A(h), since the latter is rather bulky and the former incorporates the intuitive distributive properties of linear operators also associated with usual multiplication.
1459
charts (X, x) and (Y, y), where X and Y are neighborhoods of x and y, respectively. These charts provide us with canonical isomorphisms between the Banach spaces V and W, and the respective tangent spaces Tx V and Ty W : dxx : Tx V → V, dyy : Ty W → W.
Now consider a map f : V → W between the manifolds. By composing it with the chart maps we construct the map g(X,x) = y ◦ f ◦ x−1 : V → W, deﬁned on an appropriately restricted domain. Since we now have a map between Banach (Y,y) spaces, we can deﬁne its derivative at x(x) in the sense deﬁned above, namely Dg(X,x) (x(x)). If this derivative exists for every choice of admissible charts (X, x) and (Y, y), we can say that the derivative of Df (x) of f at x is deﬁned and given by
−1 Df (x) = dyy ◦ Dg(X,x) (x(x)) ◦ dxx (Y,y) (Y,y)
(it can be shown that this is well deﬁned and independent of the choice of charts). Note that the derivative is now a map between the tangent spaces of the two manifolds Df (x) : Tx V → Ty W . Because of this a common notation for the derivative of f at x is Tx f . Another alternative notation for the derivative is f∗,x because of its connection to the categorytheoretical pushforward. Version: 15 Owner: igor Author(s): igor
371.3
l’Hpital’s rule
0 L’Hˆpital’s rule states that given an unresolvable limit of the form 0 or ∞ , the ratio of o ∞ (x) functions f (x) will have the same limit at c as the ratio f (x) . In short, if the limit of a ratio g(x) g of functions approaches an indeterminate form, then
f (x) f (x) = lim x→c g(x) x→c g (x) lim provided this last limit exists. L’Hˆpital’s rule may be applied indeﬁnitely as long as the o (x) conditions still exist. However it is important to note, that the nonexistance of lim f (x) does g not prove the nonexistance of lim f (x) . g(x) Example: We try to determine the value of x2 . x→∞ ex lim 1460
As x approaches ∞ the expression becomes an indeterminate form rule we get 2x 2 x2 lim x = lim x = lim x = 0. x→∞ e x→∞ e x→∞ e Version: 8 Owner: mathwizard Author(s): mathwizard, slider142
∞ . ∞
By applying L’Hˆpital’s o
371.4
proof of De l’Hpital’s rule
Let x0 ∈ R, I be an interval containing x0 and let f and g be two diﬀerentiable functions deﬁned on I \ {x0 } with g (x) = 0 for all x ∈ I. Suppose that
x→x0
lim f (x) = 0,
x→x0
lim g(x) = 0
and that
x→x0
lim
f (x) = m. g (x)
We want to prove that hence g(x) = 0 for all x ∈ I \ {x0 } and f (x) = m. x→x0 g(x) lim
First of all (with little abuse of notation) we suppose that f and g are deﬁned also in the point x0 by f (x0 ) = 0 and g(x0 ) = 0. The resulting functions are continuous in x0 and hence in the whole interval I. Let us ﬁrst prove that g(x) = 0 for all x ∈ I \ {x0 }. If by contradiction g(¯) = 0 since we x also have g(x0 ) = 0, by Rolle’s theorem we get that g (ξ) = 0 for some ξ ∈ (x0 , x) which is ¯ against our hypotheses. Consider now any sequence xn → x0 with xn ∈ I \ {x0 }. By Cauchy’s mean value theorem there exists a sequence xn such that f (xn ) f (xn ) − f (x0 ) f (xn ) = = . g(xn ) g(xn ) − g(x0 ) g (xn )
But as xn → x0 and since xn ∈ (x0 , xn ) we get that xn → x0 and hence f (xn ) f (xn ) f (x) = lim = lim = m. n→∞ g(xn ) n→∞ g (xn ) x→x0 g (x) lim
Since this is true for any given sequence xn → x0 we conclude that f (x) = m. x→x0 g(x) lim
Version: 5 Owner: paolini Author(s): paolini 1461
371.5
related rates
The notion of a derivative has numerous interpretations and applications. A wellknown geometric interpretation is that of a slope, or more generally that of a linear approximation to a mapping between linear spaces (see here). Another useful interpretation comes from physics and is based on the idea of related rates. This second point of view is quite general, and sheds light on the deﬁnition of the derivative of a manifold mapping (the latter is described in the pushforward entry). Consider two physical quantities x and y that are somehow coupled. For example: • the quantities x and y could be the coordinates of a point as it moves along the unit circle; • the quantity x could be the radius of a sphere and y the sphere’s surface area; • the quantity x could be the horizontal position of a point on a given curve and y the distance traversed by that point as it moves from some ﬁxed starting position; • the quantity x could be depth of water in a conical tank and y the rate at which the water ﬂows out the bottom. Regardless of the application, the situation is such that a change in the value of one quantity is accompanied by a change in the value of the other quantity. So let’s imagine that we take control of one of the quantities, say x, and change it in any way we like. As we do so, quantity y follows suit and changes along with x. Now the analytical relation between the values of x and y could be quite complicated and nonlinear, but the relation between the instantaneous rates of change of x and y is linear. It does not matter how we vary the two quantities, the ratio of the rates of change depends only on the values of x and y. This ratio is, of course, the derivative of the function that maps the values of x to the values of y. Letting x, y denote the rates of change of the two ˙ ˙ quantities, we describe this conception of the derivative as y ˙ dy = , dx x ˙ or equivalently as y= ˙ dy x. ˙ dx (371.5.1)
Next, let us generalize the discussion and suppose that the two quantities x and y represent physical states with multiple degrees of freedom. For example, x could be a point on the earth’s surface, and y the position of a point 1 kilometer to the north of x. Again, the dependence of y and x is, in general, nonlinear, but the rate of change of y does have a linear dependence on the rate of change of x. We would like to say that the derivative is 1462
precisely this linear relation, but we must ﬁrst contend with the following complication. The rates of change are no longer scalars, but rather velocity vectors, and therefore the derivative must be regarded as a linear transformation that changes one vector into another. In order to formalize this generalized notion of the derivative we must consider x and y to be points on manifolds X and Y , and the relation between them a manifold mapping φ : X → Y . A varying x is formally described by a trajectory γ : I → X, I ⊂ R.
The corresponding velocities take their value in the tangent spaces of X: γ (t) ∈ Tγ(t) X. The “coupling” of the two quantities is described by the composition φ ◦ γ : I → Y. The derivative of φ at any given x ∈ X is a linear mapping φ∗ (x) : Tx X → Tφ(x) Y, called the pushforward of φ at x, with the property that for every trajectory γ passing through x at time t, we have (φ ◦ γ) (t) = φ∗ (x)γ (t). The above is the multidimensional and coordinatefree generalization of the related rates relation (370.5.1). All of the above has a perfectly rigorous presentation in terms of manifold theory. The approach of the present entry is more informal; our ambition was merely to motivate the notion of a derivative by describing it as a linear transformation between velocity vectors. Version: 2 Owner: rmilson Author(s): rmilson
1463
Chapter 372 26A27 – Nondiﬀerentiability (nondiﬀerentiable functions, points of nondiﬀerentiability), discontinuous derivatives
372.1 Weierstrass function
The Weierstrass function is a continuous function that is nowhere diﬀerentiable, and hence is not an analytic function. The formula for the Weierstrass function is
∞ n=1 3 with a odd, 0 < b < 1, and ab > 1 + 2 π.
f (x) =
bn cos(an πx)
Another example of an everywhere continuous but nowhere diﬀerentiable curve is the fractal Koch curve. [insert plot of Weierstrass function] Version: 5 Owner: akrowne Author(s): akrowne
1464
Chapter 373 26A36 – Antidiﬀerentiation
373.1 antiderivative
The function F (x) is called an antiderivative of a function f (x) if (and only if) the derivative of F is equal to f . F (x) = f (x) Note that there are an inﬁnite number of antiderivatives for any function f (x), since any constant can be added or subtracted from any valid antiderivative to yield another equally valid antiderivative. To account for this, we express the general antiderivative, or indefinite integral, as follows: intf (x) dx = F (x) + C where C is an arbitrary constant called the constant of integration. The dx portion means ”with respect to x”, because after all, our functions F and f are functions of x. Version: 4 Owner: xriso Author(s): xriso
373.2
integration by parts
When one has an integral of a product of two functions, it is sometimes preferable to simplify the integrand by integrating one of the functions and diﬀerentiating the other. This process is called integrating by parts, and is deﬁned in the following way, where u and v are functions of x. intu · v dx = u · v − intv · u dx This process may be repeated indeﬁnitely, and in some cases it may be used to solve for the original integral algebraically. For deﬁnite integrals, the rule appears as intb u(x) · v (x) dx = (u(b) · v(b) − u(a) · v(a)) − intb v(x) · u (x) dx a a 1465
Proof: Integration by parts is simply the antiderivative of a product rule. Let G(x) = u(x) · v(x). Then, G (x) = u (x)v(x) + u(x)v (x) Therefore, We can now integrate both sides with respect to x to get G (x) − v(x)u (x) = u(x)v (x)
G(x) − intv(x)u (x) dx = intu(x)v (x) dx which is just integration by parts rearranged. Example: We integrate the function f (x) = x sin x: Therefore we deﬁne u(x) := x and v (x) = sin x. So integration by parts yields us: intx sin xdx = −x cos x + int cos xdx = −x cos x + sin x. Version: 5 Owner: mathwizard Author(s): mathwizard, slider142
373.3
integrations by parts for the Lebesgue integral
Theorem [1, 2] Suppose f, g are complex valued functions on a bounded interval [a, b]. If f and g are absolutely continuous, then int[a,b] f g = −int[a,b] f g + f (b)g(b) − f (a)g(a). where both integrals are Lebesgue integrals. Remark Any absolutely continuous function can be diﬀerentiated almost everywhere. Thus, in the above, the functions f and g make sense. Proof. Since f, g and f g are almost everywhere diﬀerentiable with Lebesgue integrable derivatives (see this page), we have (f g) = f g + f g almost everywhere, and int[a,b] (f g) = int[a,b] f g + f g = int[a,b] f g + int[a,b] f g .
The last equality is justiﬁed since f g and f g are integrable. For instance, we have int[a,b] f g ≤ max g(x)int[a,b] f ,
x∈[a,b]
which is ﬁnite since g is continuous and f is Lebesgue integrable. Now the claim follows from the Fundamental theorem of calculus for the Lebesgue integral. P 1466
REFERENCES
1. Jones, F., Lebesgue Integration on Euclidean Spaces, Jones and Barlett Publishers, 1993. 2. Ng, Tze Beng, Integration by Parts, online.
Version: 4 Owner: matte Author(s): matte
1467
Chapter 374 26A42 – Integrals of Riemann, Stieltjes and Lebesgue type
374.1 Riemann sum
partition with n ∈ N elements of I, then the Riemann sum of f over I with the partition P is deﬁned as
n
Suppose there is a function f : I → R where I = [a, b] is a closed interval, and f is bounded on I. If we have a ﬁnite set of points {x0 , x1 , x2 , . . . xn } such that a = x0 < x1 < x2 · · · < xn = b, then this set creates a partition P = {[x0 , x1 ), [x1 , x2 ), . . . [xn − 1, xn ]} of I. If P is a
S=
i=1
f (yi)(xi − xi−1 )
where xi−1 yi xi . The choice of yi is arbitrary. If yi = xi−1 for all i, then S is called a left Riemann sum. If yi = xi , then S is called a right Riemann sum. Suppose we have
n
S=
i=1
b(xi − xi−1 )
where b is the supremum of f over [xi−1 , xi ]; then S is deﬁned to be an upper Riemann sum. Similarly, if b is the inﬁmum of f over [xi−1 , xi ], then S is a lower Riemann sum. Version: 3 Owner: mathcam Author(s): mathcam, vampyr
1468
374.2
RiemannStieltjes integral
Let f and α be bounded, realvalued functions deﬁned upon a closed ﬁnite interval I = [a, b] of R(a = b), P = {x0 , ..., xn } a partition of I, and ti a point of the subinterval [xi−1 , xi ]. A sum of the form
n
S(P, f, α) =
i=1
f (ti )(α(xi ) − α(xi−1 ))
is called a RiemannStieltjes sum of f with respect to α. f is said to be Riemann integrable with respect to α on I if there exists A ∈ R such that given any > 0 there exists a partition P of I for which, for all P ﬁner than P and for every choice of points ti , we have S(P, f, α) − A < If such an A exists, then it is unique and is known as the RiemannStieltjes integral of f with respect to α. f is known as the integrand and α the integrator. The integral is denoted by intb f dα or intb f (x)dα(x) a a Version: 3 Owner: vypertd Author(s): vypertd
374.3
continuous functions are Riemann integrable
Let f : [a, b] → R be a continuous function. Then f is Riemann integrable. Version: 2 Owner: paolini Author(s): paolini
374.4
generalized Riemann integral
A function f : [a, b] → R is said to be generalized Riemann integrable on [a, b] if there exists a number L ∈ R such that for every > 0 there exists a gauge δ on [a, b] such that if ˙ P is any δ ﬁne partition of [a, b], then ˙ S(f ; P) − L < 1469
˙ ˙ Where S(f ; P) is the Riemann sum for f using the partition P. The collection of all gener∗ alized Riemann integrable functions is usually denoted by R [a, b]. If f ∈ R∗ [a, b] then the number L is uniquely determined, and is called the generalized Riemann integral of f over [a, b]. Version: 3 Owner: vypertd Author(s): vypertd
374.5
proof of Continuous functions are Riemann integrable
Recall the deﬁnition of Riemann integral. To prove that f is integrable we have to prove that limδ→0+ S ∗ (δ) − S∗ (δ) = 0. Since S ∗ (δ) is decreasing and S∗ (δ) is increasing it is enough to show that given > 0 there exists δ > 0 such that S ∗ (δ) − S∗ (δ) < . So let > 0 be ﬁxed.
By HeineCantor theorem f is uniformly continuous i.e. ∃δ > 0 x − y < δ ⇒ f (x) − f (y) < b−a .
Let now P be any partition of [a, b] in C(δ) i.e. a partition {x0 = a, x1 , . . . , xN = b} such that xi+1 − xi < δ. In any small interval [xi , xi+1 ] the function f (being continuous) has a maximum Mi and minimum mi . Being f uniformly continuous and being xi+1 − xi < δ we hence have Mi − mi < /(b − a). So the diﬀerence between upper and lower Riemann sums is (xi+1 − xi ) = . Mi (xi+1 − xi ) − mi (xi+1 − xi ) ≤ b−a i i i Being this true for every partition P in C(δ) we conclude that S ∗ (δ) − S∗ (δ) < . Version: 1 Owner: paolini Author(s): paolini
1470
Chapter 375 26A51 – Convexity, generalizations
375.1 concave function
Let f (x) a continuous function deﬁned on an interval [a, b]. Then we say that f is a concave function on [a, b] if, for any x1 , x2 in [a, b] and any λ ∈ [0, 1] we have f λx1 + (1 − λ)x2 λf (x1 ) + (1 − λ)f (x2 ).
The deﬁnition is equivalent to the statements: • For all x1 , x2 in [a, b], f
x1 + x2 2
≥
f (x1 ) + f (x2 ) 2
• The second derivative of f is negative on [a, b]. • If f has a derivative which is monotone decreasing. obviously, the last two items apply provided f has the required derivatives. And example of concave function is f (x) = −x2 on the interval [−5, 5]. Version: 5 Owner: drini Author(s): drini
1471
Chapter 376 26Axx – Functions of one variable
376.1 function centroid
Let f : D ⊂ R → R be an arbitrary function. By analogy with the geometric centroid, the centroid of an function f is deﬁned as: x = intxf (x)dx , intf (x)dx
where the integrals are taken over the domain D. Version: 1 Owner: vladm Author(s): vladm
1472
Chapter 377 26B05 – Continuity and diﬀerentiation questions
377.1
∞ C0 (U ) is not empty
Theorem If U is a nonempty open set in Rn , then the set of smooth functions with compact support ∞ C0 (U) is not empty. The proof is divided into three subclaims: Claim 1 Let a < b be real numbers. Then there exists a smooth nonnegative function f : R → R, whose support is the compact set [a, b]. To prove Claim 1, we need the following lemma: Lemma ([4], pp. 14) If φ(x) = 0 for x ≤ 0, e−1/x for x > 0,
then φ : R → R is a nonnegative smooth function. (A proof of the Lemma can be found in [4].) Proof of Claim 1. Using the lemma, let us deﬁne f (x) = φ(x − a)φ(b − x). Since φ is smooth, it follows that f is smooth. Also, from the deﬁnition of φ, we see that φ(x − a) = 0 precisely when x ≤ a, and φ(b − x) = 0 precisely when x ≥ b. Thus the support of f is indeed [a, b]. P Claim 2 Let ai , bi be real numbers with ai < bi for all i = 1, . . . , n. Then there exists a 1473
smooth nonnegative function f : Rn → R whose support is the compact set [a1 , b1 ] × · · · × [an , bn ]. Proof of Claim 2. Using Claim 1, we can for each i = 1, . . . , n construct a function fi with support in [ai , bi ]. Then f (x1 , . . . , xn ) = f1 (x1 )f2 (x2 ) · · · fn (xn ) gives a smooth function with the sought properties. P Claim 3 If U is a nonempty open set in Rn , then there are real numbers ai < bi for i = 1, . . . , n such that [a1 , b1 ] × · · · × [an , bn ] is a subset of U. Proof of Claim 3. Here, of course, we assume that Rn is equipped with the usual topology induced by the open balls of the Euclidean metric. Since U is nonempty, there exists some point x in U. Further, since U is a topological space, x is contained in some open set. Since the topology has a basis consisting of open balls, there exists a y ∈ U and ε > 0 such that x is contained in the open ball B(y, ε). Let us now set ε ε ai = yi − 2√n and bi = yi + 2√n for all i = 1, . . . , n. Then D = [a1 , b1 ] × · · · × [an , bn ] can be parametrized as ε D = {y + (λ1 , . . . , λn ) √  λi ∈ [−1, 1] for all i = 1, . . . , n}. 2 n For an arbitrary point in D, we have ε ε y + (λ1 , . . . , λn ) √ − y = (λ1 , . . . , λn ) √  2 n 2 n ε λ2 + · · · + λ2 = √ 1 n 2 n ε ≤ < ε, 2 so D ⊂ B(y, ) ⊂ U, and Claim 3 follows. P
REFERENCES
1. L. H¨rmander, The Analysis of Linear Partial Diﬀerential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, SpringerVerlag, 1990.
Version: 3 Owner: matte Author(s): matte
377.2
Rademacher’s Theorem
Let f : Rn → R be any Lipschitz continuous function. Then f is diﬀerentiable in almost every x ∈ Rn . 1474
Version: 1 Owner: paolini Author(s): paolini
377.3
smooth functions with compact support
Deﬁnition [3] Let U be an open set in Rn . Then the set of smooth functions with compact support (in U) is the set of functions f : Rn → C which are smooth (i.e., ∂ α f : Rn → C is a continuous function for all multiindices α) and supp f is compact and ∞ contained in U. This functionspace is denoted by C0 (U). Remarks
∞ 1. A proof that C0 (U) is not empty can be found here. ∞ 2. With the usual pointwise addition and pointwise multiplication by a scalar, C0 (U) is a vector space over the ﬁeld C. ∞ 3. Suppose U and V are open subsets in Rn and U ⊂ V . Then C0 (U) is a vector subspace ∞ ∞ ∞ of C0 (V ). In particular, C0 (U) ⊂ C0 (V ).
∞ ∞ It is possible to equip C0 (U) with a topology, which makes C0 (U) into a locally convex topological vector s The deﬁnition, however, of this topology is rather involved (see e.g. [3]). However, the next theorem shows when a sequence converges in this topology.
Theorem 1 Suppose that U is an open set in Rn , and that {φi }∞ is a sequence of functions i=1 ∞ in C∞ (U). Then {φi } converges (in the aforementioned topology) to a function φ ∈ C0 (U) 0 if and only if the following conditions hold: 1. There is a compact set K ⊂ U such that supp φi ⊂ K for all i = 1, 2, . . .. 2. For every multiindex α, in the supnorm. ∂ α φi → ∂ α φ
Theorem 2 Suppose that U is an open set in Rn , that Γ is a locally convex topological ∞ vector space, and that L : C0 (U) → Γ is a linear map. Then L is a continuous map, if and only if the following condition holds:
∞ If K is a compact subset of U, and {φi }∞ is a sequence of functions in C0 (U) such i=1 ∞ that supp φi ⊂ K for all i, and φi → φ (in C0 (U)) for some φ ∈ D(U), then Lφi → Lφ (in C).
The above theorems are stated without proof in [1]. 1475
REFERENCES
1. W. Rudin, Functional Analysis, McGrawHill Book Company, 1973. 2. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999.
Version: 3 Owner: matte Author(s): matte
1476
Chapter 378 26B10 – Implicit function theorems, Jacobians, transformations with several variables
378.1 Jacobian matrix
The Jacobian [Jf (x)] of a function f : Rn → Rm is the matrix of partial derivatives such that D1 f1 (x) . . . Dn f1 (x) . . .. . . [Jf (x)] = . . . D1 fm (x) . . . Dn fm (x) A more concise way of writing it is −→ − −→ − [Jf(x)] = [D1 f , · · · , Dn f ] = f1 . . . fm
−→ − where Dn f is the partial derivative with respect to the nth variable and fm is the gradient of the nth component of f. The Jacobian matrix represents the full derivative matrix [Df (x)] of f at x iﬀ f is diﬀerentiable at x. Also, if f is diﬀerentiable at x, then [Jf(x)] = [Df (x)] and the directional derivative in the direction v is [Df(x)]v. Version: 9 Owner: slider142 Author(s): slider142
378.2
directional derivative
Partial derivatives measure the rate at which a multivariable function f varies as the variable 1477
moves in the direction of the standard basis vectors. Directional derivatives measure the rate at which f varies when the variable moves in the direction v. Thus the directional derivative of f at a in the direction v is represented as Dv f (a) = f (a + hv) − f (a) ∂ f (a) = lim . h→0 ∂v h
x For example, if f y = x2 + 3y 2z, and we wanted to ﬁnd the derivative at the point z 1 1 a = 2 in the direction v = 1 , our equation would be 1 3
1 limh→0 h ((1 + h)2 + 3(2 + h)2 (3 + h) − 37) 1 = limh→0 h (3h3 + 37h2 + 50h)
.
= limh→0 3h2 + 37h + 50 = 50 One may also use the Jacobian matrix if the function is diﬀerentiable to ﬁnd the derivative in the direction v as [Jf (x)]v. Version: 6 Owner: slider142 Author(s): slider142
378.3
gradient
Summary. The gradient is a ﬁrstorder diﬀerential operator that maps functions to vector ﬁelds. It is a generalization of the ordinary derivative, and as such conveys information about the rate of change of a function relative to small variations in the independent variables. The gradient of a function f is customarily denoted by f or by grad f . Deﬁnition: Euclidean space Consider ndimensional Euclidean space with orthogonal coordinates x1 , . . . , xn , and corresponding unit vectors e1 , . . . , en . In this setting, the gradient of a function f (x1 , . . . , xn ) is deﬁned to be the vector ﬁeld given by
n
f=
i=1
∂f ei . ∂xi
It is also useful to represent the gradient operator as the vectorvalued diﬀerential operator
n
=
i=1
ei 1478
∂ . ∂xi
or, in the context of Euclidean 3space, as =i ∂ ∂ ∂ +j +k , ∂x ∂y ∂z
where i, j, k are the unit vectors lying along the positive direction of the x, y, z axes, respectively. Using this formalism, the symbol can be used to express the divergence operator as ·, the curl operator as ×, and the Laplacian operator as 2 . To wit, for a given vector ﬁeld A = Ax i + Ay j + Az k, and a given function f we have ∂Ax ∂Ay ∂Az + + ∂x ∂y ∂z ∂Az ∂Ay ×A= i+ − ∂y ∂z ∂2f ∂2f ∂2f 2 f= + 2 + 2. ∂x2 ∂y ∂z ·A=
∂Ax ∂Az − ∂z ∂x
j+
∂Ay ∂Ax − ∂x ∂y
k
Deﬁnition: Riemannian geometry More generally still, consider a Riemannian manifold with metric tensor gij and inverse g ij . In this setting the gradient X = grad f of a function f relative to a general coordinate system, is given by X j = g ij f,i . (378.3.1)
Note that the Einstein summation convention is in force above. Also note that f,i denotes the partial derivative of f with respect to the ith coordinate. Deﬁnition (377.3.1) is useful even in the Euclidean setting, because it can be used to derive the formula for the gradient in various generalized coordinate systems. For example, in the cylindrical system of coordinates (r, θ, z) we have 1 0 0 gij = 0 r 2 0 0 0 1 while for the system of spherical coordinates (ρ, φ, θ) we have 1 0 0 0 . gij = 0 ρ2 2 0 0 ρ sin2 φ 1 ∂f ∂f ∂f er + eθ + k ∂r r ∂θ ∂z 1 ∂f 1 ∂f ∂f eρ + eφ + eθ f= ∂ρ ρ ∂φ ρ sin φ ∂θ f= 1479
Hence, for a given function f we have
Cylindrical Spherical ,
where for the cylindrical system ∂ x y = i+ j ∂r r r 1 ∂ y x eθ = =− i+ j r ∂θ r r are the unit vectors in the direction of increase of r and θ, respectively, and for the spherical system er = ∂ x y z = i+ j+ k ∂ρ ρ ρ ρ 1 ∂ zx zy r eφ = = i+ j− k ρ ∂φ rρ rρ ρ y x 1 ∂ =− i+ j eθ = ρ sin θ ∂θ r r eρ = are the unit vectors in the direction of increase of ρ, θ, φ, respectively. Physical Interpretation. In the simplest case, we consider the Euclidean plane with Cartesian coordinates x, y. The gradient of a function f (x, y) is given by f= ∂f ∂f i+ j, ∂x ∂y
where i, j denote, respectively, the standard unit horizontal and vertical vectors. The gradient vectors have the following geometric interpretation. Consider the graph z = f (x, y) as a surface in 3space. The direction of the gradient vector f is the direction of steepest ascent, while the magnitude is the slope in that direction. Thus, f = ∂f ∂x
2
+
∂f ∂y
2
describes the steepness of the hill z = f (x, y) at a point on the hill located at (x, y, f (x, y)). A more general conception of the gradient is based on the interpretation of a function f as a potential corresponding to some conservative physical force. The negation of the gradient, − f , is then interpreted as the corresponding force ﬁeld. Diﬀerential identities. Several properties of the onedimensional derivative generalize to a multidimensional setting (af + bg) = a f + b g (f g) = f g + g f (φ ◦ f ) = (φ ◦ f ) f Version: 9 Owner: rmilson Author(s): rmilson, slider142 1480 Linearity Product rule Chain rule
378.4
implicit diﬀerentiation
Implicit diﬀerentiation is a tool used to analyze functions that cannot be conveniently put into a form y = f (x) where x = (x1 , x2 , ..., xn ). To use implicit diﬀerentiation meaningfully, you must be certain that your function is of the form f (x) = 0 (it can be written as a level set) and that it satisﬁes the implicit function theorem (f must be continuous, its ﬁrst partial derivatives must be continuous, and the derivative with respect to the implicit function must be nonzero). To actually diﬀerentiate implicitly, we use the chain rule to diﬀerentiate the entire equation. Example: The ﬁrst step is to identify the implicit function. For simplicity in the example, we will assume f (x, y) = 0 and y is an implicit function of x. Let f (x, y) = x2 + y 2 + xy = 0 (Since this is a two dimensional equation, all one has to check is that the graph of y may be an implicit function of x in local neighborhoods.) Then, to diﬀerentiate implicitly, we diﬀerentiate both sides of the equation with respect to x. We will get 2x + 2y · dy dy +x·1· +y =0 dx dx
Do you see how we used the chain rule in the above equation ? Next, we simply solve for our dy implicit derivative dx = − 2x+y . Note that the derivative depends on both the variable and 2y+x the implicit function y. Most of your derivatives will be functions of one or all the variables, including the implicit function itself. [better example and ?multidimensional? coming] Version: 2 Owner: slider142 Author(s): slider142
378.5
implicit function theorem
Let f = (f1 , ..., fn ) be a continuously diﬀerentiable, vectorvalued function mapping an open set E ⊂ Rn+m into Rn . Let (a, b) = (a1 , ..., an , b1 , ..., bm ) be a point in E for which f(a, b) = 0 and such that the n × n determinant Dj fi (a, b) = 0 for i, j = 1, ..., n. Then there exists an mdimensional neighbourhood W of b and a unique continuously diﬀerentiable function g : W → Rn such that g(b) = a and f(g(t), t) = 0 for all t ∈ W .
1481
Simplest case When n = m = 1, the theorem reduces to: Let F be a continuously diﬀerentiable, realvalued function deﬁned on an open set E ⊂ R2 and let (x0 , y0) be a point on E for which F (x0 , y0) = 0 and such that ∂F x ,y = 0 ∂x 0 0 Then there exists an open interval I containing y0 , and a unique function f : I → R which is continuously diﬀerentiable and such that f (y0 ) = x0 and F (f (y), y) = 0 for all y ∈ I. Note The inverse function theorem is a special case of the implicit function theorem where the dimension of each variable is the same. Version: 7 Owner: vypertd Author(s): vypertd
378.6
proof of implicit function theorem
Consider the function F : E → Rn × Rm deﬁned by F (x, y) = (f (x, y), y). Setting Ajk =
∂f j (a, b), ∂xk
and Mji =
∂fj (a, b), ∂yi
A is an n × m matrix and M is n × n. It holds
Df (a, b) = (AM) and hence DF (a, b) = In 0 A M
.
Being det M = 0 M is invertible and hence DF (a, b) is invertible too. Applying the inverse function theorem to F we ﬁnd that there exist a neighbourhood V of a and W of b and a function G ∈ C 1 (V × W, Rn+m ) such that F (G(x, y)) = (x, y) for all (x, y) ∈ V × W . Letting G(x, y) = (G1 (x, y), G2(x, y)) (so that G1 : V × W → Rn , G2 : V × W → Rm ) we hence have (x, y) = F (G1 (x, y), G2(x, y)) = (f (G1 (x, y), G2(x, y)), G2 (x, y)) 1482
and hence y = G2 (x, y) and x = f (G1 (x, y), G2(x, y)) = f (G1 (x, y), y). So we only have to set g(y) = G1 (0, y) to obtain f (g(y), y) = 0, Version: 1 Owner: paolini Author(s): paolini ∀y ∈ W.
1483
Chapter 379 26B12 – Calculus of vector functions
379.1 Clairaut’s theorem
Theorem. (Clairaut’s Theorem) If F : Rn → Rm is a function whose second partial derivatives exist and are continuous on a set S ⊆ Rn , then ∂2f ∂2f = ∂xi ∂xj ∂xj ∂xi on S (where 1 i, j n).
This theorem is commonly referred to as simply ’the equality of mixed partials’. It is usually ﬁrst presented in a vector calculus course, and is useful in this context for proving basic properties of the interrelations of gradient, divergence, and curl. I.e., if F : R3 → R3 is a function satisfying the hypothesis, then · ( × F) = 0. Or, if f : R3 → R is a function satisfying the hypothesis, × f = 0. Version: 10 Owner: ﬂynnheiss Author(s): ﬂynnheiss
379.2
Fubini’s Theorem
Fubini’s Theorem Let I ⊂ RN and J ⊂ RM be compact intervals, and let f : I × J → RK be a Riemann integrable function such that, for each x ∈ I the integral F (x) := intJ f (x, y) dµJ (y) exists. Then F : I → RK is Riemann integrable, and intI F = intI×J f. 1484
This theorem eﬀectively states that, given a function of N variables, you may integrate it one variable at a time, and that the order of integration does not aﬀect the result. Example Let I := [0, π/2] × [0, π/2], and let f : I → R, x → sin(x) cos(y) be a function. Then intI f =
[0,π/2]×[0,π/2]
sin(x) cos(y)
π/2 π/2
= int0
int0 sin(x) cos(y) dy dx sin(x) (1 − 0) dx = (0 − −1) = 1 ···
I
π/2
= int0
Note that it is often simpler (and no less correct) to write Version: 3 Owner: vernondalhart Author(s): vernondalhart
f as intI f .
379.3
Generalised Ndimensional Riemann Sum
Let I = [a1 , b1 ] × · · · × [aN , bN ] be an Ncell in RN . For each j = 1, . . . , N, let aj = tj,0 < . . . < tj,N = bj be a partition Pj of [aj , bj ]. We deﬁne a partition P of I as P := P1 × · · · × PN Each partition P of I generates a subdivision of I (denoted by (Iν )ν ) of the form Iν = [t1,j , t1,j+1 ] × · · · × [tN,k , tN,k+1 ] Let f : U → RM be such that I ⊂ U, and let (Iν )ν be the corresponding subdivision of a partition P of I. For each ν, choose xν ∈ Iν . Deﬁne S(f, P ) :=
ν
f (xν )µ(Iν)
As the Riemann sum of f corresponding to the partition P . A partition Q of I is called a reﬁnement of P if P ⊂ Q. Version: 1 Owner: vernondalhart Author(s): vernondalhart
379.4
Generalized Ndimensional Riemann Integral
Let I = [a1 , b1 ] × · · · × [aN , bN ] ⊂ RN be a compact interval, and let f : I → RM be a function. Let > 0. If there exists a y ∈ RM and a partition P of I such that for each 1485
reﬁnement P of P (and corresponding Riemann sum S(f, P )), S(f, P ) − y < Then we say that f is Riemann integrable over I, that y is the Riemann integral of f over I, and we write intI f := intI f dµ := y Note also that it is possible to extend this deﬁnition to more arbitrary sets; for any bounded set D, one can ﬁnd a compact interval I such that D ⊂ I, and deﬁne a function ˜ f : I → RM in which case we deﬁne x→ f (x), x ∈ D 0, x∈D /
˜ intD f := intI f
Version: 3 Owner: vernondalhart Author(s): vernondalhart
379.5
Helmholtz equation
It is a partial diﬀerential equation which, in scalar form is
2
f + k 2 f = 0,
or in vector form is
2
A + k 2 A = 0,
where 2 is the Laplacian. The solutions of this equation represent the solution of the wave equation, which is of great interest in physics. Consider a wave equation ∂2ψ = c2 2 ψ ∂t2 with wave speed c. If we look for time harmonic standing waves of frequency ω, ψ(x, t) = e−jωt φ(x) we ﬁnd that φ(x) satisﬁes the Helmholtz equation: ( where k = ω/c is the wave number. Usually Helmholtz equation is solved by seperation of variables method, in cartesian, spherical or cylindrical coordinates. Version: 3 Owner: giri Author(s): giri 1486
2
+ k 2 )φ = 0
379.6
Hessian matrix
The Hessian of a scalar function of a vector is the matrix of partial second derivatives. So the Hessian matrix of a function f : Rn → R is:
∂2f dx2 1 ∂2f dx2 dx1 ∂2f dx1 dx2 ∂2f dx2 2
... ... .. . ...
. . .
. . .
∂2f dxn dx1
∂2f dxn dx2
∂2f dx1 dxn ∂2f dx2 dxn
. . .
(379.6.1)
∂2f dx2 n
Note that the Hessian is symmetric because of the equality of mixed partials. Version: 2 Owner: bshanks Author(s): akrowne, bshanks
379.7
Jordan Content of an Ncell
Let I = [a1 , b1 ] × · · · × [aN , bN ] be an Ncell in RN . Then the Jordan content (denoted µ(I)) of I is deﬁned as
N
µ(I) :=
j=1
(bj − aj )
Version: 1 Owner: vernondalhart Author(s): vernondalhart
379.8
Laplace equation
The scalar form of Laplace’s equation is the partial diﬀerential equation
2
f =0
and the vector form is
2
A = 0,
where k = 0.
2
is the Laplacian. It is a special case of the Helmholtz diﬀerential equation with
A function f which satisﬁes Laplace’s equation is said to be harmonic. Since Laplace’s equation is linear, the superposition of any two solutions is also a solution. Version: 3 Owner: giri Author(s): giri 1487
379.9
chain rule (several variables)
The chain rule is a theorem of analysis that governs derivatives of composed functions. The basic theorem is the chain rule for functions of one variables (see here). This entry is devoted to the more general version involving functions of several variables and partial derivatives. Note: the symbol Dk will be used to denote the partial derivative with respect to the k th variable. Let F (x1 , . . . , xn ) and G1 (x1 , . . . , xm ), . . . , Gn (x1 , . . . , xm ) be diﬀerentiable functions of several variables, and let H(x1 , . . . , xm ) = F (G1 (x1 , . . . , xm ), . . . , Gn (x1 , . . . , xm )) be the function determined by the composition of F with G1 , . . . , Gn The partial derivatives of H are given by
n
(Dk H)(x1 , . . . , xm ) =
i=1
(Di F )(G1 (x1 , . . . , xm ), . . .)(Dk Gi )(x1 , . . . , xm ).
The chain rule can be more compactly (albeit less precisely) expressed in terms of the JacobiLegendre partial derivative symbols (historical note). Just as in the Leibniz system, the basic idea is that of one quantity (i.e. variable) depending on one or more other quantities. Thus we would speak about a variable z depends diﬀerentiably on y1 , . . . , yn , which in turn depend diﬀerentiably on variables x1 , . . . , xm . We would then write the chain rule as ∂z = ∂xj
n
i=1
∂z ∂yi , ∂yi ∂xj
j = 1, . . . m.
The most general, and conceptually clear approach to the multivariable chain is based on the notion of a diﬀerentiable mapping, with the Jacobian matrix of partial derivatives playing the role of generalized derivative. Let, X ⊂ Rm and Y ⊂ Rn be open domains and let F : Y → Rl , G:X→Y
be diﬀerentiable mappings. In essence, the symbol F represents l functions of n variables each: F = (F1 , . . . , Fl ), Fi = Fi (x1 , . . . , xn ), whereas G = (G1 , . . . , Gn ) represents n functions of m variables each. The derivative of such mappings is no longer a function, but rather a matrix of partial derivatives, customarily called the Jacobian matrix. Thus D1 G1 . . . Dm G1 D1 F1 . . . Dn F1 . . . . .. .. . . DG = . DF = . . . . . . . D1 Fl . . . Dn Fl D1 Gn . . . Dm Gn 1488
The chain rule now takes the same form as it did for functions of one variable: albeit with matrix multiplication taking the place of ordinary multiplication. This form of the chain rule also generalizes quite nicely to the even more general setting where one is interested in describing the derivative of a composition of mappings between manifolds. Version: 7 Owner: rmilson Author(s): rmilson D(F ◦ G) = ((DF) ◦ G) (DG),
379.10
divergence
Basic Deﬁnition. Let x, y, z be a system of Cartesian coordinates on 3dimensional Euclidean space, and let i, j, k be the corresponding basis of unit vectors. The divergence of a continuously diﬀerentiable vector ﬁeld F = F 1 i + F 2 j + F 3 k, is deﬁned to be the function ∂F 1 ∂F 2 ∂F 3 + + . ∂x ∂y ∂z Another common notation for the divergence is · F (see gradient), a convenient mnemonic. div F = Physical interpretation. In physical terms, the divergence of a vector ﬁeld is the extent to which the vector ﬁeld ﬂow behaves like a source or a sink at a given point. Indeed, an alternative, but logically equivalent deﬁnition, gives the divergence as the derivative of the net ﬂow of the vector ﬁeld across the surface of a small sphere relative to the surface area of the sphere. To wit, (div F)(p) = lim intS · N)dS / 4πr 2 , (F
r→0
where S denotes the sphere of radius r about a point p ∈ R3 , and the integral is a surface integral taken with respect to N, the normal to that sphere. The noninﬁnitesimal interpretation of divergence is given by Gauss’s Theorem. This theorem is a conservation law, stating that the volume total of all sinks and sources, i.e. the volume integral of the divergence, is equal to the net ﬂow across the volume’s boundary. In symbols, intV div F dV = intS (F · N) dS,
where V ⊂ R3 is a compact region with a smooth boundary, and S = ∂V is that boundary oriented by outwardpointing normals. We note that Gauss’s theorem follows from the more general Stokes’ theorem, which itself generalizes the fundamental theorem of calculus. In light of the physical interpretation, a vector ﬁeld with constant zero divergence is called incompressible – in this case, no net ﬂow can occur across any closed surface. 1489
General deﬁnition. The notion of divergence has meaning in the more general setting of Riemannian geometry. To that end, let V be a vector ﬁeld on a Riemannian manifold. The covariant derivative of V is a type (1, 1) tensor ﬁeld. We deﬁne the divergence of V to be the trace of that ﬁeld. In terms of coordinates (see tensor and Einstein summation convention), we have div V = V i ;i . Version: 6 Owner: rmilson Author(s): rmilson, jaswenso
379.11
extremum
Extrema are minima and maxima. The singular forms of these words are extremum, minimum, and maximum. Extrema may be “global” or “local”. A global minimum of a function f is the lowest value that f ever achieves. If you imagine the function as a surface, then a global minimum is the lowest point on that surface. Formally, it is said that f : U → V has a global minimum at x if ∀u ∈ U, f (x) f (u). A local minimum of a function f is a point x which has less value than all points ”next to” it. If you imagine the function as a surface, then a local minimum is the bottom of a “valley” or “bowl” in the surface somewhere. Formally, it is said that f : U → V has a local minimum at x if ∃ a neighborhood N of x such that ∀y ∈ N, f (x) f (y). If you ﬂip the signs above to , you get the deﬁnitions of global and local maxima.
A ”strict local minima” or ”strict local maxima” means that nearby points are strictly less than or strictly greater than the critical point, rather than or . For instance, a strict local minima at x has a neighborhood N such that ∀y ∈ N, (f (x) < f (y) or y = x). Related concepts are plateau and saddle point. Finding minima or maxima is an important task which is part of the ﬁeld of optimization. Version: 9 Owner: bshanks Author(s): bshanks, bbukh
379.12
irrotational ﬁeld
Suppose Ω is an open set in R3 , and V is a vector ﬁeld with diﬀerentiable real (or possibly complex) valued component functions. If × V = 0, then V is called an irrotional vector ﬁeld, or curl free ﬁeld. If U and V are irrotational, then U × V is solenoidal. 1490
Version: 6 Owner: matte Author(s): matte, giri
379.13
partial derivative
The partial derivative of a multivariable function f is simply its derivative with respect to only one variable, keeping all other variables constant (which are not functions of the variable in question). The formal deﬁnition is ∂f 1 Di f (a) = = lim f ai + h ∂ai h→0 h . . . an a1 . . . f (a + hei ) − f (a) − f (a) = lim h→0 h
where ei is the standard basis vector of the ith variable. Since this only aﬀects the ith variable, one can derive the function using common rules and tables, treating all other variables (which are not functions of ai ) as constants. For example, if f (x) = x2 + 2xy + y 2 + y 3z, then (1) (2) (3)
∂f ∂x ∂f ∂y ∂f ∂z
= 2x + 2y = 2x + 2y + 3y 2z = y3
Note that in equation (1), we treated y as a constant, since we were diﬀerentiating with respect to x. d(c∗x) = c The partial derivative of a vectorvalued function f (x) with respect dx − → ∂f to variable ai is a vector Di f = ∂ai . Multiple Partials: Multiple partial derivatives can be treated just like multiple derivatives. There is an additional degree of freedom though, as you can compound derivatives with respect to diﬀerent variables. For example, using the above function, (4) (5) (6)
∂2f ∂x2 ∂2f ∂z∂y ∂2f ∂y∂z
= = =
∂ (2x ∂x ∂ (2x ∂z
+ 2y)
=2
+ 2y + (5)3y 2z) = 3y 2 = 3y 2
∂ (y 3 ) ∂y
D12 is another way of writing ∂x1∂∂x2 . If f (x) is continuous in the neighborhood of x, it can be shown that Dij f (x) = Dji f (x) where i, j are the ith and jth variables. In fact, as long as an equal number of partials are taken with respect to each variable, changing the order 1491
of diﬀerentiation will produce the same results in the above condition. Another form of notation is f (a,b,c,...)(x) where a is the partial derivative with respect to the ﬁrst variable a times, b is the partial with respect to the second variable b times, etc. Version: 17 Owner: slider142 Author(s): slider142
379.14
plateau
A plateau of a function is a region where a function has constant value. More formally, let U and V be topological spaces. A plateau for a scalar function f : U → V is a pathconnected set of points P ⊆ U such that for some y we have ∀p ∈ P, f (p) = y (379.14.1)
Please take note that this entry is not authoritative. If you know of a more standard deﬁnition of ”plateau”, please contribute it, thank you. Version: 4 Owner: bshanks Author(s): bshanks
379.15
proof of Green’s theorem
Consider the region R bounded by the closed curve P in a wellconnected space. P can be given by a vector valued function F (x, y) = (f (x, y), g(x, y)). The region R can then be described by ∂g ∂f ∂f ∂g int R int − dA − int R int dA dA = int R int ∂x ∂y ∂x ∂y The double integrals above can be evaluated separately. Let’s look at int R int ∂g B(y) ∂g dA = intb intA(y) dxdy a ∂x ∂x
Evaluating the above double integral, we get intb (g(A(y), y) − g(B(y), y)) dy = intb g(A(y), y) dy − intb g(B(y), y) dy a a a According to the fundamental theorem of line integrals, the above equation is actually equivalent to the evaluation of the line integral of the function F1 (x, y) = (0, g(x, y)) over a path P = P1 + P2 , where P1 = (A(y), y) and P2 = (B(y), y). intb g(A(y), y) dy − intb g(B(y), y) dy = intP1 F1 · dt + intP2 F1 · dt = a a 1492 F1 · dt
P
Thus we have int R int
∂g dA = ∂x
P
F1 · dt
By a similar argument, we can show that int R int ∂f dA = − ∂y F2 · dt
P
where F2 = (f (x, y), 0). Putting all of the above together, we can see that int R int ∂g ∂f − ∂x ∂y dA =
P
F1 · dt +
P
F2 · dt =
P
(F1 + F2 ) · dt =
P
(f (x, y), g(x, y)) · dt
which is Green’s theorem. Version: 7 Owner: slider142 Author(s): slider142
379.16
relations between Hessian matrix and local extrema
Let x be a vector, and let H(x) be the Hessian for f at a point x. Let the neighborhood of x be in the domain for f , and let f have continuous partial derivatives of ﬁrst and second order. Let f = 0. If H(x) is positive deﬁnite, then x is a strict local minimum for f . If x is a local minimum for x, then H(x) is positive semideﬁnite. If H(x) is negative deﬁnite, then x is a strict local maximum for f . If x is a local maximum for x, then H(x) is negative semideﬁnite. If H(x) is indeﬁnite, x is a nondegenerate saddle point. If the case when the dimension of x is 1 (i.e. f : R → R), this reduces to the Second Derivative Test, which is as follows: Let the neighborhood of x be in the domain for f , and let f have continuous partial derivatives of ﬁrst and second order. Let f (x) = 0. If f (x) > 0, then x is a strict local minimum. If f (x) < 0, then x is a strict local maximum. Version: 6 Owner: bshanks Author(s): bshanks
1493
379.17
solenoidal ﬁeld
A solenoidal vector ﬁeld is one that satisﬁes ·B=0 at every point where the vector ﬁeld B is deﬁned. Here · B is the divergence.
This condition actually implies that there exists a vector A, known as the vector potential, such that B = × A. For a function f satisfying Laplace’s equation
2
f = 0,
it follows that
f is solenoidal.
Version: 4 Owner: giri Author(s): giri
1494
Chapter 380 26B15 – Integration: length, area, volume
380.1 arc length
Arc length is the length of a section of a diﬀerentiable curve. Finding arc length is useful in many applications, for the length of a curve can be attributed to distance traveled, work, etc. It is commonly represented as S or the diﬀerential ds if one is diﬀerentiating or integrating with respect to change in arclength. If one knows the vector function or parametric equations of a curve, ﬁnding the arc length is simple, as it can be given by the sum of the lengths of the tangent vectors to the curve or intb F (t) dt = S a Note that t is an independent parameter. In Cartesian coordinates, arclength can be calculated by the formula S = intb 1 + (f (x))2 dx a This formula is derived by viewing arclength as the Riemman sum
n ∆x→∞
lim
1 + f (xi ) ∆x
i=1
The term being summed is the length of an approximating secant to the curve over the distance ∆x. As ∆x vanishes, the sum approaches the arclength, thus the algorithm. Arclength can also be derived for polar coordinates from the general formula for vector functions given
1495
above. The result is L = intb a r(θ)2 + (r (θ))2 dθ
Version: 5 Owner: slider142 Author(s): slider142
1496
Chapter 381 26B20 – Integral formulas (Stokes, Gauss, Green, etc.)
381.1 Green’s theorem
Green’s theorem provides a connection between path integrals over a wellconnected region in the plane and the area of the region bounded in the plane. Given a closed path P bounding a region R with area A, and a vectorvalued function F = (f (x, y), g(x, y)) over the plane, F · dx = int R [g1 (x, y) − f2 (x, y)]dA int
P
where an is the derivative of a with respect to the nth variable.
Corollary: The closed path integral over a gradient of a function with continuous partial derivatives is always zero. Thus, gradients are conservative vector ﬁelds. The smooth function is called the potential of the vector ﬁeld.
Proof: The corollary states that
h P
· dx = 0
We can easily prove this using Green’s theorem.
1497
h P
· dx = int R [g1 (x, y) − f2 (x, y)]dA int
But since this is a gradient... int R [g1 (x, y) − f2 (x, y)]dA = int R [h21 (x, y) − h12 (x, y)]dA int int Since h12 = h21 for any function with continuous partials, the corollary is proven. Version: 4 Owner: slider142 Author(s): slider142
1498
Chapter 382 26B25 – Convexity, generalizations
382.1 convex function
Deﬁnition Suppose Ω is a convex set in a vector space over R (or C), and suppose f is a function f : Ω → R. If for any x, y ∈ Ω and any λ ∈ (0, 1), we have f λa + (1 − λ)b λf (a) + (1 − λ)f (b),
we say that f is a convex function. If for any x, y ∈ Ω and any λ ∈ (0, 1), we have f λa + (1 − λ)b λf (a) + (1 − λ)f (b),
we say that f is a concave function. If either of the inequalities are strict, then we say that f is a strictly convex function, or a strictly concave function, respectively.
Properties • A function f is a (strictly) convex function if and only if −f is a (strictly) concave function. • On R, a continuous function is convex if and only if for all x, y ∈ R, we have f x+y 2 ≤ f (x) + f (y) . 2
• A twice continuously diﬀerentiable function on R is convex if and only if f (x) ≥ 0 for all x ∈ R. • A local minimum of a convex function is a global minimum. See this page. 1499
Examples • ex ,e−x , and x2 are convex functions on R. • A norm is a convex function. • On R2 , the 1norm and the ∞norm (i.e., (x, y)1 = x + y and (x, y)∞ = max{x, y}) are not strictly convex ([2], pp. 334335).
REFERENCES
1. E. Kreyszig, Introductory Functional Analysis With Applications, John Wiley & Sons, 1978.
Version: 11 Owner: matte Author(s): matte, drini
382.2
extremal value of convex/concave functions
Theorem. Let U be a convex set in a normed (real or complex) vector space. If f : U → R is a convex function on U, then a local minimum of f is a global minimum. Proof. Suppose x is a local minimum for f , i.e., there is an open ball B ⊂ U with radius and center x such that f (x) ≤ f (ξ) for all ξ ∈ B. Let us ﬁx some y ∈ B. Our aim is to / 1 prove that f (x) ≤ f (y). We deﬁne λ = 2x−y , where  ·  is the norm on U. Then λy + (1 − λ)x − x = λy − λx = λx − y = 2 , so λy + (1 − λ)x ∈ B. If follows that f (x) ≤ f (λy + (1 − λ)x). Since f is convex, we then get f (x) ≤ f (λy + (1 − λ)x) ≤ λf (y) + (1 − λ)f (x), and f (x) ≤ f (y) as claimed. P The analogous theorem for concave functions is as follows. Theorem. Let U be a convex set in a normed (real or complex) vector space. If f : U → R is a concave function on U, then a local maximum of f is a global maximum. 1500
Proof. Consider the convex function −f . If x is a local maximum of f , then it is a local minimum of −f . By the previous theorem, x is then a global minimum of −f . Hence x is a global maximum of f . P Version: 1 Owner: matte Author(s): matte
1501
Chapter 383 26B30 – Absolutely continuous functions, functions of bounded variation
383.1 absolutely continuous function
Deﬁnition [1, 1] closed bounded interval of R. Then a function f : [a, b] → C is absolutely continuous on [a, b], if for any ε > 0, there is a δ > 0 such that the following condition holds: (∗) If (a1 , b1 ), . . . , (an , bn ) is a ﬁnite collection of disjoint open intervals in [a, b] such that
n
i=1
(bi − ai ) < δ,
then
n
i=1
f (bi ) − f (ai ) < ε.
Basic results for absolutely continuous functions are as follows. Theorem 1. A function f : [a, b] → C is absolutely continuous if and only if Re{f } and Im{f } are absolutely continuous real functions. 2. If f : [a, b] → C is a function, which is everywhere diﬀerentiable and f is bounded, then f is absolutely continuous [1]. 1502
3. Any absolutely continuous function f : [a, b] → C is continuous on [a, b] and has a bounded variation [1, 1]. 4. If f, g be absolutely continuous functions, then so are f g, f + g, f γ (if γ ≥ 1), and f /g (if g is never zero) [1]. 5. If f, g are real valued absolutely continuous functions, then so are max{f, g} and min{f, g}. If f (x) > 0 for all x and γ ∈ R, then f γ is absolutely continuous [1]. Property (2), which is readily proven using the mean value theorem, implies that any smooth function with compact support on R is absolutely continuous. By property (3), any absolutely continuous function is a bounded variation. Hence, from properties of functions of bounded variation, the following theorem follows: Theorem ([1], pp. 536) Let f : [a, b] → C be a absolutely continuous function. Then f is diﬀerentiable almost everywhere, and f  is Lebesgue integrable. We have the following characterization of absolutely continuous functions Theorem [Fundamental theorem of calculus for the Lebesgue integral] ([1], pp. 550, [1]) Let f : [a, b] → C be a function. Then f is absolutely continuous if and only if there is a function g ∈ L1 (a, b) (i.e. a g : (a, b) → C with int(a,b) g < ∞), such that f (x) = f (a) + intx g(t)dt a for all x ∈ [a, b]. What is more, if f and g are as above, then f = g almost everywhere. (Above, both integrals are Lebesgue integrals.)
REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999. 2. W. Rudin, Real and complex analysis, 3rd ed., McGrawHill Inc., 1987. 3. F. Jones, Lebesgue Integration on Euclidean Spaces, Jones and Barlett Publishers, 1993. 4. C.D. Aliprantis, O. Burkinshaw, Principles of Real Analysis, 2nd ed., Academic Press, 1990.
Version: 5 Owner: matte Author(s): matte
383.2
total variation
Let γ : [a, b] → X be a function mapping an interval [a, b] to a metric space (X, d). We say that γ is of bounded variation if there is a constant M such that, for each partition 1503
P = {a = t0 < t1 < · · · < tn = b} of [a, b],
n
v(γ, P ) =
k=1
d(γ(tk ), γ(tk−1 ))
M.
The total variation Vγ of γ is deﬁned by Vγ = sup{v(γ, P ) : P is a partition of [a, b]}. It can be shown that, if X is either R or C, every smooth (or piecewise smooth) function γ : [a, b] → X is of bounded variation, and Vγ = intb γ (t)dt. a Also, if γ is of bounded variation and f : [a, b] → X is continuous, then the RiemannStieltjes integral intb f dγ is ﬁnite. a If γ is also continuous, it is said to be a rectiﬁable path, and V (γ) is the length of its trace. If X = R, it can be shown that γ is of bounded variation if and only if it is the diﬀerence of two monotonic functions. Version: 3 Owner: Koro Author(s): Koro
1504
Chapter 384 26B99 – Miscellaneous
384.1 derivation of zeroth weighted power mean
Using the Taylor series expansion et = 1 + t + O(t2 ), where O(t2 ) is Landau notation for terms of order t2 and higher, we can write xr as i xr = er log xi = 1 + r log xi + O(r 2 ). i
r By substituting this into the deﬁnition of Mw , we get r Mw (x1 , x2 , . . . , xn ) =
Let x1 , x2 , . . . , xn be positive real numbers, and let w1 , w2 , . . . , wn be positive real numbers such that w1 + w2 + · · · + wn = 1. For r = 0, the rth weighted power mean of x1 , x2 , . . . , xn is r Mw (x1 , x2 , . . . , xn ) = (w1 xr + w2 xr + · · · + wn xr )1/r . 1 2 n
w1 (1 + r log x1 ) + · · · + wn (1 + r log xn ) + O(r 2 ) 1 + r(w1 log x1 + · · · + wn log xn ) + O(r 2 )
1/r 1/r
1/r
= =
1 + r log(xw1 xw2 · · · xwn ) + O(r 2 ) 1 2 n 1 = exp log 1 + r log(xw1 xw2 · · · xwn ) + O(r 2 ) 1 2 n r
.
Again using a Taylor series, this time log(1 + t) = t + O(t2 ), we get
r Mw (x1 , x2 , . . . , xn ) = exp
1 r log(xw1 xw2 · · · xwn ) + O(r 2 ) 1 2 n r = exp [log(xw1 xw2 · · · xwn ) + O(r)] . 2 n 1
Taking the limit r → 0, we ﬁnd
0 Mw (x1 , x2 , . . . , xn ) = exp [log(xw1 xw2 · · · xwn )] n 1 2 = xw1 xw2 · · · xwn . 1 2 n
1505
1 In particular, if we choose all the weights to be n ,
M 0 (x1 , x2 , . . . , xn ) = the geometric mean of x1 , x2 , . . . , xn . Version: 3 Owner: pbruin Author(s): pbruin
√ n
x1 x2 · · · xn ,
384.2
weighted power mean
If w1 , w2 , . . . , wn are positive real numbers such that w1 + w2 + · · · + wn = 1, we deﬁne the rth weighted power mean of the xi as:
r Mw (x1 , x2 , . . . , xn ) = (w1 xr + w2 xr + · · · + wn xr )1/r . 1 2 n 1 When all the wi = n we get the standard power mean. The weighted power mean is a continuous function of r, and taking limit when r → 0 gives us 0 w Mw = xw1 xw2 · · · wn n . 1 2
We can weighted use power means to generalize the power means inequality: If w is a set of weights, and if r < s then r s Mw < Mw . Version: 6 Owner: drini Author(s): drini
1506
Chapter 385 26C15 – Rational functions
385.1 rational function
A real function R(x) of a single variable x is called rational if it can be written as a quotient R(x) = P (x) , Q(x)
where P (x) and Q(x) are polynomials in x with real coeﬃcients. In general, a rational function R(x1 , . . . , xn ) has the form R(x1 , . . . , xn ) = P (x1 , . . . , xn ) , Q(x1 , . . . , xn )
where P (x1 , . . . , xn ) and Q(x1 , . . . , xn ) are polynomials in the variables (x1 , . . . , xn ) with coeﬃcients in some ﬁeld or ring S. In this sense, R(x1 , . . . , xn ) can be regarded as an element of the fraction ﬁeld S(x1 , . . . , xn ) of the polynomial ring S[x1 , . . . , xn ]. Version: 1 Owner: igor Author(s): igor
1507
Chapter 386 26C99 – Miscellaneous
386.1 Laguerre Polynomial
A Laguerre Polynomial is a polynomial of the form: ex dn −x n e x . Ln (x) = n! dxn Associated to this is the Laguerre diﬀerential equation, the solutions of which are called associated Laguerre Polynomials: Lk (x) = n Of course L0 (x) = Ln (x). n The associated Laguere Polynomials are orthogonal over 0, ∞) with respect to the weighting function xk e−x : (n + k)! int∞ ex xk Lk (x)Lk (x)dx = δn m. 0 n m n! Version: 2 Owner: mathwizard Author(s): mathwizard ex x−k dn −x n+k e x . n! dxn
1508
Chapter 387 26D05 – Inequalities for trigonometric functions and polynomials
387.1 Weierstrass product inequality
For any ﬁnite family (ai )i∈I of real numbers in the interval [0, 1], we have (1 − ai ) ≥ 1 − ai .
i
i
Proof: Write f=
i
(1 − ai ) +
ai .
i
For any k ∈ I, and any ﬁxed values of the ai for i = k, f is a polynomial of the ﬁrst degree in ak . Consequently f is minimal either at ak = 0 or ak = 1. That brings us down to two cases: all the ai are zero, or at least one of them is 1. But in both cases it is clear that f ≥ 1, QED. Version: 2 Owner: Daume Author(s): Larry Hammick
387.2
proof of Jordan’s Inequality
To prove that 2 x π sin(x) x π ∀ x ∈ [0, ] 2 (387.2.1)
consider a unit circle (circle with radius = 1 unit). Take any point P on the circumference of the circle. 1509
Drop the perpendicular from P to the horizontal line, M being the foot of the perpendicular and Q the reﬂection of P at M. (refer to ﬁgure) Let x = ∠P OM. For x to be in [0, π ], the point P lies in the ﬁrst quadrant, as shown. 2
The length of line segment P M is sin(x). Construct a circle of radius MP , with M as the center.
Length of line segment P Q is 2 sin(x). Length of arc P AQ is 2x. Length of arc P BQ is π sin(x). Since P Q implies length of arc P AQ (equality holds when x = 0) we have 2 sin(x) sin(x) x 2x. This
Since length of arc P AQ is length of arc P BQ (equality holds true when x = 0 or x = π ), 2 we have 2x π sin(x). This implies 2 x π Thus we have 2 x π sin(x) x π ∀ x ∈ [0, ] 2 (387.2.2) sin(x)
Version: 12 Owner: giri Author(s): giri
1510
Chapter 388 26D10 – Inequalities involving derivatives and diﬀerential and integral operators
388.1
If, for t0 t
Gronwall’s lemma
t1 , φ(t) 0 and ψ(t) φ(t) 0 are continuous functions such that the inequality K + Lintt0 ψ(s)φ(s)ds t K exp Lintt0 ψ(s)ds t
holds on t0
t
t1 , with K and L positive constants, then φ(t)
on t0
t
t1 .
Version: 1 Owner: jarino Author(s): jarino
388.2
proof of Gronwall’s lemma
φ(t) K + Lintt0 ψ(s)φ(s)ds t 1 (388.2.1)
The inequality is equivalent to φ(t) K + Lintt0 ψ(s)φ(s)ds t Multiply by Lψ(t) and integrate, giving intt0 t Lψ(s)φ(s)ds K + Lints0 ψ(τ )φ(τ )dτ t 1511 Lintt0 ψ(s)ds t
Thus ln K + Lintt0 ψ(s)φ(s)ds − ln K t and ﬁnally K + Lintt0 ψ(s)φ(s)ds t K exp Lintt0 ψ(s)ds t Using (387.2.1) in the left hand side of this inequality gives the result. Version: 2 Owner: jarino Author(s): jarino Lintt0 ψ(s)ds t
1512
Chapter 389 26D15 – Inequalities for sums, series and integrals
389.1 Carleman’s inequality
Theorem ([4], pp. 24) For positive real numbers {an }∞ , Carleman’s inequality states n=1 that ∞ ∞
n=1
a1 a2 · · · an
1/n
≤e
an .
n=1
Although the constant e (the natural log base) is optimal, it is possible to reﬁne Carleman’s inequality by decreasing the weight coeﬃcients on the right hand side [2].
REFERENCES
1. L. H¨rmander, The Analysis of Linear Partial Diﬀerential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, SpringerVerlag, 1990. 2. B.Q. Yuan, Reﬁnements of Carleman’s inequality, Journal of Inequalities in Pure and Applied Mathematics, Vol. 2, Issue 2, 2001, Article 21. online
Version: 2 Owner: matte Author(s): matte
389.2
Chebyshev’s inequality
If x1 , x2 , . . . , xn and y1 , y2, . . . , yn are two sequences (at least one of them consisting of positive numbers):
1513
• if x1 < x2 < · · · < xn and y1 < y2 < · · · < yn then x1 + x2 + · · · + xn n y1 + y2 + · · · + yn n ≤ x1 y1 + x2 y2 + · · · + xn yn . n
• if x1 < x2 < · · · < xn and y1 > y2 > · · · > yn then x1 + x2 + · · · + xn n y1 + y2 + · · · + yn n ≥ x1 y1 + x2 y2 + · · · + xn yn . n
Version: 1 Owner: drini Author(s): drini
389.3
MacLaurin’s Inequality
Let a1 , a2 , . . . , an be positive real numbers , and deﬁne the sums Sk as follows :
1 i1 <i2 <···<ik n
ai1 ai2 · · · aik
Sk =
n k
Then the following chain of inequalities is true : S1 S2
3
S3
···
n
Sn
Note : Sk are called the averages of the elementary symmetric sums This inequality is in fact important because it shows that the ArithmeticGeometric Mean inequality is nothing but a consequence of a chain of stronger inequalities Version: 2 Owner: drini Author(s): drini, slash
389.4
If p
Minkowski inequality
1 and ak , bk are real numbers for k = 1, . . ., then
n 1/p n 1/p n 1/p
k=1
ak + bk p
≤
k=1
ak p
+
k=1
bk p
The Minkowski inequality is in fact valid for all Lp norms with p ≥ 1 on arbitrary measure spaces. This covers the case of Rn listed here as well as spaces of sequences and spaces of functions, and also complex Lp spaces. Version: 8 Owner: drini Author(s): drini, saforres 1514
389.5
Let 0 s1
Muirhead’s theorem
··· sn and 0
n n
t1
...
k
tn be real numbers such that
k
si =
i=1 i=1
ti and
i=1
si
i=1
ti (k = 1, . . . , n − 1)
Then for any nonnegative numbers x1 , . . . , xn , x1σ(1) . . . xnσ(n)
σ σ s s
x1σ(1) . . . xnσ(n)
t
t
where the sums run over all permutations σ of {1, 2, . . . , n}. Version: 3 Owner: Koro Author(s): Koro
389.6
Schur’s inequality
1 a ﬁxed real constant, then the following 0
If a, b, and c are positive real numbers and k inequality holds:
ak (a − b)(b − c) + bk (b − c)(c − a) + ck (c − a)(c − b) Taking k = 1, we get the wellknown a3 + b3 + c3 + 3abc
W
ab(a + b) + ac(a + c) + bc(b + c)
e can assume without loss of generality that c b a via a permutation of the variables (as both sides are symmetric in those variables). Then collecting terms, the lemma states that (a − b) ak (a − c) − bk (b − c) + ck (a − c)(b − c) which is clearly true as every term on the left is positive. Version: 3 Owner: mathcam Author(s): mathcam, slash 0
389.7
Young’s inequality
Let φ : R → R be a continuous , strictly increasing function such that φ(0) = 0 . Then the following inequality holds: ab inta φ(x)dx + intb φ−1 (y)dy 0 0 1515
The inequality is trivial to prove by drawing the graph of φ(x) and by observing that the sum of the two areas represented by the integrals above is greater than the area of a rectangle of sides a and b . Version: 2 Owner: slash Author(s): slash
389.8
arithmeticgeometricharmonic means inequality
Let x1 , x2 , . . . , xn be positive numbers. Then max{x1 , x2 , . . . , xn } ≥ x1 + x2 + · · · + xn n √ ≥ n x1 x2 · · · xn n ≥ 1 1 + x2 + · · · + x1n x1
≥ min{x1 , x2 , . . . , xn } There are several generalizations to this inequality using power means and weighted power means. Version: 4 Owner: drini Author(s): drini
389.9
general means inequality
The power means inequality is a generalization of arithmeticgeometric means inequality. If 0 = r ∈ R, the rmean (or rth power mean) of the nonnegative numbers a1 , . . . , an is deﬁned as 1/r n 1 r r M (a1 , a2 , . . . , an ) = ak n
k=1
Given real numbers x, y such that xy = 0 and x < y, we have Mx My
and the equality holds if and only if a1 = ... = an . Additionally, if we deﬁne M 0 to be the geometric mean (a1 a2 ...an )1/n , we have that the inequality above holds for arbitrary real numbers x < y. The mentioned inequality is a special case of this one, since M 1 is the arithmetic mean, M 0 is the geometric mean and M −1 is the harmonic mean. 1516
This inequality can be further generalized using weighted power means. Version: 3 Owner: drini Author(s): drini
389.10
power mean
The rth power mean of the numbers x1 , x2 , . . . , xn is deﬁned as: xr + xr + · · · + xr 1 2 n n
1/r
M r (x1 , x2 , . . . , xn ) =
.
The arithmetic mean is a special case when r = 1. The power mean is a continuous function of r, and taking limit when r → 0 gives us the geometric mean: M 0 (x1 , x2 , . . . , xn ) = √ n x1 x2 · · · xn .
Also, when r = −1 we get M −1 (x1 , x2 , . . . , xn ) = the harmonic mean.
1 x1
+
1 x2
n +···+
1 xn
A generalization of power means are weighted power means. Version: 8 Owner: drini Author(s): drini
389.11
proof of Chebyshev’s inequality
Let x1 , x2 , . . . , xn and y1 , y2, . . . , yn be real numbers such that x1 ≤ x2 ≤ · · · ≤ xn . Write the product (x1 + x2 + · · · + xn )(y1 + y2 + · · · + yn ) as + + + + (x1 y1 + x2 y2 + · · · + xn yn ) (x1 y2 + x2 y3 + · · · + xn−1 yn + xn y1 ) (x1 y3 + x2 y4 + · · · + xn−2 yn + xn−1 y1 + xn y2 ) ··· (x1 yn + x2 y1 + x3 y2 + · · · + xn yn−1 ). 1517
(389.11.1)
• If y1 ≤ y2 ≤ · · · ≤ yn , each of the n terms in parentheses is less than or equal to x1 y1 + x2 y2 + · · · + xn yn , according to the rearrangement inequality. From this, it follows that (x1 + x2 + · · · + xn )(y1 + y2 + · · · + yn ) ≤ n(x1 y1 + x2 y2 + · · · + xn yn ) or (dividing by n2 ) x1 + x2 + · · · + xn n y1 + y2 + · · · + yn n ≤ x1 y1 + x2 y2 + · · · + xn yn . n
• If y1 ≥ y2 ≥ · · · ≥ yn , the same reasoning gives x1 + x2 + · · · + xn n y1 + y2 + · · · + yn n ≥ x1 y1 + x2 y2 + · · · + xn yn . n
It is clear that equality holds if x1 = x2 = · · · = xn or y1 = y2 = · · · = yn . To see that this condition is also necessary, suppose that not all yi ’s are equal, so that y1 = yn . Then the second term in parentheses of (388.11.1) can only be equal to x1 y1 + x2 y2 + · · · + xn yn if xn−1 = xn , the third term only if xn−2 = xn−1 , and so on, until the last term which can only be equal to x1 y1 + x2 y2 + · · · + xn yn if x1 = x2 . This implies that x1 = x2 = · · · = xn . Therefore, Chebyshev’s inequality is an equality if and only if x1 = x2 = · · · = xn or y1 = y2 = · · · = yn . Version: 1 Owner: pbruin Author(s): pbruin
389.12
proof of Minkowski inequality
For p = 1 the result follows immediately from the triangle inequality, so we may assume p > 1. We have by the triangle inequality. Therefore we have ak + bk p Set q =
p . p−1
ak + bk p = ak + bk ak + bk p−1
(ak  + bk )ak + bk p−1
ak ak + bk p−1 + bk ak + bk p−1
1 p 1 q
Then
n
1 p
+
1 q
= 1, so by the H¨lder inequality we have o
n n
k=0 n
ak ak + bk p−1 bk ak + bk p−1
k=0 n
ak p
1 p
k=0 n
ak + bk (p−1)q
1 q
k=0
k=0
bk p
k=0
ak + bk (p−1)q
1518
Adding these two inequalities, dividing by the factor common to the right sides of both, and observing that (p − 1)q = p by deﬁnition, we have
n 1− 1 q n n
1 p
n
1 p
i=0
ak + bk p
k=0
(ak  + bk )ak + bk p−1
k=0
ak p
+
k=0
bk p
Finally, observe that 1 − 1 = 1 , and the result follows as required. The proof for the integral q p version is analogous. Version: 4 Owner: saforres Author(s): saforres
389.13
proof of arithmeticgeometricharmonic means inequality
Let M be max{x1 , x2 , x3 , . . . , xn } and let m be min{x1 , x2 , x3 , . . . , xn }. Then M +M +M +···+M x1 + x2 + x3 + · · · + xn n n n n n m= n = 1 1 1 1 1 1 1 + x2 + x3 + · · · + x1n + m + m +···+ m m m x1 M=
where all the summations have n terms. So we have proved in this way the two inequalities at the extremes. Now we shall prove the inequality between arithmetic mean and geometric mean. We do ﬁrst the case n = 2. √ √ ( x1 − x2 )2 √ x1 − 2 x1 x2 + x2 x1 + x2 x1 + x2 2
0 0 √ 2 x1 x2 √ x1 x2
Now we prove the inequality for any power of 2 (that is, n = 2k for some integer k) by using mathematical induction. x1 + x2 + · · · + x2k + x2k +1 + · · · + x2k+1 2k+1 x k +x k +···+x2k+1 x1 +x2 +···+x2k + 2 +1 2 +2 2k 2k = 2 1519
and using the case n = 2 on the last expression we can state the following inequality x1 + x2 + · · · + x2k + x2k +1 + · · · + x2k+1 2k+1 ≥ ≥ x1 + x2 + · · · + x2k x2k +1 + x2k +2 + · · · + x2k+1 k 2 2k √ √ 2k x1 x2 · · · x2k 2k x2k +1 x2k +2 · · · x2k+1
where the last inequality was obtained by applying the induction hypothesis with n = 2k . √ Finally, we see that the last expression is equal to 2k+1 x1 x2 x3 · · · x2k+1 and so we have proved the truth of the inequality when the number of terms is a power of two. Finally, we prove that if the inequality holds for any n, it must also hold for n − 1, and this proposition, combined with the preceding proof for powers of 2, is enough to prove the inequality for any positive integer. Suppose that √ x1 + x2 + · · · + xn n x1 x2 · · · xn n is known for a given value of n (we just proved that it is true for powers of two, as example). Then we can replace xn with the average of the ﬁrst n − 1 numbers. So
+···+x x1 + x2 + · · · + xn−1 + x1 +x2n−1 n−1 n (n − 1)x1 + (n − 1)x2 + · · · + (n − 1)xn−1 + x1 + x2 + · · · + xn = n(n − 1) nx1 + nx2 + · · · + nxn−1 = n(n − 1) x1 + x2 + · · · + xn−1 = (n − 1)
On the other hand
n
x1 x2 · · · xn−1 = √ n
x1 + x2 + · · · + xn−1 n−1
n
x1 x2 · · · xn−1
n
x1 + x2 + · · · + xn−1 n−1 x1 + x2 + · · · + xn−1 n−1
which, by the inequality stated for n and the observations made above, leads to: x1 + x2 + · · · + xn−1 n−1 and so x1 + x2 + · · · + xn−1 n−1 1520 ≥ (x1 x2 · · · xn )
n−1
≥ x1 x2 · · · xn
from where we get that x1 + x2 + · · · + xn−1 ≥ n−1
n−1
√
x1 x2 · · · xn .
So far we have proved the inequality between the arithmetic mean and the geometric mean. The geometricharmonic inequality is easier. Let ti be 1/xi . From we obtain t1 + t2 + · · · + tn n
1 x1
√ n
t1 t2 t3 · · · tn 1 1 1 1 ··· x1 x2 x3 xn n
1 x3
+
1 x2
+
1 x3
n and therefore √ n
+···+
1 xn
n
x1 x2 x3 · · · xn
1 x1
+
1 x2
+
and so, our proof is completed. Version: 2 Owner: drini Author(s): drini
+···+
1 xn
389.14
proof of general means inequality
Let r < s be real numbers, and let w1 , w2 , . . . , wn be positive real numbers such that w1 + w2 + · · · + wn = 1. We will prove the weighted power means inequality, which states that for positive real numbers x1 , x2 , . . . , xn ,
r s Mw (x1 , x2 , . . . , xn ) ≤ Mw (x1 , x2 , . . . , xn ).
First, suppose that r and s are nonzero. Then the rth weighted power mean of x1 , x2 , . . . , xn is r Mw (x1 , x2 , . . . , xn ) = (w1 x1 + w2 x2 + · · · + wn xn )1/r
s and Mw is deﬁned similarly. s t Let t = r , and let yi = xr for 1 ≤ i ≤ n; this implies yi = xs . Deﬁne the function f on i i 1 (0, ∞) by f (x) = xt . The second derivative of f is f (x) = t(t−1) xt−2 . There are three cases for the signs of r and s: r < s < 0, r < 0 < s, and 0 < r < s. We will prove the inequality for the case 0 < r < s; the other cases are almost identical. 1 In the case that r and s are both positive, t > 1. Since f (x) = t(t−1) xt−2 > 0 for all x > 0, f is a strictly convex function. Therefore, according to Jensen’s inequality,
(w1 y1 + w2 y2 + · · · + wn yn )t = f (w1y1 + w2 y2 + · · · + wn yn ) ≤ w1 f (y1) + w2 f (y2) + · · · + wn f (yn ) t t t = w1 y 1 + w2 y 2 + · · · + wn y n . 1521
with equality if and only if y1 = y2 = · · · = yn . By substituting t = this inequality, we get
s r
and yi = xr back into i
(w1 xr + w2 xr + · · · + wn xr )s/r ≤ w1 xs + w2 xs + · · · + wn xs 1 2 n 1 2 n with equality if and only if x1 = x2 = · · · = xn . Since s is positive, the function x → x1/s is strictly increasing, so raising both sides to the power 1/s preserves the inequality: (w1 xr + w2 xr + · · · + wn xr )1/r ≤ (w1 xs + w2 xs + · · · + wn xs )1/s , 2 n 1 2 n 1 which is the inequality we had to prove. Equality holds if and only if all the xi are equal.
0 r r s If r = 0, the inequality is still correct: Mw is deﬁned as limr→0 Mw , and since Mw ≤ Mw for all r < s with r = 0, the same holds for the limit r → 0. We can show by an identical r 0 argument that Mw ≤ Mw for all r < 0. Therefore, for all real numbers r and s such that r < s, r s Mw (x1 , x2 , . . . , xn ) ≤ Mw (x1 , x2 , . . . , xn ).
Version: 1 Owner: pbruin Author(s): pbruin
389.15
proof of rearrangement inequality
We ﬁrst prove the rearrangement inequality for the case n = 2. Let x1 , x2 , y1 , y2 be real numbers such that x1 ≤ x2 and y1 ≤ y2 . Then (x2 − x1 )(y2 − y1 ) ≥ 0, and therefore Equality holds iﬀ x1 = x2 or y1 = y2 . x1 y1 + x2 y2 ≥ x1 y2 + x2 y1 .
For the general case, let x1 , x2 , . . . , xn and y1 , y2, . . . , yn be real numbers such that x1 ≤ x2 ≤ · · · ≤ xn . Suppose that (z1 , z2 , . . . , zn ) is a permutation (rearrangement) of {y1, y2 , . . . , yn } such that the sum x1 z1 + x2 z2 + · · · + xn zn is maximized. If there exists a pair i < j with zi > zj , then xi zj + xj zi ≥ xi zi + xj zj (the n = 2 case); equality holds iﬀ xi = xj . Therefore, x1 z1 + x2 z2 + · · · + xn zn is not maximal unless z1 ≤ z2 ≤ · · · ≤ zn or xi = xj for all pairs i < j such that zi > zj . In the latter case, we can consecutively interchange these pairs until z1 ≤ z2 ≤ · · · ≤ zn (this is possible because the number of pairs i < j with zi > zj decreases with each step). So x1 z1 + x2 z2 + · · · + xn zn is maximized if z1 ≤ z2 ≤ · · · ≤ zn . To show that x1 z1 + x2 z2 + · · · + xn zn is minimal for a permutation (z1 , z2 , . . . , zn ) of {y1 , y2 , . . . , yn } if z1 ≥ z2 ≥ · · · ≥ zn , observe that −(x1 z1 + x2 z2 + · · · + xn zn ) = x1 (−z1 ) + 1522
x2 (−z2 ) + · · · + xn (−zn ) is maximized if −z1 ≤ −z2 ≤ · · · ≤ −zn . This implies that x1 z1 + x2 z2 + · · · + xn zn is minimized if z1 ≥ z2 ≥ · · · ≥ zn . Version: 1 Owner: pbruin Author(s): pbruin
389.16
rearrangement inequality
Let x1 , x2 , . . . , xn and y1 , y2 , . . . , yn two sequences of positive real numbers. Then the sum x1 y1 + x2 y2 + · · · + xn yn is maximized when the two sequences are ordered in the same way (i.e. x1 ≤ x2 ≤ · · · ≤ xn and y1 ≤ y2 ≤ · · · ≤ yn ) and is minimized when the two sequences are ordered in the opposite way (i.e. x1 ≤ x2 ≤ · · · ≤ xn and y1 ≥ y2 ≥ · · · ≥ yn ). This can be seen intuitively as: If x1 , x2 , . . . , xn are the prices of n kinds of items, and y1 , y2 , . . . , yn the number of units sold of each, then the highest proﬁt is when you sell more items with high prices and fewer items with low prices (same ordering), and the lowest proﬁt happens when you sell more items with lower prices and less items with high prices (opposite orders). Version: 4 Owner: drini Author(s): drini
1523
Chapter 390 26D99 – Miscellaneous
390.1 Bernoulli’s inequality
Let x and r be real numbers. If r > 1 and x > −1 then (1 + x)r ≥ 1 + xr.
The inequality also holds when r is an even integer. Version: 3 Owner: drini Author(s): drini
390.2
proof of Bernoulli’s inequality
Let I be the interval (−1, ∞) and f : I → R the function deﬁned as: f (x) = (1 + x)α − 1 − αx with α ∈ R \ {0, 1} ﬁxed. Then f is diﬀerentiable and its derivative is f (x) = α(1 + x)α−1 − α, for all x ∈ I, from which it follows that f (x) = 0 ⇔ x = 0. 1. If 0 < α < 1 then f (x) < 0 for all x ∈ (0, ∞) and f (x) > 0 for all x ∈ (−1, 0) which means that 0 is a global maximum point for f . Therefore f (x) < f (0) for all x ∈ I \ {0} which means that (1 + x)α < 1 + αx for all x ∈ (−1, 0). 1524
2. If α ∈ [0, 1] then f (x) > 0 for all x ∈ (0, ∞) and f (x) < 0 for all x ∈ (−1, 0) meaning / that 0 is a global minimum point for f . This implies that f (x) > f (0) for all x ∈ I \{0} which means that (1 + x)α > 1 + αx for all x ∈ (−1, 0). Checking that the equality is satisﬁed for x = 0 or for α ∈ {0, 1} ends the proof. Version: 3 Owner: danielm Author(s): danielm
1525
Chapter 391 26E35 – Nonstandard analysis
391.1 hyperreal
An ultraﬁlter F on a set I is called nonprincipal if no ﬁnite subsets of I are in F. Fix once and for all a nonprincipal ultraﬁlter F on the set N of natural numbers. Let ∼ be the equivalence relation on the set RN of sequences of real numbers given by {an } ∼ {bn } ⇔ {n ∈ N  an = bn } ∈ F Let ∗ R be the set of equivalence classes of RN under the equivalence relation ∼. The set ∗ R is called the set of hyperreals. It is a ﬁeld under coordinatewise addition and multiplication: {an } + {bn } = {an + bn } {an } · {bn } = {an · bn } The ﬁeld ∗ R is an ordered ﬁeld under the ordering relation {an } {bn } ⇔ {n ∈ N  an bn } ∈ F
The real numbers embed into ∗ R by the map sending the real number x ∈ R to the equivalence class of the constant sequence given by xn := x for all n. In what follows, we adopt the convention of treating R as a subset of ∗ R under this embedding. A hyperreal x ∈ ∗ R is: • limited if a < x < b for some real numbers a, b ∈ R • positive unlimited if x > a for all real numbers a ∈ R • negative unlimited if x < a for all real numbers a ∈ R 1526
• unlimited if it is either positive unlimited or negative unlimited • positive inﬁnitesimal if 0 < x < a for all positive real numbers a ∈ R+ • negative inﬁnitesimal if a < x < 0 for all negative real numbers a ∈ R− • inﬁnitesimal if it is either positive inﬁnitesimal or negative inﬁnitesimal For any subset A of R, the set ∗ A is deﬁned to be the subset of ∗ R consisting of equivalence classes of sequences {an } such that {n ∈ N  an ∈ A} ∈ F. The sets ∗ N, ∗ Z, and ∗ Q are called hypernaturals, hyperintegers, and hyperrationals, respectively. An element of ∗ N is also sometimes called hyperﬁnite. Version: 1 Owner: djao Author(s): djao
391.2
e is not a quadratic irrational
Looking at the Taylor series for ex , we see that e =
x ∞ k=0
xk . k!
∞ ∞ 1 −1 k 1 This converges for every x ∈ R, so e = = k=0 k! and e k=0 (−1) k! . Arguing by contradiction, assume ae2 +be+c = 0 for integers a, b and c. That is the same as ae+b+ce−1 = 0.
Fix n > a + c, then a, c  n! and ∀k ≤ n, k!  n! . Consider 0 = n!(ae + b + ce ) = an! = b+
k=0 −1 ∞ k=0 n
1 1 + b + cn! (−1)k k! k! k=0 (a + c(−1)k ) n! + k!
∞
∞
(a + c(−1)k )
k=n+1
n! k!
Since k!  n! for k ≤ n, the ﬁrst two terms are integers. So the third term should be an
1527
integer. However,
∞
(a + c(−1)k )
k=n+1
n! n! ≤ (a + c) k! k! k=n+1 = (a + c) ≤ (a + c) = (a + c) = (a + c) 1 (n + 1)(n + 2) · · · k k=n+1
∞ ∞
∞
(n + 1)n−k
k=n+1 ∞ t=1
(n + 1)−t
1 n
is less than 1 by our assumption that n > a + c. Since there is only one integer which is k 1 less than 1 in absolute value, this means that ∞ k=n+1 (a+ c(−1) ) k! = 0 for every suﬃciently large n which is not the case because 1 1 1 (a + c(−1)k ) = (a + c(−1)k ) (a + c(−1) ) − k! k=n+2 k! (n + 1)! k=n+1
k ∞ ∞
is not identically zero. The contradiction completes the proof. Version: 6 Owner: thedagit Author(s): bbukh, thedagit
391.3
zero of a function
Deﬁnition Suppose X is a set, and suppose f is a complexvalued function f : X → C. Then a zero of f is an element x ∈ X such that f (x) = 0. The zero set of f is the set Z(f ) = {x ∈ X  f (x) = 0}. Remark When X is a “simple” space, such as R or C a zero is also called a root. However, in pure mathematics and especially if Z(f ) is inﬁnite, it seems to be customary to talk of zeroes and the zero set instead of roots. Examples • Suppose p is a polynomial p : C → C of degree n ≥ 1. Then p has at most n zeroes. That is, Z(p) ≤ n. 1528
• If f and g are functions f : X → C and g : X → C, then Z(f g) = Z(f ) Z(f g) ⊃ Z(f ), where f g is the function x → f (x)g(x). • If X is a topological space and f : X → C is a function, then supp f = Z(f ) . Further, if f is continuous, then Z(f ) is a closed in X (assuming that C is given the usual topology of the complex plane where {0} is a closed set). Version: 21 Owner: mathcam Author(s): matte, yark, say 10, apmxi Z(g),
1529
Chapter 392 2800 – General reference works (handbooks, dictionaries, bibliographies, etc.)
392.1 extended real numbers
The extended real numbers are the real numbers together with +∞ (or simply ∞) and −∞. This set is usually denoted by R or [−∞, ∞] [3], and the elements +∞ and −∞ are called plus inﬁnity respectively minus inﬁnity. Following [3], let us next extend the order operation <, the addition and multiplication operations, and the absolute value from R to R. In other words, let us deﬁne how these operations should behave when some of their arguments are ∞ or −∞. Order on R The order relation on R extends to R by deﬁning that for any x ∈ R, we have −∞ < x, x < ∞, and that −∞ < ∞. Addition For any real number x, we deﬁne x + (±∞) = (±∞) + x = ±∞, 1530
and for +∞ and −∞, we deﬁne (±∞) + (±∞) = ±∞. It should be pointed out that sums like (+∞) + (−∞) are left undeﬁned. Multiplication If x is a positive real number, then x · (±∞) = (±∞) · x = ±∞. Similarly, if x is a negative real number, then x · (±∞) = (±∞) · x = Furthermore, for ∞ and −∞, we deﬁne (+∞) · (+∞) = (−∞) · (−∞) = +∞, (+∞) · (−∞) = (−∞) · (+∞) = −∞. In many areas of mathematics, products like 0 · ∞ are left undeﬁned. However, a special case is measure theory, where it is convenient to deﬁne [3] 0 · (±∞) = (±∞) · 0 = 0. Absolute value For ∞ and −∞, the absolute value is deﬁned as  ± ∞ = +∞. Examples 1. By taking x = −1 in the product rule, we obtain the relations (−1) · (±∞) = ∞. ∞.
REFERENCES
1. D.L. Cohn, Measure Theory, Birkh¨user, 1980. a
Version: 1 Owner: matte Author(s): matte 1531
Chapter 393 28XX – Measure and integration
393.1 Riemann integral
Suppose there is a function f : D → R where D, R ⊆ R and that there is a closed interval I = [a, b] such that I ⊆ D. For any ﬁnite set of points {x0 , x1 , x2 , . . . xn } such that a = x0 < x1 < x2 · · · < xn = b, there is a corresponding partition P = {[x0 , x1 ), [x1 , x2 ), . . . , [xn − 1, xn ]} of I. Let C( ) be the set of all partitions of I with max(xi+1 − xi ) < . Then let S ∗ ( ) be the inﬁmum of the set of upper Riemann sums with each partition in C( ), and let S∗ ( ) be the supremum of the set of lower Riemann sums with each partition in C( ). If 1 < 2 , then C( 1 ) ⊂ C( 2 ), so S ∗ = lim →0 S ∗ ( ) and S∗ = lim →0 S∗ ( ) exist. If S ∗ = S∗ , then f is Riemannintegrable over I, and the Riemann integral of f over I is deﬁned by intb f (x)dx = S ∗ = S∗ . a Version: 4 Owner: bbukh Author(s): bbukh, vampyr
393.2
martingale
Let ν be a probability measure on Cantor space C, and let s ∈ [0, ∞). 1. A νssupergale is a function d : {0, 1}∗ → [0, ∞) that satisﬁes the condition d(w)ν(w)s for all w ∈ {0, 1}∗. 2. A νsgale is a νssupergale that satisﬁes the condition with equality for all w ∈ {0, 1}∗. 1532 d(w0)ν(w0)s + d(w1)ν(w1)s (393.2.1)
3. A νsupermartingale is a ν1supergale. 4. A νmartingale is a ν1gale. 5. An ssupergale is a µssupergale, where µ is the uniform probability measure. 6. An sgale is a µsgale. 7. A supermartingale is a 1supergale. 8. A martingale is a 1gale. Put in another way, a martingale is a function d : {0, 1}∗ → [0, ∞) such that, for all w ∈ {0, 1}∗, d(w) = (d(w0) + d(w1))/2. Let d be a νssupergale, where ν is a probability measure on C and s ∈ [0, ∞). We say that d succeeds on a sequence S ∈ C if lim sup d(S[0..n − 1]) = ∞.
n→∞
The success set of d is S ∞ [d] = {S ∈ C d succeeds on S}. d succeeds on a language A ⊆ {0, 1}∗ if d succeeds on the characteristic sequence χA of A. We say that d succeeds strongly on a sequence S ∈ C if lim inf d(S[0..n − 1]) = ∞.
n→∞ ∞ The strong success set of d is Sstr [d] = {S ∈ C d succeeds strongly on S}.
Intuitively, a supergale d is a betting strategy that bets on the next bit of a sequence when the previous bits are known. s is the parameter that tunes the fairness of the betting. The smaller s is, the less fair the betting is. If d succeeds on a sequence, then the bonus we can get from applying d as the betting strategy on the sequence is unbounded. If d succeeds strongly on a sequence, then the bonus goes to inﬁnity. Version: 10 Owner: xiaoyanggu Author(s): xiaoyanggu
1533
Chapter 394 28A05 – Classes of sets (Borel ﬁelds, σrings, etc.), measurable sets, Suslin sets, analytic sets
394.1 Borel σalgebra
For any topological space X, the Borel sigma algebra of X is the σ–algebra B generated by the open sets of X. An element of B is called a Borel subset of X, or a Borel set. Version: 5 Owner: djao Author(s): djao, rmilson
1534
Chapter 395 28A10 – Real or complexvalued set functions
395.1 σﬁnite
A measure space (Ω, B, µ) is σﬁnite if the total space is the union of a ﬁnite or countable family of sets of ﬁnite measure; i.e. if there exists a ﬁnite or countable set F ⊂ B such that µ(A) < ∞ for each A ∈ F, and Ω = A∈F A. In this case we also say that µ is a σﬁnite measure. If µ is not σﬁnite, we say that it is σinﬁnite. Examples. Any ﬁnite measure space is σﬁnite. A more interesting example is the Lebesgue measure µ in Rn : it is σﬁnite but not ﬁnite. In fact R= [−k, k]n
k∈N
([−k, k]n is a cube with center at 0 and side length 2k, and its measure is (2k)n ), but µ(Rn ) = ∞. Version: 6 Owner: Koro Author(s): Koro, drummond
395.2
Argand diagram
An argand diagram is the graphical representation of complex numbers written in polar coordinates. Argand is the name of JeanRobert Argand, the frenchman who is is credited with the geometric interpretation of the complex numbers [Biography] Version: 3 Owner: drini Author(s): drini 1535
395.3
HahnKolmogorov theorem
{∞}
Let A0 be an algebra of subsets of a set X. If a ﬁnitely additive measure µ0 : A → R satisﬁes ∞ ∞ µ0 ( An ) = µ0 (An )
n=1 n=1
for any disjoint family {An : n ∈ N} of elements of A0 such that ∞ An ∈ A0 , then µ0 n=0 extends uniquely to a measure deﬁned on the σalgebra A generated by A0 ; i.e. there exists a unique measure µ : A → R {∞} such that its restriction to A0 coincides with µ0 Version: 3 Owner: Koro Author(s): Koro
395.4
measure
Let (E, B(E)) be a measurable space. A measure on (E, B(E)) is a function µ : B(E) −→ R {∞} with values in the extended real numbers such that: 1. µ(A) 2. µ( I The second property is called countable additivity. A ﬁnitely additive measure µ has the same deﬁnition except that B(E) is only required to be an algebra and the second property above is only required to hold for ﬁnite unions. Note the slight abuse of terminology: a ﬁnitely additive measure is not necessarily a measure. The triple (E, B, µ) is called a measure space. If µ(E) = 1, then it is called a probability space, and the measure µ is called a probability measure. Lebesgue measure on Rn is one important example of a measure. Version: 8 Owner: djao Author(s): djao 0 for A ∈ B(E), with equality if A = ∅ =
∞ i=0
∞ i=0 Ai )
µ(Ai ) for any sequence of disjoint sets Ai ∈ B(E).
395.5
outer measure
Deﬁnition [1, 2, 1] Let X be a set, and let P(X) be the power set of X. An outer measure on X is a function µ∗ : P(X) → [0, ∞] satisfying the properties 1. µ∗ (∅) = 0. 1536
2. If A ⊂ B are subsets in X, then µ∗ (A) ≤ µ∗ (B). 3. If {Ai } is a countable collection of subsets of X, then µ∗ (
i
Ai ) ≤
µ∗ (Ai ).
i
Here, we can make two remarks. First, from (1) and (2), it follows that µ∗ is a positive function on P(X). Second, property (3) also holds for any ﬁnite collection of subsets since we can always append an inﬁnite sequence of empty sets to such a collection. Examples • [1, 2] On a set X, let us deﬁne µ∗ : P(X) → [0, ∞] as µ∗ (E) = Then µ∗ is an outer measure. • [1] On a uncountable set X, let us deﬁne µ∗ : P(X) → [0, ∞] as µ∗ (E) = Then µ∗ is an outer measure. Theorem [1, 2, 1] Let X be a set, and let F be a collection of subsets of X such that ∅ ∈ F and X ∈ F. Further, let ρ : F → [0, ∞] be a mapping such that ρ(∅) = 0. If A ⊂ X, let µ (A) = inf
∗ ∞ i=1
1 when E = ∅, 0 when E = ∅.
1 when E is uncountable, 0 when E is countable.
ρ(Fi ),
∞ i=1
where the inﬁmum is taken over all collections {Fi }∞ ⊂ F such that A ⊂ i=1 µ∗ : P(X) → [0, ∞] is an outer measure.
Fi . Then
REFERENCES
1. A. Mukherjea, K. Pothoven, Real and Functional analysis, Plenum press, 1978. 2. A. Friedman, Foundations of Modern Analysis, Dover publications, 1982. 3. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999.
Version: 1 Owner: mathcam Author(s): matte 1537
395.6
properties for measure
Theorem [1, 1, 3, 2] Let (E, B, µ) be a measure space, i.e., let E be a set, let B be a σalgebra of sets in E, and let µ be a measure on B. Then the following properties hold: 1. Monotonicity: If A, B ∈ B, and A ⊂ B, then µ(A) ≤ µ(B). 2. If A, B in B, A ⊂ B, and µ(A) < ∞, then µ(B \ A) = µ(B) − µ(A). 3. For any A, B in B, we have µ(A B) + µ(A B) = µ(A) + µ(B).
4. subadditivity: If {Ai }∞ is a collection of sets from B, then i=1 µ
∞ i=1
Ai ≤
∞
µ(Ai ).
i=1
5. Continuity from below: If {Ai }∞ is a collection of sets from B such that Ai ⊂ Ai+1 i=1 for all i, then µ
∞
Ai = lim µ(Ai ).
i→∞
i=1
6. Continuity from above: If {Ai }∞ is a collection of sets from B such that µ(A1 ) < i=1 ∞, and Ai ⊃ Ai+1 for all i, then µ
∞
Ai = lim µ(Ai ).
i→∞
i=1
Remarks In (2), the assumption µ(A) < ∞ assures that the right hand side is always well deﬁned, i.e., not of the form ∞ − ∞. Without the assumption we can prove that µ(B) = µ(A) + µ(B \ A) (see below). In (3), it is tempting to move the term µ(A B) to the other side for aesthetic reasons. However, this is only possible if the term is ﬁnite. Proof. For (1), suppose A ⊂ B. We can then write B as the disjoint union B = A (B \ A), whence µ(B) = µ(A (B \ A)) = µ(A) + µ(B \ A).
Since µ(B \ A) ≥ 0, the claim follows. Property (2) follows from the above equation; since µ(A) < ∞, we can subtract this quantity from both sides. For property (3), we can write A B = A (B \ A), whence µ(A B) = µ(A) + µ(B \ A) ≤ µ(A) + µ(B). 1538
If µ(A B) is inﬁnite, the last inequality must be equality, and either of µ(A) or µ(B) must be inﬁnite. Together with (1), we obtain that if any of the quantities µ(A), µ(B), µ(A B) or µ(A B) is inﬁnite, then all quantities are inﬁnite, whence the claim clearly holds. We can therefore without loss of generality assume that all quantities are ﬁnite. From A B = B (A \ B), we have µ(A B) = µ(B) + µ(A \ B) and thus 2µ(A B) = µ(A) + µ(B) + µ(A \ B) + µ(B \ A). (B \ A)) B)) B),
For the last two terms we have
µ(A \ B) + µ(B \ A) = µ((A \ B) = µ((A = µ(A
B) \ (A B) − µ(A
where, in the second equality we have used properties for the symmetric set diﬀerence, and the last equality follows from property (2). This completes the proof of property (3). For property (4), let us deﬁne the sequence {Di }∞ as i=1
i−1
D1 = A1 ,
Di = Ai \
Ak .
k=1 ∞ i=1
Now Di Dj = ∅ for i < j, so {Di } is a sequence of disjoint sets. Since and since Di ⊂ Ai , we have µ(
∞
Di =
∞ i=1
Ai ,
Ai ) = µ( = ≤
∞
Di )
i=1
i=1 ∞ i=1 ∞ i=1
µ(Di )
µ(Ai ),
and property (4) follows. TODO: proofs for (5)(6).
REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999. 2. A. Mukherjea, K. Pothoven, Real and Functional analysis, Plenum press, 1978. 3. D.L. Cohn, Measure Theory, Birkh¨user, 1980. a 4. A. Friedman, Foundations of Modern Analysis, Dover publications, 1982.
Version: 2 Owner: matte Author(s): matte 1539
Chapter 396 28A12 – Contents, measures, outer measures, capacities
396.1 Hahn decomposition theorem
Let µ be a signed measure in the measurable space (Ω, S). There are two measurable sets A and B such that: 1. A B = Ω and A B = ∅;
2. µ(E) 3. µ(E)
0 for each E ∈ S such that E ⊂ A; 0 for each E ∈ S such that E ⊂ B.
The pair (A, B) is called a Hahn decomposition for µ. This decomposition is not unique, but any other such decomposition (A , B ) satisﬁes µ(A A) = µ(B B ) = 0 (where denotes the symmetric diﬀerence), so the two decompositions diﬀer in a set of measure 0. Version: 6 Owner: Koro Author(s): Koro
396.2
Jordan decomposition
Let (Ω, S, µ) be a signed measure space, and let (A, B) be a Hahn decomposition for µ. We deﬁne µ+ and µ− by µ+ (E) = µ(A E) and µ− (E) = −µ(B E).
This deﬁnition is easily shown to be independent of the chosen Hahn decomposition. 1540
It is clear that µ+ is a positive measure, and it is called the positive variation of µ. On the other hand, µ− is a positive ﬁnite measure, called the negative variation of µ. The measure µ = µ+ + µ− is called the total variation of µ. Notice that µ = µ+ − µ− . This decomposition of µ into its positive and negative parts is called the Jordan decomposition of µ. Version: 6 Owner: Koro Author(s): Koro
396.3
Lebesgue decomposition theorem
Let µ and ν be two σﬁnite signed measures in the measurable space (Ω, S). There exist two σﬁnite signed measures ν0 and ν1 such that: 1. ν = ν0 + ν1 ; 2. ν0 µ (i.e. ν0 is absolutely continuous with respect to µ;)
3. ν1 ⊥ µ (i.e. ν1 and µ are singular.) These two measures are uniquely determined. Version: 5 Owner: Koro Author(s): Koro
396.4
Lebesgue outer measure
Let S be some arbitrary subset of R. Let L(I) be the traditional deﬁnition of the length of an interval I ⊆ R. If I = (a, b), then L(I) = b − a. Let M be the set containing L(A)
A∈C
for any countable collection of open intervals C that covers S (that is, S ⊆ Lebesgue outer measure of S is deﬁned by:
C). Then the
m∗ (S) = inf(M) Note that (R, P(R), m∗ ) is “almost” a measure space. In particular: 1541
• Lebesgue outer measure is deﬁned for any subset of R (and P(R) is a σalgebra). • m∗ A 0 for any A ⊆ R, and m∗ ∅ = 0. • If A and B are disjoint sets, then m∗ (A B) m∗ A + m∗ B. More generally, if Ai is a countable sequence of disjoint sets, then m∗ ( Ai ) m∗ Ai . This property is known as countable subadditivity and is weaker than countable additivity. In fact, m∗ is not countably additive. Lebesgue outer measure has other nice properties: • The outer measure of an interval is its length: m∗ (a, b) = b − a. • m∗ is translation invariant. That is, if we deﬁne A + y to be the set {x + y : x ∈ A}, we have m∗ A = m∗ (A + y) for any y ∈ R. Version: 4 Owner: vampyr Author(s): vampyr
396.5
absolutely continuous
Given two signed measures µ and ν on the same measurable space (Ω, S), we say that ν is absolutely continuous with respect to µ if, for each A ∈ S such that µ(A) = 0, it holds ν(A) = 0. This is usually denoted by ν µ. Remarks. If (ν + , ν − ) is the Jordan decomposition of ν, the following propositions are equivalent: 1. ν 2. ν + 3. ν µ; µ and ν − µ. µ;
If ν is a ﬁnite signed measure and ν µ, the following useful property holds: for each ε > 0, there is a δ > 0 such that ν(E) < ε whenever µ(E) < δ. Version: 5 Owner: Koro Author(s): Koro
1542
396.6
counting measure
Let (X, B) be a measurable space. We call a measure µ counting measure on X if µ(A ∈ B) = n if A has exactly n elements ∞ otherwise.
Generally, counting measure is applied on N or Z. Version: 2 Owner: mathwizard Author(s): mathwizard, drummond
396.7
measurable set
Let (X, F, µ) be a measure space with a sigma algebra F. A measurable set with respect to µ in X is an element of F. These are also sometimes called µmeasurable sets. Any subset Y ⊂ X with Y ∈ F is said to be nonmeasurable with respect to µ, or nonµmeasurable. / Version: 2 Owner: mathcam Author(s): mathcam, drummond
396.8
outer regular
Let X be a locally compact Hausdorﬀ topological space with Borel σ–algebra B, and suppose µ is a measure on (X, B). For any Borel set B ∈ B, the measure µ is said to be outer regular on B if µ(B) = inf {µ(U)  U ⊃ B, U open}. We say µ is inner regular on B if µ(B) = sup {µ(K)  K ⊂ B, K compact}. Version: 1 Owner: djao Author(s): djao
396.9
signed measure
{+∞} which is
A signed measure on a measurable space (Ω, S) is a function µ : S → R σadditive and such that µ(∅) = 0. Remarks. 1543
1. The usual (positive) measure is a particular case of signed measure, in which µ = µ (see Jordan decomposition.) 2. Notice that the value −∞ is not allowed. 3. An important example of signed measures arises from the usual measures in the following way: Let (Ω, S, µ) be a measure space, and let f be a (real valued) measurable function such that int{x∈Ω:f (x)<0} f dµ < ∞. Then a signed measure is deﬁned by A → intA f dµ. Version: 4 Owner: Koro Author(s): Koro
396.10
singular measure
Two measures µ and ν in a measurable space (Ω, A) are called singular if there exist two disjoint sets A and B in A such that A B = Ω and µ(B) = ν(A) = 0. This is denoted by µ ⊥ ν. Version: 4 Owner: Koro Author(s): Koro
1544
Chapter 397 28A15 – Abstract diﬀerentiation theory, diﬀerentiation of set functions
397.1 HardyLittlewood maximal theorem
There is a constant K > 0 such that for each Lebesgue integrable function f ∈ L1 (Rn ), and each t > 0, K K m({x : Mf (x) > t}) f 1 = intRn f (x)dx, t t where Mf is the HardyLittlewood maximal function of f . Remark. The theorem holds for the constant K = 3n . Version: 1 Owner: Koro Author(s): Koro
397.2
Lebesgue diﬀerentiation theorem
Let f be a locally integrable function on Rn with Lebesgue measure m, i.e. f ∈ L1 (Rn ). loc Lebesgue’s diﬀerentiation theorem basically says that for almost every x, the averages 1 intQ f (y) − f (x)dy m(Q) converge to 0 when Q is a cube containing x and m(Q) → 0. Formally, this means that there is a set N ⊂ Rn with µ(N) = 0, such that for every x ∈ N / and ε > 0, there exists δ > 0 such that, for each cube Q with x ∈ Q and m(Q) < δ, we have 1 intQ f (y) − f (x)dy < ε. m(Q) 1545
For n = 1, this can be restated as an analogue of the fundamental theorem of calculus for Lebesgue integrals. Given a x0 ∈ R, d intx f (t)dt = f (x) dx x0
for almost every x. Version: 6 Owner: Koro Author(s): Koro
397.3
RadonNikodym theorem
Let µ and ν be two σﬁnite measures on the same measurable space (Ω, S), such that ν µ (i.e. ν is absolutely continuous with respect to µ.) Then there exists a measurable function f , which is nonnegative and ﬁnite, such that for each A ∈ S, ν(A) = intA f dµ. This function is unique (any other function satisfying these conditions is equal to f µalmost everywhere,) and it is called the RadonNikodym derivative of ν with respect to µ, dν denoted by f = dµ . Remark. The theorem also holds if ν is a signed measure. Even if ν is not σﬁnite the theorem holds, with the exception that f is not necessarely ﬁnite. Some properties of the RadonNikodym derivative Let ν, µ, and λ be σﬁnite measures in (Ω, S). 1. If ν λ and µ λ, then d(ν + µ) dν dµ = + µalmost everywhere; dλ dλ dλ 2. If ν µ λ, then dν dν dν = µalmost everywhere; dλ dµ dλ dµ dλ; dλ
3. If µ
λ and g is a µintegrable function, then intΩ gdµ = intΩ g
4. If µ
ν and ν
µ, then dµ = dν dν dµ
−1
.
Version: 5 Owner: Koro Author(s): Koro 1546
397.4
integral depending on a parameter
Suppose (E, B, µ) is a measure space, suppose I is an open interval in R, and suppose we are given a function f : E × I → R, (x, t) → f (x, t),
where R is the extended real numbers. Further, suppose that for each t ∈ I, the mapping x → f (x, t) is in L1 (E). (Here, L1 (E) is the set of measurable functions f : E → R with ﬁnite Lebesgue integral; intE f (x)dµ < ∞.) Then we can deﬁne a function F : I → R by F (t) = intE f (x, t)dµ. Continuity of F Let t0 ∈ I. In addition to the above, suppose: 1. For almost all x ∈ E, the mapping t → f (x, t) is continuous at t = t0 . 2. There is a function g ∈ L1 (E) such that for almost all x ∈ E, f (x, t) ≤ g(x) for all t ∈ I. Then F is continuous at t0 . Diﬀerentiation under the integral sign Suppose that the assumptions given in the introduction hold, and suppose: 1. For almost all x ∈ E, the mapping t → f (x, t) is diﬀerentiable for all t ∈ I. 2. There is a function g ∈ L1 (E) such that for almost all x ∈ E,  for all t ∈ I. Then F is diﬀerentiable on I,
d f (x, t)dµ dt
d f (x, t) ≤ g(x) dt
d d F (t) = intE f (x, t)dµ. dt dt The above results can be found in [1, 1]. 1547
is in L1 (E), and for all t ∈ I, (397.4.1)
REFERENCES
1. F. Jones, Lebesgue Integration on Euclidean Spaces, Jones and Barlett Publishers, 1993. 2. C.D. Aliprantis, O. Burkinshaw, Principles of Real Analysis, 2nd ed., Academic Press, 1990.
Version: 1 Owner: matte Author(s): matte
1548
Chapter 398 28A20 – Measurable and nonmeasurable functions, sequences of measurable functions, modes of convergence
398.1 Egorov’s theorem
Let (X, S, µ) be a measure space, and let E be a subset of X of ﬁnite measure. If fn is a sequence of measurable functions converging to f almost everywhere, then for each δ > 0 there exists a set Eδ such that µ(Eδ ) < δ and fn → f uniformly on E − Eδ . Version: 2 Owner: Koro Author(s): Koro
398.2
Fatou’s lemma
If f1 , f2 , . . . is a sequence of nonnegative measurable functions in a measure space X, then intX lim inf fn
n→∞
lim inf intX fn
n→∞
Version: 3 Owner: Koro Author(s): Koro
1549
398.3
FatouLebesgue theorem
Let X be a measure space. If Φ is a measurable function with intX Φ < ∞, and if f1 , f2 , . . . is a sequence of measurable functions such that fn  Φ for each n, then g = lim inf fn and h = lim sup fn
n→∞ n→∞
are both integrable, and −∞ < intX g lim inf intX fn
n→∞
lim sup intX fn
k→∞
intX h < ∞.
Version: 3 Owner: Koro Author(s): Koro
398.4
dominated convergence theorem
Let X be a measure space, and let Φ, f1 , f2 , . . . be measurable functions such that intX Φ < ∞ and fn  Φ for each n. If fn → f almost everywhere, then f is integrable and
n→∞
lim intX fn = intX f.
This theorem is a corollary of the FatouLebesgue theorem. A possible generalization is that if {fr : r ∈ R} is a family of measurable functions such that fr  Φ for each r ∈ R and fr − → f , then f is integrable and −
r→0 r→0
lim intX fr = intX f.
Version: 8 Owner: Koro Author(s): Koro
398.5
measurable function
¯ Let f : X → R be a function deﬁned on a measure space X. We say that f is measurable if {x ∈ X  f (x) > a} is a measurable set for all a ∈ R. Version: 5 Owner: vypertd Author(s): vypertd
1550
398.6
monotone convergence theorem
Let X be a measure space, and let 0 f1 f2 · · · be a monotone increasing sequence of nonnegative measurable functions. Let f be the function deﬁned almost everywhere by f (x) = limn→∞ fn (x). Then f is measurable, and
n→∞
lim intX fn = intX f.
Remark. This theorem is the ﬁrst of several theorems which allow us to “exchange integration and limits”. It requires the use of the Lebesgue integral: with the Riemann integral, we cannot even formulate the theorem, lacking, as we do, the concept of “almost everywhere”. For instance, the characteristic function of the rational numbers in [0, 1] is not Riemann integrable, despite being the limit of an increasing sequence of Riemann integrable functions. Version: 5 Owner: Koro Author(s): Koro, ariels
398.7
proof of Egorov’s theorem
Let Ei,j = {x ∈ E : fj (x) − f (x) < 1/i}. Since fn → f almost everywhere, there is a set S with µ(S) = 0 such that, given i ∈ N and x ∈ E − S, there is m ∈ N such that j > m implies fj (x) − f (x) < 1/i. This can be expressed by E−S ⊂ or, in other words,
m∈N j>m
Ei,j ,
m∈N j>m
(E − Ei,j ) ⊂ S.
Since { j>m (E − Ei,j )}m∈N is a decreasing nested sequence of sets, each of which has ﬁnite measure, and such that its intersection has measure 0, by continuity from above we know that µ( (E − Ei,j )) − − 0. −→
j>m m→∞
Therefore, for each i ∈ N, we can choose mi such that µ(
j>mi
(E − Ei,j )) <
δ . 2i
Let Eδ =
i∈N j>mi
(E − Ei,j ).
1551
Then µ(Eδ )
∞ i=1
µ(
j>mi
(E − Ei,j )) <
∞ i=1
δ = δ. 2i
We claim that fn → f uniformly on E −Eδ . In fact, given ε > 0, choose n such that 1/n < ε. If x ∈ E − Eδ , we have x∈ Ei,j ,
i∈N j>mi
which in particular implies that, if j > mn , x ∈ En,j ; that is, fj (x) − f (x) < 1/n < ε. Hence, for each xε > 0 there is N (which is given by mn above) such that j > N implies fj (x) − f (x) < ε for each x ∈ E − Eδ , as required. This completes the proof. Version: 3 Owner: Koro Author(s): Koro
398.8
proof of Fatou’s lemma
Let f (x) = lim inf n→∞ fn (x) and let gn (x) = inf k≥n fk (x) so that we have f (x) = sup gn (x).
n
As gn is an increasing sequence of measurable nonnegative functions we can apply the monotone convergence theorem to obtain intX f dµ = lim intX gn dµ.
n→∞
On the other hand, being gn ≤ fn , we conclude by observing
n→∞
lim intX gn dµ = lim inf intX gn dµ ≤ lim inf intX fn dµ.
n→∞ n→∞
Version: 1 Owner: paolini Author(s): paolini
398.9
proof of FatouLebesgue theorem
intX g ≤ lim inf intX fn
n→∞
By Fatou’s lemma we have and (recall that lim sup f = − lim inf −f ) lim sup intX fn ≤ intX h.
n→∞
1552
On the other hand by the properties of lim inf and lim sup we have g ≥ −Φ, and hence intX g ≥ intX − Φ > −∞, Version: 1 Owner: paolini Author(s): paolini intX h ≤ intX Φ < +∞. f ≤Φ
398.10
proof of dominated convergence theorem
It is not diﬃcult to prove that f is measurable. In fact we can write f (x) = sup inf k≥n fk (x)
n
and we know that measurable functions are closed under the sup and inf operation. Consider the sequence gn (x) = 2Φ(x) − f (x) − fn (x). clearly gn are nonnegative functions since f − fn ≤ 2Φ. So, applying Fatou’s lemma, we obtain
n→∞
lim intX f − fn  dµ ≤ lim sup intX f − fn  dµ
n→∞
= − lim inf intX − f − fn  dµ
n→∞
= intX 2Φ dµ − lim inf intX 2Φ − f − fn  dµ
n→∞
≤ intX 2Φ dµ − intX 2Φ − lim sup f − fn  dµ
n→∞
= intX 2Φ dµ − intX 2Φ dµ = 0. Version: 1 Owner: paolini Author(s): paolini
398.11
proof of monotone convergence theorem
It is enough to prove the following Theorem 7. Let (X, µ) be a measurable space and let fk : X → R {+∞} be a monotone increasing sequence of positive measurable functions (i.e. 0 ≤ f1 ≤ f2 ≤ . . .). Then f (x) = limk→∞ fk (x) is measurable and lim intX fk dµ = intX f (x).
n→∞
1553
First of all by the monotonicity of the sequence we have f (x) = sup fk (x)
k
hence we know that f is measurable. Moreover being fk ≤ f for all k, by the monotonicity of the integral, we immediately get sup intX fk dµ ≤ intX f (x) dµ.
k
So take any simple measurable function s such that 0 ≤ s ≤ f . Given also α < 1 deﬁne Ek = {x ∈ X : fk (x) ≥ αs(x)}. The sequence Ek is an increasing sequence of measurable sets. Moreover the union of all Ek is the whole space X since limk→∞ fk (x) = f (x) ≥ s(x) > αs(x). Moreover it holds intX fk dµ ≥ intEk fk dµ ≥ αintEk s dµ. Being s a simple measurable function it is easy to check that E → intE s dµ is a measure and hence sup intX fk dµ ≥ αintX s dµ.
k
But this last inequality holds for every α < 1 and for all simple measurable functions s with s ≤ f . Hence by the deﬁnition of Lebesgue integral sup intk fk dµ ≥ intX f dµ
k
which completes the proof. Version: 1 Owner: paolini Author(s): paolini
1554
Chapter 399 28A25 – Integration with respect to measures and other set functions
399.1 L∞(X, dµ)
The L∞ space, L∞ (X, dµ), is a vector space consisting of equivalence classes of functions f : X → C with norm given by f
∞
= ess sup f (t) ,
∞
the essential supremum of f . Additionally, we require that f
< ∞.
The equivalence classes of L∞ (X, dµ) are given by saying that f, g : X → C are equivalent iﬀ f and g diﬀer on a set of µ measure zero. Version: 3 Owner: ack Author(s): bbukh, ack, apmxi
399.2
HardyLittlewood maximal operator
The HardyLittlewood maximal operator in Rn is an operator deﬁned on L1 (Rn ) (the loc space of locally integrable functions in Rn with the Lebesgue measure) which maps each locally integrable function f to another function Mf , deﬁned for each x ∈ Rn by Mf (x) = sup
Q
1 intQ f (y)dy, m(Q)
where the supremum is taken over all cubes Q containing x. This function is lower semicontinuous (and hence measurable), and it is called the HardyLittlewood maximal function of f .
1555
The operator M is sublinear, which means that M(af + bg) aMf + bMg
for each pair of locally integrable functions f, g and scalars a, b. Version: 3 Owner: Koro Author(s): Koro
399.3
Lebesgue integral
The integral of a measurable function f : X → R {±∞} on a measure space (X, B, µ) is written intX f dµ It is deﬁned via the following steps: • If f = f rm[o]−−A is the characteristic function of a set A ∈ B, then set intX f rm[o]−−A dµ := µ(A). • If f is a simple function (i.e. if f can be written as
n
or just
intf.
(399.3.1)
(399.3.2)
f=
k=1
ck f rm[o]−−Ak ,
ck ∈ R
(399.3.3)
for some ﬁnite collection Ak ∈ B), then deﬁne
n
n
intX f dµ :=
k=1
ck intX f rm[o]−−Ak dµ =
k=1
ck µ(Ak ).
(399.3.4)
• If f is a nonnegative measurable function (possibly attaining the value ∞ at some points), then we deﬁne intX f dµ := sup {intX h dµ : h is simple and h(x) ≤ f (x) for all x ∈ X} . (399.3.5) • For any measurable function f (possibly attaining the values ∞ or −∞ at some points), write f = f + − f − where f + := max(f, 0) and and deﬁne the integral of f as intX f dµ := intX f + dµ − intX f − dµ, provided that intX f + dµ and intX f − dµ are not both ∞. 1556 (399.3.7) f − := max(−f, 0), (399.3.6)
If µ is Lebesgue measure and X is any interval in Rn then the integral is called the Lebesgue integral. If the Lebesgue integral of a function f on a set A exists, f is said to be Lebesgue integrable. The Lebesgue integral equals the Riemann integral everywhere the latter is deﬁned; the advantage to the Lebesgue integral is that many Lebesgueintegrable functions are not Riemannintegrable. For example, the Riemann integral of the characteristic function of the rationals in [0, 1] is undeﬁned, while the Lebesgue integral of this function is simply the measure of the rationals in [0, 1], which is 0. Version: 12 Owner: djao Author(s): djao, drummond
1557
Chapter 400 28A60 – Measures on Boolean rings, measure algebras
400.1 σalgebra
Let X be a set. A σalgebra is a collection M of subsets of X such that • X∈M • If A ∈ M then X − A ∈ M. • If A1 , A2 , A3 , . . . is a countable subcollection of M, that is, Aj ∈ M for j = 1, 2, 3, . . . (the subcollection can be ﬁnite) then the union of all of them is also in M:
∞ j=1
Ai ∈ M.
Version: 3 Owner: drini Author(s): drini, apmxi
400.2
σalgebra
Given a set E, a sigma algebra (or σ–algebra) in E is a collection B(E) of subsets of E such that: • ∅ ∈ B(E) • Any countable union of elements of B(E) is in B(E) 1558
• The complement of any element of B(E) in E is in B(E) Given any collection C of subsets of B(E), the σ–algebra generated by C is deﬁned to be the smallest σ–algebra in E containing C. Version: 5 Owner: djao Author(s): djao
400.3
algebra
Given a set E, an algebra in E is a collection B(E) of subsets of E such that: • ∅ ∈ B(E) • Any ﬁnite union of elements of B(E) is in B(E) • The complement of any element of B(E) in E is in B(E) Given any collection C of subsets of B(E), the algebra generated by C is deﬁned to be the smallest algebra in E containing C. Version: 2 Owner: djao Author(s): djao
400.4
measurable set (for outer measure)
Deﬁnition [1, 2, 1] Let µ∗ be an outer measure on a set X. A set E ⊂ X is said to be measurable, or µ∗ measurable, if for all A ⊂ X, we have µ∗ (A) = µ∗ (A E) + µ∗ (A E ). (400.4.1)
Remark If A, E ⊂ X, we have, from the properties of the outer measure, µ∗ (A) = µ∗ A = µ∗ (A ≤ µ∗ (A (E E) E ) (A E ) E )).
E) + µ∗ (A
Hence equation (399.4.1) is equivalent to the inequality [1, 2, 1] µ∗ (A) ≥ µ(A E) + µ(A E ).
1559
Of course, this inequality is trivially satisﬁed if µ∗ (A) = ∞. Thus a set E ⊂ X is µmeasurable in X if and only if the above inequality holds for all A ⊂ X for which µ∗ (A) < ∞ [1]. Theorem [Carath´odory’s theorem] [1, 2, 1] Suppose µ∗ is an outer measure on a set e X, and suppose M is the set of all µ∗ measurable sets in X. Then M is a σalgebra, and µ∗ restricted to M is a measure (on M). Example Let µ∗ be an outer measure on a set X. 1. Any null set (a set E with µ∗ (E) = 0) is measurable. Indeed, suppose µ∗ (E) = 0, and A ⊂ X. Then, since A E ⊂ E, we have µ∗ (A E) = 0, and since A E ⊂ A, we have µ∗ (A) ≥ µ∗ (A E ), so µ∗ (A) ≥ µ∗ (A = µ∗ (A Thus E is measurable. 2. If {Bi }∞ is a countable collection of null sets, then i=1 directly from the last property of the outer measure.
∞ i=1
E) E) + µ∗ (A E ).
Bi is a null set. This follows
REFERENCES
1. A. Mukherjea, K. Pothoven, Real and Functional analysis, Plenum press, 1978. 2. A. Friedman, Foundations of Modern Analysis, Dover publications, 1982. 3. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999.
Version: 1 Owner: matte Author(s): matte
1560
Chapter 401 28A75 – Length, area, volume, other geometric measure theory
401.1 Lebesgue density theorem
Let µ be the Lebesgue measure on R. If µ(Y ) > 0, then there exists X ⊂ Y such that µ(Y − X) = 0 and for all x ∈ X lim µ(X
→+0
[x − , x + ]) = 1. 2
Version: 2 Owner: bbukh Author(s): bbukh
1561
Chapter 402 28A80 – Fractals
402.1 Cantor set
The Cantor set C is the canonical example of an uncountable set of measure zero. We construct C as follows.
1 Begin with the unit interval C0 = [0, 1], and remove the open segment R1 := ( 3 , 2 ) from the 3 middle. We deﬁne C1 as the two remaining pieces
C1 := C0 R1 = 0,
1 3
2 ,0 3
(402.1.1)
Now repeat the process on each remaining segment, removing the open set R2 := to form the fourpiece set C2 := C1 R2 = 0, 1 9 2 1 , 9 3 2 7 , 3 9 8 ,1 16 (402.1.3) 1 2 , 9 9 7 8 , 9 9 (402.1.2)
Continue the process, forming C3 , C4 , . . . Note that Ck has 2k pieces.
Figure 402.1: The sets C0 through C5 in the construction of the Cantor set
Also note that at each step, the endpoints of each closed segment will stay in the set forever— 2 e.g., the point 3 isn’t touched as we remove sets. 1562
The Cantor set is deﬁned as C :=
∞ k=1
Ck = C0 \
∞ n=1
Rn
(402.1.4)
Cardinality of the Cantor set To establish cardinality, we want a bijection between some set whose cardinality we know (e.g. Z, R) and the points in the Cantor set. We’ll be aggressive and try the reals. Start at C1 , which has two pieces. Mark the lefthand segment “0” and the righthand segment “1”. Then continue to C2 , and consider only the leftmost pair. Again, mark the segments “0” and “1”, and do the same for the rightmost pair. Keep doing this all the way down the Ck , starting at the left side and marking the segments 0, 1, 0, 1, 0, 1 as you encounter them, until you’ve labeled the entire Cantor set. Now, pick a path through the tree starting at C0 and going leftleftrightleft. . . and so on. Mark a decimal point for C0 , and record the zeros and ones as you proceed. Each path has a unique number based on your decision at each step. For example, the ﬁgure represents your choice of leftleftrightleftright at the ﬁrst ﬁve steps, representing the number beginning 0.00101... Every point in the Cantor set will have a unique address dependent solely on the pattern Figure 402.2: One possible path through C5 : 0.00101
of lefts and rights, 0’s and 1’s, required to reach it. Each point thus has a unique number, the real number whose binary expansion is that sequence of zeros and ones. Every inﬁnite stream of binary digits can be found among these paths, and in fact the binary expansion of every real number is a path to a unique point in the Cantor set. Some caution is justiﬁed, as two binary expansions may refer to the same real number; for 1 example, 0.011111 . . . = 0.100000 . . . = 2 . However, each one of these duplicates must correspond to a rational number. To see this, suppose we have a number x in [0, 1] whose binary expansion becomes all zeros or all ones at digit k (both are the same number, remember). Then we can multiply that number by 2k and get 1, so it must be a (binary) rational number. There are only countably many rationals, and not even all of those are the doublecovered numbers we’re worried about (see, e.g., 1 = 0.0101010 . . .), so we have at most countably 3 many duplicated reals. Thus, the cardinality of the 0.Cantor set is equal to that of the reals. (If we want to be really picky, map (0, 1) to the reals with, say, f (x) = 1/x + 1/(x − 1), and the end points really don’t matter much.)
1 Return, for a moment, to the earlier observation that numbers such as 3 , 2 , the endpoints 9 of deleted intervals, are themselves never deleted. In particluar, consider the ﬁrst deleted
1563
interval: the ternary expansions of its constituent numbers are precisely those that begin 0.1, and proceed thence with at least one nonzero “ternary” digit (just digit for us) further ˙ along. Note also that the point 1 , with ternary expansion 0.1, may also be written 0.02 3 (or 0.0¯ which has no digits 1. Similar descriptions apply to further deleted intervals. 2), The result is that the cantor set is precisely those numbers in the set [0, 1] whose ternary expansion contains no digits 1.
Measure of the Cantor set Let µ be Lebesgue measure. The measure of the sets Rk that we remove during the construction of the Cantor set are 1 2 1 − = 3 3 3 2 1 − µ(R2 ) = + 9 9 . . . µ(R1 ) =
k
(402.1.5) 8 7 − 9 9 = 2 9 (402.1.6) (402.1.7) (402.1.8)
µ(Rk ) =
n=1
2n−1 3n
Note that the R’s are disjoint, which will allow us to sum their measures without worry. In the limit k → ∞, this gives us µ
∞ n=1
Rn
=
∞ n=1
2n−1 = 1. 3n
(402.1.9)
But we have µ(C0 ) = 1 as well, so this means µ(C) = µ C0 \
∞ n=1
Rn
= µ(C0) −
∞ n=1
1 = 1 − 1 = 0. 2n
(402.1.10)
Thus we have seen that the measure of C is zero (though see below for more on this topic). How many points are there in C? Lots, as we shall see. So we have a set of measure zero (very tiny) with uncountably many points (very big). This nonintuitive result is what makes Cantor sets so interesting.
Cantor sets with positive measure clearly, Cantor sets can be constructed for all sorts of “removals”—we can remove middle halves, or thirds, or any amount 1 , r > 1 we like. All r of these Cantor sets have measure zero, since at each step n we end up with Ln = 1− 1564 1 r
n
(402.1.11)
of what we started with, and limn→∞ Ln = 0 for any r > 1. With apologies, the ﬁgure above is drawn for the case r = 2, rather than the r = 3 which seems to be the publically favored example. However, it is possible to construct Cantor sets with positive measure as well; the key is to remove less and less as we proceed. These Cantor sets have the same “shape” (topology) as the Cantor set we ﬁrst constructed, and the same cardinality, but a diﬀerent “size.” Again, start with the unit interval for C0 , and choose a number 0 < p < 1. Let R1 := 2−p 2+p , 4 4 (402.1.12)
which has length (measure) p . Again, deﬁne C1 := C0 \ R1 . Now deﬁne 2 R2 := 2−p 2+p , 16 16 14 − p 14 + p , 16 16
p ; 2k
(402.1.13) note again
which has measure p . Continue as before, such that each Rk has measure 4 that all the Rk are disjoint. The resulting Cantor set has measure µ C0 \
∞ n=1
Rn
=1−
∞ n=1
µ(Rn ) = 1 −
∞ n=1
p 2−n = 1 − p > 0.
Thus we have a whole family of Cantor sets of positive measure to accompany their vanishing brethren. Version: 19 Owner: drini Author(s): drini, quincynoodles, drummond
402.2
Hausdorﬀ dimension
Let Θ be a bounded subset of Rn let NΘ ( ) be the minimum number of balls of radius required to cover Θ. Then deﬁne the Hausdorﬀ dimension dH of Θ to be dH (Θ) := − lim log NΘ ( ) . →0 log
Hausdorﬀ dimension is easy to calculate for simple objects like the Sierpinski gasket or a Koch curve. Each of these may be covered with a collection of scaleddown copies of itself. In fact, in the case of the Sierpinski gasket, one can take the individual triangles in each approximation as balls in the covering. At stage n, there are 3n triangles of radius 21 , and n log 3 n log 3 so the Hausdorﬀ dimension of the Sierpinski triangle is − n log 1/2 = log 2 .
1565
From some notes from Koro This deﬁnition can be extended to a general metric space X with distance function d. Deﬁne the diameter C of a bounded subset C of X to be supx,y∈C d(x, y), and deﬁne a countable rcover of X to be a collection of subsets Ci of X indexed by some countable set I, such that X = i∈I Ci . We also deﬁne the handy function
D Hr (X) = inf i∈I
Ci D
where the inﬁmum is over all countable rcovers of X. The Hausdorf dimension of X may then be deﬁned as D dH (X) = inf{D  lim Hr (X) = 0}.
r→0
When X is a subset of R with any restricted norminduced metric, then this deﬁnition reduces to that given above. Version: 8 Owner: drini Author(s): drini, quincynoodles
n
402.3
Koch curve
A Koch curve is a fractal generated by a replacement rule. This rule is, at each step, to replace the middle 1/3 of each line segment with two sides of a right triangle having sides of length equal to the replaced segment. Two applications of this rule on a single line segment gives us:
To generate the Koch curve, the rule is applied indeﬁnitely, with a starting line segment. Note that, if the length of the initial line segment is l, the length LK of the Koch curve at the nth step will be 4 3
n
LK =
l
This quantity increases without bound; hence the Koch curve has inﬁnite length. However, the curve still bounds a ﬁnite area. We can prove this by noting that in each step, we add an amount of area equal to the area of all the equilateral triangles we have just created. We can bound the area of each triangle of side length s by s2 (the square containing the triangle.) Hence, at step n, the area AK ”under” the Koch curve (assuming l = 1) is
1566
Figure 402.3: Sierpinski gasket stage 0, a single triangle Figure 402.4: Stage 1, three triangles
AK < =
1 3
n
2
+3 1 3i−1
1 9
2
+9
1 27
2
+···
i=1
but this is a geometric series of ratio less than one, so it converges. Hence a Koch curve has inﬁnite length and bounds a ﬁnite area. A Koch snowﬂake is the ﬁgure generated by applying the Koch replacement rule to an equilateral triangle indeﬁnitely. Version: 3 Owner: akrowne Author(s): akrowne
402.4
Sierpinski gasket
Let S0 be a triangular area, and deﬁne Sn+1 to be obtained from Sn by replacing each triangular area in Sn with three similar and similarly oriented triangular areas each intersecting with each of the other two at exactly one vertex, each one half the linear scale of the orriginal in size. The limiting set as n → ∞ (alternately the intersection of all these sets) is a Sierpinski gasket, also known as a Sierpinski triangle. Version: 3 Owner: quincynoodles Author(s): quincynoodles
402.5
fractal
Option 1: Some equvialence class of subsets of Rn . A usual equivalence is postulated when some generalised ”distance” is zero. For example, let F, G ⊂ Rn , and let d(x, y) be the usual distance (x, y ∈ R). Deﬁne the distance D between F and G as D(F, G) := inf f ∈F sup d(f, g) + inf g∈G sup d(f, g)
g∈G f ∈F
Figure 402.5: Stage 2, nine triangles 1567
Figure 402.6: Stage n, 3n triangles Then in this case we have, as fractals, that Q and R are equivalent. Option 2: A subset of Rn with nonintegral Hausdorﬀ dimension. Examples: (we think) the coast of Britain, a Koch snowﬂake. Option 3: A “selfsimilar object”. That is, one which can be covered by copies of itself using a set of (usually two or more) transformation mappings. Another way to say this would be “an object with a discrete approximate scaling symmetry.” Example: A square region, a Koch curve, a fern frond. This isn’t much diﬀerent from Option 1 because of the collage theorem. A cursory description of some relationships between options 2 and 3 is given towards the end of the entry on Hausdorﬀ dimension. The use of option 1 is that it permits one to talk about how ”close” two fractals are to one another. This becomes quite handy when one wants to talk about approximating fractals, especially approximating option 3 type fractals with pictures that can be drawn in ﬁnite time. A simple example: one can talk about how close one of the line drawings in the Koch curve entry is to an actual Koch curve. Version: 7 Owner: quincynoodles Author(s): quincynoodles
1568
Chapter 403 28Axx – Classical measure theory
403.1 Vitali’s Theorem
There exists a set V ⊂ [0, 1] which is not Lebesgue measurable Version: 1 Owner: paolini Author(s): paolini
403.2
proof of Vitali’s Theorem
Consider the equivalence relation in [0, 1) given by x∼y ⇔ x−y ∈Q and let F be the family of all equivalence classes of ∼. Let V be a section of F i.e. put in V an element for each equivalence class of ∼ (notice that we are using the axiom of choice). Given q ∈ Q [0, 1) deﬁne Vq = ((V + q) [0, 1)) ((V + q − 1) [0, 1)) that is Vq is obtained translating V by a quantity q to the right and then cutting the piece which goes beyond the point 1 and putting it on the left, starting from 0. Now notice that given x ∈ [0, 1) there exists y ∈ V such that x ∼ y (because V is a section of ∼) and hence there exists q ∈ Q [0, 1) such that x ∈ Vq . So Vq = [0, 1).
q∈Q T [0,1)
Moreover all the Vq are disjoint. In fact if x ∈ Vq Vp then x − q (modulus [0, 1)) and x − p are both in V which is not possible since they diﬀer by a rational quantity q −p (or q −p + 1). 1569
Now if V is Lebesgue measurable, clearly also Vq are measurable and µ(Vq ) = µ(V ). Moreover by the countable additivity of µ we have µ([0, 1)) =
q∈Q T [0,1)
µ(Vq ) =
q
µ(V ).
So if µ(V ) = 0 we had µ([0, 1)) = 0 and if µ(V ) > 0 we had µ([0, 1)) = +∞. So the only possibility is that V is not Lebesgue measurable. Version: 1 Owner: paolini Author(s): paolini
1570
Chapter 404 28B15 – Set functions, measures and integrals with values in ordered spaces
404.1 Lpspace
when the integral exists. The set of functions with ﬁnite Lp norm form a vector space V with the usual pointwise addition and scalar multiplication of functions. In particular, the set of functions with zero Lp norm form a linear subspace of V , which for this article will be called K. We are then interested in the quotient space V /K, which consists of real functions on X with ﬁnite Lp norm, identiﬁed up to equivalence almost everywhere. This quotient space is the real Lp space on X. Theorem The vector space V /K is complete with respect to the Lp norm.
Deﬁnition Let (X, B, µ) be a measure space. The Lp norm of a function f : X → R is deﬁned as 1 f p := (intX f p dµ) p (404.1.1)
The space L∞ . The space L∞ is somewhat special, and may be deﬁned without explicit reference to an integral. First, the L∞ norm of f is deﬁned to be the essential supremum of f : f ∞ := ess sup f  = inf {a ∈ R : µ({x : f (x) > a}) = 0} (404.1.2) The deﬁnitions of V , K, and L∞ then proceed as above. Functions in L∞ are also called essentially bounded. Example Let X = [0, 1] and f (x) =
1 √ . x
Then f ∈ L1 (X) but f ∈ L2 (X). /
Version: 18 Owner: mathcam Author(s): Manoj, quincynoodles, drummond 1571
404.2
locally integrable function
Deﬁnition [4, 1, 2] Suppose that U is an open set in Rn , and f : U → C is a Lebesgue integrable function. If the Lebesgue integral intK f dx is ﬁnite for all compact subsets K in U, then f is locally integrable. The set of all such functions is denoted by L1 (U). loc Example 1. L1 (U) ⊂ L1 (U), where L1 (U) is the set of integrable functions. loc Theorem Suppose f and g are locally integrable functions on an open subset U ⊂ Rn , and suppose that intU f φdx = intU gφdx
∞ for all smooth functions with compact support φ ∈ C0 (U). Then f = g almost everywhere.
A proof based on the Lebesgue diﬀerentiation theorem is given in [4] pp. 15. Another proof is given in [2] pp. 276.
REFERENCES
1. L. H¨rmander, The Analysis of Linear Partial Diﬀerential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, SpringerVerlag, 1990. 2. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999. 3. S. Lang, Analysis II, AddisonWesley Publishing Company Inc., 1969.
Version: 3 Owner: matte Author(s): matte
1572
Chapter 405 28C05 – Integration theory via linear functionals (Radon measures, Daniell integrals, etc.), representing set functions and measures
405.1 Haar integral
Let Γ be a locally compact topological group and C be the algebra of all continuous realvalued functions on Γ with compact support. In addition we deﬁne C+ to be the set of nonnegative functions that belong to C. The Haar integral is a real linear map I of C into the ﬁeld of the real number for Γ if it satisﬁes: • I is not the zero map • I only takes nonnegative values on C+ • I has the following property I(γ · f ) = I(f ) for all elements f of C and all element γ of Γ. The Haar integral may be denoted in the following way (there are also other ways): intγ∈Γ f (γ) or intΓ f or intΓ f dγ or I(f ) In order for the Haar intergral to exists and to be unique, the following conditions are necessary and suﬃcient: That there exists a realvalues function I + on C+ satisfying the following condition: 1573
1. (Linearity).I + (λf + µg) = λI + (f ) + µI + (g) where f, g ∈ C+ and λ, µ ∈ R+ . 2. (Positivity). If f (γ) 0 for all γ ∈ Γ then I + (f (γ)) 0. 3. (TranslationInvariance). I(f (δγ)) = I(f (γ)) for any ﬁxed δ ∈ Γ and every f in C+ . An additional property is if Γ is a compact group then the Haar integral has right translationinvariance: intγ∈Γ f (γδ) = intγ∈Γ f (γ) for any ﬁxed δ ∈ Γ. In addition we can deﬁne normalized Haar integral to be intΓ 1 = 1 since Γ is compact, it implies that intΓ 1 is ﬁnite. (The proof for existence and uniqueness of the Haar integral is presented in [PV] on page 9.)
( the information of this entry is in part quoted and paraphrased from [GSS])
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeﬀer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). SpringerVerlag, New York, 1988. [HG] Gochschild, G.: The Structure of Lie Groups. HoldenDay, San Francisco, 1965.
Version: 4 Owner: Daume Author(s): Daume
1574
Chapter 406 28C10 – Set functions and measures on topological groups, Haar measures, invariant measures
406.1
406.1.1
Haar measure
Deﬁnition of Haar measures
Let G be a locally compact topological group. A left Haar measure on G is a measure µ on the Borel sigma algebra B of G which is: 1. outer regular on all Borel sets B ∈ B 2. inner regular on all open sets U ⊂ G 3. ﬁnite on all compact sets K ⊂ G 4. invariant under left translation: µ(gB) = µ(B) for all Borel sets B ∈ B A right Haar measure on G is deﬁned similarly, except with left translation invariance replaced by right translation invariance (µ(Bg) = µ(B) for all Borel sets B ∈ B). A bi– invariant Haar measure is a Haar measure that is both left invariant and right invariant.
1575
406.1.2
Existence of Haar measures
For any ﬁnite group G, the counting measure on G is a bi–invariant Haar measure. More generally, every locally compact topological group G has a left 1 Haar measure µ, which is unique up to scalar multiples. The Haar measure plays an important role in the development of Fourier analysis and representation theory on locally compact groups such as Lie groups and proﬁnite groups. Version: 1 Owner: djao Author(s): djao
1 G also has a right Haar measure, although the right and left Haar measures on G are not necessarily equal unless G is abelian.
1576
Chapter 407 28C20 – Set functions and measures and integrals in inﬁnitedimensional spaces (Wiener measure, Gaussian measure, etc.)
407.1 essential supremum
Let (X, B, µ) be a measure space and let f : X → R be a function. The essential supremum of f is the smallest number a ∈ R for which f only exceeds a on a set of measure zero. This allows us to generalize the maximum of a function in a useful way. More formally, we deﬁne ess sup f as follows. Let a ∈ R, and deﬁne Ma = {x : f (x) > a} , the subset of X where f (x) is greater than a. Then let A0 = {a ∈ R : µ(Ma ) = 0} , (407.1.2) (407.1.1)
the set of real numbers for which Ma has measure zero. If A0 = ∅, then the essential supremum is deﬁned to be ∞. Otherwise, the essential supremum of f is ess sup f := infA0 . Version: 1 Owner: drummond Author(s): drummond (407.1.3)
1577
Chapter 408 28D05 – Measurepreserving transformations
408.1 measurepreserving
Let (X, B, µ) be a measure space, and T : X → X be a (possibly noninvertible) measurable transformation. We call T measurepreserving if for all A ∈ B, µ(T −1(A)) = µ(A), where T −1 (A) is deﬁned to be the set of points x ∈ X such that T (x) ∈ A. A measurepreserving transformation is also called an endomorphism of the measure space. Version: 5 Owner: mathcam Author(s): mathcam, drummond
1578
Chapter 409 3000 – General reference works (handbooks, dictionaries, bibliographies, etc.)
409.1 domain
A nonempty open set in C is called a domain. The topology considered is the Euclidean one (viewing C as R2 ). So we have that for a domain D being connected is equivalent to being pathconnected. Since we have that every component of a domain D will be a region, we have that every domain has at most countably many components. Version: 4 Owner: drini Author(s): drini
409.2
region
A region is a connected domain. Since every domain of C can be seen as the union of countably many components and each component is a region, we have that regions play a major role in complex analysis. Version: 2 Owner: drini Author(s): drini
1579
409.3
regular region
Let E be a ndimensional Euclidean space with the topology induced by the Euclidean metric. Then a set in E is a regular region, if it can be written as the closure of a nonempty region with a piecewise smooth boundary. Version: 10 Owner: ottocolori Author(s): ottocolori
409.4
topology of the complex plane
The usual topology for the complex plane C is the topology induced by the metric d(x, y) = x − y for x, y ∈ C. Here,  ·  is the complex modulus. If we identify R2 and C, it is clear that the above topology coincides with topology induced by the Euclidean metric on R2 . Version: 1 Owner: matte Author(s): matte
1580
Chapter 410 30XX – Functions of a complex variable
410.1 z0 is a pole of f
Let f be an analytic function on a punctured neighborhood of x0 ∈ C, that is, f analytic on {z ∈ C : 0 < z − x0  < ε} for some ε > 0 and such that
z→z0
lim f = ∞.
We say then that x0 is a pole for f . Version: 2 Owner: drini Author(s): drini, apmxi
1581
Chapter 411 30A99 – Miscellaneous
411.1 Riemann mapping theorem
Let U be a simply connected open proper subset of C, and let a ∈ U. There is a unique analytic function f : U → C such that 1. f (a) = 0, and f (a) is real and positive; 2. f is injective; 3. f (U) = {z ∈ C : z < 1}. Remark. As a consequence of this theorem, any two simply connected regions, none of which is the whole plane, are conformally equivalent. Version: 2 Owner: Koro Author(s): Koro
411.2
Runge’s theorem
Let K be a compact subset of C, and let E be a subset of C∞ = C {∞} (the extended complex plane) which intersects every connected component of C∞ − K. If f is an analytic function in an open set containing K, given ε > 0, there is a rational function R(z) whose only poles are in E, such that f (z) − R(z) < ε for all z ∈ K. Version: 2 Owner: Koro Author(s): Koro
1582
411.3
Weierstrass Mtest
Let X be a topological space, {fn }n∈N a sequence of real or complex valued functions on X and {Mn }n∈N a sequence of nonnegative real numbers. Suppose that, for each n ∈ N and x ∈ X, we have fn (x) ≤ Mn . Then f = ∞ fn converges uniformly if ∞ Mn n=1 n=1 converges. Version: 8 Owner: vypertd Author(s): vypertd, igor
411.4
annulus
Brieﬂy, an annulus is the region bounded between two (usually concentric) circles. An open annulus, or just annulus for short, is a domain in the complex plane of the form A = Aw (r, R) = {z ∈ C  r < z − w < R}, where w is an abitrary complex number, and r and R are real numbers with 0 < r < R. Such a set is often called an annular region. More generally, one can allow r = 0 or R = ∞. (This makes sense for the purposes of the bound on z − w above.) This would make an annulus include the cases of a punctured disc, and some unbounded domains. Analogously, one can deﬁne a closed annulus to be a set of the form A = Aw (r, R) = {z ∈ C  r z − w R},
where w ∈ C, and r and R are real numbers with 0 < r < R. One can show that two annuli Dw (r, R) and Dw (r , R ) are conformally equivalent if and only if R/r = R /r . More generally, the complement of any closed disk in an open disk is conformally equivalen to precisely one annulus of the form D0 (r, 1). Version: 1 Owner: jay Author(s): jay
411.5
conformally equivalent
A region G is conformally equivalent to a set S if there is an analytic bijective function mapping G to S. Conformal equivalence is an equivalence relation. Version: 1 Owner: Koro Author(s): Koro 1583
411.6
contour integral
Let f be a complexvalued function deﬁned on the image of a curve α: [a, b] → C, let P = {a0 , ..., an } be a partition of [a, b]. If the sum
n
i=1
f (zi )(α(ai) − α(ai−1 ))
where zi is some point α(ti ) such that ai−1 ti ai , tends to a unique limit l as n tends to inﬁnity and the greatest of the numbers ai − ai−1 tends to zero, then we say that the contour integral of f along α exists and has value l. The contour integral is denoted by intα f (z)dz Note (i) If Im(α) is a segment of the real axis, then this deﬁnition reduces to that of the Riemann integral of f(x) between α(a) and α(b) (ii) An alternative deﬁnition, making use of the RiemannStieltjes integral, is based on the fact that the deﬁnition of this can be extended without any other changes in the wording to cover the cases where f and α are complexvalued functions. Now let α be any curve [a, b] → R2 . Then α can be expressed in terms of the components (α1 , α2 ) and can be assosciated with the complex valued function z(t) = α1 (t) + iα2 (t) Given any complexvalued function of a complex variable, f say, deﬁned on Im(α) we deﬁne the contour integral of f along α, denoted by intα f (z)dz by intα f (z)dz = intb f (z(t))dz(t) a whenever the complex RiemannStieltjes integral on the right exists. (iii) Reversing the direction of the curve changes the sign of the integral. 1584
(iv) The contour integral always exists if α is rectiﬁable and f is continuous. (v) If α is piecewise smooth and the countour integral of f along α exists, then intα f dz = intb f (z(t))z (t)dt a Version: 4 Owner: vypertd Author(s): vypertd
411.7
orientation
Let α be a rectiﬁable, Jordan curve in R2 and z0 be a point in R2 − Im(α) and let α have a winding number W [α : z0 ]. Then W [α : z0 ] = ±1; all points inside α will have the same index and we deﬁne the orientation of a Jordan curve α by saying that α is positively oriented if the index of every point in α is +1 and negatively oriented if it is −1. Version: 3 Owner: vypertd Author(s): vypertd
411.8
proof of Weierstrass Mtest
Consider the sequence of partial sums sn = n fm . Since the sums are ﬁnite, each sn is m=1 continuous. Take any p, q ∈ N such that p ≤ q, then, for every x ∈ X, we have
q
sq (x) − sp (x) = ≤ ≤
fm (x)
m=p+1 q
m=p+1 q
fm (x) Mm
m=p+1
∞ But since > 0 we can ﬁnd an N ∈ N such that, for any n=1 Mn converges, for any p, q > N and x ∈ X, we have sq (x) − sp (x) ≤ q m=p+1 Mm < . Hence the sequence sn ∞ converges uniformly to n=1 fn , and the function f = ∞ fn is continuous. n=1
Version: 1 Owner: igor Author(s): igor
1585
411.9
unit disk
The unit disk in the complex plane, denoted ∆, is deﬁned as {z ∈ C : z < 1}. The unit circle, denoted ∂∆ or S 1 is the boundary {z ∈ C : z = 1} of the unit disk ∆. Every element z ∈ ∂∆ can be written as z = eiθ for some real value of θ. Version: 5 Owner: brianbirgen Author(s): brianbirgen
411.10
upper half plane
The upper half plane in the complex plane, abbreviated UHP, is deﬁned as {z ∈ C : Im(z) > 0}. Version: 4 Owner: brianbirgen Author(s): brianbirgen
411.11
winding number and fundamental group
The winding number is an analytic way to deﬁne an explicit isomorphism W [• : z0 ] : π1 (C \ z0 ) → Z from the fundamental group of the punctured (at z0 ) complex plane to the group of integers. Version: 1 Owner: Dr Absentius Author(s): Dr Absentius
1586
Chapter 412 30B10 – Power series (including lacunary series)
412.1 Euler relation
Euler’s relation (also known as Euler’s formula) is considered the ﬁrst bridge between the ﬁelds of algebra and geometry, as it relates the exponential function to the trigonometric sine and cosine functions. The goal is to prove eix = cos(x) + i sin(x) It’s easy to show that i4n = i4n+1 = i4n+2 = i4n+3 = 1 i −1 −i
Now, using the Taylor series expansions of sin x, cos x and ex , we can show that e
ix
= =
∞ n=0 ∞ n=0
in xn n! ix4n+1 x4n+2 ix4n+3 x4n + − − (4n)! (4n + 1)! (4n + 2)! (4n + 3)!
e
ix
Because the series expansion above is absolutely convergent for all x, we can rearrange the
1587
terms of the series as follows e e
ix
=
∞ n=0
x2n x2n+1 (−1)n + i (−1)n (2n)! (2n + 1)! n=0
∞
ix
= cos(x) + i sin(x)
Version: 8 Owner: drini Author(s): drini, ﬁziko, igor
412.2
analytic
Let U be a domain in the complex numbers (resp., real numbers). A function f : U −→ C (resp., f : U −→ R) is analytic (resp., real analytic) if f has a Taylor series about each point x ∈ U that converges to the function f in an open neighborhood of x.
412.2.1
On Analyticity and Holomorphicity
A complex function is analytic if and only if it is holomorphic. Because of this equivalence, an analytic function in the complex case is often deﬁned to be one that is holomorphic, instead of one having a Taylor series as above. Although the two deﬁnitions are equivalent, it is not an easy matter to prove their equivalence, and a reader who does not yet have this result available will have to pay attention as to which deﬁnition of analytic is being used. Version: 4 Owner: djao Author(s): djao
412.3
existence of power series
In this entry we shall demonstrate the logical equivalence of the holomorphic and analytic concepts. As is the case with so many basic results in complex analysis, the proof of these facts hinges on the Cauchy integral theorem, and the Cauchy integral formula.
Holomorphic implies analytic. Theorem 8. Let U ⊂ C be an open domain that contains the origin, and let f : U → C, be a function such that the complex derivative f (z) = lim f (z + ζ) − f (z) ζ→0 ζ
1588
exists for all z ∈ U. Then, there exists a power series representation f (z) =
∞ k=0
ak z k ,
z < R,
ak ∈ C
for a suﬃciently small radius of convergence R > 0. Note: it is just as easy to show the existence of a power series representation around every basepoint in z0 ∈ U; one need only consider the holomorphic function f (z − z0 ). Proof. Choose an R > 0 suﬃciently small so that the disk z the Cauchy integral formula we have that f (z) = 1 2πi f (ζ) dζ, ζ −z z < R, R is contained in U. By
ζ =R
where, as usual, the integration contour is oriented counterclockwise. For every ζ of modulus R, we can expand the integrand as a geometric power series in z, namely f (ζ)/ζ f (ζ) = = ζ −z 1 − z/ζ
∞ k=0
f (ζ) k z , ζ k+1
z < R.
The circle of radius R is a compact set; hence f (ζ) is bounded on it; and hence, the power series above converges uniformly with respect to ζ. Consequently, the order of the inﬁnite summation and the integration operations can be interchanged. Hence, f (z) = where ak = as desired. QED Analytic implies holomorphic. Theorem 9. Let f (z) =
∞ n=0 ∞ k=0
ak z k ,
z < R,
1 2πi
ζ =R
f (ζ) , ζ k+1
an z n ,
an ∈ C,
z < > 0 about the origin.
be a power series, converging in D = D (0), the open disk of radius Then the complex derivative f (z) = lim f (z + ζ) − f (z) ζ→0 ζ
exists for all z ∈ D, i.e. the function f : D → C is holomorphic. 1589
Note: this theorem generalizes immediately to shifted power series in z − z0 , z0 ∈ C. Proof. For every z0 ∈ D, the function f (z) can be recast as a power series centered at z0 . Hence, without loss of generality it suﬃces to prove the theorem for z = 0. The power series
∞ n=0
an+1 ζ n ,
ζ∈D
converges, and equals (f (ζ) − f (0))/ζ for ζ = 0. Consequently, the complex derivative f (0) exists; indeed it is equal to a1 . QED Version: 2 Owner: rmilson Author(s): rmilson
412.4
inﬁnitelydiﬀerentiable function that is not analytic
If f ∈ C∞ , then we can certainly write a Taylor series for f . However, analyticity requires that this Taylor series actually converge (at least across some radius of convergence) to f . It is not necessary that the power series for f converge to f , as the following example shows. Let f (x) = e x=0 . 0 x=0
Then f ∈ C∞ , and for any n ≥ 0, f (n) (0) = 0 (see below). So the Taylor series for f around 0 is 0; since f (x) > 0 for all x = 0, clearly it does not converge to f .
Proof that f (n) (0) = 0
Let p(x), q(x) ∈ R[x] be polynomials, and deﬁne g(x) = Then, for x = 0, p(x) · f (x). q(x)
(p (x) + p(x) x23 )q(x) − q (x)p(x) · e. g (x) = q 2 (x)
Computing (e.g. by applying L’Hˆpital’s rule), we see that g (0) = limx→0 g (x) = 0. o Deﬁne p0 (x) = q0 (x) = 1. Applying the above inductively, we see that we may write n (x) f (n) (x) = pn (x) f (x). So f (n) (0) = 0, as required. q Version: 2 Owner: ariels Author(s): ariels 1590
412.5
power series
A power series is a series of the form
∞ k=0
ak (x − x0 )k ,
with ak , x0 ∈ R or ∈ C. The ak are called the coeﬃcients and x0 the center of the power series. Where it converges it deﬁnes a function, which can thus be represented by a power series. This is what power series are usually used for. Every power series is convergent at least at x = x0 where it converges to a0 . In addition it is absolutely convergent in the region {x  x − x0  < r}, with 1 r = lim inf k k→∞ ak  It is divergent for every x with x − x0  > r. For x − x0  = r no general predictions can be made. If r = ∞, the power series converges absolutely for every real or complex x. The real number r is called the radius of convergence of the power series.
Examples of power series are: • Taylor series, for example: e = • The geometric series:
x
∞ k=0
xk . k!
1 = 1−x
∞ k=0
xk ,
with x < 1. Power series have some important properties: • If a power series converges for a z0 ∈ C then it also converges for all z ∈ C with z − x0  < z0 − x0 . • Also, if a power series diverges for some z0 ∈ C then it diverges for all z ∈ C with z − x0  > z0 − x0 . • For x − x0  < r Power series can be added by adding coeﬃcients and mulitplied in the obvious way:
∞ k=0
ak (x−xo ) ·
k
∞ l=0
bj (x−x0 )j = a0 b0 +(a0 b1 +a1 b0 )(x−x0 )+(a0 b2 +a1 b1 +a2 b0 )(x−x0 )2 . . . .
1591
• (Uniqueness) If two power series are equal and their centers are the same, then their coeﬃcients must be equal. • Power series can be termwise diﬀerentiated and integrated. These operations keep the radius of convergence. Version: 13 Owner: mathwizard Author(s): mathwizard, AxelBoldt
412.6
proof of radius of convergence
According to Cauchy’s root test a power series is absolutely convergent if lim sup
k→∞
k
ak (x − x0 )k  = x − x0  lim sup
k→∞
k
ak  < 1.
This is obviously true if x − x0  < lim sup
k→∞
k
1 1 = ak  lim inf k→∞ 1
k
ak 
.
In the same way we see that the series is divergent if x − x0  > lim inf k→∞
k
ak 
,
which means that the right hand side is the radius of convergence of the power series. Now from the ratio test we see that the power series is absolutely convergent if lim ak+1 ak+1 (x − x0 )k+1 = x − x0  lim < 1. k k→∞ ak (x − x0 ) ak x − x0  < lim The series is divergent if x − x0  > lim
k→∞
k→∞
Again this is true if
k→∞
ak . ak+1 ak , ak+1
as follows from the ratio test in the same way. So we see that in this way too we can calculate the radius of convergence. Version: 1 Owner: mathwizard Author(s): mathwizard
1592
412.7
radius of convergence
∞ k=0
To the power series
ak (x − x0 )k
(412.7.1)
there exists a number r ∈ [0, ∞], its radius of convergence, such that the series converges absolutely for all (real or complex) numbers x with x − x0  < r and diverges whenever x − x0  > r. (For x − x0  = r no general statements can be made, except that there always exists at least one complex number x with x − x0  = r such that the series diverges.) The radius of convergence is given by: r = lim inf
k→∞
1
k
ak 
(412.7.2)
and can also be computed as r = lim if this limit exists. Version: 6 Owner: mathwizard Author(s): mathwizard, AxelBoldt
k→∞
ak , ak+1
(412.7.3)
1593
Chapter 413 30B50 – Dirichlet series and other series expansions, exponential series
413.1 Dirichlet series
Let (λn )n≥1 be an increasing sequence of positive real numbers tending to ∞. A Dirichlet series with exponents (λn ) is a series of the form an e−λn z
n
where z and all the an are complex numbers. An ordinary Dirichlet series is one having λn = log n for all n. It is written an . nz The bestknown examples are the Riemann zeta fuction (in which an is the constant 1) and the more general Dirichlet Lseries (in which the mapping n → an is multiplicative and periodic). When λn = n, the Dirichlet series is just a power series in the variable e−z . The following are the basic convergence properties of Dirichlet series. There is nothing profound about their proofs, which can be found in [1] and in various other works on complex analysis and analytic number theory. Let f (z) =
n
an e−λn z be a Dirichlet series.
1. If f converges at z = z0 , then f converges uniformly in the region Re(z − z0 ) ≥ 0 − α ≤ arg(z − z0 ) ≤ α 1594
where α is any real number such that 0 < α < π/2. (Such a region is known as a “Stoltz angle”.) 2. Therefore, if f converges at z0 , its sum deﬁnes a holomorphic function on the region Re(z) > Re(z0 ), and moreover f (z) → f (z0 ) as z → z0 within any Stoltz angle. 3. f = 0 identically iﬀ all the an are zero. So, if f converges somewhere but not everywhere in C, then the domain of its convergence is the region Re(z) > ρ for some real number ρ, which is called the abscissa of convergence of the Dirichlets series. The abscissa of convergence of the series f (z) = n an e−λn z , if it exists, is called the abscissa of absolute convergence of f . Now suppose that the coeﬃcients an are all real and ≥ 0. If the series f converges for Re(z) > ρ, and the resulting function admits an analytic extension to a neighbourhood of ρ, then the series f converges in a neighbourhood of ρ. Consequently, the domain of convergence of f (unless it is the whole of C) is bounded by a singularity at a point on the real axis. Finally, return to the general case of any complex numbers (an ), but suppose λn = log n, so an f is an ordinary Dirichlet series . nz 1. If the sequence (an ) is bounded, then f converges absolutely in the region Re(z) > 1. 2. If the partial sums l an are bounded, then f converges (not necessarily absolutely) n=k in the region Re(z) > 0. Reference: [1] Serre, J.P., A Course in Arithmetic, Chapter VI, SpringerVerlag, 1973. Version: 2 Owner: bbukh Author(s): Larry Hammick
1595
Chapter 414 30C15 – Zeros of polynomials, rational functions, and other analytic functions (e.g. zeros of functions with bounded Dirichlet integral)
414.1 MasonStothers theorem
Mason’s theorem is often described as the polynomial case of the (currently unproven) ABC conjecture. Theorem 1 (MasonStothers). Let f (z), g(z), h(z) ∈ C[z] be such that f (z) + g(z) = h(z) for all z, and such that f , g, and h are pairwise relatively prime. Denote the number of distinct roots of the product f gh(z) by N. Then max deg{f, g, h} + 1 Version: 1 Owner: mathcam Author(s): mathcam N.
414.2
zeroes of analytic functions are isolated
The zeroes of a nonconstant analytic function on C are isolated. Let f be an analytic function deﬁned in some domain D ⊂ C and let f (z0 ) = 0 for some z0 ∈ D. Because f is analytic, there is a Taylor series expansion for f around z0 which converges on an open disk z − z0  < R. Write it as f (z) = Σ∞ an (z − z0 )n , with ak = 0 and k > 0 (ak is the ﬁrst n=k nonzero term). One can factor the series so that f (z) = (z − z0 )k Σ∞ an+k (z − z0 )n and n=0 1596
deﬁne g(z) = Σ∞ an+k (z − z0 )n so that f (z) = (z − z0 )k g(z). Observe that g(z) is analytic n=0 on z − z0  < R. To show that z0 is an isolated zero of f , we must ﬁnd > 0 so that f is nonzero on 0 < z − z0  < . It is enough to ﬁnd > 0 so that g is nonzero on z − z0  < by the relation f (z) = (z − z0 )k g(z). Because g(z) is analytic, it is continuous at z0 . Notice that g(z0 ) = ak = 0, so there exists an > 0 so that for all z with z − z0  < it follows that g(z) − ak  < a2k  . This implies that g(z) is nonzero in this set. Version: 5 Owner: brianbirgen Author(s): brianbirgen
1597
Chapter 415 30C20 – Conformal mappings of special domains
415.1 automorphisms of unit disk
All automorphisms of the complex unit disk ∆ = {z ∈ C : z < 1} to itself, can be written z−a in the form fa (z) = eiθ 1−az where a ∈ ∆ and θ ∈ S 1 . This map sends a to 0, 1/a to ∞ and the unit circle to the unit circle. Version: 3 Owner: brianbirgen Author(s): brianbirgen
415.2
unit disk upper half plane conformal equivalence theorem
Theorem: There is a conformal map from ∆, the unit disk, to UHP , the upper half plane.
1+w Proof: Deﬁne f : C → C, f (z) = z−i . Notice that f −1 (w) = i 1−w and that f (and therefore z+i f −1 ) is a Mobius transformation.
−1−i Notice that f (0) = −1, f (1) = 1−i = −i and f (−1) = −1+i = i. By the Mobius circle transformation theore 1+i f takes the real axis to the unit circle. Since f (i) = 0, f maps UHP to ∆ and f −1 : ∆ → UHP .
Version: 3 Owner: brianbirgen Author(s): brianbirgen
1598
Chapter 416 30C35 – General theory of conformal mappings
416.1 proof of conformal mapping theorem
Let D ⊂ C be a domain, and let f : D → C be an analytic function. By identifying the complex plane C with R2 , we can view f as a function from R2 to itself: ˜ f (x, y) := (Re f (x + iy), Im f (x + iy)) = (u(x, y), v(x, y)) ˜ with u and v real functions. The Jacobian matrix of f is J(x, y) = ∂(u, v) = ∂(x, y) ux uy . vx vy
As an analytic function, f satisﬁes the CauchyRiemann equations, so that ux = vy and uy = −vx . At a ﬁxed point z = x + iy ∈ D, we can therefore deﬁne a = ux (x, y) = vy (x, y) and b = uy (x, y) = −vx (x, y). We write (a, b) in polar coordinates as (r cos θ, r sin θ) and get J(x, y) = a b −b a =r cos θ sin θ − sin θ cos θ
Now we consider two smooth curves through (x, y), which we parametrize by γ1 (t) = (u1 (t), v1 (t)) and γ2 (t) = (u2 (t), v2 (t)). We can choose the parametrization such that ˜ ˜ ˜ γ1 (0) = γ2 (0) = z. The images of these curves under f are f ◦ γ1 and f ◦ γ2 , respectively, and their derivatives at t = 0 are ˜ (f ◦ γ1 ) (0) = and, similarly, ˜ (f ◦ γ2 ) (0) = J(x, y) 1599 dγ1 ∂(u, v) (γ1 (0)) · (0) = J(x, y) ∂(x, y) dt
du2 dt dv2 dt du1 dt dv1 dt
by the chain rule. We see that if f (z) = 0, f transforms the tangent vectors to γ1 and γ2 at t = 0 (and therefore in z) by the orthogonal matrix J/r = cos θ sin θ − sin θ cos θ
and scales them by a factor of r. In particular, the transformation by an orthogonal matrix implies that the angle between the tangent vectors is preserved. Since the determinant of J/r is 1, the transformation also preserves orientation (the direction of the angle between the tangent vectors). We conclude that f is a conformal mapping. Version: 3 Owner: pbruin Author(s): pbruin
1600
Chapter 417 30C80 – Maximum principle; Schwarz’s lemma, Lindel¨f principle, o analogues and generalizations; subordination
417.1 Schwarz lemma
Let ∆ = {z : z < 1} be the open unit disk in the complex plane C. Let f : ∆ → ∆ be a holomorphic function with f(0)=0. Then f (z) ≤ z for all z ∈ ∆, and f (0) ≤ 1. If equality f (z) = z holds for any z = 0 or f (0) = 1, then f is a rotation: f (z) = az with a = 1. This lemma is less celebrated than the bigger guns (such as the Riemann mapping theorem, which it helps prove); however, it is one of the simplest results capturing the “rigidity” of holomorphic functions. No similar result exists for real functions, of course. Version: 2 Owner: ariels Author(s): ariels
417.2
maximum principle
Maximum principle Let f : U → R (where U ⊆ Rd ) be a harmonic function. Then f attains its extremal values on any compact K ⊆ U on the boundary ∂K of K. If f attains an extremal value anywhere inside int K, then it is constant. Maximal modulus principle Let f : U → C (where U ⊆ C) be a holomorphic function. Then f  attains its maximal value on any compact K ⊆ U on the boundary ∂K of K. If f  attains its maximal value anywhere inside int K, then it is constant. 1601
Version: 1 Owner: ariels Author(s): ariels
417.3
proof of Schwarz lemma
Deﬁne g(z) = f (z)/z. Then g : ∆ → C is a holomorphic function. The Schwarz lemma is just an application of the maximal modulus principle to g. For any 1 > > 0, by the maximal modulus principle g must attain its maximum on the closed disk {z : z ≤ 1 − } at its boundary {z : z = 1 − }, say at some point z . But then 1 g(z) ≤ g(z ) ≤ 1− for any z ≤ 1 − . Taking an inﬁnimum as → 0, we see that values of g are bounded: g(z) ≤ 1. Thus f (z) ≤ z. Additionally, f (0) = g(0), so we see that f (0) = g(0) ≤ 1. This is the ﬁrst part of the lemma. Now suppose, as per the premise of the second part of the lemma, that g(w) = 1 for some w ∈ ∆. For any r > w, it must be that g attains its maximal modulus (1) inside the disk {z : z ≤ r}, and it follows that g must be constant inside the entire open disk ∆. So g(z) ⇔ a for a = g(w) of size 1, and f (z) = az, as required. Version: 2 Owner: ariels Author(s): ariels
1602
Chapter 418 30D20 – Entire functions, general theory
418.1 Liouville’s theorem
A bounded entire function is constant. That is, a bounded complex function f : C → C which is holomorphic on the entire complex plane is always a constant function. More generally, any holomorphic function f : C → C which satisﬁes a polynomial bound condition of the form f (z) < c · zn
for some c ∈ R, n ∈ Z, and all z ∈ C with z suﬃciently large is necessarily equal to a polynomial function.
Liouville’s theorem is a vivid example of how stringent the holomorphicity condition on a complex function really is. One has only to compare the theorem to the corresponding statement for real functions (namely, that a bounded diﬀerentiable real function is constant, a patently false statement) to see how much stronger the complex diﬀerentiability condition is compared to real diﬀerentiability. Applications of Liouville’s theorem include proofs of the fundamental theorem of algebra and of the partial fraction decomposition theorem for rational functions. Version: 4 Owner: djao Author(s): djao
418.2
Morera’s theorem
Morera’s theorem provides the converse of Cauchy’s integral theorem. 1603
Theorem [1] Suppose G is a region in C, and f : G → C is a continuous function. If for every closed triangle ∆ in G, we have int∂∆ f dz = 0, then f is analytic on G. (Here, ∂∆ is the piecewise linear boundary of ∆.) In particular, if for every rectiﬁable closed curve Γ in G, we have intΓ f dz = 0, then f is analytic on G. Proofs of this can be found in [2, 2].
REFERENCES
1. W. Rudin, Real and complex analysis, 3rd ed., McGrawHill Inc., 1987. 2. E. Kreyszig, Advanced Engineering Mathematics, John Wiley & Sons, 1993, 7th ed. 3. R.A. Silverman, Introductory Complex Analysis, Dover Publications, 1972.
Version: 7 Owner: matte Author(s): matte, drini, nerdy2
418.3
entire
A function f : C −→ C is entire if it is holomorphic. Version: 2 Owner: djao Author(s): djao
418.4
holomorphic
Let U ⊂ C be a domain in the complex numbers. A function f : U −→ C is holomorphic if f has a complex derivative at every point x in U, i.e. if f (z) − f (z0 ) lim z→z0 z − z0 exists for all z0 ∈ U. Version: 5 Owner: djao Author(s): djao, rmilson
418.5
proof of Liouville’s theorem
∞
Let f : C → C be a bounded, entire function. Then by Taylor’s Theorem, f (z) = cn xn where cn = 1604
n=0
1 f (w) intΓr n+1 dw 2πi w
where Γr is the circle of radius r about 0, for r > 0. Then cn can be estimated as cn  1 length(Γr ) sup 2π f (w) : w ∈ Γr w n+1 = Mr Mr 1 2πr n+1 = n 2π r r
where Mr = sup{f (w) : w ∈ Γr }.
M But f is bounded, so there is M such that Mr M for all r. Then cn  rn for all n and all r > 0. But since r is arbitrary, this gives cn = 0 whenever n > 0. So f (z) = c0 for all z, so f is constant.
Version: 2 Owner: Evandar Author(s): Evandar
1605
Chapter 419 30D30 – Meromorphic functions, general theory
419.1 CasoratiWeierstrass theorem
Let U ⊂ C be a domain, a ∈ U, and let f : U \ {a} → C be holomorphic. Then a is an essential singularity of f if and only if the image of any punctured neighborhood of a under f is dense in C. Version: 2 Owner: pbruin Author(s): pbruin
419.2
MittagLeﬄer’s theorem
Let G be an open subset of C, let {ak } be a sequence of distinct points in G which has no limit point in G. For each k, let A1k , . . . , Amk k be arbitrary complex coeﬃcients, and deﬁne
mk
Sk (z) =
j=1
Ajk . (z − ak )j
Then there exists a meromorphic function f on G whose poles are exactly the points {ak } and such that the singular part of f at ak is Sk (z), for each k. Version: 1 Owner: Koro Author(s): Koro
1606
419.3
Riemann’s removable singularity theorem
Let U ⊂ C be a domain, a ∈ U, and let f : U \ {a} be holomorphic. Then a is a removable singularity of f if and only if
z→a
lim (z − a)f (z) = 0.
In particular, a is a removable singularity of f if f is bounded near a, i.e. if there is a punctured neighborhood V of a and a real number M > 0 such that f (z) < M for all z ∈ V. Version: 1 Owner: pbruin Author(s): pbruin
419.4
essential singularity
Let U ⊂ C be a domain, a ∈ U, and let f : U \{a} → C be holomorphic. If the Laurent series expansion of f (z) around a contains inﬁnitely many terms with negative powers of z −a, then a is said to be an essential singularity of f . Any singularity of f is a removable singularity, a pole or an essential singularity. If a is an essential singularity of f , then the image of any punctured neighborhood of a under f is dense in C (the CasoratiWeierstrass theorem). In fact, an even stronger statement is true: according to Picard’s theorem, the image of any punctured neighborhood of a is C, with the possible exception of a single point. Version: 4 Owner: pbruin Author(s): pbruin
419.5
meromorphic
Let U ⊂ C be a domain. A function f : U −→ C is meromorphic if f is holomorphic except at an isolated set of poles. It can be proven that if f is meromorphic then its set of poles does not have an accumulation point. Version: 2 Owner: djao Author(s): djao
419.6
pole
Let U ⊂ C be a domain and let a ∈ C. A function f : U −→ C has a pole at a if it can be represented by a Laurent series centered about a with only ﬁnitely many negative terms; 1607
that is, f (z) =
∞ k=−n
ck (z − a)k
in some nonempty deleted neighborhood of a, for some n ∈ N. Version: 2 Owner: djao Author(s): djao
419.7
proof of CasoratiWeierstrass theorem
Assume that a is an essential singularity of f . Let V ⊂ U be a punctured neighborhood of a, and let λ ∈ C. We have to show that λ is a limit point of f (V ). Suppose it is not, then there is an > 0 such that f (z) − λ > for all z ∈ V , and the function g : V → C, z → 1 f (z) − λ
1 is bounded, since g(z) = f (z)−λ < −1 for all z ∈ V . According to Riemann’s removable singularity theorem this implies that a is a removable singularity of g, so that g can be extended to a holomorphic function g : V {a} → C. Now ¯ 1 f (z) = −λ g (z) ¯
for z = a, and a is either a removable singularity of f (if g (z) = 0) or a pole of order n (if g ¯ ¯ has a zero of order n at a). This contradicts our assumption that a is an essential singularity, which means that λ must be a limit point of f (V ). The argument holds for all λ ∈ C, so f (V ) is dense in C for any punctured neighborhood V of a. To prove the converse, assume that f (V ) is dense in C for any punctured neighborhood V of a. If a is a removable singularity, then f is bounded near a, and if a is a pole, f (z) → ∞ as z → a. Either of these possibilities contradicts the assumption that the image of any punctured neighborhood of a under f is dense in C, so a must be an essential singularity of f. Version: 1 Owner: pbruin Author(s): pbruin
419.8
proof of Riemann’s removable singularity theorem
Suppose that f is holomorphic on U \ {a} and limz→a (z − a)f (z) = 0. Let f (z) =
∞ k=−∞
ck (z − a)k
1608
be the Laurent series of f centered at a. We will show that ck = 0 for k < 0, so that f can be holomorphically extended to all of U by deﬁning f (a) = c0 . For n ∈ N0 , the residue of (z − a)n f (z) at a is Res((z − a)n f (z), a) = This is equal to zero, because (z − a)n f (z)dz ≤ 2πδ max (z − a)n f (z) = 2πδ
n z−a=δ z−a=δ
1 lim 2πi δ→0+
z−a=δ
(z − a)n f (z)dz.
z−a=δ
max (z − a)f (z)
which, by our assumption, goes to zero as δ → 0. Since the residue of (z − a)n f (z) at a is also equal to c−n−1 , the coeﬃcients of all negative powers of z in the Laurent series vanish. Conversely, if a is a removable singularity of f , then f can be expanded in a power series centered at a, so that lim (z − a)f (z) = 0
z→a
because the constant term in the power series of (z − a)f (z) is zero. A corollary of this theorem is the following: if f is bounded near a, then (z − a)f (z) ≤ z − aM for some M > 0. This implies that (z − a)f (z) → 0 as z → a, so a is a removable singularity of f . Version: 1 Owner: pbruin Author(s): pbruin
419.9
residue
Let U ⊂ C be a domain and let f : U −→ C be a function represented by a Laurent series f (z) :=
∞ k=−∞
ck (z − a)k
centered about a. The coeﬃcient c−1 of the above Laurent series is called residue of f at a, and denoted Res(f ; a). Version: 2 Owner: djao Author(s): djao
1609
419.10
simple pole
A simple pole is a pole of order 1. That is, a meromorphic function f has a simple pole at x0 ∈ C if a + g(z) f (z) = z − x0 where a = 0 ∈ C, and g is holomorphic at x0 . Version: 3 Owner: bwebste Author(s): bwebste
1610
Chapter 420 30E20 – Integration, integrals of Cauchy type, integral representations of analytic functions
420.1 Cauchy integral formula
The formulas. Let D = {z ∈ C : z − z0 < R} be an open disk in the complex plane, and let f (z) be a holomorphic 1 function deﬁned on some open domain that contains D and its boundary. Then, for every z ∈ D we have 1 2πi 1 f (z) = 2πi . . . n! f (n) (z) = 2πi f (z) =
C
C
f (ζ) dζ ζ −z f (ζ) dζ (ζ − z)2
C
Here C = ∂D is the corresponding circular boundary contour, oriented counterclockwise, with the most obvious parameterization given by ζ = z0 + Reit , 0 t 2π.
f (ζ) dζ (ζ − z)n+1
Discussion. The ﬁrst of the above formulas underscores the “rigidity” of holomoprhic functions. Indeed, the values of the holomorphic function inside a disk D are completely
It is necessary to draw a distinction between holomorphic functions (those having a complex derivative) and analytic functions (those representable by power series). The two concepts are, in fact, equivalent, but the standard proof of this fact uses the Cauchy Integral Formula with the (apparently) weaker holomorphicity hypothesis.
1
1611
speciﬁed by its values on the boundary of the disk. The second formula is useful, because it gives the derivative in terms of an integral, rather than as the outcome of a limit process. Generalization. The following technical generalization of the formula is needed for the treatment of removable singularities. Let S be a ﬁnite subset of D, and suppose that f (z) is holomorphic for all z ∈ S, but also that f (z) is bounded near all z ∈ S. Then, the above / formulas are valid for all z ∈ D \ S. Using the Cauchy residue theorem, one can further generalize the integral formula to the situation where D is any domain and C is any closed rectiﬁable curve in D; in this case, the formula becomes 1 f (ζ) η(C, z)f (z) = dζ 2πi C ζ − z where η(C, z) denotes the winding number of C. It is valid for all points z ∈ D \ S which are not on the curve C. Version: 19 Owner: djao Author(s): djao, rmilson
420.2
Cauchy integral theorem
Theorem 10. Let U ⊂ C be an open, simply connected domain, and let f : U → C be a function whose complex derivative, that is
w→z
lim
exists for all z ∈ U. Then, the integral around a every closed contour γ ⊂ U vanishes; in symbols f (z) dz = 0.
γ
f (w) − f (z) , w−z
We also have the following, technically important generalization involving removable singularities. Theorem 11. Let U ⊂ C be an open, simply connected domain, and S ⊂ U a ﬁnite subset. Let f : U\S → C be a function whose complex derivative exists for all z ∈ U\S, and that is bounded near all z ∈ S. Then, the integral around a every closed contour γ ⊂ U\S that avoids the exceptional points vanishes. Cauchy’s theorem is an essential stepping stone in the theory of complex analysis. It is required for the proof of the Cauchy integral formula, which in turn is required for the proof that the existence of a complex derivative implies a power series representation. The original version of the theorem, as stated by Cauchy in the early 1800s, requires that the derivative f (z) exist and be continuous. The existence of f (z) implies the CauchyRiemann equations, 1612
which in turn can be restated as the fact that the complexvalued diﬀerential f (z) dz is closed. The original proof makes use of this fact, and calls on Green’s theorem to conclude that the contour integral vanishes. The proof of Green’s theorem, however, involves an interchange of order in a double integral, and this can only be justiﬁed if the integrand, which involves the real and imaginary parts of f (z), is assumed to be continuous. To this date, many authors prove the theorem this way, but erroneously fail to mention the continuity assumption. In the latter part of the 19th century E. Goursat found a proof of the integral theorem that merely required that f (z) exist. Continuity of the derivative, as well as the existence of all higher derivatives, then follows as a consequence of the Cauchy integral formula. Not only is Goursat’s version a sharper result, but it is also more elementary and selfcontained, in that sense that it is does not require Green’s theorem. Goursat’s argument makes use of rectangular contour (many authors use triangles though), but the extension to an arbitrary simplyconnected domain is relatively straightforward. Theorem 12 (Goursat). Let U be an open domain containing a rectangle If the complex derivative of a function f : U → C exists at all points of U, then the contour integral of f around the boundary of R vanishes; in symbols f (z) dz = 0.
∂R
R = {x + iy ∈ C : a
x
b,c
y
d}.
Bibliography. • A. Ahlfors, “Complex Analysis”. Version: 7 Owner: rmilson Author(s): rmilson
420.3
Cauchy residue theorem
Let U ⊂ C be a simply connected domain, and suppose f is a complex valued function which is deﬁned and analytic on all but ﬁnitely many points a1 , . . . , am of U. Let C be a closed curve in U which does not intersect any of the ai . Then
m
intC f (z) dz = 2πi
i=1
η(C, ai) Res(f ; ai ),
1 dz intC 2πi z − ai is the winding number of C about ai , and Res(f ; ai) denotes the residue of f at ai . η(C, ai) := Version: 4 Owner: djao Author(s): djao, rmilson 1613
where
420.4
Gauss’ mean value theorem
Let Ω be a domain in C and suppose f is an analytic function on Ω. Furthermore, let C be a circle inside Ω with center z0 and radius r. Then f (z0 ) is the mean value of f along C, that is, 1 int2π f (z0 + reiθ )dθ. f (z0 ) = 2π 0 Version: 7 Owner: Johan Author(s): Johan
420.5
M¨bius circle transformation theorem o
M¨bius transformations always transform circles into circles. o Version: 1 Owner: Johan Author(s): Johan
420.6
M¨bius transformation crossratio preservation o theorem
A M¨bius transformation f : z → w preserves the crossratios, i.e. o (z1 − z2 )(z3 − z4 ) (w1 − w2 )(w3 − w4 ) = (w1 − w4 )(w3 − w2 ) (z1 − z4 )(z3 − z2 ) Version: 3 Owner: Johan Author(s): Johan
420.7
Rouch’s theorem
Let f, g be analytic on and inside a simple closed curve C. Suppose f (z) > g(z) on C. Then f and f + g have the same number of zeros inside C. Version: 2 Owner: Johan Author(s): Johan
1614
420.8
absolute convergence implies convergence for an inﬁnite product
If an inﬁnite product is absolutely convergent then it is convergent. Version: 2 Owner: Johan Author(s): Johan
420.9
absolute convergence of inﬁnite product
∞ n=1 (1
An inﬁnite product verges.
+ an ) is said to be absolutely convergent if
∞ n=1 (1
+ an ) con
Version: 4 Owner: mathcam Author(s): mathcam, Johan
420.10
closed curve theorem
Let U ⊂ C be a simply connected domain, and suppose f : U −→ C is holomorphic. Then intC f (z) dz = 0 for any smooth closed curve C in U. More generally, if U is any domain, and C1 and C2 are two homotopic smooth closed curves in U, then intC1 f (z) dz = intC2 f (z) dz. for any holomorphic function f : U −→ C. Version: 3 Owner: djao Author(s): djao
420.11
conformal M¨bius circle map theorem o
Any conformal map that maps the interior of the unit disc onto itself is a M¨bius transformation. o Version: 4 Owner: Johan Author(s): Johan
1615
420.12
conformal mapping
A mapping f : C → C which preserves the size and orientation of the angles (at z0 ) between any two curves which intersects in a given point z0 is said to be conformal at z0 . A mapping that is conformal at any point in a domain D is said to be conformal in D. Version: 4 Owner: Johan Author(s): Johan
420.13
conformal mapping theorem
Let f (z) be analytic in a domain D. Then it is conformal at any point z ∈ D where f (z) = 0. Version: 2 Owner: Johan Author(s): Johan
420.14
convergence/divergence for an inﬁnite product
Consider ∞ pn . We say that this inﬁnite product converges iﬀ the ﬁnite products Pm = n=1 m pn −→ P = 0 converge or for at most a ﬁnite number of terms pnk = 0 , k = 1, . . . , K. n=1 Otherwise the inﬁnite product is called divergent. Note: The inﬁnite product vanishes only if a factor is zero. Version: 6 Owner: Johan Author(s): Johan
420.15
example of conformal mapping
Consider the four curves A = {t}, B = {t + it}, C = {it} and D = {−t + it}, t ∈ [−10, 10]. Suppose there is a mapping f : C → C which maps A to D and B to C. Is f conformal at z0 = 0? The size of the angles between A and B at the point of intersection z0 = 0 is preserved, however the orientation is not. Therefore f is not conformal at z0 = 0. Now suppose there is a function g : C → C which maps A to C and B to D. In this case we see not only that the size of the angles is preserved, but also the orientation. Therefore g is conformal at z0 = 0. Version: 3 Owner: Johan Author(s): Johan
1616
420.16
examples of inﬁnite products
A classic example is the Riemann zeta function. For Re(z) > 1 we have ζ(z) =
∞ n=1
1 1 = . z n 1 − p−z p prime
With the help of a Fourier series, or in other ways, one can prove this inﬁnite product expansion of the sine function: sin z = z
∞ n=1
1−
z2 n2 π 2
(420.16.1)
where z is an arbitrary complex number. Taking the logarithmic derivative (a frequent move in connection with inﬁnite products) we get a decomposition of the cotangent into partial fractions: ∞ 1 1 1 + . (420.16.2) π cot πz = + 2 z z+n z−n n=1 The equation (495.2.1), in turn, has some interesting uses, e.g. to get the Taylor expansion of an Eisenstein series, or to evaluate ζ(2n) for positive integers n. Version: 1 Owner: mathcam Author(s): Larry Hammick
420.17
Let
link between inﬁnite products and sums
∞ k=1
pk
be an inﬁnite product such that pk > 0 for all k. Then the inﬁnite product converges if and only if the inﬁnite sum
∞
log pk
∞ k=1
k=1
converges. Moreover
∞ k=1
pk = exp
log pk .
Proof. Simply notice that
N N
pk = exp
k=1 k=1
log pk .
1617
If the inﬁnite sum converges then
N N →∞ N
lim
pk = lim exp
k=1 N →∞ k=1
log pk = exp
∞ k=1
log pk
and also the inﬁnite product converges. Version: 1 Owner: paolini Author(s): paolini
420.18
proof of Cauchy integral formula
Let D = {z ∈ C : z − z0 < R} be a disk in the complex plane, S ⊂ D a ﬁnite subset, and ¯ U ⊂ C an open domain that contains the closed disk D. Suppose that • f : U\S → C is holomorphic, and that • f (z) is bounded on D\S. Let z ∈ D\S be given, and set g(ζ) = f (ζ) − f (z) , ζ −z ζ ∈ D\S ,
where S = S {z}. Note that g(ζ) is holomorphic and bounded on D\S . The second assertion is true, because g(ζ) → f (z), as ζ → z. Therefore, by the Cauchy integral theorem g(ζ) dζ = 0,
C
where C is the counterclockwise circular contour parameterized by ζ = z0 + Reit , 0 Hence,
C
t
2π.
f (ζ) dζ = ζ −z
C
f (z) dζ. ζ −z
(420.18.1)
lemma. If z ∈ C is such that z = 1, then dζ = ζ −z 0 if z > 1 2πi if z < 1
ζ =1
1618
The proof is fun exercise in elementary integral calculus, an application of the halfangle trigonometric substitutions. Thanks to the Lemma, the right hand side of (495.2.1) evaluates to 2πif (z). Dividing through by 2πi, we obtain 1 f (ζ) dζ, f (z) = 2πi C ζ − z as desired. Since a circle is a compact set, the deﬁning limit for the derivative f (ζ) d f (ζ) = dz ζ − z (ζ − z)2 converges uniformly for ζ ∈ ∂D. Thanks to the uniform convergence, the order of the derivative and the integral operations can be interchanged. In this way we obtain the second formula: 1 d f (ζ) f (ζ) 1 f (z) = dζ = dζ. 2πi dz C ζ − z 2πi C (ζ − z)2 Version: 9 Owner: rmilson Author(s): rmilson, stawn
420.19
proof of Cauchy residue theorem
Being f holomorphic by Cauchy Riemann equations the diﬀerential form f (z) dz is closed. So by the lemma about closed diﬀerential forms on a simple connected domain we know that the integral intC f (z) dz is equal to intC f (z) dz if C is any curve which is homotopic to C. In particular we can consider a curve C which turns around the points aj along small circles and join these small circles with segments. Since the curve C follows each segment two times with opposite orientation it is enough to sum the integrals of f around the small circles. So letting z = aj + ρeiθ be a parameterization of the curve around the point aj , we have dz = ρieiθ dθ and hence intC f (z) dz = intC f (z) dz =
j
η(C, aj )int∂Bρ (aj ) f (z) dz
=
j
η(C, aj )int2π f (aj + ρeiθ )ρieiθ dθ 0
where ρ > 0 is choosen so small that the balls Bρ (aj ) are all disjoint and all contained in the domain U. So by linearity, it is enough to prove that for all j iint2π f (aj + eiθ )ρeiθ dθ = 2πiRes(f, aj ). 0
1619
Let now j be ﬁxed and consider now the Laurent series for f in aj : f (z) =
k∈Z
ck (z − aj )k
so that Res(f, aj ) = c−1 . We have int2π f (aj + eiθ )ρeiθ dθ = 0
k
int2π ck (ρeiθ )k ρeiθ dθ = ρk+1 0
k
ck int2π ei(k+1)θ dθ. 0
Notice now that if k = −1 we have ρk+1 ck int2π ei(k+1)θ dθ = c−1 int2π dθ = 2πc−1 = 2π Res(f, aj ) 0 0 while for k = −1 we have int2π ei(k+1)θ 0 Hence the result follows. Version: 2 Owner: paolini Author(s): paolini ei(k+1)θ dθ = i(k + 1)
2π
= 0.
0
420.20
proof of Gauss’ mean value theorem
We can parametrize the circle by letting z = z0 + reiφ . Then dz = ireiφ dφ. Using the Cauchy integral formula we can express f (z0 ) in the following way: f (z0 ) = 1 2πi f (z) f (z0 + reiφ ) iφ 1 1 dz = int2π ire dφ = int2π f (z0 + reiφ )dφ. 0 iφ z − z0 2πi re 2π 0
C
Version: 12 Owner: Johan Author(s): Johan
420.21
proof of Goursat’s theorem
We argue by contradiction. Set η=
∂R
f (z) dz,
and suppose that η = 0. Divide R into four congruent rectangles R1 , R2 , R3 , R4 (see Figure 1), and set ηi =
∂Ri
f (z) dz.
1620
Figure 1: subdivision of the rectangle contour.
Now subdivide each of the four subrectangles, to get 16 congruent subsubrectangles Ri1 i2 , i1 , i2 = 1 . . . 4, and then continue ad inﬁnitum to obtain a sequence of nested families of rectangles Ri1 ...ik , with ηi1 ...ik the values of f (z) integrated along the corresponding contour. Orienting the boundary of R and all the subrectangles in the usual counterclockwise fashion we have η = η1 + η2 + η3 + η4 , and more generally ηi1 ...ik = ηi1 ...ik 1 + ηi1 ...ik 2 + ηi1 ...ik 3 + ηi1 ...ik 4 . In as much as the integrals along oppositely oriented line segments cancel, the contributions from the interior segments cancel, and that is why the righthand side reduces to the integrals along the segments at the boundary of the composite rectangle. Let j1 ∈ {1, 2, 3, 4} be such that ηj1  is the maximum of ηi , i = 1, . . . , 4. By the triangle inequality we have η1  + η2  + η3  + η4  η, and hence ηj1  1/4η.
Continuing inductively, let jk+1 be such that ηj1 ...jk jk+1  is the maximum of ηj1 ...jk i , i = 1, . . . , 4. We then have ηj1 ...jk jk+1  4−(k+1) η. (420.21.1) Now the sequence of nested rectangles Rj1 ...jk converges to some point z0 ∈ R; more formally {z0 } =
∞
Rj1 ...jk .
k=1
The derivative f (z0 ) is assumed to exist, and hence for every > 0 there exists a k suﬃciently large, so that for all z ∈ Rj1 ...jk we have f (z) − f (z0 )(z − z0 ) Now we make use of the following. Lemma 9. Let Q ⊂ C be a rectangle, let a, b ∈ C, and let f (z) be a continuous, complex valued function deﬁned and bounded in a domain containing Q. Then, (az + b)dz = 0
∂Q
z − z0 .
f (z)
∂Q
MP,
where M is an upper bound for f (z) and where P is the length of ∂Q. 1621
The ﬁrst of these assertions follows by the fundamental theorem of calculus; after all the function az + b has an antiderivative. The second assertion follows from the fact that the absolute value of an integral is smaller than the integral of the absolute value of the integrand — a standard result in integration theory. Using the lemma and the fact that the perimeter of a rectangle is greater than its diameter we infer that for every > 0 there exists a k suﬃciently large that ηj1 ...jk = f (z) dz
∂Rj1 ...jk
∂Rj1 ...jk 2 = 4−k ∂R2 .
where ∂R denotes the length of perimeter of the rectangle R. This contradicts the earlier estimate (419.21.1). Therefore η = 0. Version: 10 Owner: rmilson Author(s): rmilson
420.22
proof of M¨bius circle transformation theorem o
Case 1: f (z) = az + b. Case 1a: The points on z − C = R can be written as z = C + Reiθ . They are mapped to the points w = aC + b + aReiθ which all lie on the circle w − (aC + b) = aR. Case 1b: The line Re(eiθ z) = k are mapped to the line Re
1 Case 2: f (z) = z . eiθ w a
= k + Re
b a
.
Case 2a: Consider a circle passing through the origin. This can be written as z − C = C. 1 This circle is mapped to the line Re(Cw) = 2 which does not pass through the origin. To 1 1 iθ show this, write z = C + Ce . w = z = C+Ceiθ . 1 1 Re(Cw) = (Cw + Cw) = 2 2 1 2 C eiθ C/C C + C + Ceiθ C + Ce−iθ eiθ C/C C C + iθ C + Ce C + Ce−iθ = 1 2 C Ceiθ + C + Ceiθ Ceiθ + C = 1 2
=
Case 2b: Consider the line which does not pass through the origin. This can be written as a a Re(az) = 1 for a = 0. Then az + az = 2 which is mapped to w + w = 2. This is simpliﬁed as aw + aw = 2ww which becomes (w − a/2)(w − a/2) = aa/4 or w − a = a which is a 2 2 circle passing through the origin.
1622
Case 2c: Consider a circle which does not pass through the origin. This can be written as z − C = R with C = R. This circle is mapped to the circle w− C2 C R = 2 2 − R2  −R C
Note:
which is another circle not passing through the origin. To show this, we will demonstrate that C R C −zz 1 + = 2 − R2 2 − R2 C R z C z
C−z z R z
= 1.
C2 =
C−zz zC − zz + zC R C + = 2 2 − R2 −R R z C z(C2 − R2 ) CC − (z − C)(z − C) C2 − R2 1 = = 2 − R2 ) 2 − R2 ) z(C z(C z
Case 2d: Consider a line passing through the origin. This can be written as Re(eiθ z) = iθ 0. This is mapped to the set Re ew = 0, which can be rewritten as Re(eiθ w) = 0 or Re(we−iθ ) = 0 which is another line passing through the origin. Case 3: An arbitrary Mobius transformation can be written as f (z) = falls into Case 1, so we will assume that c = 0. Let f1 (z) = cz + d f2 (z) = 1 z f3 (z) =
az+b . cz+d
If c = 0, this
bc − ad a z+ c c
Then f = f3 ◦ f2 ◦ f1 . By Case 1, f1 and f3 map circles to circles and by Case 2, f2 maps circles to circles. Version: 2 Owner: brianbirgen Author(s): brianbirgen
420.23
proof of Simultaneous converging or diverging of product and sum theorem
ex for x
m
From the fact that 1 + x
0 we get
m
an
n=1
(1 + an )
n=1
e
Pm
n=1
an
Since an 0 both the partial sums and the partial products are monotone increasing with the number of terms. This concludes the proof. Version: 2 Owner: Johan Author(s): Johan 1623
420.24
proof of absolute convergence implies convergence for an inﬁnite product
This comes at once from the link between inﬁnite products and sums and the absolute convergence theorem for inﬁnite sums. Version: 1 Owner: paolini Author(s): paolini
420.25
Let
proof of closed curve theorem
f (x + iy) = u(x, y) + iv(x, y).
Hence we have intC f (z) dz = intC ω + iintC η where ω and η are the diﬀerential forms ω = u(x, y) dx − v(x, y) dy, η = v(x, y) dx + u(x, y) dy.
Notice that by CauchyRiemann equations ω and η are closed diﬀerential forms. Hence by the lemma on closed diﬀerential forms on a simple connected domain we get intC1 ω = intC2 ω, and hence intC1 f (z) dz = intC2 f (z) dz Version: 2 Owner: paolini Author(s): paolini intC1 η = intC2 η.
420.26
proof of conformal M¨bius circle map theorem o
z−a Let f be a conformal map from the unit disk ∆ onto itself. Let a = f (0). Let ga (z) = 1−az . Then ga ◦ f is a conformal map from ∆ onto itself, with ga ◦ f (0) = 0. Therefore, by Schwarz’s lemma for all z ∈ ∆ ga ◦ f (z) ≤ z.
Because f is a conformal map onto ∆, f −1 is also a conformal map of ∆ onto itself. (ga ◦ f )−1 (0) = 0 so that by Schwarz’s Lemma (ga ◦ f )−1 (w) ≤ w for all w ∈ ∆. Writing w = ga ◦ f (z) this becomes z ≤ ga ◦ f (z). Therefore, for all z ∈ ∆ ga ◦ f (z) = z. By Schwarz’s Lemma, ga ◦ f is a rotation. Write −1 ga ◦ f (z) = eiθ z, or f (z) = eiθ ga . 1624
Therefore, f is a M¨bius transformation. o Version: 2 Owner: brianbirgen Author(s): brianbirgen
420.27
simultaneous converging or diverging of product and sum theorem
Let ak
0. Then
∞
(1 + an )and
∞ n=1
an
n=1
converge or diverge simultaneously. Version: 3 Owner: Johan Author(s): Johan
420.28
CauchyRiemann equations
∂u ∂v = , ∂x ∂y ∂u ∂v =− , ∂y ∂x
The following system of partial diﬀerential equations
where u(x, y), v(x, y) are realvalued functions deﬁned on some open subset of R2 , was introduced by Riemann[1] as a deﬁnition of a holomorphic function. Indeed, if f (z) satisﬁes the standard deﬁnition of a holomorphic function, i.e. if the complex derivative f (z) = lim
ζ→0
f (z + ζ) − f (z) ζ
exists in the domain of deﬁnition, then the real and imaginary parts of f (z) satisfy the CauchyRiemann equations. Conversely, if u and v satisfy the CauchyRiemann equations, and if their partial derivatives are continuous, then the complex valued function f (z) = u(x, y) + iv(x, y), possesses a continuous complex derivative. References 1. D. Laugwitz, Bernhard Riemann, 18261866: Turning points in the Conception of Mathematics, translated by Abe Shenitzer. Birkhauser, 1999. Version: 2 Owner: rmilson Author(s): rmilson 1625 z = x + iy,
420.29
CauchyRiemann equations (polar coordinates)
Suppose A is an open set in C and f (z) = f (reiθ ) = u(r, θ) + iv(r, θ) : A ⊂ C → C is a function. If the derivative of f (z) exists at z0 = (r0 , θ0 ). Then the functions u, v at z0 satisfy: ∂u 1 ∂v = ∂r r ∂θ ∂v 1 ∂u = − ∂r r ∂θ which are called CauchyRiemann equations in polar form.
Version: 4 Owner: Daume Author(s): Daume
420.30
proof of the CauchyRiemann equations
Existence of complex derivative implies the CauchyRiemann equations. Suppose that the complex derivative f (z) = lim f (z + ζ) − f (z) ζ→0 ζ (420.30.1)
exists for some z ∈ C. This means that for all complex ζ with ζ < ρ, we have f (z) − Henceforth, set f = u + iv,
> 0, there exists a ρ > 0, such that for all
f (z + ζ) − f (z) < . ζ
z = x + iy.
If ζ is real, then the above limit reduces to a partial derivative in x, i.e. f (z) = ∂f ∂u ∂v = +i , ∂x ∂x ∂x ∂f ∂u ∂v = −i + . ∂y ∂y ∂y
Taking the limit with an imaginary ζ we deduce that f (z) = −i Therefore
∂f ∂f = −i , ∂x ∂y and breaking this relation up into its real and imaginary parts gives the CauchyRiemann equations. 1626
The CauchyRiemann equations imply the existence of a complex derivative. Suppose that the CauchyRiemann equations ∂u ∂v = , ∂x ∂y ∂u ∂v =− , ∂y ∂x
hold for a ﬁxed (x, y) ∈ R2 , and that all the partial derivatives are continuous at (x, y) as well. The continuity implies that all directional derivatives exist as well. In other words, for ξ, η ∈ R and ρ = ξ 2 + η 2 we have
∂u u(x + ξ, y + η) − u(x, y) − (ξ ∂x + η ∂u ) ∂y
ρ
→ 0, as ρ → 0,
with a similar relation holding for v(x, y). Combining the two scalar relations into a vector relation we obtain ρ−1 u(x, y) u(x + ξ, y + η) − − v(x, y) v(x + ξ, y + η)
∂u ∂x ∂v ∂x ∂u ∂y ∂v ∂y
ξ η
→ 0, as ρ → 0.
Note that the CauchyRiemann equations imply that the matrixvector product above is equivalent to the product of two complex numbers, namely ∂u ∂v +i ∂x ∂x Setting f (z) = u(x, y) + iv(x, y), ∂u ∂v f (z) = +i ∂x ∂x ζ = ξ + iη we can therefore rewrite the above limit relation as f (z + ζ) − f (z) − f (z)ζ ζ → 0, as ρ → 0, (ξ + iη).
which is the complex limit deﬁnition of f (z) shown in (419.30.1). Version: 2 Owner: rmilson Author(s): rmilson
420.31
removable singularity
Let U ⊂ C be an open neighbourhood of a point a ∈ C. We say that a function f : U\{a} → C has a removable singularity at a, if the complex derivative f (z) exists for all z = a, and if f (z) is bounded near a. Removable singularities can, as the name suggests, be removed. 1627
Theorem 13. Suppose that f : U\{a} → C has a removable singularity at a. Then, f (z) can be holomorphically extended to all of U, i.e. there exists a holomorphic g : U → C such that g(z) = f (z) for all z = a. Proof. Let C be a circle centered at a, oriented counterclockwise, and suﬃciently small so that C and its interior are contained in U. For z in the interior of C, set g(z) = 1 2πi f (ζ) dζ. ζ −z
C
Since C is a compact set, the deﬁning limit for the derivative d f (ζ) f (ζ) = dz ζ − z (ζ − z)2 converges uniformly for ζ ∈ C. Thanks to the uniform convergence, the order of the derivative and the integral operations can be interchanged. Hence, we may deduce that g (z) exists for all z in the interior of C. Furthermore, by the Cauchy integral formula we have that f (z) = g(z) for all z = a, and therefore g(z) furnishes us with the desired extension. Version: 2 Owner: rmilson Author(s): rmilson
1628
Chapter 421 30F40 – Kleinian groups
421.1 Klein 4group
Any group G of order 4 must be abelian. If G isn’t isomorphic to the cyclic group with order 4 C4 , then it must be isomorphic to Z2 ⊕ Z2 . This groups is known as the 4Klein group. The operation is the operation induced by Z2 taking it coordinatewise. Version: 3 Owner: drini Author(s): drini, apmxi
1629
Chapter 422 31A05 – Harmonic, subharmonic, superharmonic functions
422.1 a harmonic function on a graph which is bounded below and nonconstant
There exists no harmonic function on all of the ddimensional grid Zd which is bounded below and nonconstant. This categorises a particular property of the grid; below we see that other graphs can admit such harmonic functions. Let T3 = (V3 , E3 ) be a 3regular tree. Assign “levels” to the vertices of T3 as follows: ﬁx a vertex o ∈ V3 , and let π be a branch of T3 (an inﬁnite simple path) from o. For every vertex v ∈ V3 of T3 there exists a unique shortest path from v to a vertex of π; let (v) = π be the length of this path. Now deﬁne f (v) = 2− (v) > 0. Without loss of generality, note that the three neighbours u1 , u2 , u3 of v satisfy (u1 ) = (v) − 1 (“u1 is the parent of v”), (u2) = (u3 ) = (v) + 1 (“u2 , u3 are the siblings of v”). And indeed, 1 2 (v)−1 + 2 (v)+1 + 2 (v)+1 = 2 (v) . 3 So f is a positive nonconstant harmonic function on T3 . Version: 2 Owner: drini Author(s): ariels
422.2
example of harmonic functions on graphs
1. Let G = (V, E) be a connected ﬁnite graph, and let a, z ∈ V be two of its vertices. The function f (v) = P {simple random walk from v hits a before z} 1630
Finiteness of G is required only to ensure f is welldeﬁned. So we may replace “G ﬁnite” with “simple random walk on G is recurrent”. 2. Let G = (V, E) be a graph, and let V ⊆ V . Let α : V → R be some boundary condition. For u ∈ V , deﬁne a random variable Xu to be the ﬁrst vertex of V that simple random walk from u hits. The function f (v) = E α(Xv ) is a harmonic function except on V . The ﬁrst example is a special case of this one, taking V = {a, z} and α(a) = 1, α(z) = 0. Version: 1 Owner: ariels Author(s): ariels
is a harmonic function except on {a, z}.
422.3
examples of harmonic functions on Rn
Some real functions in Rn (e.g. any linear function, or any aﬃne function) are obviously harmonic functions. What are some more interesting harmonic functions? • For n ≥ 3, deﬁne (on the punctured space U = Rn \ {0}) the function f (x) = x Then ∂f xi = (2 − n) , ∂xi x n and x2 1 ∂2f i = n(n − 2) 2 n+2 − (n − 2) x n ∂xi x
2−n
.
Summing over i = 1, ..., n shows ∆f ⇔ 0. • For n = 2, deﬁne (on the punctured plane U = R2 \ {0}) the function f (x, y) = log(x2 + y 2). Derivation and summing yield ∆f ⇔ 0. • For n = 1, the condition (∆f )(x) = f (x) =⇔ 0 forces f to be an aﬃne function on every segment; there are no “interesting” harmonic functions in one dimension. Version: 2 Owner: ariels Author(s): ariels
1631
422.4
harmonic function
• A real or complexvalued function f : U → R or f : U → C in C2 (i.e. f is twice continuously diﬀerentiable), where U ⊆ Rn is some domain, is called harmonic if its Laplacian vanishes on U: ∆f ⇔ 0. • A real or complexvalued function f : V → R or f : V → C deﬁned on the vertices V of a graph G = (V, E) is called harmonic at v ∈ V if its value at v is its average value at the neighbours of v: 1 f (u). f (v) = deg(v)
{u,v}∈E
It is called harmonic except on A, for some A ⊆ V , if it is harmonic at each v ∈ V \A, and harmonic if it is harmonic at each v ∈ V . In the continuous (ﬁrst) case, any harmonic f : Rn → R or f : Rn → C satisﬁes Liouville’s theorem. Indeed, a holomorphic function is harmonic, and a real harmonic function f : U → R, where U ⊆ R2 , is locally the real part of a holomorphic function. In fact, it is enough that a harmonic function f be bounded below (or above) to conclude that it is constant. In the discrete (second) case, any harmonic f : Zn → R, where Zn is the ndimensional grid, is constant if bounded below (or above). However, this is not necessarily true on other graphs. Version: 4 Owner: ariels Author(s): ariels
1632
Chapter 423 31B05 – Harmonic, subharmonic, superharmonic functions
423.1 Laplacian
Let (x1 , . . . , xn ) be Cartesian coordinates for some open set Ω in Rn . Then the Laplacian diﬀerential operator ∆ is deﬁned as ∆= ∂2 ∂2 + ...+ 2 . ∂x2 ∂xn 1
In other words, if f is a twice diﬀerentiable function f : Ω → C, then ∆f = ∂2f ∂2f + ...+ 2 . ∂x2 ∂xn 1 · , i.e., ∆ is the composition
A coordinate independent deﬁnition of the Laplacian is ∆ = of gradient and divergence. A harmonic function is one for which the Laplacian vanishes.
An older symbol for the Laplacian is 2 – conceptually the scalar product of This form may be more favoured by physicists. Version: 4 Owner: matte Author(s): matte, ariels
with itself.
1633
Chapter 424 32A05 – Power series, series of functions
424.1 exponential function
We begin by deﬁning the exponential function exp : BR → BR+ for all real values of x by the power series
∞ k=0
exp(x) =
xk k!
It has a few elementary properties, which can be easily shown. • The radius of convergence is inﬁnite • exp(0) = 1 • It is inﬁnitely diﬀerentiable, and the derivative is the exponential function itself • exp(x) ≥ 1 + x so it is positive and unbounded on the nonnegative reals Now consider the function f : R → R with f(x) = exp(x) exp(y − x) so, by the product rule and property 3
1634
f (x) = 0 By the constant value theorem exp(x) exp(y − x) = exp(y) ∀y, x ∈ R With a suitable change of variables, we have exp(x + y) = exp(x) exp(y) exp(x) exp(−x) = 1 Consider just the nonnegative reals. Since it is unbounded, by the intermediate value theorem, it can take any value on the interval [1, ∞). We have that the derivative is strictly positive so by the meanvalue theorem, exp(x) is strictly increasing. This gives surjectivity and injectivity i.e. it is a bijection from [0, ∞) → [1, ∞). Now exp(−x) = that:
1 , exp(x)
so it is also a bijection from (−∞, 0) → (0, 1). Therefore we can say
exp(x) is a bijection onto R+ We can now naturally deﬁne the logarithm function, as the inverse of the exponential function. It is usually denoted by ln(x), and it maps R+ to R Similarly, the natural log base, e may be deﬁned by e = exp(1) Since the exponential function obeys the rules normally associated with powers, it is often denoted by ex . In fact it is now possible to deﬁne powers in terms of the exponential function by ax = ex ln(a) a>0
Note the domain may be extended to the complex plane with all the same properties as before, except the bijectivity and ordering properties. Comparison with the power series expansions for sine and cosine yields the following identity, with the famous corollary attributed to Euler: eix = cos(x) + i sin(x) eiπ = −1 1635
Version: 10 Owner: mathcam Author(s): mathcam, vitriol
1636
Chapter 425 32C15 – Complex spaces
425.1 Riemann sphere
ˆ The Riemann Sphere, denoted C is the onepoint compactiﬁcation of the complex plane C, obtained by identifying the limits of all inﬁnitely extending rays from the origin as one ˆ single ”point at inﬁnity.” Heuristically, C can be viewed as a 2sphere with the top point corresponding to the point at inﬁnity, and the bottom point corresponding the origin. An atlas for the Riemann sphere is given by two charts: ˆ C\{∞} → C : z → z and 1 ˆ C\{0} → C : z → z ˆ ˆ Any polynomial on C has a unique smooth extension to a map p : C → C. ˆ ˆ Version: 2 Owner: mathcam Author(s): mathcam
1637
Chapter 426 32F99 – Miscellaneous
426.1 starshaped region
Deﬁnition A subset U of a real (or possibly complex) vector space is called starshaped if there is a point p ∈ U such that the line segment pq is contained in U for all q ∈ U. We then say that U is starshaped with respect of p. (Here, pq = {tp + (1 − t)q  t ∈ [0, 1]}.) A region U is, in other words, starshaped if there is a point p ∈ U such that U can be ”collapsed” or ”contracted” onto p.
Examples 1. In Rn , any vector subspace is starshaped. Also, the unit cube and unit ball are starshaped, but the unit sphere is not. 2. A subset U in a vector space is starshaped with respect of all it’s points, if and only if U is convex. Version: 2 Owner: matte Author(s): matte
1638
Chapter 427 32H02 – Holomorphic mappings, (holomorphic) embeddings and related questions
427.1 Bloch’s theorem
Let f be an holomorphic function on a region containing the closure of the disk D = {z ∈ C : z < 1}, such that f (0) = 0 and f (0) = 1. Then there is a disk S ⊂ D such that f is 1 injective on S and f (S) contains a disk of radius 72 . Version: 2 Owner: Koro Author(s): Koro
427.2
Hartog’s theorem
Let U ⊂ Cn (n > 1) be an open set containing the origin 0. Then any holomorphic function on U − {0} extends uniquely to a holomorphic function on U. Version: 1 Owner: bwebste Author(s): bwebste
1639
Chapter 428 32H25 – Picardtype theorems and generalizations
428.1 Picard’s theorem
Let f be an holomorphic function with an essential singularity at w ∈ C. Then there is a number z0 ∈ C such that the image of any neighborhood of w by f contains C − {z0 }. In other words, f assumes every complex value, with the possible exception of z0 , in any neighborhood of w. Remark. little Picard theorem follows as a corollary: Given a nonconstant entire function f , if it is a polynomial, it assumes every value in C as a consequence of the fundamental theorem of algebra. If f is not a polynomial, then g(z) = f (1/z) has an essential singularity at 0; Picard’s theorem implies that g (and thus f ) assumes every complex value, with one possble exception. Version: 4 Owner: Koro Author(s): Koro
428.2
little Picard theorem
The range of a nonconstant entire function is either the whole complex plane C, or the complex plane with a single point removed. In other words, if an entire function omits two or more values, then it is a constant function. Version: 2 Owner: Koro Author(s): Koro
1640
Chapter 429 33XX – Special functions
429.1 beta function
The beta function is deﬁned as: B(p, q) = int1 xp−1 (1 − x)q−1 dx 0 for any p, q > 0 The beta fuction has the property: Γ(p)Γ(q) Γ(p + q)
B(p, q) = where Γ is the gamma function Also,
B(p, q) = B(q, p) and 1 1 B( , ) = π 2 2 The function was discovered by L.Euler (1730) and the name was given by J.Binet. Version: 8 Owner: vladm Author(s): vladm
1641
Chapter 430 33B10 – Exponential and trigonometric functions
430.1 natural logarithm
The natural logarithm of a number is the logarithm in base e. It is deﬁned formally as 1 ln(x) = intx dx 1 x The origin of the natural logarithm, the exponential function and Euler’s number e are very much intertwined. The integral above was found to have the properties of a logarithm. You can view these properties in the entry on logarithms. If indeed the integral represented a logarithmic function, its base would have to be e, where the value of the integral is 1. Thus was the natural logarithm deﬁned. The natural logarithm can be represented by powerseries for −1 < x ln(1 + x) =
∞ k=1
1:
(−1)k+1 k x . k
Note that the above is only the deﬁnition of a logarithm for real numbers greater than zero. For complex and negative numbers, one has to look at the Euler relation. Version: 3 Owner: mathwizard Author(s): mathwizard, slider142
1642
Chapter 431 33B15 – Gamma, beta and polygamma functions
431.1 BohrMollerup theorem
Let f : R+ → R+ be a function with the following properties: 1. log f (x) is a convex function; 2. f (x + 1) = xf (x) for all x > 0; 3. f (1) = 1. Then f (x) = Γ(x) for all x > 0. That is, the only function satisfying those properties is the gamma function (restricted to the positive reals.) Version: 1 Owner: Koro Author(s): Koro
431.2
gamma function
The gamma function is Γ(x) = int∞ e−t tx−1 dt 0 For integer values of x = n,
1643
Γ(n) = (n − 1)! Hence the Gamma function satisﬁes Γ(x + 1) = xΓ(x) if x > 0. The gamma function looks like :
(generated by GNU Octave and gnuplot)
Some values of the gamma function for small arguments are: Γ(1/5) = 4.5909 Γ(1/3) = 2.6789 Γ(3/5) = 1.4892 Γ(3/4) = 1.2254 and the everuseful Γ(1/2) = √ Γ(1/4) = 3.6256 Γ(2/5) = 2.2182 Γ(2/3) = 1.3541 Γ(4/5) = 1.1642
π. These values allow a quick calculation of Γ(n + f )
Where n is a natural number and f is any fractional value for which the Gamma function’s value is known. Since Γ(x + 1) = xΓ(x), we have
Γ(n + f ) = (n + f − 1)Γ(n + f − 1) = (n + f − 1)(n + f − 2)Γ(n + f − 2) . . . = (n + f − 1)(n + f − 2) · · · (f )Γ(f ) Which is easy to calculate if we know Γ(f ). The gamma function has a meromorphic continuation to the entire complex plane with poles at the nonpositive integers. It satisﬁes the product formula e−γz Γ(z) = z
∞ n=1
1+
z n
−1
ez/n
1644
where γ is Euler’s constant, and the functional equation Γ(z)Γ(1 − z) = Version: 8 Owner: akrowne Author(s): akrowne π . sin πz
431.3
proof of BohrMollerup theorem
We prove this theorem in two stages: ﬁrst, we establish that the gamma function satisﬁes the given conditions and then we prove that these conditions uniquely determine a function on (0, ∞). By its deﬁnition, Γ(x) is positive for positive x. Let x, y > 0 and 0 λ 1.
log Γ(λx + (1 − λ)y) = log int∞ e−t tλx+(1−λ)y−1 dt 0 = log int∞ (e−t tx−1 )λ (e−t ty−1 )1−λ dt 0 log((int∞ e−t tx−1 dt)λ (int∞ e−t ty−1 dt)1−λ ) 0 0 = λ log Γ(x) + (1 − λ) log Γ(y) The inequality follows from H¨lder’s inequality, where p = o
1 λ
and q =
1 . 1−λ
This proves that Γ is logconvex. Condition 2 follows from the deﬁnition by applying integration by parts. Condition 3 is a trivial veriﬁcation from the deﬁnition. Now we show that the 3 conditions uniquely determine a function. By condition 2, it suﬃces to show that the conditions uniquely determine a function on (0, 1). Let G be a function satisfying the 3 conditions, 0 x 1 and n ∈ N. G(n)1−x G(n + 1)x =
n + x = (1 − x)n + x(n + 1) and by logconvexity of G, G(n + x) G(n)1−x G(n)x nx = (n − 1)!nx . Similarly n + 1 = x(n + x) + (1 − x)(n + 1 + x) gives n! Combining these two we get n!(n + x)x−1 G(n + x) (n − 1)!xn
G(n + x)(n + x)1−x .
and by using condition 2 to express G(n + x) in terms of G(x) we ﬁnd
1645
an :=
n!(n + x)x−1 x(x + 1) . . . (x + n − 1)
G(x)
(n − 1)!xn =: bn . x(x + 1) . . . (x + n − 1)
Now these inequalities hold for every integer n and the terms on the left and right side have n a common limit (limn→∞ an = 1) so we ﬁnd this determines G. b As a corollary we ﬁnd another expression for Γ. For 0 x 1, n!nx . n→∞ x(x + 1) . . . (x + n)
Γ(x) = lim
In fact, this equation, called Gauß’s product, goes for the whole complex plane minus the negative integers. Version: 1 Owner: lieven Author(s): lieven
1646
Chapter 432 33B30 – Higher logarithm functions
432.1 Lambert W function
Lambert’s W function is the inverse of the function f : C → C given by f (x) := xex . That is, W (x) is the complex valued function that satisﬁes W (x)eW (x) = x, for all x ∈ C. In practice the deﬁnition of W (x) requires a branch cut, which is usually taken along the negative real axis. Lambert’s W function is sometimes also called product log function. This function allow us to solve the functional equation g(x)g(x) = x since g(x) = eW (ln(x)) .
432.1.1
References
A site with good information on Lambert’s W function is Corless’ page ”On the Lambert W Function” Version: 4 Owner: drini Author(s): drini
1647
Chapter 433 33B99 – Miscellaneous
433.1 natural log base
The natural log base, or e, has value 2.718281828459045 . . . e was extensively studied by Euler in the 1720’s, but it was originally discovered by John Napier. e is deﬁned by 1 n
n
n→∞
lim
1+
It is more eﬀectively calculated, however, by using a Taylor series to get the representation e= 1 1 1 1 1 + + + + +··· 0! 1! 2! 3! 4!
Version: 3 Owner: akrowne Author(s): akrowne
1648
Chapter 434 33D45 – Basic orthogonal polynomials and functions (AskeyWilson polynomials, etc.)
434.1 orthogonal polynomials
Polynomials of order n are analytic functions that can be written in the form pn (x) = a0 + a1 x + a2 x2 + · · · + an xn They can be diﬀerentiated and integrated for any value of x, and are fully determined by the n + 1 coeﬃcients a0 . . . an . For this simplicity they are frequently used to approximate more complicated or unknown functions. In approximations, the necessary order n of the polynomial is not normally deﬁned by criteria other than the quality of the approximation. Using polynomials as deﬁned above tends to lead into numerical diﬃculties when determining the ai , even for small values of n. It is therefore customary to stabilize results numerically by using orthogonal polynomials over an interval [a, b], deﬁned with respect to a positive weight function W (x) > 0 by intb pn (x)pm (x)W (x)dx = 0 for n = m a Orthogonal polynomials are obtained in the following way: deﬁne the scalar product. (f, g) = intb f (x)g(x)W (x)dx a
1649
between the functions f and g, where W (x) is a weight factor. Starting with the polynomials p0 (x) = 1, p1 (x) = x, p2 (x) = x2 , etc., from the GramSchmidt decomposition one obtains a sequence of orthogonal polynomials q0 (x), q1 (x), . . ., such that (qm , qn ) = Nn δmn . The normalization factors Nn are arbitrary. When all Ni are equal to one, the polynomials are called orthonormal. Some important orthogonal polynomials are: a b 1 1 1 1 −∞ ∞ W (x) 1 (1 − x2 )−1/2 2 e−x name Legendre polynomials Chebychev polynomials Hermite polynomials
Orthogonal polynomials of successive orders can be expressed by a recurrence relation pn = (An + Bn x)pn−1 + Cn pn−2 This relation can be used to compute a ﬁnite series a0 p0 + a1 p1 + · · · + an pn with arbitrary coeﬃcients ai , without computing explicitly every polynomial pj (Horner’s rule). Chebyshev polynomials Tn (x) are also orthogonal with respect to discrete values xi : Tn (xi )Tm (xi ) = 0 for nr < m ≤ M
i
where the xi depend on M. For more information, see [Abramowitz74], [Press95]. References • Originally from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.html) Abramowitz74 M. Abramowitz and I.A. Stegun (Eds.), Handbook of Mathematical Functions, National Bureau of Standards, Dover, New York, 1974. Press95 W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery, Numerical Recipes in C, Second edition, Cambridge University Press, 1995. (The same book exists for the Fortran language). There is also an Internet version which you can work from. Version: 3 Owner: akrowne Author(s): akrowne 1650
Chapter 435 33E05 – Elliptic functions and integrals
435.1 Weierstrass sigma function
Deﬁnition 37. Let Λ ⊂ C be a lattice. Let Λ∗ denote Λ − {0}. 1. The Weierstrass sigma function is deﬁned as the product z z/w+ 1 (z/w)2 2 σ(z; Λ) = z 1− e w ∗ w∈Λ 2. The Weierstrass zeta function is deﬁned by the sum ζ(z; Λ) = σ (z; Λ) 1 = + σ(z; Λ) z w∈Λ∗ 1 1 z + + 2 z−w w w
Note that the Weierstrass zeta function is basically the derivative of the logarithm of the sigma function. The zeta function can be rewritten as: 1 G2k+2 (Λ)z 2k+1 ζ(z; Λ) = − z k=1 where G2k+2 is the Eisenstein series of weight 2k + 2. 3. The Weierstrass eta function is deﬁned to be η(w; Λ) = ζ(z + w; Λ) − ζ(z; Λ), for any z ∈ C (It can be proved that this is well deﬁned, i.e. ζ(z + w; Λ) −ζ(z; Λ) only depends on w). The Weierstrass eta function must not be confused with the Dedekind eta function. Version: 1 Owner: alozano Author(s): alozano 1651
∞
435.2
elliptic function
Let Λ ∈ C be a lattice in the sense of number theory, i.e. a 2dimensional free group over Z which generates C over R. An elliptic function φ, with respect to the lattice Λ, is a meromorphic funtion φ : C → C which is Λperiodic: φ(z + λ) = φ(z), ∀z ∈ C, ∀λ ∈ Λ Remark: An elliptic function which is holomorphic is constant. Indeed such a function would induce a holomorphic function on C/Λ, which is compact (and it is a standard result from complex Analysis that any holomorphic function with compact domain is constant, this follows from Liouville’s theorem). Example: The Weierstrass ℘function (see elliptic curve) is an elliptic function, probably the most important. In fact: Theorem 12. The ﬁeld of elliptic functions with respect to a lattice Λ is generated by ℘ and ℘ (the derivative of ℘).
S
ee [2], chapter 1, theorem 4.
REFERENCES
1. James Milne, Modular Functions and Modular Forms, online course notes. http://www.jmilne.org/math/CourseNotes/math678.html 2. Serge Lang, Elliptic Functions. SpringerVerlag, New York. 3. Joseph H. Silverman, The Arithmetic of Elliptic Curves. SpringerVerlag, New York, 1986.
Version: 4 Owner: alozano Author(s): alozano
435.3
elliptic integrals and Jacobi elliptic functions
Elliptic integrals For 0 < k < 1, write F (k, φ) = intφ 0 E(k, φ) = intφ 0 Π(k, n, φ) = intφ 0 dθ 1 − k 2 sin2 θ dθ dθ (1 + n sin2 θ) 1652 1 − k 2 sin2 θ 1 − k 2 sin2 θ (435.3.1) (435.3.2) (435.3.3)
The change of variable x = sin φ turns these into F1 (k, x) = intx 0 E1 (k, x) = intx 0 dv (1 − v 2 )(1 − k 2 v 2 ) (435.3.4) (435.3.5) (435.3.6)
1 − k2 v2 dv 1 − v2 dv Π1 (k, n, x) = intx 0 (1 + nv 2 ) (1 − v 2 )(1 − k 2 v 2 )
The ﬁrst three functions are known as Legendre’s form of the incomplete elliptic integrals of the ﬁrst, second, and third kinds respectively. Notice that (2) is the special case n = 0 of (3). The latter three are known as Jacobi’s form of those integrals. If φ = π/2, or x = 1, they are called complete rather than incomplete integrals, and their names are abbreviated to F (k), E(k), etc. One use for elliptic integrals is to systematize the evaluation of certain other integrals. In particular, let p be a third or fourthdegree polynomial in one variable, and let y = p(x). If q and r are any two polynomials in two variables, then the indeﬁnite integral int q(x, y) dx r(x, y)
has a “closed form” in terms of the above incomplete elliptic integrals, together with elementary functions and their inverses. Jacobi’s elliptic functions In (1) we may regard φ as a function of F , or vice versa. The notation used is φ = am u u = arg φ
and φ and u are known as the amplitude and argument respectively. But x = sin φ = sin am u. The function u → sin am u = x is denoted by sn and is one of four Jacobi (or Jacobian) elliptic functions. The four are: sn u = x √ 1 − x2 cn u = sn u tn u = cn √ u dn u = 1 − k 2 x2 When the Jacobian elliptic functions are extended to complex arguments, they are doubly periodic and have two poles in any parallogram of periods; both poles are simple. Version: 1 Owner: mathcam Author(s): Larry Hammick
1653
435.4
examples of elliptic functions
Examples of Elliptic Functions Let Λ ⊂ C be a lattice generated by w1 , w2 . Let Λ∗ denote Λ − {0}. 1. The Weierstrass ℘function is deﬁned by the series ℘(z; Λ) = 1 1 1 + − 2 2 2 z (z − w) w w∈Λ∗ 1 (z − w)3 w∈Λ∗ w −2k
w∈Λ∗
2. The derivative of the Weierstrass ℘function is also an elliptic function ℘ (z; L) = −2
3. The Eisenstein series of weight 2k for Λ is the series G2k (Λ) =
The Eisenstein series of weight 4 and 6 are of special relevance in the theory of elliptic curves. In particular, the quantities g2 and g3 are usually deﬁned as follows: g2 = 60 · G4 (Λ), Version: 3 Owner: alozano Author(s): alozano g3 = 140 · G6 (Λ)
435.5
modular discriminant
Deﬁnition 38. Let Λ ⊂ C be a lattice. 1. Let qτ = e2πiτ . The Dedekind eta function is deﬁned to be η(τ ) =
1/24 qτ ∞ n (1 − qτ )
n=1
The Dedekind eta function should not be confused with the Weierstrass eta function, η(w; Λ). 2. The jinvariant, as a function of lattices, is deﬁned to be: j(Λ) =
3 g2 3 2 g2 − 27g3
where g2 and g3 are certain multiples of the Eisenstein series of weight 4 and 6 (see this entry). 1654
3. The ∆ function (delta function or modular discriminant) is deﬁned to be
3 2 ∆(Λ) = g2 − 27g3
Let Λτ be the lattice generated by 1, τ . The ∆ function for Λτ has a product expansion ∆(τ ) = ∆(Λτ ) = (2πi)12 qτ
∞ n (1 − qτ )24 = (2πi)12 η(τ )24
n=1
Version: 2 Owner: alozano Author(s): alozano
1655
Chapter 436 3400 – General reference works (handbooks, dictionaries, bibliographies, etc.)
436.1 Liapunov function
Suppose we are given an autonomous system of ﬁrst order diﬀerential equations. dx = F (x, y) dt dy = G(x, y) dt
Let the origin be an isolated critical point of the above system. A function V (x, y) that is of class C 1 and satisﬁes V (0, 0) = 0 is called a Liapunov function if every open ball Bδ (0, 0) contains at least one point where V > 0. If there happens to exist ˙ δ ∗ such that the function V , given by ˙ V (x, y) = Vx (x, y)F (x, y) + Vy (x, y)G(x, y)
∗ is positive deﬁnite in Bδ (0, 0). Then the origin is an unstable critical point of the system.
Version: 2 Owner: tensorking Author(s): tensorking
1656
436.2
436.2.1
Lorenz equation
The history
The Lorenz equation was published in 1963 by a meteorologist and mathematician from MIT called Edward N. Lorenz. The paper containing the equation was titled “Deterministic nonperiodic ﬂows” and was published in the Journal of Atmospheric Science. What drove Lorenz to ﬁnd the set of three dimentional ordinary diﬀerential equations was the search for an equation that would “model some of the unpredictable behavior which we normally associate with the weather”[PV]. The Lorenz equation represent the convective motion of ﬂuid cell which is warmed from below and cooled from above.[PV] The same system can also apply to dynamos and laser. In addition some of its popularity can be attributed to the beauty of its solution. It is also important to state that the Lorenz equation has enough properties and interesting behavior that whole books are written analyzing results.
436.2.2
The equation
The Lorenz equation is commonly deﬁned as three coupled ordinary diﬀerential equation like dx = σ(y − x) dt dy = x(τ − z) − y dt dz = xy − βz dt where the three parameter σ, τ , β are positive and are called the Prandtl number, the Rayleigh number, and a physical proportion, respectively. It is important to note that the x, y, z are not spacial coordinate. The ”x is proportional to the intensity of the convective motion, while y is proportional to the temperature diﬀerence between the ascending and descending currents, similar signs of x and y denoting that warm ﬂuid is rising and cold ﬂuid is descending. The variable z is proportional to the distortion of vertical temperature proﬁle from linearity, a positive value indicating that the strongest gradients occur near the boundaries.” [GSS]
436.2.3
Properties of the Lorenz equations
• Symmetry The Lorenz equation has the following symmetry of ordinary diﬀerential equation: (x, y, z) → (−x, −y, z)
This symmetry is present for all parameters of the Lorenz equation (see natural symmetry of the Loren 1657
• Invariance The zaxis is invariant, meaning that a solution that starts on the zaxis (i.e. x = y = 0) will remain on the zaxis. In addition the solution will tend toward the origin if the initial condition are on the zaxis. • Critical points σ(y − x) ˙ To solve for the critical points we let x = f (x) = x(τ − z) − y and we solve xy − βz f (x) = 0. It is clear that one of those critical point is x0 = (0, 0, 0) and with some algebraic manipulation we detemine that xC1 = ( β(τ − 1), β(τ − 1), τ − 1) and xC2 = (− β(τ − 1), − β(τ − 1), τ − 1) are critical points and real when τ > 1.
436.2.4
An example
(The x solution with respect to time.) (The y solution with respect to time.) (The z solution with respect to time.)
the above is the solution of the Lorenz equation with parameters σ = 10, τ = 28 and β = 8/3(which is the classical example). The inital condition of the system is (x0 , y0 , z0 ) = (3, 15, 1).
436.2.5
Experimenting with octave
By changing the parameters and initial condition one can observe that some solution will be drastically diﬀerent. (This is in no way rigorous but can give an idea of the qualitative property of the Lorenz equation.)
function y = lorenz (x, t) y = [10*(x(2)  x(1)); x(1)*(28  x(3))  x(2); x(1)*x(2)  8 endfunction solution = lsode ("lorenz", [3; 15; 1], (0:0.01:50)’);
gset parametric gset xlabel "x" gset ylabel "y" gset zlabel "z" gset nokey gsplot soluti
REFERENCES
[LNE] Lorenz, N. Edward: Deterministic nonperiodic ﬂows. Journal of Atmospheric Science, 1963.
1658
[MM] Marsden, E. J. McCracken, M.: The Hopf Bifurcation and Its Applications. SpringerVerlag, New York, 1976. [SC] Sparow, Colin: The Lorenz Equations: Bifurcations, Chaos and Strange Attractors. SpringerVerlag, New York, 1982.
436.2.6
See also
• Paul Bourke, The Lorenz Attractor in 3D • Tim Whitcomb, http://students.washington.edu/timw/ (If you click on the Lorenz equation phase portrait, you get to download a copy of the article[GSS].) Version: 12 Owner: Daume Author(s): Daume
436.3
Wronskian determinant
If we have some functions f1 , f2 , . . . fn then the Wronskian determinant (or simply the Wronskian) W (f1 , f2 , f3 . . . fn ) is the determinant of the square matrix f1 f1 f1 . . . f1
(n−1)
W (f1 , f2 , f3 . . . fn ) =
f2 f2 f2 . . . f2
(n−1)
f3 f3 f3 . . . f3
(n−1)
... ... ... .. .
fn fn fn . . .
(n−1)
. . . fn
where f (k) indicates the kth derivative of f (not exponentiation). The Wronskian of a set of functions F is another function, which is zero over any interval where F is linearly dependent. Just as a set of vectors is said to be linearly dependent whenever one vector may by expressed as a linear combination of a ﬁnite subset of the others, a set of functions {f1 , f2 , f3 . . . fn } is said to be dependent over an interval I if one of the functions can be expressed as a linear combination of a ﬁnite subset of the others, i.e, a1 f1 (t) + a2 f2 (t) + · · · an fn (t) = 0 for some a1 , a2 , . . . an , not all zero, at any t ∈ I. Therefore the Wronskian can be used to determine if functions are independent. This is useful in many situations. For example, if we wish to determine if two solutions of a secondorder diﬀerential equation are independent, we may use the Wronskian. 1659
Examples Consider the functions x2 , x, and 1. Take the Wronskian: x2 x 1 2x 1 0 2 0 0
W =
= −2
Note that W is always nonzero, so these functions are independent everywhere. Consider, however, x2 and x: x2 x 2x 1 = x2 − 2x2 = −x2
W =
Here W = 0 only when x = 0. Therefore x2 and x are independent except at x = 0. Consider 2x2 + 3, x2 , and 1: 2x2 + 3 x2 1 4x 2x 0 4 2 0
W =
= 8x − 8x = 0
Here W is always zero, so these functions are always dependent. This is intuitively obvious, of course, since 2x2 + 3 = 2(x2 ) + 3(1) Version: 5 Owner: mathcam Author(s): mathcam, vampyr
436.4
dependence on initial conditions of solutions of ordinary diﬀerential equations
Let E ⊂ W where W is a normed vector space, f ∈ C 1 (E) is a continuous diﬀerentiable map f : E → W . Furthermore consider the the ordinary diﬀerential equation x = f (x) ˙ with the initial condition x(0) = x0 . Let x(t) be the solution of the above initial value problem deﬁned as x:I→E 1660
where I = [−a, a]. Then there exist δ > 0 such that for all y0 ∈ Nδ (x0 )(y0 in the δ neighborhood of x0 ) has a unique solution y(t) to the initial value problem above except for the initial value changed to x(0) = y0 . In addition y(t) is twice continouously diﬀerentialble function of t over the interval I. Version: 1 Owner: Daume Author(s): Daume
436.5
diﬀerential equation
A diﬀerential equation is an equation involving an unknown function of one or more variables, its derivatives and the independent variables. This type of equations comes up often in many diﬀerent branches of mathematics. They are also especially important in many problems in physics and engineering. There are many types of diﬀerential equations. An ordinary diﬀerential equation (ODE) is a diﬀerential equation where the unknown function depends on a single variable. A general ODE has the form F (x, f (x), f (x), . . . , f (n) (x)) = 0, (436.5.1) where the unknown f is usually understood to be a real or complex valued function of x, and x is usually understood to be either a real or complex variable. The order of a diﬀerential equation is the order of the highest derivative appearing in Eq. (435.5.1). In this case, assuming that F depends nontrivially on f (n) (x), the equation is of nth order. If a diﬀerential equation is satisﬁed by a function which identically vanishes (i.e. f (x) = 0 for each x in the domain of interest), then the equation is said to be homogeneous. Otherwise it is said to be nonhomogeneous (or inhomogeneous). Many diﬀerential equations can be expressed in the form L[f ] = g(x), where L is a diﬀerential operator (with g(x) = 0 for the homogeneous case). If the operator L is linear in f , then the equation is said to be a linear ODE and otherwise nonlinear. Other types of diﬀerential equations involve more complicated relations involving the unknown function. A partial diﬀerential equation (PDE) is a diﬀerential equation where the unknown function depends on more than one variable. In a delay diﬀerential equation (DDE), the unknown function depends on the state of the system at some instant in the past. Solving diﬀerential equations is a diﬃcult task. Three major types of approaches are possible: • Exact methods are generally restricted to equations of low order and/or to linear systems. • Qualitative methods do not give explicit formula for the solutions, but provide information pertaining to the asymptotic behavior of the system. 1661
• Finally, numerical methods allow to construct approximated solutions.
Examples
A common example of an ODE is the equation for simple harmonic motion d2 u + ku = 0. dx2 This equation is of second order. It can be transformed into a system of two ﬁrst order diﬀerential equations by introducing a variable v = du/dx. Indeed, we then have dv = −ku dx du = v. dx A common example of a PDE is the wave equation in three dimensions ∂2u ∂2u ∂2u ∂2u + 2 + 2 = c2 2 ∂x2 ∂y ∂z ∂t Version: 7 Owner: igor Author(s): jarino, igor
436.6
existence and uniqueness of solution of ordinary diﬀerential equations
Let E ⊂ W where W is a normed vector space, f ∈ C 1 (E) is a continuous diﬀerentiable map f : E → W , and let x0 ∈ E. Then there exists an a > 0 such that the ordinary diﬀerential equation x = f (x) ˙ with the initial condition x(0) = x0 has a unique solution x : [−a, a] → E which also satify the initial condition of the initial value problem. Version: 3 Owner: Daume Author(s): Daume
1662
436.7
maximal interval of existence of ordinary diﬀerential equations
Let E ⊂ W where W is a normed vector space, f ∈ C 1 (E) is a continuous diﬀerentiable map f : E → W . Furthermore consider the the ordinary diﬀerential equation x = f (x) ˙ with the initial condition x(0) = x0 . For all x0 ∈ E there exists a unique solution x:I→E where I = [−a, a], which also satify the initial condition of the initial value problem. Then there exists a maximal interval of existence J = (α, β) such that I ⊂ J and there exists a unique solution x : J → E. Version: 3 Owner: Daume Author(s): Daume
436.8
method of undetermined coeﬃcients
Given a (usually nonhomogenous) ordinary diﬀerential equation F (x, f (x), f (x), . . . , f (n) (x)) = 0, the method of undetermined coeﬃcients is a way of ﬁnding an exact solution when a guess can be made as to the general form of the solution. In this method, the form of the solution is guessed with unknown coeﬃcients left as variables. A typical guess might be of the form Ae2x or Ax2 + Bx + C. This can then be substituted into the diﬀerential equation and solved for the coeﬃcients. Obviously the method requires knowing the approximate form of the solution, but for many problems this is a feasible requirement. This method is most commonly used when the formula is some combination of exponentials, polynomials, sin and cos.
1663
Examples
Suppose we have f (x) − 2f (x) + f (x) − 2e2x = 0. If we guess that the soution is of the form f (x) = Ae2x then we have 4Ae2x − 4Ae2x + Ae2x − 2e2 x = 0 and therefore Ae2x = 2e2x , so A = 2, giving f (x) = 2e2x as a solution. Version: 4 Owner: Henry Author(s): Henry
436.9
natural symmetry of the Lorenz equation
The Lorenz equation has a natural symmetry deﬁned by To verify that (435.9.1) is a symmetry of an ordinary diﬀerential equation (Lorenz equation) there must exist a 3 × 3 matrix which commutes with the diﬀerential equation. This can be easily veriﬁed by observing that the symmetry is associated with the matrix R deﬁned as −1 0 0 R = 0 −1 0 . (436.9.2) 0 0 1 σ(y − x) ˙ x = f (x) = x(τ − z) − y xy − βz (436.9.3) (x, y, z) → (−x, −y, z). (436.9.1)
Let
where f (x) is the Lorenz equation and xT = (x, y, z). We proceed by showing that Rf (x) = f (Rx). Looking at the left hand side σ(y − x) −1 0 0 Rf (x) = 0 −1 0 x(τ − z) − y xy − βz 0 0 1 σ(x − y) x(z − τ ) + y = xy − βz and now looking at the right hand side −1 0 0 x f (Rx) = f ( 0 −1 0 y ) 0 0 1 z −x y ) = f( z σ(x − y) = x(z − τ ) + y . xy − βz 1664
Since the left hand side is equal to the right hand side then (435.9.1) is a symmetry of the Lorenz equation. Version: 2 Owner: Daume Author(s): Daume
436.10
symmetry of a solution of an ordinary diﬀerential equation
Let γ be a symmetry of the ordinary diﬀerential equation and x0 be a steady state solution of x = f (x). If ˙ γx0 = x0 then γ is called a symmetry of the solution of x0 . Let γ be a symmetry of the ordinary diﬀerential equation and x0 (t) be a periodic solution of x = f (x). If ˙ γx0 (t − t0 ) = x0 (t) for a certain t0 then (γ, t0 ) is called a symmetry of the periodic solution of x0 (t). lemma: If γ is a symmetry of the ordinary diﬀerential equation and let x0 (t) be a solution (either steady state or preiodic) of x = f (x). Then γx0 (t) is a solution of x = f (x). ˙ ˙ dx0 (t) dx proof: If x0 (t) is a solution of dt = f (x) implies dt = f (x0 (t)). Let’s now verify that γx0 (t) is a solution, with a substitution into dx = f (x). The left hand side of the equation becomes dt dγx0 (t) 0 = γ dxdt(t) and the right hand side of the equation becomes f (γx0 (t)) = γf (x0 (t)) since dt γ is a symmetry of the diﬀerential equation. Therefore we have that the left hand side equals 0 the right hand side since dxdt(t) = f (x0 (t)). QED
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeﬀer, G. David: Singularities and Groups in Bifurcation Theory (Volume II). SpringerVerlag, New York, 1988.
Version: 3 Owner: Daume Author(s): Daume
436.11
symmetry of an ordinary diﬀerential equation
Let f : Rn → Rn be a smooth function and let x = f (x) ˙ 1665
be a system of ordinary diﬀerential equation, in addition let γ be an invertible matrix. Then γ is a symmetry of the ordinary diﬀerential equation if f (γx) = γf (x). Example:
• natural symmetry of the Lorenz equation is a simple example of a symmetry of a diﬀerential equation.
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeﬀer, G. David: Singularities and Groups in Bifurcation Theory (Volume II). SpringerVerlag, New York, 1988.
Version: 4 Owner: Daume Author(s): Daume
1666
Chapter 437 3401 – Instructional exposition (textbooks, tutorial papers, etc.)
437.1 second order linear diﬀerential equation with constant coeﬃcients
Consider the second order homogeneous linear diﬀerential equation x + bx + cx = 0, where b and c are real constants. The explicit solution is easily found using the characteristic equation method. This method, introduced by Euler, consists in seeking solutions of the form x(t) = ert for (446.2.1). Assuming a solution of this form, and substituting it into (446.2.1) gives r 2 ert + brert + cert = 0. Thus r 2 + br + c = 0 (437.1.2) which is called the characteristic equation of (446.2.1). Depending on the nature of the roots r1 and r2 of (436.1.2), there are three cases. • If the roots are real and distinct, then two linearly independent solutions of (446.2.1) are x1 (t) = er1 t , x2 (t) = er2 t . • If the roots are real and equal, then two linearly independent solutions of (446.2.1) are x1 (t) = er1 t , 1667 x2 (t) = ter1 t . (437.1.1)
• If the roots are complex conjugates of the form r1,2 = α ± iβ, then two linearly independent solutions of (446.2.1) are x1 (t) = eαt cos βt, x2 (t) = eαt sin βt.
The general solution to (446.2.1) is then constructed from these linearly independent solutions, as φ(t) = C1 x1 (t) + C2 x2 (t). (437.1.3) Characterizing the behavior of (436.1.3) can be accomplished by studying the two dimensional linear system obtained from (446.2.1) by deﬁning y = x : x =y y = −by − cx. (437.1.4)
Remark that the roots of (436.1.2) are the eigenvalues of the Jacobian matrix of (436.1.4). This generalizes to the characteristic equation of a diﬀerential equation of order n and the ndimensional system associated to it. Also note that the only equilibrium of (436.1.4) is the origin (0, 0). Suppose that c = 0. Then (0, 0) is called a 1. source iﬀ b < 0 and c > 0, 2. spiral source iﬀ it is a source and b2 − 4c < 0, 3. sink iﬀ b > 0 and c > 0, 4. spiral sink iﬀ it is a sink and b2 − 4c < 0, 5. saddle iﬀ c < 0, 6. center iﬀ b = 0 and c > 0. Version: 3 Owner: jarino Author(s): jarino
1668
Chapter 438 34A05 – Explicit solutions and reductions
438.1 separation of variables
Separation of variables is a valuable tool for solving diﬀerential equations of the form dy = f (x)g(y) dx The above equation can be rearranged algebraically through Leibniz notation to separate the variables and be conveniently integrable. dy = f (x)dx g(y) It follows then that int dy = F (x) + C g(y)
where F (x) is the antiderivative of f and C is a constant of integration. This gives a general form of the solution. An explicit form may be derived by an initial value. Example: A population that is initially at 200 organisms increases at a rate of 15% each year. We then have a diﬀerential equation dP = 0.15P dt The solution of this equation is relatively straightforward, we simple separate the variables algebraically and integrate. dP int = int0.15 dt P 1669
This is just ln P = 0.15t + C or P = Ce0.15t When we substitute P (0) = 200, we see that C = 200. This is where we get the general relation of exponential growth P (t) = P0 ekt [more later] Version: 2 Owner: slider142 Author(s): slider142
438.2
variation of parameters
The method of variation of parameters is a way of ﬁnding a particular solution to a nonhomogeneous linear diﬀerential equation. Suppose that we have an nth order linear diﬀerential operator L[y] := y (n) + p1 (t)y (n−1) + · · · + pn (t)y, and a corresponding nonhomogeneous diﬀerential equation L[y] = g(t). (438.2.2) (438.2.1)
Suppose that we know a fundamental set of solutions y1 , y2, . . . , yn of the corresponding homogeneous diﬀerential equation L[yc ] = 0. The general solution of the homogeneous equation is yc (t) = c1 y1 (t) + c2 y2 (t) + · · · + cn yn (t), (438.2.3) where c1 , c2 , . . . , cn are constants. The general solution to the nonhomogeneous equation L[y] = g(t) is then y(t) = yc (t) + Y (t), (438.2.4) where Y (t) is a particular solution which satisﬁes L[Y ] = g(t), and the constants c1 , c2 , . . . , cn are chosen to satisfy the appropriate boundary conditions or initial conditions. The key step in using variation of parameters is to suppose that the particular solution is given by Y (t) = u1 (t)y1 (t) + u2 (t)y2 (t) + · · · + un (t)yn (t), (438.2.5) where u1 (t), u2(t), . . . , un (t) are as yet to be determined functions (hence the name variation of parameters). To ﬁnd these n functions we need a set of n independent equations. One obvious condition is that the proposed ansatz stisﬁes Eq. (437.2.2). Many possible additional conditions are possible, we choose the ones that make further calculations easier. Consider
1670
the following set of n − 1 conditions y1 u1 + y2 u2 + · · · + yn un = 0 y1 u1 + y2 u2 + · · · + yn un = 0 . . .
(n−2) y1 u1
(438.2.6)
+
(n−2) y2 u2
+···+
(n−2) yn un
= 0.
Now, substituting Eq. (437.2.5) into L[Y ] = g(t) and using the above conditions, we can get another equation y1
(n−1)
u1 + y2
(n−1)
(n−1) u2 + · · · + yn un = g.
(438.2.7)
So we have a system of n equations for u1 , u2 , . . . , un which we can solve using Cramer’s rule: um (t) = g(t)Wm (t) , W (t) m = 1, 2, . . . , n. (438.2.8)
Such a solution always exists since the Wronskian W = W (y1 , y2, . . . , yn ) of the system is nowhere zero, because the y1 , y2, . . . , yn form a fundamental set of solutions. Lastly the term Wm is the Wronskian determinant with the mth column replaced by the column (0, 0, . . . , 0, 1). Finally the particular solution can be written explicitly as
n
Y (t) =
m=1
ym (t)int
g(t)Wm (t) dt. W (t)
(438.2.9)
REFERENCES
1. W. E. Boyce—R. C. DiPrima. Elementary Diﬀerential Equations and Boundary Value Problems John Wiley & Sons, 6th edition, 1997.
Version: 3 Owner: igor Author(s): igor
1671
Chapter 439 34A12 – Initial value problems, existence, uniqueness, continuous dependence and continuation of solutions
439.1 initial value problem
Consider the simple diﬀerential equation dy = x. dx The solution goes by writing dy = x dx and then integrating both sides as intdy = intx dx. 2 the solution becomes then y = x + C where C is any constant. 2 Diﬀerentiating x + 5, x + 7 and some other examples shows that all these functions hold 2 2 the condition given by the diﬀerential equation. So we have an inﬁnite number of solutions. An initial value problem is then a diﬀerential equation (ordinary or partial, or even a system) which, besides of stating the relation among the derivatives, it also speciﬁes the value of the unknown solutions at certain points. This allows to get a unique solution from the inﬁnite number of potential ones. In our example we could add the condition y(4) = 3 turning it into an initial value problem. 2 The general solution x + C is now hold to the restriction 2 42 +C =3 2 1672
2 2
by solving for C we obtain C = −5 and so the unique solution for the system dy = x dx y(4) = 3 is y(x) =
x2 2
− 5.
Version: 1 Owner: drini Author(s): drini
1673
Chapter 440 34A30 – Linear equations and systems, general
440.1 Chebyshev equation
Chebyshev’s equation is the second order linear diﬀerential equation (1 − x2 ) where p is a real constant. There are two independent solutions which are given as series by: y1 (x) = 1 −
p2 2 x 2!
d2 y dy − x + p2 y = 0 2 dx dx
+
(p−2)p2 (p+2) 4 x 4!
and
−
(p−4)(p−2)p2 (p+2)(p+4) 6 x 6!
+···
y2 (x) = x −
(p−1)(p+1) 3 x 3!
+
(p−3)(p−1)(p+1)(p+3) 5 x 5!
−···
In each case, the coeﬃcients are given by the recursion an+2 = (n − p)(n + p) an (n + 1)(n + 2)
with y1 arising from the choice a0 = 1, a1 = 0, and y2 arising from the choice a0 = 0, a1 = 1. The series converge for x < 1; this is easy to see from the ratio test and the recursion formula above. When p is a nonnegative integer, one of these series will terminate, giving a polynomial solution. If p ≥ 0 is even, then the series for y1 terminates at xp . If p is odd, then the series for y2 terminates at xp . 1674
These polynomials are, up to multiplication by a constant, the Chebyshev polynomials. These are the only polynomial solutions of the Chebyshev equation. (In fact, polynomial solutions are also obtained when p is a negative integer, but these are not new solutions, since the Chebyshev equation is invariant under the substitution of p by −p.) Version: 3 Owner: mclase Author(s): mclase
1675
Chapter 441 34A99 – Miscellaneous
441.1 autonomous system
A system of ordinary diﬀerential equation is autonomous when it does not depend on time (does not depend on the independent variable) i.e. x = f (x). In contrast nonautonomous ˙ is when the system of ordinary diﬀerential equation does depend on time (does depend on the independent variable) i.e. x = f (x, t). ˙ It can be noted that every nonautonomous system can be converted to an autonomous system by additng a dimension. i.e. If ˙ = ( , t) ∈ Rn then it can be written as an autonomous system with ∈ Rn+1 and by doing a substitution with xn+1 = t and xn+1 = 1. ˙ Version: 1 Owner: Daume Author(s): Daume
1676
Chapter 442 34B24 – SturmLiouville theory
442.1 eigenfunction
Consider the SturmLiouville system given by dy d p(x) + q(x)y + λr(x)y = 0 dx dx a1 y(a) + a2 y (a) = 0,
a
x
b
(442.1.1)
b1 y(b) + b2 y (b) = 0,
(442.1.2)
where ai , bi ∈ R with i ∈ {1, 2} and p(x), q(x), r(x) are diﬀerentiable functions and λ ∈ R. A non zero solution of the system deﬁned by (441.1.1) and (441.1.2) exists in general for a speciﬁed λ. The functions corresponding to that speciﬁed λ are called eigenfunctions. More generally, if D is some linear diﬀerential operator, and λ ∈ R and f is a function such that Df = λf then we say f is an eigenfunction of D with eigenvalue λ. Version: 5 Owner: tensorking Author(s): tensorking
1677
Chapter 443 34C05 – Location of integral curves, singular points, limit cycles
443.1 Hopf bifurcation theorem
Consider a planar system of ordinary diﬀerential equations, written is such a form as to make explicit the dependance on a parameter µ: x y = f1 (x, y, µ) = f2 (x, y, µ)
Assume that this system has the origin as an equilibrium for all µ. Suppose that the linearization Df at zero has the two purely imaginary eigenvalues λ1 (µ) and λ2 (µ) when µ = µc . If the real part of the eigenvalues verify d (Re (λ1,2 (µ)))µ=µc > 0 dµ and the origin is asymptotically stable at µ = µc , then 1. µc is a bifurcation point; ¯ 2. for some µ1 ∈ R such that µ1 < µ < µc , the origin is a stable focus; ¯ 3. for some µ2 ∈ R such that µc < µ < µ2 , the origin is unstable, surrounded by a stable limit cycle whose size increases with µ. This is a simpliﬁed version of the theorem, corresponding to a supercritical Hopf bifurcation. Version: 1 Owner: jarino Author(s): jarino 1678
443.2
PoincareBendixson theorem
Let M be an open subset of R2 , and f ∈ C 1 (M, R2 ). Consider the planar diﬀerential equation x = f (x) Consider a ﬁxed x ∈ M. Suppose that the omega limit set ω(x) = ∅ is compact, connected, and contains only ﬁnitely many equilibria. Then one of the following holds: 1. ω(x) is a ﬁxed orbit (a periodic point with period zero, i.e., an equilibrium). 2. ω(x) is a regular periodic orbit. 3. ω(x) consists of (ﬁnitely many) equilibria {xj } and nonclosed orbits γ(y) such that ω(y) ∈ {xj } and α(y) ∈ {xj } (where α(y) is the alpha limit set of y). The same result holds when replacing omega limit sets by alpha limit sets. Since f was chosen such that existence and unicity hold, and that the system is planar, the Jordan curve theorem implies that it is not possible for orbits of the system satisfying the hypotheses to have complicated behaviors. Typical use of this theorem is to prove that an equilibrium is globally asymptotically stable (after using a Dulac type result to rule out periodic orbits). Version: 1 Owner: jarino Author(s): jarino
443.3
omega limit set
Let Φ(t, x) be the ﬂow of the diﬀerential equation x = f (x), where f ∈ C k (M, Rn ), with k 1 and M an open subset of Rn . Consider x ∈ M. The omega limit set of x, denoted ω(x), is the set of points y ∈ M such that there exists a sequence tn → ∞ with Φ(tn , x) = y. Similarly, the alpha limit set of x, denoted α(x), is the set of points y ∈ M such that there exists a sequence tn → −∞ with Φ(tn , x) = y. Note that the deﬁnition is the same for more general dynamical systems. Version: 1 Owner: jarino Author(s): jarino
1679
Chapter 444 34C07 – Theory of limit cycles of polynomial and analytic vector ﬁelds (existence, uniqueness, bounds, Hilbert’s 16th problem and ramif
444.1 Hilbert’s 16th problem for quadratic vector ﬁelds
Find a maximum natural number H(2) and relative position of limit cycles of a vector ﬁeld
2
x = p(x, y) = ˙
i+j=0 2
aij xi y j bij xi y j
i+j=0
y = q(x, y) = ˙
[GSS] As of now neither part of the problem (i.e. the bound and the positions of the limit cycles) are solved. Although R. Bam`n in 1986 showed [PV] that a quadratic vector ﬁeld has ﬁnite o number of limit cycles. In 1980 Shi Songling showed [SS] an example of a quadratic vector ﬁeld which has four limit cycles (i.e. H(2) 4).
REFERENCES
[DRR] Dumortier, F., Roussarie, R., Rousseau, C.: Hilbert’s 16th Problem for Quadratic Vector Fields. Journal of Diﬀerential Equations 110, 86133, 1994. [BR] R. Bam`n: Quadratic vector ﬁelds in the plane have a ﬁnite number of limit cycles, Publ. o I.H.E.S. 64 (1986), 111142.
1680
[SS] Shi Songling, A concrete example of the existence of four limit cycles for plane quadratic systems, Scientia Sinica 23 (1980), 154158.
Version: 6 Owner: Daume Author(s): Daume
1681
Chapter 445 34C23 – Bifurcation
445.1 equivariant branching lemma
Let Γ be a Lie group acting absolutely irreducible on V and let g ∈ Ex,λ(where E(Γ) is the space of Γequivariant germs, at the origin, of C ∞ mappings of V into V ) be a bifurcation problem with symmetry group Γ. Since V is absolutely irreducible the Jacobian matrix is (dg)0,λ = c(λ)I then we suppose that c (0) = 0. Let Σ be an isotropy subgroup satisfying dim Fix(Σ) = 1 . Then there exists a unique smooth solution branch to g = 0 such that the isotropy subgroup of each solution is Σ. [GSS]
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeﬀer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). SpringerVerlag, New York, 1988.
Version: 2 Owner: Daume Author(s): Daume
1682
Chapter 446 34C25 – Periodic solutions
446.1
Let ˙ x = f(x) be a planar dynamical system where f = (X, Y)t and x = (x, y)t . Furthermore f ∈ C 1 (E) where E is a simply connected region of the plane. If ∂X + ∂Y (the divergence of the ∂x ∂y vector ﬁeld f, · f) is always of the same sign but not identically zero then there are no periodic solution in the region E of the planar system. Version: 1 Owner: Daume Author(s): Daume
Bendixson’s negative criterion
446.2
Let
Dulac’s criteria
˙ x = f(x) be a planar dynamical system where f = (X, Y)t and x = (x, y)t . Furthermore f ∈ C 1 (E) where E is a simply connected region of the plane. If there exists a function p(x, y) ∈ C 1 (E) such that ∂(p(x,y)X) + ∂(p(x,y)Y) (the divergence of the vector ﬁeld p(x, y)f, · p(x, y)f) is ∂x ∂y always of the same sign but not identically zero then there are no periodic solution in the region E of the planar system. In addition, if A is an annular region contained in E on which the above condition is satisﬁed then there exists at most one periodic solution in A. Version: 1 Owner: Daume Author(s): Daume 1683
446.3
proof of Bendixson’s negative criterion
Suppose that there exists a periodic solution called Γ which has a period of T and lies in E. Let the interior of Γ be denoted by D. Then by Green’s theorem we can observe that int D int · f dx dy = int D int =
Γ
∂X ∂Y + dx dy ∂x ∂y
(X dx − Y dy)
= intT (Xy − Yx) dt ˙ ˙ 0 T = int0 (XY − YX) dt = 0 Since · f is not identically zero by hypothesis and is of one sign, the double integral on the left must be non zero and of that sign. This leads to a contradiction since the right hand side is equal to zero. Therefore there does not exists a periodic solution in the simply connected region E. Version: 1 Owner: Daume Author(s): Daume
1684
Chapter 447 34C99 – Miscellaneous
447.1 HartmanGrobman theorem
x = f (x) (447.1.1)
Consider the diﬀerential equation where f is a C 1 vector ﬁeld. Assume that x0 is a hyperbolic equilibrium of f . Denote Φt (x) the ﬂow of (446.2.1) through x at time t. Then there exists a homeomorphism ϕ(x) = x+h(x) with h bouded, such that ϕ ◦ etDf (x0 ) = Φt ◦ ϕ is a suﬃciently small neighboorhood of x0 . This fundamental theorem in the qualitative analysis of nonlinear diﬀerential equations states that, in a small neighborhood of x0 , the ﬂow of the nonlinear equation (446.2.1) is qualitatively similar to that of the linear system x = Df (x0 )x. Version: 1 Owner: jarino Author(s): jarino
447.2
equilibrium point
Consider an autonomous diﬀerential equation x = f (x) ˙ (447.2.1)
An equilibrium (point) x0 of (446.2.1) is such that f (x0 ) = 0. If the linearization Df (x0 ) has no eigenvalue with zero real part, x0 is said to be a hyperbolic equilibrium, whereas if there exists an eigenvalue with zero real part, the equilibrium point is nonhyperbolic. Version: 5 Owner: Daume Author(s): Daume, jarino 1685
447.3
stable manifold theorem
Let E be an open subset of Rn containing the origin, let f ∈ C 1 (E), and let φt be the ﬂow of the nonlinear system x = f (x). Suppose that f (x0 ) = 0 and that Df (x0 ) has k eigenvalues with negative real part and n − k eigenvalues with positive real part. Then there exists a kdimensional diﬀerentiable manifold S tangent to the stable subspace E S of the linear system x = Df (x)x at x0 such that for all t 0, φt (S) ⊂ S and for all y ∈ S,
t→∞
lim φt (y) = x0
and there exists an n − k dimensional diﬀerentiable manifold U tangent to the unstable subspace E U of x = Df (x)x at x0 such that for all t 0, φt (U) ⊂ U and for all y ∈ U,
t→−∞
lim φt (y) = x0 .
Version: 1 Owner: jarino Author(s): jarino
1686
Chapter 448 34D20 – Lyapunov stability
448.1 Lyapunov stable
A ﬁxed point is Lyapunov stable if trajectories of nearby points remain close for future time. More formally the ﬁxed point x∗ is Lyapunov stable, if for any > 0, there is a δ > 0 such that for all t ≥ 0 and for all x such that d(x, x∗ ) < δ, d(x(t), x∗ ) < . Version: 2 Owner: armbrusterb Author(s): yark, armbrusterb
448.2
neutrally stable ﬁxed point
A ﬁxed point is considered neutrally stable if is Liapunov stable but not attracting. A center is an example of such a ﬁxed point. Version: 3 Owner: armbrusterb Author(s): Johan, armbrusterb
448.3
stable ﬁxed point
Let X be a vector ﬁeld on a manifold M. A ﬁxed point of X is said to be stable if it is both attracting and Lyapunov stable. Version: 5 Owner: alinabi Author(s): alinabi, yark, armbrusterb
1687
Chapter 449 34L05 – General spectral theory
449.1 Gelfand spectral radius theorem
For every self consistent matrix norm,  · , and every square matrix A we can write ρ(A) = lim An  n .
n→∞
1
Note: ρ(A) denotes the spectral radius of A. Version: 4 Owner: Johan Author(s): Johan
1688
Chapter 450 34L15 – Estimation of eigenvalues, upper and lower bounds
450.1 Rayleigh quotient
The Rayleigh quotient, RA , to the Hermitian matrix A is deﬁned as RA (x) = xH Ax , xH x x = 0.
Version: 1 Owner: Johan Author(s): Johan
1689
Chapter 451 34L40 – Particular operators (Dirac, onedimensional Schr¨dinger, etc.) o
451.1 Dirac delta function
The Dirac delta “function” δ(x) is not a true function since it cannot be deﬁned completely by giving the function value for all values of the argument x. Similar to the Kronecker delta, the notation δ(x) stands for δ(x) = 0 for x = 0, and int∞ δ(x)dx = 1 −∞ For any continuous function F : int∞ δ(x)F (x)dx = F (0) −∞ or in n dimensions: intRn δ(x − s)f (s)dn s = f (x) δ(x) can also be deﬁned as a normalized Gaussian function (normal distribution) in the limit of zero width. References • Originally from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.html) Version: 2 Owner: akrowne Author(s): akrowne 1690
451.2
construction of Dirac delta function
The Dirac delta function is notorious in mathematical circles for having no actual realization as a function. However, a little known secret is that in the domain of nonstandard analysis, the Dirac delta function admits a completely legitimate construction as an actual function. We give this construction here. Choose any positive inﬁnitesimal ε and deﬁne the hyperreal valued function δ : ∗ R −→ ∗ R by 1/ε −ε/2 < x < ε/2, δ(x) := 0 otherwise. We verify that the above function satisﬁes the required properties of the Dirac delta function. By deﬁnition, δ(x) = 0 for all nonzero real numbers x. Moreover, int∞ δ(x) dx = int−ε/2 −∞
ε/2
1 dx = 1, ε
so the integral property is satisﬁed. Finally, for any continuous real function f : R −→ R, choose an inﬁnitesimal z > 0 such that f (x) − f (0) < z for all x < ε/2; then ε· f (0) + z f (0) − z < int∞ δ(x)f (x) dx < ε · −∞ ε ε
which implies that int∞ δ(x)f (x) dx is within an inﬁnitesimal of f (0), and thus has real −∞ part equal to f (0). Version: 2 Owner: djao Author(s): djao
1691
Chapter 452 3500 – General reference works (handbooks, dictionaries, bibliographies, etc.)
452.1 diﬀerential operator
Roughly speaking, a diﬀerential operator is a mapping, typically understood to be linear, that transforms a function into another function by means of partial derivatives and multiplication by other functions. On Rn , a diﬀerential operator is commonly understood to be a linear transformation of C∞ (Rn ) having the form f→ aI fI , f ∈ C∞ (Rn ),
I
where the sum is taken over a ﬁnite number of multiindices I = (i1 , . . . , in ) ∈ Nn , where aI ∈ C∞ (Rn ), and where fI denotes a partial derivative of f taken i1 times with respect to the ﬁrst variable, i2 times with respect to the second variable, etc. The order of the operator is the maximum number of derivatives taken in the above formula, i.e. the maximum of i1 + . . . + in taken over all the I involved in the above summation. On a C∞ manifold M, a diﬀerential operator is commonly understood to be a linear transformation of C∞ (M) having the above form relative to some system of coordinates. Alternatively, one can equip C∞ (M) with the limitorder topology, and deﬁne a diﬀerential operator as a continuous transformation of C∞ (M). The order of a diﬀerential operator is a more subtle notion on a manifold than on Rn . There are two complications. First, one would like a deﬁnition that is independent of any particular system of coordinates. Furthermore, the order of an operator is at best a local concept: it can change from point to point, and indeed be unbounded if the manifold is noncompact. 1692
To address these issues, for a diﬀerential operator T and x ∈ M, we deﬁne ordx (T ) the order of T at x, to be the smallest k ∈ N such that T [f k+1](x) = 0 for all f ∈ C∞ (M) such that f (x) = 0. For a ﬁxed diﬀerential operator T , the function ord(T ) : M → N deﬁned by x → ordx (T ) is lower semicontinuous, meaning that ordy (T ) for all y ∈ M suﬃciently close to x. The global order of T is deﬁned to be the maximum of ordx (T ) taken over all x ∈ M. This maximum may not exist if M is noncompact, in which case one says that the order of T is inﬁnite. Let us conclude by making two remarks. The notion of a diﬀerential operator can be generalized even further by allowing the operator to act on sections of a bundle. A diﬀerential operator T is a local operator, meaning that T [f ](x) = T [f ](x), f, g ∈ C∞ (M), x ∈ M, ordx (T )
if f ⇔ g in some neighborhood of x. A theorem, proved by Peetre states that the converse is also true, namely that every local operator is necessarily a diﬀerential operator. References 1. Dieudonn´, J.A., Foundations of modern analysis e 2. Peetre, J. , “Une caract´risation abstraite des op´rateurs diﬀ´rentiels”, Math. Scand., e e e v. 7, 1959, p. 211 Version: 5 Owner: rmilson Author(s): rmilson
1693
Chapter 453 35J05 – Laplace equation, reduced wave equation (Helmholtz), Poisson equation
453.1 Poisson’s equation
Poisson’s equation is a secondorder partial diﬀerential equation which arises in physical problems such as ﬁnding the electrical potential of a given charge distribution. Its general form in n dimensions is 2 φ(r) = ρ(r) where 2 is the Laplacian and ρ : D → R, often called a source function, is a given function on some subset D of Rn . If ρ is identical to zero, the Poisson equation reduces to the Laplace equation. The Poisson equation is linear, and therefore obeys the superposition principle: if 2 φ1 = ρ1 and 2 φ2 = ρ2 , then 2 (φ1 + φ2 ) = ρ1 + ρ2 . This fact can be used to construct solutions to Poisson’s equation from fundamental solutions, where the source distribution is a delta function. A very important case is the one in which n = 3, D is all of R3 , and φ(r) → 0 as r → ∞. The general solution is then given by φ(r) = − 1 ρ(r ) 3 intR3 d r. 4π r − r 
Version: 2 Owner: pbruin Author(s): pbruin
1694
Chapter 454 35L05 – Wave equation
454.1 wave equation
The wave equation is a partial diﬀerential equation which describes all kinds of waves. It arises in various physical situations, such as vibrating strings, sound waves, and electromagnetic waves. The wave equation in one dimension is ∂2u ∂2u = c2 2 . ∂t2 ∂x The general solution of the onedimensional wave equation can be obtained by a change of ∂2u variables (x, t) −→ (ξ, η), where ξ = x − ct and η = x + ct. This gives ∂ξ∂η = 0, which we can integrate to get d’Alembert’s solution: u(x, t) = F (x − ct) + G(x + ct) where F and G are twice diﬀerentiable functions. F and G represent waves travelling in the positive and negative x directions, respectively, with velocity c. These functions can be obtained if appropriate starting or boundary conditions are given. For example, if u(x, 0) = f (x) and ∂u (x, 0) = g(x) are given, the solution is ∂t 1 1 x+ct u(x, t) = [f (x − ct) + f (x + ct)] + intx−ct g(s)ds. 2 2c In general, the wave equation in n dimensions is ∂2u = c2 ∂t2
2
u.
where u is a function of the location variables x1 , x2 , . . . , xn , and time t. Here, 2 is the Laplacian with respect to the location variables, which in Cartesian coordinates is given by ∂2 ∂2 ∂2 2 = ∂x2 + ∂x2 + · · · + ∂x2 .
1 2 n
1695
Version: 4 Owner: pbruin Author(s): pbruin
1696
Chapter 455 35Q53 – KdVlike equations (Kortewegde Vries, Burgers, sineGordon, sinhGordon, etc.)
455.1 Korteweg  de Vries equation
The Korteweg  de Vries equation is ut = uux + uxxx where u = u(x, t) and the subscripts indicate derivatives. Version: 4 Owner: superhiggs Author(s): superhiggs (455.1.1)
1697
Chapter 456 35Q99 – Miscellaneous
456.1 heat equation
The heat equation in 1dimension (for example, along a metal wire) is a partial diﬀerential equation of the following form: ∂u ∂2u = c2 · 2 ∂t ∂x also written as ut = c2 · uxx Where u : R2 → R is the function giving the temperature at time t and position x and c is a real valued constant. This can be easily extended to 2 or 3 dimensions as ut = c2 · (uxx + uyy ) and ut = c2 · (uxx + uyy + uzz ) Note that in the steady state, that is when ut = 0, we are left with the Laplacian of u: ∆u = 0 Version: 2 Owner: dublisk Author(s): dublisk
1698
Chapter 457 3700 – General reference works (handbooks, dictionaries, bibliographies, etc.)
1699
Chapter 458 37A30 – Ergodic theorems, spectral theory, Markov operators
458.1 ergodic
Let (X, B, µ) be a probability space, and T : X → X be a measurepreserving transformation. We call T ergodic if for A ∈ B, T A = A ⇒ µ(A) = 0 or µ(A) = 1. (458.1.1)
That is, T takes almost all sets all over the space. The only sets it doesn’t move are some sets of measure zero and the entire space. Version: 2 Owner: drummond Author(s): drummond
458.2
fundamental theorem of demography
Let At be a sequence of n × n nonnegative primitive matrices. Suppose that At → A∞ , with A∞ also a nonnegative primitive matrix. Deﬁne the sequence xt+1 = At xt , with xt ∈ Rn . If x0 0, then xt =p lim t→∞ xt where p is the normalized ( p = 1) eigenvector associated to the dominant eigenvalue of A∞ (also called the PerronFrobenius eigenvector of A∞ ). Version: 3 Owner: jarino Author(s): jarino
1700
458.3
proof of fundamental theorem of demography
• First we will prove that ∃m, M > 0 such that m xk+1 xk M, ∀k (458.3.1)
with m and M independent of the sequence. In order to show this we use the primitivity of the matrices Ak and A∞ . Primitivity of A∞ implies that there exists l ∈ N such that Al ∞ 0 k0 , we have By continuity, this implies that there exists k0 such that, for all k Ak+l Ak+l−1 · · · Ak Let us then write xk+l+1 as a function of xk : xk+l+1 = Ak+l · · · Ak xk We thus have xk+l+1 C l+1 xk (458.3.2) But since the matrices Ak+l ,. . . ,Ak are strictly positive for k k0 , there exists a ε > 0 such that each component of these matrices is superior or equal to ε. From this we deduce that ∀k k0 , xk+l+1 ε xk 0
Applying relation (457.3.2), we then have that C l xk+1 which yields xk+1 and so we indeed have relation (457.3.1). • Let us denote by ek the (normalised) Perron eigenvector of Ak . Thus Ak ek = λk ek ek = 1 Let us denote by πk the projection on the supplementary space of {ek } invariant by Ak . Choosing a proper norm, we can ﬁnd ε > 0 such that Ak πk  • We shall now prove that e∗ , xk+1 k+1 → λ∞ when k → ∞ e∗ , xk k 1701 (λk − ε); ∀k ε xk 0
ε xk , ∀k Cl
In order to do this, we compute the inner product of the sequence xk+1 = Ak xk with the ek ’s: e∗ , xk+1 k+1 Therefore we have = e∗ − e∗ , Ak xk + λk e∗ , xk k+1 k k = o ( e∗ , xk ) + λk e∗ , xk k k e∗ , xk+1 k+1 = o(1) + λk e∗ , xk k uk =
πk xk e∗ , xk k We will verify that uk → 0 when k → ∞. We have uk+1 = (πk+1 − πk )Ak and so uk+1 πk+1 − πk C +
• Now assume
xk xk e∗ , xk Ak πk ∗ + ∗k ∗ ek , xk+1 ek , xk ek+1 , xk+1 e∗ , xk k (λk − ε)uk  ∗ ek+1 , xk+1
We deduce that there exists k1
k0 such that, for all k k1 ε uk+1 δk + (λ∞ − )uk  2 δk = (πk+1 − πk )C
where we have noted We have δk → 0 when t → ∞, we thus ﬁnally deduce that uk  → 0 when k → ∞ Remark that this also implies that zk = πk xk → 0 when k → ∞ xk
• We have zk → 0 when k → ∞, and xk / xk can be written xk = αk ek + zk xk Therefore, we have αk ek → 1 when k → ∞, which implies that αk tends to 1, since we have chosen ek to be normalised (i.e., ek = 1). We then can conclude that and the proof is done. Version: 2 Owner: jarino Author(s): jarino 1702 xk → e∞ when k → ∞ xk
Chapter 459 37B05 – Transformations and group actions with special properties (minimality, distality, proximality, etc.)
459.1 discontinuous action
Let X be a topological space and G a group that acts on X by homeomorphisms. The action of G is said to be discontinuous at x ∈ X if there is a neighborhood U of x such that the set {g ∈ G  gU U = ∅} is ﬁnite. The action is called discontinuous if it is discontinuous at every point. Remark 1. If G acts discontinuously then the orbits of the action have no accumulation points, i.e. if {gn } is a sequence of distinct elements of G and x ∈ X then the sequence {gn x} has no limit points. If X is locally compact then an action that satisﬁes this condition is discntinuous. Remark 2. Assume that X is a locally compact Hausdorﬀ space and let Aut(X) denote the group of self homeomorphisms of X endowed with the compactopen topology. If ρ : G → Aut(X) deﬁnes a discontimuous action then the image ρ(G) is a discrete subset of Aut(X). Version: 2 Owner: Dr Absentius Author(s): Dr Absentius
1703
Chapter 460 37B20 – Notions of recurrence
460.1 nonwandering set
Let X be a metric space, and f : X → X a continuous surjection. An element x of X is a wandering point if there is a neighborhood U of x and an integer N such that, for all n N, f n (U) U = ∅. If x is not wandering, we call it a nonwandering point. Equivalently, x is a nonwandering point if for every neighborhood U of x there is n 1 such that f n (U) U is nonempty. The set of all nonwandering points is called the nonwandering set of f , and is denoted by Ω(f ). If X is compact, then Ω(f ) is compact, nonempty, and forward invariant; if, additionally, f is an homeomorphism, then Ω(f ) is invariant. Version: 1 Owner: Koro Author(s): Koro
1704
Chapter 461 37B99 – Miscellaneous
461.1 ωlimit set
Let X be a metric space, and let f : X → X be a homeomorphism. The ωlimit set of x ∈ X, denoted by ω(x, f ), is the set of cluster points of the forward orbit {f n (x)}n∈N . Hence, y ∈ ω(x, f ) if and only if there is a strictly increasing sequence of natural numbers {nk }k∈N such that f nk (x) → y as k → ∞. Another way to express this is ω(x, f ) =
n∈N
{f k (x) : k > n}.
The αlimit set is deﬁned in a similar fashion, but for the backward orbit; i.e. α(x, f ) = ω(x, f −1). Both sets are f invariant, and if X is compact, they are compact and nonempty. If ϕ : R × X → X is a continuous ﬂow, the deﬁnition is similar: ω(x, ϕ) consists of those elements y of X for which there exists a strictly increasing sequnece {tn } of real numbers such that tn → ∞ and ϕ(x, tn ) → y as n → ∞. Similarly, α(x, ϕ) is the ωlimit set of the reversed ﬂow (i.e. ψ(x, t) = φ(x, −t)). Again, these sets are invariant and if X is compact they are compact and nonempty. Furthermore, ω(x, f ) =
n∈N
{ϕ(x, t) : t > n}.
Version: 2 Owner: Koro Author(s): Koro
1705
461.2
asymptotically stable
Let (X, d) be a metric space and f : X → X a continuous function. A point x ∈ X is said to be Lyapunov stable if for each > 0 there is δ > 0 such that for all n ∈ N and all y ∈ X such that d(x, y) < δ, we have d(f n (x), f n (y)) < . We say that x is asymptotically stable if it belongs to the interior of its stable set, i.e. if there is δ > 0 such that limn→∞ d(f n (x), f n (y)) = 0 whenever d(x, y) < δ. In a similar way, if ϕ : X × R → X is a ﬂow, a point x ∈ X is said to be Lyapunov stable if for each > 0 there is δ > 0 such that, whenever d(x, y) < δ, we have d(ϕ(x, t), ϕ(y, t)) < for each t 0; and x is called asymptotically stable if there is a neighborhood U of x such that limt→∞ d(ϕ(x, t), ϕ(y, t)) = 0 for each y ∈ U. Version: 6 Owner: Koro Author(s): Koro
461.3
expansive
If (X, d) is a metric space, a homeomorphism f : X → X is said to be expansive if there is a constant ε0 > 0, called the expansivity constant, such that for any two points of X, their nth iterates are at least ε0 appart for some integer n; i.e. if for any pair of points x = y in X there is n ∈ Z such that d(f n (x), f n (y)) ε0 . The space X is often assumed to be compact, since under that assumption expansivity is a topological property; i.e. if d is any other metric generating the same topology as d, and if f is expansive in (X, d), then f is expansive in (X, d ) (possibly with a diﬀerent expansivity constant). If f : X → X is a continuous map, we say that X is positively expansive (or forward expansive) if there is ε0 such that, for any x = y in X, there is n ∈ N such that d(f n (x), f n (y)) ε0 . Remarks.The latter condition is much stronger than expansivity. In fact, one can prove that if X is compact and f is a positively expansive homeomorphism, then X is ﬁnite (proof). Version: 9 Owner: Koro Author(s): Koro
1706
461.4
the only compact metric spaces that admit a positively expansive homeomorphism are discrete spaces
Theorem. Let (X, d) be a compact metric space. If there exists a positively expansive homeomorphism f : X → X, then X consists only of isolated points, i.e. X is ﬁnite. Lemma 1. If (X, d) is a compact metric space and there is an expansive homeomorphism f : X → X such that every point is Lyapunov stable, then every point is asymptotically stable. Proof. Let 2c be the expansivity constant of f . Suppose some point x is not asymptotically stable, and let δ be such that d(x, y) < δ implies d(f n (x), f n (y)) < c for all n ∈ N. Then there exist > 0, a point y with d(x, y) < δ, and an increasing sequence {nk } such that d(f nk (y), f nk (x)) > for each k By uniform expansivity, there is N > 0 such that for every u and v such that d(u, v) > there is n ∈ Z with n < N such that d(f n (x), f n (y)) > c. Choose k so large that nk > N. Then there is n with n < N such that d(f n+nk (x), f n+nk (y)) = d(f n (f nk (x)), f n (f nk (y))) > c. But since n + nk > 0, this contradicts the choce of δ. Hence every point is asymptotically stable. Lemma 2 If (X, d) is a compact metric space and f : X → X is a continuous surjection such that every point is asymptotically stable, then X is ﬁnite. Proof. For each x ∈ X let Kx be a closed neighborhood of x such that for all y ∈ Kx we have limn→∞ d(f n (x), f n (y)) = 0. We assert that limn→∞ diam(f n (Kx )) = 0. In fact, if that is not the case, then there is an increasing sequence of positive integers {nk }, some > 0 and a sequence {xk } of points of Kx such that d(f nk (x), f nk (xk )) > , and there is a subsequence {xki } converging to some point y ∈ Kx for which lim sup d(f n (x), f n (y)) , contradicting the choice of Kx . Now since X is compact, there are ﬁnitely many points x1 , . . . , xm such that X = m Kxi , i=1 so that X = f n (X) = m f n (Kxi ). To show that X = {x1 , . . . , xm }, suppose there is y ∈ X i=1 such that r = min{d(y, xi) : 1 i m} > 0. Then there is n such that diam(f n (Kxi )) < r for 1 i m but since y ∈ f n (Kxi ) for some i, we have a contradiction. Proof of the theorem. Consider the sets K = {(x, y) ∈ X × X : d(x, y) } for > 0 and U = {(x, y) ∈ X × X : d(x, y) > c}, where 2c is the expansivity constant of f , and let F : X × X → X × X be the mapping given by F(x, y) = (f (x), f (y)). It is clear that F is a homeomorphism. By uniform expansivity, we know that for each > 0 there is N such that for all (x, y) ∈ K , there is n ∈ {1, . . . , N } such that F n (x, y) ∈ U. We will prove that for each > 0, there is δ > 0 such that F n (K ) ⊂ Kδ for all n ∈ N. This is equivalent to say that every point of X is Lyapunov stable for f −1 , and by the previous lemmas the proof will be completed. Let K =
N n=0
F n (K ), and let δ0 = min{d(x, y) : (x, y) ∈ K}. Since K is compact, 1707
the minimum distance δ0 is reached at some point of K; i.e. there exist (x, y) ∈ K and 0 n N such that d(f n (x), f n (y)) = δ0 . Since f is injective, it follows that δ0 > 0 and letting δ = δ0 /2 we have K ⊂ Kδ . Given α ∈ K − K , there is β ∈ K and some 0 < m N such that α = F m (β), and F k (β) ∈ K for 0 < k / m. Also, there is n with 0 < m < n N such that F n (β) ∈ m+1 m+1 U ⊂ K . Hence m < N , and F(β) = F (α) ∈ F (K ) ⊂ K; On the other hand, n F(K ) ⊂ K. Therefore F (K) ⊂ K, and inductively F (K) ⊂ K for any n ∈ N. It follows that F n (K ) ⊂ F n (K) ⊂ K ⊂ Kδ for each n ∈ N as required. Version: 5 Owner: Koro Author(s): Koro
461.5
topological conjugation
Let X and Y be topological spaces, and let f : X → X and g : Y → Y be continuous functions. We say that f is topologically semicongugate to g, if there exists a continuous surjection h : Y → X such that f h = hg. If h is a homeomorphism, then we say that f and g are topologically conjugate, and we call h a topological conjugation between f and g. Similarly, a ﬂow ϕ on X is topologically semiconjugate to a ﬂow ψ on Y if there is a continuous surjection h : Y → X such that ϕ(h(y), t) = hψ(y, t) for each y ∈ Y , t ∈ R. If h is a homeomorphism then ψ and ϕ are topologically conjugate.
461.5.1
Remarks
Topological conjugation deﬁnes an equivalence relation in the space of all continuous surjections of a topological space to itself, by declaring f and g to be related if they are topologically conjugate. This equivalence relation is very useful in the theory of dynamical systems, since each class contains all functions which share the same dynamics from the topological viewpoint. In fact, orbits of g are mapped to homeomorphic orbits of f through the conjugation. Writting g = h−1 f h makes this fact evident: g n = h−1 f n h. Speaking informally, topological conjugation is a “change of coordinates” in the topological sense. However, the analogous deﬁnition for ﬂows is somewhat restrictive. In fact, we are requiring the maps ϕ(·, t) and ψ(·, t) to be topologically conjugate for each t, which is requiring more than simply that orbits of ϕ be mapped to orbits of ψ homeomorphically. This motivates the deﬁnition of topological equivalence, which also partitions the set of all ﬂows in X into classes of ﬂows sharing the same dynamics, again from the topological viewpoint. We say that ψ and ϕ are topologically equivalent, if there is an homeomorphism h : Y → X, mapping orbits of ψ to orbits of ϕ homeomorphically, and preserving orientation of the orbits. This means that: 1708
1. O(y, ψ) = {ψ(y, t) : t ∈ R} = {ϕ(h(y), t) : t ∈ R} = O(h(y), ϕ) for each y ∈ Y ; 2. for each y ∈ Y , there is δ > 0 such that, if 0 < s < t < δ, and if s is such that ϕ(h(y), s) = ψ(y, t), then s > 0. Version: 5 Owner: Koro Author(s): Koro
461.6
topologically transitive
A continuous surjection f on a topological space X to itself is topologically transitive if for every pair of open sets U and V in X there is an integer n > 0 such that f n (U) V = ∅, where f n denotes the nth iterate of f . If for every pair of open sets U and V there is an integer N such that f n (U) each n > N, we say that f is topologically mixing. V = ∅ for
If X is a compact metric space, then f is topologically transitive if and only if there exists a point x ∈ X with a dense orbit, i.e. such that O(x, f ) = {f n (x) : n ∈ N} is dense in X. Version: 2 Owner: Koro Author(s): Koro
461.7
uniform expansivity
Let (X, d) be a compact metric space and let f : X → X be an expansive homeomorphism. Theorem (uniform expansivity). For every > 0 and δ > 0 there is N > 0 such that for each pair x, y of points of X such that d(x, y) > there is n ∈ Z with n N such that d(f n (x), f n (y)) > c − δ, where c is the expansivity constant of f . Proof. Let K = {(x, y) ∈ X × X : d(x, y) /2}. Then K is closed, and hence comc. pact. For each pair (x, y) ∈ K, there is n(x,y) ∈ Z such that d(f n(x,y) (x), f n(x,y) (y)) Since the mapping F : X × X → X × X deﬁned by F (x, y) = (f (x), f (y)) is continuous, F nx is also continuous and there is a neighborhood U(x,y) of each (x, y) ∈ K such that d(f n(x,y) (u), f n(x,y) (v)) < c − δ for each (u, v) ∈ U(x,y) . Since K is compact and {U(x,y) : (x, y) ∈ K} is an open cover of K, there is a ﬁnite subcover {U(xi ,yi ) : 1 i m}. Let N = max{n(xi ,yi )  : 1 i m}. If d(x, y) > , then (x, y) ∈ K, so that (x, y) ∈ U(xi ,yi) for some i ∈ {1, . . . , m}. Thus for n = n(xi ,yi ) we have d(f n (x), f n (y)) < c − δ and n N as requred. Version: 2 Owner: Koro Author(s): Koro
1709
Chapter 462 37C10 – Vector ﬁelds, ﬂows, ordinary diﬀerential equations
462.1 ﬂow
A ﬂow on a set X is a group action of (R, +) on X. More explicitly, a ﬂow is a function ϕ : X × R → X satisfying the following properties: 1. ϕ(x, 0) = x 2. ϕ(ϕ(x, t), s) = ϕ(x, s + t) for all s, t in R and x ∈ X. The set O(x, ϕ) = {ϕ(x, t) : t ∈ R} is called the orbit of x by ϕ. Flows are usually required to be continuous or even diﬀerentiable, when the space X has some additional structure (e.g. when X is a topological space or when X = Rn .) The most common examples of ﬂows arise from describing the solutions of the autonomous ordinary diﬀerential equation y = f (y), y(0) = x (462.1.1) as a function of the initial condition x, when the equation has existence and uniqueness of solutions. That is, if (461.1.1) has a unique solution ψx : R → X for each x ∈ X, then ϕ(x, t) = ψx (t) deﬁnes a ﬂow. Version: 3 Owner: Koro Author(s): Koro
1710
462.2
globally attracting ﬁxed point
An attracting ﬁxed point is considered globally attracting if its stable manifold is the entire space. Equivalently, the ﬁxed point x∗ is globally attracting if for all x, x(t) → x∗ as t → ∞. Version: 4 Owner: mathcam Author(s): mathcam, yark, armbrusterb
1711
Chapter 463 37C20 – Generic properties, structural stability
463.1 KupkaSmale theorem
Let M be a compact smooth manifold. For every k ∈ N, the set of KupkaSmale diﬀeomorphisms is residual in Diﬀ k (M) (the space of all Ck diﬀeomorphisms from M to itself endowed with the uniform or strong Ck topology, also known as the Whitney Ck topology). Version: 2 Owner: Koro Author(s): Koro
463.2
Pugh’s general density theorem
Let M be a compact smooth manifold. There is a residual subset of Diﬀ 1 (M) in which every element f satisﬁes Per(f ) = Ω(f ). In other words: generically, the set of periodic points of a C1 diﬀeomorphism is dense in its nonwandering set. Here, Diﬀ 1 (M) denotes the set of all C1 difeomorphisms from M to itself, endowed with the (strong) C1 topology.
REFERENCES
1. Pugh, C., An improved closing lemma and a general density theorem, Amer. J. Math. 89 (1967).
Version: 5 Owner: Koro Author(s): Koro
1712
463.3
structural stability
Given a metric space (X, d) and an homeomorphism f : X → X, we say that f is structurally stable if there is a neighborhood V of f in Homeo(X) (the space of all homeomorphisms mapping X to itself endowed with the compactopen topology) such that every element of V is topologically conjugate to f . If M is a compact smooth manifold, a Ck diﬀeomorphism f is said to be Ck structurally stable if there is a neighborhood of f in Diﬀ k (M) (the space of all Ck diﬀeomorphisms from M to itself endowed with the strong Ck topology) in which every element is topologically conjugate to f . If X is a vector ﬁeld in the smooth manifold M, we say that X is Ck structurally stable if there is a neighborhood of X in Xk (M) (the space of all Ck vector ﬁelds on M endowed with the strong Ck topology) in which every element is topologically equivalent to X, i.e. such that every other ﬁeld Y in that neighborhood generates a ﬂow on M that is topologically equivalent to the ﬂow generated by X. Remark. The concept of structural stability may be generalized to other spaces of functions with other topologies; the general idea is that a function or ﬂow is structurally stable if any other function or ﬂow close enough to it has similar dynamics (from the topological viewpoint), which essentially means that the dynamics will not change under small perturbations. Version: 5 Owner: Koro Author(s): Koro
1713
Chapter 464 37C25 – Fixed points, periodic points, ﬁxedpoint index theory
464.1 hyperbolic ﬁxed point
Let M be a smooth manifold. A ﬁxed point x of a diﬀeomorphism f : M → M is said to be a hyperbolic ﬁxed point if Df (x) is a linear hyperbolic isomorphism. If x is a periodic point of least period n, it is called a hyperbolic periodic point if it is a hyperbolic ﬁxed point of f n (the nth iterate of f ). If the dimension of the stable manifold of a ﬁxed point is zero, the point is called a source; if the dimension of its unstable manifold is zero, it is called a sink; and if both the stable and unstable manifold have nonzero dimension, it is called a saddle. Version: 3 Owner: Koro Author(s): Koro
1714
Chapter 465 37C29 – Homoclinic and heteroclinic orbits
465.1 heteroclinic
Let f be an homeomorphism mapping a topological space X to itself or a ﬂow on X. An heteroclinic point, or heteroclinic intersection, is a point that belongs to the intersection of the stable set of x with the unstable set of y, where x and y are two diﬀerent ﬁxed or periodic points of f , i.e. a point that belongs to W s (f, x) W u (f, y). Version: 1 Owner: Koro Author(s): Koro
465.2
homoclinic
If X is a topological space and f is a ﬂow on X or an homeomorphism mapping X to itself, we say that x ∈ X is an homoclinic point (or homoclinic intersection) if it belongs to both the stable and unstable sets of some ﬁxed or periodic point p; i.e. x ∈ W s (f, p) W u (f, p).
The orbit of an homoclinic point is called an homoclinic orbit. Version: 2 Owner: Koro Author(s): Koro
1715
Chapter 466 37C75 – Stability theory
466.1 attracting ﬁxed point
A ﬁxed point is considered attracting if there exists a small neighborhood of the point in its stable manifold. Equivalently, the ﬁxed point x∗ is attracting if there exists a δ > 0 such that for all x, d(x, x∗ ) < δ implies x(t) → x∗ as t → ∞. The stability of a ﬁxed point can also be classiﬁed as stable, unstable, neutrally stable, and Liapunov stable. Version: 2 Owner: alinabi Author(s): alinabi, armbrusterb
466.2
stable manifold
Let X be a topological space, and f : X → X a homeomorphism. If p is a ﬁxed point for f , the stable and unstable sets of p are deﬁned by W s (f, p) = {q ∈ X : f n (q) − − p}, −→
n→∞
W (f, p) = {q ∈ X : f respectively.
u
−n
(q) − − p}, −→
n→∞
If p is a periodic point of least period k, then it is a ﬁxed point of f k , and the stable and unstable sets of p are W s (f, p) = W s (f k , p) W u (f, p) = W u (f k , p).
1716
Given a neighborhood U of p, the local stable and unstable sets of p are deﬁned by
s Wloc (f, p, U) = {q ∈ U : f n (q) ∈ U for each n u s Wloc (f, p, U) = Wloc (f −1 , p, U).
0},
If X is metrizable, we can deﬁne the stable and unstable sets for any point by W s (f, p) = {q ∈ U : d(f n (q), f n (p)) − − 0}, −→
n→∞
W (f, p) = W (f
u
s
−1
, p),
where d is a metric for X. This deﬁnition clearly coincides with the previous one when p is a periodic point. Suppose now that X is a compact smooth manifold, and f is a Ck diﬀeomorphism, k 1. If p is a hyperbolic periodic point, the stable manifold theorem assures that for some neighborhood U of p, the local stable and unstable sets are Ck embedded disks, whose tangent spaces at p are E s and E u (the stable and unstable spaces of Df (p)), respectively; moreover, they vary continuously (in certain sense) in a neighborhood of f in the Ck topology of Diﬀ k (X) (the space of all Ck diﬀeomorphisms from X to itself). Finally, the stable and unstable sets are Ck injectively immersed disks. This is why they are commonly called stable and unstable manifolds. This result is also valid for nonperiodic points, as long as they lie in some hyperbolic set (stable manifold theorem for hyperbolic sets). Version: 7 Owner: Koro Author(s): Koro
1717
Chapter 467 37C80 – Symmetries, equivariant dynamical systems
467.1 Γequivariant
Let Γ be a compact Lie group acting linearly on V and let g be a mapping deﬁned as g : V → V . Then g is Γequivariant if g(γv) = γg(v) for all γ ∈ Γ, and all v ∈ V . Therefore if g commutes with Γ then g is Γequivariant. [GSS]
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeﬀer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). SpringerVerlag, New York, 1988.
Version: 2 Owner: Daume Author(s): Daume
1718
Chapter 468 37D05 – Hyperbolic orbits and sets
468.1 hyperbolic isomorphism
Let X be a Banach space and T : X → X a continuous linear isomorphism. We say that T is an hyperbolic isomorphism if its spectrum is disjoint with the unit circle, i.e. σ(T ) {z ∈ C : z = 1} = ∅. If this is the case, then there is a splitting of X into two invariant subspaces, X = E s ⊕ E u (and therefore, a corresponding splitting of T into two operators T s : E s → E s and T u : E u → E u , i.e. T = T s ⊕T u ), and an equivalent norm · 1 in X such that T s is a contraction and T u is an expansion with respect to this norm. That is, there are constants λs and λu , with 0 < λs , λu < 1 such that for all xs ∈ E s and xu ∈ E u , T s xs
1
< λs xs
1
and (T u )−1 xu
1
< λu xu 1 .
Version: 4 Owner: Koro Author(s): Koro
1719
Chapter 469 37D20 – Uniformly hyperbolic systems (expanding, Anosov, Axiom A, etc.)
469.1 Anosov diﬀeomorphism
If M is a compact smooth manifold, a diﬀeomorphism f : M → M (or a ﬂow φ : R × M → M) such that the whole space M is an hyperbolic set for f (or φ) is called an Anosov diﬀeomorphism (or ﬂow). Anosov diﬀeomorphisms were introduced by D.V. Anosov, who proved that they are C1 structurally stable and form an open subset of C1 (M, M) with the C 1 topology. Not every manifold admits an Anosov diﬀeomorphism; for example, there are no such diffeomorphisms on the sphere S n . The simplest examples of compact manifolds admiting them are the tori Tn : they admit the so called linear Anosov diﬀeomorphisms, which are isomorphisms of Tn having no eigenvalue of modulus 1. It was proved that any other Anosov diﬀeomorphism in Tn is topologically conjugate to one of this kind. The problem of classifying manifolds that admit Anosov diﬀeomorphisms showed to be very diﬃcult and still has no answer. The only known examples of these are the tori and infranilmanifolds, and it is conjectured that they are the only ones. Another famous problem that still remains open is to determine whether or not the nonwandering set of an Anosov diﬀeomorphism must be the whole manifold M. This is known to be true for linear Anosov diﬀeomorphisms (and hence for any Anosov diﬀeomorphism in a torus). Version: 1 Owner: Koro Author(s): Koro
1720
469.2
Axiom A
Let M be a smooth manifold. We say that a diﬀeomorphism f : M → M satisﬁes (Smale’s) Axiom A (or that f is an Axiom A diﬀeomorphism) if 1. the nonwandering set Ω(f ) has a hyperbolic structure; 2. the set of periodic points of f is dense in Ω(f ): Per(f ) = Ω(f ). Version: 3 Owner: Koro Author(s): Koro
469.3
hyperbolic set
Let M be a compact smooth manifold, and let f : M → M be a diﬀeomorphism. An f invariant subset Λ of M is said to be hyperbolic (or to have an hyperbolic structure) if there is a splitting of the tangent bundle of M restricted to Λ into a (Whitney) sum of two Df invariant subbundles, E s and E u such that the restriction of Df E s is a contraction and Df E u is an expansion. This means that there are constants 0 < λ < 1 and c > 0 such that 1. TΛ M = E s ⊕ E u ;
s s s u 2. Df (x)Ex = Ef (x) and Df (x)Ex = Ef (x) for each x ∈ Λ;
3. Df n v < cλn v for each v ∈ E s and n > 0; 4. Df −n v < cλn v for each v ∈ E u and n > 0. using some Riemannian metric on M. If Λ is hyperbolic, then there exists an adapted Riemannian metric, i.e. one such that c = 1. Version: 1 Owner: Koro Author(s): Koro
1721
Chapter 470 37D99 – Miscellaneous
470.1 KupkaSmale
A diﬀeomorphism f mapping a smooth manifold M to itself is called a KupkaSmale diﬀeomorphism if 1. every periodic point of f is hyperbolic; 2. for each pair of periodic points p,q of f , the intersection between the stable manifold of p and the unstable manifold of q is transversal. Version: 1 Owner: Koro Author(s): Koro
1722
Chapter 471 37E05 – Maps of the interval (piecewise continuous, continuous, smooth)
471.1 Sharkovskii’s theorem
Every natural number can be written as 2r p, where p is odd, and r is the maximum exponent such that 2r divides the given number. We deﬁne the Sharkovskii ordering of the natural numbers in this way: given two odd numbers p and q, and two nonnegative integers r and s, then 2r p 2s q if 1. r < s and p > 1; 2. r = s and p < q; 3. r > s and p = q = 1. This deﬁnes a linear ordering of N, in which we ﬁrst have 3, 5, 7, . . . , followed by 2·3, 2·5, . . . , followed by 22 · 3, 22 · 5, . . . , and so on, and ﬁnally 2n+1 , 2n , . . . , 2, 1. So it looks like this: 3 5 ··· 3·2 5·2 ··· 3 · 2n 5 · 2n ··· 22 2 1.
Sharkovskii’s theorem. Let I ⊂ R be an interval, and let f : I → R be a continuous function. If f has a periodic point of least period n, then f has a periodic point of least period k, for each k such that n k. Version: 3 Owner: Koro Author(s): Koro
1723
Chapter 472 37G15 – Bifurcations of limit cycles and periodic orbits
472.1 Feigenbaum constant
The Feigenbaum delta constant has the value δ = 4.669211660910299067185320382047 . . . It governs the structure and behavior of many types of dynamical systems. It was discovered in the 1970’s by Mitchell Feigenbaum, while studying the logistic map y = µ · y(1 − y), which produces the Feigenbaum tree:
Generated by GNU Octave and GNUPlot.
If the bifurcations in this tree (ﬁrst few shown as dotted blue lines) are at points b1 , b2 , b3 , . . ., then bn − bn−1 =δ n→∞ bn+1 − bn lim That is, the ratio of the intervals between the bifurcation points approaches Feigenbaum’s constant. 1724
However, this is only the beginning. Feigenbaum discovered that this constant arose in any dynamical system that approaches chaotic behavior via perioddoubling bifurcation, and has a single quadratic maximum. So in some sense, Feigenbaum’s constant is a universal constant of chaos theory. Feigenabum’s constant appears in problems of ﬂuidﬂow turbulence, electronic oscillators, chemical reactions, and even the Mandelbrot set (the ”budding” of the Mandelbrot set along the negative real axis occurs at intervals determined by Feigenbaum’s constant).
References. • “What is Feigenbaum’s constant?”: http://fractals.iuta.ubordeaux.fr/scifaq/feigenbaum.html • “Bifurcations”: http://mcasco.com/bifurcat.html • “Feigenbaum’s Constant”: http://home.imf.au.dk/abuch/feigenbaum.html • “Bifurcation”: http://spanky.triumf.ca/www/fractint/bif type.html Version: 2 Owner: akrowne Author(s): akrowne
• “Feigenbaum’s Universal Constant”: http://www.stud.ntnu.no/ berland/math/feigenbaum/feigconst
472.2
Feigenbaum fractal
A Feigenbaum fractal is any bifurcation fractal produced by a perioddoubling cascade. The “canonical” Feigenbaum fractal is produced by the logistic map (a simple population model), y = µ · y(1 − y) where µ is varied smoothly along one dimension. The logistic iteration either terminates in a cycle (set of repeating values) or behaves chaotically. If one plots the points of this cycle versus the µvalue, a graph like the following is produced:
Note the distinct bifurcation (branching) points and the chaotic behavior as µ increases. Many other iterations will generate this same type of plot, for example the iteration 1725
p = r · sin(π · p) One of the most amazing things about this class of fractals is that the bifurcation intervals are always described by Feigenbaum’s constant. Octave/Matlab code to generate the above image is available here.
References.
• “Quadratic Iteration, bifurcation, and chaos”: http://mathforum.org/advanced/robertd/bifurcation.h • “Bifurcation”: http://spanky.triumf.ca/www/fractint/bif type.html • “Feigenbaum’s Constant”: http://fractals.iuta.ubordeaux.fr/scifaq/feigenbaum.html Version: 3 Owner: akrowne Author(s): akrowne
472.3
equivariant Hopf theorem
Let the system of ordinary diﬀerential equations ˙ x + f(x, λ) = 0 where f : Rn ×R → Rn is smooth and commutes with a compact Lie group Γ(Γequivariant). In addition we assume that Rn is Γsimple and we choose a basis of coordinates such that (df )0,0 ⇔ 0 −Im Im 0
where m = n/2. Furthermore let the eigenvalues of (df )0,0 be deﬁned as σ(λ) ± iρ(λ) and σ(0) = 0. ˙ Suppose that dim Fix(Σ) = 2 where Σ is an isotropy subgroup Σ ⊂ Γ×S1 acting on Rn . Then there exists a unique branch ˙ of smallamplitude periodic solutions to x + f(x, λ) = 0 with period near 2π, having Σ as their group of symmetries. [GSS]
1726
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeﬀer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). SpringerVerlag, New York, 1988.
Version: 1 Owner: Daume Author(s): Daume
1727
Chapter 473 37G40 – Symmetries, equivariant bifurcation theory
473.1 Po´naru (1976) theorem e
Let Γ be a compact Lie group and let g1 , . . . , gr generate the module P(Γ)(the space of Γequivariant polynomial mappings) of Γequivariant polynomials over the ring P(Γ)(the ring of Γinvariant polynomial). Then g1 , . . . , gr generate the module E(Γ)(the space of Γequivariant germs at the origin of C ∞ mappings) over the ring E(Γ)(the ring of Γinvariant germs). [GSS]
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeﬀer, G. David: Singularities and Groups in Bifurcation Theory (Volume II). SpringerVerlag, New York, 1988. [PV] Po´naru, V.:Singularit´s C ∞ en Pr´sence de Sym´trie. Lecture Notes in Mathematics 510, e e e e SpringerVerlag, Berlin, 1976.
Version: 1 Owner: Daume Author(s): Daume
473.2
bifurcation problem with symmetry group
Let Γ be a Lie group acting on a vector space V and let the system of ordinary diﬀerential equations ˙ x + g(x, λ) = 0
1728
where g : Rn ×R → Rn is smooth. Then g is called a bifurcation problem with symmetry group Γ if g ∈ Ex,λ(Γ) (where E(Γ) is the space of Γequivariant germs, at the origin, of C ∞ mappings of V into V ) satisfying g(0, 0) = 0 and (dg)0,0 = 0 where (dg)0,0 denotes the Jacobian matrix evaluated at (0, 0). [GSS]
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeﬀer, G. David.: Singularities and Groups in Bifurcation Theory (Volume II). SpringerVerlag, New York, 1988.
Version: 1 Owner: Daume Author(s): Daume
473.3
trace formula
Let Γ be a compact Lie group acting on V and let Σ ⊂ Γ be a Lie subgroup. Then dim Fix(Σ) = intΣ trace(σ) where int denotes the normalized Haar integral on Σ and Fix(Σ) is the ﬁxedpoint subspace of Σ .
REFERENCES
[GSS] Golubsitsky, Matin. Stewart, Ian. Schaeﬀer, G. David: Singularities and Groups in Bifurcation Theory (Volume II). SpringerVerlag, New York, 1988.
Version: 2 Owner: Daume Author(s): Daume
1729
Chapter 474 37G99 – Miscellaneous
474.1 chaotic dynamical system
As Strogatz says in reference [1], ”No deﬁnition of the term chaos is universally accepted yet, but almost everyone would agree on the three ingredients used in the following working deﬁnition”. Chaos is aperiodic longterm behavior in a deterministic system that exhibits sensitive dependence on initial conditions. Aperiodic longterm behavior means that there are trajectories which do not settle down to ﬁxed points, periodic orbits, or quasiperiodic orbits as t → ∞. For the purposed of this deﬁnition, a trajectory which approaches a limit of ∞ as t → ∞ should be considered to have a ﬁxpoint at ∞. Sensitive dependence on initial conditions means that nearby trajectories separate exponentially fast, i.e., the system has a positive Liapunov exponent. Strogatz notes that he favors additional contraints on the aperidodic longterm behavior, but leaves open what form they may take. He suggests two alternatives to fulﬁll this: 1. Requiring that ∃ an open set of initial conditions having aperiodic trajectories, or 2. If one picks a random initial condition x(0) then there must be a nonzero chance of the associated trajectory x(t) being aperiodic.
474.1.1
References
1. Steven H. Strogatz, ”Nonlinear Dynamics and Chaos”. Westview Press, 1994. 1730
Version: 2 Owner: bshanks Author(s): bshanks
1731
Chapter 475 37H20 – Bifurcation theory
475.1 bifurcation
Bifurcation refers to the splitting of attractors, as in dynamical systems. For example, the branching of the Feigenbaum tree is an instance of bifurcation. A cascade of bifurcations is a precursor to chaos.
REFERENCES
1. “Bifurcations”, http://mcasco.com/bifurcat.html 2. “Bifurcation”, http://spanky.triumf.ca/www/fractint/bif type.html 3. “Quadratic Iteration, bifurcation, and chaos”, http://mathforum.org/advanced/robertd/bifurcation.html
Version: 2 Owner: akrowne Author(s): akrowne
1732
Chapter 476 39B05 – General
476.1 functional equation
A functional equation is an equation whose unknowns are functions. f (x+y) = f (x)+f (y), f (x·y) = f (x)·f (y) are examples of such equations. The systematic study of these didn’t begin before the 1960’s, although various mathematicians have been studying them before, including Euler and Cauchy just to mention a few. Functional equations appear many places, for example, the gamma function and Riemann’s zeta function both satisfy functional equations. Version: 4 Owner: jgade Author(s): jgade
1733
Chapter 477 39B62 – Functional inequalities, including subadditivity, convexity, etc.
477.1 Jensen’s inequality
If f is a convex function on the interval [a, b], then
n n
f
k=1
λk xk
k=1
λk f (xk )
where 0 ≤ λk ≤ 1, λ1 + λ2 + · · · + λn = 1 and each xk ∈ [a, b]. If f is a concave function, the inequality is reversed. Example: f (x) = x2 is a convex function on [0, 10]. Then (0.2 · 4 + 0.5 · 3 + 0.3 · 7)2 0.2(42 ) + 0.5(32 ) + 0.3(72 ).
A very special case of this inequality is when λk = f 1 n
n
1 n n
because then f (xk )
xk
k=1
≤
1 n
k=1
that is, the value of the function at the mean of the xk is less or equal than the mean of the values of the function at each xk . There is another formulation of Jensen’s inequality used in probability: Let X be some random variable, and let f (x) be a convex function (deﬁned at least on a 1734
With this approach, the weights of the ﬁrst form can be seen as probabilities. Version: 2 Owner: drini Author(s): drini
segment containing the range of X). Then the expected value of f (X) is at least the value of f at the mean of X: E f (X) ≥ f (E X).
477.2
proof of Jensen’s inequality
We prove an equivalent, more convenient formulation: Let X be some random variable, and let f (x) be a convex function (deﬁned at least on a segment containing the range of X). Then the expected value of f (X) is at least the value of f at the mean of X: E f (X) ≥ f (E X). Indeed, let c = E X. Since f (x) is convex, there exists a supporting line for f (x) at c: for some α, and ϕ(x) ≤ f (x). Then as claimed. ϕ(x) = α(x − c) + f (c)
E f (X) ≥ E ϕ(X) = E α(X − c) + f (c) = f (c)
Version: 2 Owner: ariels Author(s): ariels
477.3
proof of arithmeticgeometricharmonic means inequality
We can use the Jensen inequality for an easy proof of the arithmeticgeometricharmonic means inequality. Let x1 , . . . , xn > 0; we shall ﬁrst prove that √ x1 + . . . + xn n . x1 · . . . · xn ≤ n Note that log is a concave function. Applying it to the arithmetic mean of x1 , . . . , xn and using Jensen’s inequality, we see that log( x1 + . . . + xn ) n log(x1 ) + . . . + log(xn ) n log(x1 · . . . · xn ) = √ n = log n x1 · . . . · xn . 1735
Since log is also a monotone function, it follows that the arithmetic mean is at least as large as the geometric mean. The proof that the geometric mean is at least as large as the harmonic mean is the usual one (see “proof of arithmeticgeometricharmonic means inequality”). Version: 4 Owner: mathcam Author(s): mathcam, ariels
477.4
subadditivity
A sequence {an }∞ is called subadditive if it satisﬁes the inequality n=1 an+m an + am for all n and m. (477.4.1) The major reason for use of subadditive sequences is the following lemma due to Fekete. Lemma 10 ([1]). For every subadditive sequence {an }∞ the limit lim an /n exists and equal n=1 to infan /n. Similarly, a function f (x) is subadditive if f (x + y) f (x) + f (y) for all x and y.
The analogue of Fekete lemma holds for subadditive functions as well. There are extensions of Fekete lemma that do not require (476.5.1) to hold for all m and n. There are also results that allow one to deduce the rate of convergence to the limit whose existence is stated in Fekete lemma if some kind of both super and subadditivity is present. A good exposition of this topic may be found in [2].
REFERENCES
1. Gy¨rgy Polya and G´bor Szeg¨. Problems and theorems in analysis, volume 1. 1976. o a o Zbl 0338.00001. 2. Michael J. Steele. Probability theory and combinatorial optimization, volume 69 of CBMSNSF Regional Conference Series in Applied Mathematics. SIAM, 1997. Zbl 0916.90233.
Version: 6 Owner: bbukh Author(s): bbukh
477.5
superadditivity
A sequence {an }∞ is called superadditive if it satisﬁes the inequality n=1 an+m an + am 1736 for all n and m. (477.5.1)
The major reason for use of superadditive sequences is the following lemma due to Fekete. Lemma 11 ([1]). For every superadditive sequence {an }∞ the limit lim an /n exists and n=1 equal to sup an /n. Similarly, a function f (x) is superadditive if f (x + y) f (x) + f (y) for all x and y.
The analogue of Fekete lemma holds for superadditive functions as well. There are extensions of Fekete lemma that do not require (476.5.1) to hold for all m and n. There are also results that allow one to deduce the rate of convergence to the limit whose existence is stated in Fekete lemma if some kind of both super and subadditivity is present. A good exposition of this topic may be found in [2].
REFERENCES
1. Gy¨rgy Polya and G´bor Szeg¨. Problems and theorems in analysis, volume 1. 1976. o a o Zbl 0338.00001. 2. Michael J. Steele. Probability theory and combinatorial optimization, volume 69 of CBMSNSF Regional Conference Series in Applied Mathematics. SIAM, 1997. Zbl 0916.90233.
Version: 5 Owner: bbukh Author(s): bbukh
1737
Chapter 478 4000 – General reference works (handbooks, dictionaries, bibliographies, etc.)
478.1 Cauchy product
Let ak and bk be two sequences of real or complex numbers for k ∈ N0 ( N0 is the set of natural numbers containing zero). The Cauchy product is deﬁned by:
k
(a ◦ b)(k) =
al bk−l .
l=0
(478.1.1)
This is basically the convolution for two sequences. Therefore the product of two series ∞ ∞ k=0 ak , k=0 bk is given by:
∞ k=0
ck =
∞ k=0
ak
·
∞ k=0
bk
=
∞
k
al bk−l .
(478.1.2)
k=0 l=0 ∞ k=0 ck
A suﬃcient condition for the resulting series ∞ ∞ k=0 bk both converge absolutely . k=0 ak and Version: 4 Owner: msihl Author(s): msihl
to be absolutely convergent is that
1738
478.2
Cesro mean
Deﬁnition Let {an }∞ be a sequence of real (or possibly complex numbers). The Ces`ro a n=0 mean of the sequence {an } is the sequence {bn }∞ with n=0 1 bn = n+1
n
ai .
i=0
(478.2.1)
Properties 1. A key property of the Ces`ro mean is that it has the same limit as the original sequence. a In other words, if {an } and {bn } are as above, and an → a, then bn → a. In particular, if {an } converges, then {bn } converges too. Version: 5 Owner: mathcam Author(s): matte, drummond
478.3
alternating series
An alternating series is of the form
∞ i=0
(−1)i ai
or
∞ i=0
(−1)i+1 ai
where (an ) is a nonnegative sequence. Version: 2 Owner: vitriol Author(s): vitriol
478.4
alternating series test
The alternating series test, or the Leibniz’s Theorem, states the following: Theorem [1, 2] Let (an )∞ be a nonnegative, nonincreasing sequence or real numbers such n=1 that limn→∞ an = 0. Then the inﬁnite sum ∞ (−1)(n+1) an converges. i=1 1739
This test provides a suﬃcient (but not necessary) condition for the convergence of an alternating series, and is therefore often used as a simple ﬁrst test for convergence of such series. The condition limn→∞ an = 0 is necessary for convergence of an alternating series. Example: The series converges to ln(2).
∞ 1 k=1 k
does not converge, but the alternating series
∞ k+1 1 k=1 (−1) k
REFERENCES
1. W. Rudin, Principles of Mathematical Analysis, McGrawHill Inc., 1976. 2. E. Kreyszig, Advanced Engineering Mathematics, John Wiley & Sons, 1993, 7th ed.
Version: 10 Owner: Koro Author(s): Larry Hammick, matte, saforres, vitriol
478.5
monotonic
A sequence (sn ) is said to be monotonic if it is • monotonically increasing • monotonically decreasing • monotonically nondecreasing • monotonically nonincreasing Intuitively, this means that the sequence can be thought of a “staircase” going either only up, or only down, with the stairs any height and any depth. Version: 1 Owner: akrowne Author(s): akrowne
478.6
monotonically decreasing
A sequence (sn ) is monotonically decreasing if s m < sn ∀ m > n 1740
Compare this to monotonically nonincreasing. Version: 4 Owner: akrowne Author(s): akrowne
478.7
monotonically increasing
A sequence (sn ) is called monotonically increasing if s m > sn ∀ m > n Compare this to monotonically nondecreasing. Version: 3 Owner: akrowne Author(s): akrowne
478.8
monotonically nondecreasing
A sequence (sn ) is called monotonically nondecreasing if sm ≥ sn ∀ m > n Compare this to monotonically increasing. Version: 2 Owner: akrowne Author(s): akrowne
478.9
monotonically nonincreasing
A sequence (sn ) is monotonically nonincreasing if sm ≤ sn for all m > n Compare this to monotonically decreasing. Examples. • (sn ) = 1, 0, −1, −2, . . . is monotonically nonincreasing. It is also monotonically decreasing. 1741
• (sn ) = 1, 1, 1, 1, . . . is nonincreasing but not monotonically decreasing.
1 • (sn ) = ( n+1 ) is nonincreasing (note that n is nonnegative).
• (sn ) = 1, 1, 2, 1, 1, . . . is not nonincreasing. It also happens to fail to be monotonically nondecreasing. • (sn ) = 1, 2, 3, 4, 5, . . . is not nonincreasing, rather it is nondecreasing (and monotonically increasing). Version: 5 Owner: akrowne Author(s): akrowne
478.10
sequence
Sequences Given any set X, a sequence in X is a function f : N −→ X from the set of natural numbers to X. Sequences are usually written with subscript notation: x0 , x1 , x2 . . . , instead of f (0), f (1), f (2) . . . .
Generalized sequences One can generalize the above deﬁnition to any arbitrary ordinal. For any set X, a generalized sequence in X is a function f : ω −→ X where ω is any ordinal number. If ω is a ﬁnite ordinal, then we say the sequence is a ﬁnite sequence. Version: 5 Owner: djao Author(s): djao
478.11
series
Given a sequence of real numbers {an } we can deﬁne a sequence of partial sums {SN }, where SN = N an . We deﬁne the series ∞ an to be the limit of these partial sums. More n=1 n=1 precisely
∞ n=1 N
an = lim Sn = lim
N →∞
N →∞
an .
n=1
The elements of the sequence {an } are called the terms of the series. Traditionally, as above, series are inﬁnite sums of real numbers. However, the formal constraints on the terms {an } are much less strict. We need only be able to add the terms and take the limit of partial sums. So in full generality the terms could be complex numbers or even elements of certain rings, ﬁelds, and vector spaces. Version: 2 Owner: igor Author(s): igor
1742
Chapter 479 40A05 – Convergence and divergence of series and sequences
479.1 Abel’s lemma
Theorem 1 Let {ai }N and {bi }N be sequences of real (or complex) numbers with N ≥ 0. i=0 i=0 For n = 0, . . . , N, let An be the partial sum An = n ai . Then i=0
N
ai bi =
i=0
N −1 i=0
Ai (bi − bi+1 ) + AN bN .
In the trivial case, when N = 0, then sum on the right hand side should be interpreted as identically zero. In other words, if the upper limit is below the lower limit, there is no summation. An inductive proof can be found here. The result can be found in [1] (Exercise 3.3.5). If the sequences are indexed from M to N, we have the following variant: Corollary Let {ai }N and {bi }N be sequences of real (or complex) numbers with 0 ≤ i=M i=M M ≤ N. For n = M, . . . , N, let An be the partial sum An = n ai . Then i=M
N
ai bi =
i=M
N −1 i=M
Ai (bi − bi+1 ) + AN bN .
Proof. By deﬁning a0 = . . . = aM −1 = b0 = . . . = bM −1 = 0, we can apply Theorem 1 to the sequences {ai }N and {bi }N . P i=0 i=0
1743
REFERENCES
1. R.B. Guenther, L.W. Lee, Partial Diﬀerential Equations of Mathematical Physics and Integral Equations, Dover Publications, 1988.
Version: 10 Owner: mathcam Author(s): matte, lieven
479.2
Abel’s test for convergence
Suppose an converges and that (bn ) is a monotonic convergent sequence. Then the series an bn converges. Version: 4 Owner: vypertd Author(s): vypertd
479.3
Let (xn )n
0
Baroni’s Theorem
be a sequence of real numbers such that lim (xn+1 −xn ) = 0. Let A = {xn n ∈ N}
n→∞
and A’ the set of limit points of A. Then A’ is a (possibly degenerate) interval from R, where R = R {−∞, +∞} Version: 2 Owner: slash Author(s): slash
479.4
BolzanoWeierstrass theorem
Given any bounded, real sequence (an ) there exists a convergent subsequence (anj ). More generally, any sequence (an ) in a compact set, has a convergent subsequence. Version: 6 Owner: vitriol Author(s): vitriol
479.5
A series
Cauchy criterion for convergence
∞ i=0
ai is convergent iﬀ for every ε > 0 there is a number N ∈ N such that an+1 + an+2 + . . . + an+p  < ε
holds for all n > N and p
1. 1744
Proof:
First deﬁne sn :=
i=0 n
ai .
Now by deﬁnition the series converges iﬀ for every ε > 0 there is a number N, such that for all n, m > N holds: sm − sn  < ε. We can assume m > n and thus set m = n + p. The the series is convergent iﬀ sn+p − sn  = an+1 + an+2 + . . . + an+p  < ε. Version: 2 Owner: mathwizard Author(s): mathwizard
479.6
If
Cauchy’s root test
√ n
an is a series of positive real terms and an < k < 1 √ n an 1 for an inﬁnite number of values of n,
for all n > N, then an is convergent. If then an is divergent. Limit form Given a series an of complex terms, set
ρ = lim sup
n→∞
n
an 
The series an is absolutely convergent if ρ < 1 and is divergent if ρ > 1. If ρ = 1, then the test is inconclusive. Version: 4 Owner: vypertd Author(s): vypertd
479.7
Dirichlet’s convergence test
n i=0
Theorem. Let {an } and {bn } be sequences of real numbers such that { and {bn } decreases with 0 as limit. Then ∞ an bn converges. n=0 1745
ai } is bounded
Proof. Let An :=
n i=0
an and let M be an upper bound for {An }. By Abel’s lemma,
n m−1
n
ai bi =
i=m i=0 n
ai bi −
ai bi
i=0 m−1
=
i=0 n
Ai (bi − bi+1 ) −
i=0
Ai (bi − bi+1 ) + An bn − Am−1 bm−1
=
n i=m n
Ai (bi − bi+1 ) + An bn − Am−1 bm−1 Ai (bi − bi+1 ) + An bn  + Am−1 bm−1 
n

i=m
ai bi 
i=m
M
i=m
(bi − bi+1 ) + An bn  + Am−1 bm−1 
Since {bn } converges to 0, there is an N( ) such that both n (bi −bi+1 ) < 3M and bi < i=m for m, n > N( ). Then, for m, n > N( ),  n ai bi  < and an bn converges. i=m Version: 1 Owner: lieven Author(s): lieven
3M
479.8
Proof of Baroni’s Theorem
Let m = infA and M = sup A . If m = M we are done since the sequence is convergent and A is the degenerate interval composed of the point l ∈ R , where l = lim xn .
n→∞
Now , assume that m < M . For every λ ∈ (m, M) , we will construct inductively two subsequences xkn and xln such that lim xkn = lim xln = λ and xkn < λ < xln
n→∞ n→∞
From the deﬁnition of M there is an N1 ∈ N such that : λ < xN1 < M Consider the set of all such values N1 . It is bounded from below (because it consists only of natural numbers and has at least one element) and thus it has a smallest element . Let n1 be the smallest such element and from its deﬁnition we have xn1 −1 λ < xn1 . So , choose k1 = n1 − 1 , l1 = n1 . Now, there is an N2 > k1 such that : λ < xN2 < M
1746
Consider the set of all such values N2 . It is bounded from below and it has a smallest element n2 . Choose k2 = n2 − 1 and l2 = n2 . Now , proceed by induction to construct the sequences kn and ln in the same fashion . Since ln − kn = 1 we have :
n→∞
lim xkn = lim xln
n→∞
and thus they are both equal to λ. Version: 1 Owner: slash Author(s): slash
479.9
Proof of StolzCesaro theorem
From the deﬁnition of convergence , for every > 0 there is N( ) ∈ N such that (∀)n N( ) , we have : an+1 − an <l+ l− < bn+1 − bn Because bn is strictly increasing we can multiply the last equation with bn+1 − bn to get : (l − )(bn+1 − bn ) < an+1 − an < (l + )(bn+1 − bn ) Let k > N( ) be a natural number . Summing the last relation we get :
k k k
(l − )
i=N ( )
(bi+1 − bi ) <
i=N ( )
(an+1 − an ) < (l + )
i=N ( )
(bi+1 − bi ) ⇒
divide the last relation by bk+1 > 0 to get : (l − )(1 − (l − )(1 −
(l − )(bk+1 − bN ( ) ) < ak+1 − aN ( ) < (l + )(bk+1 − bN ( ) ) bN ( ) bN ( ) ak+1 aN ( ) )< − < (l + )(1 − )⇔ bk+1 bk+1 bk+1 bk+1
bN ( ) aN ( ) bN ( ) aN ( ) ak+1 )+ < < (l + )(1 − )+ bk+1 bk+1 bk+1 bk+1 bk+1 This means that there is some K such that for k K we have : ak+1 < (l + ) (l − ) < bk+1 (since the other terms who were left out converge to 0) This obviously means that : and we are done . Version: 1 Owner: slash Author(s): slash 1747 an =l n→∞ bn lim
479.10
StolzCesaro theorem
Let (an )n 1 and (bn )n 1 be two sequences of real numbers. If bn is positive, strictly increasing and unbounded and the following limit exists: an+1 − an =l n→∞ bn+1 − bn lim Then the limit:
n→∞
lim
an bn
also exists and it is equal to l. Version: 4 Owner: Daume Author(s): Daume, slash
479.11
absolute convergence theorem
Every absolutely convergent series is convergent. Version: 1 Owner: paolini Author(s): paolini
479.12
The series
comparison test
∞ i=0
ai
with real ai is absolutely convergent if there is a sequence (bn )n∈N with positive real bn such that ∞ bi
i=0
is convergent and for all suﬃciently large k holds ak 
bk .
Also, the series ai is divergent if there is a sequence (bn ) with positive real bn , so that bi is divergent and ak bk for all suﬃciently large k. Version: 1 Owner: mathwizard Author(s): mathwizard
1748
479.13
convergent sequence
A sequence x0 , x1 , x2 , . . . in a metric space (X, d) is a convergent sequence if there exists a point x ∈ X such that, for every real number > 0, there exists a natural number N such that d(x, xn ) < for all n > N. The point x, if it exists, is unique, and is called the limit point of the sequence. One can also say that the sequence x0 , x1 , x2 , . . . converges to x. A sequence is said to be divergent if it does not converge. Version: 4 Owner: djao Author(s): djao
479.14
convergent series
A series Σan is convergent iﬀ the sequence of partial sums Σn ai is convergent. i=1 A series Σan is said to be absolutely convergent if Σan  is convergent. Equivalently, a series Σan is absolutely convergent if and only if all possible rearrangements are also convergent. A series Σan which converges, but which is not absolutely convergent is called conditionally convergent. It can be shown that absolute convergence implies convergence. Let Σan be an absolutely convergent series, and Σbn be a conditionally convergent series. Then any rearrangement of Σan is convergent to the same sum. It is a result due to Riemann that Σbn can be rearranged to converge to any sum, or not converge at all. Version: 5 Owner: vitriol Author(s): vitriol
479.15
determining series convergence
Consider a series Σan . To determine whether Σan converges or diverges, several tests are available. There is no precise rule indicating which type of test to use with a given series. The more obvious approaches are collected below. • When the terms in Σan are positve, there are several possibilities: – the comparison test, – the root test (Cauchy’s root test), – the ratio test, – the integral test. 1749
• If the series is an alternating series, then the alternating series test may be used. • Abel’s test for convergence can be used when terms in Σan can be obained as the product of terms of a convergent series with terms of a monotonic convergent sequence. The root test and the ratio test are direct applications of the comparison test to the geometric series with terms (an )1/n and an+1 , respectively. an Version: 2 Owner: jarino Author(s): jarino
479.16
example of integral test
∞ k=1
Consider the series
1 . k log k
Since the integral
1 dx = lim [log(log(x))]M 1 M →∞ x log x is divergent also the series considered is divergent. int∞ 1 Version: 2 Owner: paolini Author(s): paolini
479.17
geometric series
A geometric series is a series of the form
n
ar i−1
i=1
(with a and r real or complex numbers). The sum of a geometric series is
sn =
a(1 − r n ) 1−r
(479.17.1)
An inﬁnite geometric series is a geometric series, as above, with n → ∞. It is denoted
∞ i=1
ar i−1
1750
If r ≥ 1, the inﬁnite geometric series diverges. Otherwise it converges to
∞ i=1
ar i−1 =
a 1−r
(479.17.2)
Taking the limit of sn as n → ∞, we see that sn diverges if r ≥ 1. However, if r < 1, sn approaches (2). One way to prove (1) is to take
sn = a + ar + ar 2 + · · · + ar n−1 and multiply by r, to get
rsn = ar + ar 2 + ar 3 + · · · + +ar n−1 + ar n subtracting the two removes most of the terms:
sn − rsn = a − ar n factoring and dividing gives us a(1 − r n ) 1−r
sn =
Version: 6 Owner: akrowne Author(s): akrowne
479.18
harmonic number
The harmonic number of order n of θ is deﬁned as
n
Hθ (n) =
i=1
1 iθ
1751
Note that n may be equal to ∞, provided θ > 1. If θ ≤ 1, while n = ∞, the harmonic series does not converge and hence the harmonic number does not exist. If θ = 1, we may just write Hθ (n) as Hn (this is a common notation).
479.18.1
Properties
• If Re(θ) > 1 and n = ∞ then the sum is the Riemann zeta function. • If θ = 1, then we get what is known simply as“the harmonic number”, and it has many 1 important properties. For example, it has asymptotic expansion Hn = ln n+γ+ 2m +. . . where γ is Euler’s constant. • It is possible to deﬁne harmonic numbers for nonintegral n. This is done by means of the series Hn (z) = n 1 (n−z − (n + x)−z )1 . Version: 5 Owner: akrowne Author(s): akrowne
479.19
harmonic series
The harmonic series is
∞ n=1
h=
1 n
The harmonic series is known to diverge. This can be proven via the integral test; compare h with int∞ 1 A harmonic series is any series of the form
∞ n=1
1
1 dx. x
hp =
1 np
See “The Art of computer programming” vol. 2 by D. Knuth
1752
These are the socalled ”pseries.” When p > 1, these are known to converge (leading to the pseries test for series convergence). For complexvalued p, hp = ζ(p), the Riemann zeta function. A famous harmonic series is h2 (or ζ(2)), which converges to series of odd p has been solved analytically.
π2 . 6
In general no pharmonic
A harmonic series which is not summed to ∞, but instead is of the form
k
hp (k) =
n=1
1 np
is called a harmonic series of order k of p. Version: 2 Owner: akrowne Author(s): akrowne
479.20
integral test
Consider a sequence (an ) = {a0 , a1 , a2 , a3 , . . .} and given M ∈ R consider any monotonically nonincreasing function f : [M, +∞) → R which extends the sequence, i.e. f (n) = an An example is an = 2n (the former being the sequence {0, 2, 4, 6, 8, . . .} and the later the doubling function for any real number. We are interested on ﬁnding out when the summation
∞ n=0
∀n ≥ M f (x) = 2x
→
an
converges. The integral test states the following. The series
∞ n=0
an
converges if and only if the integral int∞ f (x) dx M 1753
is ﬁnite. Version: 16 Owner: drini Author(s): paolini, drini, vitriol
479.21
proof of Abel’s lemma (by induction)
Proof. The proof is by induction. However, let us ﬁrst recall that sum on the right side is a piecewise deﬁned function of the upper limit N − 1. In other words, if the upper limit is below the lower limit 0, the sum is identically set to zero. Otherwise, it is an ordinary sum. We therefore need to manually check the ﬁrst two cases. For the trivial case N = 0, both sides equal to a0 b0 . Also, for N = 1 (when the sum is a normal sum), it is easy to verify that both sides simplify to a0 b0 + a1 b1 . Then, for the induction step, suppose that the claim holds for N ≥ 2. For N + 1, we then have
N +1 N
ai bi =
i=0
ai bi + aN +1 bN +1
i=0 N −1 i=0 N
= =
Ai (bi − bi+1 ) + AN bN + aN +1 bN +1 Ai (bi − bi+1 ) − AN (bN − bN +1 ) + AN bN + aN +1 bN +1 .
i=0
Since −AN (bN − bN +1 ) + AN bN + aN +1 bN +1 = AN +1 bN +1 , the claim follows. P. Version: 4 Owner: mathcam Author(s): matte
479.22
proof of Abel’s test for convergence
Let b be the limit of {bn } and let dn = bn − b when {bn } is decreasing and dn = b − bn when {bn } is increasing. By Dirichlet’s convergence test, an dn is convergent and so is an bn = an (b ± dn ) = b an ± an dn . Version: 1 Owner: lieven Author(s): lieven
479.23
proof of BolzanoWeierstrass Theorem
To prove the BolzanoWeierstrass theorem, we will ﬁrst need two lemmas. Lemma 1. 1754
All bounded monotone sequences converge. proof. Let (sn ) be a bounded, nondecreasing sequence. Let S denote the set {sn : n ∈ N}. Then let b = sup S (the supremum of S.) Choose some > 0. Then there is a corresponding N such that sN > b − . Since (sn ) is nondecreasing, for all n > N, sn > b − . But (sn ) is bounded, so we have b − < sn ≤ b. But this implies sn − b < , so lim sn = b. (The proof for nonincreasing sequences is analogous.) Lemma 2. Every sequence has a monotonic subsequence. proof. First a deﬁnition: call the nth term of a sequence dominant if it is greater than every term following it. For the proof, note that a sequence (sn ) may have ﬁnitely many or inﬁnitely many dominant terms. First we suppose that (sn ) has inﬁnitely many dominant terms. Form a subsequence (snk ) solely of dominant terms of (sn ). Then snk+1 < snk k by deﬁnition of “dominant”, hence (snk ) is a decreasing (monotone) subsequence of (sn ). For the second case, assume that our sequence (sn ) has only ﬁnitely many dominant terms. Select n1 such that n1 is beyond the last dominant term. But since n1 is not dominant, there must be some m > n1 such that sm > sn1 . Select this m and call it n2 . However, n2 is still not dominant, so there must be an n3 > n2 with sn3 > sn2 , and so on, inductively. The resulting sequence s1 , s2 , s3 , . . . is monotonic (nondecreasing). proof of BolzanoWeierstrass. The proof of the BolzanoWeierstrass theorem is now simple: let (sn ) be a bounded sequence. By Lemma 2 it has a monotonic subsequence. By Lemma 1, the subsequence converges. Version: 2 Owner: akrowne Author(s): akrowne
1755
479.24
If for all n then
proof of Cauchy’s root test
N √ n an < k < 1
an < k n < 1. √ ∞ ∞ i n a Since n > 1 the by i=N k converges so does i=N an by the comparison test. If ∞ comparison with i=N 1 the series is divergent. Absolute convergence in case of nonpositive an can be proven in exactly the same way using n an . Version: 1 Owner: mathwizard Author(s): mathwizard
479.25
proof of Leibniz’s theorem (using Dirichlet’s convergence test)
Proof. Let us deﬁne the sequence αn = (−1)n for n ∈ N = {0, 1, 2, . . .}. Then
n
αi =
i=0
1 for even n, 0 for odd n,
so the sequence n αi is bounded. By assumption {an }∞ is a bounded decreasing sen=1 i=0 quence with limit 0. For n ∈ N we set bn := an+1 . Using Dirichlet’s convergence test, it follows that the series ∞ αi bi converges. Since i=0
∞ i=0
αi bi =
∞ n=1
(−1)n+1 an ,
the claim follows. P Version: 4 Owner: mathcam Author(s): matte, Thomas Heye
479.26
Suppose that notice that
proof of absolute convergence theorem
an is absolutely convergent, i.e., that 0 ≤ an + an  ≤ 2an , an  is convergent. First of all, 2an  =
and since the series (an + an ) has nonnegative terms it can be compared with 2 an  and hence converges. 1756
On the other hand
N
N
N
an =
n=1 n=1
(an + an ) −
n=1
an .
Since both the partial sums on the right hand side are convergent, the partial sum on the left hand side is also convergent. So, the series an is convergent. Version: 3 Owner: paolini Author(s): paolini
479.27
proof of alternating series test
If the ﬁrst term a1 is positive then the series has partial sum S2n+2 = a1 − a2 + a3 + ... − a2n + a2n+1 − a2n+2 where the ai are all nonnegative and nonincreasing. If the ﬁrst term is negative, consider the series in the absence of the ﬁrst term. From above, we have S2n+1 = S2n + a2n+1 S2n+2 = S2n + (a2n+1 − a2n+2 ) S2n+3 S2n+2
Sincea2n+1
a2n+2
S2n+3 = S2n+1 − (a2n+2 − a2n+3 ) = S2n+2 + a2n+3 . a2n+3 we have S2n+1
S2n . Moreover,
S2n+2 = a1 − (a2 − a3 ) − (a4 − a5 ) − · · · − (a2n − a2n+1 ) − a2n+2 Because the a1 are nonincreasing, we haveSn Thus a1 S2n+1 S2n+3 0, for any n. Also, S2n+2 S2n+2 S2n 0 S2n+1 a1 .
Hence the even partial sums S2n and the odd partial sumsS2n+1 are bounded. The S2n are monotonically nondecreasing, while the odd sums S2n+1 are monotonically nonincreasing. Thus the even and odd series both converge. We note that S2n+1 − S2n = a2n+1 , therefore the sums converge to the same limit if and only if(an ) → 0. The theorem is then established. Version: 7 Owner: volator Author(s): volator
479.28
Assume ak 
proof of comparison test
bk for all k > n. Then we deﬁne sk :=
∞ i=k
ai 
1757
and tk :=
∞ i=k
bi .
obviously sk tk for all k > n. Since by assumption (tk ) is convergent (tk ) is bounded and so is (sk ). Also (sk ) is monotonic and therefore. Therefore ∞ ai is absolutely convergent. i=0 Now assume bk ak for all k > n. If ∞ bi is divergent then so is ∞ ai because otherwise i=k i=k we could apply the test we just proved and show that ∞ bi is convergent, which is is not i=0 by assumption. Version: 1 Owner: mathwizard Author(s): mathwizard
479.29
proof of integral test
Consider the function (see the deﬁnition of ﬂoor) g(x) = a
x
.
Clearly for x ∈ [n, n + 1), being f non increasing we have g(x + 1) = an+1 = f (n + 1) ≤ f (x) ≤ f (n) = an = g(x) hence int+∞ g(x + 1) dx = int+∞ g(x) dx ≤ int+∞ f (x) ≤ int+∞ g(x) dx. M M +1 M M Since the integral of f and g on [M, M +1] is ﬁnite we notice that f is integrable on [M, +∞) if and only if g is integrable on [M, +∞). On the other hand g is locally constant so intn+1 g(x) dx = intn+1 an dx = an n n and hence for all N ∈ Z
∞ n=N
int+∞ g(x) N
=
an an is convergent.
that is g is integrable on [N, +∞) if and only if
∞ n=N
But, again, intN g(x) dx is ﬁnite hence g is integrable on [M, +∞) if and only if g is integrable M on [N, +∞) and also N an is ﬁnite so ∞ an is convergent if and only if ∞ an is n=0 n=0 n=N convergent. Version: 1 Owner: paolini Author(s): paolini
1758
479.30
proof of ratio test
Assume k < 1. By deﬁnition ∃N such that n > N →  an+1 − k < 1−k →  an+1  < 1+k < 1 an 2 an 2 i.e. eventually the series an  becomes less than a convergent geometric series, therefore a shifted subsequence of an  converges by the comparison test. Note that a general sequence bn converges iﬀ a shifted subsequence of bn converges. Therefore, by the absolute convergence theorem, the series an converges. Similarly for k > 1 a shifted subsequence of an  becomes greater than a geometric series tending to ∞, and so also tends to ∞. Therefore an diverges. Version: 3 Owner: vitriol Author(s): vitriol
479.31
ratio test
Let (an ) be a real sequence. If  an+1  → k then: an • k<1→ • k>1→ an converges absolutely an diverges
Version: 4 Owner: vitriol Author(s): vitriol
1759
Chapter 480 40A10 – Convergence and divergence of integrals
480.1 improper integral
Improper integrals are integrals of functions which either go to inﬁnity at the integrands, between the integrands, or where the integrands are inﬁnite. To evaluate these integrals, we use a limit process of the antiderivative. Thus we say that an improper integral converges and/or diverges if the limit converges or diverges. [examples and more exposition later] Version: 1 Owner: slider142 Author(s): slider142
1760
Chapter 481 40A25 – Approximation to limiting values (summation of series, etc.)
481.1 Euler’s constant
Euler’s constant γ is deﬁned by 1 1 1 1 + + + · · · + − ln n 2 3 4 n
γ = lim 1 +
n→∞
or equivalently
n
γ = lim Euler’s constant has the value
n→∞
i=1
1 1 − ln 1 + i i
0.57721566490153286060651209008240243104 . . . It is related to the gamma function by γ = −Γ (1) It is not known whether γ is rational or irrational. References.
1761
• Chris Caldwell  “Euler’s Constant”, http://primes.utm.edu/glossary/page.php/Gamma.html Version: 6 Owner: akrowne Author(s): akrowne
1762
Chapter 482 40A30 – Convergence and divergence of series and sequences of functions
482.1 Abel’s limit theorem
an xn has a radius of convergence r and that lim an xn = an r n = an r n is convergent. Then
Suppose that
x→r −
( lim an xn ) −
x→r
Version: 2 Owner: vypertd Author(s): vypertd
482.2
L¨wner partial ordering o
Let A and B be two Hermitian matrices of the same size. If A − B is positive semideﬁnite we write
A
B or B
A
Note: is a partial ordering, referred to as L¨wner partial ordering, on the set of hermitian o matrices. Version: 3 Owner: Johan Author(s): Johan
1763
482.3
L¨wner’s theorem o
A real function f on an interval I is matrix monotone if and only if it is real analytic and has (complex) analytic continuations to the upper and lower half planes such that Im(f ) > 0 in the upper half plane. (L¨wner 1934) o Version: 4 Owner: mathcam Author(s): Larry Hammick, yark, Johan
482.4
matrix monotone
A real function f on a real interval I is said to be matrix monotone of order n, if A B ⇒ f (A) f (B) (482.4.1)
for all Hermitian n × n matrices A, B with spectra contained in I. Version: 5 Owner: Johan Author(s): Johan
482.5
operator monotone
A function is said to be operator monotone if it is matrix monotone of arbitrary order. Version: 2 Owner: Johan Author(s): Johan
482.6
pointwise convergence
Let X be any set, and let Y be a topological space. A sequence f1 , f2 , . . . of functions mapping X to Y is said to be pointwise convergent (or simply convergent) to another function f , if the sequence fn (x) converges to f (x) for each x in X. This is usually denoted by fn → f . Version: 1 Owner: Koro Author(s): Koro
1764
482.7
uniform convergence
Let X be any set, and let (Y, d) be a metric space. A sequence f1 , f2 , . . . of functions mapping X to Y is said to be uniformly convergent to another function f if, for each ε > 0, there exists N such that, for all x and all n > N, we have d(fn (x), f (x)) < ε. This is denoted by u fn − f , or “fn → f uniformly” or, less frequently, by fn → f. Version: 8 Owner: Koro Author(s): Koro
1765
Chapter 483 40G05 – Ces`ro, Euler, N¨rlund and a o Hausdorﬀ methods
483.1 Ces`ro summability a
Ces`ro summability is a generalized convergence criterion for inﬁnite series. We say that a a series ∞ an is Ces`ro summable if the Ces`ro means of the partial sums converge to some a a n=0 limit L. To be more precise, letting
N
sN =
n=0
an Ces`ro converges to a limit L, if a
denote the N th partial sum, we say that
∞ n=0 an
1 (s0 + . . . + sN ) → L as N → ∞. N +1 Ces`ro summability is a generalization of the usual deﬁnition of the limit of an inﬁnite series. a Proposition 19. Suppose that
∞ n=0
an = L,
in the usual sense that sN → L as N → ∞. Then, the series in question Ces`ro converges a to the same limit. The converse, however is false. The standard example of a divergent series, that is nonetheless Ces`ro summable is a ∞ (−1)n .
n=0
1766
The sequence of partial sums 1, 0, 1, 0, . . . does not converge. The Ces`ro means, namely a 1 1 2 2 3 3 , , , , , ,... 1 2 3 4 5 6 do converge, with 1/2 as the limit. Hence the series in question is Ces`ro summable. a There is also a relation between Ces`ro summability and Abel summability 1 . a Theorem 14 (Frobenius). A series that is Ces`ro summable is also Abel summable. To a be more precise, suppose that 1 (s0 + . . . + sN ) → L N +1 Then, f (r) = as well. Version: 3 Owner: rmilson Author(s): rmilson
∞ n=0
as N → ∞.
an r n → L
as r → 1−
1
This and similar results are often called Abelian theorems.
1767
Chapter 484 40G10 – Abel, Borel and power series methods
484.1 Abel summability
Abel summability is a generalized convergence criterion for power series. It extends the usual deﬁnition of the sum of a series, and gives a way of summing up certain divergent series. Let us start with a series ∞ an , convergent or not, and use that series to deﬁne a power n=0 series ∞ f (r) = an r n .
n=0
Note that for r < 1 the summability of f (r) is easier to achieve than the summability of the original series. Starting with this observation we say that the series an is Abel summable if the deﬁning series for f (r) is convergent for all r < 1, and if f (r) converges to some limit L as r → 1− . If this is so, we shall say that an Abel converges to L. Of course it is important to ask whether an ordinary convergent series is also Abel summable, and whether it converges to the same limit? This is true, and the result is known as Abel’s convergence theorem, or simply as Abel’s theorem. Theorem 15 (Abel). Let
∞ n=0
an be a series; let N ∈ N,
sN = a0 + . . . + aN ,
denote the corresponding partial sums; and let f (r) be the corresponding power series deﬁned as above. If an is convergent, in the usual sense that the sN converge to some limit L as N → ∞, then the series is also Abel summable and f (r) → L as r → 1− . The standard example of a divergent series that is nonetheless Abel summable is the alternating series
∞ n=0
(−1)n . 1768
The corresponding power series is 1 = 1+r Since 1 1 → 1+r 2
∞ n=0
(−1)n r n .
as r → 1− ,
1 this otherwise divergent series Abel converges to 2 .
Abel’s theorem is the prototype for a number of other theorems about convergence, which are collectively known in analysis as Abelian theorems. An important class of associated results are the socalled Tauberian theorems. These describe various convergence criteria, and sometimes provide partial converses for the various Abelian theorems. The general converse to Abel’s theorem is false, as the example above illustrates 1 . However, in the 1890’s Tauber proved the following partial converse. Theorem 16 (Tauber). Suppose that an is an Abel summable series and that nan → 0 as n → ∞. Then, n an is convergent in the ordinary sense as well. The proof of the above theorem is not hard, but the same cannot be said of the more general Tauberian theorems. The more famous of these are due to Hardy, HardyLittlewood, Weiner, and Ikehara. In all cases, the conclusion is that a certain series or a certain integral is convergent. However, the proofs are lengthy and require sophisticated techniques. Ikehara’s theorem is especially noteworthy because it is used to prove the prime number theorem. Version: 1 Owner: rmilson Author(s): rmilson
484.2
proof of Abel’s convergence theorem
∞ n=0
Suppose that
an = L
∞ n=0
is a convergent series, and set f (r) =
an r n .
Convergence of the ﬁrst series implies that an → 0, and hence f (r) converges for r < 1. We will show that f (r) → L as r → 1− .
1 We want the converse to be false; the whole idea is to describe a method of summing certain divergent series!
1769
Let sN = a0 + . . . + aN , N ∈ N, sn r n .
n
denote the corresponding partial sums. Our proof relies on the following identity f (r) =
n
an r n = (1 − r)
(484.2.1)
The above identity obviously works at the level of formal power series. Indeed, a0 + (a1 + a0 ) r + (a2 + a1 + a0 ) r 2 + . . . −( a0 r + (a1 + a0 ) r 2 + . . .) = a0 + a1 r + a2 r 2 + . . . Since the partial sums sn converge to L, they are bounded, and hence n sn r n converges for r < 1. Hence for r < 1, identity (483.2.1) is also a genuine functional equality. Let > 0 be given. Choose an N suﬃciently large so that all partial sums, sn with n > N, are sandwiched between L − and L + . It follows that for all r such that 0 < r < 1 the series ∞ (1 − r) sn r n
n=N +1
is sandwiched between r N +1 (L − ) and r N +1 (L + ). Note that
N
f (r) = (1 − r)
n=0
sn r n + (1 − r)
∞ n=N +1
sn r n .
As r → 1− , the ﬁrst term goes to 0. Hence, lim sup f (r) and lim inf f (r) as r → 1− are sandwiched between L − and L + . Since > 0 was arbitrary, it follows that f (r) → L as r → 1− . QED Version: 1 Owner: rmilson Author(s): rmilson
484.3
Let
proof of Tauber’s convergence theorem
∞ n=0
f (z) =
an z n ,
be a complex power series, convergent in the open disk z < 1. We suppose that 1. nan → 0 as n → ∞, and that 2. f (r) converges to some ﬁnite L as r → 1− ; 1770
and wish to show that
n
an converges to the same L as well.
Let sn = a0 + . . . + an , where n = 0, 1, . . ., denote the partial sums of the series in question. The enabling idea in Tauber’s convergence result (as well as other Tauberian theorems) is the existence of a correspondence in the evolution of the sn as n → ∞, and the evolution of f (r) as r → 1− . Indeed we shall show that sn − f n−1 n → 0 as n → 0. (484.3.1)
The desired result then follows in an obvious fashion. For every real 0 < r < 1 we have
n
sn = f (r) +
k=0
ak (1 − r ) −
k
∞ k=n+1
ak r k .
Setting
n
= sup kak ,
k>n
and noting that 1 − r k = (1 − r)(1 + r + . . . + r k−1) < k(1 − r), we have that sn − f (r) (1 − r)
n
kak +
k=0
n
∞ k=n+1
n
rk .
Setting r = 1 − 1/n in the above inequality we get sn − f (1 − 1/n) where µn = 1 n µn +
n n (1
− 1/n)n+1 ,
kak
k=0
are the Ces`ro means of the sequence kak , k = 0, 1, . . . Since the latter sequence converges a to zero, so do the means µn , and the suprema n . Finally, Euler’s formula for e gives
n→∞
lim (1 − 1/n)n = e−1 .
The validity of (483.3.1) follows immediately. QED Version: 1 Owner: rmilson Author(s): rmilson
1771
Chapter 485 41A05 – Interpolation
485.1 Lagrange Interpolation formula
Let (x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ) be n points in the plane (xi = xj for i = j). Then there exists an unique polynomial p(x) of degree at most n−1 such that yi = p(xi ) for i = 1, . . . , n. Such polynomial can be found using Lagrange’s Interpolation formula: f (x) f (x) f (x) y1 + y2 + · · · + yn (x − x1 )f (x1 ) (x − x2 )f (x2 ) (x − xn )f (xn )
p(x) =
where f (x) = (x − x1 )(x − x2 ) · · · (x − xn ).
To see this, notice that the above formula is the same as p(x) = y1
(x − x2 )(x − x3 ) . . . (x − xn ) (x − x1 )(x − x3 ) . . . (x − xn ) (x − x1 )(x − x2 ) . . . +y2 +· · ·+yn (x1 − x2 )(x1 − x3 ) . . . (x1 − xn ) (x2 − x1 )(x2 − x3 ) . . . (x2 − xn ) (xn − x1 )(xn − x2 ) . .
and that every polynomial in the numerators vanishes for all xi except onen and for that one xi the denominator makes the fraction equal to 1 so each p(xi ) equals yi. Version: 4 Owner: drini Author(s): drini
485.2
Simpson’s 3/8 rule
3 Simpson’s 8 rule is a method for approximating a deﬁnite integral by evaluating the integrand at ﬁnitely many points. The formal rule is given by
intx3 f (x) dx ≈ x0
3h [f (x0 ) + 3f (x1 ) + 3f (x2 ) + f (x3 )] 8 1772
where h =
x1 −x0 . 3
Simpson’s 3 rule is the third NewtonCotes quadrature formula. It has degree of precision 8 3. This means it is exact for polynomials of degree less than or equal to three. Simpson’s 3 rule is an improvement to the traditional Simpson’s rule. The extra function evaluation 8 gives a slightly more accurate approximation . We can see this with an example. Using the fundamental theorem of the calculus shows intπ sin(x) dx = 2. 0 In this case Simpson’s rule gives, intπ sin(x) dx ≈ 0 However, Simpson’s
3 8
π π sin(0) + 4 sin + sin(π) = 2.094 6 2
rule does slightly better. 3 8 π π sin(0) + 3 sin + 3 sin 3 3 2π 3
intπ sin(x) dx ≈ 0
+ sin(π) = 2.040
Version: 4 Owner: tensorking Author(s): tensorking
485.3
trapezoidal rule
Deﬁnition 11. The trapezoidal rule is a method for approximating a deﬁnite integral by evaluating the integrand at ﬁnitely many points. The formal rule is given by intx1 f (x) dx = x0 where h = x1 − x0 . The trapezoidal rule is the ﬁrst NewtonCotes quadrature formula. It has degree of precision 1. This means it is exact for polynomials of degree less than or equal to one. We can see this with a simple example. Example 20. Using the fundamental theorem of the calculus shows int1 x dx = 1/2. 0 In this case the trapezoidal rule gives the exact value, int1 x dx ≈ 0 1 [f (0) + f (1)] = 1/2. 2 1773 h [f (x0 ) + f (x1 )] 2
It is important to note that most calculus books give the wrong deﬁnition of the trapezoidal rule. Typically they deﬁne a composite trapezoidal rule which uses the trapezoidal rule on a speciﬁed number of subintervals. Also note the trapezoidal rule can be derived by integrating a linear interpolation or using the method of undetermined coeﬃcients. The later is probably a bit easier. Version: 6 Owner: tensorking Author(s): tensorking
1774
Chapter 486 41A25 – Rate of convergence, degree of approximation
486.1 superconvergence
Let xi = ai+1 −ai , the diﬀerence between two successive entries of a sequence. The sequence a0 , a1 , . . . superconverges if, when the xi are written in base 2, then each number xi starts with 2i − 1 ≈ 2i zeroes. The following sequence is superconverging to 0. xn+1 x0 x1 x2 x3 x4 = x2 (xn )10 n 1 = 2 1 = 4 1 = 8 1 = 16 1 = 32 (xn )2 .1 .01 .0001 .00000001 .0000000000000001
In this case it is easy to see that the number of binary places increases by twice the previous amount per xn . Version: 8 Owner: slider142 Author(s): slider142
1775
Chapter 487 41A58 – Series expansions (e.g. Taylor, Lidstone series, but not Fourier series)
487.1
487.1.1
Taylor series
Taylor Series
Let f be a function deﬁned on any open interval containing 0. If f possesses derivatives of all order at 0, then
∞ k=0
T (x) =
f (k) (0) k x k!
is called the Taylor series of f about 0. We use 0 for simplicity, but any function with an inﬁnitelydiﬀerentiable point can be shifted such that this point becomes 0. Tn (x), the “nth degree Taylor approximation” or a “Taylor series approximation to n terms”1 , is deﬁned as
n−1
Tn (x) =
k=0
1
f (k) (0) k x k!
Tn is often deﬁned as the sum from k = 0 to n rather than the sum from k = 0 to n − 1. This has the beneﬁcial result of making the “nth degree Taylor approximation” a degreen polynomial. However, the drawback is that Tn is no longer an approximation “to n terms”. The diﬀerent deﬁnitions also give rise to slightly diﬀerent statements of Taylor’s Theorem. In sum, mind the context when dealing with Taylor series and Taylor’s theorem.
1776
The remainder, Rn (x), is deﬁned as Rn (x) = f (x) − Tn (x) Also note that f (x) = T (x) if and only if lim Rn (x) = 0
n→∞
For most functions one encounters in college calculus, f (x) = T (x) (for example, polynomials and ratios of polynomials), and thus, limn→∞ Rn (x) = 0. Taylor’s theorem is typically invoked in order to show this (the theorem gives the speciﬁc form of the remainder). Taylor series approximations are extremely useful to linearize or otherwise reduce the analytical complexity of a function. Taylor series approximations are most useful when the magnitude of the terms falls oﬀ rapidly.
487.1.2
Examples
Using the above deﬁnition of a Taylor series about 0, we have the following important series representations: x x2 x3 + + +··· 1! 2! 3! x3 x5 x7 x − + − +··· sin x = 1! 3! 5! 7! x2 x4 x6 cos x = 1 − + − +··· 2! 4! 6! ex = 1 +
487.1.3
Generalizations
Taylor series can also be extended to functions of more than one variable. The twovariable Taylor series of f (x, y) is
∞ ∞
T (x, y) =
i=0 j=0
f (i,j) (x, y) i j xy i!j!
Where f (i,j) is the partial derivative of f taken with respect to x i times and with respect to y j times. We can generalize this to n variables, or functions f (x) , x ∈ Rn×1 . The Taylor series of this function of a vector is then 1777
T (x) =
∞ i1 =0
···
∞ in
f (i1 ,i2 ,...,in ) (0) i1 i2 x1 x2 · · · xin n i1 !i2 ! · · · in ! =0
Version: 7 Owner: akrowne Author(s): akrowne
487.2
487.2.1
Taylor’s Theorem
Taylor’s Theorem
Let f be a function which is deﬁned on the interval (a, b), with a < 0 < b, and suppose the nth derivative f (n) exists on (a, b). Then for all nonzero x in (a, b), f (n) (y) n x n!
Rn (x) =
with y strictly between 0 and x (y depends on the choice of x). Rn (x) is the nth remainder of the Taylor series for f (x). Version: 2 Owner: akrowne Author(s): akrowne
1778
Chapter 488 41A60 – Asymptotic approximations, asymptotic expansions (steepest descent, etc.)
488.1 Stirling’s approximation
Stirling’s formula gives an approximation for n!, the factorial function. It is n! ≈ √ 2nπnn e−n
We can derive this from the gamma function. Note that for large x, Γ(x) = where
∞ n=0
√
2πxx− 2 e−x+µ(x)
1
(488.1.1)
µ(x) =
x+n+
1 2
ln 1 +
1 x+n
−1=
θ 12x
with 0 < θ < 1. Taking x = n and multiplying by n, we have n! = √ 2πnn+ 2 e−n+ 12n
1 θ
(488.1.2)
Taking the approximation for large n gives us Stirling’s formula.
1779
There is also a bigO notation version of Stirling’s approximation: √ n e
n
n! =
2πn
1+O
1 n
(488.1.3)
We can prove this equality starting from (487.1.2). It is clear that the bigo portion θ of (487.1.3) must come from e 12n , so we must consider the asymptotic behavior of e. First we observe that the Taylor series for ex is x x2 x3 + + +··· 1 2! 3!
ex = 1 +
1 But in our case we have e to a vanishing exponent. Note that if we vary x as n , we have as n −→ ∞
ex = 1 + O
1 n
We can then (almost) directly plug this in to (487.1.2) to get (487.1.3) (note that the factor of 12 gets absorbed by the bigO notation.) Version: 16 Owner: drini Author(s): drini, akrowne
1780
Chapter 489 4200 – General reference works (handbooks, dictionaries, bibliographies, etc.)
489.1 countable basis
A countable basis β of a vector space V over a ﬁeld F is a countable subset β ⊂ V with the property that every element v ∈ V can be written as an inﬁnite series v=
x∈β
ax x
in exactly one way (where ax ∈ F ). We are implicitly assuming, without further comment, that the vector space V has been given a topological structure or normed structure in which the above inﬁnite sum is absolutely convergent (so that it converges to v regardless of the order in which the terms are summed). The archetypical example of a countable basis is the Fourier series of a function: every continuous realvalued periodic function f on the unit circle S 1 = R/2π can be written as a Fourier series ∞ ∞ f (x) = an cos(nx) + bn sin(nx)
n=0 n=1
in exactly one way. Note: A countable basis is a countable set, but it is not usually a basis. Version: 4 Owner: djao Author(s): djao
1781
489.2
discrete cosine transform
The discrete cosine transform is closely related to the fast Fourier transform; it plays a role in coding signals and images [Jain89], e.g. in the widely used standard JPEG compression. The onedimensional transform is deﬁned by
N −1 n=0
t(k) = c(k)
s(n) cos
π(2n + 1)k 2N
where s is the array of N original values, t is the array of N transformed values, and the coeﬃcients c are given by c(0) = for 1 ≤ k ≤ N − 1. The discrete cosine transform in two dimensions, for a square matrix, can be written as
N −1 N −1 n=1 m=0
1/N, c(k) =
2/N
t(i, j) = c(i, j)
s(m, n) cos
π(2m + 1)i π(2n + 1)j cos 2N 2N
with an analogous notation for N, s, t, and the c(i, j) given by c(0, j) = 1/N, c(i, 0) = 1/N, and c(i, j) = 2/N for both i and j = 0. The DCT has an inverse, deﬁned by
N −1 k=0
s(n) =
c(k)t(k) cos
π(2n + 1)k 2N
for the onedimensional case, and
N −1 N −1 i=0 j=0
s(m, n) =
c(i, j)t(i, j) cos
π(2m + 1)i π(2n + 1)j cos 2N 2N
for two dimensions. The DCT is included in commercial image processing packages, e.g. in Matlab. References 1782
• Originally from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.html) • Jain89 A.K. Jain, Fundamentals of Digital Image Processing, Prentice Hall, 1989. Version: 4 Owner: akrowne Author(s): akrowne
1783
Chapter 490 4201 – Instructional exposition (textbooks, tutorial papers, etc.)
490.1 Laplace transform
Let f (t) be a function deﬁned on the interval [0, ∞). The Laplace transform of f (t) is the function F (s) deﬁned by F (s) = int∞ e−st f (t) dt, 0 provided that the improper integral converges. We will usually denote the Laplace transform of f by L{f (t)}. Some of the most common Laplace transforms are: 1. L{eat } =
1 s−a
,s>a
s s2 −b2 b s2 −b2
2. L{cos(bt)} = 3. L{sin(bt)} = 4. L{(tn )} =
,s>0 ,s>0
n! sn+1
, s > 0.
Notice the Laplace transform is a linear transformation. Much like the Fourier transform, the Laplace transform has a convolution. The most popular usage of the Laplace transform is to solve initial value problems by taking the Laplace transform of both sides of an ordinary diﬀerential equation. Version: 4 Owner: tensorking Author(s): tensorking
1784
Chapter 491 42A05 – Trigonometric polynomials, inequalities, extremal problems
491.1 Chebyshev polynomial
We can always express cos(kt) as a polynomial of cos(t): Examples: cos(1t) = cos(t) cos(2t) = 2(cos(t))2 − 1 cos(3t) = 4(cos(t))3 − 3 cos(t) . . . This fact can be proved using the formula for cosine of anglesum. If we write x = cos t we obtain the Chebyshev polynomials of ﬁrst kind, that is Tn (x) = cos(nt) where x = cos t. So we have T0 (x) T1 (x) T2 (x) T3 (x) = = = = . . . 1 x 2x2 − 1 4x3 − 3x
1785
These polynomials hold the recurrence relation: Tn+1 (x) = 2xTn (x) − Tn−1 (x) for n = 1, 2, . . . Version: 4 Owner: drini Author(s): drini
1786
Chapter 492 42A16 – Fourier coeﬃcients, Fourier series of functions with special properties, special Fourier series
492.1 RiemannLebesgue lemma
Proposition. Let f : [a, b] → C be a measurable function. If f is L1 integrable, that is to say if the Lebesgue integral of f  is ﬁnite, then intb f (x)einx dx → 0, a as n → ±∞.
The above result, commonly known as the RiemannLebesgue lemma, is of basic importance ˆ in harmonic analysis. It is equivalent to the assertion that the Fourier coeﬃcients fn of a periodic, integrable function f (x), tend to 0 as n → ±∞. The proof can be organized into 3 steps. Step 1. An elementary calculation shows that intI einx dx → 0, as n → ±∞
for every interval I ⊂ [a, b]. The proposition is therefore true for all step functions with support in [a, b]. Step 2. By the monotone convergence theorem, the proposition is true for all positive functions, integrable on [a, b]. Step 3. Let f be an arbitrary measurable function, integrable on [a, b]. The proposition is true for such a general f , because one can always write f = g − h, 1787
where g and h are positive functions, integrable on [a, b]. Version: 2 Owner: rmilson Author(s): rmilson
492.2
example of Fourier series
Here we present an example of Fourier series: Example: Let f : R → R be the “identity” function, deﬁned by f (x) = x, for all x ∈ R We will compute the Fourier coeﬃcients for this function. Notice that cos(nx) is an even function, while f and sin(nx) are odd functions. af = 0 1 1 intπ f (x)dx = intπ xdx = 0 −π 2π 2π −π
af = n
1 π 1 int−π f (x) cos(nx)dx = intπ x cos(nx)dx = 0 π π −π
bf = n
1 1 π int−π f (x) sin(nx)dx = intπ x sin(nx)dx = π π −π π sin(nx) x cos(nx) 2 π 2 + − = int0 x sin(nx)dx = π π n n 0
π
=
0
= (−1)n+1
2 n
Notice that af , af are 0 because x and x cos(nx) are odd functions. Hence the Fourier series 0 n for f (x) = x is:
f (x) = x = =
af 0
∞
+
∞ n=1
(af cos(nx) + bf sin(nx)) = n n ∀x ∈ (−π, π)
2 (−1)n+1 sin(nx), n n=1
For an application of this Fourier series, see value of the Riemann zeta function at s = 2. Version: 4 Owner: alozano Author(s): alozano
1788
Chapter 493 42A20 – Convergence and absolute convergence of Fourier and trigonometric series
493.1 Dirichlet conditions
Let f be a piecewise regular realvalued function deﬁned on some interval [a, b], such that f has only a ﬁnite number of discontinuities and extrema in [a, b]. Then the Fourier series of this function converges to f when f is continuous and to the arithmetic mean of the lefthanded and righthanded limit of f at a point where it is discontinuous. Version: 3 Owner: mathwizard Author(s): mathwizard
1789
Chapter 494 42A38 – Fourier and FourierStieltjes transforms and other transforms of Fourier type
494.1 Fourier transform
The Fourier transform F (s) of a function f (t) is deﬁned as follows: 1 F (s) = √ int∞ e−ist f (t)dt. −∞ 2π The Fourier transform exists if f is Lebesgue integrable on the whole real axis. If f is Lebesgue integrable and can be divided into a ﬁnite number of continuous, monotone functions and at every point both onesided limits exist, the Fourier transform can be inverted: 1 f (t) = √ int∞ eist F (s)ds. −∞ 2π Sometimes the Fourier transform is also deﬁned without the factor √1 in one direction, 2π 1 but therefore giving the transform into the other direction a factor 2π . So when looking a transform up in a table you should ﬁnd out how it is deﬁned in that table. The Fourier transform has some important properties when solving diﬀerential equations. We denote the Fourier transform of f with respect to t in terms of s by Ft (f ). • Ft (af + bg) = aFt (f ) + bF(g), where a and b are real constants and f and g are real functions. 1790
• Ft • Ft
∂ f ∂t ∂ f ∂x
= isFt (f ). =
∂ F (f ). ∂x t
• We deﬁne the bilateral convolution of two functions f1 and f2 as: 1 (f1 ∗ f2 )(t) := √ int∞ f1 (τ )f2 (t − τ )dτ. −∞ 2π Then the following equation holds: Ft ((f1 ∗ f2 )(t)) = Ft (f1 ) · Ft (f2 ). If f (t) is some signal (maybe a sound wave) then the frequency domain of f is given as Ft (f ). Rayleigh’s theorem states that then the energy E carried by the signal f given by: E = int∞ f (t)2 dt −∞ can also be expressed as: E = int∞ Ft (f )(s)2 ds. −∞ In general we have: int∞ f (t)2dt = int∞ Ft (f )(s)2 ds. −∞ −∞ Version: 9 Owner: mathwizard Author(s): mathwizard
1791
Chapter 495 42A99 – Miscellaneous
495.1 Poisson summation formula
Let f : R → R be a oncediﬀerentiable, squareintegrable function. Let f ∨ (y) = intR f (x)e2πixy dx be its Fourier transform. Then f (n) =
n n
f ∨ (n).
By convention, sums are over all integers.
L et g(x) = n f (x + n). This sum converges absolutely, since f is square integrable, so g is diﬀerentiable, and periodic. Thus, the Fourier series n f ∨ (n)e2πinx converges pointwise to f . Evaluating our two sums for g at x = 0, we ﬁnd
f (n) = g(0) =
n n
f ∨ (n).
Version: 5 Owner: bwebste Author(s): bwebste
1792
Chapter 496 42B05 – Fourier series and coeﬃcients
496.1 Parseval equality
Let f be a Riemann integrable function from [−π, π] to R. The equation 1 π 2 int f (x)dx = 2(af )2 + 0 π −π
∞ k=1
[(af )2 + (bf )2 ], k k
where af , af , bf are the Fourier coeﬃcients of the function f , is usually known as Parseval’s 0 k k equality or Parseval’s theorem. Version: 3 Owner: vladm Author(s): vladm
496.2
Wirtinger’s inequality
Theorem: Let f : R → R be a periodic function of period 2π, which is continuous and has a continuous derivative throughout R, and such that int2π f (x) = 0 . 0 Then with equality iﬀ f (x) = a sin x + b sin x for some a and b (or equivalently f (x) = c sin(x + d) for some c and d). Proof:Since Dirichlet’s conditions are met, we can write 1 (an sin nx + bn cos ny) f (x) = a0 + 2 n≥1 1793 int2π f 2 (x)dx ≥ int2π f 2 (x)dx 0 0 (496.2.2) (496.2.1)
and moreover a0 = 0 by (495.2.1). By Parseval’s identity, int2π f 2 (x)dx = 0 and int2π f 2 (x)dx 0 =
∞ n=1
(a2 + b2 ) n n
∞ n=1
n2 (a2 + b2 ) n n
and since the summands are all ≥ 0, we get (495.2.2), with equality iﬀ an = bn = 0 for all n ≥ 2. Hurwitz used Wirtinger’s inequality in his tidy 1904 proof of the isoperimetric inequality. Version: 2 Owner: matte Author(s): Larry Hammick
1794
Chapter 497 43A07 – Means on groups, semigroups, etc.; amenable groups
497.1 amenable group
Let G be a locally compact group and L∞ (G) be the Banach space of all essentially bounded functions G → R with respect to the Haar measure. Deﬁnition 12. A linear functional on L∞ (G) is called a mean if it maps the constant function f (g) = 1 to 1 and nonnegative functions to nonnegative numbers. Deﬁnition 13. Let Lg be the left action of g ∈ G on f ∈ L∞ (G), i.e. (Lg f )(h) = f (gh). Then, a mean µ is said to be left invariant if µ(Lg f ) = µ(f ) for all g ∈ G and f ∈ L∞ (G). Similarly, right invariant if µ(Rg f ) = µ(f ), where Rg is the right action (Rg f )(h) = f (hg). Deﬁnition 14. A locally compact group G is amenable if there is a left (or right) invariant mean on L∞ (G). Example 21 (Amenable groups). All ﬁnite groups and all abelian groups are amenable. compact groups are amenable as the Haar measure is an (unique) invariant mean. Example 22 (Nonamenable groups). If a group contains a free (nonabelian) subgroup on two generators then it is not amenable. Version: 5 Owner: mhale Author(s): mhale
1795
Chapter 498 44A35 – Convolution
498.1 convolution
Introduction The convolution of two functions f, g : R → R is the function (f ∗ g)(u) = int∞ f (x)g(u − x)dx −∞ In a sense, (f ∗ g)(u) is the sum of all the terms f (x)g(y) where x + y = u. Such sums occur when investigating sums of independent random variables, and discrete versions appear in the coeﬃcients of products of polynomials and power series. Convolution is an important tool in data processing, in particular in digital signal and image processing. We will ﬁrst deﬁne the concept in various general settings, discuss its properties and then list several convolutions of probability distributions.
Deﬁnitions If G is a locally compact abelian topological group with Haar measure µ and f and g are measurable functions on G, we deﬁne the convolution (f ∗ g)(u) := intG f (x)g(u − x)dµ(x) whenever the right hand side integral exists (this is for instance the case if f ∈ Lp (G, µ), g ∈ Lq (G, µ) and 1/p + 1/q = 1). The case G = Rn is the most important one, but G = Z is also useful, since it recovers the convolution of sequences which occurs when computing the coeﬃcients of a product of polynomials or power series. The case G = Zn yields the socalled cyclic convolution which is often discussed in connection with the discrete Fourier transform.
1796
The (Dirichlet) convolution of multiplicative functions considered in number theory does not quite ﬁt the above deﬁnition, since there the functions are deﬁned on a commutative monoid (the natural numbers under multiplication) rather than on an abelian group. If X and Y are independent random variables with probability densities fX and fY respectively, and if X + Y has a probability density, then this density is given by the convolution fX ∗ fY . This motivates the following deﬁnition: for probability distributions P and Q on Rn , the convolution P ∗ Q is the probability distribution on Rn given by (P ∗ Q)(A) := (P × Q)({(x, y)  x + y ∈ A}) for every Borel set A. The convolution of two distributions u and v on Rn is deﬁned by (u ∗ v)(φ) = u(ψ) for any test function φ for v, assuming that ψ(t) := v(φ(· + t)) is a suitable test function for u. Properties The convolution operation, when deﬁned, is commutative, associative and distributive with respect to addition. For any f we have f ∗δ =f where δ is the Dirac delta distribution. The Fourier transform F translates between convolution and pointwise multiplication: F (f ∗ g) = F (f ) · F (g). Because of the availability of the Fast Fourier Transform and its inverse, this latter relation is often used to quickly compute discrete convolutions, and in fact the fastest known algorithms for the multiplication of numbers and polynomials are based on this idea. Some convolutions of probability distributions
2 2 • The convolution of two normal distributions with zero mean and variances σ1 and σ2 2 2 2 is a normal distribution with zero mean and variance σ = σ1 + σ2 .
• The convolution of two χ2 distributions with f1 and f2 degrees of freedom is a χ2 distribution with f1 + f2 degrees of freedom. • The convolution of two Poisson distributions with parameters λ1 and λ2 is a Poisson distribution with parameter λ = λ1 + λ2 . 1797
• The convolution of an exponential and a normal distribution is approximated by another exponential distribution. If the original exponential distributionhas density f (x) = e−x/τ τ (x ≥ 0) or f (x) = 0 (x < 0), σ the
and the normal distribution has zero mean and variance σ 2 , then for u probability density of the sum is e−u/τ +σ /(2τ √ f (u) ≈ στ 2π
2 2)
In a semilogarithmic diagram where log(fX (x)) is plotted versus x and log(f (u)) versus u, the latter lies bye the amount σ 2 /(2τ 2 ) higher than the former but both are represented by parallel straight lines, the slope of which is determined by the parameter τ. • The convolution of a uniform and a normal distribution results in a quasiuniform distribution smeared out at its edges. If the original distribution is uniform in the region a ≤ x < b and vanishes elsewhere and the normal distribution has zero mean and variance σ 2 , the probability density of the sum is f (u) = Where 1 2 ψ0 (x) = √ intx e−t /2 dt −∞ 2π is the distribution function of the standard normal distribution. For σ → 0, the function f (u) vanishes for u < a and u > b and is equal to 1/(b − a) in between. For ﬁnite σ the sharp steps at a and b are rounded oﬀ over a width of the order 2σ. ψ0 ((u − a)/σ) − ψ0 ((u − b)/σ) b−a
References
• Adapted with permission from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.htm Version: 12 Owner: akrowne Author(s): akrowne, AxelBoldt
1798
Chapter 499 4600 – General reference works (handbooks, dictionaries, bibliographies, etc.)
499.1 balanced set
Deﬁnition [3, 1, 2, 1] Let V be a vector space over R (or C), and let S be a subset of V . If λS ⊂ S for all scalars λ such that λ ≤ 1, then S is a balanced set in V . Here, λS = {λs  s ∈ S}, and  ·  is the absolute value (in R), or the modulus of a complex number (in C). Examples and properties 1. Let V be a normed space with norm  · . Then the unit ball {v ∈ V  v ≤ 1} is a balanced set. 2. Any vector subspace is a balanced set. Thus, in R3 , lines and planes passing trough the origin are balanced sets. 3. Any nonempty balanced set contains the zero vector [1]. 4. The union and intersection of an arbitrary collection of balanced sets is again a balanced set [2]. 5. Suppose f is a linear map between to vector spaces. Then both f and f −1 (the inverse image of f ) map balanced sets into balanced sets [3, 2].
1799
Deﬁnition Suppose S is a set in a vector space V . Then the balanced hull of S, denoted by eq(S), is the smallest balanced set containing S. The balanced core of S is deﬁned as the largest balanced contained in S. Proposition Let S be a set in a vector space. 1. For eq(S) we have [1, 1] eq(S) = {λa  a ∈ A, λ ≤ 1}. 2. The balanced hull of S is the intersection of all balanced sets containing A [1, 2]. 3. The balanced core of S is the union of all balanced sets contained in A [2]. 4. The balanced core of S is nonempty if and only if the zero vector belongs to S [2]. 5. If S is a closed set in a topological vector space, then the balanced core is also a closed set [2]. Notes A balanced set is also sometimes called circled [2]. The term balanced evelope is also used for the balanced hull [1]. Bourbaki uses the term ´quilibr´ [1], c.f. eq(A) above. In e e [4], a balanced set is deﬁned as above, but with the condition λ = 1 instead of λ ≤ 1.
REFERENCES
1. W. Rudin, Functional Analysis, McGrawHill Book Company, 1973. 2. R.E. Edwards, Functional Analysis: Theory and Applications, Dover Publications, 1995. 3. J. Horv´th, Topological Vector Spaces and Distributions, AddisonWsley Publishing a Company, 1966. 4. R. Cristescu, Topological vector spaces, Noordhoﬀ International Publishing, 1977. 5. M. Reed, B. Simon, Methods of Modern Mathematical Physics: Functional Analysis I, Revised and enlarged edition, Academic Press, 1980.
Version: 7 Owner: matte Author(s): matte
499.2
bounded function
Deﬁnition Suppose X is a nonempty set. Then a function f : X → C is a bounded function if there exist a C < ∞ such that f (x) < C for all x ∈ X. The set of all bounded functions on X is usually denoted by B(X) ([1], pp. 61). 1800
Under standard pointwise addition and pointwise multiplication by a scalar, B(X) is a complex vector space. If f ∈ B(X), then the supnorm, or uniform norm, of f is deﬁned as f ∞ = sup f (x).
x∈X
It is straightforward to check that  · ∞ makes B(X) into a normed vector space, i.e., to check that  · ∞ satisﬁes the assumptions for a norm. Example Suppose X is a compact topological space. Further, let C(X) be the set of continuous complexvalued functions on X (with the same vector space structure as B(X)). Then C(X) is a vector subspace of B(X).
REFERENCES
1. C.D. Aliprantis, O. Burkinshaw, Principles of Real Analysis, 2nd ed., Academic Press, 1990.
Version: 3 Owner: matte Author(s): matte
499.3
bounded set (in a topological vector space)
Deﬁnition [3, 1, 1] Suppose B is a subset of a topological vector space V . Then B is a bounded set if for every neighborhood U of the zero vector in V , there exists a scalar λ such that B ⊂ λS. Theorem If K is a compact set in a topological vector space, then K is bounded. ([3], pp. 12)
REFERENCES
1. W. Rudin, Functional Analysis, McGrawHill Book Company, 1973. 2. F.A. Valentine, Convex sets, McGrawHill Book company, 1964. 3. R. Cristescu, Topological vector spaces, Noordhoﬀ International Publishing, 1977.
Version: 2 Owner: matte Author(s): matte 1801
499.4
cone
Deﬁnition [4, 2, 1] Suppose V is a real (or complex) vector space with a subset C. 1. If λC ⊂ C for any real λ > 0, then C is a cone. 2. If the origin belongs to a cone, then the cone is pointed. Otherwise, the cone is blunt. 3. A pointed cone is salient, if it contains no 1dimensional vector subspace of V . 4. If C − x0 is a cone for some x0 in V , then C is a cone with vertex at x0 . Examples 1. In R, the set x > 0 is a salient blunt cone. 2. Suppose x ∈ Rn . Then for any ε > 0, the set C= { λBx (ε)  λ > 0 } is an open cone. If x < ε, then C = Rn . Here, Bx (ε) is the open ball at x with radius ε. Properties 1. The union and intersection of a collection of cones is a cone. 2. A set C in a real (or complex) vector space is a convex cone if and only if [2, 1] λC ⊂ C, for all λ > 0, C + C ⊂ C. 3. For a convex pointed cone C, the set C in C [2, 1]. (−C) is the largest vector subspace contained (−C) = {0} [1].
4. A pointed convex cone C is salient if and only if C
REFERENCES
1. M. Reed, B. Simon, Methods of Modern Mathematical Physics: Functional Analysis I, Revised and enlarged edition, Academic Press, 1980. 2. J. Horv´th, Topological Vector Spaces and Distributions, AddisonWesley Publishing a Company, 1966. 3. R.E. Edwards, Functional Analysis: Theory and Applications, Dover Publications, 1995.
Version: 4 Owner: bwebste Author(s): matte 1802
499.5
locally convex topological vector space
Deﬁnition Let V be a topological vector space. If the topology of V has a basis where each member is a convex set, then V is a locally convex topological vector space [1].
REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999.
Version: 2 Owner: matte Author(s): matte
499.6
sequential characterization of boundedness
Theorem [3, 1] A set B in a real (or possibly complex) topological vector space V is bounded if and only if the following condition holds: If {zi }∞ is a sequence in B, and {λi }∞ is a sequence of scalars (in R or C), such that i=1 i=1 λi → 0, then λi zi → 0 in V .
REFERENCES
1. W. Rudin, Functional Analysis, McGrawHill Book Company, 1973. 2. R. Cristescu, Topological vector spaces, Noordhoﬀ International Publishing, 1977.
Version: 4 Owner: bwebste Author(s): matte
499.7
symmetric set
Deﬁnition [1, 3] Suppose A is a set in a vector space. Then A is a symmetric set, if A = −A. Here, −A = {−a  a ∈ A}. In other words, A is symmetric if for any a ∈ A also −a ∈ A.
1803
Examples 1. In R, examples of symmetric sets are intervals of the type (−k, k) with k > 0, and the sets Z and {−1, 1}. 2. Any vector subspace in a vector space is a symmetric set. 3. If A is any set in a vector space, then A −A [1] and A −A are symmetric sets.
REFERENCES
1. R. Cristescu, Topological vector spaces, Noordhoﬀ International Publishing, 1977. 2. W. Rudin, Functional Analysis, McGrawHill Book Company, 1973.
Version: 1 Owner: matte Author(s): matte
1804
Chapter 500 46A30 – Open mapping and closed graph theorems; completeness (including B, Br completeness)
500.1 closed graph theorem
A linear mapping between two Banach spaces X and Y is continuous if and only if its graph is a closed subset of X × Y (with the product topology). Version: 4 Owner: Koro Author(s): Koro
500.2
open mapping theorem
There are two important theorems having this name. In the context of functions of a complex variable: Theorem. Every nonconstant analytic function on a region is an open mapping. In the context of functional analysis: Theorem. Every surjective continuous linear mapping between two Banach spaces is an open mapping. Version: 8 Owner: Koro Author(s): Koro
1805
Chapter 501 46A99 – Miscellaneous
501.1 HeineCantor theorem
Let X, Y be uniform spaces, and f : X → Y a continuous function. If X is compact, then f is uniformly continuous. For instance, if f : [a, b] → R is a continuous function, then it is uniformly continuous. Version: 6 Owner: n3o Author(s): n3o
501.2
proof of HeineCantor theorem
We prove this theorem in the case when X and Y are metric spaces. Suppose f is not uniformly continuous. Then ∃ > 0 ∀δ > 0 ∃x, y ∈ X d(x, y) < δ but d(f (x), f (y)) ≥ .
In particular by letting δ = 1/k we can construct two sequences xk and yk such that d(xk , yk ) < 1/k and d(f (xk ), f (yk ) ≥ . Since X is compact the two sequence have convergent subsequences i.e. xkj → x ∈ X, ¯ ykj → y ∈ X. ¯
Since d(xk , yk ) → 0 we have x = y . Being f continuous we hence conclude d(f (xkj ), f (ykj )) → ¯ ¯ 0 which is a contradiction being d(f (xk ), f (yk )) ≥ . Version: 2 Owner: paolini Author(s): paolini 1806
501.3
topological vector space
A topological vector space is a pair (V, T) where V is a vector space over a topological ﬁeld K, and T is a Hausdorﬀ topology on V such that under T, the vector space operations v → λv is continuous from K × V to V and (v, w) → v + w is continuous from V × V to V , where K × V and V × V are given the respective product topologies. A ﬁnite dimensional vector space inherits a natural topology. For if V is a ﬁnite dimensional vectos space, then V is isomorphic to K n for some N; then let f : V → K n be such an isomorphism, and suppose K n has the product topology. Give V the topology where a subset A of V is open in V if and only if f (A) is open in K n . This topology is independent of the choice of isomorphism f . Version: 6 Owner: Evandar Author(s): Evandar
1807
Chapter 502 46B20 – Geometry and structure of normed linear spaces
502.1 limp→∞ x
p
= x
∞
p
Suppose x = (x1 , . . . , xn ) is a point in Rn , and let x ∞norm;
an x
1/p
∞
be the usual pnorm and
x p = x1 p + · · · + xn p x ∞ = max{x1 , . . . , xn }. Our claim is that
p→∞
,
lim x
p
=
x
∞
.
(502.1.1)
In other words, for any ﬁxed x ∈ Rn , the above limit holds. This, or course, justiﬁes the notation for the ∞norm. Proof. Since both norms stay invariant if we exchange two components in x, we can arrange things such that x ∞ = x1 . Then for any real p > 0, we have x and x
p ∞
= x1  = (x1 p )1/p ≤ n1/p x1  = n1/p x x ≤ ≤ 1808 lim x x
∞
x
p
∞
.
Taking the limit of the above inequalities (see this page) we obtain
∞ p p→∞ p
,
p→∞
lim x
,
which combined yield the result. P Version: 7 Owner: matte Author(s): matte
502.2
HahnBanach theorem
The HahnBanach theorem is a foundational result in functional analysis. Roughly speaking, it asserts the existence of a great variety of bounded (and hence continuous) linear functionals on an normed vector space, even if that space happens to be inﬁnitedimensional. We ﬁrst consider an abstract version of this theorem, and then give the more classical result as a corollary. Let V be a real, or a complex vector space, with K denoting the corresponding ﬁeld of scalars, and let p : V → R+ be a seminorm on V . Theorem 17. Let f : U → K be a linear functional deﬁned on a subspace U ⊂ V . If the restricted functional satisﬁes f (u) p(u), u ∈ U, then it can be extended to all of V without violating the above property. To be more precise, there exists a linear functional F : V → K such that F (u) = f (u), F (u) p(u), u∈U u ∈ V.
Deﬁnition 15. We say that a linear functional f : V → K is bounded if there exists a bound B ∈ R+ such that f (u) B p(u), u ∈ V. (502.2.1) If f is a bounded linear functional, we deﬁne f , the norm of f , according to f = sup{f (u) : p(u) = 1}. One can show that f is the inﬁmum of all the possible B that satisfy (501.2.1) Theorem 18 (HahnBanach). Let f : U → K be a bounded linear functional deﬁned on a subspace U ⊂ V . Let f U denote the norm of f relative to the restricted seminorm on U. Then there exists a bounded extension F : V → K with the same norm, i.e. F
V
= f
U.
Version: 7 Owner: rmilson Author(s): rmilson, Evandar 1809
502.3
proof of HahnBanach theorem
Consider the family of all possible extensions of f , i.e. the set F of all pairings (F, H) where H is a vector subspace of X containing U and F is a linear map F : H → K such that F (u) = f (u) for all u ∈ U and F (u) ≤ p(u) for all u ∈ H. F is naturally endowed with an partial order relation: given (F1 , H1 ), (F2 , H2 ) ∈ F we say that (F1 , H1 ) ≤ (F2 , H2 ) iﬀ F2 is an extension of F1 that is H1 ⊂ H2 and F2 (u) = F1 (u) for all u ∈ H1 . We want to apply Zorn’s lemma to F so we are going to prove that every chain in F has an upper bound. Let (Fi , Hi ) be the elements of a chain in F. Deﬁne H = i Hi . Clearly H is a vector subspace of V and contains U. Deﬁne F : H → K by “merging” all Fi ’s as follows. Given u ∈ H there exists i such that u ∈ Hi : deﬁne F (u) = Fi (u). This is a good deﬁnition since if both Hi and Hj contain u then Fi (u) = Fj (u) in fact either (Fi , Hi ) ≤ (Fj , Hj ) or (Fj , Hj ) ≤ (Fi , Hi ). Clearly the so constructed pair (F, H) is an upper bound for the chain (Fi , Hi ) since F is an extension of every Fi . Zorn’s Lemma then assures that there exists a maximal element (F, H) ∈ F. To complete the proof we will only need to prove that H = V . Suppose by contradiction that there exists v ∈ V \ H. Then consider the vector space H = H + Kv = {u + tv : u ∈ H, t ∈ K} (H is the vector space generated by H and v). Choose λ = sup{F (x) − p(x − v)}.
x∈H
We notice that given any x, y ∈ H it holds F (x) − F (y) = F (x − y) ≤ p(x − y) = p(x − v + v − y) ≤ p(x − v) + p(y − v) i.e. F (x) − p(x − v) ≤ F (y) + p(y − v); in particular we ﬁnd that λ < +∞ and for all y ∈ H it holds F (y) − p(y − v) ≤ λ ≤ F (y) + p(y − v). Deﬁne F : H → K as follows: F (u + tv) = F (u) + tλ. Clearly F is a linear functional. We have F (u + tv) = F (u) + tλ = t F (u/t) + λ and by letting y = −u/t by the previous estimates on λ we obtain F (u/t) + λ ≤ F (u/t) + F (−u/t) + p(−u/t − v) = p(u/t + v) 1810
and F (u/t) + λ ≥ F (u/t) + F (−u/t) − p(−u/t − v) = −p(u/t + v) which together give F (u/t) + λ ≤ p(u/t + v) and hence F (u + tv) ≤ tp(u/t + v) = p(u + tv). So we have proved that (F , H ) ∈ F and (F , H ) > (F, H) which is a contradiction. Version: 4 Owner: paolini Author(s): paolini
502.4
seminorm
Let V be a real, or a complex vector space, with K denoting the corresponding ﬁeld of scalars. A seminorm is a function p : V → R+ , from V to the set of nonnegative real numbers, that satisﬁes the following two properties. p(k u) = k p(u), k ∈ K, u ∈ V p(u + v) p(u) + p(v), u, v ∈ U, Homogeneity Sublinearity
A seminorm diﬀers from a norm in that it is permitted that p(u) = 0 for some nonzero u ∈ V. It is possible to characterize the seminorms properties geometrically. For k > 0, let Bk = {u ∈ V : p(u) k}
denote the ball of radius k. The homogeneity property is equivalent to the assertion that Bk = kB1 , in the sense that u ∈ B1 if and only if ku ∈ Bk . Thus, we see that a seminorm is fully determined by its unit ball. Indeed, given B ⊂ V we may deﬁne a function pB : V → R+ by pB (u) = inf{λ ∈ R+ : λ−1 u ∈ B}. The geometric nature of the unit ball is described by the following. Proposition 20. The function pB satisﬁes the homegeneity property if and only if for every u ∈ V , there exists a k ∈ R+ {∞} such that λu ∈ B if and only if 1811 λ k.
Proposition 21. Suppose that p is homogeneous. Then, it is sublinear if and only if its unit ball, B1 , is a convex subset of V . Proof. First, let us suppose that the seminorm is both sublinear and homogeneous, and prove that B1 is necessarily convex. Let u, v ∈ B1 , and let k be a real number between 0 and 1. We must show that the weighted average ku + (1 − k)v is in B1 as well. By assumption, p(k u + (1 − k)v) k p(u) + (1 − k) p(v).
The right side is a weighted average of two numbers between 0 and 1, and is therefore between 0 and 1 itself. Therefore k u + (1 − k)v ∈ B1 , as desired. Conversely, suppose that the seminorm function is homogeneous, and that the unit ball is convex. Let u, v ∈ V be given, and let us show that p(u + v) p(u) + p(v).
The essential complication here is that we do not exclude the possibility that p(u) = 0, but that u = 0. First, let us consider the case where p(u) = p(v) = 0. By homogeneity, for every k > 0 we have ku, kv ∈ B1 , and hence as well. By homogeneity, again, k k u + v ∈ B1 , 2 2 p(u + v)
2 . k Since the above is true for all positive k, we infer that p(u + v) = 0, as desired. Next suppose that p(u) = 0, but that p(v) = 0. We will show that in this case, necessarily, p(u + v) = p(v). Owing to the homogeneity assumption, we may without loss of generality assume that p(v) = 1. 1812
For every k such that 0
k < 1 we have k u + k v = (1 − k) ku + k v. 1−k
The rightside expression is an element of B1 because ku , v ∈ B1 . 1−k Hence k p(u + v) 1, and since this holds for k arbitrarily close to 1 we conclude that p(u + v) The same argument also shows that p(v) = p(−u + (u + v)) and hence p(u + v) = p(v), as desired. Finally, suppose that neither p(u) nor p(v) is zero. Hence, u v , p(u) p(v) are both in B1 , and hence p(u) u p(v) v u+v + = p(u) + p(v) p(u) p(u) + p(v) p(v) p(u) + p(v) is in B1 also. Using homogeneity, we conclude that p(u + v) as desired. Version: 14 Owner: rmilson Author(s): rmilson, drummond p(u) + p(v), p(u + v), p(v).
502.5
vector norm
A vector norm on the real vector space V is a function f : V → R that satisﬁes the following properties: 1813
f (x) = 0 ⇔ x = 0 f (x) ≥ 0 f (x + y) f (x) + f (y) f (αx) = αf (x)
x∈V x, y ∈ V α ∈ R, x ∈ V
Such a function is denoted as  x . Particular norms are distinguished by subscripts, such as  x V , when referring to a norm in the space V . A unit vector with respect to the norm  ·  is a vector x satisfying  x  = 1. A vector norm on a complex vector space is deﬁned similarly. A common (and useful) example of a real norm is the Euclidean norm given by x = (x2 + x2 + · · · + x2 )1/2 deﬁned on V = Rn . Note, however, that there does not exist any norm 1 2 n on all metric spaces; when it does, the space is called a normed vector space. A necessary and suﬃcient condition for a metric space to be a normed space, is d(x + a, y + a) = d(x, y) ∀x, y, a ∈ V d(αx, αy) = αd(x, y) ∀x, y ∈ V, α ∈ R But given a norm, a metric can always be deﬁned by the equation d(x, y) = x − y Version: 14 Owner: mike Author(s): mike, Manoj, Logan
1814
Chapter 503 46B50 – Compactness in Banach (or normed) spaces
503.1 Schauder ﬁxed point theorem
Let X be a Banach space, K ⊂ X compact, convex and nonempty, and let f : K → K be a continuous mapping. Then there exists x ∈ K such that f (x) = x. Notice that the unit disc of a ﬁnite dimensional vector space is always convex and compact hence this theorem extends Brouwer ﬁxed point theorem. Version: 3 Owner: paolini Author(s): paolini
503.2
proof of Schauder ﬁxed point theorem
The idea of the proof is to reduce ourselves to the ﬁnite dimensional case. Given > 0 notice that the family of open sets {B (x) : x ∈ K} is an open covering of K. Being K compact there exists a ﬁnite subcover, i.e. there exists N points p1 , . . . , pN of K such that the balls B (pi ) cover the whole set K. Let K be the convex hull of p1 , . . . , pN and let V be the aﬃne N − 1 dimensional space containing these points so that K ⊂ V . Now consider a projection π : X → V such that π (x) − π (y) ≤ x − y and deﬁne f :K →K, f (x) = π (f (x)).
This is a continuous function deﬁned on a convex and compact set K of a ﬁnite dimensional vector space V . Hence by Brouwer ﬁxed point theorem it admits a ﬁxed point x f (x ) = x . 1815
Since K is sequentially compact we can ﬁnd a sequence to some point x ∈ K. ¯ We claim that f (¯) = x. x ¯
k
→ 0 such that xk = x k converges
Clearly f k (xk ) = xk → x. To conclude the proof we only need to show that also f k (xk ) → ¯ f (¯) or, which is the same, that f k (xk ) − f (¯) → 0. x x In fact we have f k (xk ) − f (¯) = π k (f (xk )) − f (¯) x x ≤ π k (f (xk )) − f (xk ) + f (xk ) − f (¯) x ≤
k
+ f (xk ) − f (¯) → 0 x
where we used the fact that π (x) − x ≤ being x ∈ K contained in some ball B centered on K . Version: 1 Owner: paolini Author(s): paolini
1816
Chapter 504 46B99 – Miscellaneous
504.1
p
Let F be either R or C, and let p ∈ R with p sequences (ai )i 0 in F such that
∞ i=0
1. We deﬁne ai p
p
to be the vector space of all
exists.
p
is a normed vector space, under the norm (ai )
p
=(
∞ i=0
ai p )1/p with norm given by
∞
is deﬁned to be the vector space of all bounded sequences (ai )i (ai )
∞
0
= sup{ai  : i
0}
and p for p 1 are complete under these norms, making them into Banach spaces. 2 Moreover, is a Hilbert space under the inner product (ai ), (bi ) =
∞ i=0
∞
ai bi
For p > 1 the (continuous) dual space of ∞ , and the dual space of ∞ is 1 .
p
is
q
where
1 p
+
1 q
= 1. The dual space of
1
is
Version: 10 Owner: Evandar Author(s): Evandar 1817
504.2
Banach space
A Banach space (X, . ) is a normed vector space such that X is complete under the metric induced by the norm . . Some authors use the term Banach space only in the case where X is inﬁnite dimensional, although on Planetmath ﬁnite dimensional spaces are also considered to be Banach spaces. If Y is a Banach space and X is any normed vector space, then the set of continuous linear maps f : X → Y forms a Banach space, with norm given by the operator norm. In particular, since R and C are complete, the space of all continuous linear functionals on a normed vector space is a Banach space. Version: 4 Owner: Evandar Author(s): Evandar
504.3
an inner product deﬁnes a norm
an inner product deﬁnes a norm
Let F be a ﬁeld, and X be an inner product space over F with an inner product . : X × X → F then we can deﬁne a function , from X → F such that X → x, x and this deﬁnes a norm on X. Version: 20 Owner: say 10 Author(s): say 10, apmxi
504.4
continuous linear mapping
If (V1 , · 1 ) and (V2 , · 2 ) are normed vector spaces, a linear mapping T : V1 → V2 is continuous if it is continuous in the metric induced by the norms. If there is a nonnegative constant c such that T (x) 2 c x 1 for each x ∈ V1 , we say that T is bounded. This should not be confused with the usual terminology referring to a bounded function as one that has bounded range. In fact, bounded linear mappings usually have unbounded ranges. The expression bounded linear mapping is often used in functional analysis to refer to continuous linear mappings as well. This is because the two deﬁnitions are equivalent: If T is bounded, then T (x) − T (y) 2 = T (x − y) 2 c x − y 1 , so T is a lipschitz function. Now suppose T is continuous. Then there exists r > 0 such that T (x) 2 1 when x 1 r.
1818
For any x ∈ V1 , we then have r x hence T (x)
2
T (x)
1
2
= T
r x x 1
2
1,
r x 1 ; so T is bounded.
It can be shown that a linear mapping between two topological vector spaces is continuous if and only if it is continuous at 0 [3].
REFERENCES
1. W. Rudin, Functional Analysis, McGrawHill Book Company, 1973.
Version: 4 Owner: Koro Author(s): Koro
504.5
equivalent norms
Deﬁnition Let · and · be two norms on a vector space V . These norms are equivalent norms if there exist positive real numbers c, d such that cx ≤ x ≤ dx for all x ∈ V . An equivalent condition is that there exists a number C > 0 such that 1 x ≤ x ≤ Cx C for all x ∈ V . To see the equivalence, set C = max{1/c, d}. Some key results are as follows: 1. On a ﬁnite dimensional vector space all norms are equivalent. The same is not true for vector spaces of inﬁnite dimension [2]. It follows that on a ﬁnite dimensional vector space, one can check the convergence of a sequence with respect with any norm. If a sequence converges in one norm, it converges in all norms. 2. If two norms are equivalent on a vector space V , they induce the same topology on V [2].
1819
REFERENCES
1. E. Kreyszig, Introductory Functional Analysis With Applications, John Wiley & Sons, 1978.
Version: 3 Owner: Koro Author(s): matte
504.6
normed vector space
Let F be a ﬁeld which is either R or C. A normed vector space over F is a pair (V, . ) where V is a vector space over F and . : V → R is a function such that 1. v 0 for all v ∈ V and v = 0 if and only if v = 0 in V (positive deﬁniteness) v + w for all v, w ∈ V (the triangle inequality)
2. λv = λ v for all v ∈ V and all λ ∈ F 3. v + w
The function . is called a norm on V . Some properties of norms: 1. If W is a subspace of V then W can be made into a normed space by simply restricting the norm on V to W . This is called the induced norm on W . 2. Any normed vector space (V, . ) is a metric space under the metric d : V × V → R given by d(u, v) = u − v . This is called the metric induced by the norm . . 3. In this metric, the norm deﬁnes a continuous map from V to R  this is an easy consequence of the triangle inequality. 4. If (V, , ) is an inner product space, then there is a natural induced norm given by v = v, v for all v ∈ V . Version: 5 Owner: Evandar Author(s): Evandar
1820
Chapter 505 46Bxx – Normed linear spaces and Banach spaces; Banach lattices
505.1 vector pnorm
A class of vector norms, called a pnorm and denoted  · p , is deﬁned as  x p = (x1 p + · · · + xn p ) p
1
p
1, x ∈ Rn
The most widely used are the 1norm, 2norm, and ∞norm:
 x 1 = x1  + · · · + xn 
1 i n
√ x1 2 + · · · + xn 2 =  xT x x 2 = x ∞ = max xi 
The 2norm is sometimes called the Euclidean vector norm, because  x − y 2 yields the Euclidean distance between any two vectors x, y ∈ Rn . The 1norm is also called the taxicab metric (sometimes Manhattan metric) since the distance of two points can be viewed as the distance a taxi would travel on a city (horizontal and vertical movements). A useful fact is that for ﬁnite dimensional spaces (like Rn ) the tree mentioned norms are equivalent. Version: 5 Owner: drini Author(s): drini, Logan
1821
Chapter 506 46C05 – Hilbert and preHilbert spaces: geometry and topology (including spaces with semideﬁnite inner product)
506.1 Bessel inequality
Let H be a Hilbert space, and suppose e1 , e2 , . . . ∈ H is an orthonormal sequence. Then for any x ∈ H,
∞ k=1
 x, ek 2 ≤ x
2
.
Bessel’s inequality immediately lets us deﬁne the sum x =
∞ k=1
x, ek ek .
The inequality means that the series converges. For a complete orthonormal series, we have Parseval’s theorem, which replaces inequality with equality (and consequently x with x). Version: 2 Owner: ariels Author(s): ariels
506.2
Hilbert module
Deﬁnition 16. A 1822
Deﬁnition 39. (right) preHilbert module over a C ∗ algebra A is a right Amodule E equipped with an Avalued inner product −, − : E × E → A, i.e. a sesquilinear pairing satisfying u, va u, v v, v = = u, v a v, u ∗ 0, with v, v = 0 iﬀ v = 0, (506.2.1) (506.2.2) (506.2.3)
for all u, v ∈ E and a ∈ A. Note, positive deﬁniteness is welldeﬁned due to the notion of positivity for C ∗ algebras. The norm of an element v ∈ E is deﬁned by v =  v, v . Deﬁnition 17. A Deﬁnition 40. (right) Hilbert module over a C ∗ algebra A is a right preHilbert module over A which is complete with respect to the norm. Example 23 (Hilbert spaces). A complex Hilbert space is a Hilbert Cmodule. Example 24 (C ∗ algebras). A C ∗ algebra A is a Hilbert Amodule with inner product a, b = a∗ b. Deﬁnition 18. A Deﬁnition 41. Hilbert ABbimodule is a (right) Hilbert module E over a C ∗ algebra B together with a *homomorphism π from a C ∗ algebra A to End(E). Version: 4 Owner: mhale Author(s): mhale
506.3
Hilbert space
A Hilbert space is an inner product space (X, , ) which is complete under the induced metric. In particular, a Hilbert space is a Banach space in the norm induced by the inner product, since the norm and the inner product both induce the same metric. Some authors require X to be inﬁnite dimensional for it to be called a Hilbert space. Version: 7 Owner: Evandar Author(s): Evandar
506.4
Let
proof of Bessel inequality
n
rn = x −
k=1
x, ek · ek .
1823
Then for j = 1, . . . , n, rn , ej = x, ej − n k=1 x, ek · ek , ej = x, ej − x, ej ej , ej = 0 (506.4.1) (506.4.2)
so e1 , . . . , en , rn is an orthogonal series. Computing norms, we see that
n 2 n n
x So the series
2
= rn +
k=1
x, ek · ek
= rn
2
+
k=1
 x, ek  ≥
2
k=1
 x, ek 2 .
∞ k=1 2
 x, ek 2
converges and is bounded by x , as required. Version: 1 Owner: ariels Author(s): ariels
1824
Chapter 507 46C15 – Characterizations of Hilbert spaces
507.1 classiﬁcation of separable Hilbert spaces
Let H1 and H2 be inﬁnite dimensional, separable Hilbert spaces. Then there is an isomorphism f : H1 → H2 which is also an isometry. In other words, H1 and H2 are identical as Hilbert spaces. Version: 2 Owner: Evandar Author(s): Evandar
1825
Chapter 508 46E15 – Banach spaces of continuous, diﬀerentiable or analytic functions
508.1 AscoliArzela theorem
Theorem 19. Let Ω be a bounded subset of Rn and (fk ) a sequence of functions fk : Ω → Rm . If {fk } is equibounded and uniformly equicontinuous then there exists a uniformly convergent subsequence (fkj ). A more abstract (and more general) version is the following. Theorem 20. Let X and Y be totally bounded metrical spaces and let F ⊂ C(X, Y ) be an equibounded family of continuous mappings from X to Y . Then F is totally bounded (with respect to the uniform convergence metric induced by C(X, Y )). Notice that the ﬁrst version is a consequence of the second. Recall, in fact, that a subset of a complete metric space is totally bounded if and only if its closure is compact (or sequentially compact). Hence Ω is totally bounded and all the functions fk have image in a totally bounded set. Being F = {fk } totally bounded means that F is sequentially compact and hence (fk ) has a convergent subsequence. Version: 6 Owner: paolini Author(s): paolini, n3o
508.2
StoneWeierstrass theorem
Let X be a compact metric space and let C 0 (X, R) be the algebra of continuous real functions deﬁned over X. Let A be a subalgebra of C 0 (X, R) for which the following conditions hold: 1826
1. ∀x, y ∈ X ∃f ∈ A : f (x) = f (y) 2. 1 ∈ A Then A is dense in C 0 (X, R). Version: 1 Owner: n3o Author(s): n3o
508.3
proof of AscoliArzel theorem
Given > 0 we aim at ﬁnding a 4 lattice in F (see the deﬁnition of totally boundedness). Let δ > 0 be given with respect to in the deﬁnition of equicontinuity of F . Let Xδ be a δlattice in X and Y be a lattice in Y . Let now Y Xδ be the set of functions from Xδ to Y and deﬁne G ⊂ Y Xδ by G = {g ∈ Y Xδ : ∃f ∈ F ∀x ∈ Xδ d(f (x), g(x)) < }.
Since Y Xδ is a ﬁnite set, G is ﬁnite too: say G = {g1 , . . . , gN }. Then deﬁne F ⊂ F , F = {f1 , . . . , fN } where fk : X → Y is a function in F such that d(fk (x), gk (x)) < for all x ∈ Xδ (the existence of such a function is guaranteed by the deﬁnition of G ). We now will prove that F is a 4 lattice in F . Given f ∈ F choose g ∈ Y Xδ such that for all x ∈ Xδ it holds d(f (x), g(x)) < (this is possible as for all x ∈ Xδ there exists y ∈ Y with d(f (x), y) < ). We conclude that g ∈ G and hence g = gk for some k ∈ {1, . . . , N}. Notice also that for all x ∈ Xδ we have d(f (x), fk (x)) ≤ d(f (x), gk (x)) + d(gk (x), fk (x)) < 2 . Given any x ∈ X we know that there exists xδ ∈ Xδ such that d(x, xδ ) < δ. So, by equicontinuity of F , d(f (x), fk (x)) ≤ d(f (x), f (xδ )) + d(fk (x), fk (xδ )) + d(f (xδ ), fk (xδ )) ≤ 4 . Version: 3 Owner: paolini Author(s): paolini
508.4
Holder inequality
The H¨lder inequality concerns vector pnorms: o 1 1 + = 1 then xT y p q  x p y q
If
An important instance of a H¨lder inequality is the CauchySchwarz inequality. o 1827
There is a version of this result for the Lp spaces. If a function f is in Lp (X), then the Lp norm of f is denoted  f p . Let (X, B, µ) be a measure space. If f is in Lp (X) and g is in Lq (X) (with 1/p + 1/q = 1), then the H¨lder inequality becomes o
fg
1
= intX f gdµ ≤ (intX f pdµ) p (intX gq dµ) q = f p g q
1
1
Version: 10 Owner: drini Author(s): paolini, drini, Logan
508.5
Young Inequality
Let a, b > 0 and p, q ∈ ]0, ∞[ with 1/p + 1/q = 1. Then ab ≤ ap bq + . p q
Version: 1 Owner: paolini Author(s): paolini
508.6
conjugate index
1 p
For p, q ∈ R, 1 < p, q < ∞ we say p and q are conjugate indices if we will also deﬁne q = ∞ as conjugate to p = 1 and vice versa.
+
1 q
= 1. Formally,
Conjugate indices are used in the H¨lder inequality and more generally to deﬁne conjugate o spaces. Version: 4 Owner: bwebste Author(s): bwebste, drummond
508.7
proof of Holder inequality
First we prove the more general form (in measure spaces). Let (X, µ) be a measure space and let f ∈ Lp (X), g ∈ Lq (X) where p, q ∈ [1, +∞] and 1 + 1 = 1. p q The case p = 1 and q = ∞ is obvious since f (x)g(x) ≤ g 1828
L∞ f (x).
Also if f = 0 or g = 0 the result is obvious. Otherwise notice that (applying Young inequality) we have p q f g L1 f  g 1 f  g 1 1 1 = intX · dµ ≤ intX dµ+ intX dµ = + = 1 f Lp · g Lq f Lp g Lq p f Lp q g Lq p q hence the desired inequality holds intX f g = f g
L1
≤ f
Lp p
· g and
Lq q
= (intX f p ) p (intX gq ) q . spaces the proof is the same, only replace
p
1 p
1
1
If x and y are vectors in Rn or vectors in integrals with sums. If we deﬁne x we have  k xk yk  xk yk  ≤ k = x p· y q x p· y q
p
=
xk 
k
1 xk  yk  ≤ x p y q p
k
xk p 1 + x p q p
k
1 1 yk q + = 1. q = y q p q
Version: 1 Owner: paolini Author(s): paolini
508.8
proof of Young Inequality
1 1 1 1 log ap + log bq ≤ log( ap + bq ). p q p q
By the concavity of the log function we have log ab =
By exponentiation we obtain the desired result. Version: 1 Owner: paolini Author(s): paolini
508.9
vector ﬁeld
A (smooth, diﬀerentiable) vector ﬁeld on a (smooth diﬀerentiable) manifold M is a (smooth, diﬀerentiable) function v : M → T M, where T M is the tangent bundle of M, which takes m to the tangent space Tm M, i.e., a section of the tangent bundle. Less formally, it can be thought of a continuous choice of a tangent vector at each point of a manifold. Alternatively, vector ﬁelds on a manifold can be identiﬁed with derivations of the algebra of (smooth, diﬀerentiable) functions. Though less intuitive, this deﬁnition can be more formally useful. Version: 8 Owner: bwebste Author(s): bwebste, slider142 1829
Chapter 509 46F05 – Topological linear spaces of test functions, distributions and ultradistributions
509.1 Tf is a distribution of zeroth order
To check that Tf is a distribution of zeroth order, we shall use condition (3) on this page. First, it is clear that Tf is a linear mapping. To see that Tf is continuous, suppose K is a compact set in U and u ∈ DK , i.e., u is a smooth function with support in K. We then have Tf (u) = intK f (x)u(x)dx ≤ intK f (x) u(x)dx ≤ intK f (x)dx u∞. Since f is locally integrable, it follows that C = intK f (x)dx is ﬁnite, so Tf (u) ≤ Cu∞. Thus f is a distribution of zeroth order ([2], pp. 381). P
REFERENCES
1. S. Lang, Analysis II, AddisonWesley Publishing Company Inc., 1969.
Version: 3 Owner: matte Author(s): matte
1830
509.2
1 p.v.( x ) is a distribution of ﬁrst order
(Following [4, 2].) Let u ∈ D(U). Then supp u ⊂ [−k, k] for some k > 0. For any ε > 0, u(x)/x is Lebesgue integrable in x ∈ [ε, k]. Thus, by a change of variable, we have 1 p.v.( )(u) = x
ε→0+
lim int[ε,k]
u(x) − u(−x) dx. x
Now it is clear that the integrand is continuous for all x ∈ R \ {0}. What is more, the integrand approaches 2u (0) for x → 0, so the integrand has a removable discontinuity at x = 0. That is, by assigning the value 2u (0) to the integrand at x = 0, the integrand becomes continuous in [0, k]. This means that the integrand is Lebesgue measurable on [0, k]. Then, by deﬁning fn (x) = χ[1/n,k] u(x) − u(−x) /x (where χ is the characteristic function), and applying the Lebesgue dominated convergence theorem, we have 1 u(x) − u(−x) p.v.( )(u) = int[0,k] dx. x x
1 1 It follows that p.v.( x )(u) is ﬁnite, i.e., p.v.( x ) takes values in C. Since D(U) is a vector space, 1 if follows easily from the above expression that p.v.( x ) is linear. 1 To prove that p.v.( x ) is continuous, we shall use condition (3) on this page. For this, suppose K is a compact subset of R and u ∈ DK . Again, we can assume that K ⊂ [−k, k] for some k > 0. For x > 0, we have

1 u(x) − u(−x)  =  int(−x,x) u (t)dt x x ≤ 2u ∞ ,
where ·∞ is the supremum norm. In the ﬁrst equality we have used the Fundamental theorem of calculus f (valid since u is absolutely continuous on [−k, k]). Thus 1  p.v.( )(u) ≤ 2ku ∞ x
1 and p.v.( x ) is a distribution of ﬁrst order as claimed. P
REFERENCES
1. M. Reed, B. Simon, Methods of Modern Mathematical Physics: Functional Analysis I, Revised and enlarged edition, Academic Press, 1980. 2. S. Igari, Real analysis  With an introduction to Wavelet Theory, American Mathematical Society, 1998.
Version: 4 Owner: matte Author(s): matte 1831
509.3
Cauchy principal part integral
∞ Deﬁnition [4, 2, 2] Let C0 (R) be the set of smooth functions with compact support on 1 1 ∞ R. Then the Cauchy principal part integral p.v.( x ) is mapping p.v.( x ) : C0 (R) → C deﬁned as 1 u(x) p.v.( )(u) = lim intx>ε dx ε→0+ x x ∞ for u ∈ C0 (R). 1 1 Theorem The mapping p.v.( x ) is a distribution of ﬁrst order. That is, p.v.( x ) ∈ D 1 (R).
(proof.)
Properties
1 1. The distribution p.v.( x ) is obtained as the limit ([2], pp. 250)
χnx 1 → p.v.( ). x x as n → ∞. Here, χ is the characteristic function, the locally integrable functions on the left hand side should be interpreted as distributions (see this page), and the limit should be taken in D (R). 2. Let ln t be the distribution induced by the locally integrable function ln t : R → R. Then, for the distributional derivative D, we have ([2], pp. 149) 1 D(ln t) = p.v.( ). x
REFERENCES
1. M. Reed, B. Simon, Methods of Modern Mathematical Physics: Functional Analysis I, Revised and enlarged edition, Academic Press, 1980. 2. S. Igari, Real analysis  With an introduction to Wavelet Theory, American Mathematical Society, 1998. 3. J. Rauch, Partial Diﬀerential Equations, SpringerVerlag, 1991.
Version: 5 Owner: matte Author(s): matte
1832
509.4
delta distribution
Let U be an open subset of Rn such that 0 ∈ U. Then the delta distribution is the mapping [2, 3, 4] δ : D(U) → C u → u(0). Claim The delta distribution is a distribution of zeroth order, i.e., δ ∈ D 0 (U). Proof. With obvious notation, we have δ(u + v) = (u + v)(0) = u(0) + v(0) = δ(u) + δ(v), δ(αu) = (αu)(0) = αu(0) = αδ(u), so δ is linear. To see that δ is continuous, we use condition (3) on this this page. Indeed, if K is a compact set in U, and u ∈ DK , then δ(u) = u(0) ≤ u∞, where  · ∞ is the supremum norm. P
REFERENCES
1. J. Rauch, Partial Diﬀerential Equations, SpringerVerlag, 1991. 2. W. Rudin, Functional Analysis, McGrawHill Book Company, 1973. 3. M. Reed, B. Simon, Methods of Modern Mathematical Physics: Functional Analysis I, Revised and enlarged edition, Academic Press, 1980.
Version: 1 Owner: matte Author(s): matte
509.5
distribution
Deﬁnition [1] Suppose U is an open set in Rn , and suppose D(U) is the topological vector space of smooth functions with compact support. A distribution is a linear continuous functional on D(U), i.e., a linear continuous mapping D(U) → C. The set of all distributions on U is denoted by D (U). Suppose T is a linear functional on D(U). Then T is continuous if and only if T is continuous in the origin (see this page). This condition can be rewritten in various ways, and the below theorem gives two convenient conditions that can be used to prove that a linear mapping is a distribution. 1833
Theorem Let U be an open set in Rn , and let T be a linear functional on D(U). Then the following are equivalent: 1. T is a distribution. 2. If K is a compact set in U, and {ui }∞ be a sequence in DK such that for any i=1 multiindex α, we have D α ui → 0 in the supremum norm as i → ∞, then T (ui) → 0 in C. 3. For any compact set K in U, there are constants C > 0 and k ∈ {1, 2, . . .} such that for all u ∈ DK , we have T (u) ≤ C D αu∞ , (509.5.1)
α≤k
where α is a multiindex, and  · ∞ is the supremum norm. Proof The equivalence of (2) and (3) can be found on this page, and the equivalence of (1) and (3) is shown in [3], pp. 141. If T is a distribution on an open set U, and the same k can be used for any K in inequality (508.5.1), then T is a distribution of order k. The set of all such distributions is denoted by D k (U). Further, the set of all distributions of ﬁnite order on U is deﬁned as [4] DF (U) = {T ∈ D (U)  T ∈ D k (U) for some k < ∞ }. A common notation for the action of a distribution T onto a test function u ∈ D(U) (i.e., T (u) with above notation) is T, u . The motivation for this comes from this example.
Topology for D (U) The standard topology for D (U) is the weak∗ topology. In this topology, a sequence {Ti }∞ i=1 of distributions (in D (U)) converges to a distribution D ∈ D (U) if and only if Ti (u) → T (u) (in C) as i → ∞ for every u ∈ D(U) [3].
REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999. 2. W. Rudin, Functional Analysis, McGrawHill Book Company, 1973.
1834
3. L. H¨rmander, The Analysis of Linear Partial Diﬀerential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, SpringerVerlag, 1990.
Version: 6 Owner: matte Author(s): matte
509.6
equivalence of conditions
Let us ﬁrst show the equivalence of (2) and (3) following [4], pp. 35. First, the proof that (3) implies (2) is a direct calculation. Next, let us show that (2) implies (3): Suppose T ui → 0 in C, and if K is a compact set in U, and {ui }∞ is a sequence in DK such that for any i=1 multiindex α, we have D α ui → 0 in the supremum norm  · ∞ as i → ∞. For a contradiction, suppose there is a compact set K in U such that for all constants C > 0 and k ∈ {0, 1, 2, . . .} there exists a function u ∈ DK such that T (u) > C D αu∞ .
α≤k
Then, for C = k = 1, 2, . . . we obtain functions u1 , u2 , . . . in D(K) such that T (ui ) > i α≤i D α ui∞ . Thus T (ui) > 0 for all i, so for vi = ui /T (ui), we have 1>i
α≤i
D α vi ∞ .
It follows that D αui ∞ < 1/i for any multiindex α with α ≤ i. Thus {vi }∞ satisﬁes i=1 our assumption, whence T (vi ) should tend to 0. However, for all i, we have T (vi ) = 1. This contradiction completes the proof. TODO: The equivalence of (1) and (3) is given in [3]. P
REFERENCES
1. L. H¨rmander, The Analysis of Linear Partial Diﬀerential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, SpringerVerlag, 1990. 2. W. Rudin, Functional Analysis, McGrawHill Book Company, 1973.
Version: 3 Owner: matte Author(s): matte
1835
509.7
every locally integrable function is a distribution
Suppose U is an open set in Rn and f is a locally integrable function on U, i.e., f ∈ L1 (U). loc Then the mapping Tf : D(U) → C u → intU f (x)u(x)dx is a zeroth order distribution [4, 2]. (Here, D(U) is the set of smooth functions with compact support on U.) (proof) If f and g are both locally integrable functions on a open set U, and Tf = Tg , then it follows (see this page), that f = g almost everywhere. Thus, the mapping f → Tf is a linear injection when L1 is equipped with the usual equivalence relation for an Lp space. For this loc reason, one also writes f for the distribution Tf [2].
REFERENCES
1. L. H¨rmander, The Analysis of Linear Partial Diﬀerential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, SpringerVerlag, 1990. 2. S. Lang, Analysis II, AddisonWesley Publishing Company Inc., 1969.
Version: 2 Owner: matte Author(s): matte
509.8
localization for distributions
Deﬁnition [1, 3] Suppose U is an open set in Rn and T is a distribution T ∈ D (U). Then we say that T vanishes on an open set V ⊂ U, if the restriction of T to V is the zero ∞ distribution on V . In other words, T vanishes on V , if T (v) = 0 for all v ∈ C0 (V ). (Here ∞ C0 (V ) is the set of smooth function with compact support in V .) Similarly, we say that two distributions S, T ∈ D (U) are equal, or coincide on V , if S − T vanishes on V . We then write: S = T on V . Theorem[1, 4] Suppose U is an open set in Rn and {Ui }i∈I is an open cover of U, i.e., U=
i∈I
Ui .
Here, I is an arbitrary index set. If S, T are distributions on U, such that S = T on each Ui , then S = T (on U). 1836
Proof. Suppose u ∈ D(U). Our aim is to show that S(u) = T (u). First, we have supp u ⊂ K for some compact K ⊂ U. It follows that there exist a ﬁnite collection of Ui :s from the open cover, say U1 , . . . , UN , such that K ⊂ N Ui . By a smooth partition of unity (see e.g. [2] i=1 pp. 137), there are smooth functions φ1 , . . . , φN : U → R such that 1. supp φi ⊂ Ui for all i. 2. φi (x) ∈ [0, 1] for all x ∈ U and all i, 3.
N i=1
φi (x) = 1 for all x ∈ K.
From the ﬁrst property, and from a property for the support of a function, it follows that supp φi u ⊂ supp φi supp u ⊂ Ui . Therefore, for each i, S(φiu) = T (φi u) since S and T conicide on Ui . Then
N N
S(u) =
i=1
S(φiu) =
i=1
T (φiu) = T (u),
and the theorem follows. P
REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999. 2. W. Rudin, Functional Analysis, McGrawHill Book Company, 1973. 3. L. H¨rmander, The Analysis of Linear Partial Diﬀerential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, SpringerVerlag, 1990. 4. S. Igari, Real analysis  With an introduction to Wavelet Theory, American Mathematical Society, 1998.
Version: 4 Owner: matte Author(s): matte
509.9
operations on distributions
Let us assume that U is an open set in Rn . Then we can deﬁne the below operations for distributions in D (U). To prove that these operations indeed give rise to other distributions, one can use condition (2) given on this page.
1837
Vector space structure of D (U) Suppose S, T are distributions in D (U) and α are complex numbers. Thus it is natural to deﬁne [4] S + T : D(U) → C u → S(u) + T (u) and αT : D(U) → C u → αT (u). It is readily shown that these are again distributions. Thus D (U) is a complex vector space. Restriction of distribution Suppose T is a distribution in D (U), and V is an open subset V in U. Then the restriction of the distribution T onto V is the distribution T V ∈ D (V ) deﬁned as [4] T V : D(V ) → C v → T (v).
Again, using condition (2) on this page, one can check that T V is indeed a distribution. Derivative of distribution Suppose T is a distribution in D (U), and α is a multiindex. Then the αderivative of T is the distribution ∂ α T ∈ D (U) deﬁned as ∂ α : D(U) → C u → (−1)α T (∂ α u),
where the last ∂ α is the usual derivative deﬁned here for smooth functions. Suppose α is a multiindex, and f : U → C is a locally integrable function, whose all partial diﬀerentials up to order α are continuous. Then, if Tf is the distribution induced by f , we have ([3], pp. 143) ∂ α Tf = T∂ α f . This means that the derivative for a distribution coincide with the usual deﬁnition of the derivative provided that the distribution is induced by a suﬃciently smooth function. If α and β are multiindices, then for any T ∈ D (U) we have ∂ α ∂ β T = ∂ β ∂ α T. This follows since the corresponding relation holds in D(U) (see this page). 1838
Multiplication of distribution and smooth function Suppose T is a distribution in D (U), and f is a smooth function on U, i.e., f ∈ C ∞ (U). Then f T is the distribution f T ∈ D (U) deﬁned as f T : D(U) → C u → T (f u), where f u is the smooth mapping f u : x → f (x)u(x). The proof that f T is a distribution is an application of Leibniz’ rule [3].
REFERENCES
1. L. H¨rmander, The Analysis of Linear Partial Diﬀerential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, SpringerVerlag, 1990. 2. W. Rudin, Functional Analysis, McGrawHill Book Company, 1973.
Version: 4 Owner: matte Author(s): matte
509.10
smooth distribution
Deﬁnition 1 Suppose U is an open set in Rn , suppose T is a distribution on U, i.e., T ∈ D (U), and suppose V is an open set V ⊂ U. Then we say that T is smooth on V , if there exists a smooth function f : V → C such that T V = Tf . In other words, T is smooth on V , if the restriction of T to V coincides with the distribution induced by some smooth function f : V → C. Deﬁnition 2 [1, 2] Suppose U is an open set in Rn and T ∈ D (U). Then the singular support of T (which is denoted by sing supp T ) is the complement of the largest open set where T is smooth. Examples 1. [2] On R, let f be the function f (x) = +1 when x is irrational, 0 when x is rational.
Then the distribution induced by f , that is Tf , is smooth. Indeed, let 1 be the smooth function x → 1. Since f = 1 almost everywhere, we have Tf = T1 (see this page), so Tf is smooth. 1839
2. For the delta distribution δ, we have sing supp δ = {0}. 3. For any distribution T ∈ D (U), we have [1] sing supp T ⊂ supp T, where supp T is the support of T . 4. Let f be a smooth function f : U → C. Then sing supp Tf is empty [1].
REFERENCES
1. J. BarrosNeta, An introduction to the theory of distributions, Marcel Dekker, Inc., 1973. 2. A. Grigis, J. Sj¨strand, Microlocal Analysis for Diﬀerential Operators, Cambridge Unio versity Press, 1994. 3. J. Rauch, Partial Diﬀerential Equations, SpringerVerlag, 1991.
Version: 2 Owner: matte Author(s): matte
509.11
space of rapidly decreasing functions
Deﬁnition [4, 2] The space of rapidly decreasing functions is the function space S(Rn ) = {f ∈ C ∞ (Rn )  sup  f α,β < ∞ for all multiindices α, β},
x∈Rn
where C ∞ (Rn ) is the set of smooth functions from Rn to C, and f α,β = xα D β f ∞ . Here,  · ∞ is the supremum norm, and we use multiindex notation. When the dimension n is clear, it is convenient to write S = S(Rn ). The space S is also called the Schwartz space, after Laurent Schwartz (19152002) [3]. The set S is closed under pointwise addition and under multiplication by a complex scalar. Thus S is a complex vector space. Examples of functions in S 1. If k ∈ {0, 1, 2, . . .}, and a is a positive real number, then [2] xk exp{−ax2 } ∈ S. 2. Any smooth function with compact support f is in S. This is clear since any derivative of f is continuous, so xα D β f has a maximum in Rn . 1840
Properties 1. For any 1 ≤ p ≤ ∞, we have [2, 4] S(Rn ) ⊂ Lp (Rn ), where Lp (Rn ) is the space of pintegrable functions. 2. Using Leibniz’ rule, it follows that S is also closed under pointwise multiplication; if f, g ∈ S, then f g : x → f (x)g(x) is also in S.
REFERENCES
1. L. H¨rmander, The Analysis of Linear Partial Diﬀerential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, SpringerVerlag, 1990. 2. S. Igari, Real analysis  With an introduction to Wavelet Theory, American Mathematical Society, 1998. 3. The MacTutor History of Mathematics archive, Laurent Schwartz 4. M. Reed, B. Simon, Methods of Modern Mathematical Physics: Functional Analysis I, Revised and enlarged edition, Academic Press, 1980.
Version: 2 Owner: matte Author(s): matte
509.12
support of distribution
Deﬁnition [1, 2, 3, 4] Let U be an open set in Rn and let T be a distribution T ∈ D (U). Then the support of T is the complement of the union of all open sets V ⊂ U where T vanishes. This set is denoted by supp T . If we denote by T V the restriction of T to the set V , then we have the following formula supp T = {V ⊂ U  V is open, and T V = 0 } .
Examples and properties [2, 1] Let U be an open set in Rn . 1. For the delta distribution, supp δ = {0} provided that 0 ∈ U. 1841
2. For any distribution T , the support supp T is closed. 3. Suppose Tf is the distribution induced by a continuous function f : U → C. Then the above deﬁnition for the support of Tf is compatible with the usual deﬁnition for the support of the function f , i.e., supp Tf = supp f. 4. If T ∈ D (U), then we have for any multiindex α, supp D α T ⊂ supp T. 5. If T ∈ D (U) and f ∈ D(U), then supp(f T ) ⊂ supp f supp T.
Theorem [2, 3] Suppose U is an open set on Rn . If T is a distribution with compact support in U, then T is a distribution of ﬁnite order. What is more, if supp T is a point, say supp T = {p}, then T is of the form T =
α≤N
Cα D α δp ,
for some N ≥ 0, and complex constants Cα . Here, δp is the delta distribution at p; δp (u) = u(p) for u ∈ D(U).
REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed, John Wiley & Sons, Inc., 1999. 2. J. Rauch, Partial Diﬀerential Equations, SpringerVerlag, 1991. 3. W. Rudin, Functional Analysis, McGrawHill Book Company, 1973. 4. L. H¨rmander, The Analysis of Linear Partial Diﬀerential Operators I, (Distribution o theory and Fourier Analysis), 2nd ed, SpringerVerlag, 1990. 5. R.E. Edwards, Functional Analysis: Theory and Applications, Dover Publications, 1995.
Version: 3 Owner: matte Author(s): matte
1842
Chapter 510 46H05 – General theory of topological algebras
510.1 Banach algebra
Deﬁnition 19. A Banach algebra is a Banach space with a multiplication law compatible with the norm, i.e. ab a b (product inequality). Deﬁnition 20. A Deﬁnition 42. Banach *algebra is a Banach algebra with an involution following properties: a∗∗ (ab)∗ (λa + µb)∗ a∗  = = = = a, b∗ a∗ , ¯ λa∗ + µb∗ ¯ a.
∗
satisfying the (510.1.1) (510.1.2) (510.1.3) (510.1.4)
∀λ, µ ∈ C,
Example 25. The algebra of bounded operators on a Banach Space is a Banach algebra for the operator norm. Version: 4 Owner: mhale Author(s): mhale
1843
Chapter 511 46L05 – General theory of C ∗algebras
511.1
A Deﬁnition 43. C ∗ algebra A is a Banach *algebra such that a∗ a = a2 for all a ∈ A. Version: 2 Owner: mhale Author(s): mhale
C ∗algebra
511.2
GelfandNaimark representation theorem
Every C ∗ algebra is isomorphic to a C ∗ subalgebra (norm closed *subalgebra) of some B(H), the algebra of bounded operators on some Hilbert space H. In particular, every ﬁnite dimensional C ∗ algebra is isomorphic to a direct sum of matrix algebras. Version: 2 Owner: mhale Author(s): mhale
511.3
A
state
Deﬁnition 44. state Ψ on a C ∗ algebra A is a positive linear functional Ψ : A → C, Ψ(a∗ a) 0 for all a ∈ A, with unit norm. The norm of a positive linear functional is deﬁned by Ψ = sup{Ψ(a) : a 1}. (511.3.1)
a∈A
For a unital C algebra, Ψ = Ψ(1 I). 1844
∗
The space of states is a convex set. Let Ψ1 and Ψ2 be states, then the convex combination λΨ1 + (1 − λ)Ψ2 , is also a state. A state is Deﬁnition 45. pure if it is not a convex combination of two other states. Pure states are the extreme points of the convex set of states. A pure state on a commutative C ∗ algebra is equivalent to a character. When a C ∗ algebra is represented on a Hilbert space H, every unit vector ψ ∈ H determines a (not necessarily pure) state in the form of an Deﬁnition 46. expectation value, Ψ(a) = ψ, aψ . (511.3.3) λ ∈ [0, 1], (511.3.2)
In physics, it is common to refer to such states by their vector ψ rather than the linear functional Ψ. The converse is not always true; not every state need be given by an expectation value. For example, delta functions (which are distributions not functions) give pure states on C0 (X), but they do not correspond to any vector in a Hilbert space (such a vector would not be squareintegrable).
REFERENCES
1. G. Murphy, C ∗ Algebras and Operator Theory. Academic Press, 1990.
Version: 1 Owner: mhale Author(s): mhale
1845
Chapter 512 46L85 – Noncommutative topology
512.1 GelfandNaimark theorem
Let Haus be the category of locally compact Hausdorﬀ spaces with continuous proper maps as morphisms. And, let C∗ Alg be the category of commutative C ∗ algebras with proper *homomorphisms (send approximate units into approximate units) as morphisms. There is a contravariant functor C : Hausop → C∗ Alg which sends each locally compact Hausdorﬀ space X to the commutative C ∗ algebra C0 (X) (C(X) if X is compact). Conversely, there is a contravariant functor M : C∗ Algop → Haus which sends each commutative C ∗ algebra A to the space of characters on A (with the Gelfand topology). The functors C and M are an equivalences of category. Version: 1 Owner: mhale Author(s): mhale
512.2
SerreSwan theorem
Let X be a compact Hausdorﬀ space. Let Vec(X) be the category of complex vector bundles over X. And, let ProjMod(C(X)) be the category of ﬁnitely generated projective modules over the C ∗ algebra C(X). There is a functor Γ : Vec(X) → ProjMod(C(X)) which sends each complex vector bundle E → X to the C(X)module Γ(X, E) of continuous sections. The functor Γ is an equivalences of category. Version: 1 Owner: mhale Author(s): mhale
1846
Chapter 513 46T12 – Measure (Gaussian, cylindrical, etc.) and integrals (Feynman, path, Fresnel, etc.) on manifolds
513.1 path integral
where P (a), P (b) are elements of Rn , and dx = dx1 , · · · , dxn dt where each xi is parametrized dt dt into a function of t. Proof and existence of path integral: Assume we have a parametrized curve P (t) with t ∈ [a, b]. We want to construct a sum of F over this interval on the curve P . Split the interval [a, b] into n subintervals of size ∆t = (b − a)/n. This means that the path P has been divided into n segments of lesser change in tangent vector. Note that the arc lengths need not be of equal length, though the intervals are of equal size. Let ti be an element of the ith subinterval. The quantity P (ti ) gives the average magnitude of the vector tangent to the curve at a point in the interval ∆t. P (ti )∆t is then the approximate arc length of the curve segment produced by the subinterval ∆t. Since we want to sum F over our curve P , we let the range of our curve equal the domain of F . We can then dot this vector with our tangent vector to get the approximation to F at the point P (ti ). Thus, to get the sum we want, we can take the 1847
The path integral is a generalization of the integral that is very useful in theoretical and applied physics. Consider a vector ﬁeld F : Rn → Rm and a path P ⊂ Rn . The path integral of F along the path P is deﬁned as a deﬁnite integral. It can be construed to be the Riemann sum of the values of F along the curve P , aka the area under the curve S : P → F . Thusly, it is deﬁned in terms of the parametrization of P , mapped into the domain Rn of F . Analytically, intP F · dx = intb F (P (t)) · dx a
limit as ∆t approaches 0.
b ∆t→0
lim
a
F (P (ti )) · P (ti )∆t
This is a Riemann sum, and thus we can write it in integral form. This integral is known as a path or line integral (the older name). intP F · dx = intb F (P (t)) · P (t)dt a Note that the path integral only exists if the deﬁnite integral exists on the interval [a, b]. properties: A path integral that begins and ends at the same point is called a closed path integral, and is denoted with the summa symbol with a centered circle: . These types of path integrals can also be evaluated using Green’s theorem. Another property of path integrals is that the directed path integral on a path C in a vector ﬁeld is equal to the negative of the path integral in the opposite direction along the same path. A directed path integral on a closed path is denoted by summa and a circle with an arrow denoting direction. Visualization Aids:
This is an image of a path P superimposed on a vector ﬁeld F .
This is a visualization of what we are doing when we take the integral under the curve S : P → F. Version: 9 Owner: slider142 Author(s): slider142
1848
Chapter 514 47A05 – General (adjoints, conjugates, products, inverses, domains, ranges, etc.)
514.1 BakerCampbellHausdorﬀ formula(e)
Given a linear operator A, we deﬁne: exp A := It follows that
∞ k=0
1 k A . k!
(514.1.1)
∂ τA e = Aeτ A = eτ A A. (514.1.2) ∂τ Consider another linear operator B. Let B(τ ) = eτ A Be−τ A . Then one can prove the following series representation for B(τ ): ∞ τm Bm , (514.1.3) B(τ ) = m! m=0 where Bm = [A, B]m := [A, [A, . . . [A, B]]] (m times) and B0 := B. A very important special case of eq. (513.1.3) is known as the BakerCampbellHausdorﬀ (BCH) formula. Namely, for τ = 1 we get: ∞ 1 eA Be−A = Bm . (514.1.4) m! m=0 Alternatively, this expression may be rewritten as 1 [B, e−A ] = e−A [A, B] + [A, [A, B]] + . . . , 2 1849 (514.1.5)
or
1 (514.1.6) [A, B] + [A, [A, B]] + . . . eA . 2 There is a descendent of the BCH formula, which often is also referred to as BCH formula. It provides us with the multiplication law for two exponentials of linear operators: Suppose [A, [A, B]] = [B, [B, A]] = 0. Then, [eA , B] = eA eB = eA+B e 2 [A,B] .
1
(514.1.7) (514.1.8)
Thus, if we want to commute two exponentials, we get an extra factor eA eB = eB eA e[A,B] . Version: 5 Owner: msihl Author(s): msihl
514.2
adjoint
Let H be a Hilbert space and let A : D(A) ⊂ H → H be a densely deﬁned linear operator. Suppose that for some y ∈ H, there exists z ∈ H such that (Ax, y) = (x, z) for all x ∈ D(A). Then such z is unique, for if z is another element of H satisfying that condition, we have (x, z − z ) = 0 for all x ∈ D(A), which implies z − z = 0 since D(A) is dense. Hence we may deﬁne a new operator A∗ : D(A∗ ) ⊂ H → H by D(A∗ ) ={y ∈ H : there isz ∈ Hsuch that(Ax, y) = (x, z)}, A∗ (y) =z.
It is easy to see that A∗ is linear, and it is called the adjoint of A∗ . Remark. The requirement for A to be densely deﬁned is essential, for otherwise we cannot guarantee A∗ to be well deﬁned. Version: 4 Owner: Koro Author(s): Koro
514.3
closed operator
Let B be a Banach space. A linear operator A : D(A) ⊂ B → B is said to be closed if for every sequence {xn }n∈N in D(A) converging to x ∈ B such that Axn − − y ∈ B, it holds −→ x ∈ D(A) and Ax = y. Equivalently, A is closed if its graph is closed in B ⊕ B.
n→∞
Given an operator A, not necessarily closed, if the closure of its graph in B ⊕ B happens to be the graph of some operator, we call that operator the closure of A and denote it by A. It follows easily that A is the restriction of A to D(A). The following properties are easily checked: 1850
1. Any bounded linear operator deﬁned on the whole space B is closed; 2. If A is closed then A − λI is closed; 3. If A is closed and it has an inverse, then A−1 is also closed; 4. An operator A admits a closure if and only if for every pair of sequences {xn } and {yn } in D(A) converging to x and y, respectively, such that both {Axn } and {Ayn } converge, it holds limn Axn = limn Ayn . Version: 2 Owner: Koro Author(s): Koro
514.4
properties of the adjoint operator
Let A and B be linear operators in a Hilbert space, and let λ ∈ C. Assuming all the operators involved are densely deﬁned, the following properties hold: 1. If A−1 exists and is densely deﬁned, then (A−1 )∗ = (A∗ )− 1; 2. (λA)∗ = λA∗ ; 3. A ⊂ B implies B ∗ ⊂ A∗ ; 4. A∗ + B ∗ ⊂ (A + B)∗ ; 5. B ∗ A∗ ⊂ (AB)∗ ; 6. (A + λI)∗ = A∗ + λI; 7. A∗ is a closed operator. Remark. The notation A ⊂ B for operators means that B is an extension of A, i.e. A is the restriction of B to a smaller domain. Also, we have the following Proposition. 1. If A admits a closure A, then A∗ is densely deﬁned and (A∗ )∗ = A. Version: 5 Owner: Koro Author(s): Koro
1851
Chapter 515 47A35 – Ergodic theory
515.1 ergodic theorem
Let (X, B, µ) be a space with ﬁnite measure, f ∈ L1 (X), and T : X → X be an ergodic transformation, not necessarily invertible. The ergodic theorem (often the pointwise or strong ergodic theorem) states that 1 k holds for almost all x as k → ∞. That is, for an ergodic transformation, the time average converges to the space average almost surely. Version: 3 Owner: bbukh Author(s): bbukh, drummond
k−1
j=0
f (T j x) −→ intf dµ
1852
Chapter 516 47A53 – (Semi) Fredholm operators; index theories
516.1 Fredholm index
Let P be a Fredholm operator. The Deﬁnition 47. index of P is deﬁned as index(P ) = dim ker (P ) − dim coker(P ) = dim ker (P ) − dim ker (P ∗ ).
Note: this is well deﬁned as ker (P ) and ker (P ∗ ) are ﬁnitedimensional vector spaces, for P Fredholm. properties • index(P ∗ ) = − index(P ). • index(P + K) = index(P ) for any compact operator K. • If P1 : H1 → H2 and P2 : H2 → H3 are Fredholm operators then index(P2 P1 ) = index(P1 ) + index(P2 ). Version: 2 Owner: mhale Author(s): mhale
516.2
Fredholm operator
A Fredholm operator is a bounded operator that has a ﬁnite dimensional kernel and cokernel. Equivalently, it is invertible modulo compact operators. That is, if F : X → Y 1853
is a Fredholm operator between two vector spaces X and Y . Then, there exists a bounded operator G : Y → X such that GF − 1 X ∈ K(X), I If F is Fredholm then so is it’s adjoint, F ∗ . Version: 4 Owner: mhale Author(s): mhale F G − 1 Y ∈ K(Y ), I (516.2.1)
where K(X) denotes the space of compact operators on X.
1854
Chapter 517 47A56 – Functions whose values are linear operators (operator and matrix valued functions, etc., including analytic and meromorphic ones
517.1 Taylor’s formula for matrix functions
Let p be a polynomial and suppose A and B commute, i.e. AB = BA, then
n
p(A + B) =
k=0
1 (k) p (A)Bk . k!
where n = deg(p). Version: 4 Owner: bwebste Author(s): bwebste, Johan
1855
Chapter 518 47A60 – Functional calculus
518.1 Beltrami identity
D Let q(t) be a function R → R, q = Dt q, and L = L(q, q, t). Begin with the timerelative ˙ ˙ EulerLagrange condition ∂ D ∂ L− L = 0. (518.1.1) ∂q Dt ∂ q ˙
If
∂ L ∂t
= 0, then the EulerLagrange condition reduces to
∂ L = C, (518.1.2) ∂q ˙ which is the Beltrami identity. In the calculus of variations, the ability to use the Beltrami identity can vastly simplify problems, and as it happens, many physical problems have ∂ L = 0. ∂t L+q ˙ In spacerelative terms, with q :=
D q, Dx
we have (518.1.3)
∂ D ∂ L− L = 0. ∂q Dx ∂q If
∂ L ∂x
= 0, then the EulerLagrange condition reduces to ∂ L = C. ∂q
L+q To derive the Beltrami identity, note that D Dt q ˙ ∂ L ∂q ˙
(518.1.4)
=q ¨
∂ D L+q ˙ ∂q ˙ Dt
∂ L ∂q ˙
(518.1.5)
1856
Multiplying (1) by q, we have ˙ q ˙ D ∂ L−q ˙ ∂q Dt ∂ L ∂q ˙ = 0. (518.1.6)
Now, rearranging (5) and substituting in for the rightmost term of (6), we obtain q ˙ ∂ D ∂ L+q L− ¨ ∂q ∂q ˙ Dt q ˙ ∂ L ∂q ˙ = 0. (518.1.7)
Now consider the total derivative ∂ ∂ ∂ D L(q, q, t) = q L + q L + L. ˙ ˙ ¨ Dt ∂q ∂q ˙ ∂t
(518.1.8)
∂ If ∂t L = 0, then we can substitute in the lefthand side of (8) for the leading portion of (7) to get ∂ D D q L = 0. ˙ L− (518.1.9) Dt Dt ∂q ˙ Integrating with respect to t, we arrive at
L+q ˙ which is the Beltrami identity.
∂ L = C, ∂q ˙
(518.1.10)
Version: 4 Owner: drummond Author(s): drummond
518.2
EulerLagrange diﬀerential equation
D Let q(t) be a function R → R, q = Dt q, and L = L(q, q, t). The EulerLagrange diﬀeren˙ ˙ tial equation (or EulerLagrange condition) is
∂ D L− ∂q Dt
∂ L ∂q ˙
= 0.
(518.2.1)
This is the central equation of the calculus of variations. In some cases, speciﬁcally for ∂ L = 0, it can be replaced by the Beltrami identity. ∂t Version: 1 Owner: drummond Author(s): drummond
518.3
calculus of variations
Imagine a bead of mass m on a wire whose endpoints are at a = (0, 0) and b = (xf , yf ), with yf lower than the starting position. If gravity acts on the bead with force F = mg, what 1857
path (arrangement of the wire) minimizes the bead’s travel time from a to b, assuming no friction? This is the famed “brachistochrone problem,” and its solution was one of the ﬁrst accomplishments of the calculus of variations. Many minimum problems can be solved using the techniques introduced here. In its general form, the calculus of variations concerns quantities S[q, q, t] = intb L(q(t), q(t), t)dt ˙ ˙ a for which we wish to ﬁnd a minimum or a maximum. To make this concrete, let’s consider a much simpler problem than the brachistochrone: what’s the shortest distance between two points p = (x1, y1) and q = (x2, y2)? Let the variable s represent distance along the path, so that intq ds = S. We wish to ﬁnd the path p such that S is a minimum. Zooming in on a small portion of the path, we can see that ds2 = dx2 + dy 2 ds = dx2 + dy 2 (518.3.2) (518.3.3) (518.3.1)
If we parameterize the path by t, then we have ds = dx dt
2
+
dy dt
2
dt
(518.3.4)
Let’s assume y = f (x), so that we may simplify (4) to ds = 1+ dy dx
2
dx =
1 + f (x)2 dx.
(518.3.5)
Now we have S = intq L dx = intx2 p x1 1 + f (x)2 dx (518.3.6) In this case, L is particularly simple. Converting to q’s and t’s to make the comparison easier, we have L = L[f (x)] = L[q(t)], not the more general L[q(t), q(t), t] covered by the ˙ ˙ calculus of variations. We’ll see later how to use our L’s simplicity to our advantage. For now, let’s talk more generally. We wish to ﬁnd the path described by L, passing through a point q(a) at t = a and through q(b) at t = b, for which the quantity S is a minimum, for which small perturbations in the path produce no ﬁrstorder change in S, which we’ll call a “stationary point.” This is directly analogous to the idea that for a function f (t), the minimum can be found where small perturbations δt produce no ﬁrstorder change in f (t). This is where f (t + δt) ≈ f (t); taking a Taylor series expansion of f (t) at t, we ﬁnd f (t + δt) = f (t) + δtf (t) + O(δt2 ) = f (t), 1858 (518.3.7)
D with f (t) := Dt f (t). Of course, since the whole point is to consider δt = 0, once we neglect terms O(δt2 ) this is just the point where f (t) = 0. This point, call it t = t0 , could be a minimum or a maximum, so in the usual calculus of a single variable we’d proceed by taking the second derivative, f (t0 ), and seeing if it’s positive or negative to see whether the function has a minimum or a maximum at t0 , respectively.
In the calculus of variations, we’re not considering small perturbations in t—we’re considering small perturbations in the integral of the relatively complicated function L(q, q, t), where ˙ D q = Dt q(t). S is called a functional, essentially a mapping from functions to real numbers, ˙ and we can think of the minimization problem as the discovery of a minimum in Sspace as we jiggle the parameters q and q. ˙ For the shortestdistance problem, it’s clear the maximum time doesn’t exist, since for any ﬁnite path length S0 we (intuitively) can always ﬁnd a curve for which the path’s length is greater than S0 . This is often true, and we’ll assume for this discussion that ﬁnding a stationary point means we’ve found a minimum. Formally, we write the condition that small parameter perturbations produce no change in S as δS = 0. To make this precise, we simply write:
δS := S[q + δq, q + δ q, t] − S[q, q, t] ˙ ˙ ˙
= intb L(q + δq, q + δ q)Dt − S[q, q, t] ˙ ˙ ˙ a
How are we to simplify this mess? We are considering small perturbations to the path, which suggests a Taylor series expansion of L(q + δq, q + δ q) about (q, q): ˙ ˙ ˙ L(q + δq, q + δ q) = L(q, q) + δq ˙ ˙ ˙ ∂ ∂ L(q, q) + δ q L(q, q) + O(δq 2) + O(δ q 2 ) ˙ ˙ ˙ ˙ ∂q ∂q ˙
and since we make little error by discarding higherorder terms in δq and δ q, we have ˙ intb L(q + δq, q + δ q)Dt = S[q, q, t] + intb δq ˙ ˙ ˙ a a Keeping in mind that δ q = ˙
D δq Dt
∂ ∂ L(q, q) + δ q L(q, q)Dt ˙ ˙ ˙ ∂q ∂q ˙
and noting that = δq D ∂ ∂ L(q, q) + δ q L(q, q), ˙ ˙ ˙ Dt ∂ q ˙ ∂q ˙
D ∂ ˙ δq L(q, q) Dt ∂q ˙
a simple application of the product rule δq ˙
D (f g) Dt
= f˙g + f g which allows us to substitute ˙
∂ D ∂ D ∂ L(q, q) = ˙ δq L(q, q) − δq ˙ L(q, q), ˙ ∂q ˙ Dt ∂q ˙ Dt ∂ q ˙ 1859
we can rewrite the integral, shortening L(q, q) to L for convenience, as: ˙ intb δq a ∂ ∂ ∂ ∂ D ∂ D δq L Dt L + δ q LDt = intb δq L − δq ˙ L+ a ∂q ∂q ˙ ∂q Dt ∂ q ˙ Dt ∂q ˙ ∂ D ∂ ∂ b = intb δq L− L Dt + δq L a ∂q Dt ∂ q ˙ ∂q a ˙
Substituting all of this progressively back into our original expression for δS, we obtain δS = intb L(q + δq, q + δ q)Dt − S[q, q, t] ˙ ˙ ˙ a ∂ ∂ ˙ = S + intb δq L + δ q L Dt − S a ∂q ∂q ˙ ∂ D ∂ ∂ = intb δq L− L Dt + δq L a ∂q Dt ∂ q ˙ ∂q ˙
b a
= 0.
Two conditions come to our aid. First, we’re only interested in the neighboring paths that still begin at a and end at b, which corresponds to the condition δq = 0 at a and b, which lets us cancel the ﬁnal term. Second, between those two points, we’re interested in the paths which do vary, for which δq = 0. This leads us to the condition intb δq a ∂ D ∂ L− L Dt = 0. ∂q Dt ∂ q ˙ (518.3.8)
The fundamental theorem of the calculus of variations is that for functions f (t), g(t) with g(t) = 0 ∀t ∈ (a, b), intb f (t)g(t)dt = 0 =⇒ f (t) = 0 ∀t ∈ (a, b). a Using this theorem, we obtain ∂ D L− ∂q Dt ∂ L ∂q ˙ = 0. (518.3.10) (518.3.9)
This condition, one of the fundamental equations of the calculus of variations, is called the EulerLagrange condition. When presented with a problem in the calculus of variations, the ﬁrst thing one usually does is to ask why one simply doesn’t plug the problem’s L into this equation and solve. Recall our shortestpath problem, where we had arrived at S = intb L dx = intx2 a x1 1 + f (x)2 dx. (518.3.11)
Here, x takes the place of t, f takes the place of q, and (8) becomes D ∂ ∂ L− L=0 ∂f Dx ∂f 1860 (518.3.12)
Even with
∂ L ∂f
= 0, this is still ugly. However, because L−q ∂ L = C. ∂q
∂ L ∂f
= 0, we can use the Beltrami identity, (518.3.13)
(For the derivation of this useful little trick, see the corresponding entry.) Now we must simply solve ∂ 1 + f (x)2 − f (x) L = C (518.3.14) ∂q which looks just as daunting, but quickly reduces to 1+f (x)2 − f (x) =C 1 + f (x)2 1 + f (x)2 − f (x)2 =C 1 + f (x)2 1 =C 1 + f (x)2 f (x) = 1 − 1 = m. C2
1 2f 2
(x)
(518.3.15) (518.3.16) (518.3.17) (518.3.18)
That is, the slope of the curve representing the shortest path between two points is a constant, which means the curve must be a straight line. Through this lengthy process, we’ve proved that a straight line is the shortest distance between two points. To ﬁnd the actual function f (x) given endpoints (x1 , y1 ) and (x2 , y2), simply integrate with respect to x: f (x) = intf (x)dx = intbdx = mx + d (518.3.19) and then apply the boundary conditions f (x1 ) = y1 = mx1 + d f (x2 ) = y2 = mx2 + d Subtracting the ﬁrst condition from the second, we get m = for the slope of a line. Solving for d = y1 − mx1 , we get f (x) = y2 − y1 (x − x1 ) + y1 x2 − x1
y2 −y1 , x2 −x1
(518.3.20) (518.3.21) the standard equation
(518.3.22)
which is the basic equation for a line passing through (x1 , y1 ) and (x2 , y2 ). The solution to the brachistochrone problem, while slightly more complicated, follows along exactly the same lines. Version: 6 Owner: drummond Author(s): drummond
1861
Chapter 519 47B15 – Hermitian and normal operators (spectral measures, functional calculus, etc.)
519.1 selfadjoint operator
A linear operator A : D(A) ⊂ H → H in a Hilbert space H is an Hermitian operator if (Ax, y) = (x, Ay) for all x, y ∈ D(A). If A is densely deﬁned, it is said to be a symmetric operator if it is the restriction of its adjoint A∗ to D(A), i.e. if A ⊂ A∗ ; and it is called a selfadjoint operator if A = A∗ . Version: 2 Owner: Koro Author(s): Koro
1862
Chapter 520 47G30 – Pseudodiﬀerential operators
520.1 Dini derivative
The upper Dini derivative of a continuous function, f , denoted D + f or f+ , is deﬁned as D + f (t) = f+ (t) = lim+ sup
h→0
f (t + h) − f (t) . h
The lower Dini derivative, D − f or f− is deﬁned as D − f (t) = f− (t) = lim+ inf
h→0
f (t + h) − f (t) . h
If f is deﬁned on a vector space, then the upper Dini derivative at t in the direction d is denoted f (t + hd) − f (t) . f+ (t, d) = lim+ sup h→0 hd If f is locally Lipschitz then D + f is ﬁnite. If f is diﬀerentiable at t then the Dini derivative at t is the derivative at t. Version: 5 Owner: lha Author(s): lha
1863
Chapter 521 47H10 – Fixedpoint theorems
521.1 Brouwer ﬁxed point in one dimension
Theorem 1 [1, 1] Suppose f is a continuous function f : [−1, 1] → [−1, 1]. Then f has a ﬁxed point, i.e., there is a x such that f (x) = x. Proof (Following [1]) We can assume that f (−1) > −1 and f (+1) < 1, since otherwise there is nothing to prove. Then, consider the function g : [−1, 1] → R deﬁned by g(x) = f (x) − x. It satisﬁes g(+1) > 0, g(−1) < 0, so by the intermediate value theorem, there is a point x such that g(x) = 0, i.e., f (x) = x. P Assuming slightly more of the function f yields the Banach ﬁxed point theorem. In one dimension it states the following: Theorem 2 Suppose f : [−1, 1] → [−1, 1] is a function that satisﬁes the following condition: for some constant C ∈ [0, 1), we have for each a, b ∈ [−1, 1], f (b) − f (a) ≤ Cb − a. Then f has a unique ﬁxed point in [−1, 1]. In other words, there exists one and only one point x ∈ [−1, 1] such that f (x) = x. Remarks The ﬁxed point in Theorem 2 can be found by iteration from any s ∈ [−1, 1] as follows: ﬁrst choose some s ∈ [−1, 1]. Then form s1 = f (s), then s2 = f (s1 ), and generally sn = f (sn−1 ). As n → ∞, sn approaches the ﬁxed point for f . More details are given on the 1864
entry for the Banach ﬁxed point theorem. A function that satisﬁes the condition in Theorem 2 is called a contraction mapping. Such mappings also satisfy the Lipschitz condition.
REFERENCES
1. A. Mukherjea, K. Pothoven, Real and Functional analysis, Plenum press, 1978.
Version: 5 Owner: mathcam Author(s): matte
521.2
Brouwer ﬁxed point theorem
Theorem Let B = {x ∈ Rn : x ≤ 1} be the closed unit ball in Rn . Any continuous function f : B → B has a ﬁxed point.
Notes
Shape is not important The theorem also applies to anything homeomorphic to a closed disk, of course. In particular, we can replace B in the formulation with a square or a triangle. Compactness counts (a) The theorem is not true if we drop a point from the interior of 1 B. For example, the map f (x) = 2 x has the single ﬁxed point at 0; dropping it from the domain yields a map with no ﬁxed points. Compactness counts (b) The theorem is not true for an open disk. For instance, the map f (x) = 1 x + ( 1 , 0, . . . , 0) has its single ﬁxed point on the boundary of B. 2 2 Version: 3 Owner: matte Author(s): matte, ariels
521.3
any topological space with the ﬁxed point property is connected
Theorem Any topological space with the ﬁxed point property is connected [3, 2]. Proof. Suppose X is a topological space with the ﬁxed point property. We will show that X is connected by contradiction: suppose there are nonempty disjoint open sets A, B ⊂ X
1865
such that X = A B. Then there are elements a ∈ A and b ∈ B, and we can deﬁne a function f : X → X by a, when x ∈ B, f (x) = b, when x ∈ A. Since A B = ∅ and A B = X, the function f is well deﬁned. Also, since f (x) and x are always in disjoint connected components of X, f can have no ﬁxed point. To obtain a contradiction, we only need to show that f is continuous. However, if V is an open set in X, a short calculation shows that f −1 (V ) is either ∅, A, B or X, which are all open sets. Thus f is continuous, and X must be connected. P
REFERENCES
1. G.J. Jameson, Topology and Normed Spaces, Chapman and Hall, 1974. 2. L.E. Ward, Topology, An Outline for a First Course, Marcel Dekker, Inc., 1972.
Version: 5 Owner: matte Author(s): matte
521.4
ﬁxed point property
Deﬁnition [2, 3, 2] Suppose X is a topological space. If every continuous function f : X → X has a ﬁxed point, then X has the ﬁxed point property.
Example 1. Brouwer’s ﬁxed point theorem states that in Rn , the closed unit ball with the subspace topology has the ﬁxed point property.
Properties 1. The ﬁxed point property is preserved under a homeomorphism. In other words, suppose f : X → Y is a homeomorphism between topological spaces X and Y . If X has the ﬁxed point property, then Y has the ﬁxed point property [2]. 2. any topological space with the ﬁxed point property is connected [3, 2]. 3. Suppose X is a topological space with the ﬁxed point property, and Y is a retract of X. Then Y has the ﬁxed point property [3].
1866
REFERENCES
1. G.L. Naber, Topological methods in Euclidean spaces, Cambridge University Press, 1980. 2. G.J. Jameson, Topology and Normed Spaces, Chapman and Hall, 1974. 3. L.E. Ward, Topology, An Outline for a First Course, Marcel Dekker, Inc., 1972.
Version: 5 Owner: matte Author(s): matte
521.5
proof of Brouwer ﬁxed point theorem
Proof of the Brouwer ﬁxed point theorem: Assume that there does exist a map from f : B n → B n with no ﬁxed point. Then let g(x) be the following map: Start at f (x), draw the ray going through x and then let g(x) be the ﬁrst intersection of that line with the sphere. This map is continuous and well deﬁned only because f ﬁxes no point. Also, it is not hard to see that it must be the identity on the boundary sphere. Thus we have a map g : B n → S n−1 , which is the identity on S n−1 = ∂B n , that is, a retraction. Now, if i : S n−1 → B n is the inclusion map, g ◦ i = idS n−1 . Applying the reduced homology functor, we ﬁnd that g∗ ◦ i∗ = idHn−1 (S n−1 ) , where ∗ indicates the induced ˜ map on homology. ˜ ˜ But, it is a wellknown fact that Hn−1(B n ) = 0 (since B n is contractible), and that Hn−1 (S n−1 ) = Z. Thus we have an isomorphism of a nonzero group onto itself factoring through a trivial group, which is clearly impossible. Thus we have a contradiction, and no such map f exists. Version: 3 Owner: bwebste Author(s): bwebste
1867
Chapter 522 47L07 – Convex sets and cones of operators
522.1 convex hull of S is open if S is open
Theorem If S is an open set in a topological vector space, then the convex hull co(S) is open [1]. As the next example shows, the corresponding result does not hold for a closed set. Example ([1], pp. 14) If S = {(x, 1/x) ∈ R2  x ∈ R \ {0}}, then S is closed, but co(S) is the open halfspace {(x, y)  x ∈ R, y ∈ (0, ∞)}, which is open. P
REFERENCES
1. F.A. Valentine, Convex sets, McGrawHill book company, 1964.
Version: 3 Owner: drini Author(s): matte
1868
Chapter 523 47L25 – Operator spaces (= matricially normed spaces)
523.1 operator norm
Let A : V → W be a linear map between normed vector spaces V and W. We can deﬁne a function · op : A → R+ as Av A op := sup . v v∈V
v=0
Equivalently, the above deﬁnition can be written as A
op
:= sup Av = sup
v∈V v =1 v∈V 0< v ≤1
Av .
Turns out that · op satisﬁes all the properties of a norm and hence is called the operator norm (or the induced norm) of A. If A op exists and is ﬁnite, we say that A is a bounded linear map. The space L(V, W) of bounded linear maps from V to W also forms a vector space with · as the natural norm.
op
523.1.1
Example
·
p
Suppose that V = (Rn , · p ) and W = (Rn , · p ), where the operator norm · op = · p is the matrix pnorm. Version: 3 Owner: igor Author(s): igor 1869
is the vector pnorm. Then
Chapter 524 47S99 – Miscellaneous
524.1 Drazin inverse
A Drazin inverse of an operator T is an operator, S, such that T S = ST, S 2 T = S, T m+1 S = T m for any integer m ≥ 0. For example, a projection operator, P , is its own Drazin inverse, as P P = P 2P = P m = P for any integer m ≥ 0. Version: 2 Owner: lha Author(s): lha
1870
Chapter 525 49K10 – Free problems in two or more independent variables
525.1 Kantorovitch’s theorem
Let a0 be a point in Rn , U an open neighborhood of a0 in Rn and f : U → Rn a diﬀerentiable mapping, with its derivative [Df (a0 )] invertible. Deﬁne h0 = −[Df (a0 )]−1 f(a0 ) , a1 = a0 + h0 , U0 = {x  x − a1  If U0 ⊂ U and the derivative [Df (x)] satisﬁes the Lipschitz condition [Df(u1 )] − [Df (u2 )] for all points u1 , u2 ∈ U0 , and if the inequality f (a0 ) [Df (a0 )]−1 M
2
h0 }
Mu1 − u2  1 2
is satisﬁed, the equation f (x) = 0 has a unique solution in U0 , and Newton’s method with initial guess a0 converges to it. If we replace with <, then it can be shown that Newton’s method superconverges! If you want an even stronger version, one can replace ... with the norm .... logic behind the theorem:Let’s look at the useful part of the theorem: f (a0 ) [Df (a0 )]−1 M
2
1 2
It is a product of three distinct properties of your function such that the product is less than or equal to a certain number, or bound. If we call the product R, then it says that a0 must be within a ball of radius R. It also says that the solution x is within this same ball. How was this ball deﬁned ? 1871
The ﬁrst term, f (a0 ), is a measure of how far the function is from the domain; in the cartesian plane, it would be how far the function is from the xaxis. Of course, if we’re solving for f (x) = 0, we want this value to be small, because it means we’re closer to the axis. However a function can be annoyingly close to the axis, and yet just happily curve away from the axis. Thus we need more. The second term, [Df(a0 )]−1 2 is a little more diﬃcult. This is obviously a measure of how fast the function is changing with respect to the domain (xaxis in the plane). The larger the derivative, the faster it’s approaching wherever it’s going (hopefully the axis). Thus, we take the inverse of it, since we want this product to be less than a number. Why it’s squared though, is because it is the denominator where a product of two terms of like units is the numerator. Thus to conserve units with the numerator, it is multiplied by itself. Combined with the ﬁrst term, this also seems to be enough, but what if the derivative changes sharply, but it changes the wrong way? The third term is the Lipschitz ratio M. This measures sharp changes in the ﬁrst derivative, so we can be sure that if this is small, that the function won’t try to curve away from our goal on us too sharply. By the way, the number 1 is unitless, so all the units on the left side cancel. Checking units 2 is essential in applications, such as physics and engineering, where Newton’s method is used. Version: 18 Owner: slider142 Author(s): slider142
1872
Chapter 526 49M15 – Methods of NewtonRaphson, Galerkin and Ritz types
526.1 Newton’s method
− → Let f be a diﬀerentiable function from Rn to Rn . Newton’s method consists of starting at − → − → an a0 for the equation f (x) = 0 . Then the function is linearized at a0 by replacing the − → − → − → increment f (x) − f (a0 ) by a linear function of the increment [D f (a0 )](x − a0 ). − → − → − → Now we can solve the linear equation f (a0 ) + [D f (a0 )](x − a0 ) = 0 . Since this is a system − → − → of n linear equations in n unknowns, [D f (a0 )](x − a0 ) = − f (a0 ) can be likened to the → → − general linear system A− = b . x − → − → − → Therefore, if [D f (a0 )] is invertible, then x = a0 − [D f (a0 )]−1 f (a0 ). By renaming x to a1 , you can reiterate Newton’s method to get an a2 . Thus, Newton’s method states − → − → an+1 = an − [D f (an )]−1 f (an ) − → − → Thus we get a series of a’s that hopefully will converge to x f (x) = 0 . When we solve − → − → an equation of the form f (x) = 0 , we call the solution a root of the equation. Thus, Newton’s method is used to ﬁnd roots of nonlinear equations. Unfortunately, Newton’s method does not always converge. There are tests for neighborhoods of a0 ’s where Newton’s method will converge however. One such test is Kantorovitch’s theorem, which combines what is needed into a concise mathematical equation. Corollary 1:Newton’s Method in one dimension  The above equation is simpliﬁed in one dimension to the wellused 1873
a1 = a0 −
f (a0 ) f (a0 )
This intuitively cute equation is pretty much the equation of ﬁrst year calculus. :) Corollary 2:Finding a square root  So now that you know the equation, you need to know how to use it, as it is an algorithm. The construction of the primary equation, of course is the important part. Let’s see how you do it if you want to ﬁnd a square root of a number b. We want to ﬁnd a number x (x for unknown), such that x2 = b. You might think ”why not √ ﬁnd a number such that x = b ?” Well, the problem with that approach is that we don’t √ have a value for b, so we’d be right back where we started. However, squaring both sides of the equation to get x2 = b lets us work with the number we do know, b.) Back to x2 = b. With some manipulation, we see this means that x2 − b = 0 ! Thus we have our f (x) = 0 scenario. We can see that f (x) = 2x thus, f (a0 ) = 2a0 and f (a0 ) = a2 − b. Now we have all we need 0 to carry out Newton’s method. By renaming x to be a1 , we have 1 b 1 2 (a0 − b) = a0 + 2a0 2 a0
a1 = a0 − .
The equation on the far right is also known as the divide and average method, for those who have not learned the full Newton’s method, and just want a fast way to ﬁnd square roots. Let’s see how this works out to ﬁnd the square root of a number like 2: Let x2 = 2 x2 − 2 = 0 = f (x) Thus, by Newton’s method,... a2 − 2 0 2a0
a1 = a0 −
All we did was plug in the expressions f (a0 ) and f (a0 ) where Newton’s method asks for them. Now we have to pick an a0 . Hmm, since √ 1< 1< √ √ 2< √ 4
2<2
1874
let’s pick a reasonable number between 1 and 2 like 1.5 1.52 − 2 2(1.5) ¯ a1 = 1.416
a1 = 1.5 −
Looks like our guess was too high. Let’s see what the next iteration says a2 = 1.41¯ − 6 1.41¯2 − 2 6 2(1.41¯ 6)
a2 = 1.414215686 . . . getting better =) You can use your calculator to ﬁnd that √ 2 = 1.414213562 . . .
Not bad for only two iterations! Of course, the more you iterate, the more decimal places your an will be accurate to. By the way, this is also how your calculator/computer ﬁnds square roots! Geometric interpretation: Consider an arbitrary function f : R → R such as f (x) = x2 − b. Say you wanted to ﬁnd a root of this function. You know that in the neighborhood of x = a0 , there is a root (Maybe you used Kantorovitch’s theorem or tested and saw that the function changed signs in this neighborhood). We want to use our knowledge a0 to ﬁnd an a1 that is a better approximation to x0 (in this case, closer to it on the xaxis). So we know that x0 a1 a0 or in another case a0 a1 x0 . What is an eﬃcient algorithm to bridge the gap between a0 and x0 ? Let’s look at a tangent line to the graph. Note that the line intercepts the xaxis between a0 and x0 , which is exactly what we want. The slope of this tangent line is f (a0 ) by deﬁnition of the derivative at a0 , and we know one point on the line is (a1 , 0), since that is the xintercept. That is all we need to ﬁnd the formula of the line, and solve for a1 .
1875
y − y1
= m(x − x1 ) Substituting
f (a0 ) − 0 = f (a0 )(a0 − a1 )
f (a0 ) f (a0 )
= a0 − a1 = −a0 + = a0 −
f (a0 ) f (a0 )
−a1 a1
Aesthetic change. Flipped the equation around. Newton’s method!
f (a0 ) f (a0 )
Version: 17 Owner: slider142 Author(s): slider142
1876
Chapter 527 5100 – General reference works (handbooks, dictionaries, bibliographies, etc.)
527.1 Apollonius theorem
Let a, b, c the sides of a triangle and m the length of the median to the side with length a. 2 Then b2 + c2 = 2m2 + a2 .
Version: 2 Owner: drini Author(s): drini
527.2
Apollonius’ circle
Apollonius’ circle. The locus of a point moving so that the ratio of its distances from two ﬁxed points is ﬁxed, is a circle. If two circles C1 and C2 are ﬁxed with radius r1 and r2 then the cicle of Apollonius of the two centers with ratio r1 /r2 is the circle whose diameter is the segment that joins the two homothety centers of the circles. Version: 1 Owner: drini Author(s): drini
1877
527.3
Brahmagupta’s formula
If a cyclic quadrilateral has sides p, q, r, s then its area is given by (T − p)(T − q)(T − r)(T − s) where T =
p+q+r+s . 2
Note that if s → 0, Heron’s formula is recovered.
Version: 3 Owner: drini Author(s): drini
527.4
Brianchon theorem
If an hexagon ABCDEF (not necessarily convex) is inscribed into a conic (in particular into a circle), then the three diagonals AD, BE, CF are concurrent. This theorem is the dual of Pascal line theorem. (C. Brianchon, 1806)
Version: 2 Owner: vladm Author(s): vladm
527.5
Brocard theorem
Theorem: Let ABC be a triangle. Let A , B , C be three points such as A ∈ (BC), B ∈ (AC), C ∈ (AB). Then the circumscribed circles to the triangles AB C , BC A , CA B have a point in common. This point is called Brocard point. Proof: Let M be the point in witch the circles circumscribed to the triangles AB C , BC A meets. Because the quadrilateral AB MC is discreptible, the angles AB M and MC B are congruent. Analog because the quadrilateral BA MC is discreptible, the angles MC B and MA C are congruent. So AB M and MA C are congruent and MA CB is inscribable.
Version: 2 Owner: vladm Author(s): vladm 1878
527.6
Carnot circles
If ABC is a triangle, and H is the orthocenter, then we have three circles so that every circle contains two angles and the orthocenter. The three circles are called the Carnot circles.
Version: 2 Owner: vladm Author(s): vladm
527.7
Erd¨sAnning Theorem o
If an inﬁnite number of points in a plane are all separated by integer distances, then all the points lie on a straight line. Version: 1 Owner: giri Author(s): giri
527.8
Euler Line
In any triangle, the orthocenter H, the centroid G and the circumcenter O are collinear, and OG/GH = 1/2. The line passing by these points is known as the Euler line of the triangle.
This line also passes by the center of the ninepoint circle (or Feuerbach circle) N, and N is the midpoint of OH.
Version: 9 Owner: drini Author(s): drini
527.9
Gergonne point
Let ABC be a triangle and D, E, F where the incircle touches the sides BC, CA, AB respectively. Then the lines AD, BE, CF are concurrent, and the common point is called the Gergonne point of the triangle. Version: 3 Owner: drini Author(s): drini 1879
527.10
Gergonne triangle
Let ABC be a triangle and D, E, F where the incircle touches the sides BC, CA, AB respectively. Then triangle DEF is called the Gergonne triangle or contact triangle of ABC. Version: 2 Owner: drini Author(s): drini
527.11
Heron’s formula
The area of a triangle with side lengths a, b, c is = where s =
a+b+c 2
s(s − a)(s − b)(s − c)
(the semiperimeter).
Version: 2 Owner: drini Author(s): drini
527.12
Lemoine circle
If throught the Lemoine point of a triangle are drawn parallels to the sides, the six points where these intersect the circle lie all on a same circle. This circle is called the Lemoine circle of the triangle Version: 1 Owner: drini Author(s): drini
527.13
Lemoine point
The Lemoine point of a triangle, is the intersection point of its three symmedians. (That is, the isogonal conjugate of the centroid). It is related with the Gergonne point by the following result: On any triangle ABC, the Lemoine point of its Gergonne triangle is the Gergonne point of ABC. In the picture, the blue lines are the medians, intersecting an the centroid G. The green lines are anglee bisectors intersecting at the incentre I and the red lines are symmedians. The symmedians intersect at Lemoine point L. Version: 5 Owner: drini Author(s): drini 1880
527.14
Miquel point
Let AECF be a complete quadrilateral, then the four circles circumcised to the four triangles : AED, AF B, BEC, CDF , are concurrent in a point M. This point is called the Miquel point. The Miquel point is also the focus of the parabola inscribed to AECF .
Version: 2 Owner: vladm Author(s): vladm
527.15
Mollweide’s equations
In a triangle, having the sides a, b and c opposite to the angles α, β and γ respectively the following equations hold: α−β γ (a + b) sin = c cos 2 2 and (a − b) cos γ = c sin 2 α−β 2 .
Version: 2 Owner: mathwizard Author(s): mathwizard
527.16
Morley’s theorem
Morley’s theorem. The points of intersections of the adjacent trisectors in any triangle, are the vertices of an equilateral triangle.
Version: 3 Owner: drini Author(s): drini
1881
527.17
Newton’s line
Let ABCD be a circumscribed quadrilateral. The middle of the two diagonals M, N and the center of the inscribed circle I are collinear. This line is called the Newton’s line
Version: 1 Owner: vladm Author(s): vladm
527.18
NewtonGauss line
Let AECF be a complete quadrilateral, and AC, BD, EF his diagonals. Let P be the middle of AC, Q the middle of BD, and R the middle of EF . Then P, Q, R are on a same line, called the NewtonGauss line.
Version: 1 Owner: vladm Author(s): vladm
527.19
Pascal’s mystic hexagram
If an hexagon ADBF CE (not necessarily convex) is inscribed into a conic (in particular into a circle), then the points of intersections of opposite sides (AD with F C, DBwith CE and BF with EA) are collinear. This line is called the Pascal line of the hexagon. A very special case happens when the conic degenerates into two lines, however the theorem still holds although this particular case is usually called Pappus theorem.
Version: 5 Owner: drini Author(s): drini
527.20
Ptolemy’s theorem
If ABCD is a cyclic quadrilateral, then the product of the two diagonals is equal to the sum of the products of opposite sides. 1882
AC · BD = AB · CD + AD · BC. When the quadrilateral is not cyclic we have the following inequality AB · CD + BC · AD > AC · BD Version: 5 Owner: drini Author(s): drini
527.21
Pythagorean theorem
Pythagorean theorem states: If ABC is a right triangle, then the square of the length of the hypothenuse is equal to the sum of the squares of the two legs. In the following picture, the purple squares add up the same area as the orange one.
AC 2 = AB 2 + BC 2 . Cosines law is a generalization of Pythagorean theorem for any triangle. Version: 12 Owner: drini Author(s): drini
527.22
Schooten theorem
Theorem: Let ABC be a equilateral triangle. If M is a point on the circumscribed circle then the equality AM = BM + CM holds. Proof: Let B ∈ (MA) so that MB = B B. Because BMA = BCA = 60◦ , the triangle MBB is equilateral, so BB = MB = MB . Because AB = BC, BB = BM and ABB ⇔ MBC we have that the triangles ABB and CBM are equivalent. Since MC = AB we have that AM = AB + B M = MC + MB.
1883
Version: 1 Owner: vladm Author(s): vladm
527.23
Simson’s line
Let ABC a triangle and P a point on its circumcircle (other than A, B, C). Then the feet of the perpendiculars drawn from P to the sides AB, BC, CA (or their prolongations) are collinear.
An interesting result form the realm of analytic geometry states that the envelope formed by Simson’s lines when P varies is a circular hypocycloid of three points. Version: 9 Owner: drini Author(s): drini
527.24
Stewart’s theorem
Let be given a triangle ABC with AB = c, BC = a, CA = b, and a point X on BC such that BX = m and XC = n. Denote with p the length of AX. Then a(p2 + mn) = b2 m + c2 n. Version: 3 Owner: drini Author(s): drini
527.25
Thales’ theorem
Let A and B be two points and C a point on the semicircle above them. Then the angle ACB is 90◦ .
Version: 3 Owner: mathwizard Author(s): mathwizard
1884
527.26
alternate proof of parallelogram law
Proof of this is simple, given the cosine law: c2 = a2 + b2 − 2ab cos φ where a, b, and c are the lengths of the sides of the triangle, and angle φ is the corner angle opposite the side of length c. Let us deﬁne the largest interior angles as angle θ. Applying this to the parallelogram, we ﬁnd that d2 = u2 + v 2 − 2uv cos θ 1 d2 = u2 + v 2 − 2uv cos (π − θ) 2 Knowing that we can add the two expressions together, and ﬁnd ourselves with d2 + d2 = 2u2 + 2v 2 − 2uv cos θ + 2uv cos θ 2 1 d2 + d2 = 2u2 + 2v 2 2 1 which is the theorem we set out to prove. Version: 2 Owner: drini Author(s): ﬁziko cos (π − θ) = − cos θ
527.27
alternative proof of the sines law
The goal is to prove the sine law: sin A sin B sin C 1 = = = a b c 2R where the variables are deﬁned by the triangle
(60,30)**@ (0,0)**@ ,(50,12)*a ,(30,18)*b ,(20,3)*c ,(9,2)*A ,(39,3)* and where R is the radius of the circumcircle that encloses our triangle. Let’s add a couple of lines and deﬁne more variables.
(60,30)**@ (0,0)**@ ,(50,12)*a ,(30,18)*b ,(20,3)*c ,(9,2)*A ,(39,3)* So, we now know that sin A = 1885
y b
and, therefore, we need to prove or
y sin B = b ba sin B = y a y a
From geometry, we can see that sin (π − B) = So the proof is reduced to proving that
sin (π − B) = sin B This is easily seen as true after examining the top half of the unit circle. So, putting all of our results together, we get sin A y = a ba sin (π − B) sin A = a b sin A sin B = a b
(527.27.1)
sin C . c
The same logic may be followed to show that each of these fractions is also equal to For the ﬁnal step of the proof, we must show that 2R = a sin A
We begin by deﬁning our coordinate system. For this, it is convenient to ﬁnd one side that is not shorter than the others and label it with length b. (The concept of a “longest” side is not well deﬁned in equilateral and some isoceles triangles, but there is always at least one side that is not shorter than the others.) We then deﬁne our coordinate system such that the corners of the triangle that mark the ends of side b are at the coordinates (0, 0) and (b, 0). Our third corner (with sides labelled alphbetically clockwise) is at the point (c cos A, c sin A). Let the center of our circumcircle be at (x0 , y0 ). We now have
2 x2 + y0 = R2 0 2 (b − x0 )2 + y0 = R2 (c cos A − x0 )2 + (c sin A − y0 )2 = R2
(527.27.2) (527.27.3) (527.27.4)
as each corner of our triangle is, by deﬁnition of the circumcircle, a distance R from the circle’s center. 1886
Combining equations (3) and (2), we ﬁnd
2 2 (b − x0 )2 + y0 = x2 + y0 0 b2 − 2bx0 = 0 b = x0 2
Substituting this into equation (2) we ﬁnd that
2 y0 = R 2 −
b2 4
(527.27.5)
Combining equations (4) and (5) leaves us with (c cos A − x0 )2 + (c sin A − y0 )2 c2 cos2 A − 2x0 c cos A + c2 sin2 A − 2y0 c sin A c − 2x0 cos A − 2y0 sin A c − b cos A 2 sin A (c − b cos A)2 4 sin2 A (c − b cos A)2 + b2 sin2 A c2 − 2bc cos A + b2 a2 a sin A
2 = x2 + y0 0 = 0 = 0
= y0 = R2 − = = = = b2 4 2 4R sin2 A 4R2 sin2 A 4R2 sin2 A 2R
where we have applied the cosines law in the second to last step. Version: 3 Owner: drini Author(s): ﬁziko
527.28
angle bisector
For every angle, there exists a line that divides the angle into two equal parts. This line is called the angle bisector.
The interior bisector of an angle is the line or line segment that divides it into two equal angles on the same side as the angle. The exterior bisector of an angle is the line or line segment that divides it into two equal angles on the opposite side as the angle. 1887
For a triangle, the point where the angle bisectors of the three angles meet is called the incenter. Version: 1 Owner: giri Author(s): giri
527.29
angle sum identity
It is desired to prove the identities sin(θ + φ) = sin θ cos φ + cos θ sin φ and cos(θ + φ) = cos θ cos φ − sin θ sin φ Consider the ﬁgure
where we have ◦ ◦ Aad ⇔ Bba ⇔ Ccb Ddc
◦ ad = dc = 1. Also, everything is Euclidean, and in particular, the interior angles of any triangle sum to π. Call ∠Aad = θ and ∠baB = φ. From the triangle sum rule, we have ∠Ada = ∠Ddc = π − φ, while the degenerate angle ∠AdD = π, so that 2 ∠adc = θ + φ We have, therefore, that the area of the pink parallelogram is sin(θ + φ). On the other hand, we can rearrange things thus:
π 2
− θ and
In this ﬁgure we see an equal pink area, but it is composed of two pieces, of areas sin φ cos θ and cos φ sin θ. Adding, we have sin(θ + φ) = sin φ cos θ + cos φ sin θ 1888
which gives us the ﬁrst. From deﬁnitions, it then also follows that sin(θ + π/2) = cos(θ), and sin(θ + π) = −sin(θ). Writing cos(θ + φ) = = = = sin(θ + φ + π/2) sin(θ) cos(φ + π/2) + cos(θ) sin(φ + π/2) sin(θ) sin(φ + π) + cos(θ) cos(φ) cos θ cos φ − sinθ sin φ
Version: 7 Owner: quincynoodles Author(s): quincynoodles
527.30
annulus
An annulus is a twodimensional shape which can be thought of as a disc with a smaller disc removed from its center. An annulus looks like:
Note that both the inner and outer radii may take on any values, so long as the outer radius is larger than the inner. Version: 9 Owner: akrowne Author(s): akrowne
527.31
butterﬂy theorem
Let M be the midpoint of a chord P Q of a circle, through which two other chords AB and CD are drawn. If AD intersects P Q at X and CB intersects P Q at Y ,then M is also the midpoint of XY.
The theorem gets its name from the shape of the ﬁgure, which resembles a butterﬂy. Version: 5 Owner: giri Author(s): giri
527.32
centroid
The centroid of a triangle (also called center of gravity of the triangle) is the point where the three medians intersect each other. 1889
In the ﬁgure, AA , BB and CC are medians and G is the centroid of ABC. The centroid G has the property that divides the medians in the ratio 2 : 1, that is AG = 2GA BG = 2GB CG = 2GC .
Version: 5 Owner: drini Author(s): drini
527.33
chord
A chord is the line segment joining two points on a curve. Usually it is used to refer to a line segment whose end points lie on a circle. Version: 1 Owner: giri Author(s): giri
527.34
circle
Deﬁnition A circle in the plane is determined by a center and a radius. The center is a point in the plane, and the radius is a positive real number. The circle consists of all points whose distance from the center equals the radius. (In this entry, we only work with the standard Euclidean norm in the plane.) A circle determines a closed curve in the plane, and this curve is called the perimeter or circumference of the circle. If the radius of a circle is r, then the length of the perimeter is 2πr. Also, the area of the circle is πr 2 . More precisely, the interior of the perimeter has area πr 2 . The diameter of a circle is deﬁned as d = 2r. The circle is a special case of an ellipse. Also, in three dimensions, the analogous geometric object to a circle is a sphere. The circle in analytic geometry Let us next derive an analytic equation for a circle in Cartesian coordinates (x, y). If the circle has center (a, b) and radius r > 0, we obtain the following condition for the points of the sphere, (x − a)2 + (y − b)2 = r 2 . (527.34.1) In other words, the circle is the set of all points (x, y) that satisfy the above equation. In the special case that a = b = 0, the equation is simply x2 + y 2 = r 2 . The unit circle is the circle x2 + y 2 = 1.
1890
It is clear that equation 526.34.1 can always be reduced to the form x2 + y 2 + Dx + Ey + F = 0, (527.34.2)
where D, E, F are real numbers. Conversely, suppose that we are given an equation of the above form where D, E, F are arbitrary real numbers. Next we derive conditions for these constants, so that equation (526.34.2) determines a circle [1]. Completing the squares yields x2 + Dx + whence D x+ 2 There are three cases: 1. If D 2 −4F +E 2 > 0, then equation (526.34.2) determines a circle with center (− D , − E ) 2 2 √ and radius 1 D 2 − 4F + E 2 . 2 2. If D 2 − 4F + E 2 = 0, then equation (526.34.2) determines the point (− D , − E ). 2 2 3. If D 2 − 4F + E 2 < 0, then equation (526.34.2) has no (real) solution in the (x, y)plane. The circle in polar coordinates Using polar coordinates for the plane, we can parameterize the circle. Consider the circle with center (a, b) and radius r > 0 in the plane R2 . It is then natural to introduce polar coordinates (ρ, φ) for R2 \ {(a, b)} by x(ρ, φ) = a + ρ cos φ, y(ρ, φ) = b + ρ sin φ, with ρ > 0 and φ ∈ [0, 2π). Since we wish to parameterize the circle, the point (a, b) does not pose a problem; it is not part of the circle. Plugging these expressions for x, y into equation (526.34.1) yields the condition ρ = r. The given circle is thus parameterization by φ → (a + ρ cos φ, b + ρ sin φ), φ ∈ [0, 2π). It follows that a circle is a closed curve in the plane. Three point formula for the circle Suppose we are given three points on a circle, say (x1 , y1 ), (x2 , y2), (x3 , y3 ). We next derive expressions for the parameters D, E, F in terms of these points. We also derive equation (526.34.3), which gives an equation for a circle in terms of a determinant. First, from equation (526.34.2), we have
2 x2 + y1 + Dx1 + Ey1 + F = 0, 1 2 x2 + y2 + Dx2 + Ey2 + F = 0, 2 2 x2 + y3 + Dx3 + Ey3 + F = 0. 3
D2 E2 D2 E 2 + y 2 + Ey + = −F + + , 4 4 4 4
2
E + y+ 2
2
=
D 2 − 4F + E 2 . 4
1891
Let us denote the matrix on the left hand side by Λ. Also, let us assume that det Λ = 0. Then, using Cramer’s rule, we obtain 2 2 x1 + y1 y1 1 1 2 det x2 + y2 y2 1 , D=− 2 det Λ x2 + y 2 y3 1 3 23 2 x1 x1 + y1 1 1 2 x2 x2 + y2 1 , det E=− 2 det Λ 2 2 x3 x3 + y3 1 2 x1 y1 x2 + y1 1 1 2 det x2 y2 x2 + y2 . F =− 2 det Λ 2 2 x3 y3 x3 + y3 These equations give the parameters D, E, F as functions of the three given points. Substituting these equations into equation (526.34.2) yields 2 2 x1 y1 1 x1 + y1 y1 1 2 (x2 + y 2 ) det x2 y2 1 − x det x2 + y2 y2 1 2 2 x2 + y3 y3 1 x3 y3 1 3 2 x1 x2 + y1 1 1 2 − y det x2 x2 + y2 1 2 2 x3 x2 + y3 1 3 2 x1 y1 x2 + y1 1 2 − det x2 y2 x2 + y2 = 0. 2 2 x3 y3 x2 + y3 3
These equations form a linear set of equations for D, E, F , i.e., 2 2 x1 + y1 D x1 y1 1 2 x2 y2 1 · E = − x2 + y2 . 2 2 F x2 + y3 x3 y3 1 3
Using the cofactor expansion, we can now write the equation for the circle passing trough (x1 , y1 ), (x2 , y2 ), (x3 , y3 ) as [2, 3] 2 x + y2 x y 1 2 x2 + y1 x1 y1 1 det 1 (527.34.3) 2 2 x2 + y2 x2 y2 1 = 0. 2 x2 + y3 x3 y3 1 3 See also • Wikipedia’s entry on the circle.
1892
REFERENCES
1. J. H. Kindle, Schaum’s Outline Series: Theory and problems of plane of Solid Analytic Geometry, Schaum Publishing Co., 1950. 2. E. Weisstein, Eric W. Weisstein’s world of mathematics, entry on the circle. 3. L. Rade, B. Westergren, Mathematics Handbook for Science and Engineering, Stu˙ dentlitteratur, 1995.
Version: 2 Owner: bbukh Author(s): bbukh, matte
527.35
collinear
A set of points are said to be collinear of they all lie on a straight line. In the following picture, A, P, B are collinear.
Version: 6 Owner: drini Author(s): drini
527.36
complete quadrilateral
CD and {E} = BC AD. Then AF CE
Let ABCD be a quadrilateral. Let {F } = AB is a complete quadrilateral.
The complete quadrilateral has four sides : ABF , ADE, BCE, DCF , and six angles: A, B, C, D, E, F .
Version: 2 Owner: vladm Author(s): vladm
527.37
concurrent
A set of lines or curves is said to be concurrent if all of them pass through some point:
Version: 2 Owner: drini Author(s): drini 1893
527.38
cosines law
Cosines Law. Let a, b, c be the sides of a triangle, and let A the angle opposite to a. Then a2 = b2 + c2 − 2bc cos A.
Version: 9 Owner: drini Author(s): drini
527.39
cyclic quadrilateral
Cyclic quadrilateral. A quadrilateral is cyclic when its four vertices lie on a circle.
A necessary and suﬃcient condition for a quadrilateral to be cyclic, is that the sum of a pair of opposite angles be equal to 180◦ . One of the main results about these quadrilaterals is Ptolemy’s theorem. Also, from all the quadrilaterals with given sides p, q, r, s, the one that is cyclic has the greatest area. If the four sides of a cyclic quadrilateral are known, the area can be found using Brahmagupta’s formula Version: 4 Owner: drini Author(s): drini
527.40
derivation of cosines law
The idea is to prove the cosines law: a2 = b2 + c2 − 2bc cos θ where the variables are deﬁned by the triangle: (60,30)**@ (0,0)**@ ,(20,3)*c ,(7,2)*θ ,(50,12)*a ,(30,17)*b
1894
Let’s add a couple of lines and two variables, to get
(60,30)**@ (0,0)**@ ,(20,3)*c ,(7,2)*θ ,(50,12)*a ,(30,17)*b ,(40,0) ;( This is all we need. We can use Pythagoras’ theorem to show that a2 = x2 + y 2 and b2 = y 2 + (c + x)2
So, combining these two we get a2 = x2 + b2 − (c + x)2 a2 = x2 + b2 − c2 − 2cx − x2 a2 = b2 − c2 − 2cx So, all we need now is an expression for x. Well, we can use the deﬁnition of the cosine function to show that c + x = b cos θ x = b cos θ − c With this result in hand, we ﬁnd that a2 a2 a2 a2 = = = = b2 − c2 − 2cx b2 − c2 − 2c (b cos θ − c) b2 − c2 − 2bc cos θ + 2c2 b2 + c2 − 2bc cos θ
(527.40.1)
Version: 2 Owner: drini Author(s): ﬁziko
527.41
diameter
The diameter of a circle or a sphere is the length of the segment joining a point with the one symmetrical respect to the center. That is, the length of the longest segment joining a pair of points. Also, we call any of these segments themselves a diameter. So, in the next picture, AB is a diameter. 1895
The diameter is equal to twice the radius. Version: 17 Owner: drini Author(s): drini
527.42
double angle identity
The doubleangle identities are
sin(2a) = 2 cos(a) sin(a) cos(2a) = 2 cos2 (a) − 1 = 1 − 2 sin2 (a) 2 tan(a) tan(2a) = 1 + tan2 (a) These are all derived from their respective trig addition formulas. For example,
(527.42.1) (527.42.2) (527.42.3)
sin(2a) = sin(a + a) = cos(a) sin(a) + sin(a) cos(a) = 2 cos(a) sin(a) The formula for cosine follows similarly, and tangent is derived by taking the ratio of sine to cosine, as always. The doubleangle formulae can also be derived from the de Moivre identity. Version: 5 Owner: akrowne Author(s): akrowne
527.43
equilateral triangle
A triangle with its three sides and its three angles equal.
Therefore, an equilateral triangle has 3 angles of 60◦ . Due to the congruence criterion sidesideside, an equilateral triangle gets completely detrmined by specifying its side. 1896
In an equilateral triangle, the bisector of any angle coincides with the height, the median and the perpendicular bisector of the opposite side. √ If r is the length of the side, then the height is equal to r 3/2. Version: 3 Owner: drini Author(s): drini
527.44
fundamental theorem on isogonal lines
Let ABC be a triangle and AX, BY, CZ three concurrent lines at P . If AX , BY , CZ are the respective isogonal conjugate lines for AX, BY, CZ, then AX , BY , CZ are also concurrent at some point P . An applications of this theorem proves the existence of Lemoine point (for it is the intersection point of the symmedians): This theorem is a direct consequence of Ceva’s theorem (trigonometric version). Version: 1 Owner: drini Author(s): drini
527.45
height
Let ABC be a given triangle. A height of ABC is a line drawn from a vertex to the opposite side (or its prolongations) and perpendicular to it. So we have three heights in any triangle. The three heights are always concurrent and the common point is called orthocenter. In the following ﬁgure, AD, BE and CE are heights of ABC.
Version: 2 Owner: drini Author(s): drini
527.46
hexagon
An hexagon is a 6sided polygon.
Figure. A regular hexagon. 1897
Version: 2 Owner: drini Author(s): drini
527.47
hypotenuse
Let ABC a right triangle with right angle at C. Then AB is called hypotenuse.
The middle point P of the hypotenuse coincides with the circumcenter of the triangle, so it is equidistant from the three vertices. When the triangle is inscribed on his circumcircle, the hypotenuse becomes a diameter. So the distance from P to the vertices is precisely the circumradius. The hypotenuse’s lenght can be calculated by means of the Pythagorean theorem: √ c = a2 + b2 Sometimes, the longest side of a triangle is also called an hypotenuse but this naming is seldom seen. Version: 5 Owner: drini Author(s): drini
527.48
isogonal conjugate
Let ABC be a triangle, AL the angle bisector of ∠BAC and AX any line passing through A. The isogonal conjugate line to AX is the line AY obtained by reﬂecting the line AX on the angle bisector AL. In the picture ∠Y AL = ∠LAX. This is the reason why AX and AY are called isogonal conjugates, since they form the same angle with AL. (iso= equal, gonal = angle). Let P be a point on the plane. The lines AP, BP, CP are concurrent by construction. Consider now their isogonals conjugates (reﬂections on the inner angle bisectors). The isogonals conjugates will also concurr by the fundamental theorem on isogonal lines, and their intersection point Q is called the isogonal conjugate of P . If Q is the isogonal conjugate of P , then P is the isogonal conjugate of Q so both are often referred as an isogonal conjugate pair. An example of isogonal conjugate pair is found by looking at the centroid of the triangle and the Lemoine point. Version: 4 Owner: drini Author(s): drini 1898
527.49
isosceles triangle
A triangle with two equal sides. This deﬁnition implies that any equilateral triangle is isosceles too, but there are isosceles triangles that are not equilateral. In any isosceles triangle, the angles opposite to the equal sides are also equal. In an equilateral triangle, the height, the median and the bisector to the third side are the same line. Version: 5 Owner: drini Author(s): drini
527.50
legs
The legs of a triangle are the two sides which are not the hypotenuse.
Above: Various triangles, with legs in red.
Note that there are no legs for isosceles or right triangles, just as there is no notion of hypotenuse for these special triangles. Version: 3 Owner: akrowne Author(s): akrowne
527.51
medial triangle
The medial triangle of a triangle ABC is the triangle formed by joining the midpoints of the sides of the triangle ABC.
Here, A B C is the medial triangle. The incircle of the medial triangle is called the Spieker circle and the incenter is called the Spieker center. The circumcircle of the medial triangle is called the medial circle. An important property of the medial triagle is that the medial triangle medial triangle DEF of ABC is similar to ABC. A B C of the
Version: 2 Owner: giri Author(s): giri 1899
527.52
median
The median of a triangle is a line joining a vertex with the midpoint of the opposite side. In the next ﬁgure, AA is a median. That is, BA = A C, or equivalently, A is the midpoint of BC.
Version: 7 Owner: drini Author(s): drini
527.53
midpoint
If AB is a segment, then its midpoint is the point P whose distances from B and C are equal. That is, AP = P B. With the notation of directed segments, it’s the point on the line that contains AB such − → AP that the ratio −→ = 1. −
PB
Version: 2 Owner: drini Author(s): drini
527.54
ninepoint circle
The nine point circle also known as the Euler’s circle or the Feuerbach circle is the circle that passes through the feet of perpendiculars from the vertices A, B and C of a triangle ABC.
Some of the properties of this circle are: Property 1 : This circle also passes through the midpoints of the sides AB, BC and CA of ABC. This was shown by Euler. Property 2 : Feuerbach showed that this circle also passes through the midpoints of the line segments AH, BH and CH which are drawn from the vertices of ABC to its orthocenter H. 1900
These three triples of points make nine in all, giving the circle its name. Property 3 : The radius of the ninepoint cirlce is R/2, where R is the circumradius (radius of the circumcircle). Property 4 : The center of the ninepoint circle is the midpoint of the line segment joining the orthocenter and the circumcenter, and hence lies on the Euler Line. Property 5 : All triangles inscribed in a given circle and having the same orthocenter, have the same ninepoint circle. Version: 3 Owner: mathwizard Author(s): mathwizard, giri
527.55
orthic triangle
If ABC is a triangle and AD, DE, CF are its three heights, then the triangle DEF is called the orthic triangle of ABC. A remarkable property of orthic triangles says that the orthocenter of ABC is also the incenter of the orthic triangle DEF . That is, the heights or ABC are the angle bisectors of DEF .
Version: 2 Owner: drini Author(s): drini
527.56
orthocenter
The orthocenter of a triangle is the point of intersection of its three heights.
In the ﬁgure, H is the orthocenter of ABC. The orthocenter H lies inside, on a vertex or outside the triangle depending on the triangle being acute, right or obtuse respectively. Orthocenter is one of the most important triangle centers and it is very related with the orthic triangle (formed by the three height’s foots). It lies on the Euler Line and the four quadrilaterals F HDB, CHEC, AF HE are cyclic. Version: 3 Owner: drini Author(s): drini 1901
527.57
parallelogram
A quadrilateral whose opposite sides are parallel. Some special parallelograms have their own names: squares, rectangles, rhombuses. A rectangle is a parallelogram whose 4 angles are equal, a rhombus is a parallelogram whose 4 sides are equal, and a square is a parallelogram that is a rectangle and a rhombus at the same time. All parallelograms have their opposite sides and opposite angles equal (moreover, if a quadrilateral has a pair of opposite sides equal and parallel, the quadrilateral must be a parallelogram). Also, adjacent angles always add up 180◦ , and the diagonals cut each other by the midpoint. There is also a neat relation between the length of the sides and the length of the diagonals called parallelogram law. Version: 2 Owner: drini Author(s): drini
527.58
parallelogram law
Let ABCD be a parallelogram with side lengths u, v and whose diagonals have lengths d1 and d2 then 2u2 + 2v 2 = d2 + d2 . 1 2
Version: 3 Owner: drini Author(s): drini
527.59
pedal triangle
The pedal triangle of any ABC is the triangle, whose vertices are the feet of perpendiculars from A, B, and C to the opposite sides of the triangle. In the ﬁgure DEF is the pedal triangle.
In general, for any point P inside a triangle, the pedal triangle of P is one whose vertices are the feet of perpendiculars from P to the sides of the triangle. 1902
Version: 3 Owner: giri Author(s): giri
527.60
pentagon
A pentagon is a 5sided polygon. Regular pentagons are of particular interest for geometers. On a regular pentagon, the inner angles are equal to 108◦ . All the diagonals have the same length. If s is the length of a side and d is the length of a diagonal, then √ 1+ 5 d = s 2 that is, the ratio between a diagonal and a side is the golden number. Version: 1 Owner: drini Author(s): drini
527.61
polygon
A polygon is a plane region delimited by straight lines. Some polygons have special names Number of sides 3 4 5 6 7 8 Name of the polygon triangle quadrilateral pentagon hexagon Heptagon Octagon
In general, a polygon with n sides is called an ngon. In an ngon, there are n points where two sides meet. These are called the vertices of the ngon. At each vertex, the two sides that meet determine two angles: the interior angle and the exterior angle. The former angle opens towards to the interior of the polygon, and the latter towards to exterior of the polygon. Below are some properties for polygons. 1. The sum of all its interior angles is (n − 2)180◦. 2. Any polygonal divides the plane into two components, one bounded (the inside of the polygon) and one unbounded. This result is the Jordan curve theorem for polygons. A direct proof can be found in [2], pp. 1618. 1903
3. In complex analysis, the SchwarzChristoﬀel transformation [2] gives a conformal map from any polygon to the upper half plane. 4. The area of a polygon can be calculated using Pick’s theorem.
REFERENCES
1. E.E. Moise, Geometric Topology in Dimensions 2 and 3, SpringerVerlag, 1977. 2. R.A. Silverman, Introductory Complex Analysis, Dover Publications, 1972.
Version: 5 Owner: matte Author(s): matte, drini
527.62
proof of Apollonius theorem
Let b = CA, a = BC, c = AB, and m = AM. Let ∠CMA = θ, so that ∠BMA = π − θ. By the law of cosines, b2 = m2 + a4 − am cos θ and c2 = m2 + 2 m2 + a4 + am cos θ, and adding gives b2 + c2 = 2m2 + QED Version: 1 Owner: quincynoodles Author(s): quincynoodles a2 . 2
2
a2 4
− am cos(π − θ) =
527.63
proof of Apollonius theorem
Let m be a median of the triangle, as shown in the ﬁgure. By Stewart’s theorem we have a m2 + and thus a 2
2 2
= b2
a a + c2 2 2
a m + 2
2
b2 + c2 . = 2
Multiplying both sides by 2 gives 2m2 + a2 = b2 + c2 . 2 1904
QED Version: 2 Owner: drini Author(s): drini
527.64
proof of Brahmagupta’s formula
We shall prove that the area of a cyclic quadrilateral with sides p, q, r, s is given by (T − p)(T − q)(T − r)(T − s) where T =
p+q+r+s . 2
Area of the cyclic quadrilateral = Area of
ADB+ Area of
BDC.
1 1 = pq sin A + rs sin C 2 2 But since ABCD is a cyclic quadrilateral, ∠DAB = 180◦ − ∠DCB. Hence sin A = sin C. Therefore area now is 1 1 Area = pq sin A + rs sin A 2 2 1 (Area)2 = sin2 A(pq + rs)2 4 2 4(Area) = (1 − cos2 A)(pq + rs)2 4(Area)2 = (pq + rs)2 − cos2 A(pq + rs)2 ADB and p2 + q 2 − 2pq cos A = r 2 + s2 − 2rs cos C Substituting cos C = − cos A (since angles A and C are suppplementary) and rearranging, we have 2 cos A(pq + rs) = p2 + q 2 − r 2 − s2 substituting this in the equation for area, 1 4(Area)2 = (pq + rs)2 − (p2 + q 2 − r 2 − s2 )2 4 which is of the form a2 − b2 and hence can be written in the form (a + b)(a − b) as (2(pq + rs) + p2 + q 2 − r 2 − s2 )(2(pq + rs) − p2 − q 2 + r 2 + s2 ) 1905 16(Area)2 = 4(pq + rs)2 − (p2 + q 2 − r 2 − s2 )2 Applying cosines law for have BDC and equating the expressions for side DB, we
= ((p + q)2 − (r − s)2 )((r + s)2 − (p − q)2 ) = (p + q + r − s)(p + q + s − r)(p + r + s − q)(q + r + s − p) Introducing T =
p+q+r+s , 2
16(Area)2 = 16(T − p)(T − q)(T − r)(T − s) Taking square root, we get Area = (T − p)(T − q)(T − r)(T − s)
Version: 3 Owner: giri Author(s): giri
527.65
proof of Erd¨sAnning Theorem o
Let A, B and C be three noncollinear points. For an additional point P consider the triangle ABP . By using the triangle inequality for the sides P B and P A we ﬁnd −AB P B − P A AB. Likewise, for triangle BCP we get −BC P B − P C BC. Geometrically, this means the point P lies on two hyperbola with A and B or B and C respectively as foci. Since all the lengths involved here are by assumption integer there are only 2AB + 1 possibilities for P B − P A and 2BC + 1 possibilities for P B − P C. These hyperbola are distinct since they don’t have the same major axis. So for each pair of hyperbola we can have at most 4 points of intersection and there can be no more than 4(2AB + 1)(2BC + 1) points satisfying the conditions. Version: 1 Owner: lieven Author(s): lieven
527.66
proof of Heron’s formula
Let α be the angle between the sides b and c, then we get from the cosines law: cos α = Using the equation sin α = we get: sin α = Now we know: √ b2 + c2 − a2 . 2bc √ 1 − cos2 α
−a4 − b4 − c4 + 2b2 c2 + 2a2 b2 + 2a2 c2 . 2bc 1 ∆ = bc sin α. 2 1906
So we get: ∆ = 1√ 4 −a − b4 − c4 + 2b2 c2 + 2a2 b2 + 2a2 c2 4 1 (a + b + c)(b + c − a)(a + c − b)(a + b − c) = 4 s(s − a)(s − b)(s − c). =
This is Heron’s formula. P Version: 2 Owner: mathwizard Author(s): mathwizard
527.67
proof of Mollweide’s equations
We transform the equation (a + b) sin to γ α−β = c cos 2 2
α β α β α β α β + + b cos + = c cos cos + c sin sin , 2 2 2 2 2 2 2 2 using the fact that γ = π − α − β. The left hand side can be further expanded, so that we get: a cos a cos β α β β α β α α cos − sin sin +b cos cos − sin sin 2 2 2 2 2 2 2 2 = c cos β α β α cos +c sin sin . 2 2 2 2
Collecting terms we get: (a + b − c) cos Using s :=
a+b+c 2
α β α β cos − (a + b + c) sin sin = 0. 2 2 2 2
and using the equations α = 2 β cos = 2 sin (s − b)(s − c) bc s(s − a) bc
we get: s(s − c) (s − a)(s − b) s(s − c) (s − a)(s − b)) −2 = 0, c ab c ab which is obviously true. So we can prove the ﬁrst equation by going backwards. The second equation can be proved in quite the same way. 2 Version: 1 Owner: mathwizard Author(s): mathwizard 1907
527.68
proof of Ptolemy’s inequality
Looking at the quadrilateral ABCD we construct a point E, such that the triangles ACD and AEB are similar (∠ABE = ∠CDA and ∠BAE = ∠CAD).
This means that: from which follows that
AE AB BE = = , AC AD DC BE = AB · DC . AD
AB AD = AC AE the triangles EAC and BAD are similar. So we get: EC = Now if ABCD is cyclic we get ∠ABE + ∠CBA = ∠ADC + ∠CBA = 180◦ . This means that the points C, B and E are on one line ans thus: EC = EB + BC Now we can use the formulas we already found to get: AB · DC AC · DB = + BC. AD AD Multiplication with AD gives: AC · DB = AB · DC + BC · AD. Now we look at the case that ABCD is not cyclic. Then ∠ABE + ∠CBA = ∠ADC + ∠CBA = 180◦ , so the points E, B and C form a triangle and from the triangle inequality we know: EC < EB + BC. AC · DB . AD
Also because ∠EAC = ∠BAD and
1908
Again we use our formulas to get: AB · DC AC · DB < + BC. AD AD From this we get: AC · DB < AB · DC + BC · AD. Putting this together we get Ptolomy’s inequality: AC · DB with equality iﬀ ABCD is cyclic. Version: 1 Owner: mathwizard Author(s): mathwizard AB · DC + BC · AD,
527.69
proof of Ptolemy’s theorem
Let ABCD be a cyclic quadrialteral. We will prove that AC · BD = AB · CD + BC · DA.
Find a point E on BD such that ∠BCA = ∠ECD. Since ∠BAC = ∠BDC for opening the same arc, we have triangle similarity ABC ∼ DEC and so AB CA = DE CD which implies AC · ED = AB · CD. Also notice that implies ADC ∼ BEC since have two pairs of equal angles. The similarity AC AD = BC BE
which implies AC · BE = BC · DA. So we ﬁnally have AC · BD = AC(BE + ED) = AB · CD + BC · DA. Version: 8 Owner: drini Author(s): drini
1909
527.70
proof of Pythagorean theorem
Let ABC be a right triangle with hypotenuse BC. Draw the height AT . Using the right angles ∠BAC and ∠AT B and the fact that the sum of angles on any triangle is 180◦ , it can be shown that ∠BAT = ∠ACT ∠T AC = ∠CBA and therefore we have the following triangle similarities: ABC ∼
AB From those similarities, we have BC = thus AC 2 = BC · T C. We have then TB BA
T BA ∼
T AC.
AC BC
and thus AB 2 = BC · T B. Also
=
TC AC
and
which concludes the proof.
AB 2 + AC 2 = BC(BT + T C) = BC · BC = BC 2
Version: 5 Owner: drini Author(s): drini
527.71
proof of Pythagorean theorem
This is a geometrical proof of Pythagorean theorem. We begin with our triangle: (20,10)**@ (0,0)**@ ,(10,2)*a ,(23,6)*b ,(10,7)*c
Now we use the hypotenuse as one side of a square: (20,10)**@ (0,0)**@ (10,20)**@ (10,30)**@ (20,10)**@ ,(10,
and draw in four more identical triangles (20,10)**@ (0,0)**@ (10,20)**@ (10,30)**@ (20,10)**@ (20,3 Now for the proof. We have a large square, with each side of length a+b, which is subdivided into one smaller square and four triangles. The area of the large square must be equal to the combined area of the shapes it is made out of, so we have 1 (a + b)2 = c2 + 4 ab 2 a2 + b2 + 2ab = c2 + 2ab a2 + b2 = c2 (527.71.1) Version: 4 Owner: drini Author(s): ﬁziko 1910
527.72
proof of Simson’s line
Given a ABC with a point P on its circumcircle (other than A, B, C), we will prove that the feet of the perpendiculars drawn from P to the sides AB, BC, CA (or their prolongations) are collinear.
Since P W is perpendicular to BW and P U is perpendicular to BU the point P lies on the circumcircle of BUW . By similar arguments, P also lies on the circumcircle of AW V and CUV .
This implies that P UBW , P UCV and P V W A are all cyclic quadrilaterals. Since P UBW is a cyclic quadrilateral, ∠UP W = 180◦ − ∠UBW implies ∠UP W = 180◦ − ∠CBA Also CP AB is a cyclic quadrilateral, therefore ∠CP A = 180◦ − ∠CBA (opposite angles in a cyclic quarilateral are supplementary). From these two, we get ∠UP W = ∠CP A Subracting ∠CP W , we have ∠UP C = ∠W P A Now, since P V W A is a cyclic quadrilateral, ∠W P A = ∠W V A
1911
also, since UP V C is a cyclic quadrilateral, ∠UP C = ∠UV C Combining these two results with the previous one, we have ∠W V A = ∠UV C This implies that the points U, V, W are collinear. Version: 6 Owner: giri Author(s): giri
527.73
proof of Stewart’s theorem
Let θ be the angle ∠AXB. Cosines law on AXB says c2 = m2 + p2 − 2pm cos θ and thus cos θ = m2 + p2 − c2 2pm
Using cosines law on we get
AXC and noting that ψ = ∠AXC = 180◦ −θ and thus cos θ = − cos ψ b2 − n2 − p2 cos θ = . 2pn
From the expressions above we obtain 2pn(m2 + p2 − c2 ) = 2pm(b2 − n2 − p2 ). By cancelling 2p on both sides and collecting we are lead to m2 n + mn2 + p2 n + p2 m = b2 m + c2 n and from there mn(m + n) + p2 (m + n) = b2 m + c2 n. Finally, we note that a = m + n so we conclude that a(mn + p2 ) = b2 m + c2 n. QED Version: 3 Owner: drini Author(s): drini
1912
527.74
proof of Thales’ theorem
Let M be the center of the circle through A, B and C.
Then AM = BM = CM and thus the triangles AMC and BMC are isosceles. If ∠BMC =: α then ∠MCB = 90◦ − α and ∠CMA = 180◦ − α. Therefore ∠ACM = α and 2 2 ∠ACB = ∠MCB + ∠ACM = 90◦ . QED. Version: 3 Owner: mathwizard Author(s): mathwizard
527.75
proof of butterﬂy theorem
Given that M is the midpoint of a chord P Q of a circle and AB and CD are two other chords passing through M, we will prove that M is the midpoint of XY, where X and Y are the points where AD and BC cut P Q respectively.
Let O be the center of the circle. Since OM is perpendicular to XY (the line from the center of the circle to the midpoint of a chord is perpendicular to the chord), to show that XM = MY, we have to prove that ∠XOM = ∠Y OM. Drop perpendiculars OK and ON from O onto AD and BC, respectively. Obviously, K is the midpoint of AD and N is the midpoint of BC. Further, ∠DAB = ∠DCB and ∠ADC = ∠ABC as angles subtending equal arcs. Hence triangles ADM and CBM are similar and hence AD BC = AM CM AK CN = KM NM In other words, in triangles AKM and CNM, two pairs of sides are proportional. Also the angles between the corresponding sides are equal. We infer that the triangles AKM and CNM are similar. Hence ∠AKM = ∠CNM. 1913 or
Now we ﬁnd that quadrilaterals OKXM and ONY M both have a pair of opposite straight angles. This implies that they are both cyclic quadrilaterals. In OKXM, we have ∠AKM = ∠XOM and in ONY M, we have ∠CNM = ∠Y OM. From these two, we get ∠XOM = ∠Y OM. Therefore M is the midpoint of XY. Version: 2 Owner: giri Author(s): giri
527.76
sine:
proof of double angle identity
sin(2a) = sin(a + a) = sin(a) cos(a) + cos(a) sin(a) = 2 sin(a) cos(a). cosine: cos(2a) = cos(a + a) = cos(a) cos(a) + sin(a) sin(a) = cos2 (a) − sin2 (a). By using the identity sin2 (a) + cos2 (a) = 1 we can change the expression above into the alternate forms cos(2a) = 2 cos2 (a) − 1 = 1 − 2 sin2 (a). tangent: tan(2a) = tan(a + a) tan(a) + tan(a) = 1 − tan(a) tan(a) 2 tan(a) = . 1 − tan2 (a) Version: 1 Owner: drini Author(s): drini 1914
527.77
proof of parallelogram law
The proof follows directly from Apollonius theorem noticing that each diagonal is a median for the triangles in which parallelogram is split by the other diagonal. And also, diagonales bisect each other. Therefore, Apollonius theorem implies 2 d1 2
2
+
d2 2
2
= u2 + v 2 .
Multiplying both sides by 2 and simpliﬁcation leads to the desired expression. Version: 1 Owner: drini Author(s): drini
527.78
To prove that
proof of tangents law
tan( A−B ) a−b 2 = A+B a+b tan( 2 ) b a = . sin(A) sin(B)
we start with the sines law, which says that
This implies that a sin(B) = b sin(A) We can write sin(A) as sin(A) = sin( and sin(B) as sin(B) = sin( Therefore, we have a(sin( A+B A−B A+B A−B ) cos( ) − cos( ) sin( ). 2 2 2 2 A−B A+B A−B A+B ) cos( ) + cos( ) sin( ). 2 2 2 2
A−B A+B A−B A+B A−B A+B A−B A+B ) cos( )−cos( ) sin( )) = b(sin( ) cos( )+cos( ) sin( ) 2 2 2 2 2 2 2 2
Dividing both sides by cos( A−B ) cos( A+B ), we have, 2 2 a(tan( A+B A−B A+B A−B ) − tan( )) = b(tan( ) + tan( )) 2 2 2 2 1915
This gives us
Hence we ﬁnd that
tan( A+B ) + tan( A−B ) a 2 2 = b tan( A+B ) − tan( A−B ) 2 2 tan( A−B ) a−b 2 = . a+b tan( A+B ) 2
Version: 2 Owner: giri Author(s): giri
527.79
quadrilateral
A foursided polygon. A very special kind of quadrilaterals are parallelograms (squares, rhombuses, rectangles, etc) although cyclic quadrilaterals are also interesting on their own. Notice however, that there are quadrilaterals that are neither parallelograms nor cyclic quadrilaterals. [Graphic will go here] Version: 2 Owner: drini Author(s): drini
527.80
radius
The radius of a circle or sphere is the distance from the center of the ﬁgure to the outer edge (or surface.) This deﬁnition actually holds in n dimensions; so 4th and 5th and kdimensional “spheres” have radii. Since a circle is really a 2dimensional sphere, its “radius” is merely an instance of the general deﬁnition. Version: 2 Owner: akrowne Author(s): akrowne
527.81
rectangle
A parallelogram whose four angles are equal, that is, whose 4 angles are equal to 90◦ . Rectangles are the only parallelograms that are also cyclic. Notice that every square is also a rectangle, but there are rectangles that are not squares [graphic] 1916
Any rectangle has their 2 diagonals equal (and rectangles are the only parallelograms with this property). A nice result following from this, is that joining the midpoints of the sides of a rectangle always gives a rhombus. Version: 1 Owner: drini Author(s): drini
527.82
regular polygon
A regular polygon is a polygon with all its sides equal and all its angles equal, that is, a polygon that is both equilateral and equiangular. Some regular polygons get special names. So, an regular triangle is also known as an equilateral triangle, and a regular quadrilateral is also know as a square. The symmetry group of a regular polygon with n sides is known as the dihedral group of order n (denoted as Dn ). Any regular polygon can be inscribed into a circle and a circle can be inscribed within it. Given a regular polygon with n sides whose side has lenght t, the radius of the circunscribed circle is t R= 2 sin(180◦ /n) and the radius of the inscribed circle is r = 2t tan(180◦ /n). The area can also be calculated using the formula nt2 . A= 4 tan(180◦ /n) Version: 3 Owner: drini Author(s): drini
527.83
regular polyhedron
A regular polyhedron is a polyhedron such that • Every face is a regular polygon. • On each vertex, the same number of edges concur. • The dihedral angle between any two faces is always the same. 1917
These polyhedra are also know as Platonic solids, since Plato described them on his work. There are only 5 regular polyhedra (the ﬁrst four were known to Plato) and they are Tetrahedron It has 6 edges and 4 vertices and 4 faces, each one being an equilateral triangle. Its symmetry group is S4 . Hexahedron Also known as cube. It has 8 vertices, 12 edges and 6 faces each one being a square. Its symmetry group is S4 × C2 . Octahedron It has 6 vertices, 12 edges and 8 faces, each one being an equilateral triangle Its symmetry group is S4 × C2 . Dodecahedron It has 20 vertices, 30 edges and 12 faces, each one being a regular pentagon. Its symmetry group is A5 × C2 . Icosahedron It has 12 vertices, 30 edges and 20 faces, each one being an equilateral triangle. Its symmetry group is A5 × C2 . where An is the alternating group of order n, Sn is the symmetric group of order n and Cn is the cyclic group with order n. Version: 6 Owner: drini Author(s): drini
527.84
rhombus
A rhombus is a parallelogram with its 4 sides equal. This is not the same as being a square, since the angles need not to be all equal
In any rhombus, the diagonals are always perpendicular. A nice result following from this, is that joining the midpoints of the sides, always gives a rectangle. If D and d are the diagonal’s lenghts, then the area of rhombus can be computed using the formula Dd A= . 2 Version: 5 Owner: drini Author(s): drini
1918
527.85
right triangle
A triangle ABC is right when one of its angles is equal to 90◦ (and therefore has two perpendicular sides).
Version: 1 Owner: drini Author(s): drini
527.86
sector of a circle
A sector is a fraction of the interior of a circle, described by a central angle θ. If θ = 2π, the sector becomes a complete circle.
If the central angle is θ, and the radius of the circle is r, then the area of the sector is given by 1 Area = r 2 θ 2 This is obvious from the fact that the area of a sector is is πr 2 ). Note that, in the formula, θ is in radians. Version: 1 Owner: giri Author(s): giri
θ 2π
times the area of the circle (which
527.87
sines law
Sines Law. Let ABC be a triangle where a, b, c are the sides opposite to A, B, C respectively, and let R be the radius of the circumcircle. Then the following relation holds: b c a = = = 2R. sin A sin B sin C
Version: 10 Owner: drini Author(s): drini 1919
527.88
sines law proof
Let ABC a triangle. Let T a point in the circumcircle such that BT is a diameter. So ∠A = ∠CAB is equal to ∠CT B (they subtend the same arc). Since right triangle, from the deﬁnition of sine we get sin ∠CT B = BC a = . BT 2R CBT is a
On the other hand ∠CAB = ∠CT B implies their sines are the same and so sin ∠CAB = and therefore a 2R
a = 2R. sin A
Drawing diameters passing by C and A will let us prove in a similar way the relations b = 2R and sin B and we conclude that c = 2R sin C
b c a = = = 2R. sin A sin B sin C
Q.E.D. Version: 5 Owner: drini Author(s): drini
527.89
some proofs for triangle theorems
The sum of the three angles is A + B + C = 180◦ . The following triangle shows how the angles can be found to make a half revolution, which equals 180◦ P.
The area formula AREA = pr where p is the half circumfrence p = a+b+c and r is the radius 2 of the inscribed circle is proved by creating the triangles BAO, BCO, ACO from the original triangle ABC.
1920
Then AREAABC = AREABAO + AREABCO + AREAACO AREAABC = P Version: 6 Owner: Gunnar Author(s): Gunnar rc ra rb r(a + b + c) + + = = pr 2 2 2 2
527.90
square
A square is the regular 4gon, that is, a quadrilateral whose 4 angles and 4 sides are respectively equal. This implies a square is a parallelogram that is both a rhombus and a rectangle at the same time. Notice, however, that if a quadrilateral has its 4 sides equal, we cannot generally say it is a square, since it could be a rhombus as well. If r is the length of a side, the diagonals of a square (which are equal since it’s a rectangle √ too) have length r 2. Version: 2 Owner: drini Author(s): drini
527.91
tangents law
Let ABC be a triangle with a, b and c being the sides opposite to A, B and C respectively. Then the following relation holds. tan( A−B ) a−b 2 = . a+b tan( A+B ) 2 Version: 2 Owner: giri Author(s): giri
527.92
triangle
Triangle. A plane ﬁgure bounded by 3 straight lines.
1921
The sum of its three (inner) angles is always 180◦ . In the ﬁgure: A + B + C = 180◦ . Triangles can be classiﬁed according to the number of their equal sides. So, a triangle with 3 equal sides is called equilateral, triangles with 2 equal sides are isosceles and ﬁnally a triangle with no equal sides is called scalene. Notice that an equilateral triangle is also isosceles, but there are isosceles triangles that are not equilateral.
Triangles can also be classiﬁed according to the size of the greatest of its three (inner) angles. If the greatest of them is less than 90◦ (and therefore all three) we say that the triangle is acute. If the triangle has a right angle, we say that it is a right triangle. If the greatest angle of the three is greater than 90◦ , we call the triangle obtuse.
There are several ways to calculate a triangle’s area. Let a, b, c be the sides and A, B, C the interior angles opposite to them. Let ha , hb , hc be the heights drawn upon a, b, c respectively, r the inradius and R the circumradius. Finally, let p = a+b+c be the semiperimeter. Then 2
AREA =
bhb chc aha = = 2 2 2 ab sin C bc sin A ca sin B = = = 2 2 2 abc = pr = 4R = p(p − a)(p − b)(p − c)
Last formula is know as Heron’s formula. Version: 18 Owner: drini Author(s): drini
527.93
triangle center
On every triangle there are points where special lines or circles intersect, and those points usually have very interesting geometrical properties. Such points are called triangle centers. Some examples of triangle centers are incenter, orthocenter, centroid, circumcenter, excenters, Feuerbach point, Fermat points, etc. 1922
For an online reference please check the Triangle Centers page. Here’s a drawing I made showing the most important lines and centers of a triangle
(XEukleides source code for the drawing)
Version: 5 Owner: drini Author(s): drini
1923
Chapter 528 5101 – Instructional exposition (textbooks, tutorial papers, etc.)
528.1 geometry
Geometry, or literally, the measurement of land, is among the oldest and largest areas of mathematics. For this reason, a precise deﬁnition of what geometry is is quite diﬃcult. Some approaches are listed below.
528.1.1
As invariants under certain transformations
One approach to geometry ﬁrst formulated by Felix Klein is to describe it as the study of invariants under certain allowed transformations. This involves taking our space as a set S, and consider a sugroup G of the group Bij(S), the set of bijections of S. Objects are subsets of S, and we consider two objects A, B ⊂ S to be equivalent if there is an f ∈ G such that f (A) = B.
528.1.2
Basic Examples
Euclidean Geometry Euclidean geometry deals with Rn as a vector space along with a metric d. The allowed transformations are bijections f : Rn → Rn that preserve the metric, that is, d(x, y) = d(f (x), f (y)) for all x, y ∈ Rn . Such maps are called isometries, and the group is often denoted by Iso(Rn ). Deﬁning a norm by x = d(x, 0), for x ∈ Rn , we obtain a notion of length or distance. This gives an inner product by < x, y >= x − y, leading to the deﬁniton 1924
of the angle between two vectors x, y ∈ Rn to be ∠xy = cos−1 ( <x,y> ). It is clear that since x·y isometries preserve the metric, they preserve distance and angle. As an example, it can be shown that the group Iso(R2 ) consists of translations, reﬂections, glides, and rotations.
Projective Geometry Projective geometry was motivated by how we see objects in everyday life. For example, parallel train tracks appear to meet at a point far away, even though they are always the same distance apart. In projective geometry, the primary invariant is that of incidence. The notion of parallelism and distance is not present as with Eulcidean geometry. There are diﬀerent ways of approaching projective geometry. One way is to add points of inﬁnity to Euclidean space. For example, we may form the projective line by adding a point of inﬁnity ∞, called the ideal point, to R. We can then create the projective plane where for each line l ∈ R2 , we attach an ideal point, and two ordinary lines have the same ideal point if and only if they are parallel. The projective plane then consists of the regular plane R2 along with the ideal line, which consists of all ideal points of all ordinary lines. The idea here is to make central projections from a point sending a line to another a bijective map. Another approach is more algebraic, where we form P (V ) where V is a vector space. When V = Rn , we take the quotient of Rn+1