4.79
Views: 4,912Likes: 72https://www.scribd.com/doc/9691966/FreeEncyclopediaofMathematicsVol1
05/09/2014
original
1
by the PlanetMath authors Aatu, ack, akrowne, alek thiery, alinabi, almann,
alozano, antizeus, antonio, aparna, ariels, armbrusterb, AxelBoldt, basseykay,
bbukh, benjaminfjones, bhaire, brianbirgen, bs, bshanks, bwebste, cryo, danielm,
Daume, debosberg, deiudi, digitalis, djao, Dr Absentius, draisma, drini, drum
mond, dublisk, Evandar, ﬁbonaci, ﬂynnheiss, gabor sz, GaloisRadical, gantsich,
gaurminirick, gholmes74, giri, greg, grouprly, gumau, Gunnar, Henry, iddo, igor,
imran, jamika chris, jarino, jay, jgade, jihemme, Johan, karteef, karthik, kemy
ers3, Kevin OBryant, kidburla2003, KimJ, Koro, lha, lieven, livetoad, liyang, Lo
gan, Luci, m759, mathcam, mathwizard, matte, mclase, mhale, mike, mikestaﬂo
gan, mps, msihl, muqabala, n3o, nerdy2, nobody, npolys, Oblomov, ottocolori,
paolini, patrickwonders, pbruin, petervr, PhysBrain, quadrate, quincynoodles,
ratboy, RevBobo, Riemann, rmilson, ruiyang, Sabean, saforres, saki, say 10,
scanez, scineram, seonyoung, slash, sleske, slider142, sprocketboy, sucrose, super
higgs, tensorking, thedagit, Thomas Heye, thouis, Timmy, tobix, tromp, tz26, un
lord, uriw, urz, vampyr, vernondalhart, vitriol, vladm, volator, vypertd, wberry,
Wkbj79, wombat, x bas, xiaoyanggu, XJamRastaﬁre, xriso, yark et al.
edited by Joe Corneli & Aaron Krowne
Copyright c ( 2004 PlanetMath.org authors. Permission is granted to copy, dis
tribute and/or modify this document under the terms of the GNU Free Documen
tation License, Version 1.2 or any later version published by the Free Software
Foundation; with no Invariant Sections, with no FrontCover Texts, and with no
BackCover Texts. A copy of the license is included in the section entitled “GNU
Free Documentation License”.
Introduction
Welcome to the PlanetMath “One Big Book” compilation, the Free Encyclopedia of Math
ematics. This book gathers in a single document the best of the hundreds of authors and
thousands of other contributors from the PlanetMath.org web site, as of January 4, 2004.
The purpose of this compilation is to help the eﬀorts of these people reach a wider audience
and allow the beneﬁts of their work to be accessed in a greater breadth of situations.
We want to emphasize is that the Free Encyclopedia of Mathematics will always be a work
in progress. Producing a bookformat encycopedia from the amorphous web of interlinked
and multidimensionallyorganized entries on PlanetMath is not easy. The print medium
demands a linear presentation, and to boil the web site down into this format is a diﬃcult,
and in some ways lossy, transformation. A major part of our editorial eﬀorts are going into
making this transformation. We hope the organization we’ve chosen for now is useful to
readers, and in future editions you can expect continuing improvements.
The “linearization” of PlanetMath.org is not the only editorial task we must perform.
Throughout the millenia, readers have come to expect a strict standard of consistency and
correctness from print books, and we must strive to meet this standard in the PlanetMath
Book as closely as possible. This means applying more editorial control to the book form
of PlanetMath than is applied to the web site. We hope you will agree that there is signiﬁ
cant value to be gained from unifying style, correcting errors, and ﬁltering out notyetready
content, so we will continue to do these things.
For more details on planned improvements to this book, see the TODO ﬁle that came with
this archive. Remember that you can help us to improve this work by joining PlanetMath.org
and ﬁling corrections, adding entries, or just participating in the community. We are also
looking for volunteers to help edit this book, or help with programming related to its pro
duction, or to help work on Noosphere, the PlanetMath software. To send us comments
about the book, use the email address pmbook@planetmath.org. For general comments
and queries, use feedback@planetmath.org.
Happy mathing,
Joe Corneli
Aaron Krowne
Tuesday, January 27, 2004
i
Toplevel Math Subject
Classiﬁciations
00 General
01 History and biography
03 Mathematical logic and foundations
05 Combinatorics
06 Order, lattices, ordered algebraic structures
08 General algebraic systems
11 Number theory
12 Field theory and polynomials
13 Commutative rings and algebras
14 Algebraic geometry
15 Linear and multilinear algebra; matrix theory
16 Associative rings and algebras
17 Nonassociative rings and algebras
18 Category theory; homological algebra
19 $K$theory
20 Group theory and generalizations
22 Topological groups, Lie groups
26 Real functions
28 Measure and integration
30 Functions of a complex variable
31 Potential theory
32 Several complex variables and analytic spaces
33 Special functions
34 Ordinary differential equations
35 Partial differential equations
37 Dynamical systems and ergodic theory
39 Difference and functional equations
40 Sequences, series, summability
41 Approximations and expansions
42 Fourier analysis
43 Abstract harmonic analysis
44 Integral transforms, operational calculus
ii
45 Integral equations
46 Functional analysis
47 Operator theory
49 Calculus of variations and optimal control; optimization
51 Geometry
52 Convex and discrete geometry
53 Differential geometry
54 General topology
55 Algebraic topology
57 Manifolds and cell complexes
58 Global analysis, analysis on manifolds
60 Probability theory and stochastic processes
62 Statistics
65 Numerical analysis
68 Computer science
70 Mechanics of particles and systems
74 Mechanics of deformable solids
76 Fluid mechanics
78 Optics, electromagnetic theory
80 Classical thermodynamics, heat transfer
81 Quantum theory
82 Statistical mechanics, structure of matter
83 Relativity and gravitational theory
85 Astronomy and astrophysics
86 Geophysics
90 Operations research, mathematical programming
91 Game theory, economics, social and behavioral sciences
92 Biology and other natural sciences
93 Systems theory; control
94 Information and communication, circuits
97 Mathematics education
iii
Table of Contents
Introduction i
Toplevel Math Subject Classiﬁciations ii
Table of Contents iv
GNU Free Documentation License lii
UNCLA – Unclassiﬁed 1
Golomb ruler 1
Hesse conﬁguration 1
Jordan’s Inequality 2
Lagrange’s theorem 2
Laurent series 3
Lebesgue measure 3
Leray spectral sequence 4
M¨obius transformation 4
MordellWeil theorem 4
Plateau’s Problem 5
Poisson random variable 5
Shannon’s theorem 6
Shapiro inequality 9
Sylow psubgroups 9
Tchirnhaus transformations 9
Wallis formulae 10
ascending chain condition 10
bounded 10
bounded operator 11
complex projective line 12
converges uniformly 12
descending chain condition 13
diamond theorem 13
equivalently oriented bases 13
ﬁnitely generated Rmodule 14
fraction 14
group of covering transformations 15
idempotent 15
isolated 17
isolated singularity 17
isomorphic groups 17
joint continuous density function 18
joint cumulative distribution function 18
joint discrete density function 19
left function notation 20
lift of a submanifold 20
limit of a real function exits at a point 20
lipschitz function 21
lognormal random variable 21
lowest upper bound 22
marginal distribution 22
measurable space 23
measure zero 23
minimum spanning tree 23
minimum weighted path length 24
mod 2 intersection number 25
moment generating function 27
monoid 27
monotonic operator 27
multidimensional Gaussian integral 28
multiindex 29
near operators 30
negative binomial random variable 36
normal random variable 37
normalizer of a subset of a group 38
nth root 38
null tree 40
open ball 40
opposite ring 40
orbitstabilizer theorem 41
orthogonal 41
permutation group on a set 41
prime element 42
product measure 43
projective line 43
projective plane 43
proof of calculus theorem used in the Lagrange
method 44
proof of orbitstabilizer theorem 45
proof of power rule 45
proof of primitive element theorem 47
proof of product rule 47
proof of sum rule 48
proof that countable unions are countable 48
quadrature 48
quotient module 49
regular expression 49
regular language 50
right function notation 51
ring homomorphism 51
scalar 51
schrodinger operator 51
iv
selection sort 52
semiring 53
simple function 54
simple path 54
solutions of an equation 54
spanning tree 54
square root 55
stable sorting algorithm 56
standard deviation 56
stochastic independence 56
substring 57
successor 57
sum rule 58
superset 58
symmetric polynomial 59
the argument principle 59
torsionfree module 59
total order 60
tree traversals 60
trie 63
unit vector 64
unstable ﬁxed point 65
weak* convergence in normed linear space 65
wellordering principle for natural numbers 65
0001 – Instructional exposition (textbooks,
tutorial papers, etc.) 66
dimension 66
toy theorem 67
00XX – General 68
method of exhaustion 68
00A05 – General mathematics 69
Conway’s chained arrow notation 69
Knuth’s up arrow notation 70
arithmetic progression 70
arity 71
introducing 0th power 71
lemma 71
property 72
saddle point approximation 72
singleton 73
subsequence 73
surreal number 73
00A07 – Problem books 76
Nesbitt’s inequality 76
proof of Nesbitt’s inequality 76
00A20 – Dictionaries and other general
reference works 78
completing the square 78
00A99 – Miscellaneous topics 80
QED 80
TFAE 80
WLOG 81
order of operations 81
01A20 – Greek, Roman 84
Roman numerals 84
01A55 – 19th century 85
Poincar, Jules Henri 85
01A60 – 20th century 90
Bourbaki, Nicolas 90
Erds Number 97
0300 – General reference works (hand
books, dictionaries, bibliographies, etc.) 98
BuraliForti paradox 98
Cantor’s paradox 98
Russell’s paradox 99
biconditional 99
bijection 100
cartesian product 100
chain 100
characteristic function 101
concentric circles 101
conjunction 102
disjoint 102
empty set 102
even number 103
ﬁxed point 103
inﬁnite 103
injective function 104
integer 104
inverse function 105
linearly ordered 106
operator 106
ordered pair 106
ordering relation 106
partition 107
pullback 107
set closed under an operation 108
signature of a permutation 109
subset 109
surjective 110
v
transposition 110
truth table 111
03XX – Mathematical logic and founda
tions 112
standard enumeration 112
03B05 – Classical propositional logic 113
CNF 113
Proof that contrapositive statement is true using
logical equivalence 113
contrapositive 114
disjunction 114
equivalent 114
implication 115
propositional logic 115
theory 116
transitive 116
truth function 117
03B10 – Classical ﬁrstorder logic 118
∆
1
bootstrapping 118
Boolean 119
G¨odel numbering 120
G¨odel’s incompleteness theorems 120
Lindenbaum algebra 127
Lindstr¨om’s theorem 128
Pressburger arithmetic 129
Rminimal element 129
Skolemization 129
arithmetical hierarchy 129
arithmetical hierarchy is a proper hierarchy 130
atomic formula 131
creating an inﬁnite model 131
criterion for consistency of sets of formulas 132
deductions are ∆
1
132
example of G¨odel numbering 134
example of wellfounded induction 135
ﬁrst order language 136
ﬁrst order logic 137
ﬁrst order theories 138
free and bound variables 138
generalized quantiﬁer 139
logic 140
proof of compactness theorem for ﬁrst order logic
141
proof of principle of transﬁnite induction 141
proof of the wellfounded induction principle 141
quantiﬁer 141
quantiﬁer free 144
subformula 144
syntactic compactness theorem for ﬁrst order logic
144
transﬁnite induction 144
universal relation 145
universal relations exist for each level of the arith
metical hierarchy 145
wellfounded induction 146
wellfounded induction on formulas 147
03B15 – Higherorder logic and type the
ory 143
H¨artig’s quantiﬁer 143
Russell’s theory of types 143
analytic hierarchy 145
gametheoretical quantiﬁer 146
logical language 147
second order logic 148
03B40 – Combinatory logic and lambda
calculus 150
Church integer 150
combinatory logic 150
lambda calculus 151
03B48 – Probability and inductive logic
154
conditional probability 154
03B99 – Miscellaneous 155
Beth property 155
Hofstadter’s MIU system 155
IFlogic 157
Tarski’s result on the undeﬁnability of Truth 160
axiom 161
compactness 164
consistent 164
interpolation property 164
sentence 165
03Bxx – General logic 166
BanachTarski paradox 166
03C05 – Equational classes, universal al
gebra 168
congruence 168
every congruence is the kernel of a homomor
phism 168
homomorphic image of a Σstructure is a Σstructure
vi
169
kernel 169
kernel of a homomorphism is a congruence 169
quotient structure 170
03C07 – Basic properties of ﬁrstorder lan
guages and structures 171
Models constructed from constants 171
Stone space 172
alphabet 173
axiomatizable theory 174
deﬁnable 174
deﬁnable type 175
downward LowenheimSkolem theorem 176
example of deﬁnable type 176
example of strongly minimal 177
ﬁrst isomorphism theorem 177
language 178
length of a string 179
proof of homomorphic image of a Σstructure is
a Σstructure 179
satisfaction relation 180
signature 181
strongly minimal 181
structure preserving mappings 181
structures 182
substructure 183
type 183
upward LowenheimSkolem theorem 183
03C15 – Denumerable structures 185
random graph (inﬁnite) 185
03C35 – Categoricity and completeness of
theories 187
κcategorical 187
Vaught’s test 187
proof of Vaught’s test 187
03C50 – Models with special properties
(saturated, rigid, etc.) 189
example of universal structure 189
homogeneous 191
universal structure 191
03C52 – Properties of classes of models
192
amalgamation property 192
03C64 – Model theory of ordered struc
tures; ominimality 193
inﬁnitesimal 193
ominimality 194
real closed ﬁelds 194
03C68 – Other classical ﬁrstorder model
theory 196
imaginaries 196
03C90 – Nonclassical models (Booleanvalued,
sheaf, etc.) 198
Boolean valued model 198
03C99 – Miscellaneous 199
axiom of foundation 199
elementarily equivalent 199
elementary embedding 200
model 200
proof equivalence of formulation of foundation
201
03D10 – Turing machines and related no
tions 203
Turing machine 203
03D20 – Recursive functions and relations,
subrecursive hierarchies 206
primitive recursive 206
03D25 – Recursively (computably) enu
merable sets and degrees 207
recursively enumerable 207
03D75 – Abstract and axiomatic computabil
ity and recursion theory 208
Ackermann function 208
halting problem 209
03E04 – Ordered sets and their coﬁnali
ties; pcf theory 211
another deﬁnition of coﬁnality 211
coﬁnality 211
maximal element 212
partitions less than coﬁnality 213
well ordered set 213
pigeonhole principle 213
proof of pigeonhole principle 213
tree (set theoretic) 214
κcomplete 215
Cantor’s diagonal argument 215
Fodor’s lemma 216
SchroederBernstein theorem 216
Veblen function 216
additively indecomposable, 217
vii
cardinal number 217
cardinal successor 217
cardinality 218
cardinality of a countable union 218
cardinality of the rationals 219
classes of ordinals and enumerating functions 219
club 219
club ﬁlter 220
countable 220
countably inﬁnite 221
ﬁnite 221
ﬁxed points of normal functions 221
height of an algebraic number 221
if A is inﬁnite and B is a ﬁnite subset of A, then
A` B is inﬁnite 222
limit cardinal 222
natural number 223
ordinal arithmetic 224
ordinal number 225
power set 225
proof of Fodor’s lemma 225
proof of SchroederBernstein theorem 225
proof of ﬁxed points of normal functions 226
proof of the existence of transcendental numbers
226
proof of theorems in aditively indecomposable
227
proof that the rationals are countable 228
stationary set 228
successor cardinal 229
uncountable 229
von Neumann integer 229
von Neumann ordinal 230
weakly compact cardinal 231
weakly compact cardinals and the tree property
231
Cantor’s theorem 232
proof of Cantor’s theorem 232
additive 232
antisymmetric 233
constant function 233
direct image 234
domain 234
dynkin system 234
equivalence class 235
ﬁbre 235
ﬁltration 236
ﬁnite character 236
ﬁx (transformation actions) 236
function 237
functional 237
generalized cartesian product 238
graph 238
identity map 238
inclusion mapping 239
inductive set 239
invariant 240
inverse function theorem 240
inverse image 241
mapping 242
mapping of period n is a bijection 242
partial function 242
partial mapping 243
period of mapping 243
pisystem 244
proof of inverse function theorem 244
proper subset 246
range 246
reﬂexive 246
relation 246
restriction of a mapping 247
set diﬀerence 247
symmetric 247
symmetric diﬀerence 248
the inverse image commutes with set operations
248
transformation 249
transitive 250
transitive 250
transitive closure 250
Hausdorﬀ’s maximum principle 250
Kuratowski’s lemma 251
Tukey’s lemma 251
Zermelo’s postulate 251
Zermelo’s wellordering theorem 251
Zorn’s lemma 252
axiom of choice 252
equivalence of Hausdorﬀ’s maximum principle,
Zorn’s lemma and the wellordering theorem 252
equivalence of Zorn’s lemma and the axiom of
viii
choice 253
maximality principle 254
principle of ﬁnite induction 254
principle of ﬁnite induction proven from well
ordering principle 255
proof of Tukey’s lemma 255
proof of Zermelo’s wellordering theorem 255
axiom of extensionality 256
axiom of inﬁnity 256
axiom of pairing 257
axiom of power set 258
axiom of union 258
axiom schema of separation 259
de Morgan’s laws 260
de Morgan’s laws for sets (proof) 261
set theory 261
union 264
universe 264
von NeumannBernaysGdel set theory 265
FS iterated forcing preserves chain condition 267
chain condition 268
composition of forcing notions 268
composition preserves chain condition 268
equivalence of forcing notions 269
forcing relation 270
forcings are equivalent if one is dense in the other
270
iterated forcing 272
iterated forcing and composition 273
name 273
partial order with chain condition does not col
lapse cardinals 274
proof of partial order with chain condition does
not collapse cardinals 274
proof that forcing notions are equivalent to their
composition 275
complete partial orders do not add small subsets
280
proof of complete partial orders do not add small
subsets 280
Q is equivalent to ♣ and continuum hypothesis
281
Levy collapse 281
proof of Q is equivalent to ♣ and continuum hy
pothesis 282
Martin’s axiom 283
Martin’s axiom and the continuum hypothesis
283
Martin’s axiom is consistent 284
a shorter proof: Martin’s axiom and the contin
uum hypothesis 287
continuum hypothesis 288
forcing 288
generalized continuum hypothesis 289
inaccessible cardinals 290
Q 290
♣ 290
Dedekind inﬁnite 291
ZermeloFraenkel axioms 291
class 291
complement 293
delta system 293
delta system lemma 293
diagonal intersection 293
intersection 294
multiset 294
proof of delta system lemma 294
rational number 295
saturated (set) 295
separation and doubletons axiom 295
set 296
03Exx – Set theory 299
intersection 299
03F03 – Proof theory, general 300
NJp 300
NKp 300
natural deduction 301
sequent 301
sound,, complete 302
03F07 – Structure of proofs 303
induction 303
03F30 – Firstorder arithmetic and frag
ments 307
Elementary Functional Arithmetic 307
PA 308
Peano arithmetic 308
03F35 – Second and higherorder arith
metic and fragments 310
ACA
0
310
ix
RCA
0
310
Z
2
310
comprehension axiom 311
induction axiom 311
03G05 – Boolean algebras 313
Boolean algebra 313
M. H. Stone’s representation theorem 313
03G10 – Lattices and related structures
314
Boolean lattice 314
complete lattice 314
lattice 315
03G99 – Miscellaneous 316
Chu space 316
Chu transform 316
biextensional collapse 317
example of Chu space 317
property of a Chu space 318
0500 – General reference works (hand
books, dictionaries, bibliographies, etc.) 319
example of pigeonhole principle 319
multiindex derivative of a power 319
multiindex notation 320
05A10 – Factorials, binomial coeﬃcients,
combinatorial functions 322
Catalan numbers 322
LeviCivita permutation symbol 323
Pascal’s rule (bit string proof) 325
Pascal’s rule proof 326
Pascal’s triangle 326
Upper and lower bounds to binomial coeﬃcient
328
binomial coeﬃcient 328
double factorial 329
factorial 329
falling factorial 330
inductive proof of binomial theorem 331
multinomial theorem 332
multinomial theorem (proof) 333
proof of upper and lower bounds to binomial co
eﬃcient 334
05A15 – Exact enumeration problems, gen
erating functions 336
Stirling numbers of the ﬁrst kind 336
Stirling numbers of the second kind 338
05A19 – Combinatorial identities 342
Pascal’s rule 342
05A99 – Miscellaneous 343
principle of inclusionexclusion 343
principle of inclusionexclusion proof 344
05B15 – Orthogonal arrays, Latin squares,
Room squares 346
example of Latin squares 346
graecolatin squares 346
latin square 347
magic square 347
05B35 – Matroids, geometric lattices 348
matroid 348
polymatroid 353
05C05 – Trees 354
AVL tree 354
Aronszajn tree 354
Suslin tree 354
antichain 355
balanced tree 355
binary tree 355
branch 356
child node (of a tree) 356
complete binary tree 357
digital search tree 357
digital tree 358
example of Aronszajn tree 358
example of tree (set theoretic) 359
extended binary tree 359
external path length 360
internal node (of a tree) 360
leaf node (of a tree) 361
parent node (in a tree) 361
proof that ω has the tree property 362
root (of a tree) 362
tree 363
weightbalanced binary trees are ultrametric 364
weighted path length 366
05C10 – Topological graph theory, imbed
ding 367
Heawood number 367
Kuratowski’s theorem 368
Szemer´ediTrotter theorem 368
crossing lemma 369
crossing number 369
x
graph topology 369
planar graph 370
proof of crossing lemma 370
05C12 – Distance in graphs 372
Hamming distance 372
05C15 – Coloring of graphs and hyper
graphs 373
bipartite graph 373
chromatic number 374
chromatic number and girth 375
chromatic polynomial 375
colouring problem 376
complete bipartite graph 377
complete kpartite graph 378
fourcolor conjecture 378
kpartite graph 379
property B 380
05C20 – Directed graphs (digraphs), tour
naments 381
cut 381
de Bruijn digraph 381
directed graph 382
ﬂow 383
maximum ﬂow/minimum cut theorem 384
tournament 385
05C25 – Graphs and groups 387
Cayley graph 387
05C38 – Paths and cycles 388
Euler path 388
Veblen’s theorem 388
acyclic graph 389
bridges of Knigsberg 389
cycle 390
girth 391
path 391
proof of Veblen’s theorem 392
05C40 – Connectivity 393
kconnected graph 393
Thomassen’s theorem on 3connected graphs 393
Tutte’s wheel theorem 394
connected graph 394
cutvertex 395
05C45 – Eulerian and Hamiltonian graphs
396
Bondy and Chvtal theorem 396
Dirac theorem 396
Euler circuit 397
Fleury’s algorithm 397
Hamiltonian cycle 398
Hamiltonian graph 398
Hamiltonian path 398
Ore’s theorem 398
Petersen graph 399
hypohamiltonian 399
traceable 399
05C60 – Isomorphism problems (reconstruc
tion conjecture, etc.) 400
graph isomorphism 400
05C65 – Hypergraphs 402
Steiner system 402
ﬁnite plane 402
hypergraph 403
linear space 404
05C69 – Dominating sets, independent sets,
cliques 405
Mantel’s theorem 405
clique 405
proof of Mantel’s theorem 405
05C70 – Factorization, matching, covering
and packing 407
Petersen theorem 407
Tutte theorem 407
bipartite matching 407
edge covering 409
matching 409
maximal bipartite matching algorithm 410
maximal matching/minimal edge covering theo
rem 411
05C75 – Structural characterization of types
of graphs 413
multigraph 413
pseudograph 413
05C80 – Random graphs 414
examples of probabilistic proofs 414
probabilistic method 415
05C90 – Applications 417
Hasse diagram 417
05C99 – Miscellaneous 419
Euler’s polyhedron theorem 419
Poincar´e formula 419
xi
Turan’s theorem 419
Wagner’s theorem 420
block 420
bridge 420
complete graph 420
degree (of a vertex) 421
distance (in a graph) 421
edgecontraction 421
graph 422
graph minor theorem 422
graph theory 423
homeomorphism 424
loop 424
minor (of a graph) 424
neighborhood (of a vertex) 425
null graph 425
order (of a graph) 425
proof of Euler’s polyhedron theorem 426
proof of Turan’s theorem 427
realization 427
size (of a graph) 428
subdivision 428
subgraph 429
wheel graph 429
05D05 – Extremal set theory 431
LYM inequality 431
Sperner’s theorem 432
05D10 – Ramsey theory 433
Erd¨osRado theorem 433
Ramsey’s theorem 433
Ramsey’s theorem 434
arrows 435
coloring 436
proof of Ramsey’s theorem 437
05D15 – Transversal (matching) theory 438
Hall’s marriage theorem 438
proof of Hall’s marriage theorem 438
saturate 440
system of distinct representatives 440
05E05 – Symmetric functions 441
elementary symmetric polynomial 441
reduction algorithm for symmetric polynomials
441
0600 – General reference works (hand
books, dictionaries, bibliographies, etc.) 443
equivalence relation 443
06XX – Order, lattices, ordered algebraic
structures 445
join 445
meet 445
06A06 – Partial order, general 446
directed set 446
inﬁmum 446
sets that do not have an inﬁmum 447
supremum 447
upper bound 448
06A99 – Miscellaneous 449
dense (in a poset) 449
partial order 449
poset 450
quasiorder 450
well quasi ordering 450
06B10 – Ideals, congruence relations 452
order in an algebra 452
06C05 – Modular lattices, Desarguesian
lattices 453
modular lattice 453
06D99 – Miscellaneous 454
distributive 454
distributive lattice 454
06E99 – Miscellaneous 455
Boolean ring 455
08A40 – Operations, polynomials, primal
algebras 456
coeﬃcients of a polynomial 456
08A99 – Miscellaneous 457
binary operation 457
ﬁltered algebra 457
1100 – General reference works (hand
books, dictionaries, bibliographies, etc.) 459
Euler phifunction 459
EulerFermat theorem 460
Fermat’s little theorem 460
Fermat’s theorem proof 460
Goldbach’s conjecture 460
Jordan’s totient function 461
Legendre symbol 461
Pythagorean triplet 462
Wilson’s theorem 462
arithmetic mean 462
xii
ceiling 463
computation of powers using Fermat’s little the
orem 463
congruences 464
coprime 464
cube root 464
ﬂoor 465
geometric mean 465
googol 466
googolplex 467
greatest common divisor 467
group theoretic proof of Wilson’s theorem 467
harmonic mean 467
mean 468
number ﬁeld 468
pi 468
proof of Wilson’s theorem 470
proof of fundamental theorem of arithmetic 471
root of unity 471
1101 – Instructional exposition (textbooks,
tutorial papers, etc.) 472
base 472
11XX – Number theory 474
Lehmer’s Conjecture 474
Sierpinski conjecture 474
prime triples conjecture 475
11A05 – Multiplicative structure; Euclidean
algorithm; greatest common divisors 476
Bezout’s lemma (number theory) 476
Euclid’s algorithm 476
Euclid’s lemma 478
Euclid’s lemma proof 478
fundamental theorem of arithmetic 479
perfect number 479
smooth number 480
11A07 – Congruences; primitive roots; residue
systems 481
Anton’s congruence 481
Fermat’s Little Theorem proof (Inductive) 482
Jacobi symbol 483
ShanksTonelli algorithm 483
Wieferich prime 483
Wilson’s theorem for prime powers 484
factorial module prime powers 485
proof of EulerFermat theorem 485
proof of Lucas’s theorem 486
11A15 – Power residues, reciprocity 487
Euler’s criterion 487
Gauss’ lemma 487
Zolotarev’s lemma 489
cubic reciprocity law 491
proof of Euler’s criterion 493
proof of quadratic reciprocity rule 494
quadratic character of 2 495
quadratic reciprocity for polynomials 496
quadratic reciprocity rule 497
quadratic residue 497
11A25 – Arithmetic functions; related num
bers; inversion formulas 498
Dirichlet character 498
Liouville function 498
Mangoldt function 499
Mertens’ ﬁrst theorem 499
Moebius function 499
Moebius in version 500
arithmetic function 502
multiplicative function 503
nonmultiplicative function 505
totient 507
unit 507
11A41 – Primes 508
Chebyshev functions 508
Euclid’s proof of the inﬁnitude of primes 509
Mangoldt summatory function 509
Mersenne numbers 510
Thue’s lemma 510
composite number 511
prime 511
prime counting function 511
prime diﬀerence function 512
prime number theorem 512
prime number theorem result 513
proof of Thue’s Lemma 514
semiprime 515
sieve of Eratosthenes 516
test for primality of Mersenne numbers 516
11A51 – Factorization; primality 517
Fermat Numbers 517
Fermat compositeness test 517
Zsigmondy’s theorem 518
xiii
divisibility 518
division algorithm for integers 519
proof of division algorithm for integers 519
squarefree number 520
squarefull number 520
the prime power dividing a factorial 521
11A55 – Continued fractions 523
SternBrocot tree 523
continued fraction 524
11A63 – Radix representation; digital prob
lems 527
Kummer’s theorem 527
corollary of Kummer’s theorem 528
11A67 – Other representations 529
Sierpinski Erd¨os egyptian fraction conjecture 529
adjacent fraction 529
any rational number is a sum of unit fractions
530
conjecture on fractions with odd denominators
532
unit fraction 532
11A99 – Miscellaneous 533
ABC conjecture 533
Suranyi theorem 533
irrational to an irrational power can be rational
534
triangular numbers 534
11B05 – Density, gaps, topology 536
CauchyDavenport theorem 536
Mann’s theorem 536
Schnirelmann density 537
Sidon set 537
asymptotic density 538
discrete space 538
essential component 539
normal order 539
11B13 – Additive bases 541
Erd¨osTuran conjecture 541
additive basis 542
asymptotic basis 542
base con version 542
sumset 546
11B25 – Arithmetic progressions 547
Behrend’s construction 547
Freiman’s theorem 548
Szemer´edi’s theorem 548
multidimensional arithmetic progression 549
11B34 – Representation functions 550
Erd¨osFuchs theorem 550
11B37 – Recurrences 551
Collatz problem 551
recurrence relation 551
11B39 – Fibonacci and Lucas numbers and
polynomials and generalizations 553
Fibonacci sequence 553
Hogatt’s theorem 554
Lucas numbers 554
golden ratio 554
11B50 – Sequences (mod m) 556
Erd¨osGinzburgZiv theorem 556
11B57 – Farey sequences; the sequences ?
557
Farey sequence 557
11B65 – Binomial coeﬃcients; factorials;
qidentities 559
Lucas’s Theorem 559
binomial theorem 559
11B68 – Bernoulli and Euler numbers and
polynomials 561
Bernoulli number 561
Bernoulli periodic function 561
Bernoulli polynomial 562
generalized Bernoulli number 562
11B75 – Other combinatorial number the
ory 563
Erd¨osHeilbronn conjecture 563
Freiman isomorphism 563
sumfree 564
11B83 – Special sequences and polynomi
als 565
Beatty sequence 565
Beatty’s theorem 566
Fraenkel’s partition theorem 566
Sierpinski numbers 567
palindrome 567
proof of Beatty’s theorem 568
squarefree sequence 569
superincreasing sequence 569
11B99 – Miscellaneous 570
Lychrel number 570
xiv
closed form 571
11C08 – Polynomials 573
content of a polynomial 573
cyclotomic polynomial 573
height of a polynomial 574
length of a polynomial 574
proof of Eisenstein criterion 574
proof that the cyclotomic polynomial is irreducible
575
11D09 – Quadratic and bilinear equations
577
Pell’s equation and simple continued fractions
577
11D41 – Higher degree equations; Fermat’s
equation 578
Beal conjecture 578
Euler quartic conjecture 579
Fermat’s last theorem 580
11D79 – Congruences in many variables
582
Chinese remainder theorem 582
Chinese remainder theorem proof 583
11D85 – Representation problems 586
polygonal number 586
11D99 – Miscellaneous 588
Diophantine equation 588
11E39 – Bilinear and Hermitian forms 590
Hermitian form 590
nondegenerate bilinear form 590
positive deﬁnite form 591
symmetric bilinear form 591
Cliﬀord algebra 591
11Exx – Forms and linear algebraic groups
593
quadratic function associated with a linear func
tional 593
11F06 – Structure of modular groups and
generalizations; arithmetic groups 594
TaniyamaShimura theorem 594
11F30 – Fourier coeﬃcients of automor
phic forms 597
Fourier coeﬃcients 597
11F67 – Special values of automorphic L
series, periods of modular forms, cohomol
ogy, modular symbols 598
Schanuel’s conjecutre 598
period 598
11G05 – Elliptic curves over global ﬁelds
600
complex multiplication 600
11H06 – Lattices and convex bodies 602
Minkowski’s theorem 602
lattice in R
n
602
11H46 – Products of linear forms 604
triple scalar product 604
11J04 – Homogeneous approximation to
one number 605
Dirichlet’s approximation theorem 605
11J68 – Approximation to algebraic num
bers 606
DavenportSchmidt theorem 606
Liouville approximation theorem 606
proof of Liouville approximation theorem 607
11J72 – Irrationality; linear independence
over a ﬁeld 609
nth root of 2 is irrational for n ≥ 3 (proof using
Fermat’s last theorem) 609
e is irrational (proof) 610
irrational 610
square root of 2 is irrational 611
11J81 – Transcendence (general theory)
612
Fundamental Theorem of Transcendence 612
Gelfond’s theorem 612
four exponentials conjecture 612
six exponentials theorem 613
transcendental number 614
11K16 – Normal numbers, radix expan
sions, etc. 615
absolutely normal 615
11K45 – Pseudorandom numbers; Monte
Carlo methods 617
pseudorandom numbers 617
quasirandom numbers 618
random numbers 619
truly random numbers 619
11L03 – Trigonometric and exponential sums,
general 620
Ramanujan sum 620
11L05 – Gauss and Kloosterman sums; gen
xv
eralizations 622
Gauss sum 622
Kloosterman sum 623
LandsbergSchaar relation 623
derivation of Gauss sum up to a sign 624
11L40 – Estimates on character sums 625
PlyaVinogradov inequality 625
11M06 – ζ(s) and L(s, χ) 627
Ap´ery’s constant 627
Dedekind zeta function 627
Dirichlet Lseries 628
Riemann θfunction 629
Riemann Xi function 630
Riemann omega function 630
functional equation for the Riemann Xi function
630
functional equation for the Riemann theta func
tion 631
generalized Riemann hypothesis 631
proof of functional equation for the Riemann theta
function 631
11M99 – Miscellaneous 633
Riemann zeta function 633
formulae for zeta in the critical strip 636
functional equation of the Riemann zeta function
638
value of the Riemann zeta function at s = 2 638
11N05 – Distribution of primes 640
Bertrand’s conjecture 640
Brun’s constant 640
proof of Bertrand’s conjecture 640
twin prime conjecture 642
11N13 – Primes in progressions 643
primes in progressions 648
11N32 – Primes represented by polynomi
als; other multiplicative structure of poly
nomial values 644
Euler foursquare identity 644
11N56 – Rate of growth of arithmetic func
tions 645
highly composite number 645
11N99 – Miscellaneous 646
Chinese remainder theorem 646
proof of chinese remainder theorem 646
11P05 – Waring’s problem and variants
648
Lagrange’s foursquare theorem 648
Waring’s problem 648
proof of Lagrange’s foursquare theorem 649
11P81 – Elementary theory of partitions
651
pentagonal number theorem 651
11R04 – Algebraic numbers; rings of alge
braic integers 653
Dedekind domain 653
Dirichlet’s unit theorem 653
Eisenstein integers 654
Galois representation 654
Gaussian integer 658
algebraic conjugates 659
algebraic integer 659
algebraic number 659
algebraic number ﬁeld 659
calculating the splitting of primes 660
characterization in terms of prime ideals 661
ideal classes form an abelian group 661
integral basis 661
integrally closed 662
transcendental root theorem 662
11R06 – PVnumbers and generalizations;
other special algebraic numbers 663
Salem number 663
11R11 – Quadratic extensions 664
prime ideal decomposition in quadratic exten
sions of ´ 664
11R18 – Cyclotomic extensions 666
KroneckerWeber theorem 666
examples of regular primes 667
prime ideal decomposition in cyclotomic exten
sions of ´ 668
regular prime 669
11R27 – Units and factorization 670
regulator 670
11R29 – Class numbers, class groups, dis
criminants 672
Existence of Hilbert Class Field 672
class number formula 673
discriminant 673
ideal class 674
xvi
ray class group 675
11R32 – Galois theory 676
Galois criterion for solvability of a polynomial by
radicals 676
11R34 – Galois cohomology 677
Hilbert Theorem 90 677
11R37 – Class ﬁeld theory 678
Artin map 678
Tchebotarev density theorem 679
modulus 679
multiplicative congruence 680
ray class ﬁeld 680
11R56 – Ad`ele rings and groups 682
adle 682
idle 682
restricted direct product 683
11R99 – Miscellaneous 684
Henselian ﬁeld 684
valuation 685
11S15 – Ramiﬁcation and extension the
ory 686
decomposition group 686
examples of prime ideal decomposition in num
ber ﬁelds 688
inertial degree 691
ramiﬁcation index 692
unramiﬁed action 697
11S31 – Class ﬁeld theory; padic formal
groups 699
Hilbert symbol 699
11S99 – Miscellaneous 700
padic integers 700
local ﬁeld 701
11Y05 – Factorization 703
Pollard’s rho method 703
quadratic sieve 706
11Y55 – Calculation of integer sequences
709
Kolakoski sequence 709
11Z05 – Miscellaneous applications of num
ber theory 711
τ function 711
arithmetic derivative 711
example of arithmetic derivative 712
proof that τ(n) is the number of positive divisors
of n 712
1200 – General reference works (hand
books, dictionaries, bibliographies, etc.) 714
monomial 714
order and degree of polynomial 715
12XX – Field theory and polynomials 716
homogeneous polynomial 716
subﬁeld 716
12D05 – Polynomials: factorization 717
factor theorem 717
proof of factor theorem 717
proof of rational root theorem 718
rational root theorem 719
sextic equation 719
12D10 – Polynomials: location of zeros
(algebraic theorems) 720
Cardano’s derivation of the cubic formula 720
FerrariCardano derivation of the quartic formula
721
Galoistheoretic derivation of the cubic formula
722
Galoistheoretic derivation of the quartic formula
724
cubic formula 728
derivation of quadratic formula 728
quadratic formula 729
quartic formula 730
reciprocal polynomial 730
root 731
variant of Cardano’s derivation 732
12D99 – Miscellaneous 733
Archimedean property 733
complex 734
complex conjugate 735
complex number 737
examples of totally real ﬁelds 738
fundamental theorem of algebra 739
imaginary 739
imaginary unit 739
indeterminate form 739
inequalities for real numbers 740
interval 742
modulus of complex number 743
proof of fundamental theorem of algebra 744
proof of the fundamental theorem of algebra 744
xvii
real and complex embeddings 744
real number 746
totally real and imaginary ﬁelds 747
12E05 – Polynomials (irreducibility, etc.)
748
Gauss’s Lemma I 748
Gauss’s Lemma II 749
discriminant 749
polynomial ring 751
resolvent 751
de Moivre identity 754
monic 754
Wedderburn’s Theorem 754
proof of Wedderburn’s theorem 755
second proof of Wedderburn’s theorem 756
ﬁnite ﬁeld 757
Frobenius automorphism 760
characteristic 761
characterization of ﬁeld 761
example of an inﬁnite ﬁeld of ﬁnite characteristic
762
examples of ﬁelds 762
ﬁeld 764
ﬁeld homomorphism 764
prime subﬁeld 765
12F05 – Algebraic extensions 766
a ﬁnite extension of ﬁelds is an algebraic exten
sion 766
algebraic closure 767
algebraic extension 767
algebraically closed 767
algebraically dependent 768
existence of the minimal polynomial 768
ﬁnite extension 769
minimal polynomial 769
norm 770
primitive element theorem 770
splitting ﬁeld 770
the ﬁeld extension R/´ is not ﬁnite 771
trace 771
12F10 – Separable extensions, Galois the
ory 772
Abelian extension 772
Fundamental Theorem of Galois Theory 772
Galois closure 773
Galois conjugate 773
Galois extension 773
Galois group 773
absolute Galois group 774
cyclic extension 774
example of nonperfect ﬁeld 774
ﬁxed ﬁeld 774
inﬁnite Galois theory 774
normal closure 776
normal extension 776
perfect ﬁeld 777
radical extension 777
separable 777
separable closure 778
12F20 – Transcendental extensions 779
transcendence degree 779
12F99 – Miscellaneous 780
composite ﬁeld 780
extension ﬁeld 780
12J15 – Ordered ﬁelds 782
ordered ﬁeld 782
1300 – General reference works (hand
books, dictionaries, bibliographies, etc.) 783
absolute value 783
associates 784
cancellation ring 784
comaximal 784
every prime ideal is radical 784
module 785
radical of an ideal 786
ring 786
subring 787
tensor product 787
13XX – Commutative rings and algebras
789
commutative ring 789
13A02 – Graded rings 790
graded ring 790
13A05 – Divisibility 791
Eisenstein criterion 791
13A10 – Radical theory 792
Hilbert’s Nullstellensatz 792
nilradical 792
radical of an integer 793
13A15 – Ideals; multiplicative ideal theory
xviii
794
contracted ideal 794
existence of maximal ideals 794
extended ideal 795
fractional ideal 796
homogeneous ideal 797
ideal 797
maximal ideal 797
principal ideal 798
the set of prime ideals of a commutative ring
with identity 798
13A50 – Actions of groups on commuta
tive rings; invariant theory 799
Schwarz (1975) theorem 799
invariant polynomial 800
13A99 – Miscellaneous 801
Lagrange’s identity 801
characteristic 802
cyclic ring 802
proof of Euler foursquare identity 803
proof that every subring of a cyclic ring is a cyclic
ring 804
proof that every subring of a cyclic ring is an
ideal 804
zero ring 805
13B02 – Extension theory 806
algebraic 806
moduleﬁnite 806
13B05 – Galois theory 807
algebraic 807
13B21 – Integral dependence 808
integral 808
13B22 – Integral closure of rings and ide
als ; integrally closed rings, related rings
(Japanese, etc.) 809
integral closure 809
13B30 – Quotients and localization 810
fraction ﬁeld 810
localization 810
multiplicative set 811
13C10 – Projective and free modules and
ideals 812
example of free module 812
13C12 – Torsion modules and ideals 813
torsion element 813
13C15 – Dimension theory, depth, related
rings (catenary, etc.) 814
Krull’s principal ideal theorem 814
13C99 – Miscellaneous 815
ArtinRees theorem 815
Nakayama’s lemma 815
prime ideal 815
proof of Nakayama’s lemma 816
proof of Nakayama’s lemma 817
support 817
13E05 – Noetherian rings and modules 818
Hilbert basis theorem 818
Noetherian module 818
proof of Hilbert basis theorem 819
ﬁnitely generated modules over a principal ideal
domain 819
13F07 – Euclidean rings and generaliza
tions 821
Euclidean domain 821
Euclidean valuation 821
proof of Bezout’s Theorem 822
proof that an Euclidean domain is a PID 822
13F10 – Principal ideal rings 823
Smith normalform 823
13F25 – Formal power series rings 825
formal power series 825
13F30 – Valuation rings 831
discrete valuation 831
discrete valuation ring 831
13G05 – Integral domains 833
DedekindHasse valuation 833
PID 834
UFD 834
a ﬁnite integral domain is a ﬁeld 835
an artinian integral domain is a ﬁeld 835
example of PID 835
ﬁeld of quotients 836
integral domain 836
irreducible 837
motivation for Euclidean domains 837
zero divisor 838
13H05 – Regular local rings 839
regular local ring 839
13H99 – Miscellaneous 840
local ring 840
xix
semilocal ring 841
13J10 – Complete rings, completion 842
completion 842
13J25 – Ordered rings 844
ordered ring 844
13J99 – Miscellaneous 845
topological ring 845
13N15 – Derivations 846
derivation 846
13P10 – Polynomial ideals, Gr¨obner bases
847
Gr¨obner basis 847
1400 – General reference works (hand
books, dictionaries, bibliographies, etc.) 849
Picard group 849
aﬃne space 849
aﬃne variety 849
dual isogeny 850
ﬁnite morphism 850
isogeny 851
line bundle 851
nonsingular variety 852
projective space 852
projective variety 854
quasiﬁnite morphism 854
14A10 – Varieties and morphisms 855
Zariski topology 855
algebraic map 856
algebraic sets and polynomial ideals 856
noetherian topological space 857
regular map 857
structure sheaf 858
14A15 – Schemes and morphisms 859
closed immersion 859
coherent sheaf 859
ﬁbre product 860
prime spectrum 860
scheme 863
separated scheme 864
singular set 864
14A99 – Miscellaneous 865
Cartier divisor 865
General position 865
Serre’s twisting theorem 866
ample 866
height of a prime ideal 866
invertible sheaf 866
locally free 867
normal irreducible varieties are nonsingular in
codimension 1 867
sheaf of meromorphic functions 867
very ample 867
14C20 – Divisors, linear systems, invert
ible sheaves 869
divisor 869
Rational and birational maps 870
general type 870
14F05 – Vector bundles, sheaves, related
constructions 871
direct image (functor) 871
14F20 –
´
Etale and other Grothendieck topolo
gies and cohomologies 872
site 872
14F25 – Classical real and complex coho
mology 873
Serre duality 873
sheaf cohomology 874
14G05 – Rational points 875
Hasse principle 875
14H37 – Automorphisms 876
Frobenius morphism 876
14H45 – Special curves and curves of low
genus 878
Fermat’s spiral 878
archimedean spiral 878
folium of Descartes 879
spiral 879
14H50 – Plane and space curves 880
torsion (space curve) 880
14H52 – Elliptic curves 881
Birch and SwinnertonDyer conjecture 881
Hasse’s bound for elliptic curves over ﬁnite ﬁelds
882
Lseries of an elliptic curve 882
Mazur’s theorem on torsion of elliptic curves 884
Mordell curve 884
NagellLutz theorem 885
Selmer group 886
bad reduction 887
conductor of an elliptic curve 890
xx
elliptic curve 890
height function 894
jinvariant 895
rank of an elliptic curve 896
supersingular 897
the torsion subgroup of an elliptic curve injects
in the reduction of the curve 897
14H99 – Miscellaneous 900
RiemannRoch theorem 900
genus 900
projective curve 901
proof of RiemannRoch theorem 901
14L17 – Aﬃne algebraic groups, hyperal
gebra constructions 902
aﬃne algebraic group 902
algebraic torus 902
14M05 – Varieties deﬁned by ring con
ditions (factorial, CohenMacaulay, semi
normal) 903
normal 903
14M15 – Grassmannians, Schubert vari
eties, ﬂag manifolds 904
BorelBottWeil theorem 904
ﬂag variety 905
14R15 – Jacobian problem 906
Jacobian conjecture 906
1500 – General reference works (hand
books, dictionaries, bibliographies, etc.) 907
Cholesky decomposition 907
Hadamard matrix 908
Hessenberg matrix 909
If A ∈ M
n
(k) and A is supertriangular then
A
n
= 0 910
Jacobi determinant 910
Jacobi’s Theorem 912
Kronecker product 912
LU decomposition 913
Peetre’s inequality 914
Schur decomposition 915
antipodal 916
conjugate transpose 916
corollary of Schur decomposition 917
covector 918
diagonal matrix 918
diagonalization 920
diagonally dominant matrix 920
eigenvalue (of a matrix) 921
eigenvalue problem 922
eigenvalues of orthogonal matrices 924
eigenvector 925
exactly determined 926
free vector space over a set 926
in a vector space, λv = 0 if and only if λ = 0 or
v is the zero vector 928
invariant subspace 929
least squares 929
linear algebra 930
linear least squares 932
linear manifold 934
matrix exponential 934
matrix operations 935
nilpotent matrix 938
nilpotent transformation 938
nonzero vector 939
oﬀdiagonal entry 940
orthogonal matrices 940
orthogonal vectors 941
overdetermined 941
partitioned matrix 941
pentadiagonal matrix 942
proof of CayleyHamilton theorem 942
proof of Schur decomposition 943
singular value decomposition 944
skewsymmetric matrix 945
square matrix 946
strictly upper triangular matrix 946
symmetric matrix 947
theorem for normal triangular matrices 947
triangular matrix 948
tridiagonal matrix 949
under determined 950
unit triangular matrix 950
unitary 951
vector space 952
vector subspace 953
zero map 954
zero vector in a vector space is unique 955
zero vector space 955
1501 – Instructional exposition (textbooks,
tutorial papers, etc.) 956
xxi
circulant matrix 956
matrix 957
15XX – Linear and multilinear algebra;
matrix theory 960
linearly dependent functions 960
15A03 – Vector spaces, linear dependence,
rank 961
Sylvester’s law 961
basis 961
complementary subspace 962
dimension 963
every vector space has a basis 964
ﬂag 964
frame 965
linear combination 968
linear independence 968
list vector 968
nullity 969
orthonormal basis 970
physical vector 970
proof of ranknullity theorem 972
rank 973
ranknullity theorem 973
similar matrix 974
span 975
theorem for the direct sum of ﬁnite dimensional
vector spaces 976
vector 976
15A04 – Linear transformations, semilin
ear transformations 980
admissibility 980
conductor of a vector 980
cyclic decomposition theorem 981
cyclic subspace 981
dimension theorem for symplectic complement
(proof) 981
dual homomorphism 982
dual homomorphism of the derivative 983
image of a linear transformation 984
invertible linear transformation 984
kernel of a linear transformation 985
linear transformation 985
minimal polynomial (endomorphism) 986
symplectic complement 987
trace 988
15A06 – Linear equations 989
Gaussian elimination 989
ﬁnitedimensional linear problem 991
homogeneous linear problem 992
linear problem 993
reduced row echelon form 993
row echelon form 994
underdetermined polynomial interpolation 994
15A09 – Matrix inversion, generalized in
verses 996
matrix adjoint 996
matrix inverse 997
15A12 – Conditioning of matrices 1000
singular 1000
15A15 – Determinants, permanents, other
special matrix functions 1001
CayleyHamilton theorem 1001
Cramer’s rule 1001
cofactor expansion 1002
determinant 1003
determinant as a multilinear mapping 1005
determinants of some matrices of special form
1006
example of Cramer’s rule 1006
proof of Cramer’s rule 1008
proof of cofactor expansion 1008
resolvent matrix 1009
15A18 – Eigenvalues, singular values, and
eigenvectors 1010
Jordan canonical form theorem 1010
Lagrange multiplier method 1011
PerronFrobenius theorem 1011
characteristic equation 1012
eigenvalue 1012
eigenvalue 1013
15A21 – Canonical forms, reductions, clas
siﬁcation 1015
companion matrix 1015
eigenvalues of an involution 1015
linear involution 1016
normal matrix 1017
projection 1018
quadratic form 1019
15A23 – Factorization of matrices 1021
QR decomposition 1021
xxii
15A30 – Algebraic systems of matrices 1023
ideals in matrix algebras 1023
15A36 – Matrices of integers 1025
permutation matrix 1025
15A39 – Linear inequalities 1026
Farkas lemma 1026
15A42 – Inequalities involving eigenvalues
and eigenvectors 1027
Gershgorin’s circle theorem 1027
Gershgorin’s circle theorem result 1027
Shur’s inequality 1028
15A48 – Positive matrices and their gen
eralizations; cones of matrices 1029
negative deﬁnite 1029
negative semideﬁnite 1029
positive deﬁnite 1030
positive semideﬁnite 1030
primitive matrix 1031
reducible matrix 1031
15A51 – Stochastic matrices 1032
Birkoﬀvon Neumann theorem 1032
proof of Birkoﬀvon Neumann theorem 1032
15A57 – Other types of matrices (Hermi
tian, skewHermitian, etc.) 1035
Hermitian matrix 1035
direct sum of Hermitian and skewHermitian ma
trices 1036
identity matrix 1037
skewHermitian matrix 1037
transpose 1038
15A60 – Norms of matrices, numerical range,
applications of functional analysis to ma
trix theory 1041
Frobenius matrix norm 1041
matrix pnorm 1042
self consistent matrix norm 1043
15A63 – Quadratic and bilinear forms, in
ner products 1044
CauchySchwarz inequality 1044
adjoint endomorphism 1045
antisymmetric 1046
bilinear map 1046
dot product 1049
every orthonormal set is linearly independent 1050
inner product 1051
inner product space 1051
proof of CauchySchwarz inequality 1052
selfdual 1052
skewsymmetric bilinear form 1053
spectral theorem 1053
15A66 – Cliﬀord algebras, spinors 1056
geometric algebra 1056
15A69 – Multilinear algebra, tensor prod
ucts 1058
Einstein summation convention 1058
basic tensor 1059
multilinear 1061
outer multiplication 1061
tensor 1062
tensor algebra 1065
tensor array 1065
tensor product (vector spaces) 1067
tensor transformations 1069
15A72 – Vector and tensor algebra, theory
of invariants 1072
baccab rule 1072
cross product 1072
euclidean vector 1073
rotational invariance of cross product 1074
15A75 – Exterior algebra, Grassmann al
gebras 1076
contraction 1076
exterior algebra 1077
15A99 – Miscellaneous topics 1081
Kronecker delta 1081
dual space 1081
example of trace of a matrix 1083
generalized Kronecker delta symbol 1083
linear functional 1084
modules are a generalization of vector spaces 1084
proof of properties of trace of a matrix 1085
quasipositive matrix 1086
trace of a matrix 1086
Volume 2
1600 – General reference works (hand
books, dictionaries, bibliographies, etc.) 1088
direct product of modules 1088
direct sum 1089
xxiii
exact sequence 1089
quotient ring 1090
16D10 – General module theory 1091
annihilator 1091
annihilator is an ideal 1091
artinian 1092
composition series 1092
conjugate module 1093
modular law 1093
module 1093
proof of modular law 1094
zero module 1094
16D20 – Bimodules 1095
bimodule 1095
16D25 – Ideals 1096
associated prime 1096
nilpotent ideal 1096
primitive ideal 1096
product of ideals 1097
proper ideal 1097
semiprime ideal 1097
zero ideal 1098
16D40 – Free, projective, and ﬂat modules
and ideals 1099
ﬁnitely generated projective module 1099
ﬂat module 1099
free module 1100
free module 1100
projective cover 1100
projective module 1101
16D50 – Injective modules, selfinjective
rings 1102
injective hull 1102
injective module 1102
16D60 – Simple and semisimple modules,
primitive rings and ideals 1104
central simple algebra 1104
completely reducible 1104
simple ring 1105
16D80 – Other classes of modules and ide
als 1106
essential submodule 1106
faithful module 1106
minimal prime ideal 1107
module of ﬁnite rank 1107
simple module 1107
superﬂuous submodule 1107
uniform module 1108
16E05 – Syzygies, resolutions, complexes
1109
nchain 1109
chain complex 1109
ﬂat resolution 1110
free resolution 1110
injective resolution 1110
projective resolution 1110
short exact sequence 1111
split short exact sequence 1111
von Neumann regular 1111
16K20 – Finitedimensional 1112
quaternion algebra 1112
16K50 – Brauer groups 1113
Brauer group 1113
16K99 – Miscellaneous 1114
division ring 1114
16N20 – Jacobson radical, quasimultipli
cation 1115
Jacobson radical 1115
a ring modulo its Jacobson radical is semiprimi
tive 1116
examples of semiprimitive rings 1116
proof of Characterizations of the Jacobson radi
cal 1117
properties of the Jacobson radical 1118
quasiregularity 1119
semiprimitive ring 1120
16N40 – Nil and nilpotent radicals, sets,
ideals, rings 1121
Koethe conjecture 1121
nil and nilpotent ideals 1121
16N60 – Prime and semiprime rings 1123
prime ring 1123
16N80 – General radicals and rings 1124
prime radical 1124
radical theory 1124
16P40 – Noetherian rings and modules 1126
Noetherian ring 1126
noetherian 1126
16P60 – Chain conditions on annihilators
and summands: Goldietype conditions ,
xxiv
Krull dimension 1128
Goldie ring 1128
uniform dimension 1128
16S10 – Rings determined by universal prop
erties (free algebras, coproducts, adjunc
tion of inverses, etc.) 1130
Ore domain 1130
16S34 – Group rings , Laurent polynomial
rings 1131
support 1131
16S36 – Ordinary and skew polynomial rings
and semigroup rings 1132
Gaussian polynomials 1132
q skew derivation 1133
q skew polynomial ring 1133
sigma derivation 1133
sigma, delta constant 1133
skew derivation 1133
skew polynomial ring 1134
16S99 – Miscellaneous 1135
algebra 1135
algebra (module) 1135
16U10 – Integral domains 1137
Pr¨ ufer domain 1137
valuation domain 1137
16U20 – Ore rings, multiplicative sets, Ore
localization 1139
Goldie’s Theorem 1139
Ore condition 1139
Ore’s theorem 1140
classical ring of quotients 1140
saturated 1141
16U70 – Center, normalizer (invariant el
ements) 1142
center (rings) 1142
16U99 – Miscellaneous 1143
antiidempotent 1143
16W20 – Automorphisms and endomor
phisms 1144
ring of endomorphisms 1144
16W30 – Coalgebras, bialgebras, Hopf al
gebras ; rings, modules, etc. on which
these act 1146
Hopf algebra 1146
almost cocommutative bialgebra 1147
bialgebra 1148
coalgebra 1148
coinvariant 1149
comodule 1149
comodule algebra 1149
comodule coalgebra 1150
module algebra 1150
module coalgebra 1150
16W50 – Graded rings and modules 1151
graded algebra 1151
graded module 1151
supercommutative 1151
16W55 – “Super” (or “skew”) structure
1153
super tensor product 1153
superalgebra 1153
supernumber 1154
16W99 – Miscellaneous 1155
Hamiltonian quaternions 1155
16Y30 – Nearrings 1158
nearring 1158
17A01 – General theory 1159
commutator bracket 1159
17B05 – Structure theory 1161
Killing form 1161
Levi’s theorem 1161
nilradical 1161
radical 1162
17B10 – Representations, algebraic theory
(weights) 1163
Ado’s theorem 1163
Lie algebra representation 1163
adjoint representation 1164
examples of nonmatrix Lie groups 1165
isotropy representation 1165
17B15 – Representations, analytic theory
1166
invariant form (Lie algebras) 1166
17B20 – Simple, semisimple, reductive (su
per)algebras (roots) 1167
Borel subalgebra 1167
Borel subgroup 1167
Cartan matrix 1168
Cartan subalgebra 1168
Cartan’s criterion 1168
xxv
Casimir operator 1168
Dynkin diagram 1169
Verma module 1169
Weyl chamber 1170
Weyl group 1170
Weyl’s theorem 1170
classiﬁcation of ﬁnitedimensional representations
of semisimple Lie algebras 1171
cohomology of semisimple Lie algebras 1171
nilpotent cone 1171
parabolic subgroup 1172
pictures of Dynkin diagrams 1172
positive root 1175
rank 1175
root lattice 1175
root system 1176
simple and semisimple Lie algebras 1177
simple root 1178
weight (Lie algebras) 1178
weight lattice 1178
17B30 – Solvable, nilpotent (super)algebras
1179
Engel’s theorem 1179
Lie’s theorem 1182
solvable Lie algebra 1183
17B35 – Universal enveloping (super)algebras
1184
Poincar´eBirkhoﬀWitt theorem 1184
universal enveloping algebra 1185
17B56 – Cohomology of Lie (super)algebras
1187
Lie algebra cohomology 1187
17B67 – KacMoody (super)algebras (struc
ture and representation theory) 1188
KacMoody algebra 1188
generalized Cartan matrix 1188
17B99 – Miscellaneous 1190
Jacobi identity interpretations 1190
Lie algebra 1190
real form 1192
1800 – General reference works (hand
books, dictionaries, bibliographies, etc.) 1193
Grothendieck spectral sequence 1193
category of sets 1194
functor 1194
monic 1194
natural equivalence 1195
representable functor 1195
supplemental axioms for an Abelian category 1195
18A05 – Deﬁnitions, generalizations 1197
autofunctor 1197
automorphism 1197
category 1198
category example (arrow category) 1199
commutative diagram 1199
double dual embedding 1200
dual category 1201
duality principle 1201
endofunctor 1202
examples of initial objects, terminal objects and
zero objects 1202
forgetful functor 1204
isomorphism 1205
natural transformation 1205
types of homomorphisms 1205
zero object 1206
18A22 – Special properties of functors (faith
ful, full, etc.) 1208
exact functor 1208
18A25 – Functor categories, comma cate
gories 1210
Yoneda embedding 1210
18A30 – Limits and colimits (products, sums,
directed limits, pushouts, ﬁber products,
equalizers, kernels, ends and coends, etc.)
1211
categorical direct product 1211
categorical direct sum 1211
kernel 1212
18A40 – Adjoint functors (universal con
structions, reﬂective subcategories, Kan ex
tensions, etc.) 1213
adjoint functor 1213
equivalence of categories 1214
18B40 – Groupoids, semigroupoids, semi
groups, groups (viewed as categories) 1215
groupoid (category theoretic) 1215
18E10 – Exact categories, abelian cate
gories 1216
abelian category 1216
xxvi
exact sequence 1217
derived category 1218
enough injectives 1218
18F20 – Presheaves and sheaves 1219
locally ringed space 1219
presheaf 1220
sheaf 1220
sheaﬁﬁcation 1225
stalk 1226
18F30 – Grothendieck groups 1228
Grothendieck group 1228
18G10 – Resolutions; derived functors 1229
derived functor 1229
18G15 – Ext and Tor, generalizations, K¨ unneth
formula 1231
Ext 1231
18G30 – Simplicial sets, simplicial objects
(in a category) 1232
nerve 1232
simplicial category 1232
simplicial object 1233
18G35 – Chain complexes 1235
5lemma 1235
9lemma 1236
Snake lemma 1236
chain homotopy 1237
chain map 1237
homology (chain complex) 1237
18G40 – Spectral sequences, hypercoho
mology 1238
spectral sequence 1238
1900 – General reference works (hand
books, dictionaries, bibliographies, etc.) 1239
Algebraic Ktheory 1239
Ktheory 1240
examples of algebraic Ktheory groups 1241
19K33 – EXT and Khomology 1242
Fredholm module 1242
Khomology 1243
19K99 – Miscellaneous 1244
examples of Ktheory groups 1244
2000 – General reference works (hand
books, dictionaries, bibliographies, etc.) 1245
alternating group is a normal subgroup of the
symmetric group 1245
associative 1245
canonical projection 1246
centralizer 1246
commutative 1247
examples of groups 1247
group 1250
quotient group 1250
2002 – Research exposition (monographs,
survey articles) 1252
length function 1252
20XX – Group theory and generalizations
1253
free product with amalgamated subgroup 1253
nonabelian group 1254
20A05 – Axiomatics and elementary prop
erties 1255
FeitThompson theorem 1255
Proof: The orbit of any element of a group is a
subgroup 1255
center 1256
characteristic subgroup 1256
class function 1257
conjugacy class 1258
conjugacy class formula 1258
conjugate stabilizer subgroups 1258
coset 1259
cyclic group 1259
derived subgroup 1260
equivariant 1260
examples of ﬁnite simple groups 1261
ﬁnitely generated group 1262
ﬁrst isomorphism theorem 1262
fourth isomorphism theorem 1262
generator 1263
group actions and homomorphisms 1263
group homomorphism 1265
homogeneous space 1265
identity element 1268
inner automorphism 1268
kernel 1269
maximal 1269
normal subgroup 1269
normality of subgroups is not transitive 1269
normalizer 1270
order (of a group) 1271
xxvii
presentation of a group 1271
proof of ﬁrst isomorphism theorem 1272
proof of second isomorphism theorem 1273
proof that all cyclic groups are abelian 1274
proof that all cyclic groups of the same order are
isomorphic to each other 1274
proof that all subgroups of a cyclic group are
cyclic 1274
regular group action 1275
second isomorphism theorem 1275
simple group 1276
solvable group 1276
subgroup 1276
third isomorphism theorem 1277
20A99 – Miscellaneous 1279
Cayley table 1279
proper subgroup 1280
quaternion group 1280
20B05 – General theory for ﬁnite groups
1282
cycle notation 1282
permutation group 1283
20B15 – Primitive groups 1284
primitive transitive permutation group 1284
20B20 – Multiply transitive ﬁnite groups
1286
Jordan’s theorem (multiply transitive groups) 1286
multiply transitive 1286
sharply multiply transitive 1287
20B25 – Finite automorphism groups of al
gebraic, geometric, or combinatorial struc
tures 1288
diamond theory 1288
20B30 – Symmetric groups 1289
symmetric group 1289
symmetric group 1289
20B35 – Subgroups of symmetric groups
1290
Cayley’s theorem 1290
20B99 – Miscellaneous 1291
(p, q) shuﬄe 1291
Frobenius group 1291
permutation 1292
proof of Cayley’s theorem 1292
20C05 – Group rings of ﬁnite groups and
their modules 1294
group ring 1294
20C15 – Ordinary representations and char
acters 1295
Maschke’s theorem 1295
a representation which is not completely reducible
1295
orthogonality relations 1296
20C30 – Representations of ﬁnite symmet
ric groups 1299
example of immanent 1299
immanent 1299
permanent 1299
20C99 – Miscellaneous 1301
Frobenius reciprocity 1301
Schur’s lemma 1301
character 1302
group representation 1303
induced representation 1303
regular representation 1304
restriction representation 1304
20D05 – Classiﬁcation of simple and non
solvable groups 1305
Burnside p −q theorem 1305
classiﬁcation of semisimple groups 1305
semisimple group 1305
20D08 – Simple groups: sporadic groups
1307
Janko groups 1307
20D10 – Solvable groups, theory of for
mations, Schunck classes, Fitting classes,
πlength, ranks 1308
ˇ
Cuhinin’s Theorem 1308
separable 1308
supersolvable group 1309
20D15 – Nilpotent groups, pgroups 1310
Burnside basis theorem 1310
20D20 – Sylow subgroups, Sylow proper
ties, πgroups, πstructure 1311
πgroups and π
t
groups 1311
psubgroup 1311
Burnside normal complement theorem 1312
Frattini argument 1312
Sylow psubgroup 1312
Sylow theorems 1312
xxviii
Sylow’s ﬁrst theorem 1313
Sylow’s third theorem 1313
application of Sylow’s theorems to groups of or
der pq 1313
pprimary component 1314
proof of Frattini argument 1314
proof of Sylow theorems 1314
subgroups containing the normalizers of Sylow
subgroups normalize themselves 1316
20D25 – Special subgroups (Frattini, Fit
ting, etc.) 1317
Fitting’s theorem 1317
characteristically simple group 1317
the Frattini subgroup is nilpotent 1317
20D30 – Series and lattices of subgroups
1319
maximal condition 1319
minimal condition 1319
subnormal series 1320
20D35 – Subnormal subgroups 1321
subnormal subgroup 1321
20D99 – Miscellaneous 1322
Cauchy’s theorem 1322
Lagrange’s theorem 1322
exponent 1322
fully invariant subgroup 1323
proof of Cauchy’s theorem 1323
proof of Lagrange’s theorem 1324
proof of the converse of Lagrange’s theorem for
ﬁnite cyclic groups 1324
proof that expG divides [G[ 1324
proof that [g[ divides expG 1325
proof that every group of prime order is cyclic
1325
20E05 – Free nonabelian groups 1326
NielsenSchreier theorem 1326
Scheier index formula 1326
free group 1326
proof of NielsenSchreier theorem and Schreier
index formula 1327
JordanHolder decomposition 1328
proﬁnite group 1328
extension 1329
holomorph 1329
proof of the Jordan Holder decomposition theo
rem 1329
semidirect product of groups 1330
wreath product 1333
JordanHlder decomposition theorem 1334
simplicity of the alternating groups 1334
abelian groups of order 120 1337
fundamental theorem of ﬁnitely generated abelian
groups 1337
conjugacy class 1338
Frattini subgroup 1338
nongenerator 1338
20Exx – Structure and classiﬁcation of in
ﬁnite or ﬁnite groups 1339
faithful group action 1339
20F18 – Nilpotent groups 1340
classiﬁcation of ﬁnite nilpotent groups 1340
nilpotent group 1340
20F22 – Other classes of groups deﬁned by
subgroup chains 1342
inverse limit 1342
20F28 – Automorphism groups of groups
1344
outer automorphism group 1344
20F36 – Braid groups; Artin groups 1345
braid group 1345
20F55 – Reﬂection and Coxeter groups 1347
cycle 1347
dihedral group 1348
20F65 – Geometric group theory 1349
groups that act freely on trees are free 1349
20F99 – Miscellaneous 1350
perfect group 1350
20G15 – Linear algebraic groups over ar
bitrary ﬁelds 1351
Nagao’s theorem 1351
computation of the order of GL(n, F
q
) 1351
general linear group 1352
order of the general linear group over a ﬁnite ﬁeld
1352
special linear group 1352
20G20 – Linear algebraic groups over the
reals, the complexes, the quaternions 1353
orthogonal group 1353
20G25 – Linear algebraic groups over local
ﬁelds and their integers 1354
xxix
Ihara’s theorem 1354
20G40 – Linear algebraic groups over ﬁ
nite ﬁelds 1355
SL
2
(F
3
) 1355
20J06 – Cohomology of groups 1356
group cohomology 1356
stronger Hilbert theorem 90 1357
20J15 – Category of groups 1359
variety of groups 1359
20K01 – Finite abelian groups 1360
Schinzel’s theorem 1360
20K10 – Torsion groups, primary groups
and generalized primary groups 1361
torsion 1361
20K25 – Direct sums, direct products, etc.
1362
direct product of groups 1362
20K99 – Miscellaneous 1363
Klein 4group 1363
divisible group 1364
example of divisible group 1364
locally cyclic group 1364
20Kxx – Abelian groups 1366
abelian group 1366
20M10 – General structure theory 1367
existence of maximal semilattice decomposition
1367
semilattice decomposition of a semigroup 1368
simple semigroup 1368
20M12 – Ideal theory 1370
Rees factor 1370
ideal 1370
20M14 – Commutative semigroups 1372
Archimedean semigroup 1372
commutative semigroup 1372
20M20 – Semigroups of transformations,
etc. 1373
semigroup of transformations 1373
20M30 – Representation of semigroups; ac
tions of semigroups on sets 1375
counting theorem 1375
example of group action 1375
group action 1376
orbit 1377
proof of counting theorem 1377
stabilizer 1378
20M99 – Miscellaneous 1379
a semilattice is a commutative band 1379
adjoining an identity to a semigroup 1379
band 1380
bicyclic semigroup 1380
congruence 1381
cyclic semigroup 1381
idempotent 1382
null semigroup 1383
semigroup 1383
semilattice 1383
subsemigroup,, submonoid,, and subgroup 1384
zero elements 1384
20N02 – Sets with a single binary opera
tion (groupoids) 1386
groupoid 1386
idempotency 1386
left identity and right identity 1387
20N05 – Loops, quasigroups 1388
Moufang loop 1388
loop and quasigroup 1389
2200 – General reference works (hand
books, dictionaries, bibliographies, etc.) 1390
ﬁxedpoint subspace 1390
22XX – Topological groups, Lie groups
1391
Cantor space 1391
22A05 – Structure of general topological
groups 1392
topological group 1392
22C05 – Compact groups 1393
ntorus 1393
reductive 1393
22D05 – General properties and structure
of locally compact groups 1394
Γsimple 1394
22D15 – Group algebras of locally com
pact groups 1395
group C
∗
algebra 1395
22E10 – General properties and structure
of complex Lie groups 1396
existence and uniqueness of compact real form
1396
maximal torus 1397
xxx
Lie group 1397
complexiﬁcation 1399
HilbertWeyl theorem 1400
the connection between Lie groups and Lie alge
bras 1401
2600 – General reference works (hand
books, dictionaries, bibliographies, etc.) 1402
derivative notation 1402
fundamental theorems of calculus 1403
logarithm 1404
proof of the ﬁrst fundamental theorem of calcu
lus 1405
proof of the second fundamental theorem of cal
culus 1405
rootmeansquare 1406
square 1406
26XX – Real functions 1408
abelian function 1408
fullwidth at half maximum 1408
26A03 – Foundations: limits and general
izations, elementary topology of the line
1410
Cauchy sequence 1410
Dedekind cuts 1410
binomial proof of positive integer power rule 1413
exponential 1414
interleave sequence 1415
limit inferior 1415
limit superior 1416
power rule 1417
properties of the exponential 1417
squeeze rule 1418
26A06 – Onevariable calculus 1420
Darboux’s theorem (analysis) 1420
Fermat’s Theorem (stationary points) 1420
Heaviside step function 1421
Leibniz’ rule 1421
Rolle’s theorem 1422
binomial formula 1422
chain rule 1422
complex Rolle’s theorem 1423
complex meanvalue theorem 1423
deﬁnite integral 1424
derivative of even/odd function (proof) 1425
direct sum of even/odd functions (example) 1425
even/odd function 1426
example of chain rule 1427
example of increasing/decreasing/monotone func
tion 1428
extended meanvalue theorem 1428
increasing/decreasing/monotone function 1428
intermediate value theorem 1429
limit 1429
mean value theorem 1430
meanvalue theorem 1430
monotonicity criterion 1431
nabla 1431
onesided limit 1432
product rule 1432
proof of Darboux’s theorem 1433
proof of Fermat’s Theorem (stationary points)
1434
proof of Rolle’s theorem 1434
proof of Taylor’s Theorem 1435
proof of binomial formula 1436
proof of chain rule 1436
proof of extended meanvalue theorem 1437
proof of intermediate value theorem 1437
proof of mean value theorem 1438
proof of monotonicity criterion 1439
proof of quotient rule 1439
quotient rule 1440
signum function 1440
26A09 – Elementary functions 1443
deﬁnitions in trigonometry 1443
hyperbolic functions 1444
26A12 – Rate of growth of functions, or
ders of inﬁnity, slowly varying functions
1446
Landau notation 1446
26A15 – Continuity and related questions
(modulus of continuity, semicontinuity, dis
continuities, etc.) 1448
Dirichlet’s function 1448
semicontinuous 1448
semicontinuous 1449
uniformly continuous 1450
26A16 – Lipschitz (H¨ older) classes 1451
Lipschitz condition 1451
Lipschitz condition and diﬀerentiability 1452
xxxi
Lipschitz condition and diﬀerentiability result 1453
26A18 – Iteration 1454
iteration 1454
periodic point 1454
26A24 – Diﬀerentiation (functions of one
variable): general theory, generalized deriva
tives, meanvalue theorems 1455
Leibniz notation 1455
derivative 1456
l’Hpital’s rule 1460
proof of De l’Hpital’s rule 1461
related rates 1462
26A27 – Nondiﬀerentiability (nondiﬀeren
tiable functions, points of nondiﬀerentia
bility), discontinuous derivatives 1464
Weierstrass function 1464
26A36 – Antidiﬀerentiation 1465
antiderivative 1465
integration by parts 1465
integrations by parts for the Lebesgue integral
1466
26A42 – Integrals of Riemann, Stieltjes
and Lebesgue type 1468
Riemann sum 1468
RiemannStieltjes integral 1469
continuous functions are Riemann integrable 1469
generalized Riemann integral 1469
proof of Continuous functions are Riemann inte
grable 1470
26A51 – Convexity, generalizations 1471
concave function 1471
26Axx – Functions of one variable 1472
function centroid 1472
26B05 – Continuity and diﬀerentiation ques
tions 1473
C
∞
0
(U) is not empty 1473
Rademacher’s Theorem 1474
smooth functions with compact support 1475
26B10 – Implicit function theorems, Jaco
bians, transformations with several vari
ables 1477
Jacobian matrix 1477
directional derivative 1477
gradient 1478
implicit diﬀerentiation 1481
implicit function theorem 1481
proof of implicit function theorem 1482
26B12 – Calculus of vector functions 1484
Clairaut’s theorem 1484
Fubini’s Theorem 1484
Generalised Ndimensional Riemann Sum 1485
Generalized Ndimensional Riemann Integral 1485
Helmholtz equation 1486
Hessian matrix 1487
Jordan Content of an Ncell 1487
Laplace equation 1487
chain rule (several variables) 1488
divergence 1489
extremum 1490
irrotational ﬁeld 1490
partial derivative 1491
plateau 1492
proof of Green’s theorem 1492
relations between Hessian matrix and local ex
trema 1493
solenoidal ﬁeld 1494
26B15 – Integration: length, area, volume
1495
arc length 1495
26B20 – Integral formulas (Stokes, Gauss,
Green, etc.) 1497
Green’s theorem 1497
26B25 – Convexity, generalizations 1499
convex function 1499
extremal value of convex/concave functions 1500
26B30 – Absolutely continuous functions,
functions of bounded variation 1502
absolutely continuous function 1502
total variation 1503
26B99 – Miscellaneous 1505
derivation of zeroth weighted power mean 1505
weighted power mean 1506
26C15 – Rational functions 1507
rational function 1507
26C99 – Miscellaneous 1508
Laguerre Polynomial 1508
26D05 – Inequalities for trigonometric func
tions and polynomials 1509
Weierstrass product inequality 1509
proof of Jordan’s Inequality 1509
xxxii
26D10 – Inequalities involving derivatives
and diﬀerential and integral operators 1511
Gronwall’s lemma 1511
proof of Gronwall’s lemma 1511
26D15 – Inequalities for sums, series and
integrals 1513
Carleman’s inequality 1513
Chebyshev’s inequality 1513
MacLaurin’s Inequality 1514
Minkowski inequality 1514
Muirhead’s theorem 1515
Schur’s inequality 1515
Young’s inequality 1515
arithmeticgeometricharmonic means inequality
1516
general means inequality 1516
power mean 1517
proof of Chebyshev’s inequality 1517
proof of Minkowski inequality 1518
proof of arithmeticgeometricharmonic means in
equality 1519
proof of general means inequality 1521
proof of rearrangement inequality 1522
rearrangement inequality 1523
26D99 – Miscellaneous 1524
Bernoulli’s inequality 1524
proof of Bernoulli’s inequality 1524
26E35 – Nonstandard analysis 1526
hyperreal 1526
e is not a quadratic irrational 1527
zero of a function 1528
2800 – General reference works (hand
books, dictionaries, bibliographies, etc.) 1530
extended real numbers 1530
28XX – Measure and integration 1532
Riemann integral 1532
martingale 1532
28A05 – Classes of sets (Borel ﬁelds, σ
rings, etc.), measurable sets, Suslin sets,
analytic sets 1534
Borel σalgebra 1534
28A10 – Real or complexvalued set func
tions 1535
σﬁnite 1535
Argand diagram 1535
HahnKolmogorov theorem 1536
measure 1536
outer measure 1536
properties for measure 1538
28A12 – Contents, measures, outer mea
sures, capacities 1540
Hahn decomposition theorem 1540
Jordan decomposition 1540
Lebesgue decomposition theorem 1541
Lebesgue outer measure 1541
absolutely continuous 1542
counting measure 1543
measurable set 1543
outer regular 1543
signed measure 1543
singular measure 1544
28A15 – Abstract diﬀerentiation theory,
diﬀerentiation of set functions 1545
HardyLittlewood maximal theorem 1545
Lebesgue diﬀerentiation theorem 1545
RadonNikodym theorem 1546
integral depending on a parameter 1547
28A20 – Measurable and nonmeasurable
functions, sequences of measurable func
tions, modes of convergence 1549
Egorov’s theorem 1549
Fatou’s lemma 1549
FatouLebesgue theorem 1550
dominated convergence theorem 1550
measurable function 1550
monotone convergence theorem 1551
proof of Egorov’s theorem 1551
proof of Fatou’s lemma 1552
proof of FatouLebesgue theorem 1552
proof of dominated convergence theorem 1553
proof of monotone convergence theorem 1553
28A25 – Integration with respect to mea
sures and other set functions 1555
L
∞
(X, dµ) 1555
HardyLittlewood maximal operator 1555
Lebesgue integral 1556
28A60 – Measures on Boolean rings, mea
sure algebras 1558
σalgebra 1558
σalgebra 1558
xxxiii
algebra 1559
measurable set (for outer measure) 1559
28A75 – Length, area, volume, other geo
metric measure theory 1561
Lebesgue density theorem 1561
28A80 – Fractals 1562
Cantor set 1562
Hausdorﬀ dimension 1565
Koch curve 1566
Sierpinski gasket 1567
fractal 1567
28Axx – Classical measure theory 1569
Vitali’s Theorem 1569
proof of Vitali’s Theorem 1569
28B15 – Set functions, measures and inte
grals with values in ordered spaces 1571
L
p
space 1571
locally integrable function 1572
28C05 – Integration theory via linear func
tionals (Radon measures, Daniell integrals,
etc.), representing set functions and mea
sures 1573
Haar integral 1573
28C10 – Set functions and measures on
topological groups, Haar measures, invari
ant measures 1575
Haar measure 1575
28C20 – Set functions and measures and
integrals in inﬁnitedimensional spaces (Wiener
measure, Gaussian measure, etc.) 1577
essential supremum 1577
28D05 – Measurepreserving transforma
tions 1578
measurepreserving 1578
3000 – General reference works (hand
books, dictionaries, bibliographies, etc.) 1579
domain 1579
region 1579
regular region 1580
topology of the complex plane 1580
30XX – Functions of a complex variable
1581
z
0
is a pole of f 1581
30A99 – Miscellaneous 1582
Riemann mapping theorem 1582
Runge’s theorem 1582
Weierstrass Mtest 1583
annulus 1583
conformally equivalent 1583
contour integral 1584
orientation 1585
proof of Weierstrass Mtest 1585
unit disk 1586
upper half plane 1586
winding number and fundamental group 1586
30B10 – Power series (including lacunary
series) 1587
Euler relation 1587
analytic 1588
existence of power series 1588
inﬁnitelydiﬀerentiable function that is not ana
lytic 1590
power series 1591
proof of radius of convergence 1592
radius of convergence 1593
30B50 – Dirichlet series and other series
expansions, exponential series 1594
Dirichlet series 1594
30C15 – Zeros of polynomials, rational func
tions, and other analytic functions (e.g.
zeros of functions with bounded Dirichlet
integral) 1596
MasonStothers theorem 1596
zeroes of analytic functions are isolated 1596
30C20 – Conformal mappings of special
domains 1598
automorphisms of unit disk 1598
unit disk upper half plane conformal equivalence
theorem 1598
30C35 – General theory of conformal map
pings 1599
proof of conformal mapping theorem 1599
30C80 – Maximum principle; Schwarz’s lemma,
Lindel¨of principle, analogues and general
izations; subordination 1601
Schwarz lemma 1601
maximum principle 1601
proof of Schwarz lemma 1602
30D20 – Entire functions, general theory
1603
xxxiv
Liouville’s theorem 1603
Morera’s theorem 1603
entire 1604
holomorphic 1604
proof of Liouville’s theorem 1604
30D30 – Meromorphic functions, general
theory 1606
CasoratiWeierstrass theorem 1606
MittagLeﬄer’s theorem 1606
Riemann’s removable singularity theorem 1607
essential singularity 1607
meromorphic 1607
pole 1607
proof of CasoratiWeierstrass theorem 1608
proof of Riemann’s removable singularity theo
rem 1608
residue 1609
simple pole 1610
30E20 – Integration, integrals of Cauchy
type, integral representations of analytic
functions 1611
Cauchy integral formula 1611
Cauchy integral theorem 1612
Cauchy residue theorem 1613
Gauss’ mean value theorem 1614
M¨obius circle transformation theorem 1614
M¨obius transformation crossratio preservation
theorem 1614
Rouch’s theorem 1614
absolute convergence implies convergence for an
inﬁnite product 1615
absolute convergence of inﬁnite product 1615
closed curve theorem 1615
conformal M¨obius circle map theorem 1615
conformal mapping 1616
conformal mapping theorem 1616
convergence/divergence for an inﬁnite product
1616
example of conformal mapping 1616
examples of inﬁnite products 1617
link between inﬁnite products and sums 1617
proof of Cauchy integral formula 1618
proof of Cauchy residue theorem 1619
proof of Gauss’ mean value theorem 1620
proof of Goursat’s theorem 1620
proof of M¨obius circle transformation theorem
1622
proof of Simultaneous converging or diverging of
product and sum theorem 1623
proof of absolute convergence implies convergence
for an inﬁnite product 1624
proof of closed curve theorem 1624
proof of conformal M¨obius circle map theorem
1624
simultaneous converging or diverging of product
and sum theorem 1625
CauchyRiemann equations 1625
CauchyRiemann equations (polar coordinates)
1626
proof of the CauchyRiemann equations 1626
removable singularity 1627
30F40 – Kleinian groups 1629
Klein 4group 1629
31A05 – Harmonic, subharmonic, super
harmonic functions 1630
a harmonic function on a graph which is bounded
below and nonconstant 1630
example of harmonic functions on graphs 1630
examples of harmonic functions on R
n
1631
harmonic function 1632
31B05 – Harmonic, subharmonic, super
harmonic functions 1633
Laplacian 1633
32A05 – Power series, series of functions
1634
exponential function 1634
32C15 – Complex spaces 1637
Riemann sphere 1637
32F99 – Miscellaneous 1638
starshaped region 1638
32H02 – Holomorphic mappings, (holomor
phic) embeddings and related questions 1639
Bloch’s theorem 1639
Hartog’s theorem 1639
32H25 – Picardtype theorems and gener
alizations 1640
Picard’s theorem 1640
little Picard theorem 1640
33XX – Special functions 1641
beta function 1641
xxxv
33B10 – Exponential and trigonometric func
tions 1642
natural logarithm 1642
33B15 – Gamma, beta and polygamma func
tions 1643
BohrMollerup theorem 1643
gamma function 1643
proof of BohrMollerup theorem 1645
33B30 – Higher logarithm functions 1647
Lambert W function 1647
33B99 – Miscellaneous 1648
natural log base 1648
33D45 – Basic orthogonal polynomials and
functions (AskeyWilson polynomials, etc.)
1649
orthogonal polynomials 1649
33E05 – Elliptic functions and integrals
1651
Weierstrass sigma function 1651
elliptic function 1652
elliptic integrals and Jacobi elliptic functions 1652
examples of elliptic functions 1654
modular discriminant 1654
3400 – General reference works (hand
books, dictionaries, bibliographies, etc.) 1656
Liapunov function 1656
Lorenz equation 1657
Wronskian determinant 1659
dependence on initial conditions of solutions of
ordinary diﬀerential equations 1660
diﬀerential equation 1661
existence and uniqueness of solution of ordinary
diﬀerential equations 1662
maximal interval of existence of ordinary diﬀer
ential equations 1663
method of undetermined coeﬃcients 1663
natural symmetry of the Lorenz equation 1664
symmetry of a solution of an ordinary diﬀeren
tial equation 1665
symmetry of an ordinary diﬀerential equation
1665
3401 – Instructional exposition (textbooks,
tutorial papers, etc.) 1667
second order linear diﬀerential equation with con
stant coeﬃcients 1667
34A05 – Explicit solutions and reductions
1669
separation of variables 1669
variation of parameters 1670
34A12 – Initial value problems, existence,
uniqueness, continuous dependence and con
tinuation of solutions 1672
initial value problem 1672
34A30 – Linear equations and systems, gen
eral 1674
Chebyshev equation 1674
34A99 – Miscellaneous 1676
autonomous system 1676
34B24 – SturmLiouville theory 1677
eigenfunction 1677
34C05 – Location of integral curves, sin
gular points, limit cycles 1678
Hopf bifurcation theorem 1678
PoincareBendixson theorem 1679
omega limit set 1679
34C07 – Theory of limit cycles of polyno
mial and analytic vector ﬁelds (existence,
uniqueness, bounds, Hilbert’s 16th prob
lem and ramif 1680
Hilbert’s 16th problem for quadratic vector ﬁelds
1680
34C23 – Bifurcation 1682
equivariant branching lemma 1682
34C25 – Periodic solutions 1683
Bendixson’s negative criterion 1683
Dulac’s criteria 1683
proof of Bendixson’s negative criterion 1684
34C99 – Miscellaneous 1685
HartmanGrobman theorem 1685
equilibrium point 1685
stable manifold theorem 1686
34D20 – Lyapunov stability 1687
Lyapunov stable 1687
neutrally stable ﬁxed point 1687
stable ﬁxed point 1687
34L05 – General spectral theory 1688
Gelfand spectral radius theorem 1688
34L15 – Estimation of eigenvalues, upper
and lower bounds 1689
Rayleigh quotient 1689
xxxvi
34L40 – Particular operators (Dirac, one
dimensional Schr¨odinger, etc.) 1690
Dirac delta function 1690
construction of Dirac delta function 1691
3500 – General reference works (hand
books, dictionaries, bibliographies, etc.) 1692
diﬀerential operator 1692
35J05 – Laplace equation, reduced wave
equation (Helmholtz), Poisson equation 1694
Poisson’s equation 1694
35L05 – Wave equation 1695
wave equation 1695
35Q53 – KdVlike equations (Kortewegde
Vries, Burgers, sineGordon, sinhGordon,
etc.) 1697
Korteweg  de Vries equation 1697
35Q99 – Miscellaneous 1698
heat equation 1698
3700 – General reference works (hand
books, dictionaries, bibliographies, etc.) 1699
37A30 – Ergodic theorems, spectral the
ory, Markov operators 1700
ergodic 1700
fundamental theorem of demography 1700
proof of fundamental theorem of demography 1701
37B05 – Transformations and group ac
tions with special properties (minimality,
distality, proximality, etc.) 1703
discontinuous action 1703
37B20 – Notions of recurrence 1704
nonwandering set 1704
37B99 – Miscellaneous 1705
ωlimit set 1705
asymptotically stable 1706
expansive 1706
the only compact metric spaces that admit a pos
itively expansive homeomorphism are discrete spaces
1707
topological conjugation 1708
topologically transitive 1709
uniform expansivity 1709
37C10 – Vector ﬁelds, ﬂows, ordinary dif
ferential equations 1710
ﬂow 1710
globally attracting ﬁxed point 1711
37C20 – Generic properties, structural sta
bility 1712
KupkaSmale theorem 1712
Pugh’s general density theorem 1712
structural stability 1713
37C25 – Fixed points, periodic points, ﬁxed
point index theory 1714
hyperbolic ﬁxed point 1714
37C29 – Homoclinic and heteroclinic or
bits 1715
heteroclinic 1715
homoclinic 1715
37C75 – Stability theory 1716
attracting ﬁxed point 1716
stable manifold 1716
37C80 – Symmetries, equivariant dynam
ical systems 1718
Γequivariant 1718
37D05 – Hyperbolic orbits and sets 1719
hyperbolic isomorphism 1719
37D20 – Uniformly hyperbolic systems (ex
panding, Anosov, Axiom A, etc.) 1720
Anosov diﬀeomorphism 1720
Axiom A 1721
hyperbolic set 1721
37D99 – Miscellaneous 1722
KupkaSmale 1722
37E05 – Maps of the interval (piecewise
continuous, continuous, smooth) 1723
Sharkovskii’s theorem 1723
37G15 – Bifurcations of limit cycles and
periodic orbits 1724
Feigenbaum constant 1724
Feigenbaum fractal 1725
equivariant Hopf theorem 1726
37G40 – Symmetries, equivariant bifurca
tion theory 1728
Po´enaru (1976) theorem 1728
bifurcation problem with symmetry group 1728
trace formula 1729
37G99 – Miscellaneous 1730
chaotic dynamical system 1730
37H20 – Bifurcation theory 1732
bifurcation 1732
39B05 – General 1733
xxxvii
functional equation 1733
39B62 – Functional inequalities, including
subadditivity, convexity, etc. 1734
Jensen’s inequality 1734
proof of Jensen’s inequality 1735
proof of arithmeticgeometricharmonic means in
equality 1735
subadditivity 1736
superadditivity 1736
4000 – General reference works (hand
books, dictionaries, bibliographies, etc.) 1738
Cauchy product 1738
Cesro mean 1739
alternating series 1739
alternating series test 1739
monotonic 1740
monotonically decreasing 1740
monotonically increasing 1741
monotonically nondecreasing 1741
monotonically nonincreasing 1741
sequence 1742
series 1742
40A05 – Convergence and divergence of
series and sequences 1743
Abel’s lemma 1743
Abel’s test for convergence 1744
Baroni’s Theorem 1744
BolzanoWeierstrass theorem 1744
Cauchy criterion for convergence 1744
Cauchy’s root test 1745
Dirichlet’s convergence test 1745
Proof of Baroni’s Theorem 1746
Proof of StolzCesaro theorem 1747
StolzCesaro theorem 1748
absolute convergence theorem 1748
comparison test 1748
convergent sequence 1749
convergent series 1749
determining series convergence 1749
example of integral test 1750
geometric series 1750
harmonic number 1751
harmonic series 1752
integral test 1753
proof of Abel’s lemma (by induction) 1754
proof of Abel’s test for convergence 1754
proof of BolzanoWeierstrass Theorem 1754
proof of Cauchy’s root test 1756
proof of Leibniz’s theorem (using Dirichlet’s con
vergence test) 1756
proof of absolute convergence theorem 1756
proof of alternating series test 1757
proof of comparison test 1757
proof of integral test 1758
proof of ratio test 1759
ratio test 1759
40A10 – Convergence and divergence of
integrals 1760
improper integral 1760
40A25 – Approximation to limiting values
(summation of series, etc.) 1761
Euler’s constant 1761
40A30 – Convergence and divergence of
series and sequences of functions 1763
Abel’s limit theorem 1763
L¨owner partial ordering 1763
L¨owner’s theorem 1764
matrix monotone 1764
operator monotone 1764
pointwise convergence 1764
uniform convergence 1765
40G05 – Ces`aro, Euler, N¨orlund and Haus
dorﬀ methods 1766
Ces`aro summability 1766
40G10 – Abel, Borel and power series meth
ods 1768
Abel summability 1768
proof of Abel’s convergence theorem 1769
proof of Tauber’s convergence theorem 1770
41A05 – Interpolation 1772
Lagrange Interpolation formula 1772
Simpson’s 3/8 rule 1772
trapezoidal rule 1773
41A25 – Rate of convergence, degree of
approximation 1775
superconvergence 1775
41A58 – Series expansions (e.g. Taylor,
Lidstone series, but not Fourier series) 1776
Taylor series 1776
Taylor’s Theorem 1778
xxxviii
41A60 – Asymptotic approximations, asymp
totic expansions (steepest descent, etc.)
1779
Stirling’s approximation 1779
4200 – General reference works (hand
books, dictionaries, bibliographies, etc.) 1781
countable basis 1781
discrete cosine transform 1782
4201 – Instructional exposition (textbooks,
tutorial papers, etc.) 1784
Laplace transform 1784
42A05 – Trigonometric polynomials, in
equalities, extremal problems 1785
Chebyshev polynomial 1785
42A16 – Fourier coeﬃcients, Fourier se
ries of functions with special properties,
special Fourier series 1787
RiemannLebesgue lemma 1787
example of Fourier series 1788
42A20 – Convergence and absolute con
vergence of Fourier and trigonometric se
ries 1789
Dirichlet conditions 1789
42A38 – Fourier and FourierStieltjes trans
forms and other transforms of Fourier type
1790
Fourier transform 1790
42A99 – Miscellaneous 1792
Poisson summation formula 1792
42B05 – Fourier series and coeﬃcients 1793
Parseval equality 1793
Wirtinger’s inequality 1793
43A07 – Means on groups, semigroups,
etc.; amenable groups 1795
amenable group 1795
44A35 – Convolution 1796
convolution 1796
4600 – General reference works (hand
books, dictionaries, bibliographies, etc.) 1799
balanced set 1799
bounded function 1800
bounded set (in a topological vector space) 1801
cone 1802
locally convex topological vector space 1803
sequential characterization of boundedness 1803
symmetric set 1803
46A30 – Open mapping and closed graph
theorems; completeness (including B, B
r

completeness) 1805
closed graph theorem 1805
open mapping theorem 1805
46A99 – Miscellaneous 1806
HeineCantor theorem 1806
proof of HeineCantor theorem 1806
topological vector space 1807
46B20 – Geometry and structure of normed
linear spaces 1808
lim
p→∞
x
p
= x
∞
1808
HahnBanach theorem 1809
proof of HahnBanach theorem 1810
seminorm 1811
vector norm 1813
46B50 – Compactness in Banach (or normed)
spaces 1815
Schauder ﬁxed point theorem 1815
proof of Schauder ﬁxed point theorem 1815
46B99 – Miscellaneous 1817
p
1817
Banach space 1818
an inner product deﬁnes a norm 1818
continuous linear mapping 1818
equivalent norms 1819
normed vector space 1820
46Bxx – Normed linear spaces and Banach
spaces; Banach lattices 1821
vector pnorm 1821
46C05 – Hilbert and preHilbert spaces:
geometry and topology (including spaces
with semideﬁnite inner product) 1822
Bessel inequality 1822
Hilbert module 1822
Hilbert space 1823
proof of Bessel inequality 1823
46C15 – Characterizations of Hilbert spaces
1825
classiﬁcation of separable Hilbert spaces 1825
46E15 – Banach spaces of continuous, dif
ferentiable or analytic functions 1826
AscoliArzela theorem 1826
StoneWeierstrass theorem 1826
xxxix
proof of AscoliArzel theorem 1827
Holder inequality 1827
Young Inequality 1828
conjugate index 1828
proof of Holder inequality 1828
proof of Young Inequality 1829
vector ﬁeld 1829
46F05 – Topological linear spaces of test
functions, distributions and ultradistribu
tions 1830
T
f
is a distribution of zeroth order 1830
p.v.(
1
x
) is a distribution of ﬁrst order 1831
Cauchy principal part integral 1832
delta distribution 1833
distribution 1833
equivalence of conditions 1835
every locally integrable function is a distribution
1836
localization for distributions 1836
operations on distributions 1837
smooth distribution 1839
space of rapidly decreasing functions 1840
support of distribution 1841
46H05 – General theory of topological al
gebras 1843
Banach algebra 1843
46L05 – General theory of C
∗
algebras 1844
C
∗
algebra 1844
GelfandNaimark representation theorem 1844
state 1844
46L85 – Noncommutative topology 1846
GelfandNaimark theorem 1846
SerreSwan theorem 1846
46T12 – Measure (Gaussian, cylindrical,
etc.) and integrals (Feynman, path, Fres
nel, etc.) on manifolds 1847
path integral 1847
47A05 – General (adjoints, conjugates, prod
ucts, inverses, domains, ranges, etc.) 1849
BakerCampbellHausdorﬀ formula(e) 1849
adjoint 1850
closed operator 1850
properties of the adjoint operator 1851
47A35 – Ergodic theory 1852
ergodic theorem 1852
47A53 – (Semi) Fredholm operators; in
dex theories 1853
Fredholm index 1853
Fredholm operator 1853
47A56 – Functions whose values are lin
ear operators (operator and matrix val
ued functions, etc., including analytic and
meromorphic ones 1855
Taylor’s formula for matrix functions 1855
47A60 – Functional calculus 1856
Beltrami identity 1856
EulerLagrange diﬀerential equation 1857
calculus of variations 1857
47B15 – Hermitian and normal operators
(spectral measures, functional calculus, etc.)
1862
selfadjoint operator 1862
47G30 – Pseudodiﬀerential operators 1863
Dini derivative 1863
47H10 – Fixedpoint theorems 1864
Brouwer ﬁxed point in one dimension 1864
Brouwer ﬁxed point theorem 1865
any topological space with the ﬁxed point prop
erty is connected 1865
ﬁxed point property 1866
proof of Brouwer ﬁxed point theorem 1867
47L07 – Convex sets and cones of opera
tors 1868
convex hull of S is open if S is open 1868
47L25 – Operator spaces (= matricially
normed spaces) 1869
operator norm 1869
47S99 – Miscellaneous 1870
Drazin inverse 1870
49K10 – Free problems in two or more in
dependent variables 1871
Kantorovitch’s theorem 1871
49M15 – Methods of NewtonRaphson, Galerkin
and Ritz types 1873
Newton’s method 1873
5100 – General reference works (hand
books, dictionaries, bibliographies, etc.) 1877
Apollonius theorem 1877
Apollonius’ circle 1877
Brahmagupta’s formula 1878
xl
Brianchon theorem 1878
Brocard theorem 1878
Carnot circles 1879
Erd¨osAnning Theorem 1879
Euler Line 1879
Gergonne point 1879
Gergonne triangle 1880
Heron’s formula 1880
Lemoine circle 1880
Lemoine point 1880
Miquel point 1881
Mollweide’s equations 1881
Morley’s theorem 1881
Newton’s line 1882
NewtonGauss line 1882
Pascal’s mystic hexagram 1882
Ptolemy’s theorem 1882
Pythagorean theorem 1883
Schooten theorem 1883
Simson’s line 1884
Stewart’s theorem 1884
Thales’ theorem 1884
alternate proof of parallelogram law 1885
alternative proof of the sines law 1885
angle bisector 1887
angle sum identity 1888
annulus 1889
butterﬂy theorem 1889
centroid 1889
chord 1890
circle 1890
collinear 1893
complete quadrilateral 1893
concurrent 1893
cosines law 1894
cyclic quadrilateral 1894
derivation of cosines law 1894
diameter 1895
double angle identity 1896
equilateral triangle 1896
fundamental theorem on isogonal lines 1897
height 1897
hexagon 1897
hypotenuse 1898
isogonal conjugate 1898
isosceles triangle 1899
legs 1899
medial triangle 1899
median 1900
midpoint 1900
ninepoint circle 1900
orthic triangle 1901
orthocenter 1901
parallelogram 1902
parallelogram law 1902
pedal triangle 1902
pentagon 1903
polygon 1903
proof of Apollonius theorem 1904
proof of Apollonius theorem 1904
proof of Brahmagupta’s formula 1905
proof of Erd¨osAnning Theorem 1906
proof of Heron’s formula 1906
proof of Mollweide’s equations 1907
proof of Ptolemy’s inequality 1908
proof of Ptolemy’s theorem 1909
proof of Pythagorean theorem 1910
proof of Pythagorean theorem 1910
proof of Simson’s line 1911
proof of Stewart’s theorem 1912
proof of Thales’ theorem 1913
proof of butterﬂy theorem 1913
proof of double angle identity 1914
proof of parallelogram law 1915
proof of tangents law 1915
quadrilateral 1916
radius 1916
rectangle 1916
regular polygon 1917
regular polyhedron 1917
rhombus 1918
right triangle 1919
sector of a circle 1919
sines law 1919
sines law proof 1920
some proofs for triangle theorems 1920
square 1921
tangents law 1921
triangle 1921
triangle center 1922
xli
5101 – Instructional exposition (textbooks,
tutorial papers, etc.) 1924
geometry 1924
51XX – Geometry 1927
nonEuclidean geometry 1927
parallel postulate 1927
51A05 – General theory and projective ge
ometries 1928
Ceva’s theorem 1928
Menelaus’ theorem 1928
Pappus’s theorem 1929
proof of Ceva’s theorem 1929
proof of Menelaus’ theorem 1930
proof of Pappus’s theorem 1931
proof of Pascal’s mystic hexagram 1932
51A30 – Desarguesian and Pappian geome
tries 1934
Desargues’ theorem 1934
proof of Desargues’ theorem 1934
51A99 – Miscellaneous 1936
Pick’s theorem 1936
proof of Pick’s theorem 1936
51F99 – Miscellaneous 1939
Weizenbock’s Inequality 1939
51M04 – Elementary problems in Euclidean
geometries 1940
Napoleon’s theorem 1940
corollary of Morley’s theorem 1941
pivot theorem 1941
proof of Morley’s theorem 1941
proof of pivot theorem 1943
51M05 – Euclidean geometries (general)
and generalizations 1944
area of the nsphere 1944
geometry of the sphere 1945
sphere 1945
spherical coordinates 1947
volume of the nsphere 1947
51M10 – Hyperbolic and elliptic geome
tries (general) and generalizations 1949
Lobachevsky’s formula 1949
51M16 – Inequalities and extremum prob
lems 1950
BrunnMinkowski inequality 1950
HadwigerFinsler inequality 1950
isoperimetric inequality 1951
proof of HadwigerFinsler inequality 1951
51M20 – Polyhedra and polytopes; regu
lar ﬁgures, division of spaces 1953
polyhedron 1953
51M99 – Miscellaneous 1954
Euler line proof 1954
SSA 1954
cevian 1955
congruence 1955
incenter 1956
incircle 1956
symmedian 1957
51N05 – Descriptive geometry 1958
curve 1958
piecewise smooth 1960
rectiﬁable 1960
51N20 – Euclidean analytic geometry 1961
Steiner’s theorem 1961
Van Aubel theorem 1961
conic section 1961
proof of Steiner’s theorem 1963
proof of Van Aubel theorem 1964
proof of Van Aubel’s Theorem 1965
three theorems on parabolas 1966
52A01 – Axiomatic and generalized con
vexity 1969
convex combination 1969
52A07 – Convex sets in topological vector
spaces 1970
Fr´echet space 1970
52A20 – Convex sets in n dimensions (in
cluding convex hypersurfaces) 1973
Carath´eodory’s theorem 1973
52A35 – Hellytype theorems and geomet
ric transversal theory 1974
Helly’s theorem 1974
52A99 – Miscellaneous 1975
convex set 1975
52C07 – Lattices and convex bodies in n
dimensions 1976
Radon’s lemma 1976
52C35 – Arrangements of points, ﬂats, hy
perplanes 1978
Sylvester’s theorem 1978
xlii
5300 – General reference works (hand
books, dictionaries, bibliographies, etc.) 1979
Lie derivative 1979
closed diﬀerential forms on a simple connected
domain 1979
exact (diﬀerential form) 1980
manifold 1980
metric tensor 1983
proof of closed diﬀerential forms on a simple con
nected domain 1983
pullback of a kform 1985
tangent space 1985
5301 – Instructional exposition (textbooks,
tutorial papers, etc.) 1988
curl 1988
53A04 – Curves in Euclidean space 1990
Frenet frame 1990
SerretFrenet equations 1991
curvature (space curve) 1992
fundamental theorem of space curves 1993
helix 1993
space curve 1994
53A45 – Vector and tensor analysis 1996
closed (diﬀerential form) 1996
53B05 – Linear and aﬃne connections 1997
LeviCivita connection 1997
connection 1997
vector ﬁeld along a curve 2001
53B21 – Methods of Riemannian geome
try 2002
Hodge star operator 2002
Riemannian manifold 2002
53B99 – Miscellaneous 2004
germ of smooth functions 2004
53C17 – SubRiemannian geometry 2005
SubRiemannian manifold 2005
53D05 – Symplectic manifolds, general 2006
Darboux’s Theorem (symplectic geometry) 2006
Moser’s theorem 2006
almost complex structure 2007
coadjoint orbit 2007
examples of symplectic manifolds 2007
hamiltonian vector ﬁeld 2008
isotropic submanifold 2008
lagrangian submanifold 2009
symplectic manifold 2009
symplectic matrix 2009
symplectic vector ﬁeld 2010
symplectic vector space 2010
53D10 – Contact manifolds, general 2011
contact manifold 2011
53D20 – Momentum maps; symplectic re
duction 2012
momentum map 2012
5400 – General reference works (hand
books, dictionaries, bibliographies, etc.) 2013
Krull dimension 2013
Niemytzki plane 2013
Sorgenfrey line 2014
boundary (in topology) 2014
closed set 2014
coarser 2015
compactopen topology 2015
completely normal 2015
continuous proper map 2016
derived set 2016
diameter 2016
every second countable space is separable 2016
ﬁrst axiom of countability 2017
homotopy groups 2017
indiscrete topology 2018
interior 2018
invariant forms on representations of compact
groups 2018
ladder connected 2019
local base 2020
loop 2020
loop space 2020
metrizable 2020
neighborhood system 2021
paracompact topological space 2021
pointed topological space 2021
proper map 2021
quasicompact 2022
regularly open 2022
separated 2022
support of function 2023
topological invariant 2024
topological space 2024
topology 2025
xliii
triangle inequality 2025
universal covering space 2026
54A05 – Topological spaces and general
izations (closure spaces, etc.) 2027
characterization of connected compact metric spaces.
2027
closure axioms 2027
neighborhood 2028
open set 2028
54A20 – Convergence in general topology
(sequences, ﬁlters, limits, convergence spaces,
etc.) 2030
Banach ﬁxed point theorem 2030
Dini’s theorem 2031
another proof of Dini’s theorem 2031
continuous convergence 2032
contractive maps are uniformly continuous 2033
net 2033
proof of Banach ﬁxed point theorem 2034
proof of Dini’s theorem 2035
theorem about continuous convergence 2035
ultraﬁlter 2035
ultranet 2036
54A99 – Miscellaneous 2037
basis 2037
box topology 2037
closure 2038
cover 2038
dense 2039
examples of ﬁlters 2039
ﬁlter 2039
limit point 2040
nowhere dense 2040
perfect set 2041
properties of the closure operator 2041
subbasis 2041
54B05 – Subspaces 2042
irreducible 2042
irreducible component 2042
subspace topology 2042
54B10 – Product spaces 2043
product topology 2043
product topology preserves the Hausdorﬀ prop
erty 2044
54B15 – Quotient spaces, decompositions
2045
Klein bottle 2045
M¨obius strip 2046
cell attachment 2047
quotient space 2047
torus 2048
54B17 – Adjunction spaces and similar con
structions 2049
adjunction space 2049
54B40 – Presheaves and sheaves 2050
direct image 2050
54B99 – Miscellaneous 2051
coﬁnite and cocountable topology 2051
cone 2051
join 2052
order topology 2052
suspension 2053
54C05 – Continuous maps 2054
Inverse Function Theorem (topological spaces)
2054
continuity of composition of functions 2054
continuous 2055
discontinuous 2055
homeomorphism 2057
proof of Inverse Function Theorem (topological
spaces) 2057
restriction of a continuous mapping is continu
ous 2057
54C10 – Special maps on topological spaces
(open, closed, perfect, etc.) 2059
densely deﬁned 2059
open mapping 2059
54C15 – Retraction 2060
retract 2060
54C70 – Entropy 2061
diﬀerential entropy 2061
54C99 – Miscellaneous 2062
BorsukUlam theorem 2062
ham sandwich theorem 2062
proof of BorsukUlam theorem 2062
54D05 – Connected and locally connected
spaces (general aspects) 2064
Jordan curve theorem 2064
clopen subset 2064
connected component 2065
xliv
connected set 2065
connected set in a topological space 2066
connected space 2066
connectedness is preserved under a continuous
map 2066
cutpoint 2067
example of a connected space that is not path
connected 2067
example of a semilocally simply connected space
which is not locally simply connected 2068
example of a space that is not semilocally simply
connected 2068
locally connected 2069
locally simply connected 2069
path component 2069
path connected 2070
products of connected spaces 2070
proof that a path connected space is connected
2070
quasicomponent 2070
semilocally simply connected 2071
54D10 – Lower separation axioms (T
0
–T
3
,
etc.) 2072
T0 space 2072
T1 space 2072
T2 space 2072
T3 space 2073
a compact set in a Hausdorﬀ space is closed 2073
proof of A compact set in a Hausdorﬀ space is
closed 2074
regular 2074
regular space 2074
separation axioms 2075
topological space is T
1
if and only if every sin
gleton is closed. 2076
54D15 – Higher separation axioms (com
pletely regular, normal, perfectly or col
lectionwise normal, etc.) 2077
Tietze extension theorem 2077
Tychonoﬀ 2077
Urysohn’s lemma 2078
normal 2078
proof of Urysohn’s lemma 2078
54D20 – Noncompact covering properties
(paracompact, Lindel¨of, etc.) 2081
Lindel¨of 2081
countably compact 2081
locally ﬁnite 2081
54D30 – Compactness 2082
Y is compact if and only if every open cover of
Y has a ﬁnite subcover 2082
HeineBorel theorem 2083
Tychonoﬀ’s theorem 2083
a space is compact if and only if the space has
the ﬁnite intersection property 2083
closed set in a compact space is compact 2084
closed subsets of a compact set are compact 2084
compact 2085
compactness is preserved under a continuous map
2085
examples of compact spaces 2086
ﬁnite intersection property 2088
limit point compact 2088
point and a compact set in a Hausdorﬀ space
have disjoint open neighborhoods. 2088
proof of HeineBorel theorem 2089
properties of compact spaces 2091
relatively compact 2092
sequentially compact 2092
two disjoint compact sets in a Hausdorﬀ space
have disjoint open neighborhoods. 2092
54D35 – Extensions of spaces (compacti
ﬁcations, supercompactiﬁcations, comple
tions, etc.) 2094
Alexandrov onepoint compactiﬁcation 2094
compactiﬁcation 2094
54D45 – Local compactness, σcompactness
2095
σcompact 2095
examples of locally compact and not locally com
pact spaces 2095
locally compact 2096
54D65 – Separability 2097
separable 2097
54D70 – Base properties 2098
second countable 2098
54D99 – Miscellaneous 2099
Lindel¨of theorem 2099
ﬁrst countable 2099
proof of Lindel¨of theorem 2099
xlv
totally disconnected space 2100
54E15 – Uniform structures and general
izations 2101
topology induced by uniform structure 2101
uniform space 2101
uniform structure of a metric space 2102
uniform structure of a topological group 2102
εnet 2103
Euclidean distance 2103
Hausdorﬀ metric 2104
Urysohn metrization theorem 2104
ball 2104
bounded 2105
cityblock metric 2105
completely metrizable 2105
distance to a set 2106
equibounded 2106
isometry 2106
metric space 2107
nonreversible metric 2107
open ball 2108
some structures on R
n
2108
totally bounded 2110
ultrametric 2110
Lebesgue number lemma 2111
proof of Lebesgue number lemma 2111
complete 2111
completeness principle 2112
uniformly equicontinuous 2112
Baire category theorem 2112
Baire space 2113
equivalent statement of Baire category theorem
2113
generic 2114
meager 2114
proof for one equivalent statement of Baire cat
egory theorem 2114
proof of Baire category theorem 2115
residual 2115
six consequences of Baire category theorem 2116
HahnMazurkiewicz theorem 2116
Vitali covering 2116
compactly generated 2116
54G05 – Extremally disconnected spaces,
Fspaces, etc. 2117
extremally disconnected 2117
54G20 – Counterexamples 2118
Sierpinski space 2118
long line 2118
5500 – General reference works (hand
books, dictionaries, bibliographies, etc.) 2120
Universal Coeﬃcient Theorem 2120
invariance of dimension 2121
55M05 – Duality 2122
Poincar´e duality 2122
55M20 – Fixed points and coincidences 2123
Sperner’s lemma 2123
55M25 – Degree, winding number 2125
degree (map of spheres) 2125
winding number 2126
55M99 – Miscellaneous 2127
genus of topological surface 2127
55N10 – Singular theory 2128
Betti number 2128
MayerVietoris sequence 2128
cellular homology 2128
homology (topological space) 2129
homology of RP
3
. 2131
long exact sequence (of homology groups) 2132
relative homology groups 2133
55N99 – Miscellaneous 2134
suspension isomorphism 2134
55P05 – Homotopy extension properties,
coﬁbrations 2135
coﬁbration 2135
homotopy extension property 2135
55P10 – Homotopy equivalences 2136
Whitehead theorem 2136
weak homotopy equivalence 2136
55P15 – Classiﬁcation of homotopy type
2137
simply connected 2137
55P20 – EilenbergMac Lane spaces 2138
EilenbergMac Lane space 2138
55P99 – Miscellaneous 2139
fundamental groupoid 2139
55Pxx – Homotopy theory 2141
nulhomotopic map 2141
55Q05 – Homotopy groups, general; sets
of homotopy classes 2142
xlvi
Van Kampen’s theorem 2142
category of pointed topological spaces 2143
deformation retraction 2143
fundamental group 2144
homotopy of maps 2144
homotopy of paths 2145
long exact sequence (locally trivial bundle) 2145
55Q52 – Homotopy groups of special spaces
2146
contractible 2146
55R05 – Fiber spaces 2147
classiﬁcation of covering spaces 2147
covering space 2148
deck transformation 2148
lifting of maps 2150
lifting theorem 2151
monodromy 2151
properly discontinuous action 2153
regular covering 2153
55R10 – Fiber bundles 2155
associated bundle construction 2155
bundle map 2156
ﬁber bundle 2156
locally trivial bundle 2157
principal bundle 2157
pullback bundle 2158
reduction of structure group 2158
section of a ﬁber bundle 2160
some examples of universal bundles 2161
universal bundle 2161
55R25 – Sphere bundles and vector bun
dles 2163
Hopf bundle 2163
vector bundle 2163
55U10 – Simplicial sets and complexes 2164
simplicial complex 2164
5700 – General reference works (hand
books, dictionaries, bibliographies, etc.) 2167
connected sum 2167
57XX – Manifolds and cell complexes 2168
CW complex 2168
57M25 – Knots and links in S
3
2170
connected sum 2170
knot theory 2170
unknot 2173
57M99 – Miscellaneous 2174
Dehn surgery 2174
57N16 – Geometric structures on mani
folds 2175
selfintersections of a curve 2175
57N70 – Cobordism and concordance 2176
hcobordism 2176
Smale’s hcobordism theorem 2176
cobordism 2176
57N99 – Miscellaneous 2178
orientation 2178
57R22 – Topology of vector bundles and
ﬁber bundles 2180
hairy ball theorem 2180
57R35 – Diﬀerentiable mappings 2182
Sard’s theorem 2182
diﬀerentiable function 2182
57R42 – Immersions 2184
immersion 2184
57R60 – Homotopy spheres, Poincar´e con
jecture 2185
Poincar´e conjecture 2185
The Poincar´e dodecahedral space 2185
homology sphere 2186
57R99 – Miscellaneous 2187
transversality 2187
57S25 – Groups acting on speciﬁc mani
folds 2189
Isomorphism of the group PSL
2
(C) with the
group of Mobius transformations 2189
58A05 – Diﬀerentiable manifolds, founda
tions 2190
partition of unity 2190
58A10 – Diﬀerential forms 2191
diﬀerential form 2191
58A32 – Natural bundles 2194
conormal bundle 2194
cotangent bundle 2194
normal bundle 2195
tangent bundle 2195
58C35 – Integration on manifolds; mea
sures on manifolds 2196
general Stokes theorem 2196
proof of general Stokes theorem 2196
58C40 – Spectral theory; eigenvalue prob
xlvii
lems 2199
spectral radius 2199
58E05 – Abstract critical point theory (Morse
theory, LjusternikSchnirelman (Lyusternik
Shnirelman) theory, etc.) 2200
Morse complex 2200
Morse function 2200
Morse lemma 2201
centralizer 2201
6000 – General reference works (hand
books, dictionaries, bibliographies, etc.) 2202
Bayes’ theorem 2202
Bernoulli random variable 2202
Gamma random variable 2203
beta random variable 2204
chisquared random variable 2205
continuous density function 2205
expected value 2206
geometric random variable 2207
proof of Bayes’ Theorem 2207
random variable 2208
uniform (continuous) random variable 2208
uniform (discrete) random variable 2209
60A05 – Axioms; other general questions
2210
example of pairwise independent events that are
not totally independent 2210
independent 2210
random event 2211
60A10 – Probabilistic measure theory 2212
Cauchy random variable 2212
almost surely 2212
60A99 – Miscellaneous 2214
BorelCantelli lemma 2214
Chebyshev’s inequality 2214
Markov’s inequality 2215
cumulative distribution function 2215
limit superior of sets 2215
proof of Chebyshev’s inequality 2216
proof of Markov’s inequality 2216
60E05 – Distributions: general theory 2217
Cram´erWold theorem 2217
HellyBray theorem 2217
Scheﬀ´e’s theorem 2218
Zipf’s law 2218
binomial distribution 2219
convergence in distribution 2220
density function 2221
distribution function 2221
geometric distribution 2222
relative entropy 2223
Paul L´evy continuity theorem 2224
characteristic function 2225
Kolmogorov’s inequality 2226
discrete density function 2226
probability distribution function 2227
60F05 – Central limit and other weak the
orems 2229
Lindeberg’s central limit theorem 2229
60F15 – Strong theorems 2231
Kolmogorov’s strong law of large numbers 2231
strong law of large numbers 2231
60G05 – Foundations of stochastic processes
2233
stochastic process 2233
60G99 – Miscellaneous 2234
stochastic matrix 2234
60J10 – Markov chains with discrete pa
rameter 2235
Markov chain 2235
6200 – General reference works (hand
books, dictionaries, bibliographies, etc.) 2236
covariance 2236
moment 2237
variance 2237
62E15 – Exact distribution theory 2239
Pareto random variable 2239
exponential random variable 2240
hypergeometric random variable 2240
negative hypergeometric random variable 2241
negative hypergeometric random variable, exam
ple of 2242
proof of expected value of the hypergeometric
distribution 2243
proof of variance of the hypergeometric distribu
tion 2243
proof that normal distribution is a distribution
2245
6500 – General reference works (hand
books, dictionaries, bibliographies, etc.) 2246
xlviii
normal equations 2246
principle components analysis 2247
pseudoinverse 2248
6501 – Instructional exposition (textbooks,
tutorial papers, etc.) 2250
cubic spline interpolation 2250
65B15 – EulerMaclaurin formula 2252
EulerMaclaurin summation formula 2252
proof of EulerMaclaurin summation formula 2252
65C05 – Monte Carlo methods 2254
Monte Carlo methods 2254
65D32 – Quadrature and cubature formu
las 2256
Simpson’s rule 2256
65F25 – Orthogonalization 2257
Givens rotation 2257
GramSchmidt orthogonalization 2258
Householder transformation 2259
orthonormal 2261
65F35 – Matrix norms, conditioning, scal
ing 2262
Hilbert matrix 2262
Pascal matrix 2262
Toeplitz matrix 2263
matrix condition number 2264
matrix norm 2264
pivoting 2265
65R10 – Integral transforms 2266
integral transform 2266
65T50 – Discrete and fast Fourier trans
forms 2267
Vandermonde matrix 2267
discrete Fourier transform 2268
68M20 – Performance evaluation; queue
ing; scheduling 2270
Amdahl’s Law 2270
eﬃciency 2270
proof of Amdahl’s Law 2271
68P05 – Data structures 2272
heap insertion algorithm 2272
heap removal algorithm 2273
68P10 – Searching and sorting 2275
binary search 2275
bubblesort 2276
heap 2277
heapsort 2278
inplace sorting algorithm 2279
insertion sort 2279
lower bound for sorting 2281
quicksort 2282
sorting problem 2283
68P20 – Information storage and retrieval
2285
Browsing service 2285
Digital Library Index 2285
Digital Library Scenario 2285
Digital Library Space 2286
Digitial Library Searching Service 2286
Service, activity, task, or procedure 2286
StructuredStream 2286
collection 2286
digital library stream 2287
digital object 2287
good hash table primes 2287
hashing 2289
metadata format 2293
system state 2293
transition event 2294
68P30 – Coding and information theory
(compaction, compression, models of com
munication, encoding schemes, etc.) 2295
Huﬀman coding 2295
Huﬀman’s algorithm 2297
arithmetic encoding 2299
binary Gray code 2300
entropy encoding 2301
68Q01 – General 2302
currying 2302
higherorder function 2303
68Q05 – Models of computation (Turing
machines, etc.) 2304
Cook reduction 2304
Levin reduction 2305
Turing computable 2305
computable number 2305
deterministic ﬁnite automaton 2306
nondeterministic Turing machine 2307
nondeterministic ﬁnite automaton 2307
nondeterministic pushdown automaton 2309
oracle 2310
xlix
selfreducible 2311
universal Turing machine 2311
68Q10 – Modes of computation (nondeter
ministic, parallel, interactive, probabilis
tic, etc.) 2312
deterministic Turing machine 2312
random Turing machine 2313
68Q15 – Complexity classes (hierarchies,
relations among complexity classes, etc.)
2315
NPcomplete 2315
complexity class 2315
constructible 2317
counting complexity class 2317
polynomial hierarchy 2317
polynomial hierarchy is a hierarchy 2318
time complexity 2318
68Q25 – Analysis of algorithms and prob
lem complexity 2320
counting problem 2320
decision problem 2320
promise problem 2321
range problem 2321
search problem 2321
68Q30 – Algorithmic information theory
(Kolmogorov complexity, etc.) 2323
Kolmogorov complexity 2323
Kolmogorov complexity function 2323
Kolmogorov complexity upper bounds 2324
computationally indistinguishable 2324
distribution ensemble 2325
hard core 2325
invariance theorem 2325
natural numbers identiﬁed with binary strings
2326
oneway function 2326
pseudorandom 2327
psuedorandom generator 2327
support 2327
68Q45 – Formal languages and automata
2328
automaton 2328
contextfree language 2329
68Q70 – Algebraic theory of languages and
automata 2331
Kleene algebra 2331
Kleene star 2331
monad 2332
68R05 – Combinatorics 2333
switching lemma 2333
68R10 – Graph theory 2334
Floyd’s algorithm 2334
digital library structural metadata speciﬁcation
2334
digital library structure 2335
digital library substructure 2335
68T10 – Pattern recognition, speech recog
nition 2336
Hough transform 2336
68U10 – Image processing 2340
aliasing 2340
68W01 – General 2341
Horner’s rule 2341
68W30 – Symbolic computation and alge
braic computation 2343
algebraic computation 2343
68W40 – Analysis of algorithms 2344
speedup 2344
74A05 – Kinematics of deformation 2345
body 2345
deformation 2345
76D05 – NavierStokes equations 2346
NavierStokes equations 2346
81S40 – Path integrals 2347
Feynman path integral 2347
90C05 – Linear programming 2349
linear programming 2349
simplex algorithm 2350
91A05 – 2person games 2351
examples of normal form games 2351
normal form game 2352
91A10 – Noncooperative games 2353
dominant strategy 2353
91A18 – Games in extensive form 2354
extensive form game 2354
91A99 – Miscellaneous 2355
Nash equilibrium 2355
Pareto dominant 2355
common knowledge 2356
complete information 2356
l
example of Nash equilibrium 2357
game 2357
game theory 2358
strategy 2358
utility 2359
92B05 – General biology and biomathe
matics 2360
LotkaVolterra system 2360
93A10 – General systems 2362
transfer function 2362
93B99 – Miscellaneous 2363
passivity 2363
93D99 – Miscellaneous 2365
Hurwitz matrix 2365
94A12 – Signal theory (characterization,
reconstruction, etc.) 2366
rms error 2366
94A17 – Measures of information, entropy
2367
conditional entropy 2367
gaussian maximizes entropy for given covariance
2368
mutual information 2368
proof of gaussian maximizes entropy for given
covariance 2369
94A20 – Sampling theory 2371
sampling theorem 2371
94A60 – Cryptography 2372
DiﬃeHellman key exchange 2372
elliptic curve discrete logarithm problem 2373
94A99 – Miscellaneous 2374
Heaps’ law 2374
History 2375
li
GNU Free Documentation License
Version 1.1, March 2000
Copyright c ( 2000 Free Software Foundation, Inc.
59 Temple Place, Suite 330, Boston, MA 021111307 USA
Everyone is permitted to copy and distribute verbatim copies of this license document, but
changing it is not allowed.
Preamble
The purpose of this License is to make a manual, textbook, or other written document “free”
in the sense of freedom: to assure everyone the eﬀective freedom to copy and redistribute
it, with or without modifying it, either commercially or noncommercially. Secondarily, this
License preserves for the author and publisher a way to get credit for their work, while not
being considered responsible for modiﬁcations made by others.
This License is a kind of “copyleft”, which means that derivative works of the document must
themselves be free in the same sense. It complements the GNU General Public License, which
is a copyleft license designed for free software.
We have designed this License in order to use it for manuals for free software, because free
software needs free documentation: a free program should come with manuals providing the
same freedoms that the software does. But this License is not limited to software manuals; it
can be used for any textual work, regardless of subject matter or whether it is published as a
printed book. We recommend this License principally for works whose purpose is instruction
or reference.
Applicability and Deﬁnitions
This License applies to any manual or other work that contains a notice placed by the copy
right holder saying it can be distributed under the terms of this License. The “Document”,
lii
below, refers to any such manual or work. Any member of the public is a licensee, and is
addressed as “you”.
A “Modiﬁed Version” of the Document means any work containing the Document or a
portion of it, either copied verbatim, or with modiﬁcations and/or translated into another
language.
A “Secondary Section” is a named appendix or a frontmatter section of the Document
that deals exclusively with the relationship of the publishers or authors of the Document to
the Document’s overall subject (or to related matters) and contains nothing that could fall
directly within that overall subject. (For example, if the Document is in part a textbook
of mathematics, a Secondary Section may not explain any mathematics.) The relationship
could be a matter of historical connection with the subject or with related matters, or of
legal, commercial, philosophical, ethical or political position regarding them.
The “Invariant Sections” are certain Secondary Sections whose titles are designated, as being
those of Invariant Sections, in the notice that says that the Document is released under this
License.
The “Cover Texts” are certain short passages of text that are listed, as FrontCover Texts or
BackCover Texts, in the notice that says that the Document is released under this License.
A “Transparent” copy of the Document means a machinereadable copy, represented in a
format whose speciﬁcation is available to the general public, whose contents can be viewed
and edited directly and straightforwardly with generic text editors or (for images composed
of pixels) generic paint programs or (for drawings) some widely available drawing editor,
and that is suitable for input to text formatters or for automatic translation to a variety of
formats suitable for input to text formatters. A copy made in an otherwise Transparent ﬁle
format whose markup has been designed to thwart or discourage subsequent modiﬁcation
by readers is not Transparent. A copy that is not “Transparent” is called “Opaque”.
Examples of suitable formats for Transparent copies include plain ASCII without markup,
Texinfo input format, L
A
T
E
X input format, SGML or XML using a publicly available DTD,
and standardconforming simple HTML designed for human modiﬁcation. Opaque formats
include PostScript, PDF, proprietary formats that can be read and edited only by proprietary
word processors, SGML or XML for which the DTD and/or processing tools are not generally
available, and the machinegenerated HTML produced by some word processors for output
purposes only.
The “Title Page” means, for a printed book, the title page itself, plus such following pages
as are needed to hold, legibly, the material this License requires to appear in the title page.
For works in formats which do not have any title page as such, “Title Page” means the text
near the most prominent appearance of the work’s title, preceding the beginning of the body
of the text.
liii
Verbatim Copying
You may copy and distribute the Document in any medium, either commercially or non
commercially, provided that this License, the copyright notices, and the license notice saying
this License applies to the Document are reproduced in all copies, and that you add no
other conditions whatsoever to those of this License. You may not use technical measures
to obstruct or control the reading or further copying of the copies you make or distribute.
However, you may accept compensation in exchange for copies. If you distribute a large
enough number of copies you must also follow the conditions in section 3.
You may also lend copies, under the same conditions stated above, and you may publicly
display copies.
Copying in Quantity
If you publish printed copies of the Document numbering more than 100, and the Document’s
license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly
and legibly, all these Cover Texts: FrontCover Texts on the front cover, and BackCover
Texts on the back cover. Both covers must also clearly and legibly identify you as the
publisher of these copies. The front cover must present the full title with all words of the
title equally prominent and visible. You may add other material on the covers in addition.
Copying with changes limited to the covers, as long as they preserve the title of the Document
and satisfy these conditions, can be treated as verbatim copying in other respects.
If the required texts for either cover are too voluminous to ﬁt legibly, you should put the
ﬁrst ones listed (as many as ﬁt reasonably) on the actual cover, and continue the rest onto
adjacent pages.
If you publish or distribute Opaque copies of the Document numbering more than 100, you
must either include a machinereadable Transparent copy along with each Opaque copy, or
state in or with each Opaque copy a publiclyaccessible computernetwork location containing
a complete Transparent copy of the Document, free of added material, which the general
networkusing public has access to download anonymously at no charge using publicstandard
network protocols. If you use the latter option, you must take reasonably prudent steps, when
you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy
will remain thus accessible at the stated location until at least one year after the last time
you distribute an Opaque copy (directly or through your agents or retailers) of that edition
to the public.
It is requested, but not required, that you contact the authors of the Document well before
redistributing any large number of copies, to give them a chance to provide you with an
updated version of the Document.
liv
Modiﬁcations
You may copy and distribute a Modiﬁed Version of the Document under the conditions of
sections 2 and 3 above, provided that you release the Modiﬁed Version under precisely this
License, with the Modiﬁed Version ﬁlling the role of the Document, thus licensing distribution
and modiﬁcation of the Modiﬁed Version to whoever possesses a copy of it. In addition, you
must do these things in the Modiﬁed Version:
• Use in the Title Page (and on the covers, if any) a title distinct from that of the
Document, and from those of previous versions (which should, if there were any, be
listed in the History section of the Document). You may use the same title as a previous
version if the original publisher of that version gives permission.
• List on the Title Page, as authors, one or more persons or entities responsible for
authorship of the modiﬁcations in the Modiﬁed Version, together with at least ﬁve of
the principal authors of the Document (all of its principal authors, if it has less than
ﬁve).
• State on the Title page the name of the publisher of the Modiﬁed Version, as the
publisher.
• Preserve all the copyright notices of the Document.
• Add an appropriate copyright notice for your modiﬁcations adjacent to the other copy
right notices.
• Include, immediately after the copyright notices, a license notice giving the public
permission to use the Modiﬁed Version under the terms of this License, in the form
shown in the Addendum below.
• Preserve in that license notice the full lists of Invariant Sections and required Cover
Texts given in the Document’s license notice.
• Include an unaltered copy of this License.
• Preserve the section entitled “History”, and its title, and add to it an item stating
at least the title, year, new authors, and publisher of the Modiﬁed Version as given
on the Title Page. If there is no section entitled “History” in the Document, create
one stating the title, year, authors, and publisher of the Document as given on its
Title Page, then add an item describing the Modiﬁed Version as stated in the previous
sentence.
• Preserve the network location, if any, given in the Document for public access to
a Transparent copy of the Document, and likewise the network locations given in the
Document for previous versions it was based on. These may be placed in the “History”
section. You may omit a network location for a work that was published at least four
years before the Document itself, or if the original publisher of the version it refers to
gives permission.
lv
• In any section entitled “Acknowledgements” or “Dedications”, preserve the section’s
title, and preserve in the section all the substance and tone of each of the contributor
acknowledgements and/or dedications given therein.
• Preserve all the Invariant Sections of the Document, unaltered in their text and in
their titles. Section numbers or the equivalent are not considered part of the section
titles.
• Delete any section entitled “Endorsements”. Such a section may not be included in
the Modiﬁed Version.
• Do not retitle any existing section as “Endorsements” or to conﬂict in title with any
Invariant Section.
If the Modiﬁed Version includes new frontmatter sections or appendices that qualify as
Secondary Sections and contain no material copied from the Document, you may at your
option designate some or all of these sections as invariant. To do this, add their titles to
the list of Invariant Sections in the Modiﬁed Version’s license notice. These titles must be
distinct from any other section titles.
You may add a section entitled “Endorsements”, provided it contains nothing but endorse
ments of your Modiﬁed Version by various parties – for example, statements of peer review
or that the text has been approved by an organization as the authoritative deﬁnition of a
standard.
You may add a passage of up to ﬁve words as a FrontCover Text, and a passage of up to 25
words as a BackCover Text, to the end of the list of Cover Texts in the Modiﬁed Version.
Only one passage of FrontCover Text and one of BackCover Text may be added by (or
through arrangements made by) any one entity. If the Document already includes a cover
text for the same cover, previously added by you or by arrangement made by the same entity
you are acting on behalf of, you may not add another; but you may replace the old one, on
explicit permission from the previous publisher that added the old one.
The author(s) and publisher(s) of the Document do not by this License give permission to
use their names for publicity for or to assert or imply endorsement of any Modiﬁed Version.
Combining Documents
You may combine the Document with other documents released under this License, under
the terms deﬁned in section 4 above for modiﬁed versions, provided that you include in the
combination all of the Invariant Sections of all of the original documents, unmodiﬁed, and
list them all as Invariant Sections of your combined work in its license notice.
The combined work need only contain one copy of this License, and multiple identical In
variant Sections may be replaced with a single copy. If there are multiple Invariant Sections
lvi
with the same name but diﬀerent contents, make the title of each such section unique by
adding at the end of it, in parentheses, the name of the original author or publisher of that
section if known, or else a unique number. Make the same adjustment to the section titles
in the list of Invariant Sections in the license notice of the combined work.
In the combination, you must combine any sections entitled “History” in the various original
documents, forming one section entitled “History”; likewise combine any sections entitled
“Acknowledgements”, and any sections entitled “Dedications”. You must delete all sections
entitled “Endorsements.”
Collections of Documents
You may make a collection consisting of the Document and other documents released under
this License, and replace the individual copies of this License in the various documents with
a single copy that is included in the collection, provided that you follow the rules of this
License for verbatim copying of each of the documents in all other respects.
You may extract a single document from such a collection, and distribute it individually
under this License, provided you insert a copy of this License into the extracted document,
and follow this License in all other respects regarding verbatim copying of that document.
Aggregation With Independent Works
A compilation of the Document or its derivatives with other separate and independent doc
uments or works, in or on a volume of a storage or distribution medium, does not as a whole
count as a Modiﬁed Version of the Document, provided no compilation copyright is claimed
for the compilation. Such a compilation is called an “aggregate”, and this License does not
apply to the other selfcontained works thus compiled with the Document, on account of
their being thus compiled, if they are not themselves derivative works of the Document.
If the Cover Text requirement of section 3 is applicable to these copies of the Document, then
if the Document is less than one quarter of the entire aggregate, the Document’s Cover Texts
may be placed on covers that surround only the Document within the aggregate. Otherwise
they must appear on covers around the whole aggregate.
Translation
Translation is considered a kind of modiﬁcation, so you may distribute translations of the
Document under the terms of section 4. Replacing Invariant Sections with translations
lvii
requires special permission from their copyright holders, but you may include translations of
some or all Invariant Sections in addition to the original versions of these Invariant Sections.
You may include a translation of this License provided that you also include the original
English version of this License. In case of a disagreement between the translation and the
original English version of this License, the original English version will prevail.
Termination
You may not copy, modify, sublicense, or distribute the Document except as expressly pro
vided for under this License. Any other attempt to copy, modify, sublicense or distribute the
Document is void, and will automatically terminate your rights under this License. However,
parties who have received copies, or rights, from you under this License will not have their
licenses terminated so long as such parties remain in full compliance.
Future Revisions of This License
The Free Software Foundation may publish new, revised versions of the GNU Free Doc
umentation License from time to time. Such new versions will be similar in spirit to
the present version, but may diﬀer in detail to address new problems or concerns. See
http://www.gnu.org/copyleft/.
Each version of the License is given a distinguishing version number. If the Document
speciﬁes that a particular numbered version of this License ”or any later version” applies
to it, you have the option of following the terms and conditions either of that speciﬁed
version or of any later version that has been published (not as a draft) by the Free Software
Foundation. If the Document does not specify a version number of this License, you may
choose any version ever published (not as a draft) by the Free Software Foundation.
ADDENDUM: How to use this License for your docu
ments
To use this License in a document you have written, include a copy of the License in the
document and put the following copyright and license notices just after the title page:
Copyright c ( YEAR YOUR NAME. Permission is granted to copy, distribute
and/or modify this document under the terms of the GNU Free Documenta
tion License, Version 1.1 or any later version published by the Free Software
Foundation; with the Invariant Sections being LIST THEIR TITLES, with the
lviii
FrontCover Texts being LIST, and with the BackCover Texts being LIST. A
copy of the license is included in the section entitled “GNU Free Documentation
License”.
If you have no Invariant Sections, write “with no Invariant Sections” instead of saying which
ones are invariant. If you have no FrontCover Texts, write “no FrontCover Texts” instead
of “FrontCover Texts being LIST”; likewise for BackCover Texts.
If your document contains nontrivial examples of program code, we recommend releasing
these examples in parallel under your choice of free software license, such as the GNU General
Public License, to permit their use in free software.
lix
Chapter 1
UNCLA – Unclassiﬁed
1.1 Golomb ruler
A Golomb ruler of length : is a ruler with only a subset of the integer markings ¦0. c
2
. . :¦ ⊂
¦0. 1. 2. . . . . :¦ that appear on a regular ruler. The deﬁning criterion of this subset is that
there exists an : such that any positive integer / < : can be expresses uniquely as a
diﬀerence / = c
i
−c
j
for some i. ,. This is referred to as an :Golomb ruler.
A 4Golomb ruler of length : is given by ¦0. 1. 3. 7¦. To verify this, we need to show that
every number 1. 2. . . . . 7 can be expressed as a diﬀerence of two numbers in the above set:
1 = 1 −0
2 = 3 −1
3 = 3 −0
4 = 7 −3
An optimal Golomb ruler is one where for a ﬁxed value of : the value of c
n
is minimized.
Version: 2 Owner: mathcam Author(s): mathcam, imran
1.2 Hesse conﬁguration
A Hesse conﬁguration is a set 1 of nine noncollinear points in the projective plane over
a ﬁeld 1 such that any line through two points of 1 contains exactly three points of 1.
1
Then there are 12 such lines through 1. A Hesse conﬁguration exists if and only if the ﬁeld
1 contains a primitive third root of unity. For such 1 the projective automorphism group
PGL(3. 1) acts transitively on all possible Hesse conﬁgurations.
The conﬁguration 1 with its intersection structure of 12 lines is isomorphic to the aﬃne space
¹ = F
2
where F is a ﬁeld with three elements.
The group Γ ⊂ PGL(3. 1) of all symmetries that map 1 onto itself has order 216 and it
is isomorphic to the group of aﬃne transformations of ¹ that have determinant 1. The
stabilizer in Γ of any of the 12 lines through 1 is a cyclic subgroup of order three and Γ is
generated by these subgroups.
The symmetry group Γ is isomorphic to G(1)2(1) where G(1) ⊂ GL(3. 1) is a group
of order 648 generated by reﬂections of order three and 2(1) is its cyclic center of order
three. The reﬂection group G(C) is called the Hesse group which appears as G
25
in the
classiﬁcation of ﬁnite complex reﬂection groups by Shephard and Todd.
If 1 is algebraically closed and the characteristic of 1 is not 2 or 3 then the nine inﬂection
points of an elliptic curve 1 over 1 form a Hesse conﬁguration.
Version: 3 Owner: debosberg Author(s): debosberg
1.3 Jordan’s Inequality
Jordan’s Inequality states
2
π
r < sin(r) < r, ∀ r ∈ [0.
π
2
]
Version: 3 Owner: unlord Author(s): unlord
1.4 Lagrange’s theorem
Lagrange’s theorem
1: G group
2: H < G
3: [G : H] index of H in G
4: [G[ = [H[[G : H]
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 2 Owner: bwebste Author(s): akrowne, apmxi
2
1.5 Laurent series
A Laurent series centered about c is a series of the form
∞
¸
k=−∞
c
k
(. −c)
k
where c
k
. c. . ∈ C.
One can prove that the above series converges everywhere inside the set
1 := ¦. ∈ C [ 1
1
< [. −c[ < 1
2
¦
where
1
1
:= limsup
k→∞
[c
−k
[
1/k
and
1
2
:= 1
limsup
k→∞
[c
k
[
1/k
.
(This set may be empty)
Every Laurent series has an associated function, given by
1(.) :=
∞
¸
k=−∞
c
k
(. −c)
k
.
whose domain is the set of points in C on which the series converges. This function is analytic
inside the annulus 1, and conversely, every analytic function on an annulus is equal to some
(unqiue) Laurent series.
Version: 3 Owner: djao Author(s): djao
1.6 Lebesgue measure
Let o ⊆ R, and let o
t
be the complement of o with respect to R. We deﬁne o to be
measurable if, for any ¹ ⊆ R,
:
∗
(¹) = :
∗
(¹
¸
o) + :
∗
(¹
¸
o
t
)
where :
∗
(o) is the Lebesgue outer measure of o. If o is measurable, then we deﬁne the
Lebesgue measure of o to be :(o) = :
∗
(o).
Lebesgue measure on R
n
is the :fold product measure of Lebesgue measure on R.
Version: 2 Owner: vampyr Author(s): vampyr
3
1.7 Leray spectral sequence
The Leray spectral sequence is a special case of the Grothendieck spectral sequence regarding
composition of functors.
If 1 : A →) is a continuous map of topological spaces, and if F is a sheaf of abelian groups
on A, then there is a spectral sequence 1
pq
2
= H
p
(). R
q
1
∗
F) →H
p+q
(A. F)
where 1
∗
is the direct image functor.
Version: 1 Owner: bwebste Author(s): nerdy2
1.8 M¨obius transformation
A M¨obius transformation is a bijection on the extended complex plane C
¸
¦∞¦ given by
1(.) =
az+b
cz+d
. = −
d
c
. ∞
a
c
. = ∞
∞ . = −
d
c
where c. /. c. d ∈ C and cd −/c = 0
It can be shown that the inverse, and composition of two mobius transformations are similarly
deﬁned, and so the M¨obius transformations form a group under composition.
The geometric interpretation of the M¨obius group is that it is the group of automorphisms
of the Riemann sphere.
Any M¨obius map can be composed from the elementary transformations  dilations, trans
lations and inversions. If we deﬁne a line to be a circle passing through ∞ then it can be
shown that a M¨obius transformation maps circles to circles, by looking at each elementary
transformation.
Version: 9 Owner: vitriol Author(s): vitriol
1.9 MordellWeil theorem
If 1 is an elliptic curve deﬁned over a number ﬁeld 1, then the group of points with
coordinates in 1 is a ﬁnitely generated abelian group.
Version: 1 Owner: nerdy2 Author(s): nerdy2
4
1.10 Plateau’s Problem
The ”Plateau’s Problem” is the problem of ﬁnding the surface with minimal area among all
surfaces wich have the same prescribed boundary.
This problem is named after the Belgian physicist Joseph plateau (18011883) who experi
mented with soap ﬁlms. As a matter of fact if you take a wire (which represents a closed curve
in threedimensional space) and dip it in a solution of soapy water, you obtain a soapy sur
face which has the wire as boundary. It turns out that this surface has the minimal area
among all surfaces with the same boundary, so the soap ﬁlm is a solution to the Plateau’s
Problem.
Jesse Douglas (18971965) solved the problem by proving the existence of such minimal
surfaces. The solution to the problem is achieved by ﬁnding an harmonic and conformal
parameterization of the surface.
The extension of the problem to higher dimensions (i.e. for /dimensional surfaces in :
dimensional space) turns out to be much more diﬃcult to study. Moreover while the solutions
to the original problem are always regular it turns out that the solutions to the extended
problem may have singularities if : ≥ 8. To solve the extended problem the theory of
currents (Federer and Fleming) has been developed.
Version: 4 Owner: paolini Author(s): paolini
1.11 Poisson random variable
A is a Poisson random variable with parameter λ if
1
X
(r) =
e
−λ
λ
x
x!
, r = ¦0. 1. 2. ...¦
Parameters:
 λ 0
syntax:
A ∼ 1oi::o:(λ)
5
Notes:
1. A is often used to describe the ocurrence of rare events. It’s a very commonly used
distribution in all ﬁelds of statistics.
2. 1[A] = λ
3. \ c:[A] = λ
4. `
X
(t) = c
λ(e
t
−1)
Version: 2 Owner: Riemann Author(s): Riemann
1.12 Shannon’s theorem
Deﬁnition (Discrete) Let (Ω. F. j) be a discrete probability space, and let A be a discrete
random variable on Ω.
The entropy H[A] is deﬁned as the functional
H[A] = −
¸
x∈Ω
j(A = r) log j(A = r). (1.12.1)
Deﬁnition (Continuous) Entropy in the continuous case is called diﬀerential entropy.
Discussion—Discrete Entropy Entropy was ﬁrst introduced by Shannon in 1948 in his
landmark paper “A Mathematical Theory of Communication.” A modiﬁed and expanded
argument of his argument is presented here.
Suppose we have a set of possible events whose probabilities of occurrence are j
1
. j
2
. . . . . j
n
.
These probabilities are known but that is all we know concerning which event will occur.
Can we ﬁnd a measure of how much “choice” is involved in the selection of the event or of
how uncertain we are of the outcome? If there is such a measure, say H(j
1
. j
2
. . . . . j
n
), it is
reasonable to require of it the following properties:
1. H should be continuous in the j
i
.
2. If all the j
i
are equal, j
i
=
1
n
, then H should be a monotonic increasing function of :.
With equally likely events there is more choice, or uncertainty, when there are more
possible events.
6
3. If a choice be broken down into two successive choices, the original H should be the
weighted sum of the individual values of H.
As an example of this last property, consider losing your luggage down a chute which feeds
three carousels, A, B and C. Assume that the baggage handling system is constructed such
that the probability of your luggage ending up on carousel A is
1
2
, on B is
1
3
, and on C is
1
6
. These probabilities specify the j
i
. There are two ways to think about your uncertainty
about where your luggage will end up.
First, you could consider your uncertainty to be H(1
A
. 1
B
. 1
C
) = H(
1
2
.
1
3
.
1
6
). On the other
hand, you reason, no matter how byzantine the baggage handling system is, half the time
your luggage will end up on carousel A and half the time it will end up on carousels B
or C (with uncertainty H(1
A
. 1
B
S
C
) = H(
1
2
.
1
2
)). If it doesn’t go into A (and half the
time it won’t), then twothirds of the time it shows up on B and onethird of the time
it winds up on carousel C (and your uncertainty about this second event, in isolation, is
H(1
B
. 1
C
) = H(
2
3
.
1
3
)). But remember this second event only happens half the time (1
B
S
C
of the time), so you must weight this second uncertainty appropriately—that is, by
1
2
. The
uncertainties computed using each of these chains of reasoning must be equal. That is,
H (1
A
. 1
B
. 1
C
) = H
1
A
. 1
B
S
C
+ 1
B
S
C
H (1
B
. 1
C
)
H
1
2
.
1
3
.
1
6
= H
1
2
.
1
2
+
1
2
H
2
3
.
1
3
If you’re not as lost as your luggage, then you may be interested in the following. . .
Theorem The only H satisfying the three above assumptions is of the form:
H = −/
n
¸
i=1
j
i
log j
i
/ is a constant, essentially a choice of unit of measure. The measure of uncertainty, H, is
called entropy, not to be confused (though it often is) with Boltzmann’s thermodynamic
entropy. The logarithm may be taken to the base 2, in which case H is measured in “bits,”
or to the base c, in which case H is measured in “nats.”
Discussion—Continuous Entropy Despite its seductively analogous form, continuous
entropy cannot be obtained as a limiting case of discrete entropy.
We wish to obtain a generally ﬁnite measure as the “bin size” goes to zero. In the discrete
case, the bin size is the (implicit) width of each of the : (ﬁnite or inﬁnite) bins/buckets/states
whose probabilities are the j
n
. As we generalize to the continuous domain, we must make
this width explicit.
7
To do this, start with a continuous function 1 discretized as shown in the ﬁgure:
As the ﬁgure indicates, by the meanvalue theorem there exists a value r
i
in each bin such
Figure 1.1: Discretizing the function 1 into bins of width ∆
that
1(r
i
)∆ = int
(i+1)∆
i∆
1(r)dr (1.12.2)
and thus the integral of the function 1 can be approximated (in the Riemannian sense) by
int
∞
−∞
1(r)dr = lim
∆→0
∞
¸
i=−∞
1(r
i
)∆ (1.12.3)
where this limit and “bin size goes to zero” are equivalent.
We will denote
H
∆
:= −
∞
¸
i=−∞
∆1(r
i
) log ∆1(r
i
) (1.12.4)
and expanding the log we have
H
∆
= −
∞
¸
i=−∞
∆1(r
i
) log ∆1(r
i
) (1.12.5)
= −
∞
¸
i=−∞
∆1(r
i
) log 1(r
i
) −
∞
¸
i=−∞
1(r
i
)∆log ∆. (1.12.6)
As ∆ →0, we have
∞
¸
i=−∞
1(r
i
)∆ →int1(r)dr = 1 and (1.12.7)
∞
¸
i=−∞
∆1(r
i
) log 1(r
i
) →int1(r) log 1(r)dr (1.12.8)
This leads us to our deﬁnition of the diﬀerential entropy (continuous entropy):
/[1] = lim
∆→0
H
∆
+ log ∆
= −int
∞
−∞
1(r) log 1(r)dr. (1.12.9)
Version: 13 Owner: gaurminirick Author(s): drummond
8
1.13 Shapiro inequality
Let : ` 3 positive reals r
1
. r
2
. . . . . r
n
∈ R
+
.
The following inequality
r
1
r
1
+ r
2
+
r
2
r
2
+ r
3
+ +
r
n
r
1
+ r
2
`
:
2
with r
i
+ r
i+1
0 is true for any even integer : < 12 and any odd integer : < 23.
Version: 1 Owner: alek thiery Author(s): alek thiery
1.14 Sylow jsubgroups
Let G be a ﬁnite group and j be a prime that divides [G[. We can then write [G[ = j
k
: for
some positive integer / so that j does not divide :.
Any subgroup of H whose order is j
k
is called a Sylow jsubgroup or simply j subgroup.
First Sylow theorem states that any group with order j
k
: has a Sylow jsubgroup.
Version: 3 Owner: drini Author(s): drini, apmxi
1.15 Tchirnhaus transformations
A polynomial transformation which transforms a polynomial to another with certain zero
coeﬃcients is called a Tschirnhaus Transformation. It is thus an invertible transforma
tion of the form r → o(r)/(r) where o. / are polynomials over the base ﬁeld 1 (or some
subﬁeld of the splitting ﬁeld of the polynomial being transformed). If gcd(1(r). 1(r)) = 1
then the Tschirnhaus transformation becomes a polynomial transformation mod f.
Speciﬁcally, it concerns a substitution that reduces ﬁnding the roots of the polynomial
p = 1
n
+ c
1
1
n−1
+ ... + c
n
=
n
¸
i=1
(1 −:
i
) ∈ /[1]
to ﬁnding the roots of another q  with less parameters  and solving an auxiliary polynomial
equation s, with deg(:) < deg(j
¸
¡).
Historically, the transformation was applied to reduce the general quintic equation, to simpler
resolvents. Examples due to Hermite and Klein are respectively: The principal resolvent
1(A) := A
5
+ c
0
A
2
+ c
1
A + c
3
9
and the BringJerrard form
1(A) := A
5
+ c
1
A + c
2
Tschirnhaus transformations are also used when computing Galois groups to remove repeated
roots in resolvent polynomials. Almost any transformation will work but it is extremely hard
to ﬁnd an eﬃcient algorithm that can be proved to work.
Version: 5 Owner: bwebste Author(s): bwebste, ottem
1.16 Wallis formulae
int
π
2
0
sin
2n
rdr =
1.3.....(2: −1)
2.4.....2:
π
2
int
π
2
0
sin
2n+1
rdr =
2.4.....2:
3.5.....(2: + 1)
π
2
=
∞
¸
n=1
4:
2
4:
2
−1
=
2
1
2
3
4
3
4
5
...
Version: 2 Owner: vypertd Author(s): vypertd
1.17 ascending chain condition
A collection o of subsets of a set A (that is, a subset of the power set of A) satisﬁes
the ascending chain condition or ACC if there does not exist an inﬁnite ascending chain
:
1
⊂ :
2
⊂ of subsets from o.
See also the descending chain condition (DCC).
Version: 2 Owner: antizeus Author(s): antizeus
1.18 bounded
Let A be a subset of R. We say that A is bounded when there exists a real number ` such
that [r[ < ` for all r ∈ `. When A is an interval, we speak of an bounded interval.
This can be generalized ﬁrst to R
n
. We say that A ⊂ R
n
is bounded if there is a real number
` such that r < ` for all r ∈ ` and   is the Euclidean distance between r and n.
When we consider balls, we speak of bounded balls
10
This condition is equivalent to the statement: There is a real number 1 such that r−n < 1
for all r. n ∈ A.
A further generalization to any metric space \ says that A ⊂ \ is bounded when there is a
real number ` such that d(r. n) < ` for all r. n ∈ A and d represents the metric (distance
function) on \ .
Version: 2 Owner: drini Author(s): drini, apmxi
1.19 bounded operator
Deﬁnition [1]
1. Suppose A and ) are normed vector spaces with norms 
X
and 
Y
. Further,
suppose 1 is a linear map 1 : A →) . If there is a ( ≥ 0 such that
1r
Y
≤ r
X
for all r ∈ A, then 1 is a bounded operator.
2. Let A and ) be as above, and let 1 : A →) is a bounded operator. Then the norm
of 1 is deﬁned as the real number
1 = sup¦
1r
Y
r
X
[ r ∈ A ` ¦0¦¦.
In the special case when A is the zero vector space, any linear map 1 : A →) is the
zero map since 1(0) = 01(0) = 0. In this case, we deﬁne 1 = 0.
TODO:
1. The deﬁned norm for mappings is a norm
2. Examples: identity operator, zero operator: see [1].
3. Give alternative expressions for norm of 1. (supremum taken over unit ball)
4. Discuss boundedness and continuity
Theorem [1, 2] Suppose 1 : A →) is a linear map between vector spaces A and ) . If A
is ﬁnite dimensional, then 1 is bounded.
11
REFERENCES
1. E. Kreyszig, Introductory Functional Analysis With Applications, John Wiley & Sons,
1978.
2. G. Bachman, L. Narici, Functional analysis, Academic Press, 1966.
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 2 Owner: bwebste Author(s): matte, apmxi
1.20 complex projective line
complex projective line
1: (.
1
. .
2
) complex numbers
2: (.
1
. .
2
) = (0. 0)
3: ∀λ ∈ C ` ¦0¦ : (λ.
1
. λ.
2
) ∼ (.
1
. .
2
)
4: ¦(λ.
1
. λ.
2
)
λ ∈ C ` ¦0¦¦∼
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.21 converges uniformly
Let A be a set, (). ρ) a metric space and ¦1
n
¦ a sequence of functions from A to ) , and
1 : A →) another function.
If for any ε 0 there exists an integer ` such that
ρ(1
n
(r). 1(r)) < ε
for all : ` we say that 1
n
converges unformly to 1.
Version: 2 Owner: drini Author(s): drini, apmxi
12
1.22 descending chain condition
A collection o of subsets of a set A (that is, a subset of the power set of A) satisﬁes
the descending chain condition or DCC if there does not exist an inﬁnite descending chain
:
1
⊃ :
2
⊃ of subsets from o.
See also the ascending chain condition (ACC).
Version: 1 Owner: antizeus Author(s): antizeus
1.23 diamond theorem
In the simplest case, the result states that every image of a twocolored ”Diamond” ﬁgure (like
the ﬁgure in Plato’s Meno dialogue) under the action of the symmetric group of degree 4 has
some ordinary or colorinterchange symmetry. The theorem generalizes to graphic designs
on 2x2x2, 4x4, and 4x4x4 arrays. It is of interest because it relates classical (Euclidean)
symmetries to underlying group actions that come from ﬁnite rather than from classical
geometry. The group actions in the 4x4 case of the theorem throw some light on the R. T.
Curtis ”miracle octad generator” approach to the large Mathieu group.
Version: 2 Owner: m759 Author(s): m759
1.24 equivalently oriented bases
equivalently oriented bases
1: \ ﬁnitedimensional vector space
2: (·
1
. . . . . ·
n
) ordered basis for \
3: (u
1
. . . . . u
n
) ordered basis for \
4: ¹ : \ →\
5: ∀i ∈ ¦1. . . . . :¦ : ¹·
i
= u
i
6: det(¹) 0
fact: there is a unique linear isomorphism taking a given basis to
another given basis
13
1: \ ﬁnitedimensional vector space
2: (·
1
. . . . . ·
n
) ordered basis for \
3: (u
1
. . . . . u
n
) ordered basis for \
4: ∃!¹ : \ →\ linear isomorphism : ∀i ∈ ¦1. . . . . :¦ : ¹·
i
= u
i
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.25 ﬁnitely generated 1module
ﬁnitely generated 1module
1: A module over 1
2: ) ⊂ A
3: A generated by )
4: ) ﬁnite
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 1 Owner: Thomas Heye Author(s): apmxi
1.26 fraction
A fraction is a rational number expressed in the form
n
d
or :d, where : is designated the
numerator and d the denominator. The slash between them is known as a solidus when
the fraction is expressed as :d.
The fraction :d has value : [ d. For instance, 32 = 3 [ 2 = 1.5.
If :d < 1, then :d is known as a proper fraction. Otherwise, it is an improper
fraction. If : and d are relatively prime, then :d is said to be in lowest terms. To
get a fraction in lowest terms, simply divide the numerator and the denominator by their
greatest common divisor:
60
84
=
60 [ 12
84 [ 12
=
5
7
.
14
The rules for manipulating fractions are
c
/
=
/c
//
c
/
+
c
d
=
cd + /c
/d
c
/
−
c
d
=
cd −/c
/d
c
/
c
d
=
cc
/d
c
/
[
c
d
=
cd
/c
.
Version: 3 Owner: bwebste Author(s): digitalis
1.27 group of covering transformations
group of covering transformations
1: (¦/ : A →A
/covering transformation¦. ◦)
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.28 idempotent
idempotent
1: 1 ring
2: : ∈ 1
3: :
2
= :
The following facts hold in commutative rings.
15
fact: if : is idempotent, then 1 −: is idempotent
1: 1 ring
2: : ∈ 1
3: : idempotent
4: 1 −: idempotent
fact: if : is idempotent, then :1 is a ring
1: 1 ring
2: : ∈ 1
3: : idempotent
4: :1 is a ring
fact: if : is idempotent, then :1 has identity :
1: 1 ring
2: : ∈ 1
3: : idempotent
4: ∀: ∈ :1 : :: = :: = :
fact: if : is idempotent, then 1
∼
= :1 (1 −:)1
1: 1 ring
2: : ∈ 1
3: : idempotent
4: 1
∼
= :1 (1 −:)1
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 3 Owner: bwebste Author(s): apmxi
16
1.29 isolated
Let A be a topological space, let o ⊂ A, and let r ∈ o. The point r is said to be an isolated
point of o if there exists an open set l ⊂ A such that l
¸
o = ¦r¦.
The set o is isolated if every point in o is an isolated point.
Version: 1 Owner: djao Author(s): djao
1.30 isolated singularity
isolated singularity
1: 1 : l ⊂ C →C
¸
¦∞¦
2: .
0
∈ l
3: 1 analytic on l ` ¦.
0
¦
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.31 isomorphic groups
isomorphic groups
1: (A
1
. ∗
1
). (A
2
. ∗
2
) groups
2: 1 : A
1
→A
2
isomorphism
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 1 Owner: Thomas Heye Author(s): apmxi
17
1.32 joint continuous density function
Let A
1
. A
2
. .... A
n
be : random variables all deﬁned on the same probability space. The joint
continuous density function of A
1
. A
2
. .... A
n
, denoted by 1
X
1
,X
2
,...,Xn
(r
1
. r
2
. .... r
n
), is the
function
1
X
1
,X
2
,...,Xn
: 1
n
→1
such that int
(x
1
,x
2
,...,xn)
(−∞,−∞,...,−∞)
1
X
1
,X
2
,...,Xn
(n
1
. n
2
. .... n
n
)dn
1
dn
2
...dn
n
= 1
X
1
,X
2
,...,Xn
(r
1
. r
2
. .... r
n
)
As in the case where : = 1, this function satisﬁes:
1. 1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) ` 0 ∀(r
1
. .... r
n
)
2. int
x
1
,...,xn
1
X
1
,X
2
,...,Xn
(n
1
. n
2
. .... n
n
)dn
1
dn
2
...dn
n
= 1
As in the single variable case, 1
X
1
,X
2
,...,Xn
does not represent the probability that each of the
random variables takes on each of the values.
Version: 4 Owner: Riemann Author(s): Riemann
1.33 joint cumulative distribution function
Let A
1
. A
2
. .... A
n
be : random variables all deﬁned on the same probability space. The joint
cumulative distribution function of A
1
. A
2
. .... A
n
, denoted by 1
X
1
,X
2
,...,Xn
(r
1
. r
2
. .... r
n
),
is the following function:
1
X
1
,X
2
,...,Xn
: 1
n
→1
1
X
1
,X
2
,...,Xn
(r
1
. r
2
. .... r
n
) = 1[A
1
< r
1
. A
2
< r
2
. .... A
n
< r
n
]
As in the unidimensional case, this function satisﬁes:
1. lim
(x
1
,...,xn)→(−∞,...,−∞)
1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) = 0 and lim
(x
1
,...,xn)→(∞,...,∞)
1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) =
1
2. 1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) is a monotone, nondecreasing function.
18
3. 1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) is continuous from the right in each variable.
The way to evaluate 1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) is the following:
1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) = int
x
1
−∞
int
x
2
−∞
int
xn
−∞
1
X
1
,X
2
,...,Xn
(n
1
. .... n
n
)dn
1
dn
2
dn
n
(if 1 is continuous) or
1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) =
¸
i
1
x
1
,...,inxn
1
X
1
,X
2
,...,Xn
(i
1
. .... i
n
)
(if 1 is discrete),
where 1
X
1
,X
2
,...,Xn
is the joint density function of A
1
. .... A
n
.
Version: 3 Owner: Riemann Author(s): Riemann
1.34 joint discrete density function
Let A
1
. A
2
. .... A
n
be : random variables all deﬁned on the same probability space. The
joint discrete density function of A
1
. A
2
. .... A
n
, denoted by 1
X
1
,X
2
,...,Xn
(r
1
. r
2
. .... r
n
), is
the following function:
1
X
1
,X
2
,...,Xn
: 1
n
→1
1
X
1
,X
2
,...,Xn
(r
1
. r
2
. .... r
n
) = 1[A
1
= r
1
. A
2
= r
2
. .... A
n
= r
n
]
As in the single variable case, sometimes it’s expressed as j
X
1
,X
2
,...,Xn
(r
1
. r
2
. .... r
n
) to mark
the diﬀerence between this function and the continuous joint density function.
Also, as in the case where : = 1, this function satisﬁes:
1. 1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) ` 0 ∀(r
1
. .... r
n
)
2.
¸
x
1
,...,xn
1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) = 1
In this case, 1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) = 1[A
1
= r
1
. A
2
= r
2
. .... A
n
= r
n
].
19
Version: 3 Owner: Riemann Author(s): Riemann
1.35 left function notation
We are said to be using left function notation if we write functions to the left of their
arguments. That is, if α : A →) is a function and r ∈ A, then αr is the image of r under
α.
Furthermore, if we have a function β : ) → 2, then we write the composition of the two
functions as βα : A → 2, and the image of r under the composition as βαr = (βα)r =
β(αr).
Compare this to right function notation.
Version: 1 Owner: antizeus Author(s): antizeus
1.36 lift of a submanifold
lift of a submanifold
1: A. ) topological manifolds
2: 2 ⊂ ) submanifold
3: o : 2 →) inclusion
4: ˜ o lift of o
5: i(˜ o)
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.37 limit of a real function exits at a point
Let A ⊂ R be an open set of real numbers and 1 : A →R a function.
If r
0
∈ A, we say that 1 is continuous at r
0
if for any ε 0 there exists δ positive such that
[1(r) −1(r
0
)[ < ε
20
whenever
[r −r
0
[ < δ.
Based on apmξ
Version: 2 Owner: drini Author(s): drini, apmxi
1.38 lipschitz function
lipschitz function
1: 1 : R →C
2: ∃` ∈ R : ∀r. n ∈ R : [1(r) −1(n)[ < `[r −n[
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.39 lognormal random variable
A is a Lognormal random variable with parameters j and σ
2
if
1
X
(r) =
1
√
2πσ
2
e
−
(ln x−µ)
2
2σ
2
x
, r 0
Parameters:
 j ∈ 1
 σ
2
0
syntax:
A ∼ 1oo`(j. σ
2
)
21
Notes:
1. A is a random variable such that ln(A) is a normal random variable with mean j and
variance σ
2
.
2. 1[A] = c
µ+σ
2
/2
3. \ c:[A] = c
2µ
2
+σ
2
(c
σ
2
−1)
4. `
X
(t) not useful
Version: 2 Owner: Riemann Author(s): Riemann
1.40 lowest upper bound
Let o be a set with an ordering relation <, and let 1 be a subset of o. A lowest upper bound
of 1 is an upper bound r of 1 with the property that r < n for every upper bound n of 1.
A lowest upper bound of 1, when it exists, is unique.
Greatest lower bound is deﬁned similarly: a greatest lower bound of 1 is a lower bound r of
1 with the property that r ` n for every lower bound n of 1.
Version: 3 Owner: djao Author(s): djao
1.41 marginal distribution
Given random variables A
1
. A
2
. .... A
n
and a subset 1 ⊂ ¦1. 2. .... :¦, the marginal distri
bution of the random variables A
i
: i ∈ 1 is the following:
1
X
i
:i∈I¦
(x) =
¸
x
i
:i / ∈I¦
1
X
1
,...,Xn
(r
1
. .... r
n
) or
1
X
i
:i∈I¦
(x) = int
x
i
:i / ∈I¦
1
X
1
,...,Xn
(n
1
. .... n
n
)
¸
u
i
:i / ∈I¦
dn
i
,
summing if the variables are discrete and integrating if the variables are continuous.
This is, the marginal distribution of a set of random variables A
1
. .... A
n
can be obtained by
summing (or integrating) the joint distribution over all values of the other variables.
22
The most common marginal distribution is the individual marginal distribution (ie, the
marginal distribution of ONE random variable).
Version: 4 Owner: Riemann Author(s): Riemann
1.42 measurable space
A measurable space is a set 1 together with a collection B(1) of subsets of 1 which is a
sigma algebra.
The elements of B(1) are called measurable sets.
Version: 3 Owner: djao Author(s): djao
1.43 measure zero
measure zero
1: (A. `. j) measure space
2: ¹ ∈ `
3: j(¹) = 0
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.44 minimum spanning tree
Given a graph G with weighted edges, a minimum spanning tree is a spanning tree with
minimum weight, where the weight of a spanning tree is the sum of the weights of its edges.
There may be more than one minimum spanning tree for a graph, since it is the weight of
the spanning tree that must be minimum.
For example, here is a graph G of weighted edges and a minimum spanning tree 1 for that
graph. The edges of 1 are drawn as solid lines, while edges in G but not in 1 are drawn as
dotted lines.
23
•
3
4
7
•
8
4
•
2
5 •
5
3
7
•
2
•
6
•
Prim’s algorithm or Kruskal’s algorithm can compute the minimum spanning tree of a graph.
Version: 3 Owner: Logan Author(s): Logan
1.45 minimum weighted path length
Given a list of weights, \ := ¦u
1
. u
2
. . . . . u
n
¦, the minimum weighted path length is the
minimum of the weighted path length of all extended binary trees that have : external nodes
with weights taken from \. There may be multiple possible trees that give this minimum
path length, and quite often ﬁnding this tree is more important than determining the path
length.
Example
Let \ := ¦1. 2. 3. 3. 4¦. The minimum weighted path length is 29. A tree that gives this
weighted path length is shown below.
Applications
Constructing a tree of minimum weighted path length for a given set of weights has several
applications, particularly dealing with optimization problems. A simple and elegant algo
rithm for constructing such a tree is Huﬀman’s algorithm. Such a tree can give the most
optimal algorithm for merging : sorted sequences (optimal merge). It can also provide a
means of compressing data (Huﬀman coding), as well as lead to optimal searches.
Version: 2 Owner: Logan Author(s): Logan
24
1.46 mod 2 intersection number
mod 2 intersection number
case: transversal map
1: A smooth manifold
2: A compact
3: ) smooth manifold
4: 2 ⊂ ) closed submanifold
5: 1 : A →) smooth
6: 2 and A have complementary dimension
7: 1 transversal to 2
8: [1
−1
(2)[ (mod#1)
case: nontransversal map
1: A smooth manifold
2: A compact
3: ) smooth manifold
4: 2 ⊂ ) closed submanifold
5: 1 : A →) smooth
6: dim(A) + dim(2) = dim() )
7: o homotopic to 1
8: o transversal to 2
9: [o
−1
(2)[ (mod#1)
25
fact: a homotopic transversal map exists
1: A smooth manifold
2: A compact
3: ) smooth manifold
4: 2 ⊂ ) closed submanifold
5: 1 : A →) smooth
6: dim(A) + dim(2) = dim() )
7: ∃o homotopic to 1 : o transversal to 2
fact: two homotopic transversal maps have the same mod 2 inter
section number
1: A smooth manifold
2: A compact
3: ) smooth manifold
4: 2 ⊂ ) closed submanifold
5: 1
1
. 1
2
: A →) smooth
6: 1
1
homotopic to 1
2
7: 1
2
(1
1
. 2) = 1
2
(1
2
. 2)
fact: boundary theorem
1: A manifold with boundary
2: ) manifold
3: 2 ⊂ ) submanifold
4: 2 and ∂A have complementary dimension
5: o : ∂A →)
26
6: o can be extended to A
7: 1
2
(o. 2) = 0
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.47 moment generating function
Given a random variable A, the moment generating function of A is the following
function:
`
X
(t) = 1[c
tX
] for t ∈ 1 (if the expectation converges).
It can be shown that if the moment generating function of X is deﬁned on an interval around
the origin, then
1[A
k
] = `
(k)
X
(t)[
t=0
In other words, the /thderivative of the moment generating function evaluated at zero is
the /th moment of A.
Version: 1 Owner: Riemann Author(s): Riemann
1.48 monoid
A monoid is a semigroup G which contains an identity element; that is, there exists an
element c ∈ G such that c c = c c = c for all c ∈ G.
Version: 1 Owner: djao Author(s): djao
1.49 monotonic operator
For a poset A, an operator 1 is a monotonic operator if for all r. n ∈ A, r < n implies
1(r) < 1(n).
Version: 1 Owner: Logan Author(s): Logan
27
1.50 multidimensional Gaussian integral
Let N(0. K) be an unnormalized multidimensional Gaussian with mean 0 and covariance
matrix K, 1
ij
= cov(r
i
. r
j
). K is symmetric by the identity cov(r
j
. r
i
) = cov(r
i
. r
j
). Let
x = [r
1
r
2
. . . r
n
]
T
and d
n
x ⇔
¸
n
i=1
dr
n
.
It is easy to see that N(0. K) = exp (−
1
2
x
T
K
−1
x). How can we normalize N(0. K)?
We can show that
c
−
1
2
x
T
K
−1
x
d
n
x = ((2π)
n
[K[)
1
2
(1.50.1)
where [K[ = det K.
K
−1
is real and symmetric (since (K
−1
)
T
= (K
T
)
−1
= K
−1
). For convenience, let A = K
−1
.
We can decompose A into A = TΛT
−1
, where T is an orthonormal (T
T
T = I) matrix of
the eigenvectors of A and Λ is a diagonal matrix of the eigenvalues of A. Then
c
−
1
2
x
T
Ax
d
n
x =
c
−
1
2
x
T
TΛT
−1
x
d
n
x. (1.50.2)
Because T is orthonormal, we have T
−1
= T
T
. Now deﬁne a new vector variable y ⇔T
T
x,
and substitute:
c
−
1
2
x
T
TΛT
−1
x
d
n
x =
c
−
1
2
x
T
TΛT
T
x
d
n
x (1.50.3)
=
c
−
1
2
y
T
Λy
[J[d
n
y (1.50.4)
(1.50.5)
where [J[ is the determinant of the Jacobian matrix J
mn
=
∂xm
∂yn
. In this case, J = T and
thus [J[ = 1.
Now we’re in business, because Λ is diagonal and thus the integral may be separated into
the product of : independent Gaussians, each of which we can integrate separately using the
wellknown formula
intc
−
1
2
at
2
dt =
2π
c
1
2
. (1.50.6)
Carrying out this program, we get
28
c
−
1
2
y
T
Λy
d
n
y =
n
¸
k=1
intc
−
1
2
λ
k
y
2
k
dn
k
(1.50.7)
=
n
¸
k=1
2π
λ
k
1
2
(1.50.8)
=
(2π)
n
¸
n
k=1
λ
k
1
2
(1.50.9)
=
(2π)
n
[Λ[
1
2
(1.50.10)
(1.50.11)
Now, we have [A[ = [TΛT
−1
[ = [T[[Λ[[T
−1
[ = [T[ = [Λ[[T[
−1
= [Λ[, so this becomes
c
−
1
2
x
T
Ax
d
n
x =
(2π)
n
[A[
1
2
. (1.50.12)
Substituting back in for K
−1
, we get
c
−
1
2
x
T
K
−1
x
d
n
x =
(2π)
n
[K
−1
[
1
2
= ((2π)
n
[K[)
1
2
. (1.50.13)
as promised.
Version: 4 Owner: drini Author(s): drini, drummond
1.51 multiindex
multiindex
Let : ∈ N. Then a element α ∈ N
n
is called a multiindex
Version: 2 Owner: mike Author(s): mike, apmxi
29
1.52 near operators
1.52.1 Perturbations and small perturbations: deﬁnitions and some
results
We start our discussion on the Campanato theory of near operators with some preliminary
tools.
Let A. ) be two sets and let a metric d be deﬁned on ) . If 1 : A →) is an injective map,
we can deﬁne a metric on A by putting:
d
F
(r
t
. r
tt
) = d(1(r
t
). 1(r
tt
)).
Indeed, d
F
is zero if and only if r
t
= r
tt
(since 1 is injective); d
F
is obviously symmetric and
the triangle inequality follows from the triangle inequality of d.
If moreover 1(A) is a complete subspace of ) , then A is complete wrt the metric d
F
.
Indeed, let (n
n
) be a Cauchy sequence in A. By deﬁnition of d, then (1(n
n
)) is a Cauchy
sequence in ) , and in particular in 1(A), which is complete. Thus, there exists n
0
=
1(r
0
) ∈ 1(A) which is limit of the sequence (1(n
n
)). r
0
is the limit of (r
n
) in (A. d
F
),
which completes the proof.
A particular case of the previous statement is when 1 is onto (and thus a bijection) and
(). d) is complete.
Similarly, if 1(A) is compact in ) , then A is compat with the metric d
F
.
Deﬁnition 1. Let A be a set and ) be a metric space. Let 1. G be two maps from A to
) . We say that G is a perturbation of 1 if there exist a constant / 0 such that for each
r
t
. r
tt
∈ A one has:
d(G(r
t
). G(r
tt
)) ≤ /d(1(r
t
). 1(r
tt
))
remark 1. In particular, if 1 is injective then Gis a perturbation of 1 if Gis uniformly continuous
wrt to the metric induced on A by 1.
Deﬁnition 2. In the same hypothesis as in the previous deﬁnition, we say that G is a small
perturbation of 1 if it is a perturbation of constant / < 1.
We can now prove this generalization of the BanachCaccioppoli ﬁxed point theorem:
Theorem 1. Let A be a set and (). d) be a complete metric space. Let 1. G be two mappings
from A to ) such that:
1. 1 is bijective;
30
2. G is a small perturbation of 1.
Then, there exists a unique n ∈ A such that G(n) = 1(n)
T he hypothesis (1) ensures that the metric space (A. d
F
) is complete. If we now consider
the function 1 : A →A deﬁned by
1(r) = 1
−1
(G(r))
we note that, by (2), we have
d(G(r
t
). G(r
tt
)) ≤ /d(1(r
t
). 1(r
tt
))
where / ∈ (0. 1) is the constant of the small perturbation; note that, by the deﬁnition of d
F
and applying 1 ◦ 1
−1
to the ﬁrst side, the last equation can be rewritten as
d
F
(1(r
t
). 1(r
tt
)) ≤ /d
F
(r
t
. r
tt
);
in other words, since / < 1, 1 is a contraction in the complete metric space (A. d
F
); therefore
(by the classical BanachCaccioppoli ﬁxed point theorem) 1 has a unique ﬁxed point: there
exist n ∈ A such that 1(n) = n; by deﬁnition of 1 this is equivalent to G(n) = 1(n), and
the proof is hence complete.
remark 2. The hypothesis of the theorem can be generalized as such: let A be a set and
) a metric space (not necessarily complete); let 1. G be two mappings from A to ) such
that 1 is injective, 1(A) is complete and G(A) ⊆ 1(A); then there exists n ∈ A such that
G(n) = 1(n).
(Apply the theorem using 1(A) instead of ) as target space.)
remark 3. The BanachCaccioppoli ﬁxed point theorem is obtained when A = ) and 1 is
the identity.
We can use theorem 1 to prove a result that applies to perturbations which are not necessarily
small (i.e. for which the constant / can be greater than one). To prove it, we must assume
some supplemental structure on the metric of ) : in particular, we have to assume that the
metric d is invariant by dilations, that is that d(αn
t
. αn
tt
) = αd(n
t
. n
tt
) for each n
t
. n
tt
∈ ) .
The most common case of such a metric is when the metric is deduced from a norm (i.e. when
) is a normed space, and in particular a Banach space). The result follows immediately:
Corollary 1. Let A be a set and (). d) be a complete metric space with a metric d invariant
by dilations. Let 1. G be two mappings from A to ) such that 1 is bijective and G is a
perturbation of 1, with constant 1 0.
Then, for each ` 1 there exists a unique n
M
∈ A such that G(n) = `1(n)
T he proof is an immediate consequence of theorem 1 given that the map
˜
G(n) = G(n)`
is a small perturbation of 1 (a property which is ensured by the dilation invariance of the
metric d).
31
We also have the following
Corollary 2. Let A be a set and (). d) be a complete, compact metric space with a metric
d invariant by dilations. Let 1. G be two mappings from A to ) such that 1 is bijective and
G is a perturbation of 1, with constant 1 0.
Then there exists at least one n
K
∈ A such that G(n
∞
) = 11(n
∞
)
L et (c
n
) be a decreasing sequence of real numbers greater than one, converging to one
(c
n
↓ 1) and let `
n
= c
n
1 for each : ∈ N. We can apply corollary 1 to each `
n
, obtaining
a sequence n
n
of elements of A for which one has
G(n
n
) = `
n
1(n
n
). (1.52.1)
Since (A. d
F
) is compact, there exist a subsequence of n
n
which converges to some n
∞
; by
continuity of G and 1 we can pass to the limit in (1.52.1), obtaining
G(n
∞
) = 11(n
∞
)
which completes the proof.
remark 4. For theorem 2 we cannot ensure uniqueness of n
∞
, since in general the sequence
n
n
may change with the choice of c
n
, and the limit might be diﬀerent. So the corollary can
only be applied as an existence theorem.
1.52.2 Near operators
We can now introduce the concept of near operators and discuss some of their properties.
A historical remark: Campanato initially introduced the concept in Hilbert spaces; subse
quently, it was remarked that most of the theory could more generally be applied to Banach
spaces; indeed, it was also proven that the basic deﬁnition can be generalized to make part
of the theory available in the more general environment of metric vector spaces.
We will here discuss the theory in the case of Banach spaces, with only a couple of exceptions:
to see some of the extra properties that are available in Hilbert spaces and to discuss a
generalization of the LaxMilgram theorem to metric vector spaces.
1.52.3 Basic deﬁnitions and properties
Deﬁnition 3. Let A be a set and ) a Banach space. Let ¹. 1 be two operators from A
to ) . We say that ¹ is near 1 if and only if there exist two constants α 0 and / ∈ (0. 1)
such that, for each r
t
. r
tt
∈ A one has
1(r
t
) −1(r
tt
) −α(¹(r
t
) −¹(r
tt
)) < / 1(r
t
) −1(r
tt
)
32
In other words, ¹ is near 1 if 1 −α¹ is a small perturbation of 1 for an appropriate value
of α.
Observe that in general the property is not symmetric: if ¹ is near 1, it is not necessarily
true that 1 is near ¹; as we will brieﬂy see, this can only be proven if α < 12, or in the
case that ) is a Hilbert space, by using an equivalent condition that will be discussed later
on. Yet it is possible to deﬁne a topology with some interesting properties on the space of
operators, by using the concept of nearness to form a base.
The core point of the nearness between operators is that it allows us to “transfer” many
important properties from 1 to ¹; in other words, if 1 satisﬁes certain properties, and ¹ is
near 1, then ¹ satisﬁes the same properties. To prove this, and to enumerate some of these
“nearnessinvariant” properties, we will emerge a few important facts.
In what follows, unless diﬀerently speciﬁed, we will always assume that A is a set, ) is a
Banach space and ¹. 1 are two operators from A to ) .
Lemma 1. If ¹ is near 1 then there exist two positive constants `
1
. `
2
such that
1(r
t
) −1(r
tt
) < `
1
¹(r
t
) −¹(r
tt
)
¹(r
t
) −¹(r
tt
) < `
2
1(r
t
) −1(r
tt
)
W e have:
1(r
t
) −1(r
tt
) <
< 1(r
t
) −1(r
tt
) −α(¹(r
t
) −¹(r
tt
)) + α¹(r
t
) −¹(r
tt
) <
< / 1(r
t
) −1(r
tt
) + α¹(r
t
) −¹(r
tt
)
and hence
1(r
t
) −1(r
tt
) <
α
1 −/
¹(r
t
) −¹(r
tt
)
which is the ﬁrst inequality with `
1
= α(1 −/) (which is positive since / < 1).
But also
¹(r
t
) −¹(r
tt
) <
<
1
α
1(r
t
) −1(r
tt
) −α(¹(r
t
) −¹(r
tt
)) +
1
α
1(r
t
) −1(r
tt
) <
<
/
α
1(r
t
) −1(r
tt
) +
1
α
1(r
t
) −1(r
tt
)
and hence
¹(r
t
) −¹(r
tt
) <
1 + /
α
1(r
t
) −1(r
tt
)
which is the second inequality with `
2
= (1 +/)α.
33
The most important corollary of the previous lemma is the following
Corollary 3. If ¹ is near 1 then two points of A have the same image under ¹ if and only
if the have the same image under 1.
We can express the previous concept in the following formal way: for each n in 1(A) there
exist . in ) such that ¹(1
−1
(n)) = ¦.¦ and conversely. In yet other words: each ﬁber of ¹
is a ﬁber (for a diﬀerent point) of 1, and conversely.
It is therefore possible to deﬁne a map 1
A
: 1(A) →) by putting 1
A
(n) = .; the range of
1
A
is ¹(A). Conversely, it is possible to deﬁne 1
B
: ¹(A) → ) , by putting 1
B
(.) = n; the
range of 1
B
is 1(A). Both maps are injective and, if restricted to their respective ranges,
one is the inverse of the other.
Also observe that 1
B
and 1
A
are continuous. This follows from the fact that for each r ∈ A
one has
1
A
(1(r)) = ¹(r). 1
B
(¹(r)) = 1(r)
and that the lemma ensures that given a sequence (r
n
) in A, the sequence (1(r
n
)) converges
to 1(r
0
) if and only if (¹(r
n
)) converges to ¹(r
0
).
We can now list some invariant properties of operators with respect ot nearness. The prop
erties are given in the form “if and only if” because each operator is near itself (therefore
ensuring the “only if” part).
1. a map is injective iﬀ it is near an injective operator;
2. a map is surjective iﬀ it is near a surjective operator;
3. a map is open iﬀ it is near an open map;
4. a map has dense range iﬀ it is near a map with dense range.
To prove (2) it is necessary to use theorem 1.
Another important property that follows from the lemma is that if there exist n ∈ ) such that
¹
−1
(n)
¸
1
−1
(n) = ∅, then it is ¹
−1
(n) = 1
−1
(n): intersecting ﬁbers are equal. (Campanato
only stated this property for the case n = 0 and called it “‘the kernel property”; I prefer to
call it the “ﬁber persistence” property.)
A topology based on nearness
In this section we will show that the concept of nearness between operator can indeed be
connected to a topological understanding of the set of maps from A to ) .
34
Let M be the set of maps between A and ) . For each 1 ∈ M and for each / ∈ (0. 1) we let
l
k
(1) the set of all maps G ∈ Msuch that 1 −G is a small perturbation of 1 with constant
/. In other words, G ∈ l
k
(1) iﬀ G is near 1 with constants 1. /.
The set U(1) = ¦l
k
(1) [ 0 < / < 1¦ satisﬁes the axioms of the set of fundamental
neighbourhoods. Indeed:
1. 1 belongs to each l
k
(1);
2. l
k
(1) ⊂ l
h
(1) iﬀ / < /, and thus the intersection property of neighbourhoods is
trivial;
3. for each l
k
(1) there exist l
h
(1) such that for each G ∈ l
h
(1) there exist l
j
(G) ⊆
l
k
(1).
This last property (permanence of neighbourhoods) is somewhat less trivial, so we shall now
prove it.
L et l
k
(1) be given.
Let l
h
(1) be another arbitrary neighbourhood of 1 and let G be an arbitrary element in it.
We then have:
1(r
t
) −1(r
tt
) −(G(r
t
) −G(r
tt
)) ≤ / 1(r
t
) −1(r
tt
) . (1.52.2)
but also (lemma 1)
(G(r
t
) −G(r
tt
)) ≤ (1 + /) 1(r
t
) −1(r
tt
) . (1.52.3)
Let also l
j
(G) be an arbitrary neighbourhood of G and H an arbitrary element in it. We
then have:
G(r
t
) −G(r
tt
) −(H(r
t
) −H(r
tt
)) ≤ , G(r
t
) −G(r
tt
) . (1.52.4)
The nearness between 1 and H is calculated as such:
1(r
t
) −1(r
tt
) −(H(r
t
) −H(r
tt
)) ≤
1(r
t
) −1(r
tt
) −(G(r
t
) −G(r
tt
)) +G(r
t
) −G(r
tt
) −(H(r
t
) −H(r
tt
)) ≤
/ 1(r
t
) −1(r
tt
) + , G(r
t
) −G(r
tt
) ≤ (/ + ,(1 + /)) 1(r
t
) −1(r
tt
) . (1.52.5)
We then want /+,(1+/) ≤ /, that is , ≤ (/ −/)(1+/); the condition 0 < , < 1 is always
satisﬁed on the right side, and the left side gives us / < /.
It is important to observe that the topology generated this way is not a Hausdorﬀ topology:
indeed, it is not possible to separate 1 and 1 +n (where 1 ∈ M and n is a constant element
of ) ). On the other hand, the subset of all maps with with a ﬁxed valued at a ﬁxed point
(1(r
0
) = n
0
) is a Hausdorﬀ subspace.
35
Another important characteristic of the topology is that the set H of invertible operators
from A to ) is open in M (because a map is invertible iﬀ it is near an invertible map). This
is not true in the topology of uniform convergence, as is easily seen by choosing A = ) = R
and the sequence with generic element 1
n
(r) = r
3
− r:: the sequence converges (in the
uniform convergence topology) to 1(r) = r
3
, which is invertible, but none of the 1
n
is
invertible. Hence 1 is an element of H which is not inside H, and H is not open.
1.52.4 Some applications
As we mentioned in the introduction, the Campanato theory of near operators allows us
to generalize some important theorems; we will now present some generalizations of the
LaxMilgram theorem, and a generalization of the Riesz representation theorem.
[TODO]
Version: 5 Owner: Oblomov Author(s): Oblomov
1.53 negative binomial random variable
A is a Negative binomial random variable with parameters : and j if
1
X
(r) =
r+x−1
x
j
r
(1 −j)
x
, r = ¦0. 1. ...¦
Parameters:
 : 0
 j ∈ [0. 1]
syntax:
A ∼ `co1i:(:. j)
Notes:
1. If : ∈ N, A represents the number of failed Bernoulli trials before the :th success.
Note that if : = 1 the variable is a geometric random variable.
36
2. 1[A] = :
1−p
p
3. \ c:[A] = :
1−p
p
2
4. `
X
(t) = (
p
1−(1−p)e
t
)
r
Version: 2 Owner: Riemann Author(s): Riemann
1.54 normal random variable
A is a Normal random variable with parameters j and σ
2
if
1
X
(r) =
1
√
2πσ
2
c
−
(x−µ)
2
2σ
2
, r ∈ 1
Parameters:
 j ∈ 1
 σ
2
0
syntax:
A ∼ `(j. σ
2
)
Notes:
1. Probably the most frequently used distribution. 1
X
(r) will look like a bellshaped
function, hence justifying the synonym bell distribution.
2. When j = 0 and σ
2
= 1 the distribution is called standard normal
3. The cumulative distribution function of A is often called Φ(r).
4. 1[A] = j
5. \ c:[A] = σ
2
6. `
X
(t) = c
tµ+t
2 σ
2
2
Version: 4 Owner: Riemann Author(s): Riemann
37
1.55 normalizer of a subset of a group
normalizer of a subset of a group
1: A group
2: ) ⊂ A subset
3: ¦r ∈ A
r) r
−1
= ) ¦
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 1 Owner: drini Author(s): apmxi
1.56 nth root
There are two oftenused deﬁnitions of the nth root. The ﬁrst discussed deals with real numbers
only; the second deals with complex numbers.
The nth root of a nonnegative real number r, written as
n
√
r, can be deﬁned as the real
number n such that n
n
= r. This notation is normally, but not always, used when : is a
natural number. This deﬁnition could also be written as
n
√
r
n
⇔r∀r ` 0 ∈ R.
Example:
4
√
81 = 3 because 3
4
= 3 3 3 3 = 81.
Example:
5
√
r
5
+ 5r
4
+ 10r
3
+ 5r
2
+ 1 = r + 1 because (r + 1)
5
= (r
2
+ 2r + 1)
2
(r + 1) =
r
5
+ 5r
4
+ 10r
3
+ 10r
2
+ 5r + 1. (See the binomial theorem and Pascal’s Triangle.)
The nth root operation is distributive for multiplication and division, but not for addition
and subtraction. That is,
n
√
r n =
n
√
r
n
√
n, and
n
x
y
=
n
√
x
n
√
y
. However, except in special
cases,
n
√
r + n =
n
√
r +
n
√
n and
n
√
r −n =
n
√
r −
n
√
n.
Example:
4
81
625
=
3
5
because
3
5
4
=
3
4
5
4
=
81
625
.
The nth root notation is actually an alternative to exponentiation. That is,
n
√
r ⇔ r
1
n
. As
such, the nth root operation is associative with exponentiation. That is,
n
√
r
3
= r
3
n
=
n
√
r
3
.
In this deﬁnition,
n
√
r is undeﬁned when r < 0.and : is even. When : is odd and r < 0,
n
√
r < 0. Examples:
3
√
−1 = −1, but
4
√
−1 is undeﬁned for this deﬁnition.
A more generalized deﬁnition: The nth roots of a complex number t = r+ni = (r. ni) = (:. θ)
are all the complex numbers .
1
. .
2
. . . . . .
n
∈ C that satisfy the condition .
n
k
= t. : such
complex numbers always exist.
38
One of the more popular methods of ﬁnding these roots is through geometry and trigonome
try. The complex numbers are treated as a plane using Cartesian coordinates with an r axis
and a ni axis. (Remember, in the context of complex numbers, i ⇔
√
−1.) These rectangu
lar coordinates (r. ni) are then translated to polar coordinates (:. θ), where : =
2n
r
2
+ n
2
(according to the previous deﬁnition of nth root), θ =
π
2
if r = 0, and θ = arctan
y
x
if r = 0.
(See the Pythagorean theorem.)
Then the nth roots of t are the vertices of a regular polygon having : sides, centered at
(0. 0i), and having (:. θ) as calculated above as one of its vertices.
Example: Consider
3
√
8. 8 can also be written as 8+0i or in polar as (8. 0). By our method, we
now have an equilateral triangle centered at (0. 0) and having one vertex at (2. 0). Knowing
that a complete circle consists of 2π radians, and knowing that all angles are equal in an
equilateral triangle, we can deduce that the other two vertices lie at polar coordinates (2.
2π
3
)
and (2.
4π
3
). Translating back into rectangular coordinates, we have:
3
√
8 = 2
3
√
8 = 2(cos
2π
3
+ i sin
2π
3
) = 2(−
1
2
+ i
√
3
2
) = −1 + i
√
3
3
√
8 = 2(cos
4π
3
+ i sin
4π
3
) = 2(−
1
2
+ i
−
√
3
2
) = −1 −i
√
3
Example: Consider
4
√
−16. We can rewrite this as
4
√
−1
4
√
16 = 2
√
i.
We can ﬁnd 2
√
i by using a formula for multiplying complex numbers in polar coordinates:
(:
1
. θ
1
) (:
2
. θ
2
) = (:
1
:
2
. θ
1
+ θ
2
). So 0 + i = (:
2
. 2θ). Therefore, : =
4
√
0
2
+ 1
2
= 1 and
θ =
π
4
. So
√
i = (1.
π
4
), and doubling that we get (2.
π
4
).
Now we have a square centered at polar coordinates (0. 0) with one corner at (2.
π
4
). Adding
π
2
to the angle repeatedly gives us the remainder of the corners: (2.
3π
4
), (2.
5π
4
), (2.
7π
4
).
Translating these to rectangular coordinates works as in the previous example.
So the four solutions to
4
√
−16 are
√
2 + i
√
2, −
√
2 + i
√
2, −
√
2 −i
√
2, and
√
2 −i
√
2.
Example: Consider
3
√
1 + i. As in the previous examples, our ﬁrst step is to convert 1 + 1i
into polar coordinates. We get : =
√
1
2
+ 1
2
=
√
2 and θ = arctan1 =
π
4
, giving a polar
coordinate of (
√
2.
π
4
). Now we take the cube root of this complex number: (
√
2.
π
4
) = (:
3
. 3θ).
We get coordinates (
6
√
2.
π
12
). This point is one vertex of an equilateral triangle centered at
(0. 0). The other two vertices of the triangle are derived from adding
2π
3
to θ. We know this
because lines from the center of an equilateral triangle to each of the corners will form three
equal angles of width
2π
3
about the center, and because all three vertices of an equilateral
triangle will be the same distance from the center.
So the other vertices in polar coordinates are (
6
√
2.
3π
4
) and (
6
√
2.
17π
12
). Most people would
just use a calculator to compute the sines and cosines of these angles, but they can be
interpolated using these handy identities:
39
cos 2t = 1 −2 sin
2
t (use this to calculate sin(
π
12
) from sin(
π
3
) =
√
3
2
)
sin(c + /) = sin(c) cos(/) + cos(c) sin(/) (use c =
3π
4
and / =
2π
3
)
cos(c + /) = cos(c) cos(/) −sin(c) sin(/)
The process of calculating these values is left as an exercise to the reader in the interest of
space. The rectangular coordinates, the cube roots of 1 +i, are:
(
6
√
2.
π
12
) =
6
√
2
4
√
12
2
+ i
6
√
2
1 −
√
3
2
(
6
√
2.
3π
4
) = −
6
√
128
2
+ i
6
√
128
2
(
6
√
2.
17π
12
) =
6
√
2
√
2−
√
6
4
−i
6
√
2
√
2+
√
6
4
Version: 8 Owner: mathcam Author(s): mathcam, wberry
1.57 null tree
A null tree is simply a tree with zero nodes.
Version: 1 Owner: Logan Author(s): Logan
1.58 open ball
Let (A. ρ) be a metric space and r
0
∈ A. Let : be a positive number. The set
1(r
0
. :) = ¦r ∈ A : ρ(r. r
0
) < :¦
is called the ball with center r
0
and radius :. On some spaces like C or R
2
this is also known
as an open disk and when the space is R, it is known as open interval (all three spaces with
standard metric).
Version: 2 Owner: drini Author(s): drini, apmxi
1.59 opposite ring
If 1 is a ring, then we may construct the opposite ring 1
op
which has the same underlying
abelian group structure, but with multiplication in the opposite order: the product of :
1
and
:
2
in 1
op
is :
2
:
1
.
40
If ` is a left 1module, then it can be made into a right 1
op
module, where a module
element :, when multiplied on the right by an element : of 1
op
, yields the :: that we
have with our left 1module action on `. Similarly, right 1modules can be made into left
1
op
modules.
If 1 is a commutative ring, then it is equal to its own opposite ring.
Version: 1 Owner: antizeus Author(s): antizeus
1.60 orbitstabilizer theorem
Given a group action G on a set A, deﬁne Gr to be the orbit of r and G
x
to be the set of
stabilizers of r. For each r ∈ A the correspondence g(r) →oG
x
is a bijection between Gr,
and the set of left cosets of G
x
A famous corollary is that
[Gr[ [G
x
[ = [G[ ∀r ∈ A
Version: 8 Owner: vitriol Author(s): vitriol
1.61 orthogonal
The deﬁnition of orthogonal varies depending on the mathematical constructs in question.
There are particular deﬁnitions for
• orthogonal matrices
• orthogonal polynomials
• orthogonal vectors
In general, two objects are orthogonal if they do not “coincide” in some sense. Sometimes
orthogonal means roughly the same thing as “perpendicular”.
Version: 2 Owner: akrowne Author(s): akrowne
1.62 permutation group on a set
permutation group on a set
41
1: ¹ set
2: (o
A
. ◦) symmetric group
3: A < o
A
4: (A. ◦)
fact: conjugating stabilizer of an element by permutation produces
stabilizer of permuted element
1: ¹ set
2: c ∈ ¹
3: A permutation group on ¹
4: σ ∈ A
5: σStab
X
(c)σ
−1
= Stab
X
(σ(c))
fact: if a permutation group acts transitively, then the intersection
of conjugated stabilizers is the identity
1: ¹ set
2: c ∈ ¹
3: A permutation group on ¹
4:
¸
σ∈X
σStab
X
(c) = 1
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.63 prime element
An element j in a ring 1 is a prime element if it generates a prime ideal. If 1 is commutative,
this is equivalent to saying that for all c. / ∈ 1 , if j divides c/, then j divides c or j divides /.
42
When 1 = Z the prime elements as formulated above are simply prime numbers.
Version: 3 Owner: dublisk Author(s): dublisk
1.64 product measure
Let (1
1
. B
1
(1
1
)) and (1
2
. B
2
(1
2
)) be two measurable spaces, with measures j
1
and j
2
. Let
B
1
B
2
be the sigma algebra on 1
1
1
2
generated by subsets of the form 1
1
1
2
, where
1
1
∈ B
1
(1
1
) and 1
2
∈ B
2
(1
2
).
The product measure j
1
j
2
is deﬁned to be the unique measure on the measurable space
(1
1
1
2
. B
1
B
2
) satisfying the property
j
1
j
2
(1
1
1
2
) = j
1
(1
1
)j
2
(1
2
) for all 1
1
∈ B
1
(1
1
). 1
2
∈ B
2
(1
2
).
Version: 2 Owner: djao Author(s): djao
1.65 projective line
projective line
example
1: / = ¦[A. ). 2. \] ∈ RP
3
2 = \ = 0¦
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.66 projective plane
projective plane
1: ∼: S
2
S
2
→¦0. 1¦
2: r ∼ n ⇔n = −r
43
3: j : S
2
→S
2
∼
4: quotient space obtained from j
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 2 Owner: bhaire Author(s): bhaire, apmxi
1.67 proof of calculus theorem used in the Lagrange
method
Let 1(x) and o
i
(x). i = 0. . . .. : be diﬀerentiable scalar functions; x ∈ 1
n
.
We will ﬁnd local extremes of the function 1(x) where ∇1 = 0. This can be proved by
contradiction:
∇1 = 0
⇔∃c
0
0. ∀c; 0 < c < c
0
: 1(x −c∇1) < 1(x) < 1(x + c∇f )
but then 1(x) is not a local extreme.
Now we put up some conditions, such that we should ﬁnd the x ∈ o ⊂ 1
n
that gives a local
extreme of 1. Let o =
¸
m
i=1
o
i
, and let o
i
be deﬁned so that o
i
(x) = 0∀x ∈ o
i
.
Any vector x ∈ 1
n
can have one component perpendicular to the subset o
i
(for visualization,
think : = 3 and let o
i
be a ﬂat surface). ∇o
i
will be perpendicular to o
i
, because:
∃c
0
0. ∀c; 0 < c < c
0
: o
i
(x −c∇o
i
) < o
i
(x) < o
i
(x + c∇o
i
)
But o
i
(x) = 0, so any vector x +c∇o
i
must be outside o
i
, and also outside o. (todo: I have
proved that there might exist a component perpendicular to each subset o
i
, but not that
there exists only one; this should be done)
By the argument above, ∇1 must be zero  but now we can ignore all components of ∇1
perpendicular to o. (todo: this should be expressed more formally and proved)
So we will have a local extreme within o
i
if there exists a λ
i
such that
∇1 = λ
i
∇o
i
We will have local extreme(s) within o where there exists a set λ
i
. i = 1. . . .. : such that
∇1 =
¸
λ
i
∇o
i
Version: 2 Owner: tobix Author(s): tobix
44
1.68 proof of orbitstabilizer theorem
The correspondence is clearly surjective. It is injective because if oG
x
= o
t
G
x
then o = o
t
/
for some / ∈ G
x
. Therefore g(r) = g’(h(r)) = g’(r).
Version: 1 Owner: vitriol Author(s): vitriol
1.69 proof of power rule
The power rule can be derived by repeated application of the product rule.
Proof for all positive integers :
The power rule has been shown to hold for : = 0 and : = 1. If the power rule is known to
hold for some / 0, then we have
D
D
x
r
k+1
=
D
D
x
(r r
k
)
= r(
D
D
x
r
k
) + r
k
= r (/r
k−1
) + r
k
= /r
k
+ r
k
= (/ + 1)r
k
Thus the power rule holds for all positive integers :.
Proof for all positive rationals :
Let n = r
p/q
. We need to show
D
y
D
x
(r
p/q
) =
j
¡
r
p/q−1
(1.69.1)
The proof of this comes from implicit diﬀerentiation.
By deﬁnition, we have n
q
= r
p
. We now take the derivative with respect to r on both sides
of the equality.
45
D
D
x
n
q
=
D
D
x
r
p
D
D
y
(n
q
)
D
y
D
x
= jr
p−1
¡n
p−1
D
y
D
x
= jr
p−1
D
y
D
x
=
j
¡
r
p−1
n
p−1
=
j
¡
r
p−1
n
−q/y
=
j
¡
r
p−1
r
p/q−p
=
j
¡
r
p−1+p/q−p
=
j
¡
r
p/q−1
Proof for all positive irrationals :
For positive irrationals we claim continuity due to the fact that (1.69.1) holds for all positive
rationals, and there are positive rationals that approach any positive irrational.
Proof for negative powers :
We again employ implicit diﬀerentiation. Let n = r, and diﬀerentiate n
n
with respect to r
for some nonnegative :. We must show
D
u
−n
D
x
= −:n
−n−1
(1.69.2)
By deﬁnition we have n
n
n
−n
= 1. We begin by taking the derivative with respect to r on
both sides of the equality. By application of the product rule we get
46
D
D
x
(n
n
n
−n
) = 1
n
n
D
u
−n
D
x
+ n
−n
D
u
n
D
x
= 0
n
n
D
u
−n
D
x
+ n
−n
(:n
n−1
) = 0
n
n
D
u
−n
D
x
= −:n
−1
D
u
−n
D
x
= −:n
−n−1
Version: 3 Owner: alek thiery Author(s): alek thiery, Logan
1.70 proof of primitive element theorem
Let 1
a
∈ 1[r], respectively 1
b
∈ 1[r], be the monic irreducible polynomial satisﬁed by c,
respectively /. If 1 is an extension of 1 that splits 1
a
1
b
, then 1 is normal over 1, and so
there are a ﬁnite number of subﬁelds of 1 containing 1, as many as there are subgroups of
Gal(11), by the Fundamental Theorem of Galois Theory. Let c
k
= c+// with / ∈ 1, and
consider the ﬁelds 1(c
k
). Since 1 is characteristic 0, there are inﬁnitely many choices for /.
But 1 ⊂ 1(c
k
) ⊂ 1(c. /) ⊂ 1, so by the above there are only ﬁnitely many 1(c
k
). Therefore,
for some /
i
. /
j
∈ 1, 1(c
k
i
) = 1(c
k
j
). Then c
k
j
∈ 1(c
k
i
), and so c
k
i
−c
k
j
= (/
i
−/
j
)/ ∈ 1(c
k
i
),
and thus / ∈ 1(c
k
i
). Then also c = c
k
i
−/
i
/ ∈ 1(c
k
i
), which gives 1(c. /) ⊂ 1(c
k
i
). But we
also have 1(c
k
i
) ⊂ 1(c. /), and thus 1(c. /) = 1(c
k
i
), QED.
Version: 1 Owner: sucrose Author(s): sucrose
1.71 proof of product rule
D
D
x
[1(r)o(r)] = lim
h→0
1(r + /)o(r + /) −1(r)o(r)
/
= lim
h→0
1(r + /)o(r + /) + 1(r + /)o(r) −1(r + /)o(r) −1(r)o(r)
/
= lim
h→0
¸
1(r + /)
o(r + /) −o(r)
/
+ o(r)
1(r + /) −1(r)
/
= 1(r)o
t
(r) + 1
t
(r)o(r)
Version: 1 Owner: Logan Author(s): Logan
47
1.72 proof of sum rule
D
D
x
[1(r) + o(r)] = lim
h→0
1(r + /) + o(r + /) −1(r) −o(r)
/
= lim
h→0
¸
1(r + /) −1(r)
/
+
o(r + /) −o(r)
/
= 1
t
(r) + o
t
(r)
Version: 1 Owner: Logan Author(s): Logan
1.73 proof that countable unions are countable
Let ( be a countable collection of countable sets. We will show that
¸
( is countable.
Let 1 be the set of positive primes. 1 is countably inﬁnite, so there is a bijection between
1 and N. Since there is a bijection between ( and a subset of N, there must in turn be a
onetoone function 1 : ( →1.
Each o ∈ ( is countable, so there exists a bijection between o and some subset of N. Call
this function o, and deﬁne a new function /
S
: o →N such that for all r ∈ o,
/
S
(r) = 1(o)
g(x)
Note that /
S
is onetoone. Also note that for any distinct pair o. 1 ∈ (, the range of /
S
and the range of /
T
are disjoint due to the fundamental theorem of arithmetic.
We may now deﬁne a onetoone function / :
¸
( → N, where, for each r ∈
¸
(, /(r) =
/
S
(r) for some o ∈ ( where r ∈ o (the choice of o is irrelevant, so long as it contains r).
Since the range of / is a subset of N, / is a bijection into that set and hence
¸
( is countable.
Version: 2 Owner: vampyr Author(s): vampyr
1.74 quadrature
Quadrature is the computation of a univariate deﬁnite integral. It can refer to either
numerical or analytic techniques; one must gather from context which is meant.
Cubature refers to higherdimensional deﬁnite integral computation.
Some numerical quadrature methods are Simpson’s rule, the trapezoidal rule, and Riemann sums.
Version: 4 Owner: akrowne Author(s): akrowne
48
1.75 quotient module
quotient module
1: A is a ring
2: ) a module over A
3: 2 is a submodule of )
4: )2 is the additive group of cosets of 2 in )
5: r(n + 2) = rn + 2 module structure
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 1 Owner: Thomas Heye Author(s): apmxi
1.76 regular expression
A regular expression is a particular metasyntax for specifying regular grammars, which
has many useful applications.
While variations abound, fundamentally a regular expression consists of the following components.
Parentheses can be used for grouping and nesting, and must contain a fullyformed regular
expression. The  symbol can be used for denoting alternatives. Some speciﬁcations do not
provide nesting or alternatives. There are also a number of postﬁx operators. The ? operator
means that the preceding element can either be present or nonpresent, and corresponds to a
rule of the form ¹ →1[ λ. The * operator means that the preceding element can be present
zero or more times, and corresponds to a rule of the form ¹ → 1¹[ λ. The + operator
means that the preceding element can be present one or more times, and corresponds to a
rule of the form ¹ → 1¹[ 1. Note that while these rules are not immediately in regular
form, they can be transformed so that they are.
Here is an example of a regular expression that speciﬁes a grammar that generates the binary
representation of all multiples of 3 (and only multiples of 3).
(0
∗
(1(01
∗
0)
∗
1)
∗
)
∗
0
∗
This speciﬁes the contextfree grammar (in BNF):
49
o ::= ¹1
¹ ::= (1
1 ::= 01[λ
( ::= 0([λ
1 ::= 111
1 ::= 11[λ
1 ::= 0G0
G ::= 1G[λ
A little further work is required to transform this grammar into an acceptable form for
regular grammars, but it can be shown that this grammar (and any grammar speciﬁed by a
regular expression) is equivalent to some regular grammar.
Regular expressions have many applications. Quite often they are used for powerful string
matching and substitution features in many text editors and programming languages.
Version: 1 Owner: Logan Author(s): Logan
1.77 regular language
A regular grammar is a contextfree grammar where all productions must take one of the
following forms (speciﬁed here in BNF, λ is the empty string):
<nonterminal> ::= tc::i:c
<nonterminal> ::= tc::i:c nonterminal
<nonterminal> ::= λ
A regular language is the set of strings generated by a regular grammar. Regular grammars
are also known as Type3 grammars in the Chomsky hierarchy.
A regular grammar can be represented by a deterministic or nondeterministic ﬁnite automaton.
Such automata can serve to either generate or accept sentences in a particular regular
language. Note that since the set of regular languages is a subset of contextfree lan
guages, any deterministic or nondeterministic ﬁnite automaton can be simulated by a
pushdown automaton.
Version: 2 Owner: Logan Author(s): Logan
50
1.78 right function notation
We are said to be using right function notation if we write functions to the right of their
arguments. That is, if α : A →) is a function and r ∈ A, then rα is the image of r under
α.
Furthermore, if we have a function β : ) → 2, then we write the composition of the two
functions as αβ : A → 2, and the image of r under the composition as rαβ = r(αβ) =
(rα)β.
Compare this to left function notation.
Version: 1 Owner: antizeus Author(s): antizeus
1.79 ring homomorphism
Let 1 and o be rings. A ring homomorphism is a function 1 : 1 −→o such that:
• 1(c + /) = 1(c) + 1(/) for all c. / ∈ 1
• 1(c /) = 1(c) 1(/) for all c. / ∈ 1
When working in a context in which all rings have a multiplicative identity, one also requires
that 1(1
R
) = 1
S
.
Version: 3 Owner: djao Author(s): djao
1.80 scalar
A scalar is a quantity that is invariant under coordinate transformation, also known as a
tensor of rank 0. For example, the number 1 is a scalar, so is any number or variable : ∈ R.
The point (3. 4) is not a scalar because it is variable under rotation. As such, a scalar can
be an element of a ﬁeld over which a vector space is deﬁned.
Version: 3 Owner: slider142 Author(s): slider142
1.81 schrodinger operator
schrodinger operator
51
1: \ : R →R
2: n →−
d
y
x
d2
y
+ \ (r)n
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.82 selection sort
The Problem
See the Sorting Problem.
The Algorithm
Suppose 1 = ¦r
1
. r
2
. . . . . r
n
¦ is the initial list of unsorted elements. The selection sort
algorithm sorts this list in : steps. At each step i, ﬁnd the largest element 1[,] such that
, < : − i + 1, and swap it with the element at 1[: − i + 1]. So, for the ﬁrst step, ﬁnd the
largest value in the list and swap it with the last element in the list. For the second step,
ﬁnd the largest value in the list up to (but not including) the last element, and swap it with
the next to last element. This is continued for :−1 steps. Thus the selection sort algorithm
is a very simple, inplace sorting algorithm.
Pseudocode
Algorithm Selection Sort(L, n)
Input: A list 1 of : elements
Output: The list 1 in sorted order begin
for i ←: downto 2 do
begin
tc:j ←1[i]
:cr ←1
for , ←2 to i do
if 1[,] 1[:cr] then
:cr ←,
1[i] ←1[:cr]
1[:cr] ←tc:j
end
end
52
Analysis
The selection sort algorithm has the same runtime for any set of : elements, no matter
what the values or order of those elements are. Finding the maximum element of a list of
i elements requires i − 1 comparisons. Thus 1(:), the number of comparisons required to
sort a list of : elements with the selection sort, can be found:
1(:) =
n
¸
i=2
(i −1)
=
n
¸
i=1
i −: −2
=
(:
2
−: −4)
2
= O(:
2
)
However, the number of data movements is the number of swaps required, which is :−1. This
algorithm is very similar to the insertion sort algorithm. It requires fewer data movements,
but requires more comparisons.
Version: 1 Owner: Logan Author(s): Logan
1.83 semiring
A semiring is an algebra (¹. . +. 0. 1) of a set ¹, where 0 and 1 are constants, (¹. . 1) is
a monoid, (¹. +. 0) is a commutative monoid, distributes over + from the left and right,
and 0 is both a left and right annihilator (0c = c0 = 0). Often c / is written simply as c/,
and the semiring (¹. . +. 0. 1) as simply ¹.
The relation < on a semiring ¹ is deﬁned as c < / if and only if there exists some c ∈ ¹
such that c + c = /, and is a quasiordering. If + is idempotent over ¹ (that is, c + c = c
holds for all c ∈ ¹), then < is a partial ordering.
Addition and (left and right) multiplication are monotonic operators with respect to <, with
0 as the minimal element.
Version: 2 Owner: Logan Author(s): Logan
53
1.84 simple function
Let (A. B) be a measurable space. Let χ
A
k
, / = 1. 2. . . . . : be the characteristic functions
of sets ¹
k
∈ B. We call / a simple function if it can be written as
/ =
n
¸
k=1
c
k
χ
A
k
. c
k
∈ R. (1.84.1)
for some : ∈ N.
Version: 2 Owner: drummond Author(s): drummond
1.85 simple path
A simple path in a graph is a path that contains no vertex more than once. By deﬁnition,
cycles are particular instances of simple paths.
Version: 1 Owner: Logan Author(s): Logan
1.86 solutions of an equation
solutions of an equation
1: ¦r
1(r) = 0¦
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 1 Owner: Thomas Heye Author(s): apmxi
1.87 spanning tree
A spanning tree of a (connected) graph G is a connected, acyclic subgraph of G that
contains all of the vertices of G. Below is an example of a spanning tree 1, where the edges
in 1 are drawn as solid lines and the edges in G but not in 1 are drawn as dotted lines.
54
• •
•
• • •
• •
• •
For any tree there is exactly one spanning tree: the tree itself.
Version: 2 Owner: Logan Author(s): Logan
1.88 square root
The square root of a nonnegative real number r, written as
√
r, is the real number n such
that n
2
= r. Equivalently,
√
r
2
⇔r. Or,
√
r
√
r ⇔r.
Example:
√
9 = 3 because 3
2
= 3 3 = 9.
Example:
√
r
2
+ 2r + 1 = r+1 because (r+1)
2
= (r+1)(r+1) = r
2
+r+r+1 = r
2
+2r+1.
In some situations it is better to allow two values for
√
r. For example,
√
4 = ±2 because
2
2
= 4 and (−2)
2
= 4.
The square root operation is distributive for multiplication and division, but not for addition
and subtraction.
That is,
√
r n =
√
r
√
n, and
x
y
=
√
x
√
y
.
However, in general,
√
r + n =
√
r +
√
n and
√
r −n =
√
r −
√
n.
Example:
r
2
n
2
= rn because (rn)
2
= rn rn = r r n n = r
2
n
2
= r
2
n
2
.
Example:
9
25
=
3
5
because
3
5
2
=
3
2
5
2
=
9
25
.
The square root notation is actually an alternative to exponentiation. That is,
√
r ⇔r
1
2
. As
such, the square root operation is associative with exponentiation. That is,
√
r
3
= r
3
2
=
√
r
3
.
Negative real numbers do not have real square roots. For example,
√
−4 is not a real number.
Proof by contradiction: Suppose
√
−4 = r ∈ R. If r is negative, r
2
is positive. But if r is
55
positive, r
2
is also positive. But r cannot be zero either, because 0
2
= 0. So
√
−4 ∈ R.
For additional discussion of the square root and negative numbers, see the discussion of
complex Numbers.
Version: 9 Owner: wberry Author(s): wberry
1.89 stable sorting algorithm
A stable sorting algorithm is any sorting algorithm that preserves the relative ordering of
items with equal values. For instance, consider a list of ordered pairs 1 := ¦(¹. 3). (1. 5). ((. 2). (1. 5). (1. 4)
If a stable sorting algorithm sorts 1 on the second value in each pair using the < relation,
then the result is guaranteed to be ¦((. 2). (¹. 3). (1. 4). (1. 5). (1. 5)¦. However, if an
algorithm is not stable, then it is possible that (1. 5) may come before (1. 5) in the sorted
output.
Some examples of stable sorting algorithms are bubblesort and mergesort (although the
stability of mergesort is dependent upon how it is implemented). Some examples of unstable
sorting algorithms are heapsort and quicksort (quicksort could be made stable, but then it
wouldn’t be quick any more). Stability is a useful property when the total ordering relation
is dependent upon initial position. Using a stable sorting algorithm means that sorting by
ascending position for equal keys is builtin, and need not be implemented explicitly in the
comparison operator.
Version: 3 Owner: Logan Author(s): Logan
1.90 standard deviation
Given a random variable A, the standard deviation of A is deﬁned as
o1[A] =
\ c:[A].
The standard deviation is a measure of the variation of A around the expected value.
Version: 1 Owner: Riemann Author(s): Riemann
1.91 stochastic independence
The random variables A
1
. A
2
. .... A
n
are stochastically independent (or just independent)
if
56
1
X
1
,...,Xn
(r
1
. .... r
n
) = 1
X
1
(r
1
) 1
Xn
(r
n
) ∀(r
1
. .... r
n
) ∈ 1
n
This is, the random variables A
1
. .... A
n
are independent if its joint distribution function can
be expressed as the product of the marginal distributions of the variables, evaluated at the
corresponding points.
This deﬁnition implies all the following:
1. 1
X
1
,...,Xn
(r
1
. .... r
n
) = 1
X
1
(r
1
) 1
Xn
(r
n
)∀(r
1
. .... r
n
) ∈ 1
n
(joint cumulative distribution)
2. `
X
1
+...+Xn
(t) = `
X
1
(t) `
Xn
(t)∀(t
1
. .... t
n
) (moment generating function)
3. 1[
¸
n
i=1
A
i
] =
¸
n
i=1
1[A
i
] (expectation)
However, only the ﬁrst two above imply independence. See also correlation.
There are other deﬁnitions of independence, too.
Version: 3 Owner: Riemann Author(s): Riemann
1.92 substring
Given a string : ∈ Σ
∗
, a string t is a substring of : if : = nt· for some strings n. · ∈ Σ
∗
.
For example, j. c. /c. cj/c, and λ (the empty string) are all substrings of the string cj/c.
Version: 2 Owner: Logan Author(s): Logan
1.93 successor
Given a set o, the successor of o is the set o
¸
¦o¦. One often denotes the successor of o
by o
t
.
Version: 1 Owner: djao Author(s): djao
57
1.94 sum rule
The sum rule states that
D
D
x
[1(r) + o(r)] = 1
t
(r) + o
t
(r)
Proof
See the proof of the sum rule.
Examples
D
D
x
(r + 1) =
D
D
x
r +
D
D
x
1 = 1
D
D
x
(r
2
−3r + 2) =
D
D
x
r
2
+
D
D
x
(−3r) +
D
D
x
(2) = 2r −3
D
D
x
(sin r + cos r) =
D
D
x
sin r +
D
D
x
cos r = cos r −sin r
Version: 3 Owner: Logan Author(s): Logan
1.95 superset
Given two sets ¹ and 1, ¹ is a superset of 1 if every element in 1 is also in ¹. We
denote this relation as ¹ ⊇ 1. This is equivalent to saying that 1 is a subset of ¹, that is
¹ ⊇ 1 ⇒1 ⊆ ¹.
Similar rules that hold for ⊆ also hold for ⊇. If A ⊇ ) and ) ⊇ A, then A = ) . Every set
is a superset of itself, and every set is a superset of the empty set.
¹ is a proper superset of 1 if ¹ ⊇ 1 and ¹ = 1. This relation is often denoted as ¹ ⊃ 1.
Unfortunately, ¹ ⊃ 1 is often used to mean the more general superset relation, and thus it
should be made explicit when proper superset is intended.
Version: 2 Owner: Logan Author(s): Logan
58
1.96 symmetric polynomial
A polynomial 1 ∈ 1[r
1
. . . . . r
n
] in : variables with coeﬃcients in a ring 1 is symmetric if
σ(1) = 1 for every permutation σ of the set ¦r
1
. . . . . r
n
¦.
Every symmetric polynomial can be written as a polynomial expression in the elementary symmetric polynomials
Version: 2 Owner: djao Author(s): djao
1.97 the argument principle
the argument principle
1: 1 meromorphic in Ω
2: ∀0 < i < : : 1(c
i
) = 0
3: ∀0 < i < : : 1(/
i
) = ∞
4: γ cycle
5: γ homologous to zero with respect to Ω
6: ∀c
i
∈ im(γ) : ∀/
i
∈ im(γ) :
1
2πi
int
γ
1
t
(.)1(.)d. =
¸
n
j=0
ind
γ
(c
j
) −
¸
m
k=0
ind
γ
(/
k
)
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.98 torsionfree module
torsionfree module
1: 1 integral domain
2: A left module over 1
3: A
t
torsion submodule
4: A
t
= 0
59
fact: a ﬁnitely generated torsionfree submodule is a free module
1: A ﬁnitely generated 1module
2: A torsionfree
3: A free
(to be ﬁexd)
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 2 Owner: drini Author(s): drini, apmxi
1.99 total order
A total order is a special case of a partial order. If < is a partial order on ¹, then it
satisﬁes the following three properties:
1. reﬂexivity: c < c for all c ∈ ¹
2. antisymmetry: If c < / and / < c for any c. / ∈ ¹, then c = /
3. transitivity: If c < / and / < c for any c. /. c ∈ ¹, then c < c
The relation < is a total order if it satisﬁes the above three properties and the following
additional property:
4. Comparability: For any c. / ∈ ¹, either c < / or / < c.
Version: 2 Owner: Logan Author(s): Logan
1.100 tree traversals
A tree traversal is an algorithm for visiting all the nodes in a rooted tree exactly once.
The constraint is on rooted trees, because the root is taken to be the starting point of the
traversal. A traversal is also deﬁned on a forest in the sense that each tree in the forest can
be iteratively traversed (provided one knows the roots of every tree beforehand). This entry
presents a few common and simple tree traversals.
60
In the description of a tree, the notion of rootedsubtrees was presented. Full understanding
of this notion is necessary to understand the traversals presented here, as each of these
traversals depends heavily upon this notion.
In a traversal, there is the notion of visiting a node. Visiting a node often consists of doing
some computation with that node. The traversals are deﬁned here without any notion of
what is being done to visit a node, and simply indicate where the visit occurs (and most
importantly, in what order).
Examples of each traversal will be illustrated on the following binary tree.
•
• •
• • • •
Vertices will be numbered in the order they are visited, and edges will be drawn with arrows
indicating the path of the traversal.
Preorder Traversal
Given a rooted tree, a preorder traversal consists of ﬁrst visiting the root, and then
executing a preorder traversal on each of the root’s children (if any).
For example
1
a g
2
b d
f
5
h j
3
c
4
e
6
i
7
The term preorder refers to the fact that a node is visited before any of its descendents.
A preorder traversal is deﬁned for any rooted tree. As pseudocode, the preorder traversal is
Algorithm PreorderTraversal(r, Visit)
Input: A node r of a binary tree, with children left(r) and right(r), and some computation
Visit
d
eﬁned
for r
Output: Visits nodes of subtree rooted at r in a preorder traversal begin
61
Visit
(r)
PreorderTraversal
(left(r). Visit)
PreorderTraversal
(right(r). Visit)
end
Postorder Traversal
Given a rooted tree, a postorder traversal consists of ﬁrst executing a postorder traversal
on each of the root’s children (if any), and then visiting the root.
For example
7
a g
3
b d
f
6
h j
1
c
2
e
4
i
5
As with the preorder traversal, the term postorder here refers to the fact that a node is
visited after all of its descendents. A postorder traversal is deﬁned for any rooted tree. As
pseudocode, the postorder traversal is
Algorithm PostorderTraversal(r, Visit)
Input: A node r of a binary tree, with children left(r) and right(r), and some computation
Visit
d
eﬁned
for r
Output: Visits nodes of subtree rooted at r in a postorder traversal begin
Visit
(r)
PostorderTraversal
(left(r). Visit)
PostorderTraversal
(right(r). Visit)
end
62
Inorder Traversal
Given a binary tree, an inorder traversal consists of executing an inorder traversal on
the root’s left child (if present), then visiting the root, then executing an inorder traversal
on the root’s right child (if present). Thus all of a root’s left descendents are visited before
the root, and the root is visited before any of its right descendents.
For example
4
a g
2
b d
f
6
h j
1
c
3
e
5
i
7
As can be seen, the inorder traversal has the wonderful property of traversing a tree from
left to right (if the tree is visualized as it has been drawn here). The term inorder comes
from the fact that an inorder traversal of a binary search tree visits the data associated with
the nodes in sorted order. As pseudocode, the inorder traversal is
Algorithm InOrderTraversal(r, Visit)
Input: A node r of a binary tree, with children left(r) and right(r), and some computation
Visit
d
eﬁned
for r
Output: Visits nodes of subtree rooted at r in an inorder traversal begin
InOrderTraversal
(left(r). Visit)
Visit
(r)
InOrderTraversal
(right(r). Visit)
end
Version: 3 Owner: Logan Author(s): Logan
1.101 trie
A trie is a digital tree for storing a set of strings in which there is one node for every preﬁx
of every string in the set. The name comes from the word retrieval, and thus is pronounced
the same as tree (which leads to much confusion when spoken aloud). The word retrieval is
63
stressed, because a trie has a lookup time that is equivalent to the length of the string being
looked up.
If a trie is to store some set of strings o ⊆ Σ
∗
(where Σ is an alphabet), then it takes the
following form. Each edge leading to nonleaf nodes in the trie is labelled by an element
of Σ. Any edge leading to a leaf node is labelled by $ (some symbol not in Σ). For every
string : ∈ o, there is a path from the root of the trie to a leaf, the labels of which when
concatenated form : ++ $ (where ++ is the string concatenation operator). For every path
from the root of the trie to a leaf, the labels of the edges concatenated form some string in
o.
Example
Suppose we wish to store the set of strings o := ¦cj/c. /ctc. /cc:. /cc:t. /cct¦. The trie that
stores o would be
•
a b
•
l
•
e
•
p
•
a t
•
h
•
r
s
t
•
a
•
a
•
$
•
t
•
$
•
$
•
$
• •
$
• •
• •
Version: 4 Owner: Logan Author(s): Logan
1.102 unit vector
A unit vector is a vector with a length, or vector norm, of one. In R
n
, one can obtain such a
vector by dividing a vector by its magnitude [·[. For example, we have a vector < 1. 2. 3 .
A unit vector pointing in this direction would be
1
[ < 1. 2. 3 [
< 1. 2. 3 =
1
√
14
< 1. 2. 3 =<
1
√
14
.
2
√
14
.
3
√
14
64
. The magnitude of this vector is 1.
Version: 7 Owner: slider142 Author(s): slider142
1.103 unstable ﬁxed point
A ﬁxed point is considered unstable if it is neither attracting nor Liapunov stable. A saddle
point is an example of such a ﬁxed point.
Version: 1 Owner: armbrusterb Author(s): armbrusterb
1.104 weak* convergence in normed linear space
weak* convergence in normed linear space
1: (r
t
n
) ⊂ A
t
2: A a Banach space
3: ∃r
t
∈ A
t
: ∀r ∈ A : lim
n→∞
r(r
t
n
) ⇔r
t
n
(r) = r
t
(r).
4: If A is reﬂexive, then weak* convergence is the same as weak convergence
Note: This is a “seed” entry written using a shorthand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.105 wellordering principle for natural numbers
Every nonempty set o of nonnegative integers contains a least element; that is, there is some
integer c in o such that c < / for all / belonging to o.
For example, the positive integers are a wellordered set under the standard order.
Version: 5 Owner: KimJ Author(s): KimJ
65
Chapter 2
0001 – Instructional exposition
(textbooks, tutorial papers, etc.)
2.1 dimension
The word dimension in mathematics has many deﬁnitions, but all of them are trying to
quantify our intuition that, for example, a sheet of paper has somehow one less dimension
than a stack of papers.
One common way to deﬁne dimension is through some notion of a number of independent
quantities needed to describe an element of an object. For example, it is natural to say
that the sheet of paper is twodimensional because one needs two real numbers to specify
a position on the sheet, whereas the stack of papers is threedimension because a position
in a stack is speciﬁed by a sheet and a position on the sheet. Following this notion, in
linear algebra the dimension of a vector space is deﬁned as the minimal number of vectors
such that every other vector in the vector space is representable as a sum of these. Similarly,
the word rank denotes various dimensionlike invariants that appear throughout the algebra.
However, if we try to generalize this notion to the mathematical objects that do not possess
an algebraic structure, then we run into a diﬃculty. From the point of view of set theory
there are as many real numbers as pairs of real numbers since there is a bijection from real
numbers to pairs of real numbers. To distinguish a plane from a cube one needs to impose
restrictions on the kind of mapping. Surprisingly, it turns out that the continuity is not
enough as was pointed out by Peano. There are continuous functions that map a square
onto a cube. So, in topology one uses another intuitive notion that in a highdimensional
space there are more directions than in a lowdimensional. Hence, the (Lebesgue covering)
dimension of a topological space is deﬁned as the smallest number d such that every covering
of the space by open sets can be reﬁned so that no point is contained in more than d+1 sets.
For example, no matter how one covers a sheet of paper by suﬃciently small other sheets
of paper such that no two sheets can overlap each other, but cannot merely touch, one will
66
always ﬁnd a point that is covered by 2 + 1 = 3 sheets.
Another deﬁnition of dimension rests on the idea that higherdimensional objects are in some
sense larger than the lowerdimensional ones. For example, to cover a cube with a side length
2 one needs at least 2
3
= 8 cubes with a side length 1, but a square with a side length 2 can
be covered by only 2
2
= 4 unit squares. Let `(c) be the minimal number of open balls in
any covering of a bounded set o by balls of radius c. The BesicovitchHausdorﬀ dimension
of o is deﬁned as −lim
→∞
log
`(c). The BesicovitchHausdorﬀ dimension is not always
deﬁned, and when deﬁned it might be nonintegral.
Version: 4 Owner: bbukh Author(s): bbukh
2.2 toy theorem
A toy theorem is a simpliﬁed version of a more general theorem. For instance, by intro
ducing some simplifying assumptions in a theorem, one obtains a toy theorem.
Usually, a toy theorem is used to illustrate the claim of a theorem. It can also be illustrative
and insightful to study proofs of a toy theorem derived from a nontrivial theorem. Toy
theorems also have a great education value. After presenting a theorem (with, say, a highly
nontrivial proof), one can sometimes give some assurance that the theorem really holds, by
proving a toy version of the theorem.
For instance, a toy theorem of Brouwer ﬁxed point theorem is obtained by restricting the
dimension to one. In this case, the Brouwer ﬁxed point theorem follows almost immediately
from the intermediate value theorem (see this page).
Version: 1 Owner: matte Author(s): matte
67
Chapter 3
00XX – General
3.1 method of exhaustion
The method of exhaustion is calculating an area by approximating it by the areas of a
sequence of polygons.
For example, ﬁlling up the interior of a circle by inscribing polygons with more and more
sides.
Version: 1 Owner: vladm Author(s): vladm
68
Chapter 4
00A05 – General mathematics
4.1 Conway’s chained arrow notation
Conway’s chained arrow notation is a way of writing numbers even larger than those
provided by the up arrow notation. We deﬁne : → : → j = :
(p+2)
: = :↑ ↑
. .. .
p
: and
: →: = : →: →1 = :
n
. Longer chains are evaluated by
: → →: →j →1 = : → →: →j
: → →: →1 →¡ = : → →:
and
: → →: →j + 1 →¡ + 1 = : → →: →(: → →: →j →¡ + 1) →¡
For example:
3 →3 →2 =
3 →(3 →2 →2) →1 =
3 →(3 →2 →2) =
3 →(3 →(3 →1 →2) →1) =
3 →(3 →3 →1) =
3
3
3
=
3
27
= 7625597484987
69
A much larger example is:
3 →2 →4 →4 =
3 →2 →(3 →2 →3 →4) →3 =
3 →2 →(3 →2 →(3 →2 →2 →4) →3) →3 =
3 →2 →(3 →2 →(3 →2 →(3 →2 →1 →4) →3) →3) →3 =
3 →2 →(3 →2 →(3 →2 →(3 →2) →3) →3) →3 =
3 →2 →(3 →2 →(3 →2 →9 →3) →3) →3
Clearly this is going to be a very large number. Note that, as large as it is, it is proceeding
towards an eventual ﬁnal evaluation, as evidenced by the fact that the ﬁnal number in the
chain is getting smaller.
Version: 4 Owner: Henry Author(s): Henry
4.2 Knuth’s up arrow notation
Knuth’s up arrow noation is a way of writing numbers which would be unwieldy in
standard decimal notation. It expands on the exponential notation : ↑ : = :
n
. Deﬁne
: ↑↑ 0 = 1 and : ↑↑ : = : ↑ (: ↑↑ [: −1]).
Obviously : ↑↑ 1 = :
1
= :, so 3 ↑↑ 2 = 3
3↑↑1
= 3
3
= 27, but 2 ↑↑ 3 = 2
2↑↑2
= 2
2
2↑↑1
=
2
(2
2
)
= 16.
In general, : ↑↑ : = :
m
···
m
, a tower of height :.
Clearly, this process can be extended: : ↑↑↑ 0 = 1 and : ↑↑↑ : = : ↑↑ (: ↑↑↑ [: −1]).
An alternate notation is to write :
(i)
: for : ↑ ↑
. .. .
i−2 times
:. (i−2 times because then :
(2)
: = ::
and :
(1)
: = :+ :.) Then in general we can deﬁne :
(i)
: = :
(i−1)
(:
(i)
(: −1)).
To get a sense of how quickly these numbers grow, 3 ↑↑↑ 2 = 3 ↑↑ 3 is more than seven and
a half trillion, and the numbers continue to grow much more than exponentially.
Version: 3 Owner: Henry Author(s): Henry
4.3 arithmetic progression
Arithmetic progression of length :, initial term c
1
and common diﬀerence d is the sequence
c
1
. c
1
+ d. c
1
+ 2d. . . . . c
1
+ (: −1)d.
70
The sum of terms of an arithmetic progression can be computed using Gauss’s trick:
o = (c
1
+ 0)
+o = (c
1
+ (: −1)d
2o = (2c
1
+ (: −1)
We just add the sum with itself written backwards, and the sum of each of the columns equals
to (2c
1
+ (: −1)d). The sum is then
o =
(2c
1
+ (: −1)d):
2
.
Version: 3 Owner: bbukh Author(s): bbukh
4.4 arity
The arity of something is the number of arguments it takes. This is usually applied to
functions: an :ary function is one that takes : arguments. Unary is a synonym for 1ary,
and binary for 2ary.
Version: 1 Owner: Henry Author(s): Henry
4.5 introducing 0th power
Let c be a number. Then for all : ∈ N, c
n
is the product of : c’s. For integers (and their
extensions) we have a “multiplicative identity” called “1”, i.e.c 1 = c for all c. So we can
write
c
n
= c
n+0
= c
n
1.
From the deﬁnition of the power of c the usual laws can be derived; so it is plausible to set
c
0
= 1, since 0 doesn’t change a sum, like 1 doesn’t change the product.
Version: 4 Owner: Thomas Heye Author(s): Thomas Heye
4.6 lemma
There is no technical distinction between a lemma and a theorem. A lemma is a proven
statement, typically named a lemma to distinguish it as a truth used as a stepping stone to
a larger result rather than an important statement in and of itself. Of course, some of the
most powerful statements in mathematics are known as lemmas, including Zorn’s lemma,
71
Bezout’s lemma, Gauss’ lemma, Fatou’s lemma, etc., so one clearly can’t get too much sim
ply by reading into a proposition’s name.
According to [1], the plural ’Lemmas’ is commonly used. The correct plural of lemma,
however, is lemmata.
REFERENCES
1. N. Higham, Handbook of writing for the mathematical sciences, Society for Industrial and
Applied Mathematics, 1998. (pp. 16)
Version: 5 Owner: mathcam Author(s): mathcam
4.7 property
Given each element of a set A, a property is either true or false. Formally, a property 1 : A →
¦true. false¦. Any property gives rise in a natural way to the set ¦r : r has the property 1¦
and the corresponding characteristic function.
Version: 3 Owner: ﬁbonaci Author(s): bbukh, ﬁbonaci, apmxi
4.8 saddle point approximation
The saddle point approximation (SPA), a.k.a. Stationary phase approximation, is a widely
used method in quantum ﬁeld theory (QFT) and related ﬁelds. Suppose we want to evaluate
the following integral in the limit ζ →∞:
I = lim
ζ→∞
int
∞
−∞
dr e
−ζf(x)
. (4.8.1)
The saddle point approximation can be applied if the function 1(r) satisﬁes certain condi
tions. Assume that 1(r) has a global minimum 1(r
0
) = n
min
at r = r
0
, which is suﬃciently
separated from other local minima and whose value is suﬃciently smaller than the value of
those. Consider the Taylor expansion of 1(r) about the point r
0
:
1(r) = 1(r
0
) + ∂
x
1(r)
x=x
0
(r −r
0
) +
1
2
∂
x
2
1(r)
x=x
0
(r −r
0
)
2
+ ((r
3
). (4.8.2)
Since 1(r
0
) is a (global) minimum, it is clear that 1
t
(r
0
) = 0. Therefore 1(r) may be
approximated to quadratic order as
1(r) ≈ 1(r
0
) +
1
2
1
tt
(r
0
)(r −r
0
)
2
. (4.8.3)
72
The above assumptions on the minima of 1(r) ensure that the dominant contribution to
(4.8.1) in the limit ζ →∞ will come from the region of integration around r
0
:
I ≈ lim
ζ→∞
e
−ζf(x
0
)
int
∞
−∞
dr e
−
ζ
2
f
(x
0
)(x−x
0
)
2
(4.8.4)
≈ lim
ζ→∞
e
−ζf(x
0
)
2π
ζ1
tt
(r
0
)
1/2
.
In the last step we have performed the Gaußian integral. The next nonvanishing higher order
correction to (4.8.4) stems from the quartic term of the expansion (4.8.2). This correction
may be incorporated into (4.8.4) to yield (after expanding part of the exponential):
I ≈ lim
ζ→∞
e
−ζf(x
0
)
int
∞
−∞
dr e
−
ζ
2
f
(x
0
)(x−x
0
)
2
1 −
ζ
4!
(∂
4
x
1(r))[
x=x
0
(r −r
0
)
4
. (4.8.5)
...to be continued with applications to physics...
Version: 2 Owner: msihl Author(s): msihl
4.9 singleton
A set consisting of a single element is usually referred to as a singleton.
Version: 2 Owner: Koro Author(s): Koro
4.10 subsequence
If A is a set and (c
n
)
n∈N
is a sequence in A, then a subsequence of (c
n
) is a sequence of
the form (c
nr
)
r∈N
where (:
r
)
r∈N
is a strictly increasing sequence of natural numbers.
Version: 2 Owner: Evandar Author(s): Evandar
4.11 surreal number
The surreal numbers are a generalization of the reals. Each surreal number consists of two
parts (called the left and right), each of which is a set of surreal numbers. For any surreal
number `, these parts can be called `
L
and `
R
. (This could be viewed as an ordered pair of
sets, however the surreal numbers were intended to be a basis for mathematics, not something
to be embedded in set theory.) A surreal number is written ` = '`
L
[ `
R
`.
Not every number of this form is a surreal number. The surreal numbers satisfy two addi
tional properties. First, if r ∈ `
R
and n ∈ `
L
then r < n. Secondly, they must be well
73
founded. These properties are both satisﬁed by the following construction of the surreal
numbers and the < relation by mutual induction:
'[`, which has both left and right parts empty, is 0.
Given two (possibly empty) sets of surreal numbers 1 and 1 such that for any r ∈ 1 and
n ∈ 1, r < n, '1 [ 1`.
Deﬁne ` < ` if there is no r ∈ `
L
such that ` < r and no n ∈ `
R
such that n < `.
This process can be continued transﬁnitely, to deﬁne inﬁnite and inﬁnitesimal numbers. For
instance if Z is the set of integers then ω = 'Z [`. Note that this does not make equality the
same as identity: '1 [ 1` = '[`, for instance.
It can be shown that ` is ”sandwiched” between the elements of `
L
and `
R
: it is larger
than any element of `
L
and smaller than any element of `
R
.
Addition of surreal numbers is deﬁned by
` +` = '¦` +r [ r ∈ `
L
¦
¸
¦` +r [ n ∈ `
L
¦ [ ¦` +r [ r ∈ `
R
¦
¸
¦` +r [ n ∈ `
R
¦`
It follows that −` = '−`
R
[ −`
L
`.
The deﬁnition of multiplication can be written more easily by deﬁning ` `
L
= ¦` r [
r ∈ `
L
¦ and similarly for `
R
.
Then
` ` ='` `
L
+ ` `
L
−`
L
`
L
. ` `
R
+ ` `
R
−`
R
`
R
[
` `
L
+ ` `
R
−`
L
`
R
. ` `
R
+ ` `
L
−`
R
`
L
`
The surreal numbers satisfy the axioms for a ﬁeld under addition and multiplication (whether
they really are a ﬁeld is complicated by the fact that they are too large to be a set).
The integers of surreal mathematics are called the omniﬁc integers. In general positive
integers : can always be written ': − 1 [` and so −: = '[ 1 − :` = '[ (−:) + 1`. So for
instance 1 = '0 [`.
In general, 'c [ /` is the simplest number between c and /. This can be easily used to deﬁne
the dyadic fractions: for any integer c, c +
1
2
= 'c [ c + 1`. Then
1
2
= '0 [ 1`,
1
4
= '0 [
1
2
`,
and so on. This can then be used to locate nondyadic fractions by pinning them between
a left part which gets inﬁnitely close from below and a right part which gets inﬁnitely close
from above.
74
Ordinal arithmetic can be deﬁned starting with ω as deﬁned above and adding numbers
such as 'ω [` = ω + 1 and so on. Similarly, a starting inﬁnitesimal can be found as '0 [
1.
1
2
.
1
4
. . .` =
1
ω
, and again more can be developed from there.
Version: 5 Owner: Henry Author(s): Henry
75
Chapter 5
00A07 – Problem books
5.1 Nesbitt’s inequality
Nesbitt’s inequality says, that for positive real c, / and c we have:
c
/ + c
+
/
c + c
+
c
c + /
`
3
2
.
Version: 2 Owner: mathwizard Author(s): mathwizard
5.2 proof of Nesbitt’s inequality
Starting from Nesbitt’s inequality
c
/ + c
+
/
c + c
+
c
c + /
`
3
2
we transform the left hand side:
c + / + c
/ + c
+
c + / + c
c + c
+
c + / + c
c + /
−3 `
3
2
.
Now this can be transformed into:
((c + /) + (c + c) + (/ + c))
1
c + /
+
1
c + c
+
1
/ + c
` 9.
Division by 3 and the right factor yields:
(c + /) + (c + c) + (/ + c)
3
`
3
1
a+b
+
1
a+c
+
1
b+c
.
76
Now on the left we have the arithmetic mean and on the right the harmonic mean, so this
inequality is true.
Version: 2 Owner: mathwizard Author(s): mathwizard
77
Chapter 6
00A20 – Dictionaries and other
general reference works
6.1 completing the square
Let us consider the expression r
2
+rn, where r and n are real (or complex) numbers. Using
the formula
(r + n)
2
= r
2
+ 2rn + n
2
we can write
r
2
+ rn = r
2
+ rn + 0
= r
2
+ rn +
n
2
4
−
n
2
4
= (r +
n
2
)
2
−
n
2
4
.
This manipulation is called completing the square [3] in r
2
+rn, or completing the square
r
2
.
Replacing n by −n, we also have
r
2
−rn = (r −
n
2
)
2
−
n
2
4
.
Here are some applications of this method:
• Derivation of the solution formula to the quadratic equation.
• Completing the square can also be used to ﬁnd the extremal value of a quadratic
polynomial [2] without calculus. Let us illustrate this for the polynomial j(r) =
78
4r
2
+ 8r + 9. Completing the square yields
j(r) = (2r + 2)
2
−4 + 9
= (2r + 2)
2
+ 5
≥ 5.
since (2r + 2)
2
≥ 0. Here, equality holds if and only if r = −1. Thus j(r) ≥ 5 for all
r ∈ R, and j(r) = 5 if and only if r = −1. It follows that j(r) has a global minimum
at r = −1, where j(−1) = 5.
• Completing the square can also be used as an integration technique to integrate, say
1
4x
2
+8x+9
[3].
REFERENCES
1. R. Adams, Calculus, a complete course, AddisonWesley Publishers Ltd, 3rd ed.
2. Matematik Lexikon (in Swedish), J. Thompson, T. Martinsson, Wahlstr¨om & Widstrand,
1991.
(Anyone has an English reference?)
Version: 7 Owner: mathcam Author(s): matte
79
Chapter 7
00A99 – Miscellaneous topics
7.1 QED
The term “QED” is actually an abbreviation and stands for the Latin quod erat demon
strandum, meaning “which was to be demonstrated.”
QED typically is used to signify the end of a mathematical proof. The symbol
¯
is often used in place of “QED,” and is called the “Halmos symbol” after mathematician
Paul Halmos (it can vary in width, however, and sometimes it is fully or partially shaded).
Halmos borrowed this symbol from magazines, where it was used to denote “end of article.”
Version: 3 Owner: akrowne Author(s): akrowne
7.2 TFAE
The abbreviation “TFAE” is shorthand for “the following are equivalent”. It is used before
a set of equivalent conditions (each implies all the others).
In a deﬁnition, when one of the conditions is somehow “better” (simpler, shorter, ...), it
makes sense to phrase the deﬁnition with that condition, and mention that the others are
equivalent. “TFAE” is typically used when none of the conditions can take priority over the
others. Actually proving the claimed equivalence must, of course, be done separately.
Version: 1 Owner: ariels Author(s): ariels
80
7.3 WLOG
“WLOG” (or “WOLOG”) is an acronym which stands for “without loss of generality.”
WLOG is invoked in situations where some property of a model or system is invariant
under the particular choice of instance attributes, but for the sake of demonstration, these
attributes must be ﬁxed.
For example, we might be discussing properties of a segment (open or closed) of the real number
line. Due to the nature of the reals, we can select endpoints c and / without loss of gen
erality. Nothing about our discussion of this segment depends on the choice of c or /. Of
course, any segment does actually have speciﬁc endpoints, so it may help to actually select
some (say 0 and 1) for clarity.
WLOG can also be invoked to shorten proofs where there are a number of choices of conﬁg
uration, but the proof is “the same” for each of them. We need only walk through the proof
for one of these conﬁgurations, and “WLOG” serves as a note that we haven’t lost anything
in the choosing.
Version: 2 Owner: akrowne Author(s): akrowne
7.4 order of operations
The order of operations is a convention that tells us how to evaluate mathematical expres
sions (these could be purely numerical). The problem arises because expressions consist of
operators applied to variables or values (or other expressions) that each demand individual
evaluation, yet the order in which these individual evaluations are done leads to diﬀerent
outcomes.
A conventional order of operations solves this. One could technically do without memorizing
this convention, but the only alternative is to use parentheses to group every single term of
an expression and evaluate the innermost operations ﬁrst.
For example, in the expression c / +c, how do we know whether to apply multiplication or
addition ﬁrst? We could interpret even this simple expression two drastically diﬀerent ways:
1. Add / and c,
2. Multiply the sum from (1) with c.
or
1. Multiply c and /,
81
2. Add to the product in (1) the value of c.
One can see the diﬀerent outcomes for the two cases by selecting some diﬀerent values for c,
/, and c. The issue is resolved by convention in order of operations: the correct evaluation
would be the second one.
The nearly universal mathematical convention dictates the following order of operations (in
order of which operators should be evaluated ﬁrst):
1. factorial.
2. Exponentiation.
3. Multiplication.
4. Division.
5. Addition.
Any parenthesized expressions are automatically higher “priority” than anything on the
above list.
There is also the problem of what order to evaluate repeated operators of the same type, as
in:
c/cd
The solution in this problem is typically to assume the lefttoright interpretation. For the
above, this would lead to the following evaluation:
(((c/)c)d)
In other words,
1. Evaluate c/.
2. Evaluate (1)/c.
3. Evaluate (2)/d.
Note that this isn’t a problem for associative operators such as multiplication or addition in
the reals. One must still proceed with caution, however, as associativity is a notion bound
82
up with the concept of groups rather than just operators. Hence, context is extremely
important.
For more obscure operations than the ones listed above, parentheses should be used to remove
ambiguity. Completely new operations are typically assumed to have the highest priority,
but the deﬁnition of the operation should be accompanied by some sort of explanation of how
it is evaluated in relation to itself. For example, Conway’s chained arrow notation explicitly
deﬁnes what order repeated applications of itself should be evaluated in (it is righttoleft
rather than lefttoright)!
Version: 2 Owner: akrowne Author(s): akrowne
83
Chapter 8
01A20 – Greek, Roman
8.1 Roman numerals
Roman numerals are a method of writing numbers employed primarily by the ancient
Romans. It place of digits, the Romans used letters to represent the numbers central to the
system:
1 1
\ 5
A 10
1 50
( 100
1 500
` 1000
Larger numbers can be made by writing a bar over the letter, which means one thousand
times as much. For instance \ is 5000.
Other numbers were written by putting letters together. For instance 11 means 2. Larger
letters go on the left, so 111 is 52, but 111 is not a valid Roman numeral.
One additional rule allows a letter to the left of a larger letter to signify subtracting the
smaller from the larger. For instance 1\ is 4. This can only be done once; 3 is written 111,
not 11\ . Also, it is generally required that the smaller letter be the one immediately smaller
than the larger, so 1999 is usually written `(`A(1A, not `1`.
It is worth noting that today it is usually considered incorrect to repeat a letter four times,
so 1\ is prefered to 1111. However many older monuments do not use the subtraction rule
at all, so 44 was written AAAA1111 instead of the now preferable A11A.
Version: 3 Owner: Henry Author(s): Henry
84
Chapter 9
01A55 – 19th century
9.1 Poincar, Jules Henri
Jules Henri Poincar´e was born on April 29
th
1854 in Cit´e Ducale[BA] a neighborhood in
Nancy, a city in France. He was the son of Dr L´eon Poincar´e (18281892) who was a
professor at the University of Nancy in the faculty of medicine.[14] His mother, Eug´enie
Launois (18301897) was described as a “gifted mother”[6] who gave special instruction to
her son. She was 24 and his father 26 years of age when Henri was born[9]. Two years after
the birth of Henri they gave birth to his sister Aline.[6]
In 1862 Henri entered the Lyc´ee of Nancy which is today, called in his honor, the Lyc´ee
Henri Poincar´e. In fact the University of Nancy is also named in his honor. He graduated
from the Lyc´ee in 1871 with a bachelors degree in letters and sciences. Henri was the top
of class in almost all subjects, he did not have much success in music and was described as
“average at best” in any physical activities.[9] This could be blamed on his poor eyesight
and absentmindedness.[4] Later in 1873, Poincar´e entered l’Ecole Polytechnique where he
performed better in mathematics than all the other students. He published his ﬁrst pa
per at 20 years of age, titled D´emonstration nouvelle des propri´et´es de l’indicatrice
d’une surface.[3] He graduated from the institution in 1876. The same year he decided
to attend l’Ecole des Mines and graduated in 1879 with a degree in mining engineering.[14]
After his graduation he was appointed as an ordinary engineer in charge of the mining
services in Vesoul. At the same time he was preparing for his doctorate in sciences (not
surprisingly), in mathematics under the supervision of Charles Hermite. Some of Charles
Hermite’s most famous contributions to mathematics are: Hermite’s polynomials, Hermite’s
diﬀerential equation, Hermite’s formula of interpolation and Hermitian matrices.[9] Poincar´e,
as expected graduated from the University of Paris in 1879, with a thesis relating to dif
Jules Henri Poincar´e (1854  1912)
85
ferential equations. He then became a teacher at the University of Caen, where he taught
analysis. He remained there until 1881. He then was appointed as the “maˆıtre de conf´erences
d’analyse”[14] (professor in charge of analysis conferences) at the University of Paris. Also in
that same year he married Miss Poulain d’Andecy. Together they had four children: Jeanne
born in 1887, Yvonne born in 1889, Henriette born in 1891, and ﬁnally L´eon born in 1893.
He had now returned to work at the Ministry of Public Services as an engineer. He was
responsible for the development of the northern railway. He held that position from 1881 to
1885. This was the last job he held in administration for the government of France. In 1893
he was awarded the title of head engineer in charge of the mines. After that his career awards
and position continuously escalated in greatness and quantity. He died two years before the
war on July 17
th
1912 of an embolism at the age of 58. Interestingly, at the beginning of
World War I, his cousin Raymond Poincar´e was the president of the French Republic.
Poincar´e’s work habits have been compared to a bee ﬂying from ﬂower to ﬂower. Poincar´e
was interested in the way his mind worked, he studied his habits. He gave a talk about his
observations in 1908 at the Institute of General Psychology in Paris. He linked his way of
thinking to how he made several discoveries. His mental organization was not only interesting
to him but also to Toulouse, a psychologist of the Psychology Laboratory of the School of
Higher Studies in Paris. Toulouse wrote a book called Henri Poincar´e which was published
in 1910. He discussed Poincar´e’s regular schedule: he worked during the same times each
day in short periods of time. He never spent a long time on a problem since he believed
that the subconscious would continue working on the problem while he worked on another
problem. Toulouse also noted that Poincar´e also had an exceptional memory. In addition he
stated that most mathematicians worked from principle already established while Poincar´e
was the type that started from basic principle each time.[9] His method of thinking is well
summarized as:
Habitu´e `a n´egliger les d´etails et `a ne regarder que les cimes, il passait de l’une `a
l’autre avec une promptitude surprenante et les faits qu’il d´ecouvrait se groupant
d’euxmˆemes autour de leur centre ´etaient instantan´emant et automatiquement
class´e dans sa m´emoire. (He neglected details and jumped from idea to idea, the
facts gathered from each idea would then come together and solve the problem)
[BA]
The mathematician Darboux claimed he was “un intuitif”(intuitive)[BA], arguing that this
is demonstrated by the fact that he worked so often by visual representation. He did not
care about being rigorous and disliked logic. He believed that logic was not a way to invent
but a way to structure ideas but that logic limits ideas.
Poincar´e had the opposite philosophical views of Bertrand Rusell and Gottlob Fredge who
believed that mathematics were a branch of logic. Poincar´e strongly disagreed, claiming that
intuition was the life of mathematics. Poincar´e gives an interesting point of view in his book
Science and Hypothesis:
For a superﬁcial observer, scientiﬁc truth is beyond the possibility of doubt; the
86
logic of science is infallible, and if the scientists are sometimes mistaken, this is
only from their mistaking its rule. [12]
Poincar´e believed that arithmetic is a synthetic science. He argued that Peano’s axioms
cannot be proven noncircularly with the principle of induction.[7] Therefore concluding
that arithmetic is a priori synthetic and not analytic. Poincar´e then went on to say that
mathematics can not be a deduced from logic since it is not analytic. It is important to
note that even today Poincar´e has not been proven wrong in his argumentation. His views
were the same as those of Kant[8]. However Poincar´e did not share Kantian views in all
branches of philosophy and mathematics. For example in geometry Poincar´e believed that
the structure of nonEuclidean space can be known analytically. He wrote 3 books that made
his philosophies known: Science and Hypothesis, The Value of Science and Science
and Method.
Poincar´e’s ﬁrst area of interest in mathematics was the fuchsian function that he named after
the mathematician Lazarus Fuch because Fuch was known for being a good teacher and done
alot of research in diﬀerential equations and in the theory of functions. The functions did
not keep the name fuchsian and are today called automorphic. Poincar´e actually developed
the concept of those functions as part of his doctoral thesis.[9] An automorphic function is a
function 1(.) where . ∈ C which is analytic under its domain and which is invariant under
a denumerable inﬁnite group of linear fractional transformations, they are the generaliza
tions of trigonometric functions and elliptic functions.[15] Below Poincar´e explains how he
discovered Fuchsian functions:
For ﬁfteen days I strove to prove that there could not be any functions like those
I have since called Fuchsian functions. I was then very ignorant; every day I
seated myself at my work table, stayed an hour or two, tried a great number
of combinations and reached no results. One evening, contrary to my custom, I
drank black coﬀee and could not sleep. Ideas rose in crowds; I felt them collide
until pairs interlocked, so to speak, making a stable combination. By the next
morning I had established the existence of a class of Fuchsian functions, those
which come from the hypergeometric series; I had only to write out the results,
which took but a few hours. [11]
This is a clear indication Henri Poincar´e brilliance. Poincar´e communicated a lot with Klein
another mathematician working on fuchsian functions. They were able to discuss and further
the theory of automorphic(fushian) functions. Apparently Klein became jealous of Poincar´e’s
high opinion of Fuch’s work and ended their relationship on bad terms.
Poincar´e contributed to the ﬁeld of algebraic topology and published Analysis situs in 1895
which was the ﬁrst real systematic look at topology. He acquired most of his knowledge
from his work on diﬀerential equations. He also formulated the Poincar´e conjecture, one
of the great unsolved mathematics problems. It is currently one of the “Millennium Prize
Problems”. The problem is stated as:
87
Consider a compact 3dimensional manifold V without boundary. Is it possible
that the fundamental group V could be trivial, even though V is not homeomorphic
to the 3dimensional sphere? [5]
The problem has been attacked by many mathematicians such as Henry Whitehead in 1934,
but without success. Later in the 50’s and 60’s progress was made and it was discovered
that for higher dimension manifolds the problem was easier. (Theorems have been stated for
those higher dimensions by Stephe Smale, John Stallings, Andrew Wallace, and many more)
[5] Poincar´e also studied homotopy theory, which is the study of topology reduced to various
groups that are algebraically invariant.[9] He introduced the fundamental group in a paper
in 1894, and later stated his infamous conjecture. He also did work in analytic functions,
algebraic geometry, and Diophantine problems where he made important contributions not
unlike most of the areas he studied in.
In 1887, Oscar II, King of Sweden and Norway held a competition to celebrate his sixtieth
birthday and to promote higher learning.[1] The King wanted a contest that would be of
interest so he decided to hold a mathematics competition. Poincar´e entered the competition
submitting a memoir on the three body problem which he describes as:
Le but ﬁnal de la M´ecanique c´eleste est de r´esoudre cette grande question de
savoir si la loi de Newton explique `a elle seule tous les ph´enom`enes astronomiques;
le seul moyen d’y parvenir est de faire des observation aussi pr´ecises que possible
et de les comparer ensuite aux r´esultats du calcul. (The goal of celestial mechanics
is to answer the great question of whether Newtonian mechanics explains all
astronomical phenomenons. The only way this can be proven is by taking the
most precise observation and comparing it to the theoretical calculations.) [13]
Poincar´e did in fact win the competition. In his memoir he described new mathematical ideas
such as homoclinic points. The memoir was about to be published in Acta Mathematica
when an error was found by the editor. This error in fact led to the discovery of chaos
theory. The memoir was published later in 1890.[9] In addition Poincar´e proved that the
determinism and predictability were disjoint problems. He also found that the solution of
the three body problem would change drastically with small change on the initial conditions.
This area of research was neglected until 1963 when Edward Lorenz discovered the famous
a chaotic deterministic system using a simple model of the atmosphere.[7]
He made many contributions to diﬀerent ﬁelds of applied mathematics as well such as: celes
tial mechanics, ﬂuid mechanics, optics, electricity, telegraphy, capillarity, elasticity, thermo
dynamics, potential theory, quantum theory, theory of relativity and cosmology. In the ﬁeld
of diﬀerential equations Poincar´e has given many results that are critical for the qualitative
theory of diﬀerential equations, for example the Poincar´e sphere and the Poincar´e map.
It is that intuition that led him to discover and study so many areas of science. Poincar´e
is considered to be the next universalist after Gauss. After Gauss’s death in 1855 people
88
generally believed that there would be no one else that could master all branches of math
ematics. However they were wrong because Poincar´e took all areas of mathematics as “his
province”[4].
REFERENCES
1. The 1911 Edition Encyclopedia: Oscar II of Sweden and Norway, [online],
http://63.1911encyclopedia.org/O/OS/OSCAR II OF SWEDEN AND NORWAY.htm
2. Belliver, Andr´e: Henri Poincar´e ou la vocation souveraine, Gallimard, 1956.
3. Bour PE., Rebuschi M.: Serveur W3 des Archives H. Poincar´e [online] http://www.univ
nancy2.fr/ACERHP/
4. Boyer B. Carl: A History of Mathematics: Henri Poincar´e, John Wiley & Sons, inc., Toronto,
1968.
5. Clay Mathematics Institute: Millennium Prize Problems, 2000, [online]
http://www.claymath.org/prizeproblems/ .
6. Encyclopaedia Britannica: Biography of Jules Henri Poincar´e.
7. Murz, Mauro: Jules Henri Poincar´e [Internet Encyclopedia of Philosophy], [online]
http://www.utm.edu/research/iep/p/poincare.htm, 2001.
8. Kolak, Daniel: Lovers of Wisdom (second edition), Wadsworth, Belmont, 2001.
9. O’Connor, J. John & Robertson, F. Edmund: The MacTutor History of Mathematics Archive,
[online] http://wwwgap.dcs.stand.ac.uk/ history/, 2002.
10. Oeuvres de Henri Poincar´e: Tome XI, GauthierVillard, Paris, 1956.
11. Poincar´e, Henri: Science and Method; The Foundations of Science, The Science Press, Lan
caster, 1946.
12. Poincar´e, Henri: Science and Hypothesis; The Foundations of Science, The Science Press,
Lancaster, 1946.
13. Poincar´e, Henri: Les m´ethodes nouvelles de la m´ecanique celeste, Dover Publications, Inc.
New York, 1957.
14. Sageret, Jules: Henri Poincar´e, Mercvre de France, Paris, 1911.
15. Weisstein, W. Eric: World of Mathematics: Automorphic Function, CRC Press LLC, 2002.
Version: 6 Owner: Daume Author(s): Daume
89
Chapter 10
01A60 – 20th century
10.1 Bourbaki, Nicolas
by
´
Emilie Richer
The Problem
The devastation of World War I presented a unique challenge to aspiring mathematicians of
the mid 1920’s. Among the many casualties of the war were great numbers of scientists and
mathematicians who would at this time have been serving as mentors to the young students.
Whereas other countries such as Germany were sending their scholars to do scientiﬁc work,
France was sending promising young students to the front. A wartime directory of the
´ecole Normale Sup´erieure in Paris conﬁrms that about 2/3 of their student population was
killed in the war.[DJ] Young men studying after the war had no young teachers, they had
no previous generation to rely on for guidance. What did this mean? According to Jean
Dieudonn´e, it meant that students like him were missing out on important discoveries and
advances being made in mathematics at that time. He explained : “I am not saying that
they (the older professors) did not teach us excellent mathematics (...) But it is indubitable
that a 50 year old mathematician knows the mathematics he learned at 20 or 30, but has
only notions, often rather vague, of the mathematics of his epoch, i.e. the period of time
when he is 50.” He continued : “I had graduated from the ´ecole Normale and I did not know
what an ideal was! This gives you and idea of what a young French mathematician knew in
1930.”[DJ] Henri Cartan, another student in Paris shortly after the war aﬃrmed : “we were
the ﬁrst generation after the war. Before us their was a vide, a vacuum, and it was necessary
to make everything new.”[JA] This is exactly what a few young Parisian math students set
out to do.
The Beginnings
After graduation from the ´ecole Normale Sup´erieure de Paris a group of about ten young
90
mathematicians had maintained very close ties.[WA] They had all begun their careers and
were scattered across France teaching in universities. Among them were Henri Cartan and
Andr´e Weil who were both in charge of teaching a course on diﬀerential and integral calcu
lus at the University of Strasbourg. The standard textbook for this class at the time was
“Trait´e d’Analyse” by E. Goursat which the young professors found to be inadequate in
many ways.[BA] According to Weil, his friend Cartan was constantly asking him questions
about the best way to present a given topic to his class, so much so that Weil eventually
nicknamed him “the grand inquisitor”.[WA] After months of persistent questioning, in the
winter of 1934, Weil ﬁnally got the idea to gather friends (and former classmates) to settle
their problem by rewriting the treatise for their course. It is at this moment that Bourbaki
was conceived.
The suggestion of writing this treatise spread and very soon a loose circle of friends, includ
ing Henri Cartan, Andr´e Weil, Jean Delsarte, Jean Dieudonn´e and Claude Chevalley began
meeting regularly at the Capoulade, a caf´e in the Latin quarter of Paris to plan it . They
called themselves the “Committee on the Analysis Treatise”[BL]. According to Chevalley
the project was extremely naive. The idea was to simply write another textbook to replace
Goursat’s.[GD] After many discussions over what to include in their treatise they ﬁnally
came to the conclusion that they needed to start from scratch and present all of essential
mathematics from beginning to end. With the idea that “the work had to be primarily a
tool, not usable in some small part of mathematics but in the greatest possible number of
places”.[DJ] Gradually the young men realized that their meetings were not suﬃcient, and
they decided they would dedicate a few weeks in the summer to their new project. The
collaborators on this project were not aware of it’s enormity, but were soon to ﬁnd out.
In July of 1935 the young men gathered for their ﬁrst congress (as they would later call them)
in BesseenChandesse. The men believed that they would be able to draft the essentials of
mathematics in about three years. They did not set out wanting to write something new,
but to perfect everything already known. Little did they know that their ﬁrst chapter would
not be completed until 4 years later. It was at one of their ﬁrst meetings that the young
men chose their name : Nicolas Bourbaki. The organization and it’s membership would go
on to become one of the greatest enigmas of 20th century mathematics.
The ﬁrst Bourbaki congress, July 1935. From left to right, back row: Henri Cartan, Ren´e
de Possel, Jean Dieudonn´e, Andr´e Weil, university lab technician, seated: Mirl`es, Claude
Chevalley, Szolem Mandelbrojt.
Andr´e Weil recounts many years later how they decided on this name. He and a few other
Bourbaki collaborators had been attending the ´ecole Normale in Paris, when a notiﬁcation
was sent out to all ﬁrst year science students : a guest speaker would be giving a lecture and
attendance was highly recommended. As the Story goes, the young students gathered to
hear, (unbeknownst to them) an older student, Raoul Husson who had disguised himself with
a fake beard and an unrecognizable accent. He gave what is said to be an incomprehensible,
91
nonsensical lecture, with the young students trying desperately to follow him. All his results
were wrong in a nontrivial way and he ended with his most extravagant : Bourbaki’s
Theorem. One student even claimed to have followed the lecture from beginning to end.
Raoul had taken the name for his theorem from a general in the FrancoPrussian war. The
committee was so amused by the story that they unanimously chose Bourbaki as their name.
Weil’s wife was present at the discussion about choosing a name and she became Bourbaki’s
godmother baptizing him Nicolas.[WA] Thus was born Nicolas Bourbaki.
Andr´e Weil, Claude Chevalley, Jean Dieudonn´e, Henri Cartan and Jean Delsarte were among
the few present at these ﬁrst meetings, they were all active members of Bourbaki until their
retirements. Today they are considered by most to be the founding fathers of the Bourbaki
group. According to a later member they were “those who shaped Bourbaki and gave it
much of their time and thought until they retired” he also claims that some other early
contributors were Szolem Mandelbrojt and Ren´e de Possel.[BA]
Reforming Mathematics : The Idea
Bourbaki members all believed that they had to completely rethink mathematics. They felt
that older mathematicians were holding on to old practices and ignoring the new. That is
why very early on Bourbaki established one it’s ﬁrst and only rules : obligatory retirement
at age 50. As explained by Dieudonn´e “if the mathematics set forth by Bourbaki no longer
correspond to the trends of the period, the work is useless and has to be redone, this is why
we decided that all Bourbaki collaborators would retire at age 50.”[DJ] Bourbaki wanted to
create a work that would be an essential tool for all mathematicians. Their aim was to create
something logically ordered, starting with a strong foundation and building continuously on
it. The foundation that they chose was set theory which would be the ﬁrst book in a series
of 6 that they named “´el´ements de math´ematique”(with the ’s’ dropped from math´ematique
to represent their underlying belief in the unity of mathematics). Bourbaki felt that the old
mathematical divisions were no longer valid comparing them to ancient zoological divisions.
The ancient zoologist would classify animals based on some basic superﬁcial similarities such
as “all these animals live in the ocean”. Eventually they realized that more complexity
was required to classify these animals. Past mathematicians had apparently made similar
mistakes : “the order in which we (Bourbaki) arranged our subjects was decided according to
a logical and rational scheme. If that does not agree with what was done previously, well, it
means that what was done previously has to be thrown overboard.”[DJ] After many heated
discussions, Bourbaki eventually settled on the topics for “´el´ements de math´ematique” they
would be, in order:
I Set theory
II algebra
III topology
IV functions of one real variable
V topological vector spaces
VI Integration
92
They now felt that they had eliminated all secondary mathematics, that according to them
“did not lead to anything of proved importance.”[DJ] The following table summarizes Bour
baki’s choices.
What remains after cutting the loose threads What is excluded(the loose threads)
Linear and multilinear algebra theory of ordinals and cardinals
A little general topology the least possible Lattices
Topological vector Spaces Most general topology
Homological algebra Most of group theory ﬁnite groups
commutative algebra Most of number theory
Noncommutative algebra Trigonometrical series
Lie groups Interpolation
Integration Series of polynomials
diﬀerentiable manifolds Applied mathematics
Riemannian geometry
Dieudonn´e’s metaphorical ball of yarn: “here is my picture of mathematics now. It
is a ball of wool, a tangled hank where all mathematics react upon another in an almost
unpredictable way. And then in this ball of wool, there are a certain number of threads coming
out in all directions and not connecting with anything else. Well the Bourbaki method is very
simplewe cut the threads.”[DJ]
Reforming Mathematics : The Process
It didn’t take long for Bourbaki to become aware of the size of their project. They were
now meeting three times a year (twice for one week and once for two weeks) for Bourbaki
“congresses” to work on their books. Their main rule was unanimity on every point. Any
member had the right to veto anything he felt was inadequate or imperfect. Once Bourbaki
had agreed on a topic for a chapter the job of writing up the ﬁrst draft was given to any
member who wanted it. He would write his version and when it was complete it would be
presented at the next Bourbaki congress. It would be read aloud line by line. According
to Dieudonn´e “each proof was examined point by point and criticized pitilessly. He goes
on ”one has to see a Bourbaki congress to realize the virulence of this criticism and how it
surpasses by far any outside attack.”[DJ] Weil recalls a ﬁrst draft written by Cartan (who has
unable to attend the congress where it would being presented). Bourbaki sent him a telegram
summarizing the congress, it read : “union intersection partie produit tu es d´emembr´e foutu
Bourbaki” (union intersection subset product you are dismembered screwed Bourbaki).[WA]
During a congress any member was allowed to interrupt to criticize, comment or ask questions
at any time. Apparently Bourbaki believed it could get better results from confrontation
than from orderly discussion.[BA] Armand Borel, summarized his ﬁrst congress as “two or
three monologues shouted at top voice, seemingly independant of one another”.[BA]
Bourbaki congress 1951.
After a ﬁrst draft had been completely reduced to pieces it was the job of a new collaborator
to write up a second draft. This second collaborator would use all the suggestions and
changes that the group had put forward during the congress. Any member had to be able to
93
take on this task because one of Bourbaki’s mottoes was “the control of the specialists by the
nonspecialists”[BA] i.e. a member had to be able to write a chapter in a ﬁeld that was not
his specialty. This second writer would set out on his assignment knowing that by the time
he was ready to present his draft the views of the congress would have changed and his draft
would also be torn apart despite it’s adherence to the congresses earlier suggestions. The
same chapter might appear up to ten times before it would ﬁnally be unanimously approved
for publishing. There was an average of 8 to 12 years from the time a chapter was approved
to the time it appeared on a bookshelf.[DJ] Bourbaki proceeded this way for over twenty
years, (surprisingly) publishing a great number of volumes.
Bourbaki congress 1951.
Recruitment and Membership
During these years, most Bourbaki members held permanent positions at universities across
France. There, they could recruit for Bourbaki, students showing great promise in math
ematics. Members would never be replaced formally nor was there ever a ﬁxed number of
members. However when it felt the need, Bourbaki would invite a student or colleague to a
congress as a “cobaille” (guinea pig). To be accepted, not only would the guinea pig have
to understand everything, but he would have to actively participate. He also had to show
broad interests and an ability to adapt to the Bourbaki style. If he was silent he would
not be invited again.(A challenging task considering he would be in the presence of some of
the strongest mathematical minds of the time) Bourbaki described the reaction of certain
guinea pigs invited to a congress : “they would come out with the impression that it was a
gathering of madmen. They could not imagine how these people, shouting sometimes three
or four at a time about mathematics, could ever come up with something intelligent.”[DJ]
If a new recruit was showing promise, he would continue to be invited and would gradually
become a member of Bourbaki without any formal announcement. Although impossible to
have complete anonymity, Bourbaki was never discussed with the outside world. It was many
years before Bourbaki members agreed to speak publicly about their story. The following
table gives the names of some of Bourbaki’s collaborators.
1
st
generation (founding fathers) 2
nd
generation (invited after WWII) 3
rd
generation
H. Cartan J. Dixmier A. Borel
C. Chevalley R. Godement F. Bruhat
J. Delsarte S. Eilenberg P. Cartier
J. Dieudonn´e J.L. Koszul A. Grothendieck
A. Weil P. Samuel S. Lang
J.P Serre J. Tate
L. Shwartz
3 Generations of Bourbaki (membership according to Pierre Cartier)[SM]. Note: There
have been a great number of Bourbaki contributors, some lasting longer than others, this table
gives the members listed by Pierre Cartier. Diﬀerent sources list diﬀerent “oﬃcial members”
in fact the Bourbaki website lists J.Coulomb, C.Ehresmann, R.de Possel and S. Mandelbrojt
as 1
st
generation members.[BW]
94
Bourbaki congress 1938, from left to right: S. Weil, C. Pisot, A. Weil, J. Dieudonn´e, C.
Chabauty, C. Ehresmann, J. Delsarte.
The Books
The Bourbaki books were the ﬁrst to have such a tight organization, the ﬁrst to use an
axiomatic presentation. They tried as often as possible to start from the general and work
towards the particular. Working with the belief that mathematics are fundamentally sim
ple and for each mathematical question there is an optimal way of answering it. This
required extremely rigid structure and notation. In fact the ﬁrst six books of “´el´ements de
math´ematique” use a completely linearlyordered reference system. That is, any reference
at a given spot can only be to something earlier in the text or in an earlier book. This
did not please all of its readers as Borel elaborates : “I was rather put oﬀ by the very dry
style, without any concession to the reader, the apparent striving for the utmost generality,
the inﬂexible system of internal references and the total absence of outside ones”. However,
Bourbaki’s style was in fact so eﬃcient that a lot of its notation and vocabulary is still
in current usage. Weil recalls that his grandaughter was impressed when she learned that
he had been personally responsible for the symbol ∅ for the empty set,[WA] and Chevalley
explains that to “bourbakise” now means to take a text that is considered screwed up and
to arrange it and improve it. Concluding that “it is the notion of structure which is truly
bourbakique”.[GD]
As well as ∅, Bourbaki is responsible for the introduction of the ⇒ (the implication arrow),
N, R, C, ´ and Z (respectively the natural, real, complex, rational numbers and the integers)
(
A
(complement of a set A), as well as the words bijective, surjective and injective.[DR]
The Decline
Once Bourbaki had ﬁnally ﬁnished its ﬁrst six books, the obvious question was “what next?”.
The founding members who (not intentionally) had often carried most of the weight were now
approaching mandatory retirement age. The group had to start looking at more specialized
topics, having covered the basics in their ﬁrst books. But was the highly structured Bourbaki
style the best way to approach these topics? The motto “everyone must be interested in
everything” was becoming much more diﬃcult to enforce. (It was easy for the ﬁrst six
books whose contents are considered essential knowledge of most mathematicians) Pierre
Cartier was working with Bourbaki at this point. He says “in the forties you can say that
Bourbaki know where to go: his goal was to provide the foundation for mathematics”.[12] It
seemed now that they did not know where to go. Nevertheless, Bourbaki kept publishing.
Its second series (falling short of Dieudonn´e’s plan of 27 books encompassing most of modern
mathematics [BA]) consisted of two very successful books :
Book VII Commutative algebra
Book VIII Lie Groups
However Cartier claims that by the end of the seventies, Bourbaki’s method was understood,
and many textbooks were being written in its style : “Bourbaki was left without a task. (...)
95
With their rigid format they were ﬁnding it extremely diﬃcult to incorporate new mathe
matical developments”[SM] To add to its diﬃculties, Bourbaki was now becoming involved
in a battle with its publishing company over royalties and translation rights. The matter was
settled in 1980 after a “long and unpleasant” legal process, where, as one Bourbaki member
put it “both parties lost and the lawyer got rich”[SM]. In 1983 Bourbaki published its last
volume : IX Spectral Theory.
By that time Cartier says Bourbaki was a dinosaur, the head too far away from the tail.
Explaining : “when Dieudonn´e was the “scribe of Bourbaki” every printed word came from
his pen. With his fantastic memory he knew every single word. You could say “Dieudonn´e
what is the result about so and so?” and he would go to the shelf and take down the book and
open it to the right page. After Dieudonn´e retired no one was able to do this. So Bourbaki
lost awareness of his own body, the 40 published volumes.”[SM] Now after almost twenty
years without a signiﬁcant publication is it safe to say the dinosaur has become extinct?
1
But since Nicolas Bourbaki never in fact existed, and was nothing but a clever teaching and
research ploy, could he ever be said to be extinct?
REFERENCES
[BL] L. BEAULIEU: A Parisian Caf´e and Ten ProtoBourbaki Meetings (19341935), The Mathe
matical Intelligencer Vol.15 No.1 1993, pp 2735.
[BCCC] A. BOREL, P.CARTIER, K. CHANDRASKHARAN, S. CHERN, S. IYANAGA: Andr´e
Weil (19061998), Notices of the AMS Vol.46 No.4 1999, pp 440447.
[BA] A. BOREL: TwentyFive Years with Nicolas Bourbaki, 19491973, Notices of the AMS Vol.45
No.3 1998, pp 373380.
[BN] N. BOURBAKI: Th´eorie des Ensembles, de la collection ´el´ements de Math´ematique, Her
mann, Paris 1970.
[BW] Bourbaki website: [online] at www.bourbaki.ens.fr.
[CH] H. CARTAN: Andr´e Weil:Memories of a Long Friendship, Notices of the AMS Vol.46 No.6
1999, pp 633636.
[DR] R. D´eCAMPS: Qui est Nicolas Bourbaki?, [online] at http://faq.maths.free.fr.
[DJ] J. DIEUDONN´e: The Work of Nicholas Bourbaki, American Math. Monthly 77,1970, pp134
145.
[EY] Encylop´edie Yahoo: Nicolas Bourbaki, [online] at http://fr.encylopedia.yahoo.com.
[GD] D. GUEDJ: Nicholas Bourbaki, Collective Mathematician: An Interview with Claude Cheval
ley, The Mathematical Intelligencer Vol.7 No.2 1985, pp1822.
[JA] A. JACKSON: Interview with Henri Cartan, Notices of the AMS Vol.46 No.7 1999, pp782788.
[SM] M. SENECHAL: The Continuing Silence of Bourbaki An Interview with Pierre Cartier, The
Mathematical Intelligencer, No.1 1998, pp 2228.
[WA] A. WEIL: The Apprenticeship of a Mathematician, Birkh¨ auser Verlag 1992, pp 93122.
1
Today what remains is “L’Association des Collaborteurs de Nicolas Bourbaki” who organize Bourbaki
seminars three times a year. These are international conferences, hosting over 200 mathematicians who come
to listen to presentations on topics chosen by Bourbaki (or the A.C.N.B). Their last publication was in 1998,
chapter 10 of book VI commutative algebra.
96
Version: 6 Owner: Daume Author(s): Daume
10.2 Erds Number
A low Erd¨os number is a status symbol among 20th Century mathematicians and is similar
to the 6degreesofseparation concept.
Let c(j) be the Erd¨os number of person j. Your Erd¨os number is
• 0 if you are Paul Erd¨os
• min¦c(r)[r ∈ A¦ + 1, where A is the set of all persons you have authored a paper
with.
Version: 7 Owner: tz26 Author(s): tz26
97
Chapter 11
0300 – General reference works
(handbooks, dictionaries,
bibliographies, etc.)
11.1 BuraliForti paradox
The BuraliForti paradox demonstrates that the class of all ordinals is not a set. If there
were a set of all ordinals, (:d, then it would follow that (:d was itself an ordinal, and
therefore that (:d ∈ (:d. Even if sets in general are allowed to contain themselves, ordinals
cannot since they are deﬁned so that ∈ is well founded over them.
This paradox is similar to both Russel’s paradox and Cantor’s paradox, although it predates
both. All of these paradoxes prove that a certain object is ”too large” to be a set.
Version: 2 Owner: Henry Author(s): Henry
11.2 Cantor’s paradox
Cantor’s paradox demonstrates that there can be no largest cardinality. In particular,
there must be an unlimited number of inﬁnite cardinalities. For suppose that α were the
largest cardinal. Then we would have [P(α)[ = [α[. Suppose 1 : α → P(α) is a bijection
proving their equicardinality. Then A = ¦β ∈ α [ β ∈ 1(β)¦ is a subset of α, and so there
is some γ ∈ α such that 1(γ) = A. But γ ∈ A ↔γ ∈ A, which is a paradox.
The key part of the argument strongly resembles Russell’s paradox, which is in some sense
a generalization of this paradox.
98
Besides allowing an unbounded number of cardinalities as ZF set theory does, this paradox
could be avoided by a few other tricks, for instance by not allowing the construction of a
power set or by adopting paraconsistent logic.
Version: 2 Owner: Henry Author(s): Henry
11.3 Russell’s paradox
Suppose that for any coherent proposition 1(r), we can construct a set ¦r : 1(r)¦. Let
o = ¦r : r ∈ r¦. Suppose o ∈ o; then, by deﬁnition, o ∈ o. Likewise, if o ∈ o, then by
deﬁnition o ∈ o. Therefore, we have a contradiction. Bertrand Russell gave this paradox as
an example of how a purely intuitive set theory can be inconsistent. The regularity axiom,
one of the ZermeloFraenkel axioms, was devised to avoid this paradox by prohibiting self
swallowing sets.
An interpretation of Russell paradox without any formal language of set theory could be
stated like “If the barber shaves all those who do not themselves shave, does he shave him
self?”. If you answer himself that is false since he only shaves all those who do not themselves
shave. If you answer someone else that is also false because he shaves all those who do not
themselves shave and in this case he is part of that set since he does not shave himself.
Therefore we have a contradiction.
Version: 5 Owner: Daume Author(s): Daume, vampyr
11.4 biconditional
A biconditional is a truth function that is true only in the case that both parameters are true
or both are false. For example, ”a only if b”, ”a just in case b”, as well as ”b implies a and a
implies b” are all ways of stating a biconditional in english. Symbolically the biconditional
is written as
c ↔/
or
c ⇔/
Its truth table is
99
a b c ↔/
F F T
F T F
T F F
T T T
In addition, the biconditional function is sometimes written as ”iﬀ”, meaning ”if and only
if”.
The biconditional gets its name from the fact that it is really two conditionals in conjunction,
(c →/) ∧ (/ →c)
This fact is important to recognize when writing a mathematical proof, as both conditionals
must be proven independently.
Version: 8 Owner: akrowne Author(s): akrowne
11.5 bijection
Let A and ) be sets. A function 1 : A →) that is onetoone and onto is called a bijection
or bijective function from A to ) .
When A = ) , 1 is also called a permutation of A.
Version: 8 Owner: mathcam Author(s): mathcam, drini
11.6 cartesian product
For any sets ¹ and 1, the cartesian product ¹ 1 is the set consisting of all ordered pairs
(c. /) where c ∈ ¹ and / ∈ 1.
Version: 1 Owner: djao Author(s): djao
11.7 chain
Let 1 ⊆ ¹, where ¹ is ordered by ≤. 1 is a chain in ¹ if any two elements of 1 are
comparable.
That is, 1 is a linearly ordered subset of ¹.
Version: 1 Owner: akrowne Author(s): akrowne
100
11.8 characteristic function
Deﬁnition Suppose ¹ is a subset of a set A. Then the function
χ
A
(r) =
1. when r ∈ ¹.
0. when r ∈ A ` ¹
is the characteristic function for ¹.
Properties
Suppose ¹. 1 are subsets of a set A.
1. For set intersections and set unions, we have
χ
A
T
B
= χ
A
χ
B
.
χ
A
S
B
= χ
A
+ χ
B
−χ
A
T
B
.
2. For the symmetric diﬀerence,
χ
A
B
= χ
A
+ χ
B
−2χ
A
T
B
.
3. For the set complement,
χ
A
= 1 −χ
A
.
Remarks
A synonym for characteristic function is indicator function [1].
REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed,
John Wiley & Sons, Inc., 1999.
Version: 6 Owner: bbukh Author(s): bbukh, matte, vampyr
11.9 concentric circles
A collection of circles is said to be concentric if they have the same center. The region formed
between two concentric circles is therefore an annulus.
Version: 1 Owner: dublisk Author(s): dublisk
101
11.10 conjunction
A conjunction is true only when both parameters (called conjuncts) are true. In English, con
junction is denoted by the word ”and”. Symbolically, we represent it as ∧ or multiplication
applied to Boolean parameters. Conjunction of c and / would be written
c ∧ /
or, in algebraic context,
c /
or
c/
The truth table for conjuction is
c / c ∧ /
F F F
F T F
T F F
T T T
Version: 6 Owner: akrowne Author(s): akrowne
11.11 disjoint
Two sets A and ) are disjoint if their intersection A
¸
) is the empty set.
Version: 1 Owner: djao Author(s): djao
11.12 empty set
An empty set ∅ is a set that contains no elements. The ZermeloFraenkel axioms of set theory
postulate that there exists an empty set.
Version: 2 Owner: djao Author(s): djao
102
11.13 even number
Deﬁnition Suppose / is an integer. If there exists an integer : such that / = 2: + 1, then
/ is an odd number. If there exists an integer : such that / = 2:, then / is an even
number.
The concept of even and odd numbers are most easily understood in the binary base. Then
the above deﬁnition simply states that even numbers end with a 0, and odd numbers end
with a 1.
Properties
1. Every integer is either even or odd. This can be proven using induction, or using the
fundamental theorem of arithmetic.
2. An integer / is even (odd) if and only if /
2
is even (odd).
Version: 3 Owner: mathcam Author(s): matte
11.14 ﬁxed point
A ﬁxed point r of a function 1 : A →A, is a point that remains constant upon application
of that function, i.e.:
1(r) = r.
Version: 5 Owner: mathwizard Author(s): mathwizard
11.15 inﬁnite
A set o is inﬁnite if it is not ﬁnite; that is, there is no : ∈ N for which there is a bijection
between : and o. Hence an inﬁnite set has a cardinality greater than any natural number:
[o[ ≥ ℵ
0
Inﬁnite sets can be divided into countable and uncountable. For countably inﬁnite sets o,
there is a bijection between o and N. This is not the case for uncountably inﬁnite sets (like
the reals and any nontrivial real interval).
Some examples of ﬁnite sets:
103
• The empty set: ¦¦.
• ¦0. 1¦
• ¦1. 2. 3. 4. 5¦
• ¦1. 1.5. c. π¦
Some examples of inﬁnite sets:
• ¦1. 2. 3. 4. . . .¦ (countable)
• The primes: ¦2. 3. 5. 7. 9. . . .¦ (countable)
• An interval of the reals: (0. 1) (uncountable)
• The rational numbers: ´ (countable)
Version: 4 Owner: akrowne Author(s): akrowne, vampyr
11.16 injective function
We say that a function 1 : A →) is injective or onetoone if 1(r) = 1(n) implies r = n,
or equivalently, whenever r = n, then 1(r) = 1(n).
Version: 6 Owner: drini Author(s): drini
11.17 integer
The set of integers, denoted by the symbol Z, is the set ¦ −3. −2. −1. 0. 1. 2. 3. . . . ¦ con
sisting of the natural numbers and their negatives.
Mathematically, Z is deﬁned to be the set of equivalence classes of pairs of natural numbers
N N under the equivalence relation (c. /) ∼ (c. d) if c + d = / + c.
Addition and multiplication of integers are deﬁned as follows:
• (c. /) + (c. d) := (c + c. / + d)
• (c. /) (c. d) := (cc + /d. cd + /c)
104
Typically, the class of (c. /) is denoted by symbol : if / < c (resp. −: if c < /), where : is
the unique natural number such that c = / + : (resp. c + : = /). Under this notation, we
recover the familiar representation of the integers as ¦. . . . −3. −2. −1. 0. 1. 2. 3. . . . ¦. Here
are some examples:
• 0 = equivalence class of (0. 0) = equivalence class of (1. 1) = . . .
• 1 = equivalence class of (1. 0) = equivalence class of (2. 1) = . . .
• −1 = equivalence class of (0. 1) = equivalence class of (1. 2) = . . .
The set of integers Z under the addition and multiplication operations deﬁned above form
an integral domain. The integers admit the following ordering relation making Z into an
ordered ring: (c. /) < (c. d) in Z if c + d < / + c in N.
The ring of integers is also a Euclidean domain, with valuation given by the absolute value
function.
Version: 7 Owner: djao Author(s): djao
11.18 inverse function
Deﬁnition Suppose 1 : A → ) is a mapping between sets A and ) , and suppose 1
−1
:
) →A is a mapping that satisﬁes
1
−1
◦ 1 = id
X
.
1 ◦ 1
−1
= id
Y
.
Then 1
−1
is called the inverse of 1, or the inverse function of 1.
Remarks
1. The inverse function of a function 1 : A →) exists if and only if 1 is a bijection, that
is, 1 is an injection and a surjection.
2. When an inverse function exists, it is unique.
3. The inverse function and the inverse image of a set coincide in the following sense.
Suppose 1
−1
(¹) is the inverse image of a set ¹ ⊂ ) under a function 1 : A →) . If 1
is a bijection, then 1
−1
(n) = 1
−1
(¦n¦).
Version: 3 Owner: matte Author(s): matte
105
11.19 linearly ordered
An ordering ≤ (or <) of ¹ is called linear or total if any two elements of ¹ are comparable.
The pair (¹. ≤) is then called a linearly ordered set.
Version: 1 Owner: akrowne Author(s): akrowne
11.20 operator
Synonym of mapping and function. Often used to refer to mappings where the domain and
codomain are, in some sense a space of functions.
Examples: diﬀerential operator, convolution operator.
Version: 2 Owner: rmilson Author(s): rmilson
11.21 ordered pair
For any sets c and /, the ordered pair (c. /) is the set ¦¦c¦. ¦c. /¦¦.
The characterizing property of an ordered pair is:
(c. /) = (c. d) ⇔c = / and c = d.
and the above construction of ordered pair, as weird as it seems, is actually the simplest
possible formulation which achieves this property.
Version: 4 Owner: djao Author(s): djao
11.22 ordering relation
Let o be a set. An ordering relation is a relation < on o such that, for every c. /. c ∈ o:
• Either c < /, or / < c,
• If c < / and / < c, then c < c,
• If c < / and / < c, then c = /.
106
Given an ordering relation <, one can deﬁne a relation < by: c < / if c < / and c = /. The
opposite ordering is the relation ` given by: c ` / if / < c, and the relation is deﬁned
analogously.
Version: 3 Owner: djao Author(s): djao
11.23 partition
A partition 1 of a set o is a collection of mutually disjoint nonempty sets such that
¸
1 = o.
Any partition 1 of a set o introduces an equivalence relation on o, where each j ∈ 1 is
an equivalence class. Similarly, given an equivalence relation on o, the collection of distinct
equivalence classes is a partition of o.
Version: 4 Owner: vampyr Author(s): vampyr
11.24 pullback
Deﬁnition Suppose A. ). 2 are sets, and we have maps
1 : ) → 2.
Φ : A → ).
Then the pullback of 1 under Φ is the mapping
Φ
∗
1 : A → 2.
r → (1 ◦ Φ)(r).
Let us denote by `(A. ) ) the set of all mappings 1 : A → ) . We then see that Φ
∗
is a
mapping `(). 2) →`(A. 2). In other words, Φ
∗
pulls back the set where 1 is deﬁned on
from ) to A. This is illustrated in the below diagram.
A
Φ
Φ
∗
f
)
f
2
Properties
1. For any set A, (id
X
)
∗
= id
M(X,X)
.
107
2. Suppose we have maps
Φ : A → ).
Ψ : ) → 2
between sets A. ). 2. Then
(Ψ◦ Φ)
∗
= Φ
∗
◦ Ψ
∗
.
3. If Φ : A →) is a bijection, then Φ
∗
is a bijection and
Φ
∗
−1
=
Φ
−1
∗
.
4. Suppose A. ) are sets with A ⊂ ) . Then we have the inclusion map ι : A →) , and
for any 1 : ) →2, we have
ι
∗
1 = 1[
X
.
where 1[
X
is the restriction of 1 to A [1].
REFERENCES
1. W. Aitken, De Rahm Cohomology: Summary of Lectures 14, online.
Version: 7 Owner: matte Author(s): matte
11.25 set closed under an operation
A set A is said to be closed under some map 1, if 1 maps elements in A to elements in A,
i.e., 1 : A →A. More generally, suppose ) is the :fold cartesian product ) = A A.
If 1 is a map 1 : ) →A, then we also say that A is closed under the map 1.
The above deﬁnition has no relation with the deﬁnition of a closed set in topology. Instead,
one should think of A and 1 as a closed system.
Examples
1. The set of invertible matrices is closed under matrix inversion. This means that the
inverse of an invertible matrix is again an invertible matrix.
2. Let ((A) be the set of complex valued continuous functions on some topological space
A. Suppose 1. o are functions in ((A). Then we deﬁne the pointwise product of 1
and o as the function 1o : r → 1(r)o(r). Since 1o is continuous, we have that ((A)
is closed under pointwise multiplication.
108
In the ﬁrst examples, the operations is of the type A → A. In the latter, pointwise multi
plication is a map ((A) ((A) →((A).
Version: 2 Owner: matte Author(s): matte
11.26 signature of a permutation
Let A be a ﬁnite set, and let G be the group of permutations of A (see permutation group).
There exists a unique homomorphism χ from G to the multiplicative group ¦−1. 1¦ such
that χ(t) = −1 for any transposition (loc. sit.) t ∈ G. The value χ(o), for any o ∈ G,
is called the signature or sign of the permutation o. If χ(o) = 1, o is said to be of even
parity; if χ(o) = −1, o is said to be of odd parity.
Proposition: If A is totally ordered by a relation <, then for all o ∈ G,
χ(o) = (−1)
k(g)
(11.26.1)
where /(o) is the number of pairs (r. n) ∈ A A such that r < n and o(r) o(n). (Such a
pair is sometimes called an inversion of the permutation o.)
Proof: This is clear if o is the identity map A →A. If o is any other permutation, then for
some consecutive c. / ∈ A we have c < / and o(c) o(/). Let / ∈ G be the transposition
of c and /. We have
/(/ ◦ o) = /(o) −1
χ(/ ◦ o) = −χ(o)
and the proposition follows by induction on /(o).
Version: 4 Owner: drini Author(s): Larry Hammick
11.27 subset
Given two sets ¹ and 1, we say that ¹ is a subset of 1 (which we denote as ¹ ⊆ 1 or
simply ¹ ⊂ 1) if every element of ¹ is also in 1. That is, the following implication holds:
r ∈ ¹ ⇒r ∈ 1.
Some examples: The set ¹ = ¦d. :. i. t. o¦ is a subset of the set 1 = ¦j. c. d. :. i. t. o¦ because
every element of ¹ is also in 1. That is, ¹ ⊆ 1.
On the other hand, if ( = ¦j. c. d. :. o¦ neither ¹ is a subset of ( (because t ∈ ¹ but t ∈ ()
nor ( is a subset of ¹ (because j ∈ ( but j ∈ ¹). The fact that ¹ is not a subset of ( is
written as ¹ ⊆ (. And then, in this example we also have ( ⊆ ¹.
109
If A ⊆ ) and ) ⊆ A, it must be the case that A = ) .
Every set is a subset of itself, and the empty set is a subset of every other set. The set ¹ is
called a proper subset of 1, if ¹ ⊂ 1 and ¹ = 1 (in this case we do not use ¹ ⊆ 1).
Version: 5 Owner: drini Author(s): drini
11.28 surjective
A function 1 : A → ) is called surjective or onto if, for every n ∈ ) , there is an r ∈ A
such that 1(r) = n.
Equivalently, 1 : A →) is onto when its image is all the codomain:
Im1 = ).
Version: 2 Owner: drini Author(s): drini
11.29 transposition
Given a set A = ¦c
1
. c
2
. . . . . c
n
¦, a transposition is a permutation (bijective function of A
onto itself) 1 such that there exist indices i. , such that 1(c
i
) = c
j
, 1(c
j
) = c
i
and 1(c
k
) = c
k
for all other indices /.
Example: If A = ¦c. /. c. d. c¦ the function σ given by
σ(c) = c
σ(/) = c
σ(c) = c
σ(d) = d
σ(c) = /
is a transposition.
One of the main results on symmetric groups states that any permutation can be expressed
as composition of transpositions, and for any two decompositions of a given permutation,
the number of transpositions is always even or always odd.
Version: 2 Owner: drini Author(s): drini
110
11.30 truth table
A truth table is a tabular listing of all possible input value combinations for a truth function
and their corresponding output values. For : input variables, there will always be 2
n
rows
in the truth table. A sample truth table for (c ∧ /) →c would be
c / c (c ∧ /) →c
F F F T
F F T F
F T F T
F T T F
T F F T
T F T F
T T F T
T T T T
(Note that ∧ represents logical and, while → represents the conditional truth function).
Version: 4 Owner: akrowne Author(s): akrowne
111
Chapter 12
03XX – Mathematical logic and
foundations
12.1 standard enumeration
The standard enumeration of ¦0. 1¦
∗
is the sequence of strings :
0
= λ, :
1
= 0, :
2
= 1,
:
3
= 00, :
4
= 01, in lexicographic order.
The characteristic function of a language ¹ is χ
A
: N →¦0. 1¦ such that
χ
A
(:) =
1. if :
n
∈ ¹
0. if :
n
∈ ¹.
The characteristic sequence of a language ¹ (also denoted as χ
A
) is the concatenation of the
values of the characteristic function in the natural order.
Version: 12 Owner: xiaoyanggu Author(s): xiaoyanggu
112
Chapter 13
03B05 – Classical propositional logic
13.1 CNF
A propositional formula is a CNF formula, meaning Conjunctive normal form, if it is a
conjunction of disjunction of literals (a literal is a propositional variable or its negation).
Hence, a CNF is a formula of the form: 1
1
∧ 1
2
∧ . . . ∧ 1
n
, where each 1
i
is of the form

i1
∨ 
i2
∨ . . . ∨ 
im
for literals 
ij
and some :.
Example: (r ∨ n ∨ .) ∧ (n ∨ u ∨ n) ∧ (r ∨ · ∨ n).
Version: 2 Owner: iddo Author(s): iddo
13.2 Proof that contrapositive statement is true using
logical equivalence
You can see that the contrapositive of an implication is true by considering the following:
The statement j ⇒¡ is logically equivalent to j ∨ ¡ which can also be written as j ∨ ¡.
By the same token, the contrapositive statement ¡ ⇒ j is logically equivalent to ¡ ∨ j
which, using double negation on ¡, becomes ¡ ∨ j.
This, of course, is the same logical statement.
Version: 2 Owner: sprocketboy Author(s): sprocketboy
113
13.3 contrapositive
Given an implication of the form
j →¡
(”p implies q”) the contrapositive of this implication is
¡ →j
(”not q implies not p”).
An implication and its contrapositive are equivalent statements. When proving a theorem,
it is often more convenient or more intuitive to prove the contrapositive instead.
Version: 3 Owner: vampyr Author(s): vampyr
13.4 disjunction
A disjunction is true if either of its parameters (called disjuncts) are true. Disjunction
does not correspond to ”or” in English (see exclusive or.) Disjunction uses the symbol ∨
or sometimes + when taken in algebraic context. Hence, disjunction of c and / would be
written
c ∨ /
or
c + /
The truth table for disjunction is
c / c ∨ /
F F F
F T T
T F T
T T T
Version: 8 Owner: akrowne Author(s): akrowne
13.5 equivalent
Two statements A and B are said to be (logically) equivalent if A is true if and only if B is
true (that is, A implies B and B implies A). This is usually written as ¹ ⇔1. For example,
for any integer ., the statement ”. is positive” is equivalent to ”. is not negative and . = 0”.
Version: 1 Owner: sleske Author(s): sleske
114
13.6 implication
An implication is a logical construction that essentially tells us if one condition is true, then
another condition must be also true. Formally it is written
c →/
or
c ⇒/
which would be read ”c implies /”, or ”c therefore /”, or ”if c, then /” (to name a few).
Implication is often confused for ”if and only if”, or the biconditional truth function (⇔).
They are not, however, the same. The implication c → / is true even if only / is true. So
the statement ”pigs have wings, therefore it is raining today”, is true if it is indeed raining,
despite the fact that the ﬁrst item is false.
In fact, any implication c →/ is called vacuously true when c is false. By contrast, c ⇔/
would be false if either c or / was by itself false (c ⇔/ ⇔(c ∧/) ∨(c ∧/), or in terms of
implication as (c →/) ∧ (/ →c)).
It may be useful to remember that c → / only tells you that it cannot be the case that
/ is false while c is true; / must ”follow” from c (and “false” does follow from “false”).
Alternatively, c →/ is in fact equivalent to
/ ∨ c
The truth table for implication is therefore
a b c →/
F F T
F T T
T F F
T T T
Version: 3 Owner: akrowne Author(s): akrowne
13.7 propositional logic
A propositional logic is a logic in which the only objects are propositions, that is,
objects which themselves have truth values. Variables represent propositions, and there are
no relations, functions, or quantiﬁers except for the constants 1 and ⊥ (representing true
115
and false respectively). The connectives are typically , ∧, ∨, and →(representing negation,
conjunction, disjunction, and implication), however this set is redundant, and other choices
can be used (1 and ⊥ can also be considered 0ary connectives).
A model for propositional logic is just a truth function ν on a set of variables. Such a truth
function can be easily extended to a truth function ν on all formulas which contain only the
variables ν is deﬁned on by adding recursive clauses for the usual deﬁnitions of connectives.
For instance ν(α ∧ β) = 1 iﬀ ν(α) = ν(β) = 1.
Then we say ν [= φ if ν(φ) = 1, and we say [= φ if for every ν such that ν(φ) is deﬁned,
ν [= φ (and say that φ is a tautology).
Propositional logic is decidable: there is an easy way to determine whether a sentence is a
tautology. It can be done using truth tables, since a truth table for a particular formula can
be easily produced, and the formula is a tautology if every assignment of truth values makes
it true. It is not known whether this method is eﬃcient: the equivalent problem of whether
a formula is satisﬁable (that is, whether its negation is a tautology) is a canonical example
of an NPcomplete problem.
Version: 3 Owner: Henry Author(s): Henry
13.8 theory
If 1 is a logical language for some logic L, and 1 is a set of formulas with no free variables
then 1 is a theory of L.
We write 1 = φ for any formula φ if every model M of L such that ` = 1, ` = φ.
We write 1 ¬ φ is for there is a proof of φ from 1.
Version: 1 Owner: Henry Author(s): Henry
13.9 transitive
The transitive property of logic is
(c ⇒/) ∧ (/ ⇒c) ⇒(c ⇒c)
Where ⇒ is the conditional truth function. From this we can derive that
(c = /) ∧ (/ = c) ⇒(c = c)
116
Version: 1 Owner: akrowne Author(s): akrowne
13.10 truth function
A truth function is a function that returns one of two values, one of which is interpreted
as ”true”, and the other which is interpreted as ”false”. Typically either ”T” and ”F” are
used, or ”1” and ”0”, respectively. Using the latter, we can write
1 : ¦0. 1¦
n
→¦0. 1¦
deﬁnes a truth function 1. That is, 1 is a mapping from any number (:) of true/false (0 or
1) values to a single value, which is 0 or 1.
Version: 2 Owner: akrowne Author(s): akrowne
117
Chapter 14
03B10 – Classical ﬁrstorder logic
14.1 ∆
1
bootstrapping
This proves that a number of useful relations and functions are ∆
1
in ﬁrst order arithmetic,
providing a bootstrapping of parts of mathematical practice into any system including the ∆
1
relations (since the ∆
1
relations are exactly the recursive ones, this includes Turing machines).
First, we want to build a tupling relation which will allow a ﬁnite sets of numbers to be
encoded by a single number. To do this we ﬁrst show that 1(c. /) ↔c[/ is ∆
1
. This is true
since c[/ ↔∃c < /(c c = /), a formula with only bounded quantiﬁers.
Next note that 1(r) ↔x is prime is ∆
1
since 1(r) ↔ ∃n < r(n = 1 ∧ n[r). Also
¹
P
(r. n) ↔1(r) ∧ 1(n) ∧ ∀. < n(r < . →1(r)).
These two can be used to deﬁne (the graph of) a primality function, j(c) = c+1th prime.
Let j(c. /) = ∃c < /
a
2
([2[c] ∧ [∀¡ < /∀: < /(¹
P
(¡. :) → ∀, < c[¡
j
[c ↔ :
j+1
[c])] ∧ [/
a
[c] ∧
[/
a+1
[c]).
This rather awkward looking formula is worth examining, since it illustrates a principle which
will be used repeatedly. c is intended to be a function of the form 2
0
3
1
5
2
and so on. If
it includes /
a
but not /
a+1
then we know that / must be the c + 1th prime. The deﬁnition
is so complicated because we cannot just say, as we’d like to, j(c + 1) is the smallest prime
greater than j(c) (since we don’t allow recursive deﬁnitions). Instead we embed the series
of values this recursion would take into a single number (c) and guarantee that the recursive
relationship holds for at least c terms; then we just check if the cth value is /.
Finally, we can deﬁne our tupling relation. Technically, since a given relation must have
a ﬁxed arity, we deﬁne for each : a function 'r
0
. . . . . r
n
` =
¸
n
j
x
i
+1
i
. Then deﬁne (r)
i
to be the ith element of r when r is interpreted as a tuple, so '(r)
0
. . . . . (r)
n
` = r. Note
that the tupling relation, even taken collectively, is not total. For instance 5 is not a tuple
(although it is sometimes convenient to view it as a tuple with ”empty spaces”: ' . . 5`). In
118
situations like this, and also when attempting to extract entries beyond the length, (r)
i
= 0
(for instance, (5)
0
= 0). On the other hand there is a 0ary tupling relation, '` = 1.
Thanks to our deﬁnition of j, we have 'r
0
. . . . . r
n
` = r ↔r = j(0)
x
0
+1
j(:)
xn+1
. This
is clearly ∆
1
. (Note that we don’t use the
¸
as above, since we don’t have that, but since
we have a diﬀerent tupling function for each : this isn’t a problem.)
For the reverse, (r)
i
= n ↔([j(i)
y+1
[r] ∧ [j(i)
y+2
[r]) ∨ ([n = 0] ∧ [j(i)[r]).
Also, deﬁne a length function by len(r) = n ↔ [j(n + 1)[r] ∧ ∀. < n[j(.)[r] and a mem
bership relation by in(r. :) ↔∃i < len(r)[(r)
i
= :].
Armed with this, we can show that all primitive recursive functions are ∆
1
. To see this, note
that r = 0, the zero function, is trivially recursive, as are r = on and j
n,m
(r
1
. . . . . r
n
) = r
m
.
The ∆
1
functions are closed under composition, since if φ(r) and ψ(r) both have no unbounded
quantiﬁers, φ(ψ(r)) obviously doesn’t either.
Finally, suppose we have functions 1(r) and o(r. :. :) in ∆
1
. Then deﬁne the primitive
recursion /(r. n) by ﬁrst deﬁning:
¯
/(r. n) = . ↔:(.) = n ∧ ∀i < n[(.)
i+1
= o(r. i. (.)
i
)] ∧ [:(.) = 0 ∨ (.)
0
= 1(r)
and then /(r. n) = (
¯
/(r. n))
y
.
∆
1
is also closed under minimization: if 1(r. n) is a ∆
1
relation then jn.1(r. n) is a function
giving the least n satisfying 1(r. n). To see this, note that jn.1(r. n) = . ↔1(r. .) ∧∀: <
.1(r. :).
Finally, using primitive recursion it is possible to concatenate sequences. First, to concate
nate a single number, if : = 'r
0
. . . . r
n
` then : ∗
1
n = t j(len(:) +1)
y+1
. Then we can deﬁne
the concatenation of : with t = 'n
0
. . . . . n
m
` by deﬁning 1(:. t) = : and o(:. t. ,. i) = , ∗
1
(t)
i
,
and by primitive recursion, there is a function /(:. t. i) whose value is the ﬁrst , elements of
t appended to :. Then : ∗ t = /(:. t. len(t).
We can also deﬁne ∗
u
, which concatenates only elements of t not appearing in :. This just
requires deﬁning the graph of o to be o(:. t. ,. i. r) ↔[in(:. (t)
i
) ∧r = ,] ∨[in(:. (t)
i
) ∧r =
, ∗
1
(t)
i
]
Version: 6 Owner: Henry Author(s): Henry
14.2 Boolean
Boolean refers to that which can take on the values ”true” or ”false”, or that which concerns
truth and falsity. For example ”Boolean variable”, ”Boolean logic”, ”Boolean statement”,
etc.
119
”Boolean” is named for George Boole, the 19th century mathematician.
Version: 5 Owner: akrowne Author(s): akrowne
14.3 G¨odel numbering
A G¨odel numbering is any way oﬀ assigning numbers to the formulas of a language. This is
often useful in allowing sentences of a language to be selfreferential. The number associated
with a formula φ is called its G¨odel number and is denoted 'φ¯.
More formally, if L is a language and G is a surjective partial function from the terms of
L to the formulas over L then G is a G¨odel numbering. 'φ¯ may be any term t such that
G(t) = φ. Note that G is not deﬁned within L (there is no formula or object of L representing
G), however properties of it (such as being in the domain of G, being a subformula, and so
on) are.
Athough anything meeting the properties above is a G¨odel numbering, depending on the
speciﬁc language and usage, any of the following properties may also be desired (and can
often be found if more eﬀort is put into the numbering):
• If φ is a subformula of ψ then 'φ¯ < 'ψ¯
• For every number :, there is some φ such that 'φ¯ = :
• G is injective
Version: 4 Owner: Henry Author(s): Henry
14.4 G¨odel’s incompleteness theorems
G¨odel’s ﬁrst and second incompleteness theorems are perhaps the most celebrated results
in mathematical logic. The basic idea behind G¨odel’s proofs is that by the device of
G¨odel numbering, one can formulate properties of theories and sentences as arithmetical
properties of the corresponding G¨odel numbers, thus allowing 1st order arithmetic to speak
of its own consistency, provability of some sentence and so forth.
The original result G¨odel proved in his classic paper On Formally Undecidable propositions
in Principia Mathematica and Related Systems can be stated as
Theorem 1. No theory T axiomatisable in the type system of PM (i.e. in Russell’s theory of types)
which contains Peanoarithmetic and is ωconsistent proves all true theorems of arithmetic
(and no false ones).
120
Stated this way, the theorem is an obvious corollary of Tarski’s result on the undeﬁnability of Truth.
This can be seen as follows. Consider a G¨odel numbering G, which assigns to each formula
φ its G¨odel number
]
φ

. The set of G¨odel numbers of all true sentences of arithmetic is
¦
]
φ

[ N [= φ¦, and by Tarski’s result it isn’t deﬁnable by any arithmetic formula. But
assume there’s a theory 1 an axiomatisation Ax
T
of which is deﬁnable in arithmetic and
which proves all true statements of arithmetic. But now ∃1(1 is a proof of r from Ax
T
)
deﬁnes the set of (G¨odel numbers of) true sentences of arithmetic, which contradicts Tarski’s
result.
The proof given above is highly nonconstructive. A much stronger version can actually be
extracted from G¨odel’s paper, namely that
Theorem 2. There is a primitive recursive function G, s.t. if 1 is a theory with a p.r.
axiomatisation α, and if all primitive recursive functions are representable in 1 then N [=
G(α) but 1 ¬ G(α)
This second form of the theorem is the one usually proved, although the theorem is usually
stated in a form for which the nonconstructive proof based on Tarski’s result would suﬃce.
The proof for this stronger version is based on a similar idea as Tarski’s result.
Consider the formula ∃1(1 is a proof of r from α), which deﬁnes a predicate 1:o·
α
(r)
which represents provability from α. Assume we have numerated the open formulae with
one variable in a sequence 1
i
, so that every open formula occurs. Consider now the sentence
Prov(1
x
), which deﬁnes the nonprovability from α predicate. Now, since Prov(1
x
) is
an open formula with one variable, it must be 1
k
for some /. Thus we can consider the
closed sentence 1
k
(/). This sentence is equivalent to Prov(:n/:t(
]
1:o·
α
(r)

). /)), but
since :n/:t(
]
Prov(r)

). /) is just 1
k
(/), it ”asserts its own unprovability”.
Since all the steps we took to get the undecided but true sentence :n/:t(
]
Prov(r)

). /)
is just 1
k
(/) were very simple mechanic manipulations of G¨odel numbers guaranteed to
terminate in bounded time, we have in fact produced the p.r. Function G required by the
statement of the theorem.
The ﬁrst version of the proof can be used to show that also many nonaxiomatisable theories
are incomplete. For example, consider 1¹ + all true Π
1
sentences. Since Π
1
truth is
deﬁnable at Π
2
level, this theory is deﬁnable in arithmetic by a formula α. However, it’s not
complete, since otherwise ∃j(j is a proof of r from α) would be the set of true sentences
of arithmetic. This can be extended to show that no arithmetically deﬁnable theory with
suﬃcient expressive power is complete.
The second version of G¨odel’s ﬁrst incompleteness theorem suggests a natural way to extend
theories to stronger theories which are exactly as sound as the original theories. This sort
of process has been studied by Turing, Feferman, Fenstad and others under the names of
ordinal logics and transﬁnite recursive progressions of arithmetical theories.
G¨odel’s second incompleteness theorem concerns what a theory can prove about its own
provability predicate, in particular whether it can prove that no contradiction is provable.
121
The answer under very general settings is that a theory can’t prove that it is consistent,
without actually being inconsistent.
The second incompleteness theorem is best presented by means of a provability logic. Con
sider an arithmetic theory 1 which is p.r. axiomatised by α. We extend the language this
theory is expressed in with a new sentence forming operator P, so that any sentence in
parentheses preﬁxed by P is a sentence. Thus for example, P(0 = 1) is a formula. Intu
itively, we want P(φ) to express the provability of φ from α. Thus the semantics of our new
language is exactly the same as that of the original language, with the additional rule that
P(φ) is true if and only if α ¬ φ. There is a slight diﬃculty here; φ might itself contain
boxed expressions, and we haven’t yet provided any semantics for these. The answer is sim
ple, whenever a boxed expression P(ψ) occurs within the scope of another box, we replace
it with the arithmetical statement 1:o·
α
(ψ). Thus for example the truth of P(P(0 = 1))
is equivalent to α ¬ 1:o·
α
(
]
0 = 1

). Assuming that α is strong enough to prove all true
instances of Prov(
]
φ

) we can in fact interprete the whole of the new boxed language by
the translation. This is what we shall do, so formally α ¬ φ (where φ might contain boxed
sentences) is taken to mean α ¬ φ∗ where φ∗ is obtained by replacing the boxed expressions
with arithmetical formulae as above.
There are a number of restrictions we must impose on α (and thus on P, the meaning of
which is determined by α). These are known as HilbertBernays derivability conditions and
they are as follows
• if α ¬ φ then α ¬ P(φ)
• α ¬ P(φ) →PP(φ)
• α ¬ P(φ →ψ) →(Pφ →Pψ)
A statement (o:: asserts the consistency of α if its equivalent to P(0 = 1). G¨odel’s ﬁrst
incompleteness theorem shows that there is a sentence 1
k
(/) for which the following is true
P(0 = 1) → Q(1
k
(/)) ∧ Q((1
k
(/)), where Q is the dual of P, i.e. Q(φ) ↔ P(φ).
A careful analysis reveals that this is provable in any α which satisﬁed the derivability
conditions, i.e. α ¬ P(0 = 1) → Q(1
k
(/)) ∧ Q((1
k
(/)). Assume now that α can prove
P(0 = 1), i.e. that α can prove its own consistency. Then α can prove Q(1
k
(/)) ∧
Q((1
k
(/)). But this means that α can prove 1
k
(/)! Thus α is inconsistent.
Version: 4 Owner: Aatu Author(s): Aatu
14.5 Lindenbaum algebra
Let 1 be a ﬁrst order language. We deﬁne the equivalence relation ∼ over formulas of 1 by
ϕ ∼ ψ if and only if ¬ ϕ ⇔ ψ. Let 1 = 1 ∼ be the set of equivalence classes. We deﬁne
the operations ⊕ and and complementation, denoted [ϕ] on 1 by :
122
[ϕ] ⊕[ψ] = [ϕ ∨ ψ]
[ϕ] [ψ] = [ϕ ∧ ψ]
[ϕ] = [ϕ]
We let 0 = [ϕ∧ϕ] and 1 = [ϕ∨ϕ]. Then the structure (1. ⊕. .¯. 0. 1) is a Boolean algebra,
called the Lindenbaum algebra.
Note that it may possible to deﬁne the Lindenbaum algebra on extensions of ﬁrst order logic,
as long as there is a notion of formal proof that can allow the deﬁnition of the equivalence
relation.
Version: 12 Owner: jihemme Author(s): jihemme
14.6 Lindstr¨om’s theorem
One of the very ﬁrst results of the study of model theoretic logics is a characterisation
theorem due to Per Lindstr¨om. He showed that the classical ﬁrst order logic is the strongest
logic having the following properties
• Being closed under contradictory negation
• compactness
• L¨owenheimSkolem theorem
also, he showed that ﬁrst order logic can be characterised as the strongest logic for which
the following hold
• Completeness (r.e. axiomatisability)
• L¨owenheimSkolem theorem
The notion of “strength” used here is as follows. A logic L
t
is stronger than L or as strong
if the class of sets deﬁnable in L ⊆ the class of sets deﬁnable in L
t
.
Version: 2 Owner: Aatu Author(s): Aatu
123
14.7 Pressburger arithmetic
Pressburger arithmetic is a weakened form of arithmetic which includes the structure
N, the constant 0, the unary function o, the binary function +, and the binary relation <.
Essentially, it is Peano arithmetic without multiplication.
Pressburger arithmetic is decideable, but is consequently very limited in what it can express.
Version: 2 Owner: Henry Author(s): Henry
14.8 Rminimal element
Let o be a set and 1 be a relation on o. An element c ∈ o is said to be 1minimal if and
only if there is no r ∈ o such that r1c.
Version: 1 Owner: jihemme Author(s): jihemme
14.9 Skolemization
Skolemization is a way of removing existential quantiﬁers from a formula. Variables bound
by existential quantiﬁers which are not inside the scope of universal quantiﬁers can simply
be replaced by constants: ∃r[r < 3] can be changed to c < 3, with c a suitable constant.
When the existential quantiﬁer is inside a universal quantiﬁer, the bound variable must
be replaced by a Skolem function of the variables bound by universal quantiﬁers. Thus
∀r[r = 0 ∨ ∃n[r = n + 1]] becomes ∀r[r = 0 ∨ r = 1(r) + 1].
This is used in second order logic to move all existential quantiﬁers outside the scope of ﬁrst
order universal quantiﬁers. This can be done since second order quantiﬁers can quantify over
functions. For instance ∀
1
r∀
1
n∃
1
.φ(r. n. .) is equivalent to ∃
2
1∀
1
r∀
1
nφ(r. n. 1(r. n)).
Version: 1 Owner: Henry Author(s): Henry
14.10 arithmetical hierarchy
The arithmetical hierarchy is a hierarchy of either (depending on the context) formulas
or relations. The relations of a particular level of the hierarchy are exactly the relations
deﬁned by the formulas of that level, so the two uses are essentially the same.
The ﬁrst level consists of formulas with only bounded quantiﬁers, the corresponding relations
124
are also called the primitive recursive relations (this deﬁnition is equivalent to the deﬁnition
from computer science). This level is called any of ∆
0
0
, Σ
0
0
and Π
0
0
, depending on context.
A formula φ is Σ
0
n
if there is some ∆
0
1
formula ψ such that φ can be written:
φ(
/) = ∃r
1
∀r
2
Cr
n
ψ(
/. r)
where C is either ∀ or ∃, whichever maintains the pattern of alternating quantiﬁers
The Σ
0
1
relations are the same as the Recursively Enumerable relations.
Similarly, φ is a Π
0
n
relation if there is some ∆
0
1
formula ψ such that:
φ(
/) = ∀r
1
∃r
2
Cr
n
ψ(
/. r)
where C is either ∀ or ∃, whichever maintains the pattern of alternating quantiﬁers
A formula is ∆
0
n
if it is both Σ
0
n
and Π
0
n
. Since each Σ
0
n
formula is just the negation of a Π
0
n
formula and viceversa, the Σ
0
n
relations are the complements of the Π
0
n
relations.
The relations in ∆
0
1
= Σ
0
1
¸
Π
0
1
are the Recursive relations.
Higher levels on the hierarchy correspond to broader and broader classes of relations. A for
mula or relation which is Σ
0
n
(or, equivalently, Π
0
n
) for some integer : is called arithmetical.
The superscript 0 is often omitted when it is not necessary to distinguish from the analytic hierarchy.
Functions can be described as being in one of the levels of the hierarchy if the graph of the
function is in that level.
Version: 14 Owner: iddo Author(s): yark, iddo, Henry
14.11 arithmetical hierarchy is a proper hierarchy
By deﬁnition, we have ∆
n
= Π
n
¸
Σ
n
. In addition, Σ
n
¸
Π
n
⊆ ∆
n+1
.
This is proved by vacuous quantiﬁcation. If 1 is equivalent to φ(:) then 1 is equivalent to
∀rφ(:) and ∃rφ(:), where r is some variable that does not occur free in φ.
More signiﬁcant is the proof that all containments are proper. First, let : ` 1 and l be
universal for 2ary Σ
n
relations. Then 1(r) ↔l(r. r) is obviously Σ
n
. But suppose 1 ∈ ∆
n
.
Then 1 ∈ 1i
n
, so 1 ∈ Σ
n
. Since l is universal, ther is some c such that 1(r) ↔l(c. r),
and therefore 1(c) ↔l(c. c) ↔l(c. c). This is clearly a contradiction, so 1 ∈ Σ
n
` ∆
n
and 1 ∈ Π
n
` ∆
n
.
125
In addition the recursive join of 1 and 1, deﬁned by
1 ⊕1(r) ↔(∃n < r[r = 2 n] ∧ 1(r)) ∨ (∃n < r[r = 2 n] ∧ 1(r))
Clearly both 1 and 1 can be recovered from 1⊕1, so it is contained in neither Σ
n
nor
Π
n
. However the deﬁnition above has only unbounded quantiﬁers except for those in 1 and
1, so 1⊕1(r) ∈ ∆
n+1
` Σ
n
¸
Π
n
Version: 3 Owner: Henry Author(s): Henry
14.12 atomic formula
Let 1 be a ﬁrst order language, and suppose it has signature Σ. A formula ϕ of 1 is said to
be atomic if and only if :
1. ϕ = “t
1
= t
2
”, where t
1
and t
2
are terms;
2. ϕ = “1(t
1
. .... t
n
)”, where 1 ∈ Σ is an :ary relation symbol.
Version: 1 Owner: jihemme Author(s): jihemme
14.13 creating an inﬁnite model
From the syntactic compactness theorem for ﬁrst order logic, we get this nice (and useful)
result:
Let T be a theory of ﬁrstorder logic. If T has ﬁnite models of unboundedly large sizes, then
T also has an inﬁnite model.
D eﬁne the propositions
Φ
n
⇔∃r
1
. . . ∃r
n
.(r
1
= r
2
) ∧ . . . ∧ (r
1
= r
n
) ∧ (r
2
= r
3
) ∧ . . . ∧ (r
n−1
= r
n
)
(Φ
n
says “there exist (at least) : diﬀerent elements in the world”). Note that . . . ¬ Φ
n
¬
. . . ¬ Φ
2
¬ Φ
1
. Deﬁne a new theory
T
∞
= T
¸
¦Φ
1
. Φ
2
. . . .¦ .
For any ﬁnite subset T
t
⊂ T
∞
, we claim that T
t
is consistent: Indeed, T
t
contains axioms of
T, along with ﬁnitely many of ¦Φ
n
¦
n≥1
. Let Φ
m
correspond to the largest index appearing in
T
t
. If M
m
[= T is a model of T with at least : elements (and by hypothesis, such as model
exists), then M
m
[= T
¸
¦Φ
m
¦ ¬ T
t
.
126
So every ﬁnite subset of T
∞
is consistent; by the compactness theorem for ﬁrstorder logic,
T
∞
is consistent, and by G¨odel’s completeness theorem for ﬁrstorder logic it has a model
M. Then M [= T
∞
¬ T, so M is a model of T with inﬁnitely many elements (M [= Φ
n
for
any :, so M has at least ≥ : elements for all :).
Version: 3 Owner: ariels Author(s): ariels
14.14 criterion for consistency of sets of formulas
Let 1 be a ﬁrst order language, and ∆ ⊆ 1 be a set of sentences. Then ∆ is consistent if
and only if every ﬁnite subset of ∆ is consistent.
Version: 2 Owner: jihemme Author(s): jihemme
14.15 deductions are ∆
1
Using the example of G¨odel numbering, we can show that Proves(c. r) (the statement that
c is a proof of r, which will be formally deﬁned below) is ∆
1
.
First, Term(r) should be true iﬀ r is the G¨odel number of a term. Thanks to primitive
recursion, we can deﬁne it by:
Term(r) ↔∃i < r[r = '0. i`]∨
r = '5`∨
∃n < r[r = '6. n` ∧ Term(n)]∨
∃n. . < r[r = '8. n. .` ∧ Term(n) ∧ Term(.)]∨
∃n. . < r[r = '9. n. .` ∧ Term(n) ∧ Term(.)]
Then AtForm(r), which is true when r is the G¨odel number of an atomic formula, is deﬁned
by:
Form(r) ↔∃n. . < r[r = '1. n. .` ∧ Term(n) ∧ Term(.)]∨
∃n. . < r[r = '7. n. .` ∧ Term(n) ∧ Term(.)]∨
Next, Form(r), which is true only if r is the G¨odel number of a formula, is deﬁned recursively
127
by:
Form(r) ↔AtForm(r)∨
∃i. n < r[r = '2. i. n` ∧ Form(n)]∨
∃n < r[r = '3. n` ∧ Form(n)]∨
∃n. . < r[r = '4. n. .` ∧ Form(n) ∧ Form(.)]
The deﬁnition of QFForm(r), which is true when r is the G¨odel numbe of a quantiﬁer free formula,
is deﬁned the same way except without the second clause.
Next we want to show that the set of logical tautologies is ∆
1
. This will be done by formal
izing the concept of truth tables, which will require some development. First we show that
AtForms(c), which is a sequence containing the (unique) atomic formulas of c is ∆
1
. Deﬁne
it by:
AtForms(c. t) ↔(Form(c) ∧ t = 0)∨
Form(c) ∧ (
∃r. n < c[c = '1. r. n`t = c]∨
∃r. n < c[c = '7. r. n` ∧ t = c]∨
∃i. r < c[c = '2. i. r` ∧ t = AtForms(r)]∨
∃r < c[c = '3. r` ∧ t = AtForms(r)]∨
∃r. n < c[c = '4. r. n` ∧ t = AtForms(r) ∗
u
AtForms(n)])
We say · is a truth assignment if it is a sequence of pairs with the ﬁrst member of each
pair being a atomic formula and the second being either 1 or 0:
1¹(·) ↔∀i < len(·)∃r. n < (·)
i
[(·)
i
= 'r. n` ∧ AtForm(r) ∧ (n = 1 ∨ n = 0)]
Then · is a truth assignment for c if · is a truth assignment, c is quantiﬁer free, and every
atomic formula in c is the ﬁrst member of one of the pairs in ·. That is:
1¹1(·. c) ↔1¹(·)∧QFForm(c)∧∀i < len(AtForms(c))∃, < len(·)[((·)
j
)
0
= (AtForms(c))
i
]
Then we can deﬁne when · makes c true by:
True(·. c) ↔1¹1(·. c)∧
AtForm(c) ∧ ∃i < len(·)[((·)
i
)
0
= c ∧ ((·)
i
)
1
= 1]∨
∃n < r[r = '3. n` ∧ True(·. n)]∨
∃n. . < r[r = '4. n. .` ∧ True(·. n) →True(·. .)]
128
Then c is a tautology if every truth assignment makes it true:
Taut(c)∀· < 2
2
AtForms(a)
[1¹1(·. c) →True(·. c)]
We say that a number c is a deduction of φ if it encodes a proof of φ from a set of axioms
¹r. This means that c is a sequence where for each (c)
i
either:
• (c)
i
is the G¨odel number of an axiom
• (c)
i
is a logical tautology
or
• there are some ,. / < i such that (c)
j
= '4. (c)
k
. (c)
i
` (that is, (c)
i
is a conclusion
under modus ponens from (c)
j
and (c)
k
).
and the last element of c is 'φ¯.
If ¹r is ∆
1
(almost every system of axioms, including 1¹, is ∆
1
) then Proves(c. r), which is
true if c is a deduction whose last value is r, is also ∆
1
. This is fairly simple to see from the
above results (let Ax(r) be the relation specifying that r is the G¨odel number of an axiom):
Proves(c. r) ↔∀i < len(c)[Ax((c)
i
) ∨ ∃,. / < i[(c)
j
= '4. (c)
k
. (c)
i
`] ∨ Taut((c)
i
)]
Version: 5 Owner: Henry Author(s): Henry
14.16 example of G¨odel numbering
We can deﬁne by recursion a function c from formulas of arithmetic to numbers, and the
corresponding G¨odel numbering as the inverse.
The symbols of the language of arithmetic are =, ∀, , →, 0, o, <, +, , the variables ·
i
for any integer i, and ( and ). ( and ) are only used to deﬁne the order of operations, and
should be inferred where appropriate in the deﬁnition below.
We can deﬁne a function c by recursion as follows:
• c(·
i
) = '0. i`
• c(φ = ψ) = '1. c(φ). c(ψ)`
• c(∀·
i
φ) = '2. c(·
i
). c(φ)`
129
• c(φ) = '3. c(φ)`
• c(φ →ψ) = '4. c(φ). c(ψ)`
• c(0) = '5`
• c(oφ) = '6. c(φ)`
• c(φ < ψ) = '7. c(φ). c(ψ)`
• c(φ + ψ) = '8. c(φ). c(ψ)`
• c(φ ψ) = '9. c(φ). c(ψ)`
Clearly c
−1
is a G¨odel numbering, with 'φ¯ = c(φ).
Version: 3 Owner: Henry Author(s): Henry
14.17 example of wellfounded induction
As an example of the use of Wellfounded induction in the case where the order is not a
linear one, I’ll prove the fundamental theorem of arithmetic : every natural number has a
prime factorization.
First note that the division relation is wellfounded. This fact is proven in every algebra
books. The [minimal elements are the prime numbers. We detail the two steps of the proof
:
1. If : is prime, then : is its own factorization into primes, so the assertion is true for
the [minimal elements.
2. If : is not prime, then : has a nontrivial factorization (by deﬁnition of not being
prime), i.e. : = :/, where :. : = 1. By induction, : and / have prime factorizations,
and we can see that this implies that : has one too. This takes care of case 2.
Here are other commonly used wellfounded sets :
1. ideals of a Noetherian ring ordered by inverse proper inclusion;
2. Ideals of an artinian ring ordered by inclusion;
3. graphs ordered by minors (A graph ¹ is a minor of 1 if and only if it can be obtained
from 1 by collapsing edges);
4. ordinal numbers;
130
5. etc.
Version: 4 Owner: jihemme Author(s): jihemme
14.18 ﬁrst order language
Terms and formulas of ﬁrst order logic are constructed with the classical logical symbols ∀,∃,
∧, ∨, , ⇒, ⇔, and also ( and ), and a set
Σ =
¸
n∈ω
Rel
n
¸
¸
n∈ω
Fun
n
¸
Const
where for each natural number :,
• Rel
n
is a (usually countable) set of :ary relation symbols.
• Fun
n
is a (usually countable) set of :ary function symbols.
• Const is a (usually countable) set of constant symbols.
We require that all these sets be disjoint. The elements of the set Σ are the only nonlogical
symbols that we are allowed to use when we construct terms and formulas. They form the
signature of the language. So far they are only symbols, so they don’t mean anything. For
most structures that we encounter, the set Σ is ﬁnite, but we allow it to be inﬁnite, even
uncountable, as this sometimes makes things easier, and just about everything still works
when the signature is uncountable. We also assume that we have an unlimited supply of
variables, with the only constraint that the collection of variables form a set, which should
be disjoint from the other sets of nonlogical symbols.
The arity of a function or relation symbol is the number of parameters the symbol is about
to take. It is usually assumed to be a property of the symbol, and it is bad grammar to use
an :ary function or relation with : parameters if : = :.
Terms are built inductively according to the following rules :
1. Any variable is a term;
2. Any constant symbol is a term;
3. If 1 is an :ary function symbol, and t
1
. .... t
n
are term, then 1(t
1
. .... t
n
) is a term.
131
With terms in hands, we build formulas inductively by a ﬁnite application of the following
rules :
1. If t
1
and t
2
are terms, then t
1
= t
2
is a formula;
2. If 1 is an :ary relation symbol and t
1
. .... t
n
are terms, then 1(t
1
. .... t
n
) is a formula;
3. If ϕ is a formula, then so is ϕ;
4. If ϕ and ψ are formulas, then so is ϕ ∨ ψ;
5. If ϕ is a formula, and r is a variable, then ∃r(ϕ) is a formula.
The other logical symbols are obtained in the following way :
ϕ ∧ ψ
def
:= (ϕ ∨ ψ) ϕ ⇒ψ
def
:= ϕ ∨ ψ
ϕ ⇔ψ
def
:= (ϕ ⇒ψ) ∧ (ψ ⇒ϕ) ∀r.ϕ
def
:= (∃r(ϕ))
All logical symbols are used when building formulas.
Version: 8 Owner: jihemme Author(s): jihemme
14.19 ﬁrst order logic
A logic is ﬁrst order if it has exactly one type. Usually the term refers speciﬁcally to the
logic with connectives , ∨, ∧, →, and ↔ and the quantiﬁers ∀ and ∃, all given the usual
semantics:
• φ is true iﬀ φ is not true
• φ ∨ ψ is true if either φ is true or ψ is true
• ∀rφ(r) is true iﬀ φ
t
x
is true for every object t (where φ
t
x
is the result of replacing every
unbound occurence of r in φ with t)
• φ ∧ ψ is the same as (φ ∨ ψ)
• φ →ψ is the same as (φ) ∨ ψ
• φ ↔ψ is the same as (φ →ψ) ∧ (ψ →φ)
• ∃rφ(r) is the same as ∀rφ(r)
132
However languages with slightly diﬀerent quantiﬁers and connectives are sometimes still
called ﬁrst order as long as there is only one type.
Version: 4 Owner: Henry Author(s): Henry
14.20 ﬁrst order theories
Let 1 be a ﬁrstorder language. A theory in 1 is a set of sentences of 1, i.e. a set of
formulas of 1 that have no free variables.
Deﬁnition. A theory 1 is said to be consistent if and only if 1 ¬⊥, where ⊥ stands for
“false”. In other words, 1 is consistent if one cannot derive a contradiction from it. If ϕ is
a sentence of 1, then we say ϕ is consistent with 1 if and only if the theory 1
¸
¦ϕ¦ is
consistent.
Deﬁnition. A theory 1 ⊆ 1 is said to be complete if and only if for every formula ϕ ∈ 1,
either 1 ¬ ϕ or 1 ¬ ϕ.
lemma. A theory 1 in 1 is complete if and only if it is maximal consistent. In other words,
1 is complete if and only if for every ϕ ∈ 1, 1
¸
¦ϕ¦ is inconsistent.
Theorem. (Tarski) Every consistent theory 1 in 1 can be extended to a complete theory.
Proof : Use Zorn’s lemma on the collection of consistent theory extending 1. ♦
Version: 3 Owner: jihemme Author(s): jihemme
14.21 free and bound variables
In the entry ﬁrstorder languages, I have mentioned the use of variables without men
tioning what variables really are. A variable is a symbol that is supposed to range over the
universe of discourse. Unlike a constant, it has no ﬁxed value.
There are two ways in which a variable can occur in a formula: free or bound. Informally,
a variable is said to occur free in a formula ϕ if and only if it is not within the ”scope” of a
quantiﬁer. For instance, r occurs free in ϕ if and only if it occurs in it as a symbol, and no
subformula of ϕ is of the form ∃r.ψ. Here the r after the ∃ is to be taken literally : it is r
and no other symbol.
The set 1\ (ϕ) of free variables of ϕ is deﬁned by Wellfounded induction on the construction
of formulas. First we deﬁne \ c:(t), where t is a term, to be the set or all variables occurring
in t, and then :
133
1\ (t
1
= t
2
) = \ c:(t
2
)
¸
\ c:(t
2
)
1\ (1(t
1
. .... t
n
)) =
n
¸
k=1
\ c:(t
k
)
1\ (ϕ) = 1\ (ϕ)
1\ (ϕ ∨ ψ) = 1\ (ϕ)
¸
1\ (ψ)
1\ (∃r(ϕ)) = 1\ (ϕ)`¦r¦
When for some ϕ, the set 1\ (ϕ) is not empty, then it is customary to write ϕ as ϕ(r
1
. ...r
n
),
in order to stress the fact that there are some free variables left in ϕ, and that those free
variables are among r
1
. .... r
n
. When r
1
. .... r
n
appear free in ϕ, then they are considered as
placeholders, and it is understood that we will have to supply “values” for them, when
we want to determine the truth of ϕ. If 1\ (ϕ) = ∅, then ϕ is called a sentence.
If a variable never occurs free in ϕ (and occurs as a symbol), then we say the variable is
bound. A variable r is bound if and only if ∃r(ψ) or ∀r(ψ) is a subformula of ϕ for some ψ
The problem with this deﬁnition is that a variable can occur both free and bound in the
same formula. For example, consider the following formula of the lenguage ¦+. . 0. 1¦ of ring
theory :
r + 1 = 0 ∧ ∃r(r + n = 1)
The variable r occurs both free and bound here. However, the following lemma tells us that
we can always avoid this situation :
Lemma 1. It is possible to rename the bound variables without aﬀecting the truth of a
formula. In other words, if ϕ = ∃r(ψ), or ∀r(ψ), and . is a variable not occuring in ψ, then
¬ ϕ ⇔ ∃.(ψ(.r)), where ψ(.r) is the formula obtained from ψ by replacing every free
occurence of r by ..
Version: 5 Owner: jihemme Author(s): jihemme
14.22 generalized quantiﬁer
Generalized quantiﬁers are an abstract way of deﬁning quantiﬁers.
The underlying principle is that formulas quantiﬁed by a generalized quantiﬁer are true if
the set of elements satisfying those formulas belong in some relation associated with the
quantiﬁer.
134
Every generalized quantiﬁer has an arity, which is the number of formulas it takes as argu
ments, and a type, which for an :ary quantiﬁer is a tuple of length :. The tuple represents
the number of quantiﬁed variables for each argument.
The most common quantiﬁers are those of type '1`, including ∀ and ∃. If C is a quantiﬁer
of type '1`, ` is the universe of a model, and C
M
is the relation associated with C in that
model, then Crφ(r) ↔¦r ∈ ` [ φ(r)¦ ∈ C
M
.
So ∀
M
= ¦`¦, since the quantiﬁed formula is only true when all elements satisfy it. On the
other hand ∃
M
= 1(`) −¦∅¦.
In general, the monadic quantiﬁers are those of type '1. . . . . 1` and if C is an :ary monadic
quantiﬁer then C
M
⊆ 1(`)
n
. H¨artig’s quantiﬁer, for instance, is '1. 1`, and 1
M
= ¦'A. ) ` [
A. ) ⊆ ` ∧ [A[ = [) [¦.
A quantiﬁer C is polyadic if it is of type ':
1
. . . . . :
n
` where each :
i
∈ N. Then:
C
M
⊆
¸
i
1(`)
n
i
These can get quite elaborate; \rnφ(r. n) is a '2` quantiﬁer where A ∈ \
M
↔ A is a
wellordering. That is, it is true if the set of pairs making φ true is a wellordering.
Version: 1 Owner: Henry Author(s): Henry
14.23 logic
Generally, by logic, people mean ﬁrst order logic, a formal set of rules for building mathemat
ical statements out of symbols like (negation) and → (implication) along with quantiﬁers
like ∀ (for every) and ∃ (there exists).
More generally, a logic is any set of rules for forming sentences (the logic’s syntax) together
with rules for assigning truth values to them (the logic’s semantics). Normally it includes
a (possibly empty) set of types 1 (also called sorts), which represent the diﬀerent kinds of
objects that the theory discusses (typical examples might be sets, numbers, or sets of num
bers). In addition it speciﬁes particular quantiﬁers, connectives, and variables. Particular
theories in the logic can then add relations and functions to fully specify a logical language.
Version: 5 Owner: Henry Author(s): Henry
135
14.24 proof of compactness theorem for ﬁrst order logic
The theorem states that if a set of sentences of a ﬁrstorder language 1 is inconsistent, then
some ﬁnite subset of it is inconsistent. Suppose ∆ ⊆ 1 is inconsistent. Then by deﬁnition
∆ ¬⊥, i.e. there is a formal proof of “false” using only assumptions from ∆. Formal proofs
are ﬁnite objects, so let Γ collect all the formulas of ∆ that are used in the proof.
Version: 1 Owner: jihemme Author(s): jihemme
14.25 proof of principle of transﬁnite induction
To prove the transﬁnite induction theorem, we note that the class of ordinals is wellordered
by ∈. So suppose for some Φ, there are ordinals α such that Φ(α) is not true. Suppose
further that Φ satisﬁes the hypothesis, i.e. ∀α(∀β < α(Φ(β)) ⇒ Φ(α)). We will reach a
contradiction.
The class ( = ¦α : Φ(α)¦ is not empty. Note that it may be a proper class, but this is not
important. Let γ = min(() be the ∈minimal element of (. Then by assumption, for every
λ < γ, Φ(λ) is true. Thus, by hypothesis, Φ(γ) is true, contradiction.
Version: 8 Owner: jihemme Author(s): jihemme, quadrate
14.26 proof of the wellfounded induction principle
This proof is very similar to the proof of the transﬁnite induction theorem. Suppose Φ is
deﬁned for a wellfounded set (o. 1), and suppose Φ is not true for every c ∈ o. Assume
further that Φ satisﬁes requirements 1 and 2 of the statement. Since 1 is a wellfounded
relation, the set ¦c ∈ o : Φ(c)¦ has an 1 minimal element :. This element is either an 1
minimal element of o itself, in which case condition 1 is violated, or it has 1 predessors. In
this case, we have by minimality Φ(:) for every : such that :1:, and by condition 2, Φ(:) is
true, contradiction.
Version: 4 Owner: jihemme Author(s): jihemme
14.27 quantiﬁer
A quantiﬁer is a logical symbol which makes an assertion about the set of values which
make one or more formulas true. This an exceedingly general concept; the vast majority of
mathematics is done with the two standard quantiﬁers, ∀ and ∃.
136
The universal quantiﬁer ∀ takes a variable and a formula and asserts that the formula
holds for any value of r. A typical example would be a sentence like:
∀r[0 < r]
which states that no matter what value r takes, 0 < r.
The existential quantiﬁer ∃ is the dual; that is the formula ∀rφ(r) is equivalent to
∃rφ(r). It states that there is some r satsifying the formula, as in
∃r[r 0]
which states that there is some value of r greater than 0.
The scope of a quantiﬁer is the portion of a formula where it binds its variables. Note
that previous bindings of a variable are overridden within the scope of a quantiﬁer. In the
examples above, the scope of the quantiﬁers was the entire formula, but that need not be
the case. The following is a more complicated use of quantiﬁers:
∀r[r = 0 ∨ ∃n
The scope of the ﬁrst existential quantiﬁer.
. .. .
[r = n + 1 ∧ (n = 0 ∨ ∃r[n = r + 1]
. .. .
†
)]]
. .. .
The scope of the universal quantiﬁer.
†:The scope of the second existential quantiﬁer. Within this area, all references to r refer to
the variable bound by the existential quantiﬁer. It is impossible to refer directly to the one
bound by the universal quantiﬁer.
As that example illustrates, it can be very confusing when one quantiﬁer overrides another.
Since it does not change the meaning of a sentence to change a bound variable and all bound
occurrences of it, it is better form to replace sentences like that with an equivalent but more
readable one like:
∀r[r = 0 ∨ ∃n[r = n + 1 ∧ (n = 0 ∨ ∃.[n = . + 1])]]
These sentences both assert that every number is either equal to zero, or that there is some
number one less than it, and that the number one less than it is also either zero or has
a number one less than it. [Note: This is not the most useful of sentences. It would be
nice to replace this with a mathematically simple sentence which uses nested quantiﬁers
meaningfully.]
137
The quantiﬁers may not range over all objects. That is, ∀rφ(r) may not specify that r
can be any object, but rather any object belonging to some class of objects. Similarly
∃rφ(r) may specify that there is some r within that class which satisﬁes φ. For instance
second order logic has two universal quantiﬁers, ∀
1
and ∀
2
(with corresponding existential
quantiﬁers), and variables bound by them only range over the ﬁrst and second order objects
respectively. So ∀
1
r[0 < r] only states that all numbers are greater than or equal to 0, not
that sets of numbers are as well (which would be meaningless).
A particular use of a quantiﬁer is called bounded or restricted if it limits the objects
to a smaller range. This is not quite the same as the situation mentioned above; in the
situation above, the deﬁnition of the quantiﬁer does not include all objects. In this case,
quantiﬁers can range over everything, but in a particular formula it doesn’t. This is expressed
in ﬁrst order logic with formulas like these four:
∀r[r < c →φ(r)]∀r[r ∈ A →φ(r)] ∃r[r < c ∧ φ(r)]∃r[r ∈ A ∧ φ(A)]
The restriction is often incorporated into the quantiﬁer. For instance the ﬁrst example might
be written ∀r < c[φ(c)].
A quantiﬁer is called vacuous if the variable it binds does not appear anywhere in its scope,
such as ∀r∃n[0 < r]. While vacuous quantiﬁers do not change the meaning of a sentence,
they are occasionally useful in ﬁnding an equivalent formula of a speciﬁc form.
While these are the most common quantiﬁers (in particular, they are the only quantiﬁers
appearing in classical ﬁrstorder logic), some logics use others. The quantiﬁer ∃!rφ(r), which
means that there is a unique r satsifying φ(r) is equivalent to ∃r[φ(r) ∧ ∀n[φ(n) →r = n]].
Other quantiﬁers go beyond the usual two. Examples include interpreting Crφ(r) to mean
there are an inﬁnite (or uncountably inﬁnite) number of r satisfying φ(r). More elaborate
examples include the branching Henkin quantiﬁer, written:
∀r ∃n
∀c ∃/
φ(r. n. c. /)
This quantiﬁer is similar to ∀r∃n∀c∃/φ(r. n. c. /) except that the choice of c and / cannot
depend on the values of c and /. This concept can be further generalized to the game
semantic, or independencefriendly, quantiﬁers. All of these quantiﬁers are examples of
generalized quantiﬁers.
Version: 7 Owner: Henry Author(s): Henry
138
14.28 quantiﬁer free
Let 1 be a ﬁrst order language. A formula ψ is quantiﬁer free iﬀ it contains no quantiﬁers.
Let 1 be a complete 1theory. Let o ⊆ 1. Then o is an elimination set for 1 iﬀ for every
ψ(¯ r) ∈ 1 there is some φ(¯ r) ∈ o so that 1 ¬ ∀¯ r(ψ(¯ r) ↔φ(¯ r).
In particular, 1 has quantiﬁer elimination iﬀ the set of quantiﬁer free formulas is an elimi
nation set for 1. In other words 1 has quantiﬁer elimination iﬀ for every ψ(¯ r) ∈ 1 there is
some quantiﬁer free φ(¯ r) ∈ 1 so that 1 ¬ ∀¯ r(ψ(¯ r)) ↔φ(¯ r).
Version: 2 Owner: mathcam Author(s): mathcam, Timmy
14.29 subformula
Let 1 be a ﬁrst order language and suppose ϕ. ψ ∈ 1 are formulas. Then we say that ϕ is a
subformula of ψ if and only if :
1. ψ = ϕ
2. ψ is one of α, ∀r(α) or ∃r(α), and either ϕ = α, or is a subformula of α.
3. ψ is α ∨ β or α ∧ β and either ϕ = α, ϕ = β, or ϕ is a subformula of α or β
Version: 2 Owner: jihemme Author(s): jihemme
14.30 syntactic compactness theorem for ﬁrst order
logic
Let 1 be a ﬁrstorder language, and ∆ ⊆ 1 be a set of sentences. If ∆ is inconsistent, then
some ﬁnite Γ ⊆ ∆ is inconsistent.
Version: 2 Owner: jihemme Author(s): jihemme
14.31 transﬁnite induction
Suppose Φ(α) is a property deﬁned for every ordinal α, the principle of transﬁnite induc
tion states that in the case where for every α, if the fact that Φ(β) is true for every β < α
implies that Φ(α) is true, then Φ(α) is true for every ordinal α. Formally :
139
∀α(∀β(β < α ⇒Φ(β)) ⇒Φ(α)) ⇒∀α(Φ(α))
The principle of transﬁnite induction is very similar to the principle of ﬁnite induction, ex
cept that it is stated in terms of the whole class of the ordinals.
Version: 7 Owner: jihemme Author(s): jihemme, quadrate
14.32 universal relation
If Φ is a class of :ary relations with r as the only free variables, an : + 1ary formula ψ is
universal for Φ if for any φ ∈ Φ there is some c such that ψ(c. r) ↔φ(r). In other words,
ψ can simulate any element of Φ.
Similarly, if Φ is a class of function of r, a formula ψ is universal for Φ if for any φ ∈ Φ there
is some c such that ψ(c. r) = φ(r).
Version: 3 Owner: Henry Author(s): Henry
14.33 universal relations exist for each level of the arith
metical hierarchy
Let 1 ∈ ¦Σ
n
. ∆
n
. Π
n
¦ and take any / ∈ N. Then there is a / + 1ary relation l ∈ 1 such
that l is universal for the /ary relations in 1.
Proof
First we prove the case where 1 = ∆
1
, the recursive relations. We use the example of a
G¨odel numbering.
Deﬁne 1 to be a / + 2ary relation such that 1(c. r. c) if:
• c = 'φ¯
• c is a deduction of either φ(r) or φ(r)
Since deductions are ∆
1
, it follows that 1 is ∆
1
. Then deﬁne l
t
(c. r) to be the least c such
that 1(c. r. c) and l(c. r) ↔(l
t
(c. r))
len(U
(e,x))
= c. This is again ∆
1
since the ∆
1
functions
are closed under minimization.
140
If 1 is any / −c:n ∆
1
function then 1(r) = l('1¯. r).
Now take 1 to be the /ary relatons in either Σ
n
or Π
n
. Call the universal relation for
/ + :ary ∆
1
relations l
∆
. Then any φ ∈ 1 is equivalent to a relation in the form
Cn
1
C
t
n
2
C
∗
n
n
ψ(r. n) where o ∈ ∆
1
, and so l(r) = Cn
1
C
t
n
2
C
∗
n
n
l
∆
('ψ¯. r. n). Then
l is universal for 1.
Finally, if 1 is the /ary ∆
n
relations and φ ∈ 1 then φ is equivalent to relations of the form
∃n
1
∀n
2
Cn
n
ψ(r. n) and ∀.
1
∃.
2
C.
n
η(r. .). If the /ary universal relations for Σ
n
and
Π
n
are l
Σ
and l
Π
respectively then φ(r) ↔l
Σ
('ψ¯. r) ∧ l
Π
('η¯. r).
Version: 2 Owner: Henry Author(s): Henry
14.34 wellfounded induction
The principle of wellfounded induction is a generalization of the principle of transﬁnite induction.
Deﬁnition. Let o be a nonempty set, and 1 be a partial order relation on o. Then 1 is
said to be a wellfounded relation if and only if every subset A ⊆ o has an 1minimal
element. In the special case where 1 is a total order, we say o is wellordered by 1. The
structure (o. 1) is called a wellfounded set.
Note that 1 is by no means required to be a total order. A classical example of a well
founded set that is not totally ordered is the set N of natural numbers ordered by division,
i.e. c1/ if and only if c divides /, and c = 1. The 1minimal elements of this order are the
prime numbers.
Let Φ be a property deﬁned on a wellfounded set o. The principle of wellfounded induction
states that if the following is true :
1. Φ is true for all the 1minimal elements of o
2. for every c, if for every r such that r1c, we have Φ(r), then we have Φ(c)
then Φ is true for every c ∈ o.
As an example of application of this principle, we mention the proof of the fundamental theorem of arithmetic
: every natural number has a unique factorization into prime numbers. The proof goes by
wellfounded induction in the set N ordered by division.
Version: 10 Owner: jihemme Author(s): jihemme
141
14.35 wellfounded induction on formulas
Let 1 be a ﬁrstorder language. The formulas of 1 are built by a ﬁnite application of the
rules of construction. This says that the relation < deﬁned on formulas by ϕ < ψ if and only
if ϕ is a subformula of ψ is a wellfounded relation. Therefore, we can formulate a principle
of induction for formulas as follows : suppose 1 is a property deﬁned on formulas, then 1
is true for every formula of 1 if and only if
1. 1 is true for the atomic formulas;
2. for every formula ϕ, if 1 is true for every subformula of ϕ, then 1 is true for ϕ.
Version: 3 Owner: jihemme Author(s): jihemme
142
Chapter 15
03B15 – Higherorder logic and type
theory
15.1 H¨artig’s quantiﬁer
H¨artig’s quantiﬁer is a quantiﬁer which takes two variables and two formulas, written
1rnφ(r)ψ(n). It asserts that [¦r [ φ(r)¦[ = [¦n [ ψ(n)[. That is, the cardinality of the values
of r which make φ is the same as the cardinality of the values which make ψ(r) true. Viewed
as a generalized quantiﬁer, 1 is a '2` quantiﬁer.
Closely related is the Rescher quantiﬁer, which also takes two variables and two formulas,
is written Jrnφ(r)ψ(n), and asserts that [¦r [ φ(r)¦[ < [¦n [ ψ(n)[. The Rescher quantiﬁer is
sometimes deﬁned instead to be a similar but diﬀerent quantiﬁer, Jrφ(r) ↔ [¦r [ φ(r)¦[
[¦r [ φ(r)¦[. The ﬁrst deﬁnition is a '2` quantiﬁer while the second is a '1` quantiﬁer.
Another similar quantiﬁer is Chang’s quantiﬁer C
C
, a '1` quantiﬁer deﬁned by C
C
M
= ¦A ⊆
` [ [A[ = [`[¦. That is, C
C
rφ(r) is true if the number of r satisfying φ has the same
cardinality as the universe; for ﬁnite models this is the same as ∀, but for inﬁnite ones it is
not.
Version: 3 Owner: Henry Author(s): Henry
15.2 Russell’s theory of types
After the discovery of the paradoxes of set theory (notably Russell’s paradox), it become
apparent that naive set theory must be replaced by something in which the paradoxes can’t
arise. Two solutions were proposed: type theory and axiomatic set theory based on a
limitation of size principle (see the entries class and von NeumannBernaysG¨odel set theory).
143
Type theory is based on the idea that impredicative deﬁnitions are the root of all evil.
Bertrand Russell and various other logicians in the beginning of the 20th century proposed
an analysis of the paradoxes that singled out so called vicious circles as the culprits. A
vicious circle arises when one attempts to deﬁne a class by quantifying over a totality of
classes including the class being deﬁned. For example, Russell’s class 1 = ¦r [ r ∈ r¦
contains a variable r that ranges over all classes.
Russell’s type theory, which is found in its mature form in the momentous Principia Mathe
matica avoids the paradoxes by two devices. First, Frege’s ﬁfth axiom is abandoned entirely:
the extensions of predicates do not appear among the objects. Secondly, the predicates
themselves are ordered into a ramiﬁed hierarchy so that the predicates at the lowest level
can be deﬁned by speaking of objects only, the predicates at the next level by speaking of
objects and of predicates at the previous level and so forth.
The ﬁrst of these principles has drastic implications to mathematics. For example, the
predicate “has the same cardinality” seemingly can’t be deﬁned at all. For predicates apply
only to objects, and not to other predicates. In Frege’s system this is easy to overcome: the
equicardinality predicate is deﬁned for extensions of predicates, which are objects. In order
to overcome this, Russell introduced the notion of types (which are today known as degrees).
Predicates of degree 1 apply only to objects, predicates of degree 2 apply to predicates of
degree 1, and so forth.
Type theoretic universe may seem quite odd to someone familiar with the cumulative hier
archy of set theory. For example, the empty set appears anew in all degrees, as do various
other familiar structures, such as the natural numbers. Because of this, it is common to
indicate only the relative diﬀerences in degrees when writing down a formula of type theory,
instead of the absolute degrees. Thus instead of writing
∃1
1
∀r
0
(r
0
∈ 1
1
↔r
0
= r
0
)
one writes
∃1
i+1
∀r
i
(r
i
∈ 1
i+1
↔r
i
= r
i
)
to indicate that the formula holds for any i. Another possibility is simply to drop the
subscripts indicating degree and let the degrees be determined implicitly (this can usually
be done since we know that r ∈ n implies that if n is of degree :, then r is of degree : +1).
A formula for which there is an assignment of types (degrees) to the variables and constants
so that it accords to the restrictions of type theory is said to be stratiﬁed.
The second device implies another dimension in which the predicates are ordered. In any
given degree, there appears a hierarchy of levels. At ﬁrst level of degree : + 1 one has
predicates that apply to elements of degree : and which can be deﬁned with reference only
to predicates of degree :. At second level there appear all the predicates that can be deﬁned
144
with reference to preidcates of degree : and to predicates of degree : + 1 of level 1, and so
forth.
This second principle makes virtually all mathematics break down. For example, when
speaking of real number system and its completeness, one wishes to quantify over all pred
icates of real numbers (this is possible at degree : + 1 if the predicates of real numbers
appear at degree :), not only of those of a given level. In order to overcome this, Russell
and Whitehead introduced in PM the socalled axiom of reducibility, which states that if a
predicate 1
n
occurs at some level / (i.e. 1
n
= 1
k
n
), it occurs already on the ﬁrst level.
Frank P. Ramsay was the ﬁrst to notice that the axiom of reducibility in eﬀect collapses the
hierarchy of levels, so that the hierarchy is entirely superﬂuous in presense of the axiom. The
original form of type theory is known as ramiﬁed type theory, and the simpler alternative
with no second hierarchy of levels is known as unramiﬁed type theory or simply as simple
type theory.
One descendant of type theory is W. v. Quine’s system of set theory known as NF (New
Foundations), which diﬀers considerably from the more familiar set theories (ZFC, NBG,
MorseKelley). In NF there is a class comprehension axiom saying that to any stratiﬁed
formula there corresponds a set of elements satisfying the formula. The Russell class is not
a set, since it contains the formula r ∈ r, which can’t be stratiﬁed, but the universal class
is a set: r = r is perfectly legal in type theory, as we can assign to r any degree and get a
wellformed formula of type theory. It is not known if NF axiomatises any extensor (see the
entry class) based on a limitation of size principle, like the more familiar set theories do.
In the modern variants of type theory, one usually has a more general supply of types.
Beginning with some set τ of types (presumably a division of the simple objects into some
natural categories), one deﬁnes the set of types 1 by setting
• if c. / ∈ 1, then (c →/) ∈ 1
• for all t ∈ τ, t ∈ 1
One way to proceed to get something familiar is to have τ contain a type t for truth values.
Then sentences are objects of type t, open formulae of one variable are of type Object →t
and so forth. This sort of type system is often found in the study of typed lambda calculus
and also in intensional logics, which are often based on the former.
Version: 4 Owner: Aatu Author(s): Aatu
15.3 analytic hierarchy
The analytic hierarchy is a hierarchy of either (depending on context) formulas or relations
similar to the arithmetical hierarchy. It is essentially the second order equivalent. Like the
145
arithmetical hierarchy, the relations in each level are exactly the relations deﬁned by the
formulas of that level.
The ﬁrst level can be called ∆
1
0
, ∆
1
1
, Σ
1
0
, or Π
1
0
, and consists of the arithmetical formulas or
relations.
A formula φ is Σ
1
n
if there is some arithmetical formula ψ such that:
φ(
/) = ∃A
1
∀A
2
CA
n
ψ(
/.
A
n
)
where C is either ∀ or ∃, whichever maintains the pattern of alternating quantiﬁers, and each A
i
is a set
Similarly, a formula φ is Π
1
n
if there is some arithmetical formula ψ such that:
φ(
/) = ∀A
1
∃A
2
CA
n
ψ(
/.
A
n
)
where C is either ∀ or ∃, whichever maintains the pattern of alternating quantiﬁers, and each A
i
is a set
Version: 1 Owner: Henry Author(s): Henry
15.4 gametheoretical quantiﬁer
A Henkin or branching quantiﬁer is a multivariable quantiﬁer in which the selection of
variables depends only on some, but not all, of the other quantiﬁed variables. For instance
the simplest Henkin quantiﬁer can be written:
∀r∃n
∀c∃/
φ(r. n. c. /)
This quantiﬁer, inexpressible in ordinary ﬁrst order logic, can best be understood by its
Skolemization. The formula above is equivalent to ∀r∀cφ(r. 1(n). c. o(c)). Critically, the
selection of n depends only on r while the selection of / depends only on c.
Logics with this quantiﬁer are stronger than ﬁrst order logic, lying between ﬁrst and second order logic
in strength. For instance the Henkin quantiﬁer can be used to deﬁne the Rescher quantiﬁer,
and by extension H¨artig’s quantifer:
∀r∃n
∀c∃/
[(r = c ↔n = /) ∧ φ(r) →ψ(n)] ↔1rnφ(r)ψ(n)
To see that this is true, observe that this essentially requires that the Skolem functions
1(r) = n and o(c) = / the same, and moreover that they are injective. Then for each r
satisfying φ(r), there is a diﬀerent 1(r) satisfying ψ((1(r)).
146
This concept can be generalized to the gametheoretical quantiﬁers. This concept comes
from interpreting a formula as a game between a ”Prover” and ”Refuter.” A theorem is
provable whenever the Prover has a winning strategy; at each ∧ the Refuter chooses which
side they will play (so the Prover must be prepared to win on either) while each ∨ is a choice
for the Prover. At a , the players switch roles. Then ∀ represents a choice for the Refuter
and ∃ for the Prover.
Classical ﬁrst order logic, then, adds the requirement that the games have perfect information.
The gametheoretical quantifers remove this requirement, so for instance the Henkin quan
tiﬁer, which would be written ∀r∃n∀c∃
/∀x
/φ(r. n. c. /) states that when the Prover makes a
choice for /, it is made without knowledge of what was chosen at r.
Version: 2 Owner: Henry Author(s): Henry
15.5 logical language
In its most general form, a logical language is a set of rules for constructing formulas for
some logic, which can then by assigned truth values based on the rules of that logic.
A logical languages L consists of:
• A set 1 of function symbols (common examples include + and )
• A set 1 of relation symbols (common examples include = and <)
• A set ( of logical connectives (usually , ∧, ∨, → and ↔)
• A set C of quantiﬁers (usuallly ∀ and ∃)
• A set \ of variables
Every function symbol, relation symbol, and connective is associated with an arity (the set
of :ary function symbols is denoted 1
n
, and similarly for relation symbols and connectives).
Each quantiﬁer is a generalized quantiﬁer associated with a quantiﬁer type ':
1
. . . . . :
n
`.
The underlying logic has a (possibly empty) set of types 1. There is a function Type :
1
¸
\ → 1 which assignes a type to each function and variable. For each arity : is a
function Inputs
n
: 1
n
¸
1
n
→ 1
n
which gives the types of each of the arguments to a
function symbol or relation. In addition, for each quantiﬁer type ':
1
. . . . . :
n
` there is a
function Inputs
/n
1
,...,nn)
deﬁned on C
/n
1
,...,nn)
(the set of quantiﬁers of that type) which gives
an :tuple of :
i
tuples of types of the arguments taken by formulas the quantiﬁer applies to.
The terms of L of type t ∈ 1 are built as follows:
147
1. If · is a variable such that Type(·) = t then · is a term of type t
2. If 1 is an :ary function symbol such that Type(1) = t and t
1
. . . . . t
n
are terms such
that for each i < : Type(t
i
) = (Inputs
n
(1))
i
then 1t
1
. . . . . t
n
is a term of type t
The formulas of L are built as follows:
1. If : is an :ary relation symbol and t
1
. . . . . t
n
are terms such that Type(t
i
) = (Inputs
n
(:))
i
then :t
1
. . . . . t
n
is a formula
2. If c is an :ary connective and 1
1
. . . . . 1
n
are formulas then c1
1
. . . . . 1
n
is a formula
3. If ¡ is a quantiﬁer of type ':
1
. . . . . :
n
`, ·
1,1
. . . . . ·
1,n
1
. ·
2,1
. . . . . ·
n,1
. . . . . ·
n,nn
are a sequence
of variables such that Type(·
i,j
) = ((Inputs
/n
1
,...,nn)
(¡))
j
)
i
and 1
1
. . . . . 1
n
are formulas
then ¡·
1,1
. . . . . ·
1,n
1
. ·
2,1
. . . . . ·
n,1
. . . . . ·
n,nn
1
1
. . . . . 1
n
is a formula
Generally the connectives, quantiﬁers, and variables are speciﬁed by the appropriate logic,
while the function and relation symbols are speciﬁed for particular languages. Note that
0ary functions are usually called constants.
If there is only one type which is equated directly with truth values then this is essentially
a propositional logic. If the standard quantiﬁers and connectives are used, there is only one
type, and one of the relations is = (with its usual semantics), this produces ﬁrst order logic.
If the standard quantiﬁers and connectives are used, there are two types, and the relations
include = and ∈ with appropriate semantics, this is second order logic (a slightly diﬀerent
formulation replaces ∈ with a 2ary function which represents function application; this views
second order objects as functions rather than sets).
Note that often connectives are written with inﬁx notation with parentheses used to control
order of operations.
Version: 7 Owner: Henry Author(s): Henry
15.6 second order logic
Second order logic refers to logics with two (or three) types where one type consists of the
objects of interest and the second is either sets of those objects or functions on those objects
(or both, in the three type case). For instance, second order arithmetic has two types: the
numbers and the sets of numbers.
Formally, second order logic usually has:
• the standard quantiﬁers (four of them, since each type needs its own universal and
existential quantiﬁers)
148
• the standard connectives
• the relation = with its normal semantics
• if the second type represents sets, a relation ∈ where the ﬁrst argument is of the ﬁrst
type and the second argument is the second type
• if the second type represents functions, a binary function which takes one argument of
each type and results in an object of the ﬁrst type, representing function application
Speciﬁc second order logics may deviate from this deﬁnition slightly. In particular, some
mathematicians have argued that ﬁrst order logics which additional quantiﬁers which give
it most or all of the strength of second order logic should be considered second order logics.
Some people, chieﬂy Quine, have raised philisophical objections to second order logic, cen
tering on the question of whether models require ﬁxing some set of sets or functions as the
“actual” sets or functions for the purposes of that model.
Version: 4 Owner: Henry Author(s): Henry
149
Chapter 16
03B40 – Combinatory logic and
lambdacalculus
16.1 Church integer
A Church integer is a representation of integers as functions, invented by Alonzo Church.
An integer ` is represented as a higherorder function, which applies a given function to a
given expression ` times.
For example, in Haskell, a function that returns a particular Church integer might be
The transformation from a Church integer to an integer might be
unchurch n = n (+1) 0
Thus the (+1) function would be applied to an initial value of 0 : times, yielding the ordinary
integer :.
Version: 2 Owner: Logan Author(s): Logan
16.2 combinatory logic
Combinatory logic was invented by Moses Sch¨onﬁnkel in the early 1920s, and was mostly
developed by Haskell Curry. The idea was to reduce the notation of logic to the simplest
terms possible. As such, combinatory logic consists only of combinators, combination
operations, and no free variables.
150
A combinator is simply a function with no free variables. A free variable is any variable
referred to in a function that is not a parameter of that function. The operation of com
bination is then simply the application of a combinator to its parameters. Combination is
speciﬁed by simple juxtaposition of two terms, and is leftassociative. Parentheses may also
be present to override associativity. For example
1orn = (1o)rn = ((1o)r)n
All combinators in combinatory logic can be derived from two basic combinators, o and 1.
They are deﬁned as
o1or = 1r(or)
1rn = r
Reference is sometimes made to a third basic combinator, 1, which can be deﬁned in terms
of o and 1.
1r = o11r = r
Combinatory logic where 1 is considered to be derived from o and 1 is sometimes known
as pure combinatory logic.
Combinatory logic and lambda calculus are equivalent. However, lambda calculus is more
concise than combinatory logic; an expression of size O(:) in lambda calculus is equivalent
to an expression of size O(:
2
) in combinatory logic.
For example, o1or = 1r(or) in combinatory logic is equivalent to o = (λ1(λo(λr((1r)(or))))),
and 1rn = r is equivalent to 1 = (λr(λnr)).
Version: 2 Owner: Logan Author(s): Logan
16.3 lambda calculus
Lambda calculus (often referred to as λcalculus) was invented in the 1930s by Alonzo
Church, as a form of mathematical logic dealing primarly with functions and the application
of functions to their arguments. In pure lambda calculus, there are no constants. In
stead, there are only lambda abstractions (which are simply speciﬁcations of functions),
variables, and applications of functions to functions. For instance, Church integers are used
as a substitute for actual constants representing integers.
151
A lambda abstraction is typically speciﬁed using a lambda expression, which might look
like the following.
λ r . 1 r
The above speciﬁes a function of one argument, that can be reduced by applying the
function 1 to its argument (function application is leftassociative by default, and parentheses
can be used to specify associativity).
The λcalculus is equivalent to combinatory logic (though much more concise). Most functional
programming languages are also equivalent to λcalculus, to a degree (any imperative fea
tures in such languages are, of course, not equivalent).
Examples
We can specify the Church integer 3 in λcalculus as
3 = λ1 r . 1 (1 (1 r))
Suppose we have a function i:c, which when given a string representing an integer, returns
a new string representing the number following that integer. Then
3 i:c ”0” = ”3”
Addition of Church integers in λcalculus is
cdd = λ r n . (λ 1 . . r 1 (n 1 .))
cdd 2 3 = λ 1 . . 2 1 (3 1 .)
= λ 1 . . 2 1 (1 (1 (1 .)))
= λ 1 . . 1 (1 (1 (1 (1 .))))
= 5
Multiplication is
:n = λ r n . (λ 1 . . r (λ u . n 1 u) .)
:n 2 3 = λ 1 . . 2 (λ u . 3 1 u) .
= λ 1 . . 2 (λ u . 1 (1 (1 u))) .
= λ 1 . . 1 (1 (1 (1 (1 (1 .)))))
152
Russell’s Paradox in λcalculus
The λcalculus readily admits Russell’s paradox. Let us deﬁne a function : that takes a
function r as an argument, and is reduced to the application of the logical function :ot to
the application of r to itself.
: = λ r . :ot (r r)
Now what happens when we apply : to itself?
: : = :ot (: :)
= :ot (:ot (: :))
.
.
.
Since we have :ot (: :) = (: :), we have a paradox.
Version: 3 Owner: Logan Author(s): Logan
153
Chapter 17
03B48 – Probability and inductive
logic
17.1 conditional probability
Let (Ω. B. j) be a probability space, and let A and ) be random variables on Ω with joint
probability distribution j(A. ) ) := j(A
¸
) ).
The conditional probability of A given ) is deﬁned as
j(A[) ) :=
j(A
¸
) )
j() )
. (17.1.1)
In general,
j(A[) )j() ) = j(A. ) ) = j() [A)j(A). (17.1.2)
and so we have
j(A[) ) =
j() [A)j(A)
j() )
. (17.1.3)
Version: 1 Owner: drummond Author(s): drummond
154
Chapter 18
03B99 – Miscellaneous
18.1 Beth property
A logic is said to have the Beth property if whenever a predicate 1 is implicitly deﬁnable by φ
(i.e. if all models have at most one unique extension satisfying φ), then 1 is explicitly deﬁn
able relative to φ (i.e. there is a ψ not containing 1,such that φ [= ∀r
1
. ... r
n
(1(r
1
. .... r
n
) ↔
ψ(r
1
. .... r
n
))).
Version: 3 Owner: Aatu Author(s): Aatu
18.2 Hofstadter’s MIU system
The alphabet of the system contains three symbols `. 1. l. The set of theorem is the set of
string constructed by the rules and the axiom, is denoted by T and can be built as follows:
(axiom) `1 ∈ T.
(i) If r1 ∈ T then r1l ∈ T.
(ii) If `r ∈ T then `rr ∈ T.
(iii) In any theorem, 111 can be replaced by l.
(iv) In any theorem, ll can be omitted.
example:
155
• Show that `l11 ∈ T
`1 ∈ T by axiom
→`11 ∈ T by rule (ii) where r = 1
→`1111 ∈ T by rule (ii) where r = 11
→`11111111 ∈ T by rule (ii) where r = 1111
→`11111111l ∈ T by rule (i) where r = `1111111
→`11111ll ∈ T by rule (iii)
→`11111 ∈ T by rule (iv)
→`l11 ∈ T by rule (iii)
• Is `l a theorem?
No. Why? Because the number of 1’s of a theorem is never a multiple of 3. We will
show this by structural induction.
base case: The statement is true for the base case. Since the axiom has one 1 .
Therefore not a multiple of 3.
induction hypothesis: Suppose true for premise of all rule.
induction step: By induction hypothesis we assume the premise of each rule to be true
and show that the application of the rule keeps the staement true.
Rule 1: Applying rule 1 does not add any 1’s to the formula. Therefore the statement
is true for rule 1 by induction hypothesis.
Rule 2: Applying rule 2 doubles the amount of 1’s of the formula but since the initial
amount of 1’s was not a multiple of 3 by induction hypothesis. Doubling that amount
does not make it a multiple of 3 (i.e. if : ⇔ 0 mod3 then 2: ⇔ 0 mod3). Therefore
the statement is true for rule 2.
Rule 3: Applying rule 3 replaces 111 by l. Since the initial amount of 1’s was not a
multiple of 3 by induction hypothesis. Removing 111 will not make the number of 1’s
in the formula be a multiple of 3. Therefore the statement is true for rule 3.
Rule 4: Applying rule 4 removes ll and does not change the amount of 1’s. Since
the initial amount of 1’s was not a multiple of 3 by induction hypothesis. Therefore
the statement is true for rule 4.
Therefore all theorems do not have a multiple of 3 1’s.
[GVL]
REFERENCES
[HD] Hofstader, R. Douglas: G¨odel, Escher, Bach: an Eternal Golden Braid. Basic Books, Inc.,
New York, 1979.
Version: 5 Owner: Daume Author(s): Daume
156
18.3 IFlogic
Independence Friendly logic (IFlogic) is an interesting conservative extension of classical ﬁrst order logic
based on very natural ideas from game theoretical semantics developed by Jaakko Hintikka
and Gabriel Sandu among others. Although IFlogic is a conservative extension of ﬁrst order
logic, it has a number of interesting properties, such as allowing truthdeﬁnitions and ad
mitting a translation of all Σ
1
1
sentences (second order sentences with an intial second order
existential quantiﬁer followed by a ﬁrst order sentence).
IFlogic can be characterised as the natural extension of ﬁrst order logic when one allows
informational independence to occur in the game theoretical truth deﬁnition. To understand
this idea we need ﬁrst to introduce the game theoretical deﬁnition of truth for classical ﬁrst
order logic.
To each ﬁrst order sentence φ we assign a game G(φ) with to players played on models of
the appropriate language. The two players are called veriﬁer and falsisier (or nature). The
idea is that the veriﬁer attempts to show that the sentence is true in the model, while the
falsiﬁer attempts to show that it is false in the model. The game G(φ) is deﬁned as follows.
We will use the convention that if j is a symbol that names a function, a predicate or an
object of the model `, then j
M
is that named entity.
• if 1 is an :ary predicate and t
i
are names of elements of the model, then G(1(t
1
. .... t
n
))
is a game in which the veriﬁer immediatedly wins if (t
M
1
. .... t
M
n
) ∈ 1
M
and otherwise
the falsiﬁer immediatedly wins.
• the game G(φ
1
∨ φ
2
) begins with the choice φ
i
from φ
1
and φ
2
(i = 1 or i = 2) by the
veriﬁer and then proceeds as the game G(φ
i
)
• the game G(φ
1
∧ φ
2
) is the same as G(φ
1
∨ φ
2
), except that the choice is made by the
falsiﬁer
• the game G(∃rφ(r)) begins with the choice by veriﬁer of a member of ` which is
given a name a, and then proceeds as G(φ(a))
• the game G(∀rφ(r)) is the same as G(∃rφ(r)), except that the choice of a is made by
the falsiﬁer
• the game G(φ) is the same as G(φ) with the roles of the falsiﬁer and veriﬁer exchanged
Truth of a sentence φ is deﬁned as the existence of a winning strategy for veriﬁer for the
game G(φ). Similarly, falsity of φ is deﬁned as the existence of a winning strategy for the
falsiﬁer for the game G(φ). (A strategy is a speciﬁcation which determines for each move the
opponent does what the player should do. A winning strategy is a strategy which guarantees
victory no matter what strategy the opponent follows).
157
For classical ﬁrst order logic, this deﬁnition is equivalent to the usual Tarskian deﬁnition of
turth (i.e. the one based on satisfaction found in most treatments of semantics of ﬁrst order
logic). This means also that since the law of excluded middle holds for ﬁrst order logic that
the games G(φ) have a very strong property; either the falsiﬁer or the veriﬁer has a winning
strategy.
Notice that all rules except those for negation and atomic sentences concern choosing a
sentence or ﬁnding an element. These can be codiﬁed into functions, which tell us which
sentence to pick or which element of the model to choose, based on our previous choices
and those of our opponent. For example, consider the sentence ∀r(1(r) ∨ C(r)). The
corresponding game begins with the falsiﬁer picking an element a from the model, so a
strategy for the veriﬁer must specify for each element a which of C(a) and 1(a) to pick. The
truth of the sentence is equivalent to the existence of a winning strategy for the veriﬁer, i.e.
just such a function. But this means that ∀r(1(r)∨C(r)) is equivalent to ∃1∀r1(r)∧1(r) =
0∨C(r)∧1(r) = 1. Let’s consider a more complicated example: ∀r∃n∀.∃:1(r. n. .. :). The
truth of this is equivalent to the existence of a functions 1 and o, s.t. ∀r∀.1(r. 1(r). .. o(.)).
These sort of functions are known as Skolem functions, and they are in essence just winning
strategies for the veriﬁer. We won’t prove it here, but all ﬁrst order sentences can be
expressed in form ∃1
1
...∃1
n
∀r
1
...∀r
k
φ, where φ is a truth functional combination of atomic
sentences in which all terms are either constants or variables r
i
or formed by application of
the functions 1
i
to such terms. Such sentences are said to be in Σ
1
1
form.
Let’s consider a Σ
1
1
sentence ∃1∃o∀r∀.φ(r. 1(r). n. o(.)). Up front, it seems to assert the
existence of a winning strategy in a simple semantical game like those described above.
However, the game can’t correspond to any (classical) ﬁrst order formula! Let’s ﬁrst see
what the game the existence of a winning strategy of which this formula asserts looks like.
First, the falsiﬁer chooses elements a and b to serve as r and n. Then the veriﬁer chooses
an element c knowing only a and an element d knowing only b. The veriﬁer’s goal is that
φ(a. c. b. d) comes out as a true atomic sentence. The game could be actually arranged so
that the veriﬁer is a team of two players (who aren’t allowed to communicate with each
other), one of which picks c the other one picking d.
From a game theoretical point of view, games in which some moves must be made without
depending on some of the earlier moves are called informationally incomplete games, and
they occur very commonly. Bridge is such a game, for example, and usually real examples
of such games have “players” being actually teams made up of several people.
IFlogic comes out of the game theoretical deﬁnition in a natural way if we allow informa
tional independence in our semantical games. In IFlogic, every connective can be augmented
with an independence marker , so that ∗∗
t
means that the game for the occurance of
∗
t
within the scope of ∗ must be played without knowledge of the choices made for ∗. For
example (∀r∃n)∃nφ(r. n) asserts that for any choice of value for r by the falsisier, the
veriﬁer can ﬁnd a value for n which does not depend on the value of r, s.t. φ(r. n) comes
out true. This is not a very characteristic example, as it can be written as an ordinary ﬁrst
order formula ∃n∀rφ(r. n). The curious game we described above corresponding to the sec
158
ond order Skolemfunction formulation Σ
1
1
sentence ∃1∃o∀r∀.φ(r. 1(r). n. o(.)) corresponds
to an IFsentence (∀r∃n)(∀.∃n)(∃n)(∃n)φ(r. n. .. n). IFlogic allows informational in
dependence also for the usual logical connectives, for example (∀r∨)(φ(r) ∨ ψ(r)) is true
if and only if for all r, either φ(r) or ψ(r) is true, but which of these is picked by the veriﬁer
must be decided independently of the choice for r by the falsiﬁer.
One of the striking characteristics of IFlogic is that every Σ
1
1
formula φ has an IFtranslation
φ
I
1 which is true if and only if φ is true (the equivalence does not in general hold if we
replace ’true’ with ’false’). Since for example ﬁrst order truth (in a model) is Σ
1
1
deﬁnable
(it’s just quantiﬁcation over all possible valuations, which are second order objects), there
are IFtheories which correctly represent the truth predicate for their ﬁrst order part. What
is even more striking is that suﬃciently strong IFtheories can do this for the whole of the
language they are expressed in.
This seems to contradict Tarski’s famous result on the undeﬁnability of truth, but this is
illusory. Tarski’s result depends on the assumption that the logic is closed under contradic
tory negation. This is not the case for IFlogic. In general for a given sentence φ there is no
sentence φ∗ which is true just in case φ is not true. Thus the law of excluded middle does
not hold in general in IFlogic (although it does for the classical ﬁrst order portion). This is
quite unsurprising since games of imperfect information are very seldom determined in the
sense that either the veriﬁer or the falsisifer has a winning strategy. For example, a game in
which I choose a 10letter word and you have one go at guessing it is not determined in this
sense, since there is no 10letter word you couldn’t guess and on the other hand you have
no way of forcing me to choose any particular 10letter word (which would guarantee your
victory).
IFlogic is stronger than ﬁrst order logic in the usual sense that there are classes of structures
which are IFdeﬁnable but not ﬁrstorder deﬁnable. Some of these are even ﬁnite. Many
interesting concepts are expressible in IFlogic, such as equicardinality,inﬁnity (which can
be expressed by a logical formula in contradistinction to ordinary ﬁrst order logic in which
nonlogical symbols are needed), wellorder
By Lindstr¨om’s theorem we thus know that either IFlogic is not complete (i.e. it’s set of
validities is not r.e.) or the L¨owenheimSkolem theorem does not hold. In fact, (downward)
L¨owenheimSkolem theorem does hold for IFlogic, so it’s not complete. There is a com
plete disproof procedure for IFlogic, but because IFlogic is not closed under contradictory
negation this does not yield a complete proof procedure.
IFlogic can be extended by allowing contradictory negations of closed sentences and turth
functional combinations thereof. This extended IFlogic is extremely strong. For example,
the second order induction axiom for PA is ∀A((A(0) ∧∀n(A(n) →A(n +1))) →∀nA(n)).
The negation of this is a Σ
1
1
sentence asserting the existence of a set which invalidates the
induction axiom. Since Σ
1
1
sentences are expressible in IFlogic, we can translate the negation
of the induction axiom into IFsentence φ. But now φ is a formula of extended IFlogic,
and is clearly equivalent to the usual induction axiom! As all the rest of PA axioms are ﬁrst
order, this shows that extended IFlogic PA can correctly deﬁne the natural number system.
159
There exists also an interesting “translation” of :th order logic into extended IFlogic. Con
sider an :sorted ﬁrst order language and an :th order theory 1 translated into this language.
Now, extend the language to second order and add the axiom stating that the sort / + 1
actually comprises the whole of the powerset of the sort /. This is a Π
1
sentence (i.e. of
the form “for all predicates P there is a ﬁrst order element of sort / + 1 which comprises
exactly the / extension of 1”). It is easy to see that a formula is valid in this new system if
and only if it was valid in the original :th order logic. The negation of this axiom is again
Σ
1
1
and translatable into IFlogic and thus the axiom itself is expressible in extended IF
logic. Moreover, since most interesting second order theories are ﬁnitely axiomatisable, we
can consider sentences of form 1∗ →φ (where 1∗ is the multisorted translation of 1), which
express logical implication of φ by 1 (correctly). This is equivalent to (1∗) ∨ φ (where
is contradictory), but since 1∗ is a conjunction of a 1i
1
1
sentence asserting comprehension
translated into extended IFlogic and ﬁrst order translation of the axioms of 1, this is a Σ
1
1
formula translatable to nonextended IFlogic and so is φ. Thus sentences of form 1 → φ
of :th order logic are translatable into IFsentences which are true just in case the originals
were.
Version: 1 Owner: Aatu Author(s): Aatu
18.4 Tarski’s result on the undeﬁnability of Truth
Assume L is a logic which is closed under contradictory negation and has the usual truth
functional connectives. Assume also that L has a notion of open formula with one variable
and of substitution. Assume that 1 is a theory of L in which we can deﬁne deﬁne surrogates
for formulae of L, and in which all true instances of the substitution relation and the truth
functional connective relations are provable. We show that either 1 is inconsistent or 1 can’t
be augmented with a truth predicate True for which the following Tschema holds
True(
t
φ
t
) ↔φ
Assume that the open formulae with one variable of L have been indexed by some suitable
set that is representable in 1 (otherwise the predicate True would be next to useless, since if
there’s no way to speak of sentences of a logic, there’s little hope to deﬁne a truthpredicate
for it). Denote the i:th element in this indexing by 1
i
. Consider now the following open
formula with one variable
Liar(r) = True(1
x
)(r)
Now, since Liar is an open formula with one free variable it’s indexed by some i. Now
consider the sentence Liar(i). From the Tschema we know that
160
True(Liar(i)) ↔Liar(i)
and by the deﬁnition of Liar and the fact that i is the index of Liar(r) we have
True(Liar(i)) ↔True(Liar(i))
which clearly is absurd. Thus there can’t be an extension of 1 with a predicate Truth for
which the Tschema holds.
We have made several assumptions on the logic L which are crucial in order for this proof
to go trough. The most important is that L is closed under contradictory negation. There
are logics which allow truthpredicates, but these are not usually closed under contradictory
negation (so that it’s possible that True(Liar(i)) is neither true nor false). These logics
usually have stronger notions of negation, so that a sentence 1 says more than just that
1 is not true, and the proposition that 1 is simply not true is not expressible.
An example of a logic for which Tarski’s undeﬁnability result does not hold is the socalled
Independence Friendly logic, the semantics of which is based on game theory and which
allows various generalised quantiﬁers (the Henkin branching quantiﬁer, &c.) to be used.
Version: 5 Owner: Aatu Author(s): Aatu
18.5 axiom
In a nutshell, the logicodeductive method is a system of inference where conclusions (new
knowledge) follow from premises (old knowledge) through the application of sound arguments
(syllogisms, rules of inference). Tautologies excluded, nothing can be deduced if nothing
is assumed. Axioms and postulates are the basic assumptions underlying a given body
of deductive knowledge. They are accepted without demonstration. All other assertions
(theorems, if we are talking about mathematics) must be proven with the aid of the basic
assumptions.
The logicodeductive method was developed by the ancient Greeks, and has become the core
principle of modern mathematics. However, the interpretation of mathematical knowledge
has changed from ancient times to the modern, and consequently the terms axiom and
postulate hold a slightly diﬀerent meaning for the present day mathematician, then they
did for Aristotle and Euclid.
The ancient Greeks considered geometry as just one of several sciences, and held the theorems
of geometry on par with scientiﬁc facts. As such, they developed and used the logico
deductive method as a means of avoiding error, and for structuring and communicating
knowledge. Aristotle’s Posterior Analytics is a deﬁnitive exposition of the classical view.
161
“Axiom”, in classical terminology, referred to a selfevident assumption common to many
branches of science. A good example would be the assertion that
When an equal amount is taken from equals, an equal amount results.
At the foundation of the various sciences lay certain basic hypotheses that had to be accepted
without proof. Such a hypothesis was termed a postulate. The postulates of each science
were diﬀerent. Their validity had to be established by means of realworld experience.
Indeed, Aristotle warns that the content of a science cannot be successfully communicated,
if the learner is in doubt about the truth of the postulates.
The classical approach is well illustrated by Euclid’s elements, where we see a list of axioms
(very basic, selfevident assertions) and postulates (commonsensical geometric facts drawn
from our experience).
A1 Things which are equal to the same thing are also equal to one another.
A2 If equals be added to equals, the wholes are equal.
A3 If equals be subtracted from equals, the remainders are equal.
A4 Things which coincide with one another are equal to one another.
A5 The whole is greater than the part.
P1 It is possible to draw a straight line from any point to any other point.
P2 It is possible to produce a ﬁnite straight line continuously in a straight line.
P3 It is possible to describe a circle with any centre and distance.
P4 It is true that all right angles are equal to one another.
P5 It is true that, if a straight line falling on two straight lines make the interior angles on
the same side less than two right angles, the two straight lines, if produced indeﬁnitely,
meet on that side on which are the angles less than the two right angles.
The classical view point is explored in more detail here.
A great lesson learned by mathematics in the last 150 years is that it is useful to strip the
meaning away from the mathematical assertions (axioms, postulates, propositions, theorems)
and deﬁnitions. This abstraction, one might even say formalization, makes mathematical
knowledge more general, capable of multiple diﬀerent meanings, and therefore useful in
multiple contexts.
In structuralist mathematics we go even further, and develop theories and axioms (like
ﬁeld theory, group theory, topology, vector spaces) without any particular application in
162
mind. The distinction between an “axiom” and a “postulate” disappears. The postulates
of Euclid are proﬁtably motivated by saying that they lead to a great wealth of geometric
facts. The truth of these complicated facts rests on the acceptance of the basic hypotheses.
However by throwing out postulate 5, we get theories that have meaning in wider contexts,
hyperbolic geometry for example. We must simply be prepared to use labels like ”line”
and ”parallel” with greater ﬂexibility. The development of hyperbolic geometry taught
mathematicians that postulates should be regarded as purely formal statements, and not as
facts based on experience.
When mathematicians employ the axioms of a ﬁeld, the intentions are even more abstract.
The propositions of ﬁeld theory do not concern any one particular application; the mathe
matician now works in complete abstraction. There are many examples of ﬁelds; ﬁeld theory
gives correct knowledge in all contexts.
It is not correct to say that the axioms of ﬁeld theory are ”propositions that are regarded as
true without proof.” Rather, the Field Axioms are a set of constraints. If any given system of
addition and multiplication tolerates these constraints, then one is in a position to instantly
know a great deal of extra information about this system. There is a lot of bang for the
formalist buck.
Modern mathematics formalizes its foundations to such an extent that mathematical theories
can be regarded as mathematical objects, and logic itself can be regarded as a branch of
mathematics. Frege, Russell, Poincar´e, Hilbert, and G¨odel are some of the key ﬁgures in
this development.
In the modern understanding, a set of axioms is any collection of formally stated assertions
from which other formally stated assertions follow by the application of certain welldeﬁned
rules. In this view, logic becomes just another formal system. A set of axioms should be
consistent; it should be impossible to derive a contradiction from the axiom. A set of axioms
should also be nonredundant; an assertion that can be deduced from other axioms need not
be regarded as an axiom.
It was the early hope of modern logicians that various branches of mathematics, perhaps
all of mathematics, could be derived from a consistent collection of basic axioms. An early
success of the formalist program was Hilbert’s formalization of Euclidean geometry, and the
related demonstration of the consistency of those axioms.
In a wider context, there was an attempt to base all of mathematics on Cantor’s set theory.
Here the emergence of Russell’s paradox, and similar antinomies of naive set theory raised
the possibility that any such system could turn out to be inconsistent.
The formalist project suﬀered a decisive setback, when in 1931 G¨odel showed that it is
possible, for any suﬃciently large set of axioms (Peano’s axioms, for example) to construct
a statement whose truth is independent of that set of axioms. As a corollary, G¨odel proved
that the consistency of a theory like Peano arithmetic is an unprovable assertion within the
scope of that theory.
163
It is reasonable to believe in the consistency of Peano arithmetic because it is satisﬁed by
the system of natural numbers, an inﬁnite but intuitively accessible formal system. How
ever, at this date we have no way of demonstrating the consistency of modern set theory
(ZermeloFrankel axioms). The axiom of choice, a key hypothesis of this theory, remains a
very controversial assumption. Furthermore, using techniques of forcing (Cohen) one can
show that the continuum hypothesis (Cantor) is independent of the ZermeloFrankel axioms.
Thus, even this very general set of axioms cannot be regarded as the deﬁnitive foundation
for mathematics.
Version: 11 Owner: rmilson Author(s): rmilson, digitalis
18.6 compactness
A logic is said to be (κ. λ)compact, if the following holds
If Φ is a set of sentences of cardinality less than or equal to κ and all subsets of
Φ of cardinality less than λ are consistent, then Φ is consistent.
For example, ﬁrst order logic is (ω. ω)compact, for if all ﬁnite subsets of some class of
sentences are consistent, so is the class itself.
Version: 2 Owner: Aatu Author(s): Aatu
18.7 consistent
If 1 is a theory of L then it is consistent iﬀ there is some model M of L such that M = 1.
If a theory is not consistent then it is inconsistent.
A slightly diﬀerent deﬁnition is sometimes used, that 1 is consistent iﬀ 1 ¬ ⊥ (that is, as
long as it does not prove a contradiction). As long as the proof calculus used is sound and
complete, these two deﬁnitions are equivalent.
Version: 3 Owner: Henry Author(s): Henry
18.8 interpolation property
A logic is said to have the interpolation property if whenever φ(1. o) →ψ(1. o) holds, then
there is a sentence θ(1), so that φ(1. o) → θ(1) and θ(1) → ψ(1. 1), where 1. o and 1
164
are some sets of symbols that occur in the formulae, o being the set of symbols common to
both φ and ψ.
The interpolation property holds for ﬁrst order logic. The interpolation property is related
to Beth deﬁnability property and Robinson’s consistency property. Also, a natural general
isation is the concept ∆closed logic.
Version: 2 Owner: Aatu Author(s): Aatu
18.9 sentence
A sentence is a formula with no free variables.
Simple examples include:
∀r∃n[r < n]
or
∃.[. + 7 −43 = 0]
However the following formula is not a sentence:
r + 2 = 3
Version: 2 Owner: Henry Author(s): Henry
165
Chapter 19
03Bxx – General logic
19.1 BanachTarski paradox
The 3dimensional ball can be split in a ﬁnite number of pieces which can be pasted together
to give two balls of the same volume as the ﬁrst!
Let us formulate the theorem formally. We say that a set ¹ ⊂ R
n
is decomposable
in ` pieces ¹
1
. . . . . ¹
N
if there exist some isometries θ
1
. . . . . θ
N
of R
n
such that ¹ =
θ
1
(¹
1
)
¸
. . .
¸
θ
N
(¹
N
) while θ
1
(¹
1
). . . . . θ
N
(¹
N
) are all disjoint.
We then say that two sets ¹. 1 ⊂ R
n
are equidecomposable if both ¹ and 1 are decom
posable in the same pieces ¹
1
. . . . . ¹
N
.
Theorem 2 (BanachTarski). The unit ball B
3
⊂ R
3
is equidecomposable to the union of
two disjoint unit balls.
19.1.1 Comments
The actual number of pieces needed for this decomposition is not so large. Say that ten
pieces are enough.
Also it is not important that the set considered is a ball. Every two set with non empty
interior are equidecomposable in R
3
. Also the ambient space can be choosen larger. The
theorem is true in all R
n
with : ≥ 3 but it is not true in R
2
nor in R.
Where is the paradox? We are saying that a piece of (say) gold can be cut and pasted to
obtain two pieces equal to the previous one. And we may divide these two pieces in the same
way to obtain four pieces and so on...
We believe that this is not possible since the weight of the piece of gold does not change
166
when I cut it.
A consequence of this theorem is, in fact, that it is not possible to deﬁne the volume for all
subsets of the 3dimensional space. In particular the volume cannot be computed for some
of the pieces in which the unit ball is decomposed (some of them are not measurable).
The existence of nonmeasurable sets is proved more simply and in all dimension by Vitali Theorem.
However BanachTarski paradox says something more. It says that it is not possible to deﬁne
a measure on all the subsets of R
3
even if we drop the countable additivity and replace it
with a ﬁnite additivity:
j(¹
¸
1) = j(¹) + j(1) ∀¹. 1 disjoint.
Another point to be noticed is that the proof needs the axiom of choice. So some of the
pieces in which the ball is divided are not constructable.
Version: 4 Owner: paolini Author(s): paolini
167
Chapter 20
03C05 – Equational classes, universal
algebra
20.1 congruence
Let Σ be a ﬁxed signature, and A a structure for Σ. A congruence ∼ on A is an
equivalence relation such that for every natural number : and :ary function symbol 1
of Σ, if c
i
∼ c
t
i
then 1
A
(c
1
. . . . c
n
) ∼ 1
A
(c
t
1
. . . . c
t
n
).
Version: 6 Owner: almann Author(s): almann
20.2 every congruence is the kernel of a homomor
phism
Let Σ be a ﬁxed signature, and A a structure for Σ. If ∼ is a congruence on A, then there
is a homomorphism 1 such that ∼ = ker (1).
D eﬁne a homomorphism 1 : A → A ∼ : c → [[c]]. Observe that c ∼ / if and only if
1(c) = 1(/), so ∼ = ker (1). To verify that 1 is a homomorphism, observe that
1. For each constant symbol c of Σ, 1(c
A
) = [[c
A
]] = c
A/∼
.
2. For every natural number : and :ary relation symbol 1 of Σ, if 1
A
(c
1
. . . . . c
n
) then
1
A/∼
([[c
1
]]. . . . . [[c
n
]]), so 1
A/∼
(1(c
1
). . . . . 1(c
n
)).
168
3. For every natural number : and :ary function symbol 1 of Σ,
1(1
A
(c
1
. . . . c
n
)) = [[1
A
(c
1
. . . . c
n
)]]
= 1
A/∼
([[c
1
]]. . . . [[c
n
]])
= 1
A/∼
(1(c
1
). . . . 1(c
n
)).
Version: 3 Owner: almann Author(s): almann
20.3 homomorphic image of a Σstructure is a Σstructure
Let Σ be a ﬁxed signature, and A and B two structures for Σ. If 1 : A → B is a
homomorphism, then i(1) is a structure for Σ.
Version: 3 Owner: almann Author(s): almann
20.4 kernel
Given a function 1 : ¹ →1, the kernel of 1 is the equivalence relation on ¹ deﬁned by
(c. c
t
) ∈ ker (1) ⇔1(c) = 1(c
t
).
Version: 3 Owner: almann Author(s): almann
20.5 kernel of a homomorphism is a congruence
Let Σ be a ﬁxed signature, and Aand Btwo structures for Σ. If 1 : A →Bis a homomorphism,
then ker (1) is a congruence on A.
I f 1 is an :ary function symbol of Σ, and 1(c
i
) = 1(c
t
i
), then
1(1
A
(c
1
. . . . . c
n
)) = 1
B
(1(c
1
). . . . . 1(c
n
))
= 1
B
(1(c
t
1
). . . . . 1(c
t
n
))
= 1(1
A
(c
t
1
. . . . . c
t
n
)).
Version: 4 Owner: almann Author(s): almann
169
20.6 quotient structure
Let Σ be a ﬁxed signature, A a structure for Σ, and ∼ a congruence on A. The quotient
structure of A by ∼, denoted A∼, is deﬁned as follows:
1. The universe of A∼ is the set ¦[[c]] [ c ∈ A¦.
2. For each constant symbol c of Σ, c
A/∼
= [[c
A
]].
3. For every natural number : and every :ary function symbol 1 of Σ,
1
A/∼
([[c
1
]]. . . . [[c
n
]]) = [[1
A
(c
1
. . . . c
n
)]].
4. For every natural number : and every :ary relation symbol 1 of Σ, 1
A/∼
([[c
1
]]. . . . . [[c
n
]])
if and only if for some c
t
i
∼ c
i
we have 1
A
(c
t
1
. . . . . c
t
n
).
Version: 7 Owner: almann Author(s): almann
170
Chapter 21
03C07 – Basic properties of ﬁrstorder
languages and structures
21.1 Models constructed from constants
The deﬁnition of a structure and of the satisfaction relation is nice, but it raises the following
question : how do we get models in the ﬁrst place? The most basic construction for models
of ﬁrstorder theory is the construction that uses constants. Throughout this entry, 1 is a
ﬁxed ﬁrstorder language.
Let ( be a set of constant symbols of 1, and 1 be a theory in 1. Then we say ( is a set of
witnesses for 1 if and only if for every formula ϕ with at most one free variable r, we have
1 ¬ ∃r(ϕ) ⇒ϕ(c) for some c ∈ (.
lemma. Let 1 is any consistent set of sentences of 1, and ( is a set of new symbols such
that [([ = [1[. Let 1
t
= 1
¸
(. Then there is a consistent set 1
t
⊆ 1
t
extending 1 and
which has ( as set of witnesses.
Lemma. If 1 is a consistent theory in 1, and ( is a set of witnesses for 1 in 1, then 1 has
a model whose elements are the constants in (.
Proof: Let Σ be the signature for 1. If 1 is a consistent set of sentences of 1, then there is
a maximal consistent 1
t
⊇ 1. Note that 1
t
and 1 have the same sets of witnesses. As every
model of 1
t
is also a model of 1, we may assume 1 is maximal consistent.
We let the universe of M be the set of equivalence classes ( ∼, where c ∼ / if and only if
“c = /” ∈ 1. As 1 is maximal consistent, this is an equivalence relation. We interpret the
nonlogical symbols as follows :
1. [c] =
M
[/] if and only if c ∼ /;
171
2. Constant symbols are interpreted in the obvious way, i.e. if c ∈ Σ is a constant symbol,
then c
M
= [c];
3. If 1 ∈ Σ is an :ary relation symbol, then ([c
1
]. .... [c
n
]) ∈ 1
M
if and only if 1(c
1
. .... c
n
) ∈
1;
4. If 1 ∈ Σ is an :any function symbol, then 1
M
([c
0
]. .... [c
n
]) = [/] if and only if
“1(c
1
. .... c
n
) = /” ∈ 1.
From the fact that 1 is maximal consistent, and ∼ is an equivalence relation, we get that
the operations are welldeﬁned (it is not so simple, i’ll write it out later). The proof that
M[= 1 is a straightforward induction on the complexity of the formulas of 1. ♦
Corollary. (The extended completeness theorem) A set 1 of formulas of 1 is consistent if
and only if it has a model (regardless of whether or not 1 has witnesses for 1).
Proof: First add a set ( of new constants to 1, and expand 1 to 1
t
in such a way that (
is a set of witnesses for 1
t
. Then expand 1
t
to a maximal consistent set 1
tt
. This set has a
model M consisting of the constants in (, and M is also a model ot 1. ♦
Corollary. (compactness theorem) A set 1 of sentences of 1 has a model if and only if
every ﬁnite subset of 1 has a model.
Proof: Replace “has a model” by “is consistent”, and apply the syntactic compactness
theorem. ♦
Corollary. (G¨odel’s completeness theorem) Let 1 be a consistent set of formulas of 1. Then
A sentence ϕ is a theorem of 1 if and only if it is true in every model of 1.
Proof: If ϕ is not a theorem of 1, then ϕ is consistent with 1, so 1
¸
¦ϕ¦ has a model
M, in which ϕ cannot be true. ♦
Corollary. (Downward L¨owenheimSkolem theorem) If 1 ⊆ 1 has a model, then it has a
model of power at most [1[.
I f 1 has a model, then it is consistent. The model constructed from constants has power
at most [1[ (because we must add at most [1[ many new constants). ♦
Most of the treatment found in this entry can be read in more details in Chang and Keisler’s
book Model Theory.
Version: 6 Owner: jihemme Author(s): jihemme
21.2 Stone space
Suppose 1 is a ﬁrst order language and 1 is a set of parameters from an 1structure `.
172
Let o
n
(1) be the set of (complete) :types over 1 (see type). Then we put a topology on
o
n
(1) in the following manner.
For every formula ψ ∈ 1(1) we let o(ψ) := ¦j ∈ o
n
(1) : ψ ∈ j¦. Then the topology is the
one with a basis of open sets given by ¦o(ψ) : ψ ∈ 1(1)¦. Then we call o
n
(1) endowed
with this topology the Stone space of complete :types over 1.
Some logical theorems and conditions are equivalent to topological conditions on this topol
ogy.
• The compactness theorem for ﬁrst order logic is so named because it is equivalent to
this topology being compact.
• We deﬁne j to be an isolated type iﬀ j is an isolated point in the stone space. This is
equivalent to there being some formula ψ so that for every φ ∈ j we have 1
¸
ψ ¬ φ
i.e. all the formulas in j are implied by some formula.
• The Morley rank of a type j ∈ o
1
(`) is equal to the CantorBendixson rank of j in
this space.
The idea of considering the Stone space of types dates back to [1].
We can see that the set of formulas in a language is a Boolean lattice. A type is an ultraﬁlter
on this lattice. The deﬁnition of a Stone space can be made in an analogous way on the set
of ultraﬁlters on any boolean lattice.
REFERENCES
1. M. Morley, Categoricity in power. Trans. Amer. Math. Soc. 114 (1965), 514538.
Version: 5 Owner: ratboy Author(s): Larry Hammick, Timmy
21.3 alphabet
An alphabet Σ is a nonempty ﬁnite set of symbols. The main restriction is that we must
make sure that every string formed from Σ can be broken back down into symbols in only
one way.
For example, ¦/. o. o. /. oo¦ is not a valid alphabet because the string /oo can be broken
up in two ways: / o o and / oo. ¦Cc. ¨ :c. d. c¦ is a valid alphabet, because there is only
one way to fully break up any given string formed from it.
If Σ is our alphabet and : ∈ Z
+
, we deﬁne the following as the powers of Σ
173
• Σ
0
= λ, where λ stands for the empty string.
• Σ
n
= ¦rn[r ∈ Σ. n ∈ Σ
n−1
¦ (rn is the juxtaposition of r and n)
So, Σ
n
is the set of all strings formed from Σ of length :.
Version: 1 Owner: xriso Author(s): xriso
21.4 axiomatizable theory
Let 1 be a ﬁrst order theory. A subset ∆ ⊆ 1 is a set of axioms for 1 if and only if 1 is
the set of all consequences of the formulas in ∆. In other words, ϕ ∈ 1 if and only if ϕ is
provable using only assumptions from ∆.
Deﬁnition. A theory 1 is said to be ﬁnitely axiomatizable if and only if there is a ﬁnite
set of axioms for 1; it is said to be recursively axiomatizable if and only if it has a
recursive set of axioms.
For example, group theory is ﬁnitely axiomatizable (it has only three axioms), and Peano arithmetic
is recursivaly axiomatizable : there is clearly an algorithm that can decide if a formula of
the language of the natural numbers is an axiom.
Theorem. complete recursively axiomatizable theories are decidable.
As an example of the use of this theorem, consider the theory of algebraically closed ﬁelds
of characteristic j for any number j prime or 0. It is complete, and the set of axioms is
obviously recursive, and so it is decidable.
Version: 2 Owner: jihemme Author(s): jihemme
21.5 deﬁnable
21.5.1 Deﬁnable sets and functions
Deﬁnability In Model Theory
Let L be a ﬁrst order language. Let ` be an Lstructure. Denote r
1
. . . . . r
n
by r and
n
1
. . . . . n
m
by n, and suppose φ(r. n) is a formula from L, and /
1
. . . . . /
m
is some sequence
from `.
Then we write φ(`
n
.
/) to denote ¦c ∈ `
n
: ` [= φ(c.
/)¦. We say that φ(`
n
.
/) is
/
deﬁnable. More generally if o is some set and 1 ⊆ `, and there is some
/ from 1 so that
174
o is
/deﬁnable then we say that o is 1deﬁnable.
In particular we say that a set o is ∅deﬁnable or zero deﬁnable iﬀ it is the solution set of
some formula without parameters.
Let 1 be a function, then we say 1 is 1deﬁnable iﬀ the graph of 1 (i.e. ¦(r. n) : 1(r) = n¦)
is a 1deﬁnable set.
If o is 1deﬁnable then any automorphism of ` that ﬁxes 1 pointwise, ﬁxes o setwise.
A set or function is deﬁnable iﬀ it is 1deﬁnable for some parameters 1.
Some authors use the term deﬁnable to mean what we have called ∅deﬁnable here. If this
is the convention of a paper, then the term parameter deﬁnable will refer to sets that are
deﬁnable over some parameters.
Sometimes in model theory it is not actually very important what language one is using, but
merely what the deﬁnable sets are, or what the deﬁnability relation is.
Deﬁnability of functions in Proof Theory
In proof theory, given a theory 1 in the language L, for a function 1 : ` → ` to be
deﬁnable in the theory 1, we have two conditions:
(i) There is a formula in the language L s.t. 1 is deﬁnable over the model `, as in the above
deﬁnition; i.e., its graph is deﬁnable in the language L over the model `, by some formula
φ(r. n).
(ii) The theory 1 proves that 1 is indeed a function, that is 1 ¬ ∀r∃!n.φ(r. n).
For example: the graph of exponentiation function r
y
= . is deﬁnable by the language of
the theory 1∆
0
(a weak subsystem of PA), however the function itself is not deﬁnable in
this theory.
Version: 13 Owner: iddo Author(s): iddo, yark, Timmy
21.6 deﬁnable type
Let ` be a ﬁrst order structure. Let ¹ and 1 be sets of parameters from `. Let j be
a complete :type over 1. Then we say that j is an ¹deﬁnable type iﬀ for every formula
ψ(¯ r. ¯ n) with ln(¯ r) = :, there is some formula dψ(¯ n. ¯ .) and some parameters ¯ c from ¹ so
that for any
¯
/ from 1 we have ψ(¯ r.
¯
/) ∈ j iﬀ ` [= dψ(
¯
/. ¯c).
Note that if j is a type over the model ` then this condition is equivalent to showing that
175
¦
¯
/ ∈ ` : ψ(¯ r.
¯
/) ∈ `¦ is an ¹deﬁnable set.
For j a type over 1, we say j is deﬁnable if it is 1deﬁnable.
If j is deﬁnable, we call dψ the deﬁning formula for ψ, and the function ψ →dψ a deﬁning
scheme for j.
Version: 1 Owner: Timmy Author(s): Timmy
21.7 downward LowenheimSkolem theorem
Let 1 be a ﬁrst order language, let A be an 1structure and let 1 ⊆ dom(A). Then there is
an 1structure B such that 1 ⊆ B and [B[ < Max([1[. [1[) and B is elementarily embedded
in A.
Version: 1 Owner: Evandar Author(s): Evandar
21.8 example of deﬁnable type
Consider (Q. <) as a structure in a language with one binary relation, which we interpret as
the order. This is a universal, ℵ
0
categorical structure (see example of universal structure).
The theory of (Q. <) has quantiﬁer elimination, and so is ominimal. Thus a type over the
set Q is determined by the quantiﬁer free formulas over Q, which in turn are determined by
the atomic formulas over Q. An atomic formula in one variable over 1 is of the form r < /
or r / or r = / for some / ∈ 1. Thus each 1type over Q determines a Dedekind cut over
Q, and conversly a Dedekind cut determines a complete type over Q. Let 1(j) := ¦c ∈ Q :
r c ∈ j¦.
Thus there are two classes of type over Q.
1. Ones where 1(j) is of the form (−∞. c) or (−∞. c] for some c ∈ Q. It is clear that
these are deﬁnable from the above discussion.
2. Ones where 1(j) has no supremum in Q. These are clearly not deﬁnable by ominimality
of Q.
Version: 1 Owner: Timmy Author(s): Timmy
176
21.9 example of strongly minimal
Let 1
R
be the language of rings. In other words 1
R
has two constant symbols 0. 1 and three
binary function symbols +. .. −. Let 1 be the 1
R
theory that includes the ﬁeld axioms and
for each : the formula
∀r
0
. r
1
. . . . . r
n
∃n((
1in
r
i
= 0) →
¸
0in
r
i
n
i
= 0)
Which expresses that every degree : polynomial which is non constant has a root. Then any
model of 1 is an algebraically closed ﬁeld. One can show that this is a complete theory and
has quantiﬁer elimination (Tarski). Thus every 1deﬁnable subset of any 1 [= 1 is deﬁnable
by a quantiﬁer free formula in 1
R
(1) with one free variable n. A quantiﬁer free formula is a
Boolean combination of atomic formulas. Each of these is of the form
¸
in
/
i
n
i
= 0 which
deﬁnes a ﬁnite set. Thus every deﬁnable subset of 1 is a ﬁnite or coﬁnite set. Thus 1 and
1 are strongly minimal
Version: 3 Owner: Timmy Author(s): Timmy
21.10 ﬁrst isomorphism theorem
Let Σ be a ﬁxed signature, and A and B structures for Σ. If 1 : A →B is a homomorphism,
then Aker (1) is bimorphic to i(1). Furthermore, if 1 has the additional property that for
every natural number : and :ary relation symbol 1 of Σ,
1
B
(1(c
1
). . . . . 1(c
n
)) ⇒∃c
t
i
[1(c
i
) = 1(c
t
i
) ∧ 1
A
(c
t
1
. . . . . c
t
n
)].
then Aker (1)
∼
= i(1).
S ince the homomorphic image of a Σstructure is also a Σstructure, we may assume that
i(1) = B.
Let ∼ = ker (1). Deﬁne a bimorphism φ: A ∼→ B : [[c]] → 1(c). To verify that φ is well
deﬁned, let c ∼ c
t
. Then φ([[c]]) = 1(c) = 1(c
t
) = φ([[c
t
]]). To show that φ is injective,
suppose φ([[c]]) = φ([[c
t
]]). Then 1(c) = 1(c
t
), so c ∼ c
t
. Hence [[c]] = [[c
t
]]. To show that φ is
a homomorphism, observe that for any constant symbol c of Σ we have φ([[c
A
]]) = 1(c
A
) = c
B
.
For every natural number : and :ary relation symbol 1 of Σ,
1
A/∼
([[c
1
]]. . . . . [[c
n
]]) ⇒1
A
(c
1
. . . . . c
n
)
⇒1
B
(1(c
1
). . . . . 1(c
n
))
⇒1
B
(φ([[c
1
]]. . . . . φ([[c
n
]])).
177
For every natural number : and :ary function symbol 1 of Σ,
φ(1
A/∼
([[c
1
]]. . . . . [[c
n
]])) = φ([[1
A
(c
1
. . . . . c
n
)]])
= 1(1
A
(c
1
. . . . . c
n
))
= 1
B
(1(c
1
). . . . . 1(c
n
))
= 1
B
(φ([[c
1
]]. . . . . φ([[c
n
]])).
Thus φ is a bimorphism.
Now suppose 1 has the additional property mentioned in the statement of the theorem.
Then
1
B
(φ([[c
1
]]). . . . . φ([[c
n
]])) ⇒1
B
(1(c
1
). . . . . 1(c
n
))
⇒∃c
t
i
[c
i
∼ c
t
i
∧ 1
A
(c
t
1
. . . . . c
t
n
)]
⇒1
A/∼
([[c
1
]]. . . . . [[c
n
]]).
Thus φ is an isomorphism.
Version: 4 Owner: almann Author(s): almann
21.11 language
Let Σ be an alphabet. We then deﬁne the following using the powers of an alphabet and
inﬁnite union, where : ∈ Z.
Σ
+
=
∞
¸
n=1
Σ
n
Σ
∗
=
∞
¸
n=0
Σ
n
= Σ
+
¸
¦λ¦
A string is an element of Σ
∗
, meaning that it is a grouping of symbols from Σ one after
another. For example, c//c is a string, and c//c is a diﬀerent string. Σ
+
, like Σ
∗
, contains
all ﬁnite strings except that Σ
+
does not contain the empty string λ.
A language over Σ is a subset of Σ
∗
, meaning that it is a set of strings made from the
symbols in the alphabet Σ.
Take for example an alphabet Σ = ¦♣. ℘. 63. c. ¹¦. We can construct languages over Σ, such
as: 1 = ¦ccc. λ. ¹℘63. 63♣. ¹c¹c¹¦, or ¦℘c. ℘cc. ℘ccc. ℘cccc. ¦, or even the empty set
∅. In the context of languages, ∅ is called the empty language.
Version: 12 Owner: bbukh Author(s): bbukh, xriso
178
21.12 length of a string
Suppose we have a string u on alphabet Σ. We can then represent the string as u =
r
1
r
2
r
3
r
n−1
r
n
, where for all r
i
(1 < i < :), r
i
∈ Σ (this means that each r
i
must be
a ”letter” from the alphabet). Then, the length of u is :. The length of a string u is
represented as u.
For example, if our alphabet is Σ = ¦c. /. cc¦ then the length of the string u = /ccc/ is
u = 4, since the string breaks down as follows: r
1
= /, r
2
= cc, r
3
= c, r
4
= /. So, our
r
n
is r
4
and therefore : = 4. Although you may think that cc is two separate symbols, our
chosen alphabet in fact classiﬁes it as a single symbol.
A ”special case” occurs when u = 0, i.e. it does not have any symbols in it. This string
is called the empty string. Instead of saying u = , we use λ to represent the empty string:
u = λ. This is similar to the practice of using β to represent a space, even though a space
is really blank.
If your alphabet contains λ as a symbol, then you must use something else to denote the
empty string.
Suppose you also have a string · on the same alphabet as u. We turn u into r
1
r
n
just
as before, and similarly · = n
1
n
m
. We say · is equal to u if and only if both : = :,
and for every i, r
i
= n
i
.
For example, suppose u = //c and · = /c/, both strings on alphabet Σ = ¦c. /¦. These
strings are not equal because the second symbols do not match.
Version: 3 Owner: xriso Author(s): xriso
21.13 proof of homomorphic image of a Σstructure is
a Σstructure
We need to show that i(1) is closed under functions. For every constant symbol c of Σ,
c
B
= 1(c
A
). Hence c
B
∈ i(1). Also, if /
1
. . . . . /
n
∈ i(1) and 1 is an :ary function symbol of
Σ, then for some c
1
. . . . . c
n
∈ A we have
1
B
(/
1
. . . . . /
n
) = 1
B
(1(c
1
). . . . . 1(c
n
)) = 1(1
A
(c
1
. . . . . c
n
)).
Hence 1
B
(/
1
. . . . . /
n
) ∈ i(1).
Version: 1 Owner: almann Author(s): almann
179
21.14 satisfaction relation
Alfred Tarski was the ﬁrst mathematician to give a deﬁnition of what it means for a formula
to be “true” in a structure. To do this, we need to provide a meaning to terms, and truth
values to the formulas. In doing this, free variables cause a problem : what value are they
going to have ? One possible answer is to supply temporary values for the free variables,
and deﬁne our notions in terms of these temporary values.
Let A be a structure for the signature Σ. Suppose J is an interpretation, and σ is a function
that assigns elements of ¹ to variables, we deﬁne the function Val
I,σ
inductively on the
construction of terms :
Val
I,σ
(c) = "(c) c a constant symbol
Val
I,σ
(r) = σ(r) r a variable
Val
I,σ
(1(t
1
. .... t
n
)) = "(1)(Val
I,σ
(t
1
). .... Val
I,σ
(t
n
)) 1 an :ary function symbol
Now we are set to deﬁne satisfaction. Again we have to take care of free variables by assigning
temporary values to them via a function σ. We deﬁne the relation A. σ [= ϕ by induction
on the construction of formulas :
A. σ [= t
1
= t
2
if and only if Val
I,σ
(t
1
) = Val
I,σ
(t
2
)
A. σ [= 1(t
1
. .... t
n
) if and only if (Val
I,σ
(t
1
). .... Val
I,σ
(t
1
)) ∈ "(1)
A. σ [= ϕ if and only if A. σ [= ϕ
A. σ [= ϕ ∨ ψ if and only if either A. σ [= ψ or A. σ [= ψ
A. σ [= ∃r.ϕ(r) if and only if for some c ∈ ¹. A. σ[rc] [= ϕ
Here
σ[rc](n)
c if r = n
σ(n) else.
In case for some ϕ of 1, we have A. σ [= ϕ, we say that A models, or is a model of,
or satisﬁes ϕ in environment, or context sigma. If ϕ has the free variables r
1
. .... r
n
,
and c
1
. .... c
n
∈ A, we also write A [= ϕ(c
1
. .... c
n
) or A [= ϕ(c
1
r
1
. .... c
n
r
n
) instead of
A. σ[r
1
c
1
] [r
n
c
n
] [= ϕ. In case ϕ is a sentence (formula with no free variables), we write
A [= ϕ.
Version: 8 Owner: jihemme Author(s): jihemme
180
21.15 signature
A signature is the collection of a set of constant symbols, and for every natural number :,
a set of :ary relation symbols and a set of :ary function symbols.
Version: 1 Owner: almann Author(s): almann
21.16 strongly minimal
Let 1 be a ﬁrst order language and let ` be an 1structure. Let o, a subset of the domain
of ` be a deﬁnable inﬁnite set. Then o is strongly minimal iﬀ every deﬁnable ( ⊆ o we
have either ( is ﬁnite or o ` ( is ﬁnite. We say that ` is strongly minimal iﬀ the domain
of ` is a strongly minimal set.
If ` is strongly minimal and ` ⇔` then ` is strongly minimal. Thus if 1 is a complete
1 theory then we say 1 is strongly minimal if it has some model (equivalently all models)
which is strongly minimal.
Note that ` is strongly minimal iﬀ every deﬁnable subset of ` is quantiﬁer free deﬁnable
in a language with just equality. Compare this to the notion of ominimal structures.
Version: 1 Owner: Timmy Author(s): Timmy
21.17 structure preserving mappings
Let Σ be a ﬁxed signature, and A and B be two structures for Σ. The interesting functions
from A to B are the ones that preserve the structure.
A function 1 : A →B is said to be a homomorphism if and only if:
1. For every constant symbol c of Σ, 1(c
A
) = c
B
.
2. For every natural number : and every :ary function symbol 1 of Σ,
1(1
A
(c
1
. .... c
n
)) = 1
B
(1(c
1
). .... 1(c
n
)).
3. For every natural number : and every :ary relation symbol 1 of Σ,
1
A
(c
1
. . . . . c
n
) ⇒1
B
(1(c
1
). . . . . 1(c
n
)).
Homomorphisms with various additional properties have special names:
181
• An injective homomorphism is called a monomorphism.
• A surjective homomorphism is called an epimorphism.
• A bijective homomorphism is called a bimorphism.
• An injective homomorphism whose inverse function is also a homomorphism is called
an embedding.
• A surjective embedding is called an isomorphism.
• A homomorphism from a structure to itself (e.g., 1 : A → A) is called an endomor
phism.
• An isomorphism from a structure to itself is called an automorphism.
Version: 5 Owner: almann Author(s): almann, yark, jihemme
21.18 structures
Suppose Σ is a ﬁxed signature, and L is the corresponding ﬁrstorder language. A Σ
structure A consists of a set ¹, called the universe of A, together with an interpretation
for the nonlogical symbols contained in Σ. The interpretation of Σ in A is an operation
J on sets that has the following properties :
1. For each constant symbol c, J(c) is an element of ¹.
2. For each : ∈ N, and each :ary function symbol 1, J(1) : ¹
n
→ ¹ is a function from
¹
n
to ¹.
3. For each : ∈ N, and each :ary relation symbol 1, J(1) is a subset of (:ary
relation on) ¹
n
.
Another commonly used notation is J(c) = c
A
, J(1) = 1
A
, J(1) = 1
A
. For notational
convenience, when the context makes it clear in which structure we are working, we use the
elements of Σ to stand for both the symbols and their interpretation. When Σ is understood,
we call A a structure, instead of a Σstructure. In some texts, model may be used for
structure. Also, we shall write c ∈ A instead of c ∈ ¹. Of course, there are many diﬀerent
possibilities for the interpretation J. If A is a structure, then the power of A, which we
denote [A[, is the cardinality of its universe ¹. It is easy to see that the number of possibilities
for the interpretation J is at most 2
[A[
when ¹ is inﬁnite.
Version: 5 Owner: jihemme Author(s): jihemme
182
21.19 substructure
Let Σ be a ﬁxed signature, and A and B structures for Σ. We say A is a substructure of
B, denoted A ⊆ B, if for all r ∈ A we have r ∈ B, and the inclusion map i : A →B : r →r
is an embedding.
Version: 1 Owner: almann Author(s): almann
21.20 type
Let 1 be a ﬁrst order language. Let ` be an 1structure. Let 1 ⊆ `, and let c ∈ `
n
.
Then we deﬁne the type of c over 1 to be the set of 1formulas φ(r.
¯
/) with parameters
¯
/
from 1 so that ` [= φ(c.
¯
/). A collection of 1formulas is a complete :type over 1 iﬀ it is
of the above form for some 1. ` and c ∈ `
n
.
We call any consistent collection of formulas j in : variables with parameters from 1 a
partial :type over 1. (See criterion for consistency of sets of formulas.)
Note that a complete :type j over 1 is consistent so is in particular a partial type over
1. Also j is maximal in the sense that for every formula ψ(r.
¯
/) over 1 we have either
ψ(r.
¯
/) ∈ j or ψ(r.
¯
/) ∈ j. In fact, for every collection of formulas j in : variables
the following are equivalent:
• j is the type of some sequence of : elements c over 1 in some model ` ⇔`
• j is a maximal consistent set of formulas.
For : ∈ ω we deﬁne o
n
(1) to be the set of complete :types over 1.
Some authors deﬁne a collection of formulas j to be a :type iﬀ j is a partial :type. Others
deﬁne j to be a type iﬀ j is a complete :type.
A type (resp. partial type/complete type) is any :type (resp. partial type/complete type)
for some : ∈ ω.
Version: 2 Owner: Timmy Author(s): Timmy
21.21 upward LowenheimSkolem theorem
Let 1 be a ﬁrstorder language and let A be an inﬁnite 1structure. Then if κ is a cardinal
with κ ` Max([A[. [1[) then there is an 1structure B such that A is elementarily embedded
183
in B.
Version: 2 Owner: Evandar Author(s): Evandar
184
Chapter 22
03C15 – Denumerable structures
22.1 random graph (inﬁnite)
Suppose we have some method ` of generating sequences of letters from ¦j. ¡¦ so that at
each generation the probability of obtaining j is r, a real number strictly between 0 and 1.
Let ¦c
i
: i < ω¦ be a set of vertices. For each i < ω , i ` 1 we construct a graph G
i
on the
vertices c
1
. . . . . c
i
recursively.
• G
1
is the unique graph on one vertex.
• For i 1 we must describe for any , < / < i when c
j
and c
k
are joined.
– If / < i then join c
j
and c
k
in G
i
iﬀ c
j
and c
k
are joined in G
i−1
– If / = i then generate a letter (,. /) with `. Join c
j
to c
k
iﬀ (,. /) = j.
Now let Γ be the graph on ¦c
i
: i < ω¦ so that for any :. : < ω, c
n
is joined to c
m
in Γ iﬀ
it is in some G
i
.
Then we call Γ a random graph. Consider the following property which we shall call f
saturation:
Given any ﬁnite disjoint l and \ , subsets of ¦c
i
: i < ω¦ there is some c
n
∈ ¦c
i
: i <
ω¦ ` (l
¸
\ ) so that c
n
is joined to every point of l and no points in \ .
Proposition 1. A random graph has fsaturation with probability 1.
Proof: Let /
1
. /
2
. . . . . /
n
. . . . be an enumeration of ¦c
i
: i < ω¦ ` (l
¸
\ ). We say that /
i
is
correctly joined to (l. \ ) iﬀ it is joined to all the members of l and non of the members of
\ . Then the probability that /
i
is not correctly joined is (1 − r
[U[
(1 −r)
[V [
) which is some
185
real number n strictly between 0 and 1. The probability that none of the ﬁrst : are correctly
joined is n
m
and the probability that none of the /
i
s are correctly joined is lim
n→∞
n
n
= 0.
Thus one of the /
i
s is correctly joined.
Proposition 2. Any two countable graphs with fsaturation are isomorphic.
Proof: This is via a back and fourth argument. The property of fsaturation is exactly what
is needed.
Thus although the system of generation of a random graph looked as though it could deliver
many potentially diﬀerent graphs, this is not the case. Thus we talk about the random
graph.
The random graph can also be constructed as a Fraisse limit of all ﬁnite graphs, and in many
other ways. It is homogeneous and universal for the class of all countable graphs.
The theorem that almost every two inﬁnite graph random are isomorphic was ﬁrst proved
in [1].
REFERENCES
1. Paul Erd¨os and Alf´ed R´enyi. Assymetric graphs. Acta Math. Acad. Sci. Hung., 14:295–315,
1963.
Version: 2 Owner: bbukh Author(s): bbukh, Timmy
186
Chapter 23
03C35 – Categoricity and
completeness of theories
23.1 κcategorical
Let 1 be a ﬁrst order language and let o be a set of 1sentences. If κ is a cardinal, then o
is said to be κcategorical if o has a model of cardinality κ and any two such models are
isomorphic.
In other words, o is categorical iﬀ it has a unique model of cardinality κ, to within isomorphism.
Version: 1 Owner: Evandar Author(s): Evandar
23.2 Vaught’s test
Let 1 be a ﬁrst order language, and let o be a set of 1sentences with no ﬁnite models which
is κcategorical for some κ ` [1[. Then o is complete.
Version: 4 Owner: Evandar Author(s): Evandar
23.3 proof of Vaught’s test
Let ϕ be an 1sentence, and let A be the unique model of S of cardinality κ. Suppose A = ϕ.
Then if B is any model of o then by the upward and downward LowenheimSkolem theorems,
there is a model C of o which is elementarily equivalent to B such that [C[ = κ. Then C is
isomorphic to A, and so C = ϕ, and B = ϕ. So B = ϕ for all models B of o, so o = ϕ.
187
Similarly, if A = ϕ then o = ϕ. So o is complete.¯
Version: 1 Owner: Evandar Author(s): Evandar
188
Chapter 24
03C50 – Models with special
properties (saturated, rigid, etc.)
24.1 example of universal structure
Let 1 be the ﬁrst order language with the binary relation <. Consider the following sentences:
• ∀r. n((r < n (r < n) ∧ ((r < n ∧ n < r) ↔r = n))
• ∀r. n. .(r < n ∧ n < . →r < .)
Any 1structure satisfying these is called a linear order. We deﬁne the relation < so that
r < n iﬀ r < n ∧ r = n. Now consider these sentences:
1. ∀r. n((r < n →∃.(r < . < n))
2. ∀r∃n. .(n < r < .)
A linear order that satisﬁes 1. is called dense. We say that a linear order that satisﬁes 2. is
without endpoints. Let 1 be the theory of dense linear orders without endpoints. This is a
complete theory.
We can see that (Q. <) is a model of 1. It is actually a rather special model.
Theorem 3. Let (o. <) be any ﬁnite linear order. Then o embeds in (Q. <).
Proof: By induction on [o[, it is trivial for [o[ = 1.
189
Suppose that the statement holds for all linear orders with cardinality less than or equal to
:. Let [o[ = : + 1, then pick some c ∈ o, let o
t
be the structure induced by o on o ` c.
Then there is some embedding c of o
t
into Q.
• Now suppose c is less than every member of o
t
, then as Q is without endpoints, there
is some element / less than every element in the image of c. Thus we can extend c to
map c to / which is an embedding of o into Q.
• We work similarly if c is greater than every element in o
t
.
• If neither of the above hold then we can pick some maximum c
1
∈ o
t
so that c
1
< c.
Similarly we can pick some minimum c
2
∈ o
t
so that c
2
< c. Now there is some / ∈ Q
with c(c
1
) < / < c(c
2
). Then extending c by mapping c to / is the required embedding.
¯
It is easy to extend the above result to countable structures. One views a countable structure
as a the union of an increasing chain of ﬁnite substructures. The necessary embedding is
the union of the embeddings of the substructures. Thus (Q. <) is universal countable linear
order.
Theorem 4. (Q. <) is homogeneous.
Proof: The following type of proof is known as a back and forth argument. Let o
1
and o
2
be
two ﬁnite substructures of (Q. <). Let c : o
1
→o
2
be an isomorphism. It is easier to think
of two disjoint copies 1 and ( of Q with o
1
a substructure of 1 and o
2
a substructure of (.
Let /
1
. /
2
. . . . be an enumeration of 1 ` o
1
. Let c
1
. c
2
. . . . . c
n
be an enumeration of ( ` o
2
.
We iterate the following two step process:
The ith forth step If /
i
is already in the domain of c then do nothing. If /
i
is not in the
domain of c. Then as in proposition 3, either /
i
is less than every element in the domain of
c or greater than or it has an immediate successor and predecessor in the range of c. Either
way there is an element c in (` range(c) relative to the range of c. Thus we can extend the
isomorphism to include /
i
.
The ith back step If c
i
is already in the range of c then do nothing. If c
i
is not in the
domain of c. Then exactly as above we can ﬁnd some / ∈ 1` dom(c) and extend c so that
c(/) = c
i
.
After ω stages, we have an isomorphism whose range includes every /
i
and whose domain
includes every c
i
. Thus we have an isomorphism from 1 to ( extending c. ¯
A similar back and forth argument shows that any countable dense linear order wihtout
endpoints is isomorphic to (Q. <) so 1 is ℵ
0
categorical.
Version: 5 Owner: Timmy Author(s): Timmy
190
24.2 homogeneous
Let 1 be a ﬁrst order language. Let ` be an 1structure. Then we say ` is homogeneous
if the following holds:
ifσ is an isomorphism between ﬁnite substructures of `, then σ extends to an automorphism
of `.
Version: 1 Owner: Timmy Author(s): Timmy
24.3 universal structure
Let 1 be a ﬁrst order language, and let 1 be an elementary class of 1structures. Let κ be
a cardinal. 1
κ
be the set of structures from 1 with cardinality less than or equal to κ.
Let ` ∈ 1
κ
. Suppose that for every ` ∈ 1
κ
there is an embedding of ` into `. Then we
say ` is universal.
Version: 1 Owner: Timmy Author(s): Timmy
191
Chapter 25
03C52 – Properties of classes of
models
25.1 amalgamation property
A class of 1structures o has the amalgamation property iﬀ whenever ¹. 1
1
. 1
2
∈ o and and
1
i
: ¹ → 1
i
are elementary embeddings for i ∈ ¦1. 2¦ then there is some ( ∈ o and some
elementary embeddings o
i
: 1
i
→( for i ∈ ¦1. 2¦ so that o
1
(1
1
(r)) = o
2
(1
2
(r)) for all r ∈ ¹.
Compare this with the free product with amalgamated subgroup for groups and the deﬁni
tion of pushout contained there.
Version: 2 Owner: Timmy Author(s): Timmy
192
Chapter 26
03C64 – Model theory of ordered
structures; ominimality
26.1 inﬁnitesimal
Let 1 be a real closed ﬁeld, for example the reals thought of as a structure in 1, the language
of ordered rings. Let 1 be some set of parameters from 1. Consider the following set of
formulas in 1(1):
¦r < / : / ∈ 1 ∧ / 0¦
Then this set of formulas is ﬁnitely satisﬁed, so by compactness is consistent. In fact this
set of formulas extends to a unique type j over 1, as it deﬁnes a Dedekind cut. Thus there
is some model ` containing 1 and some c ∈ ` so that tp(c1) = j.
Any such element will be called 1inﬁnitesimal. In particular, suppose 1 = ∅. Then the
deﬁnable closure of 1 is the intersection of the reals with the algebraic numbers. Then a
∅inﬁnitesimal (or simply inﬁnitesimal ) is any element of any real closed ﬁeld that is positive
but smaller than every real algebraic (positive) number.
As noted above such models exist, by compactness. One can construct them using ultra
products, see the entry on hyperreal. This is due to Abraham Robinson, who used such
ﬁelds to formulate nonstandard analysis.
Let 1 be any ordered ring, then 1 contains N. We say 1 is archemedian iﬀ for every c ∈ 1
there is some : ∈ N so that c < :. Otherwise 1 is nonarchemedian.
Real closed ﬁelds with inﬁnitesimal elements are nonarchemedian: for c an inﬁnitesimal we
have c < 1: and thus 1c : for each : ∈ N.
Reference: A Robinson, Selected papers of Abraham Robinson. Vol. II. Nonstandard anal
193
ysis and philosophy (New Haven, Conn., 1979)
Version: 2 Owner: Timmy Author(s): Timmy
26.2 ominimality
Let ` be an ordered structure. An interval in ` is any subset of ` that can be expressed
in one of the following forms:
• ¦r : c < r < /¦ for some c. / from `
• ¦r : r c¦ for some c from `
• ¦r : r < c¦ for some c from `
Then we deﬁne ` to be ominimal iﬀ every deﬁnable subset of ` is a ﬁnite union of intervals
and points. This is a property of the theory of ` i.e. if ` ⇔` and ` is ominimal, then
` is ominimal. Note that ` being ominimal is equivalent to every deﬁnable subset of `
being quantiﬁer free deﬁnable in the language with just the ordering. Compare this with
strong minimality.
The model theory of ominimal structures is well understood, for an excellent account see
Lou van den Dries, Tame topology and ominimal structures, CUP 1998. In particular,
although this condition is merely on deﬁnable subsets of ` it gives very good information
about deﬁnable subsets of `
n
for : ∈ ω.
Version: 4 Owner: Timmy Author(s): Timmy
26.3 real closed ﬁelds
It is clear that the axioms for a structure to be an ordered ﬁeld can written in 1, the
ﬁrst order language of ordered rings. It is also true that the following conditions can be
written in a schema of ﬁrst order sentences in this language. For each odd degree polynomial
j ∈ 1[r], j has a root.
Let ¹ be all these sentences together with one that states that all positive elements have a
square root. Then one can show that the consequences of ¹ are a complete theory 1. It is
clear that this theory is the theory of the real numbers. We call any 1 structure a real closed
ﬁeld.
194
The semi algebraic sets on a real closed ﬁeld are Boolean combinations of solution sets of
polynomial equalities and inequalities. Tarski showed that 1 has quantiﬁer elimination,
which is equivalent to the class of semi algebraic sets being closed under projection.
Let 1 be a real closed ﬁeld. Consider the deﬁnable subsets of 1. By quantiﬁer elimination,
each is deﬁnable by a quantiﬁer free formula, i.e. a boolean combination of atomic formulas.
An atomic formula in one variable has one of the following forms:
• 1(r) o(r) for some 1. o ∈ 1[r]
• 1(r) = o(r) for some 1. o ∈ 1[r].
The ﬁrst deﬁnes a ﬁnite union of intervals, the second deﬁnes a ﬁnite union of points. Every
deﬁnable subset of 1 is a ﬁnite union of these kinds of sets, so is a ﬁnite union of intervals
and points. Thus any real closed ﬁeld is ominimal.
Version: 2 Owner: Timmy Author(s): Timmy
195
Chapter 27
03C68 – Other classical ﬁrstorder
model theory
27.1 imaginaries
Given an algebraic structure o to investigate, mathematicians consider substructures, re
strictions of the structure, quotient structures and the like. A natural question for a math
ematician to ask if he is to understand o is “What structures naturally live in o?” We can
formalise this question in the following manner: Given some logic appropriate to the struc
ture o, we say another structure 1 is deﬁnable in o iﬀ there is some deﬁnable subset 1
t
of
o
n
, a bijection σ : 1
t
→ 1 and a deﬁnable function (respectively relation) on 1
t
for each
function (resp. relation) on 1 so that σ is an isomorphism (of the relevant type for 1).
For an example take some inﬁnite group (G. .). Consider the centre of G, 2 := ¦r ∈ G :
∀n ∈ G(rn = nr)¦. Then 2 is a ﬁrst order deﬁnable subset of G, which forms a group with
the restriction of the multiplication, so (2. .) is a ﬁrst order deﬁnable structure in (G. .).
As another example consider the structure (R. +. .. 0. 1) as a ﬁeld. Then the structure (R. <)
is ﬁrst order deﬁnable in the structure (R. +. .. 0. 1) as for all r. n ∈ R
2
we have r < n iﬀ
∃.(.
2
= n − r). Thus we know that (R. +. .. 0. 1) is unstable as it has a deﬁnable order on
an inﬁnite subset.
Returning to the ﬁrst example, 2 is normal in G, so the set of (left) cosets of 2 form a
factor group. The domain of the factor group is the quotient of Gunder the equivalence relation
r ⇔ n iﬀ ∃. ∈ 2(r. = n). Therefore the factor group G2 will not (in general) be a de
ﬁnable structure, but would seem to be a “natural” structure. We therefore weaken our
formalisation of “natural” from deﬁnable to interpretable. Here we require that a struc
ture is isomorphic to some deﬁnable structure on equivalence classes of deﬁnable equivalence
relations. The equivalence classes of a ∅deﬁnable equivalence relation are called imaginaries.
196
In [2] Poizat deﬁned the property of Elimination of Imaginaries. This is equivalent to the
following deﬁnition:
Deﬁnition 1. A structure A with at least two distinct ∅deﬁnable elements admits elimina
tion of imaginaries iﬀ for every : ∈ N and ∅deﬁnable equivalence relation ∼ on A
n
there is
a ∅deﬁnable function 1 : A
n
→A
p
(for some j) such that for all r and n from A
n
we have
r ∼ n iﬀ 1(r) = 1(n).
Given this property, we think of the function 1 as coding the equivalence classes of ∼,
and we call 1(r) a code for r ∼. If a structure has elimination of imaginaries then every
interpretable structure is deﬁnable.
In [3] Shelah deﬁned, for any structure A a multisorted structure A
eq
. This is done by adding
a sort for every ∅deﬁnable equivalence relation, so that the equivalence classes are elements
(and code themselves). This is a closure operator i.e. A
eq
has elimination of imaginaries.
See [1] chapter 4 for a good presentation of imaginaries and A
eq
. The idea of passing to
A
eq
is very useful for many purposes. Unfortunately A
eq
has an unwieldy language and
theory. Also this approach does not answer the question above. We would like to show that
our structure has elimination of imaginaries with just a small selection of sorts added, and
perhaps in a simple language. This would allow us to describe the deﬁnable structures more
easily, and as we have elimination of imaginaries this would also describe the interpretable
structures.
REFERENCES
1. Wilfrid Hodges, A shorter model theory Cambridge University Press, 1997.
2. Bruno Poizat, Une th´eorie de Galois imaginaire, Journal of Symbolic Logic, 48 (1983), pp.
11511170.
3. Saharon Shelah, Classiﬁcation Theory and the Number of Nonisomorphic Models, North Hol
lans, Amsterdam, 1978.
Version: 2 Owner: Timmy Author(s): Timmy
197
Chapter 28
03C90 – Nonclassical models
(Booleanvalued, sheaf, etc.)
28.1 Boolean valued model
A traditional model of a language makes every formula of that language either true or
false. A Boolean valued model is a generalization in which formulas take on any value in a
Boolean algebra.
Speciﬁcally, a Boolean valued model of a signature Σ over the language L is a set A together
with a Boolean algebra B. Then the objects of the model are the functions A
B
= B →A.
For any formula φ, we can assign a value φ from the Boolean algebra. For example, if L is
the language of ﬁrst order logic, a typical recursive deﬁnition of φ might look something
like this:
• 1 = o =
f(b)=g(b)
/
• φ = φ
t
• φ ∨ ψ = φ ∨ ψ
• ∃rφ(r) =
f∈A
B
φ(1)
Version: 1 Owner: Henry Author(s): Henry
198
Chapter 29
03C99 – Miscellaneous
29.1 axiom of foundation
The axiom of foundation (also called the axiom of regularity) is an axiom of ZF
set theory prohibiting circular sets and sets with inﬁnite levels of containment. Intuitively,
it states that every set can be built up from the empty set. There are several equivalent
formulations, for instance:
For any nonempty set A there is some n ∈ A such that n
¸
A = ∅.
For any set A, there is no function 1 from ω to the transitive closure of A such that
1(: + 1) ∈ 1(:).
For any formula φ, if there is any set r such that φ(r) then there is some A such that φ(A)
but there is no n ∈ A such that φ(n).
Version: 2 Owner: Henry Author(s): Henry
29.2 elementarily equivalent
If M and N are models of L then they are elementarily equivalent, denoted M ⇔ N iﬀ
for every sentence φ:
M= φiﬀN = φ
Version: 1 Owner: Henry Author(s): Henry
199
29.3 elementary embedding
If A and B are models of L such that for each t ∈ 1, ¹
t
⊆ 1
t
, then we say B is an
elementary extension of A, or, equivalently, A is an elementary substructure of B if,
whenever φ is a formula of L with free variables included in r
1
. . . . . r
n
(of types t
1
. . . . . t
n
)
and c
1
. . . . . c
n
are such that c
i
∈ t
i
for each i < : then:
A = φ(c
1
. . . . . c
n
)iﬀB = φ(c
1
. . . . . c
n
)
If A and B are models of L then a collection of onetoone functions 1
t
: ¹
t
→ 1
t
for each
t ∈ 1 is an elementary embedding of A if whenever φ is a formula of type L with free
variables included in r
1
. . . . . r
n
(of types t
1
. . . . . t
n
) and c
1
. . . . . c
n
are such that c
i
∈ t
i
for
each i < : then:
A = φ(c
1
. . . . . c
n
)iﬀB = φ(1
t
1
(c
1
). . . . . 1
tn
(c
n
))
Version: 1 Owner: Henry Author(s): Henry
29.4 model
Let 1 be a logical language with function symbols 1, relations 1, and types 1. Then
M= '¦M
t
[ t ∈ 1¦. ¦1
M
[ 1 ∈ 1¦. ¦:
M
[ : ∈ 1¦`
is a model of 1 (also called an 1structure, or, if the underlying logic is clear, a Σstructure,
where Σ is a signature specifying just 1 and 1) if:
• Whenever 1 is an :ary function symbol such that Type(1) = t and Inputs
n
(1) =
't
1
. . . . . t
n
` then 1
M
:
¸
n
1
M
t
i
→M
t
• Whenever : is an :ary relation symbol such that Inputs
n
(:) = 't
1
. . . . . t
n
` then :
M
is
a relation on
¸
n
1
M
t
i
If : is a term of 1 of type t
s
without free variables then it follows that : = 1:
1
. . . :
n
and
:
M
= 1
M
(:
M
1
. . . . . :
M
n
) ∈ `
ts
.
If φ is a sentence then we write M= φ (and say that M satisﬁes φ) if φ is true in M, where
truth of a relation is deﬁned by:
• 1t
1
. . . t
n
is true if 1
M
(t
M
1
. . . . . t
M
n
)
200
• truth of a nonatomic formula is deﬁned using the semantics of the underlying logic.
If Φ is a class of sentences, we write M= Φ if for every φ ∈ Φ, M= φ.
For any term : of 1 whose only free variables are included in r
1
. . . . . r
n
with types t
1
. . . . . t
n
then for any c
1
. . . . . c
n
such that c
i
∈ `
t
i
deﬁne :
M
(c
1
. . . . . c
n
) by:
• If :
i
= r
i
then :
M
i
(c
1
. . . . . c
n
) = c
i
• If : = 1:
1
. . . :
m
then :(
M
c
1
. . . . . c
n
) = 1
M
(:
M
1
(c
1
. . . . . c
n
). . . . . :
M
n
(c
1
. . . . . c
n
))
If φ is a formula whose only free variables are included in r
1
. . . . . r
n
with types t
1
. . . . . t
n
then for any c
1
. . . . . c
n
such that c
i
∈ M
t
i
deﬁne M= φ(c
1
. . . . . c
n
) recursively by:
• If φ = 1:
1
. . . :
m
then M= φ(c
1
. . . . . c
n
) iﬀ 1
M
(:
M
1
(c
1
. . . . . c
n
). . . . . :
M
n
(c
1
. . . . . c
n
))
• Otherwise the truth of φ is determined by the semantics of the underlying logic.
As above, M= Φ(c
1
. . . . . c
n
) iﬀ for every φ ∈ Φ, M= φ(c
1
. . . . . c
n
).
Version: 10 Owner: Henry Author(s): Henry
29.5 proof equivalence of formulation of foundation
We show that each of the three formulations of the axiom of foundation given are equivalent.
1 ⇒2
Let A be a set and consider any function 1 : ω →tc(A). Consider ) = ¦1(:) [ : < ω¦. By
assumption, there is some 1(:) ∈ ) such that 1(:)
¸
) = ∅, hence 1(: + 1) ∈ 1(:).
2 ⇒3
Let φ be some formula such that φ(r) is true and for every A such that φ(A), there is some
n ∈ A such that φ(A). The deﬁne 1(0) = r and 1(: + 1) is some : ∈ 1(:) such that φ(r).
This would construct a function violating the assumption, so there is no such φ.
201
3 ⇒1
Let A be a nonempty set and deﬁne φ(r) ⇔ r ∈ A. Then φ is true for some A, and by
assumption, there is some n such that φ(n) but there is no . ∈ n such that φ(.). Hence
n ∈ A but n
¸
A = ∅.
Version: 1 Owner: Henry Author(s): Henry
202
Chapter 30
03D10 – Turing machines and related
notions
30.1 Turing machine
A Turing machine is an imaginary computing machine invented by Alan Turing to describe
what it means to compute something.
The ”physical description” of a Turing machine is a box with a tape and a tape head. The
tape consists of an inﬁnite number of cells stretching in both directions, with the tape head
always located over exactly one of these cells. Each cell has one of a ﬁnite number of symbols
written on it.
The machine has a ﬁnite set of states, and with every move the machine can change states,
change the symbol written on the current cell, and move one space left or right. The machine
has a program which speciﬁes each move based on the current state and the symbol under
the current cell. The machine stops when it reaches a combination of state and symbol
for which no move is deﬁned. One state is the start state, which the machine is in at the
beginning of a computation.
A Turing machine may be viewed as computing either a partial function or a relation. When
viewed as a function, the tape begins with a set of symbols which are the input, and when
the machine halts, whatever is on the tape is the output. For instance it is not diﬃcult to
write a program which doubles a binary number, so input of 10 (with 0 on the ﬁrst cell, 1
on the second, and all the rest blank) would give output 100. If the machine does not halt
on a particular input then the function is undeﬁned on that input.
Alternatively, a Turing machine may be viewed as computing a relation. In that case the
initial symbols on the tape is again an input, and some states are denoted ”accepting.” If
the machine halts in an accepting state, the symbol is accepted, if it halts in any other state,
203
the symbol is rejected. A slight variation is when all states are accepting, and a symbol
is rejected if the machine never halts (of course, if the only method of determining if the
machine will halt is watching it then you can never be sure that it won’t stop at some point
in the future).
Another way for a Turing machine to compute a relation is to list (enumerate) its members
one by one. A relation is recursively enumerable if there is some Turing machine which can
list it in this way, or equivalently if there is a machine which halts in an accepting state only
on the members of the relation. A relation is recursive if it is recursively enumerable and its
complement is also. An equivalent deﬁnition is that there is a Turing machine which halts
in an accepting state only on members of the relation and always halts.
There are many variations on the deﬁnition of a Turing machine. The tape could be inﬁnite
in only one direction, having a ﬁrst cell but no last cell. Even stricter, a tape could move in
only one direction. It could be two (or more) dimensional. There could be multiple tapes,
and some of them could be read only. The cells could have multiple tracks, so that they hold
multiple symbols simultaneously.
The programs mentioned above deﬁne only one move for each possible state and symbol
combination; these are called deterministic. Some programs deﬁne multiple moves for some
combinations.
If the machine halts whenever there is any series of legal moves which leads to a situation
without moves, the machine is called nondeterministic. The notion is that the machine
guesses which move to use whenever there are multiple choices, and always guesses right.
Yet other machines are probabilistic; when given the choice between diﬀerent moves they
select one at random.
No matter which of these variations is used, the recursive and recursively enumerable relations
and functions are unchanged (with two exception–one of the tapes has to move in two di
rections, although it need not be inﬁnite in both directions, and there can only be a ﬁnite
number of symbols, states, and tapes): the simplest imagineable machine, with a single tape,
oneway inﬁnite tape and only two symbols, is equivalent to the most elaborate imagineable
array of multidimensional tapes, lucky guesses, and fancy symbols.
However not all these machines can compute at the same speed; the speedup theorem states
that the number of moves it takes a machine to halt can be divided by an arbitrary constant
(the basic method involves increasing the number of symbols so that each cell encodes several
cells from the original machine; each move of the new machine emulates several moves from
the old one).
In particular, the question P
?
= NP, which asks whether an important class determinisitic
machines (those which have a polynomial function of the input length bounding the time it
takes them to halt) is the same as the corresponding class of nondeterministic machines, is
one of the major unsolved problems in modern mathematics.
204
Version: 2 Owner: Henry Author(s): Henry
205
Chapter 31
03D20 – Recursive functions and
relations, subrecursive hierarchies
31.1 primitive recursive
The class of primitive recursive functions is the smallest class of functions on the naturals
(from N to N) that
1. Includes
• the zero function: .(r) = 0
• the successor function: :(r) = r + 1
• the projection functions: j
n,m
(r
1
. . . . . r
n
) = r
m
, : ≤ :
2. Is closed under
• composition: /(r
1
. . . . . r
n
) = 1(o
1
(r
1
. . . . . r
n
). . . . . o
m
(r
1
. . . . . r
n
))
• primitive recursion: /(r. 0) = 1(r); /(r. n + 1) = o(r. n. /(r. n))
The primitive recursive functions are Turingcomputable, but not all Turingcomputable
functions are primitive recursive (see Ackermann’s function).
Further Reading
• “Dave’s Homepage: Primitive Recursive Functions”: http://www.its.caltech.edu/ boozer/symbols/pr.h
• “Primitive recursive functions”: http://public.logica.com/ stepneys/cyc/p/primrec.htm
Version: 2 Owner: akrowne Author(s): akrowne
206
Chapter 32
03D25 – Recursively (computably)
enumerable sets and degrees
32.1 recursively enumerable
For a language 1, TFAE:
• There exists a Turing machine 1 such that ∀r.(r ∈ 1) ⇔the computation 1(r) terminates.
• There exists a total recursive function 1 : N →1 which is onto.
• There exists a total recursive function 1 : N →1 which is onetoone and onto.
A language 1 fulﬁlling any (and therefore all) of the above conditions is called recursively
enumerable.
Examples
1. Any recursive language.
2. The set of encodings of Turing machines which halt when given no input.
3. The set of encodings of theorems of Peano arithmetic.
4. The set of integers : for which the hailstone sequence starting at : reaches 1. (We
don’t know if this set is recursive, or even if it is N; but a trivial program shows it is
recursively enumerable.)
Version: 3 Owner: ariels Author(s): ariels
207
Chapter 33
03D75 – Abstract and axiomatic
computability and recursion theory
33.1 Ackermann function
Ackermann’s function ¹(r. n) is deﬁned by the recurrence relations
¹(0. n) = n + 1
¹(r + 1. 0) = ¹(r. 1)
¹(r + 1. n + 1) = ¹(r. ¹(r + 1. n))
Ackermann’s function is an example of a recursive function that is not primitive recursive,
but is instead jrecursive (that is, Turingcomputable).
Ackermann’s function grows extremely fast. In fact, we ﬁnd that
¹(0. n) = n + 1
¹(1. n) = 2 + (n + 3) −3
¹(2. n) = 2 (n + 3) −3
¹(3. n) = 2
y+3
−3
¹(4. n) = 2
2
.·
2
−3 (n + 3exponentiations)
... and at this point conventional notation breaks down, and we need to employ something
like Conway notation or Knuth notation for large numbers.
208
Ackermann’s function wasn’t actually written in this form by its namesake, Wilhelm Acker
mann. Instead, Ackermann found that the .fold exponentiation of r with n was an example
of a recursive function which was not primitive recursive. Later this was simpliﬁed by Rosza
Peter to a function of two variables, similar to the one given above.
Version: 5 Owner: akrowne Author(s): akrowne
33.2 halting problem
The halting problem is to determine, given a particular input to a particular computer
program, whether the program will terminate after a ﬁnite number of steps.
The consequences of a solution to the halting problem are farreaching. Consider some
predicate 1(r) regarding natural numbers; suppose we conjecture that 1(r) holds for all
r ∈ N. (Goldbach’s conjecture, for example, takes this form.) We can write a program
that will count up through the natural numbers and terminate upon ﬁnding some : such
that 1(:) is false; if the conjecture holds in general, then our program will never terminate.
Then, without running the program, we could pass it along to a halting program to
prove or disprove the conjecture.
In 1936, Alan Turing proved that the halting problem is undecideable; the argument is
presented here informally. Consider a hypothetical program that decides the halting the
problem:
Algorithm Halt(P, I)
Input: A computer program 1 and some input 1 for 1
Output: True if 1 halts on 1 and false otherwise
The implementation of the algorithm, as it turns out, is irrelevant. Now consider another
program:
Algorithm Break(x)
Input: An irrelevant parameter r
Output: begin
if Hct(1:cc/. r) then
whiletrue do
nothing
else
1:cc/ ←t:nc
end
In other words, we can design a program that will break any solution to the halting problem.
If our halting solution determines that Break halts, then it will immediately enter an inﬁnite
loop; otherwise, Break will return immediately. We must conclude that the Halt program
does not decide the halting problem.
209
Version: 2 Owner: vampyr Author(s): vampyr
210
Chapter 34
03E04 – Ordered sets and their
coﬁnalities; pcf theory
34.1 another deﬁnition of coﬁnality
Let κ be a limit ordinal (e.g. a cardinal). The coﬁnality of κ cf(κ) could also be deﬁned as:
cf(κ) = inf¦[l[ : l ⊆ κs.t. sup l = κ¦
(sup l is calculated using the natural order of the ordinals). The coﬁnality of a cardinal is
always a regular cardinal and hence cf(κ) = cf(cf(κ)).
This deﬁnition is equivalent to the parent deﬁnition.
Version: 5 Owner: x bas Author(s): x bas
34.2 coﬁnality
If α is an ordinal and A ⊆ α then A is said to be coﬁnal in α if whenever n ∈ α there is
r ∈ A with n < r.
A map 1 : α → β between ordinals α and β is said to be coﬁnal if the image of 1 is coﬁnal
in β.
If β is an ordinal, the coﬁnality cf(β) of β is the least ordinal α such that there is a coﬁnal
map 1 : α →β. Note that cf(β) < β, because the identity map on β is coﬁnal.
It is not hard to show that the coﬁnality of any ordinal is a cardinal, in fact a regular cardinal:
a cardinal κ is said to be regular if cf(κ) = κ and singular if cf(κ) < κ.
211
For any inﬁnite cardinal κ it can be shown that κ < κ
cf(κ)
, and so also κ < cf(2
κ
).
Examples
0 and 1 are regular cardinals. All other ﬁnite cardinals have coﬁnality 1 and are therefore
singular.
ℵ
0
is regular.
Any inﬁnite successor cardinal is regular.
The smallest inﬁnite singular cardinal is ℵ
ω
. In fact, the map 1 : ω →ℵ
ω
given by 1(:) = ω
n
is coﬁnal, so cf(ℵ
ω
) = ℵ
0
. Note that cf(2
ℵ
0
) ℵ
0
, and consequently 2
ℵ
0
= ℵ
ω
.
Version: 14 Owner: yark Author(s): yark, Evandar
34.3 maximal element
Let ≤ be an ordering on a set o, and let ¹ ⊆ o. Then, with respect to the ordering ≤,
• c ∈ ¹ is the least element of ¹ if c ≤ r, for all r ∈ ¹.
• c ∈ ¹ is a minimal element of ¹ if there exists no r ∈ ¹ such that r ≤ c and r = c.
• c ∈ ¹ is the greatest element of ¹ if r ≤ c for all r ∈ ¹.
• c ∈ ¹ is a maximal element of ¹ if there exists no r ∈ ¹ such that c ≤ r and r = c.
Examples.
• The natural numbers N ordered by divisibility ([) have a least element, 1. The natural
numbers greater than 1 (N ` 1) have no least element, but inﬁnitely many minimal
elements (the primes.) In neither case is there a greatest or maximal element.
• The negative integers ordered by the standard deﬁnition of ≤ have a maximal element
which is also the greatest element, −1. They have no minimal or least element.
• The natural numbers N ordered by the standard ≤ have a least element, 1, which is
also a minimal element. They have no greatest or maximal element.
• The rationals greater than zero with the standard ordering ≤ have no least element or
minimal element, and no maximal or greatest element.
Version: 3 Owner: akrowne Author(s): akrowne
212
34.4 partitions less than coﬁnality
If λ < cf(κ) then κ →(κ)
1
λ
.
This follows easily from the deﬁnition of coﬁnality. For any coloring 1 : κ → λ then deﬁne
o : λ →κ+1 by o(α) = [1
−1
(α)[. Then κ =
¸
α<λ
o(α), and by the normal rules of cardinal
arithmatic sup
α<λ
o(α) = κ. Since λ < cf(κ), there must be some α < λ such that o(α) = κ.
Version: 1 Owner: Henry Author(s): Henry
34.5 well ordered set
A wellordered set is a totally ordered set in which every nonempty subset has a least
member.
An example of wellordered set is the set of positive integers with the standard order relation
(Z
+
. <), because any nonempty subset of it has least member. However, R
+
(the positive
reals) is not a wellordered set with the usual order, because (0. 1) = ¦r : 0 < r < 1¦ is a
nonempty subset but it doesn’t contain a least number.
A wellordering of a set A is the result of deﬁning a binary relation < on A to itself in
such a way that A becomes wellordered with respect to <.
Version: 9 Owner: drini Author(s): drini, vypertd
34.6 pigeonhole principle
For any natural number :, there does not exist a bijection between : and a proper subset
of :.
The name of the theorem is based upon the observation that pigeons will not occupy a
pigeonhole that already contains a pigeon, so there is no way to ﬁt : pigeons in fewer than
: pigeonholes.
Version: 6 Owner: djao Author(s): djao
34.7 proof of pigeonhole principle
It will ﬁrst be proven that, if a bijection exists between two ﬁnite sets, then the two sets
have the same number of elements.
213
Let o and 1 be ﬁnite sets and 1 : o → 1 be a bijection. Since 1 is injective, then [o[ =
[ ran1[. Since 1 is surjective, then [1[ = [ ran1[. Thus, [o[ = [1[.
Since the pigeonhole principle is the contrapositive of the proven statement, it follows that
the pigeonhole principle holds.
Version: 2 Owner: Wkbj79 Author(s): Wkbj79
34.8 tree (set theoretic)
In a set theory, a tree is deﬁned to be a set 1 and a relation <
T
⊆ 1 1 such that:
• <
T
is a partial ordering of 1
• For any t ∈ 1, ¦: ∈ 1 [ : <
T
t¦ is wellordered
The nodes immediately greater than a node are termed its children, the node immediately
less is its parent (if it exists), any node less is an ancestor and any node greater is a
descendant. A node with no ancestors is a root.
The partial ordering represents distance from the root, and the wellordering requirement
prohibits any loops or splits below a node (that is, each node has at most one parent, and
therefore at most one grandparent, and so on). Since there is generally no requirement that
the tree be connected, the null ordering makes any set into a tree, although the tree is a
trivial one, since each element of the set forms a single node with no children.
Since the set of ancestors of any node is wellordered, we can associate it with an ordinal.
We call this the height, and write: ht(t) = o. t.(¦: ∈ 1 [ : <
T
t¦). This all accords with
normal usage: a root has height 0, something immediately above the root has height 1, and
so on. We can then assign a height to the tree itself, which we deﬁne to be the least number
greater than the height of any element of the tree. For ﬁnite trees this is just one greater
than the height of its tallest element, but inﬁnite trees may not have a tallest element, so
we deﬁne ht(1) = sup¦ht(t) + 1 [ t ∈ 1¦.
For every α <
T
ht(1) we deﬁne the αth level to be the set 1
α
= ¦t ∈ 1 [ ht(t) = α¦. So
of course 1
0
is all roots of the tree. If α <
T
ht(1) then 1(α) is the subtree of elements with
height less than α: t ∈ 1(α) ↔r ∈ 1 ∧ ht(t) < α.
We call a tree a κtree for any cardinal κ if [1[ = κ and ht 1 = κ. If κ is ﬁnite, the only way
to do this is to have a single branch of length κ.
Version: 6 Owner: Henry Author(s): Henry
214
34.9 κcomplete
A structured set o (typically a ﬁlter or a Boolean algebra) is κcomplete if, given any 1 ⊆ o
with [1[ < κ,
¸
1 ∈ o. It is complete if it is κcomplete for all κ.
Similarly, a partial order is κcomplete if any sequence of fewer than κ elements has an
upper bound within the partial order.
A ℵ
1
complete structure is called countably complete.
Version: 8 Owner: Henry Author(s): Henry
34.10 Cantor’s diagonal argument
One of the starting points in Cantor’s development of set theory was his discovery that there
are diﬀerent degrees of inﬁnity. The rational numbers, for example, are countably inﬁnite; it
is possible to enumerate all the rational numbers by means of an inﬁnite list. By contrast,
the real numbers are uncountable. it is impossible to enumerate them by means of an
inﬁnite list. These discoveries underlie the idea of cardinality, which is expressed by saying
that two sets have the same cardinality if there exists a bijective correspondence between
them.
In essence, Cantor discovered two theorems: ﬁrst, that the set of real numbers has the same
cardinality as the power set of the naturals; and second, that a set and its power set have a
diﬀerent cardinality (see Cantor’s theorem). The proof of the second result is based on the
celebrated diagonalization argument.
Cantor showed that for every given inﬁnite sequence of real numbers r
1
. r
2
. r
3
. . . . it is
possible to construct a real number r that is not on that list. Consequently, it is impossible
to enumerate the real numbers; they are uncountable. No generality is lost if we suppose
that all the numbers on the list are between 0 and 1. Certainly, if this subset of the real
numbers in uncountable, then the full set is uncountable as well.
Let us write our sequence as a table of decimal expansions:
0 . d
11
d
12
d
13
d
14
. . .
0 . d
21
d
22
d
23
d
24
. . .
0 . d
31
d
32
d
33
d
34
. . .
0 . d
41
d
42
d
43
d
44
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
where
r
n
= 0.d
n1
d
n2
d
n3
d
n4
. . . .
and the expansion avoids an inﬁnite trailing string of the digit 9.
215
For each : = 1. 2. . . . we choose a digit c
n
that is diﬀerent from d
nn
and not equal to 9, and
consider the real number r with decimal expansion
0.c
1
c
2
c
3
. . .
By construction, this number r is diﬀerent from every member of the given sequence. After
all, for every :, the number r diﬀers from the number r
n
in the :
th
decimal digit. The claim
is proven.
Version: 6 Owner: rmilson Author(s): rmilson, slider142
34.11 Fodor’s lemma
If κ is a regular, uncountable cardinal, o is a stationary subset of κ, and 1 : κ → κ is
regressive on o (that is, 1(α) < α for any α ∈ o) then there is some γ and some stationary
o
0
⊆ o such that 1(α) = γ for any α ∈ o
0
.
Version: 1 Owner: Henry Author(s): Henry
34.12 SchroederBernstein theorem
Let o and 1 be sets. If there exists an injection 1 : o →1 and an injection o : 1 →o, then
o and 1 have the same cardinality.
The Schr¨oderBernstein theorem is useful for proving many results about cardinality, since
it replaces one hard problem (ﬁnding a bijection between o and 1) with two generally easier
problems (ﬁnding two injections).
Version: 2 Owner: vampyr Author(s): vampyr
34.13 Veblen function
The Veblen function is used to obtain larger ordinal numbers than those provided by
exponentiation. It builds on a hierarchy of closed and unbounded classes:
• (:(0) is the additively indecomposable numbers, H
• (:(o:) = (:(:)
t
the set of ﬁxed points of the enumerating function of (:(:)
• (:(λ) =
¸
α<λ
(:(α)
216
The Veblen function ϕ
α
β is deﬁned by setting ϕ
α
equal to the enumerating function of (:(α).
We call a number α strongly critical if α ∈ (:(α). The class of strongly critical ordinals
is written SC, and the enumerating function is written 1
SC
(α) = Γ
α
.
Γ
0
, the ﬁrst strongly critical ordinal, is also called the FefermanSchutte ordinal.
Version: 1 Owner: Henry Author(s): Henry
34.14 additively indecomposable,
An ordinal α is called additively indecomposable if it is not 0 and for any β. γ < α,
β + γ < α. The set of additively indecomposable ordinals is denoted H.
Obviously 1 ∈ H, since 0 + 0 < 1. Also ω ∈ H since the sum of two ﬁnite numbers is still
ﬁnite, and no ﬁnite numbers other than 1 are in H.
H is closed and unbounded, so the enumerating function of H is normal. In fact, 1
H
(α) = ω
α
.
The derivative 1
t
H
(α) is written c
α
. The number c
0
= ω
ω
ω
···
, therefore, is the ﬁrst ﬁxed point
of the series ω, ω
ω
, ω
ω
ω
, . . ..
Version: 1 Owner: Henry Author(s): Henry
34.15 cardinal number
A cardinal number is an ordinal number o with the property that o ⊂ A for every ordinal
number A which has the same cardinality as o.
Version: 3 Owner: djao Author(s): rmilson, djao
34.16 cardinal successor
The cardinal successor of a cardinal κ is the least cardinal greater than κ. It is denoted κ
+
.
Version: 1 Owner: yark Author(s): yark
217
34.17 cardinality
Cardinality is a notion of the size of a set which does not rely on numbers. It is a relative
notion because, for instance, two sets may each have an inﬁnite number of elements, but one
may have a greater cardinality. That is, it may have a ”more inﬁnite” number of elements.
The formal deﬁnition of cardinality rests upon the notion of a onetoone mapping between
sets.
Deﬁnition.
Sets ¹ and 1 have the same cardinality if there is a onetoone and onto function 1 from ¹
to 1 (a bijection.) Symbolically, we write [¹[ = [1[. This is also called equipotence.
Results.
1. ¹ is equipotent to ¹.
2. If ¹ is equipotent to 1, then 1 is equipotent to ¹.
3. If ¹ is equipotent to 1 and 1 is equipotent to (, then ¹ is equipotent to (.
Proof.
1. The identity function on ¹ is a bijection from ¹ to ¹.
2. If 1 is a bijection from ¹ to 1, then 1
−1
exists and is a bijection from 1 to ¹.
3. If 1 is a bijection from ¹ to 1 and o is a bijection from 1 to (, then 1 ◦o is a bijection
from ¹ to (.
Example.
The set of even integers E has the same cardinality as the set of integers Z. We deﬁne
1 : E →Z such that 1(r) =
x
2
. Then 1 is a bijection, therefore [E[ = [Z[.
Version: 10 Owner: akrowne Author(s): akrowne
34.18 cardinality of a countable union
Let ( be a countable collection of countable sets. Then
¸
( is countable.
Version: 1 Owner: vampyr Author(s): vampyr
218
34.19 cardinality of the rationals
The set of rational numbers ´ is countable, and therefore its cardinality is ℵ
0
.
Version: 2 Owner: quadrate Author(s): quadrate
34.20 classes of ordinals and enumerating functions
A class of ordinals is just a subset of the ordinals. For every class of ordinals ` there is an
enumerating function 1
M
deﬁned by transﬁnite recursion:
1
M
(α) = min¦r ∈ ` [ 1(β) < r for all β < α¦
This function simply lists the elements of ` in order. Note that it is not necessarily deﬁned
for all ordinals, although it is deﬁned for a segment of the ordinals. Let otype(`) = dom(1)
be the order type of `, which is either On or some ordinal α. If α < β then 1
M
(α) <
1
M
(β), so 1
M
is an order isomorphism between otype(`) and `.
We say ` is κclosed if for any ` ⊆ ` such that [`[ < κ, also sup ` ∈ `.
We say ` is κunbounded if for any α < κ there is some β ∈ ` such that α < β.
We say a function 1 : ` →On is κcontinuous if ` is κclosed and
1(sup `) = sup¦1(α) [ α ∈ `¦
A function is κnormal if it is order preserving (α < β implies 1(α) < 1(β)) and continuous.
In particular, the enumerating function of a κclosed class is always κnormal.
All these deﬁnitions can be easily extended to all ordinals: a class is closed (resp. un
bounded) if it is κclosed (unbounded) for all κ. A function is continuous (resp. normal)
if it is κcontinuous (normal) for all κ.
Version: 2 Owner: Henry Author(s): Henry
34.21 club
If κ is a cardinal then a set ( ⊆ κ is closed iﬀ for any o ⊆ ( and α < κ, sup(o
¸
α) = α
then α ∈ (. (That is, if the limit of some sequence in ( is less than κ then the limit is also
in (.)
If κ is a cardinal and ( ⊆ κ then ( is unbounded if, for any α < κ, there is some β ∈ (
such that α < β.
219
If a set is both closed and unbounded then it is a club set.
Version: 1 Owner: Henry Author(s): Henry
34.22 club ﬁlter
If κ is a regular uncountable cardinal then club(κ), the ﬁlter of all sets containing a club
subset of κ, is a κcomplete ﬁlter closed under diagonal intersection called the club ﬁlter.
To see that this is a ﬁlter, note that κ ∈ club(κ) since it is obviously both closed and
unbounded. If r ∈ club(κ) then any subset of κ containing r is also in club(κ), since r, and
therefore anything containing it, contains a club set.
It is a κ complete ﬁlter because the intersection of fewer than κ club sets is a club set. To
see this, suppose '(
i
`
i<α
is a sequence of club sets where α < κ. Obviously ( =
¸
(
i
is
closed, since any sequence which appears in ( appears in every (
i
, and therefore its limit is
also in every (
i
. To show that it is unbounded, take some β < κ. Let 'β
1,i
` be an increasing
sequence with β
1,1
β and β
1,i
∈ (
i
for every i < α. Such a sequence can be constructed,
since every (
i
is unbounded. Since α < κ and κ is regular, the limit of this sequence is less
than κ. We call it β
2
, and deﬁne a new sequence 'β
2,i
` similar to the previous sequence.
We can repeat this process, getting a sequence of sequences 'β
j,i
` where each element of a
sequence is greater than every member of the previous sequences. Then for each i < α, 'β
j,i
`
is an increasing sequence contained in (
i
, and all these sequences have the same limit (the
limit of 'β
j,i
`). This limit is then contained in every (
i
, and therefore (, and is greater than
β.
To see that club(κ) is closed under diagonal intersection, let '(
i
`, i < κ be a sequence, and
let ( = ∆
i<κ
(
i
. Since the diagonal intersection contains the intersection, obviously ( is
unbounded. Then suppose o ⊆ ( and sup(o
¸
α) = α. Then o ⊆ (
β
for every β ` α, and
since each (
β
is closed, α ∈ (
β
, so α ∈ (.
Version: 2 Owner: Henry Author(s): Henry
34.23 countable
A set o is countable if there exists a bijection between o and some subset of N.
All ﬁnite sets are countable.
Version: 2 Owner: vampyr Author(s): vampyr
220
34.24 countably inﬁnite
A set o is countably inﬁnite if there is a bijection between o and N.
As the name implies, any countably inﬁnite set is both countable and inﬁnite.
Countably inﬁnite sets are also sometimes called denumerable.
Version: 3 Owner: vampyr Author(s): vampyr
34.25 ﬁnite
A set o is ﬁnite if there exists a natural number : and a bijection from o to :. If there
exists such an :, then it is unique, and it is called the cardinality of o.
Version: 2 Owner: djao Author(s): djao
34.26 ﬁxed points of normal functions
If 1 : ` →On is a function then Fix(1) = ¦r ∈ ` [ 1(r) = r¦ is the set of ﬁxed points of
1. 1
t
, the derivative of 1, is the enumerating function of Fix(1).
If 1 is κnormal then Fix(1) is κclosed and κnormal, and therefore 1
t
is also κnormal.
Version: 1 Owner: Henry Author(s): Henry
34.27 height of an algebraic number
Suppose we have an algebraic number such that the polynomial of smallest degree it is a
root of (with the coeﬃcients relatively prime) is given by:
n
¸
i=0
c
i
r
i
Then the height / of the algebraic number is given by:
/ = : +
n
¸
i=0
[c
i
[
221
This is a quantity which is used in the proof of the existence of transcendental numbers.
REFERENCES
1. Shaw, R. Mathematics Society Notes, 1st edition. King’s School Chester, 2003.
2. Stewart, I. Galois Theory, 3rd edition. Chapman and Hall, 2003.
3. Baker, A. Transcendental Number Theory, 1st edition. Cambridge University Press, 1975.
Version: 13 Owner: kidburla2003 Author(s): kidburla2003
34.28 if ¹ is inﬁnite and 1 is a ﬁnite subset of ¹. then
¹ ` 1 is inﬁnite
Theorem. If ¹ is an inﬁnite set and 1 is a ﬁnite subset of ¹, then ¹` 1 is inﬁnite.
Proof. The proof is by contradiction. If ¹ ` 1 would be ﬁnite, there would exist a / ∈ N
and a bijection 1 : ¦1. . . . . /¦ → ¹ ` 1. Since 1 is ﬁnite, there also exists a bijection
o : ¦1. . . . . ¦ →1. We can then deﬁne a mapping / : ¦1. . . . . / + ¦ →¹ by
/(i) =
1(i) when i ∈ ¦1. . . . . /¦.
o(i −/) when i ∈ ¦/ + 1. . . . . / + ¦.
Since 1 and o are bijections, / is a bijection between a ﬁnite subset of N and ¹. This is a
contradiction since ¹ is inﬁnite. P
Version: 3 Owner: mathcam Author(s): matte
34.29 limit cardinal
A limit cardinal is a cardinal κ such that λ
+
< κ for every cardinal λ < κ. Here λ
+
denotes
the cardinal successor of λ. If 2
λ
< κ for every cardinal λ < κ, then κ is called a strong limit
cardinal.
Every strong limit cardinal is a limit cardinal, because λ
+
< 2
λ
holds for every cardinal λ.
Under GCH, every limit cardinal is a strong limit cardinal because in this case λ
+
= 2
λ
for
every inﬁnite cardinal λ.
The three smallest limit cardinals are 0, ℵ
0
and ℵ
ω
. Note that some authors do not count 0,
or sometimes even ℵ
0
, as a limit cardinal.
Version: 7 Owner: yark Author(s): yark
222
34.30 natural number
Given the ZermeloFraenkel axioms of set theory, one can prove that there exists an inductive set
A such that ∅ ∈ A. The natural numbers N are then deﬁned to be the intersection of all
subsets of A which are inductive sets and contain the empty set as an element.
The ﬁrst few natural numbers are:
• 0 := ∅
• 1 := 0
t
= ¦0¦ = ¦∅¦
• 2 := 1
t
= ¦0. 1¦ = ¦∅. ¦∅¦¦
• 3 := 2
t
= ¦0. 1. 2¦ = ¦∅. ¦∅¦. ¦∅. ¦∅¦¦¦
Note that the set 0 has zero elements, the set 1 has one element, the set 2 has two elements,
etc. Informally, the set : is the set consisting of the : elements 0. 1. . . . . :−1, and : is both
a subset of N and an element of N.
In some contexts (most notably, in number theory), it is more convenient to exclude 0 from
the set of natural numbers, so that N = ¦1. 2. 3. . . . ¦. When it is not explicitly speciﬁed, one
must determine from context whether 0 is being considered a natural number or not.
Addition of natural numbers is deﬁned inductively as follows:
• c + 0 := c for all c ∈ N
• c + /
t
:= (c + /)
t
for all c. / ∈ N
Multiplication of natural numbers is deﬁned inductively as follows:
• c 0 := 0 for all c ∈ N
• c /
t
:= (c /) + c for all c. / ∈ N
The natural numbers form a monoid under either addition or multiplication. There is an
ordering relation on the natural numbers, deﬁned by: c < / if c ⊆ /.
Version: 11 Owner: djao Author(s): djao
223
34.31 ordinal arithmetic
Ordinal arithmetic is the extension of normal arithmetic to the transﬁnite ordinal numbers.
The successor operation or (sometimes written r+1, although this notation risks confusion
with the general deﬁnition of addition) is part of the deﬁnition of the ordinals, and addition
is naturally deﬁned by recursion over this:
• r + 0 = 0
• r + on = o(r + n)
• r + α = sup
γ<α
r + γ for limit α
If r and n are ﬁnite then r+n under this deﬁnition is just the usual sum, however when r and
n become inﬁnite, there are diﬀerences. In particular, ordinal addition is not commutative.
For example,
ω + 1 = ω + o0 = o(ω + 0) = oω
but
1 + ω = sup
n<ω
1 + : = ω
Multiplication in turn is deﬁned by iterated addition:
• r 0 = 0
• r on = r n + r
• r α = sup
γ<α
r γ for limit α
Once again this deﬁnition is equivalent to normal multiplication when r and n are ﬁnite, but
is not commutative:
ω 2 = ω 1 + ω = ω + ω
but
2 ω = sup
n<ω
2 : = ω
Both these functions are strongly increasing in the second argument and weakly increasing
in the ﬁrst argument. That is, if α < β then
• γ + α < γ + β
• γ α < γ β
• α + γ < β + γ
• α γ < β γ
Version: 2 Owner: Henry Author(s): Henry
224
34.32 ordinal number
An ordinal number is a well ordered set o such that, for every r ∈ o,
r = ¦. ∈ o [ . < r¦
(where < is the ordering relation on o).
Version: 2 Owner: djao Author(s): djao
34.33 power set
Deﬁnition If A is a set, then the power set of A is the set whose elements are the subsets
of A. It is usually denoted as P(A) or 2
X
.
1. If A is a ﬁnite set, then [2
X
[ = 2
[X[
. This property motivates the notation 2
X
.
2. For an arbitrary set A, Cantor’s theorem states two things about the power set: First,
there is no bijection between A and P(A). Second, the cardinality of 2
X
is greater
than the cardinality of A.
Version: 5 Owner: matte Author(s): matte, drini
34.34 proof of Fodor’s lemma
If we let 1
−1
: κ →1(o) be the inverse of 1 restricted to o then Fodor’s lemma is equivalent
to the claim that for any function such that α ∈ 1(κ) → α κ there is some α ∈ o such
that 1
−1
(α) is stationary.
Then if Fodor’s lemma is false, for every α ∈ o there is some club set (
α
such that
(
α
¸
1
−1
(α) = ∅. Let ( = ∆
α<κ
(
α
. The club sets are closed under diagonal intersection,
so ( is also club and therefore there is some α ∈ o
¸
(. Then α ∈ (
β
for each β < α, and
so there can be no β < α such that α ∈ 1
−1
(β), so 1(α) ` α, a contradiction.
Version: 1 Owner: Henry Author(s): Henry
34.35 proof of SchroederBernstein theorem
We ﬁrst prove as a lemma that for any 1 ⊂ ¹, if there is an injection 1 : ¹ →1, then there
is also a bijection / : ¹ →1.
225
Deﬁne a sequence ¦(
k
¦
∞
k=0
of subsets of ¹ by (
0
= ¹ − 1 and for / ≥ 0, (
k+1
= 1((
k
).
If the (
k
are not pairwise disjoint, then there are minimal integers , and / with , < / and
(
j
¸
(
k
nonempty. Then / 1, and so (
k
⊂ 1. (
0
¸
1 = ∅, so , 0. Thus (
j
= 1((
j−1
)
and (
k
= 1((
k−1
). By assumption, 1 is injective, so (
j−1
¸
(
k−1
is nonempty, contradicting
the minimality of ,. Hence the (
k
are pairwise disjoint.
Now let ( =
¸
∞
k=0
(
k
, and deﬁne / : ¹ →1 by
/(.) =
1(.). . ∈ (
.. . ∈ (
.
If . ∈ (, then /(.) = 1(.) ∈ 1. But if . ∈ (, then . ∈ 1, and so /(.) ∈ 1. Hence / is
welldeﬁned; / is injective by construction. Let / ∈ 1. If / ∈ (, then /(/) = /. Otherwise,
/ ∈ (
k
= 1((
k−1
) for some / ≥ 0, and so there is some c ∈ (
k−1
such that /(c) = 1(c) = /.
Thus / is bijective; in particular, if 1 = ¹, then / is simply the identity map on ¹.
To prove the theorem, suppose 1 : o →1 and o : 1 →o are injective. Then the composition
o1 : o → o(1) is also injective. By the lemma, there is a bijection /
t
: o → o(1). The
injectivity of o implies that o
−1
: o(1) → 1 exists and is bijective. Deﬁne / : o → 1 by
/(.) = o
−1
/
t
(.); this map is a bijection, and so o and 1 have the same cardinality.
Version: 13 Owner: mps Author(s): mps
34.36 proof of ﬁxed points of normal functions
Suppose 1 is a κnormal function and consider any α < κ and deﬁne a sequence by α
0
= α
and α
n+1
= 1(α
n
). Let α
ω
= sup
n<ω
α
n
. Then, since 1 is continuous,
1(α
ω
) = sup
n<ω
1(α
n
) = sup
n<ω
α
n+1
= α
ω
So Fix(1) is unbounded.
Suppose ` is a set of ﬁxed points of 1 with [`[ < κ. Then
1(sup `) = sup
α∈N
1(α) = sup
α∈N
α = sup `
so sup ` is also a ﬁxed point of 1, and therefore Fix(1) is closed.
Version: 1 Owner: Henry Author(s): Henry
34.37 proof of the existence of transcendental numbers
Cantor discovered this proof.
226
Lemma:
Consider a natural number /. Then the number of algebraic numbers of height / is ﬁnite.
Proof:
To see this, note the sum in the deﬁnition of height is positive. Therefore:
: < /
where : is the degree of the polynomial. For a polynomial of degree :, there are only :
coeﬃcients, and the sum of their moduli is (/ −:), and there is only a ﬁnite number of ways
of doing this (the number of ways is the number of algebraic numbers). For every polynomial
with degree less than :, there are less ways. So the sum of all of these is also ﬁnite, and
this is the number of algebraic numbers with height / (with some repetitions). The result
follows.
Proof of the main theorem:
You can start writing a list of the algebraic numbers because you can put all the ones with
height 1, then with height 2, etc, and write them in numerical order within those sets because
they are ﬁnite sets. This implies that the set of algebraic numbers is countable. However,
by diagonalisation, the set of real numbers is uncountable. So there are more real numbers
than algebraic numbers; the result follows.
Version: 5 Owner: kidburla2003 Author(s): kidburla2003
34.38 proof of theorems in aditively indecomposable
H is closed
Let ¦α
i
[ i < κ¦ be some increasing sequence of elements of H and let α = sup¦α
i
[ i < κ¦.
Then for any r. n < α, it must be that r < α
i
and n < α
j
for some i. , < κ. But then
r + n < α
maxi,j¦
< α.
227
H is unbounded
Consider any α, and deﬁne a sequence by α
0
= oα and α
n+1
= α
n
+α
n
. Let α
ω
= sup
n<ω
α
ω
be the limit of this sequence. If r. n < α
ω
then it must be that r < α
i
and n < α
j
for some
i. , < ω, and therefore r + n < α
maxi,j¦+1
. Note that α
ω
is, in fact, the next element of H,
since every element in the sequence is clearly additively decomposible.
1
H
(α) = ω
α
Since 0 is not in H, we have 1
H
(0) = 1.
For any α+1, we have 1
H
(α+1) is the least additively indecomposible number greater than
1
H
(α). Let α
0
= o1
H
(α) and α
n+1
= α
n
+ α
n
= α
n
2. Then 1
H
(α + 1) = sup
n<ω
α
n
=
sup
n<ω
oα 2 2 = 1
H
(α) ω. The limit case is trivial since H is closed unbounded, so 1
H
is continuous.
Version: 1 Owner: Henry Author(s): Henry
34.39 proof that the rationals are countable
Suppose we have a rational number α = j¡ in lowest terms with ¡ 0. Deﬁne the “height”
of this number as /(α) = [j[ + ¡. For example, /(0) = /(
0
1
) = 1, /(−1) = /(1) = 2,
and /(−2) = /(
−1
2
) = /(
1
2
) = /(2) = 3. Note that the set of numbers with a given height
is ﬁnite. The rationals can now be partitioned into classes by height, and the numbers in
each class can be ordered by way of increasing numerators. Thus it is possible to assign
a natural number to each of the rationals by starting with 0. −1. 1. −2.
−1
2
.
1
2
. 2. −3. . . . and
progressing through classes of increasing heights. This assignment constitutes a bijection
between N and ´ and proves that ´ is countable.
A corollary is that the irrational numbers are uncountable, since the union of the irrationals
and the rationals is R, which is uncountable.
Version: 5 Owner: quadrate Author(s): quadrate
34.40 stationary set
If κ is a cardinal, ( ⊆ κ, and ( intersects every club in κ then ( is stationary. If ( is not
stationary then it is thin.
Version: 1 Owner: Henry Author(s): Henry
228
34.41 successor cardinal
A successor cardinal is a cardinal that is the cardinal successor of some cardinal.
Version: 1 Owner: yark Author(s): yark
34.42 uncountable
Deﬁnition A set is uncountable if it is not countable. In other words, a set o is uncount
able, if there is no subset of N with the same cardinality as o.
1. All uncountable sets are inﬁnite. However, the converse is not true. For instance, the
natural numbers and the rational numbers  although inﬁnite  are both countable.
2. The real numbers form an uncountable set. The famous proof of this result is based
on Cantor’s diagonal argument.
Version: 2 Owner: matte Author(s): matte, vampyr
34.43 von Neumann integer
A von Neumann integer is not an integer, but instead a construction of a natural number
using some basic set notation. The von Neumann integers are deﬁned inductively. The
von Neumann integer zero is deﬁned to be the empty set, ∅, and there are no smaller von
Neumann integers. The von Neumann integer ` is then the set of all von Neumann integers
less than `. The set of von Neumann integers is the set of all ﬁnite von Neumann ordinals.
This form of construction from very basic notions of sets is applicable to various forms of
set theory (for instance, ZermeloFraenkel set theory). While this construction suﬃces to
deﬁne the set of natural numbers, a little more work must be done to deﬁne the set of all
integers.
229
Examples
0 = ∅
1 = ¦0¦ = ¦∅¦
2 = ¦0. 1¦ = ¦∅. ¦∅¦¦
3 = ¦0. 1. 2¦ = ¦∅. ¦∅¦ . ¦¦∅. ¦∅¦¦¦¦
.
.
.
` = ¦0. 1. . . . . ` −1¦
Version: 3 Owner: mathcam Author(s): mathcam, Logan
34.44 von Neumann ordinal
The von Neumann ordinal is a method of deﬁning ordinals in set theory.
The von Neumann ordinal α is deﬁned to be the wellordered set containing the von Neu
mann ordinals which precede α. The set of ﬁnite von Neumann ordinals is known as the
von Neumann integers. Every wellordered set is isomorphic a von Neumann ordinal.
They can be constructed by transﬁnite recursion as follows:
• The empty set is 0.
• Given any ordinal α, the ordinal α + 1 (the successor of α is deﬁned to be α
¸
¦α¦.
• Given a set ¹ of ordinals,
¸
a∈A
c is an ordinal.
If an ordinal is the successor of another ordinal, it is an successor ordinal. If an ordinal is
neither 0 nor a successor ordinal then it is a limit ordinal. The ﬁrst limit ordinal is named
ω.
The class of ordinals is denoted On.
The von Neumann ordinals have the convenient property that if c < / then c ∈ / and c ⊂ /.
Version: 5 Owner: Henry Author(s): Henry, Logan
230
34.45 weakly compact cardinal
Weakly compact cardinals are (large) inﬁnite cardinals which have a property related to the
syntactic compactness theorem for ﬁrst order logic. Speciﬁcally, for any inﬁnite cardinal κ,
consider the language 1
κ,κ
.
This language is identical to ﬁrst logic except that:
(a) inﬁnite conjunctions and disjunctions of fewer than κ formulas are allowed
(b) inﬁnite strings of fewer than κ quantiﬁers are allowed
The weak compactness theorem for 1
κ,κ
states that if ∆ is a set of sentences of 1
κ,κ
such
that [∆[ = κ and any θ ⊂ ∆ with [θ[ < κ is consistent then ∆ is consistent.
A cardinal is weakly compact if the weak compactness theorem holds for 1
κ,κ
.
Version: 1 Owner: Henry Author(s): Henry
34.46 weakly compact cardinals and the tree property
A cardinal is weakly compact if and only if it is inaccessible and has the tree property.
Weak compactness implies tree property
Let κ be a weakly compact cardinal and let (1. <
T
) be a κ tree with all levels smaller than
κ. We deﬁne a theory in 1
κ,κ
with for each r ∈ 1, a constant c
x
, and a single unary relation
1. Then our theory ∆ consists of the sentences:
• [1(c
x
) ∧ 1(c
y
)] for every incompatible r. n ∈ 1
•
x∈T(α)
1(c
x
) for each α < κ
It should be clear that 1 represents membership in a coﬁnal branch, since the ﬁrst class of
sentences asserts that no incompatible elements are both in 1 while the second class states
that the branch intersects every level.
Clearly [∆[ = κ, since there are κ elements in 1, and hence fewer than κ κ = κ sentences
in the ﬁrst group, and of course there are κ levels and therefore κ sentences in the second
group.
231
Now consider any Σ ⊆ ∆ with [Σ[ < κ. Fewer than κ sentences of the second group are
included, so the set of r for which the corresponding c
x
must all appear in 1(α) for some
α < κ. But since 1 has branches of arbitrary height, 1(α) [= Σ.
Since κ is weakly compact, it follows that ∆ also has a model, and that model obviously has
a set of c
x
such that 1(c
x
) whose corresponding elements of 1 intersect every level and are
compatible, therefore forming a coﬁnal branch of 1, proving that 1 is not Aronszajn.
Version: 4 Owner: Henry Author(s): Henry
34.47 Cantor’s theorem
Let A be any set and P(A) its power set. Cantor’s theorem states that there is no bijection
between A and P(A). Moreover the cardinality of P(¹) is stricly greater than that of ¹,
that is [¹[ < [P(¹)[.
Version: 2 Owner: igor Author(s): igor
34.48 proof of Cantor’s theorem
The proof of this theorem is fairly simple using the following construction which is central
to Cantor’s diagonal argument.
Consider a function 1 : A → P(A) from A to its powerset. Then we deﬁne the set 2 ⊆ A
as follows:
2 = ¦r ∈ A [ r ∈ 1(r)¦.
Suppose that 1 is, in fact, a bijection. Then there must exist an r ∈ A such that 1(r) = 2.
But, by construction, we have the following contradiction:
r ∈ 2 ⇔r ∈ 1(r) ⇔r ∈ 2.
Hence 1 cannot be a bijection between A and P(A).
Version: 2 Owner: igor Author(s): igor
34.49 additive
Let φ be some real function deﬁned on an algebra of sets A. We say that φ is additive if,
whenever ¹ and 1 are disjoint sets in A, we have
φ(¹
¸
1) = φ(¹) + φ(1).
232
Suppose A is a σalgebra. Then, given any sequence '¹
i
` of disjoint sets in A, if we have
φ
¸
¹
i
=
¸
φ(¹
i
)
we say that φ is countably additive or σadditive.
Useful properties of an additive set function φ include the following:
1. φ(∅) = 0.
2. If ¹ ⊆ 1, then φ(¹) < φ(1).
3. If ¹ ⊆ 1, then φ(1 ` ¹) = φ(1) −φ(¹).
4. Given ¹ and 1, φ(¹
¸
1) + φ(¹
¸
1) = φ(¹) + φ(1).
Version: 3 Owner: vampyr Author(s): vampyr
34.50 antisymmetric
A relation R on ¹ is antisymmetric iﬀ ∀r. n ∈ ¹, (rRn ∧ nRr) → (r = n). The number
of possible antisymmetric relations on ¹ is 2
n
3
n
2
−n
2
out of the 2
n
2
total possible relations,
where : = [¹[.
Antisymmetric is not the same thing as ”not symmetric”, as it is possible to have both at
the same time. However, a relation R that is both antisymmetric and symmetric has the
condition that rRn ⇒r = n. There are only 2
n
such possible relations on ¹.
An example of an antisymmetric relation on ¹ = ¦◦. . ¦ would be R = ¦(. ). (. ◦). (◦. ). (. )¦.
One relation that isn’t antisymmetric is R = ¦(. ◦). (. ◦). (◦. )¦ because we have both R◦
and ◦R, but ◦ = 
Version: 4 Owner: xriso Author(s): xriso
34.51 constant function
Deﬁnition Suppose A and ) are sets and 1 : A →) is a function. Then 1 is a constant
function if 1(c) = 1(/) for all c. / in A.
233
Properties
1. The composition of a constant function with any function (for which composition is
deﬁned) is a constant function.
2. A constant map between topological spaces is continuous.
Version: 2 Owner: mathcam Author(s): matte
34.52 direct image
Let 1 : ¹ −→1 be a function, and let l ⊂ ¹ be a subset. The direct image of l is the set
1(l) ⊂ 1 consisting of all elements of 1 which equal 1(n) for some n ∈ l.
Version: 4 Owner: djao Author(s): rmilson, djao
34.53 domain
Let 1 be a binary relation. Then the set of all r such that r1n is called the domain of 1.
That is, the domain of 1 is the set of all ﬁrst coordinates of the ordered pairs in 1.
Version: 5 Owner: akrowne Author(s): akrowne
34.54 dynkin system
Let Ω be a set, and P(Ω) be the power set of Ω. A dynkin system on Ω is a set D ⊆ P(Ω)
such that
1. Ω ∈ D
2. ¹. 1 ∈ D and ¹ ⊆ 1 ⇒1 ` ¹ ∈ D
3. ¹
n
∈ D. ¹
n
⊆ ¹
n+1
. : ≥ 1 ⇒
¸
∞
k=1
¹
k
∈ D.
Let ¹ ⊆ P(Ω) be a set, and consider
Γ = ¦A : A is a dynkin system and ¹ ∈ A¦ . (34.54.1)
234
We deﬁne the intersection of all the dynkin systems containing ¹ as
D(¹) :=
¸
X∈Γ
A (34.54.2)
One can easily verify that D(¹) is itself a dynkin system and that it contains ¹. We call
D(¹) the dynkin system generated by ¹. It is the “smallest” dynkin system containing
¹.
A dynkin system which is also πsystem is a σalgebra.
Version: 4 Owner: drummond Author(s): drummond
34.55 equivalence class
Let o be a set with an equivalence relation ∼. An equivalence class of o under ∼ is a subset
1 ⊂ o such that
• If r ∈ 1 and n ∈ o, then r ∼ n if and only if n ∈ 1
• If o is nonempty, then 1 is nonempty
For r ∈ o, the equivalence class containing r is often denoted by [r], so that
[r] := ¦n ∈ o [ r ∼ n¦.
The set of all equivalence classes of o under ∼ is deﬁned to be the set of all subsets of o
which are equivalence classes of o under ∼.
For any equivalence relation ∼, the set of all equivalence classes of o under ∼ is a partition
of o, and this correspondence is a bijection between the set of equivalence relations on o
and the set of partitions of o (consisting of nonempty sets).
Version: 3 Owner: djao Author(s): djao, rmilson
34.56 ﬁbre
Given a function 1 : A −→) , a ﬁbre is an inverse image of an element of ) . That is given
n ∈ ) , 1
−1
(¦n¦) = ¦r ∈ A [ 1(r) = n¦ is a ﬁbre.
235
Example
Deﬁne 1 : R
2
−→ R by 1(r. n) = r
2
+ n
2
. Then the ﬁbres of 1 consist of concentric circles
about the origin, the origin itself, and empty sets depending on whether we look at the
inverse image of a positive number, zero, or a negative number respectively.
Version: 3 Owner: dublisk Author(s): dublisk
34.57 ﬁltration
A ﬁltration is a sequence of sets ¹
1
. ¹
2
. . . . . ¹
n
with
¹
1
⊂ ¹
2
⊂ ⊂ ¹
n
.
If one considers the sets ¹
1
. . . . . ¹
n
as elements of a larger set which are partially ordered
by inclusion, then a ﬁltration is simply a ﬁnite chain with respect to this partial ordering.
It should be noted that in some contexts the word ”ﬁltration” may also be employed to
describe an inﬁnite chain.
Version: 3 Owner: djao Author(s): djao
34.58 ﬁnite character
A family F of sets is of ﬁnite character if
1. For each ¹ ∈ F, every ﬁnite subset of ¹ belongs to F;
2. If every ﬁnite subset of a given set ¹ belongs to F, then ¹ belongs to F.
Version: 4 Owner: Koro Author(s): Koro
34.59 ﬁx (transformation actions)
Let ¹ be a set, and 1 : ¹ →¹ a transformation of that set. We say that r ∈ ¹ is ﬁxed by
1, or that 1 ﬁxes r, whenever
1(r) = r.
The subset of ﬁxed elements is called the ﬁxed set of 1, and is frequently denoted as ¹
T
.
236
We say that a subset 1 ⊂ ¹ is ﬁxed by 1 whenever all elements of 1 are ﬁxed by 1, i.e.
1 ⊂ ¹
T
.
If this is so, 1 restricts to the identity transformation on 1.
The deﬁnition generalizes readily to a family of transformations with common domain
1
i
: ¹ →¹. i ∈ 1
In this case we say that a subset 1 ⊂ ¹ is ﬁxed, if it is ﬁxed by all the elements of the
family, i.e. whenever
1 ⊂
¸
i∈I
¹
T
i
.
Version: 7 Owner: rmilson Author(s): rmilson
34.60 function
Let ¹ and 1 be sets. A function 1 : ¹ −→1 is a relation 1 from ¹ to 1 such that
• For every c ∈ ¹, there exists / ∈ 1 such that (c. /) ∈ 1.
• If c ∈ ¹, /
1
. /
2
∈ 1, and (c. /
1
) ∈ 1 and (c. /
2
) ∈ 1, then /
1
= /
2
.
For c ∈ ¹, one usually denotes by 1(c) the unique element / ∈ 1 such that (c. /) ∈ 1. The
set ¹ is called the domain of 1, and the set 1 is called the codomain.
Version: 5 Owner: djao Author(s): djao
34.61 functional
Deﬁnition A functional 1 is a function mapping a function space (often a vector space)
\ in to a ﬁeld of scalars 1, typically taken to be R or C.
Discussion Examples of functionals include the integral and entropy. A functional 1 is
often indicated by the use of square brackets, 1[r] rather than 1(r).
The linear functionals are those functionals 1 that satisfy
• T(x+y)=T(x)+T(y)
237
• T(cx)=cT(x)
for any c ∈ 1, r. n ∈ \ .
Version: 4 Owner: mathcam Author(s): mathcam, drummond
34.62 generalized cartesian product
Given any family of sets ¦¹
j
¦
j∈J
indexed by an index set J, the generalized cartesian product
¸
j∈J
¹
j
is the set of all functions
1 : J →
¸
j∈J
¹
j
such that 1(,) ∈ ¹
j
for all , ∈ J.
For each i ∈ J, the projection map
π
i
:
¸
j∈J
¹
j
→¹
i
is the function deﬁned by
π
i
(1) := 1(i).
Version: 4 Owner: djao Author(s): djao
34.63 graph
The graph of a function 1 : A →) is the subset of A ) given by ¦(r. 1(r)) : r ∈ A¦.
Version: 7 Owner: Koro Author(s): Koro
34.64 identity map
Deﬁnition If A is a set, then the identity map in A is the mapping that maps each
element in A to itself.
238
Properties
1. An identity map is always a bijection.
2. Suppose A has two topologies τ
1
and τ
2
. Then the identity mapping 1 : (A. τ
1
) →
(A. τ
2
) is continuous if and only if τ
1
is ﬁner than τ
2
, i.e., τ
1
⊂ τ
2
.
3. The identity map on the :sphere, is homotopic to the antipodal map ¹ : o
n
→o
n
if
: is odd [1].
REFERENCES
1. V. Guillemin, A. Pollack, Diﬀerential topology, PrenticeHall Inc., 1974.
Version: 3 Owner: bwebste Author(s): matte
34.65 inclusion mapping
Deﬁnition Let A be a subset of ) . Then the inclusion map from A to ) is the mapping
ι : A → )
r → r.
In other words, the inclusion map is simply a fancy way to say that every element in A is
also an element in ) .
To indicate that a mapping is an inclusion mapping, one usually writes → instead of →
when deﬁning or mentioning an inclusion map. This hooked arrow symbol → can be seen
as combination of the symbols ⊂ and →. In the above deﬁnition, we have not used this
convention. However, examples of this convention would be:
• Let ι : A →) be the inclusion map from A to ) .
• We have the inclusion o
n
→R
n+1
.
Version: 4 Owner: matte Author(s): matte
34.66 inductive set
An inductive set is a set A with the property that, for every r ∈ A, the successor r
t
of r is
also an element of A.
239
One major example of an inductive set is the set of natural numbers N.
Version: 7 Owner: djao Author(s): djao
34.67 invariant
Let ¹ be a set, and 1 : ¹ → ¹ a transformation of that set. We say that r ∈ ¹ is an
invariant of 1 whenever r is ﬁxed by 1:
1(r) = r.
We say that a subset 1 ⊂ ¹ is invariant with respect to 1 whenever
1(1) ⊂ 1.
If this is so, the restriction of 1 is a welldeﬁned transformation of the invariant subset:
1
B
: 1 →1.
The deﬁnition generalizes readily to a family of transformations with common domain
1
i
: ¹ →¹. i ∈ 1
In this case we say that a subset is invariant, if it is invariant with respect to all elements of
the family.
Version: 5 Owner: rmilson Author(s): rmilson
34.68 inverse function theorem
Let f be a continuously diﬀerentiable, vectorvalued function mapping the open set 1 ⊂ R
n
to R
n
and let o = f(1). If, for some point a ∈ 1, the Jacobian, [J
f
(a)[, is nonzero, then
there is a uniquely deﬁned function g and two open sets A ⊂ 1 and ) ⊂ o such that
1. a ∈ A, f(a) ∈ ) ;
2. ) = f(A);
3. f : A →) is oneone;
4. g is continuously diﬀerentiable on ) and g(f(x)) = x for all x ∈ A.
240
Simplest case
When : = 1, this theorem becomes: Let 1 be a continuously diﬀerentiable, realvalued
function deﬁned on the open interval 1. If for some point c ∈ 1, 1
t
(c) = 0, then there
is a neighbourhood [α. β] of c in which 1 is strictly monotonic. Then n → 1
−1
(n) is a
continuously diﬀerentiable, strictly monotonic function from [1(α). 1(β)] to [α. β]. If 1 is
increasing (or decreasing) on [α. β], then so is 1
−1
on [1(α). 1(β)].
Note
The inverse function theorem is a special case of the implicit function theorem where the
dimension of each variable is the same.
Version: 6 Owner: vypertd Author(s): vypertd
34.69 inverse image
Let 1 : ¹ −→ 1 be a function, and let l ⊂ 1 be a subset. The inverse image of l is the
set 1
−1
(l) ⊂ ¹ consisting of all elements c ∈ ¹ such that 1(c) ∈ l.
The inverse image commutes with all set operations: For any collection ¦l
i
¦
i∈I
of subsets of
1, we have the following identities for
1. unions:
1
−1
¸
i∈I
l
i
=
¸
i∈I
1
−1
(l
i
)
2. intersections:
1
−1
¸
i∈I
l
i
=
¸
i∈I
1
−1
(l
i
)
and for any subsets l and \ of 1, we have identities for
3. complements:
1
−1
(l)
= 1
−1
(l
)
4. set diﬀerences:
1
−1
(l ` \ ) = 1
−1
(l) ` 1
−1
(\ )
5. symmetric diﬀerences:
1
−1
(l
\ ) = 1
−1
(l)
1
−1
(\ )
241
In addition, for A ⊂ ¹ and ) ⊂ 1, the inverse image satisﬁes the miscellaneous identities
6. (1[
X
)
−1
() ) = A
¸
1
−1
() )
7. 1 (1
−1
() )) = )
¸
1(¹)
8. A ⊂ 1
−1
(1(A)), with equality if 1 is injective.
Version: 5 Owner: djao Author(s): djao, rmilson
34.70 mapping
Synonym of function, although typical usage suggests that mapping is the more generic term.
In a geometric context, the term function is often employed to connote a mapping whose
purpose is to assign values to the elements of its domain, i.e. a function deﬁnes a ﬁeld of
values, whereas mapping seems to have a more geometric connotation, as in a mapping of
one space to another.
Version: 8 Owner: rmilson Author(s): rmilson
34.71 mapping of period : is a bijection
Theorem Suppose A is a set. Then a mapping 1 : A →A of period : is a bijection.
Proof. If : = 1, the claim is trivial; 1 is the identity mapping. Suppose : = 2. 3. . . .. Then
for any r ∈ A, we have r = 1
1
n−1
(r)
, so 1 is an surjection. To see that 1 is a injection,
suppose 1(r) = 1(n) for some r. n in A. Since 1
n
is the identity, it follows that r = n. P
Version: 3 Owner: Koro Author(s): matte
34.72 partial function
A function 1 : ¹ →1 is sometimes called a total function, to signify that 1(c) is deﬁned
for every c ∈ ¹. If ( is any set such that ( ⊇ ¹ then 1 is also a partial function from (
to ¹.
Clearly if 1 is a function from ¹ to 1 then it is a partial function from ¹ to 1, but a partial
function need not be deﬁned for every element of its domain.
Version: 6 Owner: Henry Author(s): Henry
242
34.73 partial mapping
Let A
1
. . A
n
and ) be sets, and let 1 be a function of : variables: 1 : A
1
A
2
A
n
→
) . Fix r
i
∈ A
i
for 2 < i < :. The induced mapping c →1(c. r
2
. . . . . r
n
) is called the partial
mapping determined by 1 corresponding to the ﬁrst variable.
In the case where : = 2, the map deﬁned by c → 1(c. r) is often denoted 1(. r). Further,
any function 1 : A
1
A
2
→ ) determines a mapping from A
1
into the set of mappings
of A
2
into ) , namely 1 : r → (n → 1(r. n)). The converse holds too, and it is customary
to identify 1 with 1. Many of the “canonical isomorphisms” that we come across (e.g. in
multilinear algebra) are illustrations of this kind of identiﬁcation.
Version: 2 Owner: mathcam Author(s): mathcam, Larry Hammick
34.74 period of mapping
Deﬁnition Suppose A is a set and 1 is a mapping 1 : A →A. If 1
n
is the identity mapping
on A for some : = 1. 2. . . ., then 1 is said to be a mapping of period :. Here, the notation
1
n
means the :fold composition 1 ◦ ◦ 1.
Examples
1. A mapping 1 is of period 1 if and only if 1 is the identity mapping.
2. Suppose \ is a vector space. Then a linear involution 1 : \ → \ is a mapping of
period 2. For example, the reﬂection mapping r →−r is a mapping of period 2.
3. In the complex plane, the mapping . → c
−2πi/n
. is a mapping of period : for : =
1. 2. . . ..
4. Let us consider the function space spanned by the trigonometric functions sin and cos.
On this space, the derivative is a mapping of period 4.
Properties
1. Suppose A is a set. Then a mapping 1 : A →A of period : is a bijection. (proof.)
2. Suppose A is a topological space. Then a continuous mapping 1 : A →A of period :
is a homeomorphism.
Version: 8 Owner: bwebste Author(s): matte
243
34.75 pisystem
Let Ω be a set, and P(Ω) be the power set of Ω. A πsystem (or pisystem) on Ω is a set
F ⊆ P(Ω) such that
¹. 1 ∈ F ⇒¹
¸
1 ∈ F. (34.75.1)
A πsystem is closed under ﬁnite intersection.
Version: 1 Owner: drummond Author(s): drummond
34.76 proof of inverse function theorem
Since det 11(c) = 0 the Jacobian matrix 11(c) is invertible: let ¹ = (11(c))
−1
be its
inverse. Choose : 0 and ρ 0 such that
1 = 1
ρ
(c) ⊂ 1.
11(r) −11(c) ≤
1
2:¹
∀r ∈ 1.
: ≤
ρ
2¹
.
Let n ∈ 1
r
(1(c)) and consider the mapping
1
y
: 1 →R
n
1
y
(r) = r + ¹ (n −1(r)).
If r ∈ 1 we have
11
y
(r) = 1 −¹ 11(r) ≤ ¹ 11(c) −11(r) ≤
1
2:
.
Let us verify that 1
y
is a contraction mapping. Given r
1
. r
2
∈ 1, by the meanvalue theorem
on R
n
we have
[1
y
(r
1
) −1
y
(r
2
)[ ≤ sup
x∈[x
1
,x
2
]
:11
y
(r) [r
1
−r
2
[ ≤
1
2
[r
1
−r
2
[.
Also notice that 1
y
(1) ⊂ 1. In fact, given r ∈ 1,
[1
y
(r) −c[ ≤ [1
y
(r) −1
y
(c)[ +[1
y
(c) −c[ ≤
1
2
[r −c[ +[¹ (n −1(c))[ ≤
ρ
2
+¹: ≤ ρ.
244
So 1
y
: 1 →1 is a contraction mapping and hence by the contraction principle there exists
one and only one solution to the equation
1
y
(r) = r.
i.e. r is the only point in 1 such that 1(r) = n.
Hence given any n ∈ 1
r
(1(c)) we can ﬁnd r ∈ 1 which solves 1(r) = n. Let us call
o : 1
r
(1(c)) →1 the mapping which gives this solution, i.e.
1(o(n)) = n.
Let \ = 1
r
(1(c)) and l = o(\ ). Clearly 1 : l → \ is one to one and the inverse of 1
is o. We have to prove that l is a neighbourhood of c. However since 1 is continuous in
c we know that there exists a ball 1
δ
(c) such that 1(1
δ
(c)) ⊂ 1
r
(n
0
) and hence we have
1
δ
(c) ⊂ l.
We now want to study the diﬀerentiability of o. Let n ∈ \ be any point, take u ∈ R
n
and
c 0 so small that n + cu ∈ \ . Let r = o(n) and deﬁne ·(c) = o(n + cu) −o(n).
First of all notice that being
[1
y
(r + ·(c)) −1
y
(r)[ ≤
1
2
[·(c)[
we have
1
2
[·(c) ≥ [·(c) −c¹ u[ ≥ [·(c)[ −c¹ [u[
and hence
[·(c)[ ≤ 2c¹ [u[.
On the other hand we know that 1 is diﬀerentiable in r that is we know that for all · it
holds
1(r + ·) −1(r) = 11(r) · + /(·)
with lim
v→0
/(·)[·[ = 0. So we get
[/(·(c))[
c
≤
2¹ [u[ [/(·(c))[
·(c)
→0 when c →0.
So
lim
→0
o(n + c) −o(n)
c
= lim
→0
·(c)
c
= lim
→0
11(r)
−1
cu −/(·(c))
c
= 11(r)
−1
u
that is
1o(n) = 11(r)
−1
.
Version: 2 Owner: paolini Author(s): paolini
245
34.77 proper subset
Let o be a set and let A ⊂ o be a subset. We say A is a proper subset of o if A = o.
Version: 2 Owner: djao Author(s): djao
34.78 range
Let 1 be a binary relation. Then the set of all n such that r1n for some r is called the
range of 1. That is, the range of 1 is the set of all second coordinates in the ordered pairs
of 1.
In terms of functions, this means that the range of a function is the full set of values it can
take on (the outputs), given the full set of parameters (the inputs). Note that the range is
a subset of the codomain.
Version: 2 Owner: akrowne Author(s): akrowne
34.79 reﬂexive
A relation R on ¹ is reﬂexive if and only if ∀c ∈ ¹, cRc. The number of possible reﬂexive
relations on ¹ is 2
n
2
−n
, out of the 2
n
2
total possible relations, where : = [¹[.
For example, let ¹ = ¦1. 2. 3¦, R is a relation on ¹. Then R = ¦(1. 1). (2. 2). (3. 3). (1. 3). (3. 2)¦
would be a reﬂexive relation, because it contains all the (c. c), c ∈ ¹ pairs. However,
R = ¦(1. 1). (2. 2). (2. 3). (3. 1)¦ is not reﬂexive because it would also have to contain (3. 3).
Version: 6 Owner: xriso Author(s): xriso
34.80 relation
A relation is any subset of a cartesian product of two sets ¹ and 1. That is, any 1 ⊂ ¹1
is a binary relation. One may write c1/ to denote c ∈ ¹, / ∈ 1 and (c. /) ∈ 1. A subset of
¹¹ is simply called a relation on ¹.
An example of a relation is the lessthan relation on integers, i.e. < ⊂ Z Z. (1. 2) ∈ <,
but (2. 1) ∈ <.
Version: 3 Owner: Logan Author(s): Logan
246
34.81 restriction of a mapping
Deﬁnition Let 1 : A → ) be a mapping from a set A to a set ) . If ¹ is a subset of A,
then the restriction of 1 to ¹ is the mapping
1[
A
: ¹ → )
c → 1(c).
Version: 2 Owner: matte Author(s): matte
34.82 set diﬀerence
Let ¹ and 1 sets in some ambient set A. The set diﬀerence, or simply diﬀerence, between
¹ and 1 (in that order) is the set of all elements that are contained in ¹, but not in 1.
This set is denoted by ¹` 1, and we have
¹ ` 1 = ¦r ∈ A [ r ∈ ¹. r ∈ 1¦
= ¹
¸
1
.
where 1
is the complement of 1 in A.
Remark
Sometimes the set diﬀerence is also written as ¹ − 1. However, if ¹ and 1 are sets in a
vector space, then ¹−1 is commonly used to denote the set
¹−1 = ¦c −/ [ c ∈ ¹. / ∈ 1¦.
which, in general, is not the same as the set diﬀerence of ¹ and 1. Therefore, to avoid
confusion, one should try to avoid the notation ¹−1 for the set diﬀerence.
Version: 5 Owner: matte Author(s): matte, quadrate
34.83 symmetric
A relation R on ¹ is symmetric iﬀ ∀r. n ∈ ¹ rRn → nRr. The number of possible
symmetric relations on ¹ is 2
n
2
+n
2
out of the 2
n
2
total possible relations, where : = [¹[.
247
An example of a symmetric relation on ¹ = ¦c. /. c¦ would be R = ¦(c. c). (c. /). (/. c). (c. c). (c. c)¦.
One relation that is not symmetric is R = ¦(/. /). (c. /). (/. c). (c. /)¦, because since we have
(c. /) we must also have (/. c) in order to be symmetric.
Version: 6 Owner: xriso Author(s): xriso
34.84 symmetric diﬀerence
The symmetric diﬀerence between two sets ¹ and 1, written ¹´ 1, is the set of all
r such that either r ∈ ¹ or r ∈ 1 but not both. It is equal to (¹ − 1)
¸
(1 − ¹) and
(¹
¸
1) −(¹
¸
1).
The symmetric diﬀerence operator is commutative since ¹´ 1 = (¹ − 1)
¸
(1 − ¹) =
(1 −¹)
¸
(¹−1) = 1´ ¹.
The operation is also associative. To see this, consider three sets ¹. 1. and (. Any given
elemnet r is in zero, one, two, or all three of these sets. If r is not in any of ¹. 1. or (,
then it is not in the symmetric diﬀerence of the three sets no matter how it is computed.
If r is in one of the sets, let that set be ¹; then r ∈ ¹´ 1 and r ∈ (¹´ 1)´ (; also,
r ∈ (1´ () and therefore r ∈ ¹´ (1´ (). If r is in two of the sets, let them be ¹
and 1; then r ∈ ¹´ 1 and r ∈ (¹´ 1)´ (; also, r ∈ 1´ (, but because r is in ¹,
r ∈ ¹´ (1´ (). If r is in all three, then r ∈ ¹´ 1 but r ∈ (¹´ 1)´ (; similarly,
r ∈ 1´ ( but r ∈ ¹´ (1´ (). Thus, ¹´ (1´ () = (¹´ 1)´ (.
In general, an element will be in the symmetric diﬀerence of several sets iﬀ it is in an
odd number of the sets.
Version: 5 Owner: mathcam Author(s): mathcam, quadrate
34.85 the inverse image commutes with set operations
Theorem. Let 1 be a mapping from A to ) . If ¦1
i
¦
i∈I
is a (possibly uncountable) collection
of subsets in ) , then the following relations hold for the inverse image:
(1) 1
−1
¸
i∈I
1
i
=
¸
i∈I
1
−1
1
i
(2) 1
−1
¸
i∈I
1
i
=
¸
i∈I
1
−1
1
i
If ¹ and 1 are subsets in ) , then we also have:
248
(3) For the set complement,
1
−1
(¹)
= 1
−1
(¹
).
(4) For the set diﬀerence,
1
−1
(¹` 1) = 1
−1
(¹) ` 1
−1
(1).
(5) For the symmetric diﬀerence,
1
−1
(¹
1) = 1
−1
(¹)
1
−1
(1).
Proof. For part (1), we have
1
−1
¸
i∈I
1
i
=
r ∈ A [ 1(r) ∈
¸
i∈I
1
i
¸
= ¦r ∈ A [ 1(r) ∈ 1
i
for some i ∈ 1¦
=
¸
i∈I
¦r ∈ A [ 1(r) ∈ 1
i
¦
=
¸
i∈I
1
−1
1
i
.
Similarly, for part (2), we have
1
−1
¸
i∈I
1
i
=
¸
r ∈ A [ 1(r) ∈
¸
i∈I
1
i
¸
= ¦r ∈ A [ 1(r) ∈ 1
i
for all i ∈ 1¦
=
¸
i∈I
¦r ∈ A [ 1(r) ∈ 1
i
¦
=
¸
i∈I
1
−1
1
i
.
For the set complement, suppose r ∈ 1
−1
(¹). This is equivalent to 1(r) ∈ ¹, or 1(r) ∈ ¹
,
which is equivalent to r ∈ 1
−1
(¹
). Since the set diﬀerence ¹` 1 can be written as ¹
¸
1
c
,
part (4) follows from parts (2) and (3). Similarly, since ¹
1 = (¹` 1)
¸
(1 ` ¹), part (5)
follows from parts (1) and (4). P
Version: 8 Owner: matte Author(s): matte
34.86 transformation
Synonym of mapping and function. Often used to refer to mappings where the domain and
codomain are the same set, i.e. one can compose a transformation with itself. For example,
when one speaks of transformation of a space, one refers to some deformation/ of that space.
Version: 3 Owner: rmilson Author(s): rmilson
249
34.87 transitive
Let ¹ be a set. ¹ is said to be transitive if whenever r ∈ ¹ then r ⊆ ¹.
Equivalently, ¹ is transitive if whenever r ∈ ¹ and n ∈ r then n ∈ ¹.
Version: 1 Owner: Evandar Author(s): Evandar
34.88 transitive
A relation R on ¹ is transitive if and only if ∀r. n. . ∈ ¹, (rRn ∧ nR.) →(rR.).
For example, the “is a subset of” relation ⊆ between sets is transitive. The “is not equal
to” relation = between integers is not transitive. If we assign to our deﬁniton r = 5, n = 42,
and . = 5, we know that both 5 = 42 (r = n) and 42 = 5 (n = .). However, 5 = 5 (r = .),
so = is not transitive
Version: 5 Owner: xriso Author(s): xriso
34.89 transitive closure
The transitive closure of a set A is the smallest transitive set tc(A) such that A ⊆ tc(A).
The transitive closure of a set can be constructed as follows:
Deﬁne a function 1 on ω by 1(0) = A and 1(: + 1) =
¸
1(:)
tc(A) =
¸
n<ω
1(:)
Version: 1 Owner: Henry Author(s): Henry
34.90 Hausdorﬀ’s maximum principle
Theorem Let A be a partially ordered set. Then there exists a maximal totally ordered
subset of A.
The Hausdorﬀ’s maximum principle is one of the many theorems equivalent to the axiom of choice.
The below proof uses Zorn’s lemma, which is also equivalent to the axiom of choice.
250
Proof. Let o be the set of all totally ordered subsets of A. o is not empty, since the
empty set is an element of o. Given a subset τ of o, the union of all the elements of τ
is again an element of o, as is easily veriﬁed. This shows that o, ordered by inclusion, is
inductive. The result now follows from Zorn’s lemma.P
Version: 3 Owner: matte Author(s): matte, cryo
34.91 Kuratowski’s lemma
Any chain in an ordered set is contained in a maximal chain.
This proposition is equivalent to the axiom of choice.
Version: 2 Owner: Koro Author(s): Koro
34.92 Tukey’s lemma
Each nonempty family of ﬁnite character has a maximal element.
Here, by a maximal element we mean a maximal element with respect to the inclusion
ordering: ¹ < 1 iﬀ ¹ ⊆ 1. This lemma is equivalent to the axiom of choice.
Version: 3 Owner: Koro Author(s): Koro
34.93 Zermelo’s postulate
If F is a disjoint family of nonempty sets, then there is a set ( which has exactly one element
of each ¹ ∈ F (i.e such that ¹
¸
( is a singleton for each ¹ ∈ F.)
This is one of the many propositions which are equivalent to the axiom of choice.
Version: 2 Owner: Koro Author(s): Koro
34.94 Zermelo’s wellordering theorem
If A is any set whatsoever, then there exists a wellordering of A. The wellordering theorem
is equivalent to the axiom of choice.
Version: 2 Owner: vypertd Author(s): vypertd
251
34.95 Zorn’s lemma
Let A be a partially ordered set, and suppose that every chain in A has an upper bound.
Then A has a maximal element r, in the sense that for all n ∈ A, n r.
Zorn’s lemma is equivalent to the axiom of choice.
Version: 3 Owner: Evandar Author(s): Evandar
34.96 axiom of choice
Let ( be a collection of nonempty sets. Then there exists a function 1 with domain
( such that 1(r) ∈ r for all r ∈ (. 1 is sometimes called a choice function on (.
The axiom of choice is commonly (although not universally) accepted with the axioms of
ZermeloFraenkel set theory. The axiom of choice is equivalent to the wellordering principle
and to Zorn’s lemma.
The axiom of choice is sometimes called the multiplicative axiom, as it is equivalent to
the proposition that a product of cardinals is zero if and only if one of the factors is zero.
Version: 5 Owner: vampyr Author(s): vampyr
34.97 equivalence of Hausdorﬀ’s maximum principle,
Zorn’s lemma and the wellordering theorem
Hausdorﬀ’s maximum principle implies Zorn’s lemma. Consider a partially ordered set
A, where every chain has an upper bound. According to the maximum principle there exists
a maximal totally ordered subset ) ⊆ A. This then has an upper bound, r. If r is not
the largest element in ) then ¦r¦
¸
) would be a totally ordered set in which ) would be
properly contained, contradicting the deﬁnition. Thus r is a maximal element in A.
Zorn’s lemma implies the wellordering theorem. Let A be any nonempty set, and
let A be the collection of pairs (¹. <), where ¹ ⊆ A and < is a wellordering on ¹. Deﬁne
a relation _, on A so that for all r. n ∈ A : r _ n iﬀ r equals an initial of n. It is easy
to see that this deﬁnes a partial order relation on A (it inherits reﬂexibility, anti symmetry
and transitivity from one set being an initial and thus a subset of the other).
For each chain ( ⊆ A, deﬁne (
t
= (1. <
t
) where R is the union of all the sets ¹ for all
(¹. <) ∈ (, and <
t
is the union of all the relations < for all (¹. <) ∈ (. It follows that (
t
252
is an upper bound for ( in A.
According to Zorn’s lemma, A now has a maximal element, (`. <
M
). We postulate that `
contains all members of A, for if this were not true we could for any c ∈ A −` construct
(`
∗
. <
∗
) where `
∗
= `
¸
¦c¦ and <
∗
is extended so o
a
(`
∗
) = `. Clearly <
∗
then deﬁnes
a wellorder on `
∗
, and (`
∗
. <
∗
) would be larger than (`. <
M
) contrary to the deﬁnition.
Since ` contains all the members of A and <
M
is a wellordering of `, it is also a well
ordering on A as required.
The wellordering theorem implies Hausdorﬀ’s maximum principle. Let (A. _)
be a partially ordered set, and let < be a wellordering on A. We deﬁne the function φ by
transﬁnite recursion over (A. <) so that
φ(c) =
¦c¦ i1¦c¦
¸¸
b<a
φ(/) is totally ordered under _ .
∅ otherwise.
.
It follows that
¸
x∈X
φ(r) is a maximal totally ordered subset of A as required.
Version: 4 Owner: mathcam Author(s): mathcam, cryo
34.98 equivalence of Zorn’s lemma and the axiom of
choice
Let A be a set partially ordered by < such that each chain has an upper bound. Equate
each r ∈ A with j(r) = ¦n ∈ A [ r < n¦ ⊆ 1(A). Let j(A) = ¦j(r) [ r ∈ A¦. If j(r) = ∅
then it follows that r is maximal.
Suppose no j(r) = ∅. Then by the axiom of choice there is a choice function 1 on j(A), and
since for each j(r) we have 1(j(r)) ∈ j(r), it follows that j(r) < 1(j(r)). Deﬁne 1
α
(j(r))
for all ordinals i by transﬁnite induction:
1
0
(j(r)) = j(r)
1
α+1
(j(r)) = 1(j(r))
And for a limit ordinal α, let 1
α
(j(r)) be the upper bound of 1
i
(j(r)) for i < α.
This construction can go on forever, for any ordinal. Then we can easily construct a surjective
function from A to (:d by o(α) = 1
α
(r). But that requires that A be a proper class, in
contradiction to the fact that it is a set. So there can be no such choice function, and there
must be a maximal element of A.
For the reverse, assume Zorn’s lemma and let ( be any set of nonempty sets. Consider the
253
set of functions 1 = ¦1 [ ∀c ∈ dom(1)(c ∈ ( ∧ 1(c) ∈ c)¦ partially ordered by inclusion.
Then the union of any chain in 1 is also a member of 1 (since the union of a chain of
functions is always a function). By Zorn’s lemma, 1 has a maximal element 1, and since
any function with domain smaller than ( can be easily expanded, dom(1) = (, and so 1 is
a choice function (.
Version: 2 Owner: Henry Author(s): Henry
34.99 maximality principle
Let o be a collection of sets. If, for each chain ( ⊆ o, there exists an A ∈ o such that every
element of ( is a subset of A, then o contains a maximal element. This is known as a the
maximality principle.
The maximality principle is equivalent to the axiom of choice.
Version: 4 Owner: akrowne Author(s): akrowne
34.100 principle of ﬁnite induction
Let o be a set of positive integers with the properties
1. 1 belongs to o, and
2. whenever the integer / is in o, then the next integer / + 1 must also be in o.
Then o is the set of all positive integers.
The Second Principle of Finite Induction would replace (2) above with
2’. If / is a positive integer such that 1. 2. . . . . / belong to o, then / + 1 must also be in o.
The Principle of Finite Induction is a consequence of the wellordering principle.
Version: 3 Owner: KimJ Author(s): KimJ
254
34.101 principle of ﬁnite induction proven from well
ordering principle
Let 1 be the set of all postive integers not in o. Assume 1 is nonempty. The wellordering principle
says 1 contains a least element; call it c. Since 1 ∈ o, we have c 1, hence 0 < c −1 < c.
The choice of c as the smallest element of 1 means c −1 is not in 1, and hence is in o. But
then (c −1) + 1 is in o, which forces c ∈ o, contradicting c ∈ 1. Hence 1 is empty, and o
is all positive integers.
Version: 4 Owner: KimJ Author(s): KimJ
34.102 proof of Tukey’s lemma
Let o be a set and 1 a set of subsets of o such that 1 is of ﬁnite character. By Zorn’s lemma,
it is enough to show that 1 is inductive. For that, it will be enough to show that if (1
i
)
i∈I
is a family of elements of 1 which is totally ordered by inclusion, then the union l of the
1
i
is an element of 1 as well (since l is an upper bound on the family (1
i
)). So, let 1 be
a ﬁnite subset of l. Each element of l is in 1
i
for some i ∈ 1. Since 1 is ﬁnite and the 1
i
are totally ordered by inclusion, there is some , ∈ 1 such that all elements of 1 are in 1
j
.
That is, 1 ⊂ 1
j
. Since 1 is of ﬁnite character, we get 1 ∈ 1, QED.
Version: 1 Owner: Koro Author(s): Larry Hammick
34.103 proof of Zermelo’s wellordering theorem
Let A be any set and let 1 be a choice function on P(A) ` ¦∅¦. Then deﬁne a function i by
transﬁnite recursion on the class of ordinals as follows:
i(β) = 1(A −
¸
γ<β
¦i(γ)¦) unless A −
¸
γ<β
¦i(γ)¦ = ∅ or i(γ) is undeﬁned for some γ < β
(the function is undeﬁned if either of the unless clauses holds).
Thus i(0) is just 1(A) (the least element of A), and i(1) = 1(A−¦i(0)¦) (the least element
of A other than i(0)).
Deﬁne by the axiom of replacement β = i
−1
[A] = ¦γ [ i(r) = γ for some r ∈ A¦. Since β is
a set of ordinals, it cannot contain all the ordinals (by the BuraliForti paradox).
Since the ordinals are well ordered, there is a least ordinal α not in β, and therefore i(α) is
undeﬁned. It cannot be that the second unless clause holds (since α is the least such ordinal)
255
so it must be that A −
¸
γ<α
¦i(γ)¦ = ∅, and therefore for every r ∈ A there is some γ < α
such that i(γ) = r. Since we already know that i is injective, it is a bijection between α and
A, and therefore establishes a wellordering of A by r <
X
n ↔i
−1
(r) < i
−1
(n).
The reverse is simple. If ( is a set of nonempty sets, select any well ordering of
¸
(. Then
a choice function is just 1(c) = the least member of c under that well ordering.
Version: 5 Owner: Henry Author(s): Henry
34.104 axiom of extensionality
If A and ) have the same elements, then A = ) .
The Axiom of Extensionality is one of the axioms of ZermeloFraenkel set theory. In symbols,
it reads:
∀n(n ∈ A ↔n ∈ ) ) →A = ).
Note that the converse,
A = ) →∀n(n ∈ A ↔n ∈ ) )
is an axiom of the predicate calculus. Hence we have,
A = ) ↔∀n(n ∈ A ↔n ∈ ) ).
Therefore the Axiom of Extensionality expresses the most fundamental notion of a set: a set
is determined by its elements.
Version: 2 Owner: Sabean Author(s): Sabean
34.105 axiom of inﬁnity
There exists an inﬁnite set.
The Axiom of Inﬁnity is an axiom of ZermeloFraenkel set theory. At ﬁrst glance, this axiom
seems to be illdeﬁned. How are we to know what constitutes an inﬁnite set when we have
not yet deﬁned the notion of a ﬁnite set? However, once we have a theory of ordinal numbers
in hand, the axiom makes sense.
Meanwhile, we can give a deﬁnition of ﬁniteness that does not rely upon the concept of
number. We do this by introducing the notion of an inductive set. A set o is said to be
inductive if ∅ ∈ o and for every r ∈ o, r
¸
¦r¦ ∈ o. We may then state the Axiom of
Inﬁnity as follows:
There exists an inductive set.
256
In symbols:
∃o[∅ ∈ o ∧ (∀r ∈ o)[r
¸
¦r¦ ∈ o]]
We shall then be able to prove that the following conditions are equivalent:
1. There exists an inductive set.
2. There exists an inﬁnite set.
3. The least nonzero limit ordinal, ω, is a set.
Version: 3 Owner: Sabean Author(s): Sabean
34.106 axiom of pairing
For any c and / there exists a set ¦c. /¦ that contains exactly c and /.
The Axiom of Pairing is one of the axioms of ZermeloFraenkel set theory. In symbols, it
reads:
∀c∀/∃c∀r(r ∈ c ↔r = c ∨ r = /).
Using the axiom of extensionality, we see that the set c is unique, so it makes sense to deﬁne
the pair
¦c. /¦ = the unique c such that ∀r(r ∈ c ↔r = c ∨ r = /).
Using the Axiom of Pairing, we may deﬁne, for any set c, the singleton
¦c¦ = ¦c. c¦.
We may also deﬁne, for any set c and /, the ordered pair
(c. /) = ¦¦c¦. ¦c. /¦¦.
Note that this deﬁnition satisﬁes the condition
(c. /) = (c. d) iﬀ c = c and / = d.
We may deﬁne the ordered :tuple recursively
(c
1
. . . . . c
n
) = ((c
1
. . . . . c
n−1
). c
n
).
Version: 4 Owner: Sabean Author(s): Sabean
257
34.107 axiom of power set
For any A, there exists a set ) = 1(A).
The Axiom of Power Set is an axiom of ZermeloFraenkel set theory. In symbols, it reads:
∀A∃) ∀n(n ∈ ) ↔n ⊆ A).
In the above, n ⊆ A is deﬁned as ∀.(. ∈ n → . ∈ A). Hence ) is the set of all subsets of
A. ) is called the power set of A and is denoted 1(A). By extensionality, the set ) is
unique.
The Power Set Axiom allows us to deﬁne the cartesian product of two sets A and ) :
A ) = ¦(r. n) : r ∈ A ∧ n ∈ ) ¦.
The Cartesian product is a set since
A ) ⊆ 1(1(A
¸
) )).
We may deﬁne the Cartesian product of any ﬁnite collection of sets recursively:
A
1
A
n
= (A
1
A
n−1
) A
n
.
Version: 5 Owner: Sabean Author(s): Sabean
34.108 axiom of union
For any A there exists a set ) =
¸
A.
The Axiom of Union is an axiom of ZermeloFraenkel set theory. In symbols, it reads
∀A∃) ∀n(n ∈ ) ↔∃.(. ∈ A ∧ n ∈ .)).
Notice that this means that ) is the set of elements of all elements of A. More succinctly,
the union of any set of sets is a set. By extensionality, the set ) is unique. ) is called the
union of A.
In particular, the Axiom of Union, along with the axiom of pairing allows us to deﬁne
A
¸
) =
¸
¦A. ) ¦.
as well as the triple
¦c. /. c¦ = ¦c. /¦
¸
¦c¦
258
and therefore the :tuple
¦c
1
. . . . . c
n
¦ = ¦c
1
¦
¸
¸
¦c
n
¦
Version: 5 Owner: Sabean Author(s): Sabean
34.109 axiom schema of separation
Let φ(n. j) be a formula. For any A and j, there exists a set ) = ¦n ∈ A : φ(n. j)¦.
The Axiom Schema of Separation is an axiom schema of ZermeloFraenkel set theory. Note
that it represents inﬁnitely many individual axioms, one for each formula φ. In symbols, it
reads:
∀A∀j∃) ∀n(n ∈ ) ↔n ∈ A ∧ φ(n. j)).
By extensionality, the set ) is unique.
The Axiom Schema of Separation implies that φ may depend on more than one parameter
j.
We may show by induction that if φ(n. j
1
. . . . . j
n
) is a formula, then
∀A∀j
1
∀j
n
∃) ∀n(n ∈ ) ↔n ∈ A ∧ φ(n. j
1
. . . . . j
n
))
holds, using the Axiom Schema of Separation and the axiom of pairing.
Another consequence of the Axiom Schema of Separation is that a subclass of any set is a
set. To see this, let C be the class C = ¦n : φ(n. j
1
. . . . . j
n
)¦. Then
∀A∃) (C
¸
A = ) )
holds, which means that the intersection of C with any set is a set. Therefore, in particular,
the intersection of two sets A
¸
) = ¦r ∈ A : r ∈ ) ¦ is a set. Furthermore the diﬀerence
of two sets A − ) = ¦r ∈ A : r ∈ ) ¦ is a set and, provided there exists at least one set,
which is guaranteed by the axiom of inﬁnity, the empty set is a set. For if A is a set, then
∅ = ¦r ∈ A : r = r¦ is a set.
Moreover, if C is a nonempty class, then
¸
C is a set, by Separation.
¸
C is a subset of
every A ∈ C.
Lastly, we may use Separation to show that the class of all sets, \ , is not a set, i.e., \ is a
proper class. For example, suppose \ is a set. Then by Separation
\
t
= ¦r ∈ \ : r ∈ r¦
is a set and we have reached a Russell paradox.
Version: 15 Owner: Sabean Author(s): Sabean
259
34.110 de Morgan’s laws
In set theory, de Morgan’s laws relate the three basic set operations to each other; the
union, the intersection, and the complement. de Morgan’s laws are named after the Indian
born British mathematician and logician Augustus De Morgan (18061871) [1].
If ¹ and 1 are subsets of a set A, de Morgan’s laws state that
(¹
¸
1)
= ¹
¸
1
.
(¹
¸
1)
= ¹
¸
1
.
Here,
¸
denotes the union,
¸
denotes the intersection, and ¹
denotes the set complement
of ¹ in A, i.e., ¹
= A ` ¹.
Above, de Morgan’s laws are written for two sets. In this form, they are intuitively quite
clear. For instance, the ﬁrst claim states that an element that is not in ¹
¸
1 is not in ¹
and not in 1. It also states that an elements not in ¹ and not in 1 is not in ¹
¸
1.
For an arbitrary collection of subsets, de Morgan’s laws are as follows:
Theorem. Let A be a set with subsets ¹
i
⊂ A for i ∈ 1, where 1 is an arbitrary indexset.
In other words, 1 can be ﬁnite, countable, or uncountable. Then
¸
i∈I
¹
i
=
¸
i∈I
¹
i
.
¸
i∈I
¹
i
=
¸
i∈I
¹
i
.
(proof)
de Morgan’s laws in a Boolean algebra
For Boolean variables r and n in a Boolean algebra, de Morgan’s laws state that
(r ∧ n)
t
= r
t
∨ n
t
.
(r ∨ n)
t
= r
t
∧ n
t
.
Not surprisingly, de Morgan’s laws form an indispensable tool when simplifying digital cir
cuits involving and, or, and not gates [2].
REFERENCES
1. Wikipedia’s entry on de Morgan, 4/2003.
2. M.M. Mano, Computer Engineering: Hardware Design, Prentice Hall, 1988.
Version: 11 Owner: matte Author(s): matte, drini, greg
260
34.111 de Morgan’s laws for sets (proof )
Let A be a set with subsets ¹
i
⊂ A for i ∈ 1, where 1 is an arbitrary indexset. In other
words, 1 can be ﬁnite, countable, or uncountable. We ﬁrst show that
¸
i∈I
¹
i
t
=
¸
i∈I
¹
t
i
.
where ¹
t
denotes the complement of ¹.
Let us deﬁne o =
¸
i∈I
¹
i
t
and 1 =
¸
i∈I
¹
t
i
. To establish the equality o = 1, we shall
use a standard argument for proving equalities in set theory. Namely, we show that o ⊂ 1
and 1 ⊂ o. For the ﬁrst claim, suppose r is an element in o. Then r ∈
¸
i∈I
¹
i
, so r ∈ ¹
i
for any i ∈ 1. Hence r ∈ ¹
t
i
for all i ∈ 1, and r ∈
¸
i∈I
¹
i
= 1. Conversely, suppose r
is an element in 1 =
¸
i∈I
¹
t
i
. Then r ∈ ¹
t
i
for all i ∈ 1. Hence r ∈ ¹
i
for any i ∈ 1, so
r ∈
¸
i∈I
¹
i
, and r ∈ o.
The second claim,
¸
i∈I
¹
i
t
=
¸
i∈I
¹
t
i
.
follows by applying the ﬁrst claim to the sets ¹
t
i
.
Version: 3 Owner: mathcam Author(s): matte
34.112 set theory
Set theory is special among mathematical theories, in two ways: It plays a central role in
putting mathematics on a reliable axiomatic foundation, and it provides the basic language
and apparatus in which most of mathematics is expressed.
34.112.1 Axiomatic set theory
I will informally list the undeﬁned notions, the axioms, and two of the “schemes” of set
theory, along the lines of Bourbaki’s account. The axioms are closer to the von Neumann
BernaysG¨odel model than to the equivalent ZFC model. (But some of the axioms are
identical to some in ZFC; see the entry ZermeloFraenkelAxioms.) The intention here is just
to give an idea of the level and scope of these fundamental things.
There are three undeﬁned notions:
1. the relation of equality of two sets
261
2. the relation of membership of one set in another (r ∈ n)
3. the notion of an ordered 3. pair, which is a set comprised from two 3. other sets, in a
speciﬁc order.
Most of the eight schemes belong more properly to logic than to set theory, but they, or
something on the same level, are needed in the work of formalizing any theory that uses the
notion of equality, or uses quantiﬁers such as ∃. Because of their formal nature, let me just
(informally) state two of the schemes:
S6. If ¹ and 1 are sets, and ¹ = 1, then anything true of ¹ is true of 1, and conversely.
S7. If two properties 1(r) and G(r) of a set r are equivalent, then the “generic” set having
the property 1, is the same as the generic set having the property G.
(The notion of a generic set having a given property, is formalized with the help of the
Hilbert τ symbol; this is one way, but not the only way, to incorporate what is called the
axiom of choice.)
Finally come the ﬁve axioms in this axiomatization of set theory. (Some are identical to
axioms in ZFC, q.v.)
A1. Two sets ¹ and 1 are equal iﬀ they have the same elements, i.e. iﬀ the relation r ∈ ¹
implies r ∈ 1 and vice versa.
A2. For any two sets ¹ and 1, there is a set ( such that the r ∈ ( is equivalent to r = ¹
or r = 1.
A3. Two ordered pairs (¹. 1) and ((. 1) are equal iﬀ ¹ = ( and 1 = 1.
A4. For any set ¹, there exists a set 1 such that r ∈ 1 is equivalent to r ⊂ ¹; in other
words, there is a set of all subsets of ¹, for any given set ¹.
A5. There exists an inﬁnite set.
The word “inﬁnite” is deﬁned in terms of Axioms A1A4. But to formulate the deﬁnition,
one must ﬁrst build up some deﬁnitions and results about functions and ordered sets, which
we haven’t done here.
34.112.2 Product sets, relations, functions, etc.
Moving away from foundations and toward applications, all the more complex structures
and relations of set theory are built up out of the three undeﬁned notions. (See the entry
“Set”.) For instance, the relation ¹ ⊂ 1 between two sets, means simply “if r ∈ ¹ then
r ∈ 1”.
262
Using the notion of ordered pair, we soon get the very important structure called the product
¹1 of two sets ¹ and 1. Next, we can get such things as equivalence relations and order
relations on a set ¹, for they are subsets of ¹ ¹. And we get the critical notion of a
function ¹ → 1, as a subset of ¹ 1. Using functions, we get such things as the product
¸
i∈I
¹
i
of a family of sets. (“Family” is a variation of the notion of function.)
To be strictly formal, we should distinguish between a function and the graph of that func
tion, and between a relation and its graph, but the distinction is rarely necessary in practice.
34.112.3 Some structures deﬁned in terms of sets
The natural numbers provide the ﬁrst example. Peano, Zermelo and Fraenkel, and others
have given axiomlists for the set N, with its addition, multiplication, and order relation;
but nowadays the custom is to deﬁne even the natural numbers in terms of sets. In more
detail, a natural number is the ordertype of a ﬁnite wellordered set. The relation : ≤ :
between :. : ∈ N is deﬁned with the aid of a certain theorem which says, roughly, that for
any two wellordered sets, one is a segment of the other. The sum or product of two natural
numbers is deﬁned as the cardinal of the sum or product, respectively, of two sets. (For an
extension of this idea, see surreal numbers.)
(The term “cardinal” takes some work to deﬁne. The “type” of an ordered set, or any other
kind of structure, is the “generic” structure of that kind, which is deﬁned using τ.)
Groups provide another simple example of a structure deﬁned in terms of sets and ordered
pairs. A group is a pair (G. 1) in which G is just a set, and 1 is a mapping G G → G
satisfying certain axioms; the axioms (associativity etc.) can all be spelled out in terms of
sets and ordered pairs, although in practice one uses algebraic notation to do it. When we
speak of (e.g.) “the” group o
3
of permutations of a 3element set, we mean the “type” of
such a group.
Topological spaces provide another example of how mathematical structures can be deﬁned
in terms of, ultimately, the sets and ordered pairs in set theory. A topological space is a pair
(o. l), where the set o is arbitrary, but l has these properties:
– any element of l is a subset of o
– the union of any family – (or set) of elements of l is also an element of l
– the intersection of any – ﬁnite family of elements of l is an element of l.
Many special kinds of topological spaces are deﬁned by enlarging this list of restrictions on
l.
Finally, many kinds of structure are based on more than one set. E.g. a left module is a
commutative group ` together with a ring 1, plus a mapping 1 ` →` which satisﬁes
263
a speciﬁc set of restrictions.
34.112.4 Categories, homological algebra
Although set theory provides some of the language and apparatus used in mathematics
generally, that language and apparatus have expanded over time, and now include what are
called “categories” and “functors”. A category is not a set, and a functor is not a mapping,
despite similarities in both cases. A category comprises all the structured sets of the same
kind, e.g. the groups, and contains also a deﬁnition of the notion of a morphism from one
such structured set to another of the same kind. A functor is similar to a morphism but
compares one category to another, not one structured set to another. The classic examples
are certain functors from the category of topological spaces to the category of groups.
“Homological algebra” is concerned with sequences of morphisms within a category, plus
functors from one category to another. One of its aims is to get structure theories for speciﬁc
categories; the homology of groups and the cohomology of Lie algebras are examples. For
more details on the categories and functors of homological algebra, I recommend a search
for “EilenbergSteenrod axioms”.
Version: 8 Owner: mathwizard Author(s): Larry Hammick
34.113 union
The union of two sets ¹ and 1 is the set which contains all r ∈ ¹ and all r ∈ 1. The
union of ¹ and 1 is written as (¹
¸
1).
For any sets ¹ and 1,
r ∈ ¹
¸
1 ⇔(r ∈ ¹) ∨ (r ∈ 1)
Version: 1 Owner: imran Author(s): imran
34.114 universe
A universe U is a nonempty set satisfying the following axioms:
1. If r ∈ U and n ∈ r, then n ∈ U.
2. If r. n ∈ U, then ¦r. n¦ ∈ U.
264
3. If r ∈ U, then the power set P(r) ∈ U.
4. If ¦r
i
[i ∈ 1 ∈ U¦ is a family of elements of U, then
¸
i∈I
r
i
∈ U.
From these axioms, one can deduce the following properties:
1. If r ∈ U, then ¦r¦ ∈ U.
2. If r is a subset of n ∈ U, then r ∈ U.
3. If r. n ∈ U, then the ordered pair (r. n) = ¦¦r. n¦. r¦ is in U.
4. If r. n ∈ U, then r
¸
n and r n are in U.
5. If ¦r
i
[i ∈ 1 ∈ U¦ is a family of elements of U, then the product
¸
i∈I
r
i
is in U.
6. If r ∈ U, then the cardinality of r is strictly less than the cardinality of U. In
particular, U ∈ U.
The standard reference for universes is [SGA4].
REFERENCES
[SGA4] Grothendieck et al. SGA4.
Version: 2 Owner: nerdy2 Author(s): nerdy2
34.115 von NeumannBernaysGdel set theory
von NeumannBernaysG¨odel (commonly referred to as NBG or vNBG) is an axiomatisation
of set theory closely related to the more familiar ZermeloFraenkel with choice (ZFC) ax
ioamatisation. The primary diﬀerence between ZFC and NBG is that NBG has proper classes
among its objects. NBG and ZFC are very closely related and are in fact equiconsistent,
NBG being a conservative extension of ZFC.
In NBG, the proper classes are diﬀerentiated from sets by the fact that they do not belong
to other classes. Thus in NBG we have
Set(r) ↔∃nr ∈ n
Another interesting fact about proper classes within NBG is the following limitation of size principle
of von Neumann:
265
Set(r) ↔[r[ = [\ [
where \ is the set theoretic universe. This principle can in fact replace in NBG essen
tially all set existence axioms with the exception of the powerset axiom (and obviously the
axiom of inﬁnity). Thus the classes that are proper in NBG are in a very clear sense big,
while the sets are small.
The NBG set theory can be axiomatised in two diﬀerent ways
• Using the G¨odel class construction functions, resulting in a ﬁnite axiomatisation
• Using a class comprehension axiom scheme, resulting in an inﬁnite axiomatisation
In the latter alternative we take ZFC and relativise all of its axioms to sets, i.e. we replace
every expression of form ∀rφ with ∀r(oct(r) → φ) and ∃rφ with ∃r(oct(r) ∧ φ) and add
the class comprehension scheme
If φ is a formula with a free variable r with all its quantiﬁers restricted to
sets, then the following is an axiom: ∃¹∀r(r ∈ ¹ ↔φ)
Notice the important restriction to formulae with quantiﬁers restricted to sets in the scheme.
This requirement makes the NBG proper classes predicative; you can’t prove the existence
of a class the deﬁnition of which quantiﬁes over all classes. This restriction is essential; if we
loosen it we get a theory that is not conservative over ZFC. If we allow arbitrary formulae in
the class comprehension axiom scheme we get what is called MorseKelley set theory. This
theory is essentially stronger than ZFC or NBG. In addition to these axioms, NBG also
contains the global axiom of choice
∃(∀r∃.((
¸
r = ¦.¦)
Another way to axiomatise NBG is to use the eight G¨odel class construction functions. These
functions correspond to the various ways in which one can build up formulae (restricted
to sets!) with set parameters. However, the functions are ﬁnite in number and so are the
resulting axioms governing their behaviour. In particular, since there is a class corresponding
to any restricted formula, the intersection of any set and this class exists too (and is set).
Thus the comprehension scheme of ZFC can be replaced with a ﬁnite number of axioms,
provided we allow for proper classes.
It is easy to show that everything provable in ZF is also provable in NBG. It is also not too
diﬃcult to show that NBG  global choice is conservative extension of ZFC. However, showing
that NBG (including global choice) is a conservative extension of ZFC is considerably more
diﬃcult. This is equivalent to showing that NBG with global choice is conservative over
266
NBG with only local choice (choice restricted to sets). In order to do this one needs to use
(class) forcing. This result is usually credited to Easton and Solovay.
Version: 8 Owner: Aatu Author(s): Aatu
34.116 FS iterated forcing preserves chain condition
Let κ be a regular cardinal and let '
ˆ
C
β
`
β<α
be a ﬁnite support iterated forcing where for
every β < α, '
P
β
ˆ
C
β
has the κ chain condition.
By induction:
1
0
is the empty set.
If 1
α
satisﬁes the κ chain condition then so does 1
α+1
, since 1
α+1
is equivalent to 1
α
∗ C
α
and composition preserves the κ chain condition for regular κ.
Suppose α is a limit ordinal and 1
β
satisﬁes the κ chain condition for all β < α. Let
o = 'j
i
`
i<κ
be a subset of 1
α
of size κ. The domains of the elements of j
i
form κ ﬁnite
subsets of α, so if cf(α) κ then these are bounded, and by the inductive hypothesis, two
of them are compatible.
Otherwise, if cf(α) < κ, let 'α
j
`
j<cf(α)
be an increasing sequence of ordinals coﬁnal in α.
Then for any i < κ there is some :(i) < cf(α) such that dom(j
i
) ⊆ α
n(i)
. Since κ is regular
and this is a partition of κ into fewer than κ pieces, one piece must have size κ, that is, there
is some , such that , = :(i) for κ values of i, and so ¦j
i
[ :(i) = ,¦ is a set of conditions
of size κ contained in 1
α
j
, and therefore contains compatible members by the induction
hypothesis.
Finally, if cf(α) = κ, let ( = 'α
j
`
j<κ
be a strictly increasing, continuous sequence coﬁnal in
α. Then for every i < κ there is some :(i) < κ such that dom(j
i
) ⊆ α
n(i)
. When :(i) is a
limit ordinal, since ( is continuous, there is also (since dom(j
i
) is ﬁnite) some 1(i) < i such
that dom(j
i
)
¸
[α
f(i)
. α
i
) = ∅. Consider the set 1 of elements i such that i is a limit ordinal
and for any , < i, :(,) < i. This is a club, so by Fodor’s lemma there is some , such that
¦i [ 1(i) = ,¦ is stationary.
For each j
i
such that 1(i) = ,, consider j
t
i
= j
i
` ,. There are κ of these, all members of 1
j
,
so two of them must be compatible, and hence those two are also compatible in 1.
Version: 1 Owner: Henry Author(s): Henry
267
34.117 chain condition
A partial order 1 satisﬁes the κchain condition if for any o ⊆ 1 with [o[ = κ then there
exist distinct r. n ∈ o such that either r < n or n < r.
If κ = ℵ
1
then 1 is said to satisfy the countable chain condition (c.c.c.)
Version: 2 Owner: Henry Author(s): Henry
34.118 composition of forcing notions
Suppose 1 is a forcing notion in M and
ˆ
C is some 1name such that '
P
ˆ
C is a forcing
notion.
Then take a set of 1names C such that given a 1 name
˜
C of C, '
P
˜
C =
ˆ
C (that is, no
matter which generic subset G of 1 we force with, the names in C correspond precisely to
the elements of
ˆ
C[G]). We can deﬁne
1 ∗ C = ¦'j. ˆ ¡` [ j ∈ 1. ˆ ¡ ∈ C¦
We can deﬁne a partial order on 1 ∗ C such that 'j
1
. ˆ ¡
1
` < 'j
2
. ˆ ¡
2
` iﬀ j
1
<
P
j
2
and j
1
'
ˆ ¡
1
<
ˆ
Q
ˆ ¡
2
. (A note on interpretation: ¡
1
and ¡
2
are 1 names; this requires only that ˆ ¡
1
< ˆ ¡
2
in generic subsets contain j
1
, so in other generic subsets that fact could fail.)
Then 1 ∗
ˆ
C is itself a forcing notion, and it can be shown that forcing by 1 ∗
ˆ
C is equivalent
to forcing ﬁrst by 1 and then by
ˆ
C[G].
Version: 1 Owner: Henry Author(s): Henry
34.119 composition preserves chain condition
Let κ be a regular cardinal. Let 1 be a forcing notion satisfying the κ chain condition.
Let
ˆ
C be a 1name such that '
P
ˆ
C is a forcing notion satisfying the κ chain
condition. Then 1 ∗ C satisﬁes the κ chain conditon.
268
Proof:
Outline
We prove that there is some j such that any generic subset of 1 including j also includes κ
of the j
i
. Then, since C[G] satisﬁes the κ chain condition, two of the corresponding ˆ ¡
i
must
be compatible. Then, since G is directed, there is some j stronger than any of these which
forces this to be true, and therefore makes two elements of o compatible.
Let o = 'j
i
. ˆ ¡
i
`
i<κ
⊆ 1 ∗ C.
Claim: There is some j ∈ 1 such that j ' [¦i [ j
i
∈
ˆ
G¦[ = κ
(Note:
ˆ
G = ¦'j. j` [ j ∈ 1¦, hence
ˆ
G[G] = G)
If no j forces this then every j forces that it is not true, and therefore '
P
[¦i [ j
i
∈ G¦[ < κ.
Since κ is regular, this means that for any generic G ⊆ 1, ¦i [ j
i
∈ G¦ is bounded. For
each G, let 1(G) be the least α such that β < α implies that there is some γ β such that
j
γ
∈ G. Deﬁne 1 = ¦α [ α = 1(G)¦ for some G.
Claim: [1[ < κ
If α ∈ 1 then there is some j
α
∈ 1 such that j ' 1(
ˆ
G) = α, and if α. β ∈ 1 then j
α
must
be incompatible with j
β
. Since 1 satisﬁes the κ chain condition, it follows that [1[ < κ.
Since κ is regular, α = sub(1) < κ. But obviously j
α+1
' j
α+1
∈
ˆ
G. This is a contradiction,
so we conclude that there must be some j such that j ' [¦i [ j
i
∈
ˆ
G¦[ = κ.
If G ⊆ 1 is any generic subset containing j then ¹ = ¦ˆ ¡
i
[G] [ j
i
∈ G¦ must have cardinality
κ. Since C[G] satisﬁes the κ chain condition, there exist i. , < κ such that j
i
. j
j
∈ G and
there is some ˆ ¡[G] ∈ C[G] such that ˆ ¡[G] < ˆ ¡
i
[G]. ˆ ¡
j
[G]. Then since G is directed, there is
some j
t
∈ G such that j
t
< j
i
. j
j
. j and j
t
' ˆ ¡[G] < ˆ ¡
1
[G]. ˆ ¡
2
[G]. So 'j
t
. ˆ ¡` < 'j
i
. ˆ ¡
i
`. 'j
j
. ˆ ¡
j
`.
Version: 1 Owner: Henry Author(s): Henry
34.120 equivalence of forcing notions
Let 1 and C be two forcing notions such that given any generic subset G of 1 there is a
generic subset H of C with M[G] = M[H] and viceversa. Then 1 and C are equivalent.
269
Since if G ∈ M[H], τ[G] ∈ M for any 1name τ, it follows that if G ∈ M[H] and H ∈ M[G]
then M[G] = M[H].
Version: 2 Owner: Henry Author(s): Henry
34.121 forcing relation
If M is a transitive model of set theory and 1 is a partial order then we can deﬁne a forcing
relation:
j '
P
φ(τ
1
. . . . . τ
n
)
(j forces φ(τ
1
. . . . . τ
n
))
for any j ∈ 1, where τ
1
. . . . . τ
n
are 1 names.
Speciﬁcally, the relation holds if for every generic ﬁlter G over 1 which contains j,
M[G] = φ(τ
1
[G]. . . . . τ
n
[G])
That is, j forces φ if every extension of M by a generic ﬁlter over 1 containing j makes φ
true.
If j '
P
φ holds for every j ∈ 1 then we can write '
P
φ to mean that for any generic G ⊆ 1,
M[G] = φ.
Version: 2 Owner: Henry Author(s): Henry
34.122 forcings are equivalent if one is dense in the
other
Suppose 1 and C are forcing notions and that 1 : 1 →C is a function such that:
• j
1
<
P
j
2
implies 1(j
1
) <
Q
1(j
2
)
• If j
1
. j
2
∈ 1 are incomparable then 1(j
1
). 1(j
2
) are incomparable
• 1[1] is dense in C
then 1 and C are equivalent.
270
Proof
We seek to provide two operations (computable in the appropriate universes) which convert
between generic subsets of 1 and C, and to prove that they are inverses.
1(G) = H where H is generic
Given a generic G ⊆ 1, consider H = ¦¡ [ 1(j) < ¡¦ for some j ∈ G.
If ¡
1
∈ H and ¡
1
< ¡
2
then ¡
2
∈ H by the deﬁnition of H. If ¡
1
. ¡
2
∈ H then let j
1
. j
2
∈ 1
be such that 1(j
1
) < ¡
1
and 1(j
2
) < ¡
2
. Then there is some j
3
< j
1
. j
2
such that j
3
∈ G,
and since 1 is order preseving 1(j
3
) < 1(j
1
) < ¡
1
and 1(j
3
) < 1(j
2
) < ¡
2
.
Suppose 1 is a dense subset of C. Since 1[1] is dense in C, for any d ∈ 1 there is some
j ∈ 1 such that 1(j) < d. For each d ∈ 1, assign (using the axiom of choice) some d
p
∈ 1
such that 1(d
p
) < d, and call the set of these 1
P
. This is dense in 1, since for any j ∈ 1
there is some d ∈ 1 such that d < 1(j), and so some d
p
∈ 1
P
such that 1(d
p
) < d. If d
p
< j
then 1
P
is dense, so suppose d
p
< j. If d
p
< j then this provides a member of 1
P
less than
j; alternatively, since 1(d
p
) and 1(j) are compatible, d
p
and j are compatible, so j < d
p
,
and therefore 1(j) = 1(d
p
) = d, so j ∈ 1
P
. Since 1
P
is dense in 1, there is some element
j ∈ 1
P
¸
G. Since j ∈ 1
P
, there is some d ∈ 1 such that 1(j) < d. But since j ∈ G,
d ∈ H, so H intersects 1.
G can be recovered from 1(G)
Given H constructed as above, we can recover G as the set of j ∈ 1 such that 1(j) ∈ H.
Obviously every element from G is included in the new set, so consider some j such that
1(j) ∈ H. By deﬁnition, there is some j
1
∈ G such that 1(j
1
) < 1(j). Take some dense
1 ∈ C such that there is no d ∈ 1 such that 1(j) < d (this can be done easily be taking
any dense subset and removing all such elements; the resulting set is still dense since there
is some d
1
such that d
1
< 1(j) < d). This set intersects 1[G] in some ¡, so there is some
j
2
∈ G such that 1(j
2
) < ¡, and since G is directed, some j
3
∈ G such that j
3
< j
2
. j
1
. So
1(j
3
) < 1(j
1
) < 1(j). If j
3
< j then we would have j < j
3
and then 1(j) < 1(j
3
) < ¡,
contradicting the deﬁnition of 1, so j
3
< j and j ∈ G since G is directed.
1
−1
(H) = G where G is generic
Given any generic H in C, we deﬁne a corresponding G as above: G = ¦j ∈ 1 [ 1(j) ∈ H¦.
If j
1
∈ G and j
1
< j
2
then 1(j
1
) ∈ H and 1(j
1
) < 1(j
2
), so j
2
∈ G since H is directed. If
j
1
. j
2
∈ G then 1(j
1
). 1(j
2
) ∈ H and there is some ¡ ∈ H such that ¡ < 1(j
1
). 1(j
2
).
271
Consider 1, the set of elements of C which are 1(j) for some j ∈ 1 and either 1(j) < ¡ or
there is no element greater than both 1(j) and ¡. This is dense, since given any ¡
1
∈ C, if
¡
1
< ¡ then (since 1[1] is dense) there is some j such that 1(j) < ¡
1
< ¡. If ¡ < ¡
1
then
there is some j such that 1(j) < ¡ < ¡
1
. If neither of these and ¡ there is some : < ¡
1
. ¡ then
any j such that 1(j) < : suﬃces, and if there is no such : then any j such that 1(j) < ¡
suﬃces.
There is some 1(j) ∈ 1
¸
H, and so j ∈ G. Since H is directed, there is some : < 1(j). ¡,
so 1(j) < ¡ < 1(j
1
). 1(j
2
). If it is not the case that 1(j) < 1(j
1
) then 1(j) = 1(j
1
) = 1(j
2
).
In either case, we conﬁrm that H is directed.
Finally, let 1 be a dense subset of 1. 1[1] is dense in C, since given any ¡ ∈ C, there
is some j ∈ 1 such that j < ¡, and some d ∈ 1 such that d < j < ¡. So there is some
1(j) ∈ 1[1]
¸
H, and so j ∈ 1
¸
G.
H can be recovered from 1
−1
(H)
Finally, given G constructed by this method, H = ¦¡ [ 1(j) < ¡¦ for some j ∈ G. To see
this, if there is some 1(j) for j ∈ G such that 1(j) < ¡ then 1(j) ∈ H so ¡ ∈ H. On the
other hand, if ¡ ∈ H then the set of 1(j) such that either 1(j) < ¡ or there is no : ∈ C such
that : < ¡. 1(j) is dense (as shown above), and so intersects H. But since H is directed, it
must be that there is some 1(j) ∈ H such that 1(j) < ¡, and therefore j ∈ G.
Version: 3 Owner: Henry Author(s): Henry
34.123 iterated forcing
We can deﬁne an iterated forcing of length α by induction as follows:
Let 1
0
= ∅.
Let
ˆ
C
0
be a forcing notion.
For β < α, 1
β
is the set of all functions 1 such that dom(1) ⊆ β and for any i ∈ dom(1),
1(i) is a 1
i
name for a member of
ˆ
C
i
. Order 1
β
by the rule 1 < o iﬀ dom(o) ⊆ dom(1) and
for any i ∈ dom(1), o ` i ' 1(i) <
ˆ
Q
i
o(i). (Translated, this means that any generic subset
including o restricted to i forces that 1(i), an element of
ˆ
C
i
, be less than o(i).)
For β < α,
ˆ
C
β
is a forcing notion in 1
β
(so '
P
β
ˆ
C
β
is a forcing notion).
Then the sequence '
ˆ
C
β
`
β<α
is an iterated forcing.
If 1
β
is restricted to ﬁnite functions that it is called a ﬁnite support iterated forcing
272
(FS), if 1
β
is restricted to countable functions, it is called a countable support iterated
function (CS), and in general if each function in each 1
β
has size less than κ then it is a
< κsupport iterated forcing.
Typically we construct the sequence of
ˆ
C
β
’s by induction, using a function 1 such that
1('
ˆ
C
β
`
β<γ
) =
ˆ
C
γ
.
Version: 2 Owner: Henry Author(s): Henry
34.124 iterated forcing and composition
There is a function satisfying forcings are equivalent if one is dense in the other 1 : 1
α
∗
C
α
→1
α+1
.
Proof
Let 1('o. ˆ ¡`) = o
¸
¦'α. ˆ ¡`¦. This is obviously a member of 1
α+1
, since it is a partial function
from α+1 (and if the domain of o is less than α then so is the domain of 1('o. ˆ ¡`)), if i < α
then obviously 1('o. ˆ ¡`) applied to i satisﬁes the deﬁnition of iterated forcing (since o does),
and if i = α then the deﬁnition is satisﬁed since ˆ ¡ is a name in 1
i
for a member of C
i
.
1 is order preserving, since if 'o
1
. ˆ ¡
1
` < 'o
2
. ˆ ¡
2
`, all the appropriate characteristics of a
function carry over to the image, and o
1
` α '
P
i
ˆ ¡
1
< ˆ ¡
2
(by the deﬁnition of < in ∗).
If 'o
1
. ˆ ¡
1
` and 'o
2
. ˆ ¡
2
` are incomparable then either o
1
and o
2
are incomparable, in which
case whatever prevents them from being compared applies to their images as well, or ˆ ¡
1
and
ˆ ¡
2
aren’t compared appropriately, in which case again this prevents the images from being
compared.
Finally, let o be any element of 1
α+1
. Then o ` α ∈ 1
α
. If α ∈ dom(o) then this is just o,
and 1('o. ˆ ¡`) < o for any ˆ ¡. If α ∈ dom(o) then 1('o ` α. o(α)`) = o. Hence 1[1
α
∗ C
α
] is
dense in 1
α+1
, and so these are equivalent.
Version: 3 Owner: Henry Author(s): Henry
34.125 name
We need a way to refer to objects of M[G] within M. This is done by assigning a name to
each element of M[G].
Given a partial order 1, we construct the 1names by induction. Each name is just a
273
relation between 1 and the set of names already constructed; that is, a name is a set of
ordered pairs of the form (j. τ) where j ∈ 1 and τ is a name constructed at an earlier level
of the induction.
Given a generic subset G ⊆ 1, we can then deﬁne the interpretation τ[G] of a 1name τ in
M[G] by:
τ[G] = ¦τ
t
[G] [ (j. τ
t
) ∈ τ¦ for some j ∈ G
Of course, two diﬀerent names can have the same interpretation.
The generic subset can be thought of as a ”key” which reveals which potential elements of
τ are actually elements.
Any element r ∈ M can be given a canonical name
ˆ r = ¦(j. ˆ n) [ n ∈ r. j ∈ 1¦
This guarantees that the elements of ˆ r[G] will be exactly the same as the elements of r,
regardless of which members of 1 are contained in G.
Version: 3 Owner: Henry Author(s): Henry
34.126 partial order with chain condition does not col
lapse cardinals
If 1 is a partial order with satisﬁes the κ chain condition and G is a generic subset of 1 then
for any κ < λ ∈ M, λ is also a cardinal in M[G], and if cf(α) = λ in M then also cf(α) = λ
in M[G].
This theorem is the simplest way to control a notion of forcing, since it means that a notion
of forcing does not have an eﬀect above a certain point. Given that any 1 satisﬁes the [1[
+
chain condition, this means that most forcings leaves all of M above a certain point alone.
(Although it is possible to get around this limit by forcing with a proper class.)
Version: 2 Owner: Henry Author(s): Henry
34.127 proof of partial order with chain condition does
not collapse cardinals
Outline:
274
Given any function 1 purporting to violate the theorem by being surjective (or coﬁnal) on
λ, we show that there are fewer than κ possible values of 1(α), and therefore only max(α. κ)
possible elements in the entire range of 1, so 1 is not surjective (or coﬁnal).
Details:
Suppose λ κ is a cardinal of M that is not a cardinal in M[G].
There is some function 1 ∈ M[G] and some cardinal α < λ such that 1 : α →λ is surjective.
This has a name,
ˆ
1. For each β < α, consider
1
β
= ¦γ < λ [ j '
ˆ
1(β) = γ¦ for some j ∈ 1
[1
β
[ < κ, since any two j ∈ 1 which force diﬀerent values for
ˆ
1(β) are incompatible and 1
has no sets of incompatible elements of size κ.
Notice that 1
β
is deﬁnable in M. Then the range of 1 must be contained in 1 =
¸
i<α
1
i
.
But [1[ < α κ = max(α. κ) < λ. So 1 cannot possibly be surjective, and therefore λ is not
collapsed.
Now suppose that for some α ` λ κ, cf(α) = λ in M and for some η < λ there is a coﬁnal
function 1 : η →α.
We can construct 1
β
as above, and again the range of 1 is contained in 1 =
¸
i<η
1
i
. But
then [ range(1)[ < [1[ < η κ < λ. So there is some γ < α such that 1(β) < γ for any β < η,
and therefore 1 is not coﬁnal in α.
Version: 1 Owner: Henry Author(s): Henry
34.128 proof that forcing notions are equivalent to their
composition
This is a long and complicated proof, the more so because the meaning of C shifts depending
on what generic subset of 1 is being used. It is therefore broken into a number of steps. The
core of the proof is to prove that, given any generic subset G of 1 and a generic subset H of
ˆ
C[G] there is a corresponding generic subset G∗H of 1 ∗C such that M[G][H] = M[G∗H],
and conversely, given any generic subset G of 1 ∗ C we can ﬁnd some generic G
P
of 1 and
a generic G
Q
of
ˆ
C[G
P
] such that M[G
P
][G
Q
] = M[G].
We do this by constructing functions using operations which can be performed within the
forced universes so that, for example, since M[G][H] has both G and H, G ∗ H can be
calculated, proving that it contains M[G ∗ H]. To ensure equality, we will also have to
ensure that our operations are inverses; that is, given G, G
P
∗ G
H
= G and given G and H,
(G∗ H)
P
= 1 and (G∗ H)
Q
= H.
275
The remainder of the proof merely deﬁnes the precise operations, proves that they give
generic sets, and proves that they are inverses.
Before beginning, we prove a lemma which comes up several times:
Lemma: If G is generic in 1 and 1 is dense above some j ∈ G then
G
¸
1 = ∅
Let 1
t
= ¦j
t
∈ 1 [ j
t
∈ 1 ∨ j
t
is incompatible with j¦. This is dense, since if j
0
∈ 1
then either j
0
is incompatible with j, in which case j
0
∈ 1
t
, or there is some j
1
such that
j
1
< j. j
0
, and therefore there is some j
2
< j
1
such that j
2
∈ 1, and therefore j
2
< j
0
. So
G intersects 1
t
. But since a generic set is directed, no two elements are incompatible, so
G must contain an element of 1
t
which is not incompatible with j, so it must contain an
element of 1.
G∗ H is a generic ﬁlter
First, given generic subsets G and H of 1 and
ˆ
C[G], we can deﬁne:
G ∗ H = ¦'j. ˆ ¡` [ j ∈ G∧ ˆ ¡[G] ∈ H¦
G∗ H is closed
Let 'j
1
. ˆ ¡
1
` ∈ G ∗ H and let 'j
1
. ˆ ¡
1
` < 'j
2
. ˆ ¡
2
`. Then we can conclude j
1
∈ G, j
1
< j
2
,
ˆ ¡
1
[G] ∈ H, and j
1
' ˆ ¡
1
< ˆ ¡
2
, so j
2
∈ G (since G is closed) and ˆ ¡
2
[G] ∈ H since j
1
∈ G and
j
1
forces both ˆ ¡
1
< ˆ ¡
2
and that H is downward closed. So 'j
2
. ˆ ¡
2
` ∈ G∗ H.
G∗ H is directed
Suppose 'j
1
. ˆ ¡
1
`. 'j
1
. ˆ ¡
1
` ∈ G∗ H. So j
1
. j
2
∈ G, and since G is directed, there is some j
3
<
j
1
. j
2
. Since ˆ ¡
1
[G]. ˆ ¡
2
[G] ∈ H and H is directed, there is some ˆ ¡
3
[G] < ˆ ¡
1
[G]. ˆ ¡
2
[G]. Therefore
there is some j
4
< j
3
, j
4
∈ G, such that j
4
' ˆ ¡
3
< ˆ ¡
1
. ˆ ¡
2
, so 'j
4
. ˆ ¡
3
` < 'j
1
. ˆ ¡
1
`. 'j
1
. ˆ ¡
1
` and
'j
4
. ˆ ¡
3
` ∈ G∗ H.
G∗ H is generic
Suppose 1 is a dense subset of 1 ∗
ˆ
C. We can project it into a dense subset of C using G:
276
1
Q
= ¦ˆ ¡[G] [ 'j. ˆ ¡` ∈ 1¦ for some j ∈ G
Lemma: 1
Q
is dense in
ˆ
C[G]
Given any ˆ ¡
0
∈
ˆ
C, take any j
0
∈ G. Then we can deﬁne yet another dense subset, this one
in G:
1
ˆ q
0
= ¦j [ j < j
0
∧ j ' ˆ ¡ < ˆ ¡
0
∧ 'j. ˆ ¡` ∈ 1¦ for some ˆ ¡ ∈
ˆ
C
Lemma: 1
ˆ q
0
is dense above j
0
in 1
Take any j ∈ 1 such that j < j
0
. Then, since 1 is dense in 1 ∗
ˆ
C, we have some 'j
1
. ˆ ¡
1
` <
'j. ˆ ¡
0
` such that 'j
1
. ˆ ¡
1
` ∈ 1. Then by deﬁnition j
1
< j and j
1
∈ 1
ˆ q
0
.
From this lemma, we can conclude that there is some j
1
< j
0
such that j
1
∈ G
¸
1
ˆ q
0
, and
therefore some ˆ ¡
1
such that j
1
' ˆ ¡
1
< ˆ ¡
0
where 'j
1
. ˆ ¡
1
` ∈ 1. So 1
Q
is indeed dense in
ˆ
C[G].
Since 1
Q
is dense in
ˆ
C[G], there is some ˆ ¡ such that ˆ ¡[G] ∈ 1
Q
¸
H, and so some j ∈ G
such that 'j. ˆ ¡` ∈ 1. But since j ∈ G and ˆ ¡ ∈ H, 'j. ˆ ¡` ∈ G∗ H, so G∗ H is indeed generic.
G
P
is a generic ﬁlter
Given some generic subset G of 1 ∗
ˆ
C, let:
G
P
= ¦j ∈ 1 [ j
t
< j ∧ 'j
t
. ˆ ¡` ∈ G¦ for some j
t
∈ 1 and some ˆ ¡ ∈ C
G
P
is closed
Take any j
1
∈ G
P
and any j
2
such that j
1
< j
2
. Then there is some j
t
< j
1
satisfying the
deﬁnition of G
P
, and also j
t
< j
2
, so j
2
∈ G
P
.
G
P
is directed
Consider j
1
. j
2
∈ G
P
. Then there is some j
t
1
and some ˆ ¡
1
such that 'j
t
1
. ˆ ¡
1
` ∈ G and some
j
t
2
and some ˆ ¡
2
such that 'j
t
2
. ˆ ¡
2
` ∈ G. Since G is directed, there is some 'j
3
. ˆ ¡
3
` ∈ G such
277
that 'j
3
. ˆ ¡
3
` < 'j
t
1
. ˆ ¡
1
`. 'j
t
2
. ˆ ¡
2
`, and therefore j
3
∈ G
P
, j
3
< j
1
. j
2
.
G
P
is generic
Let 1 be a dense subset of 1. Then 1
t
= ¦'j. ˆ ¡` [ j ∈ 1¦. Clearly this is dense, since if
'j. ˆ ¡` ∈ 1 ∗
ˆ
C then there is some j
t
< j such that j
t
∈ 1, so 'j
t
. ˆ ¡` ∈ 1
t
and 'j
t
. ˆ ¡` < 'j. ˆ ¡`.
So there is some 'j. ˆ ¡` ∈ 1
t
¸
G, and therefore j ∈ 1
¸
G
P
. So G
P
is generic.
G
Q
is a generic ﬁlter
Given a generic subset G ⊆ 1 ∗
ˆ
C, deﬁne:
G
Q
= ¦ˆ ¡[G
P
] [ 'j. ˆ ¡` ∈ G¦ for some j ∈ 1
(Notice that G
Q
is dependant on G
P
, and is a subset of
ˆ
C[G
P
], that is, the forcing notion
inside M[G
P
], as opposed to the set of names C which we’ve been primarily working with.)
G
Q
is closed
Suppose ˆ ¡
1
[G
P
] ∈ G
Q
and ˆ ¡
1
[G
P
] < ˆ ¡
2
[G
P
]. Then there is some j
1
∈ G
P
such that j
1
'
ˆ ¡
1
< ˆ ¡
2
. Since j
1
∈ G
P
, there is some j
2
< j
1
such that for some ˆ ¡
3
, 'j
2
. ˆ ¡
3
` ∈ G. By the
deﬁnition of G
Q
, there is some j
3
such that 'j
3
. ˆ ¡
1
` ∈ G, and since G is directed, there is
some 'j
4
. ˆ ¡
4
` ∈ G and 'j
4
. ˆ ¡
4
` < 'j
3
. ˆ ¡
1
`. 'j
2
. ˆ ¡
3
`. Since G is closed and 'j
4
. ˆ ¡
4
` < 'j
4
. ˆ ¡
2
`, we
have ˆ ¡
2
[G
P
] ∈ G
Q
.
G
Q
is directed
Suppose ˆ ¡
1
[G
P
]. ˆ ¡
2
[G
P
] ∈ G
Q
. Then for some j
1
. j
2
, 'j
1
. ˆ ¡
1
`. 'j
2
. ˆ ¡
2
` ∈ G, and since G is
directed, there is some 'j
3
. ˆ ¡
3
` ∈ G such that 'j
3
. ˆ ¡
3
` < 'j
1
. ˆ ¡
1
`. 'j
2
. ˆ ¡
2
`. Then ˆ ¡
3
[G
P
] ∈ G
Q
and since j
3
∈ G and j
3
' ˆ ¡
3
< ˆ ¡
1
. ˆ ¡
2
, we have ˆ ¡
3
[G
P
] < ˆ ¡
1
[G
P
]. ˆ ¡
2
[G
P
].
G
Q
is generic
Let 1 be a dense subset of C[G
P
] (in M[G
P
]). Let
ˆ
1 be a 1name for 1, and let j
1
∈ G
P
be a such that j
1
'
ˆ
1 is dense. By the deﬁnition of G
P
, there is some j
2
< j
1
such that
'j
2
. ˆ ¡
2
` ∈ G for some ¡
2
. Then 1
t
= ¦'j. ˆ ¡` [ j ' ˆ ¡ ∈ 1 ∧ j < j
2
¦.
278
Lemma: 1
t
is dense (in G) above 'j
2
. ˆ ¡
2
`
Take any 'j. ˆ ¡` ∈ 1 ∗ C such that 'j. ˆ ¡` < 'j
2
. ˆ ¡
2
`. Then j '
ˆ
1 is dense, and therefore
there is some ˆ ¡
3
such that j ' ˆ ¡
3
∈
ˆ
1 and j ' ˆ ¡
3
< ˆ ¡. So 'j. ˆ ¡
3
` < 'j. ˆ ¡` and 'j. ˆ ¡
3
` ∈ 1
t
.
Take any 'j
3
. ˆ ¡
3
` ∈ 1
t
¸
G. Then j
3
∈ G
P
, so ˆ ¡
3
∈ 1, and by the deﬁnition of G
Q
, ˆ ¡
3
∈ G
Q
.
G
P
∗ G
Q
= G
If G is a generic subset of 1 ∗ C, observe that:
G
P
∗ G
Q
= ¦'j. ˆ ¡` [ j
t
< j ∧ 'j
t
. ˆ ¡
t
` ∈ G∧ 'j
0
. ˆ ¡` ∈ G¦ for some j
t
. ˆ ¡
t
. j
0
If 'j. ˆ ¡` ∈ G then obviously this holds, so G ⊆ G
P
∗ G
Q
. Conversly, if 'j. ˆ ¡` ∈ G
P
∗ G
Q
then there exist j
t
. ˆ ¡
t
and j
0
such that 'j
t
. ˆ ¡
t
`. 'j
0
. ˆ ¡` ∈ G, and since G is directed, some
'j
1
. ˆ ¡
1
` ∈ G such that 'j
1
. ˆ ¡
1
` < 'j
t
. ˆ ¡
t
`. 'j
0
. ˆ ¡`. But then j
1
< j and j
1
' ˆ ¡
1
< ˆ ¡, and since
G is closed, 'j. ˆ ¡` ∈ G.
(G∗ H)
P
= G
Assume that G is generic in 1 and H is generic in C[G].
Suppose j ∈ (G ∗ H)
P
. Then there is some j
t
∈ 1 and some ˆ ¡ ∈ C such that j
t
< j and
'j
t
. ˆ ¡` ∈ G∗ H. By the deﬁnition of G∗ H, j
t
∈ G, and then since G is closed j ∈ G.
Conversely, suppose j ∈ G. Then (since H is nontrivial), 'j. ˆ ¡` ∈ G ∗ H for some ˆ ¡, and
therefore j ∈ (G∗ H)
P
.
(G∗ H)
Q
= H
Assume that G is generic in 1 and H is generic in C[G].
Given any ¡ ∈ H, there is some ˆ ¡ ∈ C such that ˆ ¡[G] = ¡, and so there is some j such that
'j. ˆ ¡` ∈ G∗ H, and therefore ˆ ¡[G] ∈ H.
On the other hand, if ¡ ∈ (G ∗ H)
Q
then there is some 'j. ˆ ¡` ∈ G ∗ H, and therefore some
ˆ ¡[G] ∈ H.
Version: 1 Owner: Henry Author(s): Henry
279
34.129 complete partial orders do not add small sub
sets
Suppose 1 is a κcomplete partial order in M. Then for any generic subset G, M contains
no bounded subsets of κ which are not in M.
Version: 1 Owner: Henry Author(s): Henry
34.130 proof of complete partial orders do not add
small subsets
Take any r ∈ M[G], r ⊆ κ. Let ˆ r be a name for r. There is some j ∈ G such that
j ' ˆ r is a subset of κ bounded by λ < κ
Outline:
For any ¡ < j, we construct by induction a series of elements ¡
α
stronger than j. Each ¡
α
will determine whether or not α ∈ ˆ r. Since we know the subset is bounded below κ, we can
use the fact that 1 is κ complete to ﬁnd a single element stronger than ¡ which ﬁxes the
exact value of ˆ r. Since the series is deﬁnable in M, so is ˆ r, so we can conclude that above
any element ¡ < j is an element which forces ˆ r ∈ M. Then j also forces ˆ r ∈ M, completing
the proof.
Details:
Since forcing can be described within M, o = ¦¡ ∈ 1 [ ¡ ' ˆ r ∈ \ ¦ is a set in M. Then,
given any ¡ < j, we can deﬁne ¡
0
= ¡ and for any ¡
α
(α < λ), ¡
α+1
is an element of 1
stronger than ¡
α
such that either ¡
α+1
' α + 1 ∈ ˆ r or ¡
α+1
' α + 1 ∈ ˆ r. For limit α, let ¡
t
α
be any upper bound of ¡
β
for α < β (this exists since 1 is κcomplete and α < κ), and let
¡
α
be stronger than ¡
t
α
and satisfy either ¡
α+1
' α ∈ ˆ r or ¡
α+1
' α ∈ ˆ r. Finally let ¡
∗
be the
upper bound of ¡
α
for α < λ. ¡
∗
∈ 1 since 1 is κcomplete.
Note that these elements all exist since for any j ∈ 1 and any (ﬁrstorder) sentence φ there
is some ¡ < j such that ¡ forces either φ or φ.
¡
∗
not only forces that ˆ r is a bounded subset of κ, but for every ordinal it forces whether or
not that ordinal is contained in ˆ r. But the set ¦α < λ [ ¡
∗
' α ∈ ˆ r¦ is deﬁneable in M, and
is of course equal to ˆ r[G
∗
] in any generic G
∗
containing ¡
∗
. So ¡
∗
' ˆ r ∈ M.
Since this holds for any element stronger than j, it follows that j ' ˆ r ∈ M, and therefore
ˆ r[G] ∈ M.
Version: 1 Owner: Henry Author(s): Henry
280
34.131 Q is equivalent to ♣ and continuum hypothesis
If o is a stationary subset of κ and λ < κ implies 2
λ
< κ then
Q
S
↔♣
S
Moreover, this is best possible: Q
S
is consistent with ♣
S
.
Version: 3 Owner: Henry Author(s): Henry
34.132 Levy collapse
Given any cardinals κ and λ in M, we can use the Levy collapse to give a new model
M[G] where λ = κ. Let 1 = Levy(κ. λ) be the set of partial functions 1 : κ → λ with
[ dom(1)[ < κ. These functions each give partial information about a function 1 which
collapses λ onto κ.
Given any generic subset G of 1, M[G] has a set G, so let 1 =
¸
G. Each element of G is a
partial function, and they are all compatible, so 1 is a function. dom(G) = κ since for each
α < κ the set of 1 ∈ 1 such that α ∈ dom(1) is dense (given any function without α, it is
trivial to add (α. 0), giving a stronger function which includes α). Also range(G) = λ since
the set of 1 ∈ 1 such that α < λ is in the range of 1 is again dense (the domain of each 1 is
bounded, so if β is larger than any element of dom(1), 1
¸
¦(β. α)¦ is stronger than 1 and
includes λ in its domain).
So 1 is a surjective function from κ to λ, and λ is collapsed in M[G]. In addition,
[ Levy(κ. λ)[ = λ, so it satisﬁes the λ
+
chain condition, and therefore λ
+
is not collapsed, and
becomes κ
+
(since for any ordinal between λ and λ
+
there is already a surjective function
to it from λ).
We can generalize this by forcing with 1 = 1c·n(κ. < λ) with λ regular, the set of partial
functions 1 : λ κ →λ such that 1(0. α) = 0, [ dom(1)[ < κ and if α 0 then 1(α. i) < α.
In essence, this is the union of 1c·n(κ. η) for each κ < η < λ.
In M[G], deﬁne 1 =
¸
G and 1
α
(β) = 1(α. β). Each 1
α
is a function from κ to α, and by
the same argument as above 1
α
is both total and surjective. Moreover, it can be shown that
1 satisﬁes the λ chain condition, so λ does not collapse and λ = κ
+
.
Version: 2 Owner: Henry Author(s): Henry
281
34.133 proof of Q is equivalent to ♣ and continuum
hypothesis
The proof that Q
S
implies both ♣
S
and that for every λ < κ, 2
λ
< κ are given in the entries
for Q
S
and ♣
S
.
Let ¹ = '¹
α
`
α∈S
be a sequence which satisﬁes ♣
S
.
Since there are only κ bounded subsets of κ, there is a surjective function 1 : κ →
Bounded(κ) κ where Bounded(κ) is the bounded subsets of κ. Deﬁne a sequence 1 =
'1
α
`
α<κ
by 1
α
= 1(α) if :nj(1
α
) < α and ∅ otherwise. Since the set of (1
α
. λ) ∈
Bounded(κ) κ such that 1
α
= 1 is unbounded for any bounded subset 1, it follow that
every bounded subset of κ occurs κ times in 1.
We can deﬁne a new sequence, 1 = '1
α
`
α∈S
such that r ∈ 1
α
↔r ∈ 1
β
for some β ∈ ¹
α
.
We can show that 1 satisﬁes Q
S
.
First, for any α, r ∈ 1
α
means that r ∈ 1
β
for some β ∈ ¹
α
, and since 1
β
⊆ β ∈ ¹
α
⊆ α,
we have 1
α
⊆ α.
Next take any 1 ⊆ κ. We consider two cases:
1 is bounded
The set of α such that 1 = 1
α
forms an unbounded sequence 1
t
, so there is a stationary
o
t
⊆ o such that α ∈ o
t
↔ ¹
α
⊂ 1
t
. For each such α, r ∈ 1
α
↔ r ∈ 1
i
for some
i ∈ ¹
α
⊂ 1
t
. But each such 1
i
is equal to 1, so 1
α
= 1.
1 is unbounded
We deﬁne a function , : κ →κ as follows:
• ,(0) = 0
• To ﬁnd ,(α), take A
¸
¦,(β) [ β < α¦. This is a bounded subset of κ, so is equal to
an unbounded series of elements of 1. Take ,(α) = γ, where γ is the least number
greater than any element of ¦α¦
¸
¦,(β) [ β < α¦ such that 1
γ
= A
¸
¦,(β) [ β < α¦.
Let 1
t
= range(,). This is obviously unbounded, and so there is a stationary o
t
⊆ o such
that α ∈ o
t
↔¹
α
⊆ 1
t
.
Next, consider (, the set of ordinals less than κ closed under ,. Clearly it is unbounded,
since if λ < κ then ,(λ) includes ,(α) for α < λ, and so induction gives an ordinal greater
than λ closed under , (essentially the result of applying , an inﬁnite number of times). Also,
( is closed: take any c ⊆ ( and suppose sup(c
¸
α) = α. Then for any β < α, there is some
282
γ ∈ c such that β < γ < α and therefore ,(β) < γ. So α is closed under ,, and therefore
contained in (.
Since ( is a club, (
t
= (
¸
o
t
is stationary. Suppose α ∈ (
t
. Then r ∈ 1
α
↔ r ∈ 1
β
where β ∈ ¹
α
. Since α ∈ o
t
, β ∈ range(,), and therefore 1
β
⊆ 1. Next take any r ∈ 1
¸
α.
Since α ∈ (, it is closed under ,, hence there is some γ ∈ α such that ,(r) ∈ γ. Since
sup(¹
α
) = α, there is some η ∈ ¹
α
such that γ < η, so ,(r) ∈ η. Since η ∈ ¹
α
, 1
η
⊆ 1
α
,
and since η ∈ range(,), ,(δ) ∈ 1
η
for any δ < ,
−1
(η), and in particular r ∈ 1
η
. Since we
showed above that 1
α
⊆ α, we have 1
α
= 1
¸
α for any α ∈ (
t
.
Version: 3 Owner: Henry Author(s): Henry
34.134 Martin’s axiom
For any cardinal κ, Martin’s Axiom for κ (`¹
κ
) states that if 1 is a partial order
satisfying ccc then given any set of κ dense subsets of 1, there is a directed subset intersecting
each such subset. Martin’s Axiom states that `¹
κ
holds for every κ < 2
ℵ
0
.
Version: 3 Owner: Henry Author(s): Henry
34.135 Martin’s axiom and the continuum hypothesis
`¹
ℵ
0
always holds
Given a countable collection of dense subsets of a partial order, we can selected a set 'j
n
`
n<ω
such that j
n
is in the :th dense subset, and j
n+1
< j
n
for each :. Therefore (H implies
`¹.
If `¹
κ
then 2
ℵ
0
κ, and in fact 2
κ
= 2
ℵ
0
κ ` ℵ
0
, so 2
κ
` 2
ℵ
0
, hence it will suﬃce to ﬁnd an surjective function from P(ℵ
0
) to P(κ).
Let ¹ = '¹
α
`
α<κ
, a sequence of inﬁnite subsets of ω such that for any α = β, ¹
α
¸
¹
β
is
ﬁnite.
Given any subset o ⊆ κ we will construct a function 1 : ω →¦0. 1¦ such that a unique o can
be recovered from each 1. 1 will have the property that if i ∈ o then 1(c) = 0 for ﬁnitely
many elements c ∈ ¹
i
, and if i ∈ o then 1(c) = 0 for inﬁnitely many elements of ¹
i
.
Let 1 be the partial order (under inclusion) such that each element j ∈ 1 satisﬁes:
283
• j is a partial function from ω to ¦0. 1¦
• There exist i
1
. . . . . i
n
∈ o such that for each , < :, ¹
i
j
⊆ dom(j)
• There is a ﬁnite subset of ω, u
p
, such that u
p
= dom(j) −
¸
j<n
¹
i
j
• For each , < :, j(c) = 0 for ﬁnitely many elements of ¹
i
j
This satisﬁes ccc. To see this, consider any uncountable sequence o = 'j
α
`
α<ω
1
of elements
of 1. There are only countably many ﬁnite subsets of ω, so there is some u ⊆ ω such that
u = u
p
for uncountably many j ∈ o and j ` u is the same for each such element. Since each
of these function’s domain covers only a ﬁnite number of the ¹
α
, and is 1 on all but a ﬁnite
number of elements in each, there are only a countable number of diﬀerent combinations
available, and therefore two of them are compatible.
Consider the following groups of dense subsets:
• 1
n
= ¦j ∈ 1 [ : ∈ dom(j)¦ for : < ω. This is obviously dense since any j not already
in 1
n
can be extended to one which is by adding ':. 1`
• 1
α
= ¦j ∈ 1 [ dom(j) ⊇ ¹
α
¦ for α ∈ o. This is dense since if j ∈ 1
α
then
j
¸
¦'c. 1` [ c ∈ ¹
α
` dom(j)¦ is.
• For each α ∈ o, : < ω, 1
n,α
= ¦j ∈ 1 [ : ` : ∧ j(:) = 0¦ for some : < ω. This
is dense since if j ∈ 1
n,α
then dom(j)
¸
¹
α
= ¹
α
¸
u
p
¸¸
j
¹
i
j
. But u
p
is ﬁnite,
and the intersection of ¹
α
with any other ¹
i
is ﬁnite, so this intersection is ﬁnite,
and hence bounded by some :. ¹
α
is inﬁnite, so there is some : < r ∈ ¹
α
. So
j
¸
¦'r. 0`¦ ∈ 1
n,α
.
By `¹
κ
, given any set of κ dense subsets of 1, there is a generic G which intersects all of
them. There are a total of ℵ
0
+[o[ + (κ −[o[) ℵ
0
= κ dense subsets in these three groups,
and hence some generic G intersecting all of them. Since G is directed, o =
¸
G is a partial
function from ω to ¦0. 1¦. Since for each : < ω, G
¸
1
n
is nonempty, : ∈ dom(o), so o is
a total function. Since G
¸
1
α
for α ∈ o is nonempty, there is some element of G whose
domain contains all of ¹
α
and is 0 on a ﬁnite number of them, hence o(c) = 0 for a ﬁnite
number of c ∈ ¹
α
. Finally, since G
¸
1
n,α
for each : < ω, α ∈ o, the set of : ∈ ¹
α
such
that o(:) = 0 is unbounded, and hence inﬁnite. So o is as promised, and 2
κ
= 2
ℵ
0
.
Version: 1 Owner: Henry Author(s): Henry
34.136 Martin’s axiom is consistent
If κ is an uncountable strong limit cardinal such that for any λ < κ, κ
λ
= κ then it is
consistent that 2
ℵ
0
= κ and MA. This is shown by using ﬁnite support iterated forcing to
284
construct a model of ZFC in which this is true. Historically, this proof was the motivation
for developing iterated forcing.
Outline
The proof uses the convenient fact that `¹
κ
holds as long as it holds for all partial orders
smaller than κ. Given the conditions on κ, there are at most κ names for these partial orders.
At each step in the forcing, we force with one of these names. The result is that the actual
generic subset we add intersects every dense subset of every partial order.
Construction of 1
κ
ˆ
C
α
will be constructed by induction with three conditions: [1
α
[ < κ for all α < κ, '
Pα
ˆ
C
α
⊆
M, and 1
α
satisﬁes the ccc. Note that a partial ordering on a cardinal λ < κ is a function
from λ λ to ¦0. 1¦, so there are at most 2
λ
< κ of them. Since a canonical name for a
partial ordering of a cardinal is just a function from 1
α
to that cardinal, there are at most
κ
2
λ
< κ of them.
At each of the κ steps, we want to deal with one of these possible partial orderings, so we
need to partition the κ steps in to κ steps for each of the κ cardinals less than κ. In addition,
we need to include every 1
α
name for any level. Therefore, we partion κ into 'o
γ,δ
`
γ,δ<κ
for
each cardinal δ, with each o
γ,δ
having cardinality κ and the added condition that η ∈ o
γ,δ
implies η ` γ. Then each 1
γ
name for a partial ordering of δ is assigned some index η ∈ o
γ,δ
,
and that partial order will be dealt with at stage C
η
.
Formally, given
ˆ
C
β
for β < α, 1
α
can be constructed and the 1
α
names for partial orderings
of each cardinal δ enumerated by the elements of o
α,δ
. α ∈ o
γ,δ
for some γ
α
and δ
α
, and
α ` γ
α
so some canonical 1
γα
name for a partial order
ˆ
<
α
of δ
α
has already been assigned
to α.
Since
ˆ
<
α
is a 1
γα
name, it is also a 1
α
name, so
ˆ
C
α
can be deﬁned as 'δ
α
.
ˆ
<
α
` if '
Pα
'δ
α
.
ˆ
<
α
`
satisfies the ccc and by the trivial partial order '1. ¦'1. 1`¦` otherwise. Obviously this
satisﬁes the ccc, and so 1
α+1
does as well. Since
ˆ
C
α
is either trivial or a cardinal together
with a canonical name, '
Pα
ˆ
C
α
⊆ M. Finally, [1
α+q
[ <
¸
n
[α[
n
(sup
i
[
ˆ
C
i
[)
n
< κ.
Proof that `¹
λ
holds for λ < κ
Lemma: It suﬃces to show that `¹
λ
holds for partial order with size < λ
S uppose 1 is a partial order with [1[ κ and let '1
α
`
α<λ
be dense subsets of 1. Deﬁne
functions 1
i
: 1 → 1
α
for ακ with 1
α
(j) ` j (obviously such elements exist since 1
α
is
285
dense). Let o : 1 1 → 1 be a function such that o(j. ¡) ` j. ¡ whenever j and ¡ are
compatible. Then pick some element ¡ ∈ 1 and let C be the closure of ¦¡¦ under 1
α
and o
with the same ordering as 1 (restricted to C).
Since there are only κ functions being used, it must be that [C[ < κ. If j ∈ C then 1
α
(j) ` j
and clearly 1
α
(j) ∈ C
¸
1
α
, so each 1
α
¸
C is dense in C. In addition, C is ccc: if ¹ is an
antichain in C and j
1
. j
2
∈ ¹ then j
1
. j
2
are incompatible in C. But if they were compatible
in 1 then o(j
1
. j
2
) ` j
1
. j
2
would be an element of C, so they must be incompatible in 1.
Therefore ¹ is an antichain in 1, and therefore must have countable cardinality, since 1
satisﬁes the ccc.
By assumption, there is a directed G ⊆ C such that G
¸
(1
α
¸
C) = ∅ for each α < κ, and
therefore `¹
λ
holds in full.
Now we must prove that, if G is a generic subset of 1
κ
, 1 some partial order with [1[ < λ
and '1
α
`
α<λ
are dense subsets of 1 then there is some directed subset of 1 intersecting each
1
α
.
If [1[ < λ then λ additional elements can be added greater than any other element of 1 to
make [1[ = λ, and then since there is an order isomorphism into some partial order of λ,
assume 1 is a partial ordering of λ. Then let 1 = ¦'α. β` [ α ∈ 1
β
¦.
Take canonical names so that 1 =
ˆ
1[G], 1 =
ˆ
1[G] and 1
i
=
ˆ
1
i
[G] for each i < λ and:
'
Pκ
ˆ
1 is a partial ordering satisfying ccc and
ˆ
1 ⊆ λ λ and
ˆ
1
α
is dense in
ˆ
1
For any α. β there is a maximal antichain 1
α,β
⊆ 1
κ
such that if j ∈ 1
α,β
then either
j '
Pκ
α <
ˆ
R
β or j '
Pκ
α <
ˆ
R
β and another maximal antichain 1
α,β
⊆ 1
κ
such that if
j ∈ 1
α,β
then either j '
Pκ
'α. β` ∈
ˆ
1 or j '
Pκ
'α. β` ∈
ˆ
1. These antichains determine the
value of those two formulas.
Then, since κ
cf κ
κ and κ
µ
= κ for j < κ, it must be that cf κ = κ, so κ is regular. Then
γ = sup(¦α+1 [ α ∈ dom(j). j ∈
¸
α,β<λ
1
α,β
¸
1
α,β
) < κ, so 1
α,β
. 1
α,β
⊆ 1
γ
, and therefore
the 1
κ
names
ˆ
1 and
ˆ
1 are also 1
γ
names.
Lemma: For any γ, G
γ
= ¦j ` γ [ j ∈ G¦ is a generic subset of 1
γ
F irst, it is directed, since if j
1
` γ. j
2
` γ ∈ G
γ
then there is some j ∈ G such that
j < j
1
. j
2
, and therefore j ` γ ∈ G
γ
and j < j
1
` γ. j
2
` γ.
Also, it is generic. If 1 is a dense subset of 1
γ
then 1
κ
= ¦j ∈ 1
κ
[ j < ¡ ∈ 1¦ is dense in
1
κ
, since if j ∈ 1
κ
then there is some d < j `, but then d is compatible with j, so d
¸
j ∈ 1
κ
.
Therefore there is some j ∈ 1
κ
¸
G
κ
, and so j `∈ 1
¸
G
γ
.
286
Since
ˆ
1 and
ˆ
1 are 1
γ
names,
ˆ
1[G] =
ˆ
1[G
γ
] = 1 and
ˆ
1[G] =
ˆ
1[G
γ
] = 1, so
\ [G
γ
] =
ˆ
1 is a partial ordering of λ satisfying the ccc and
ˆ
1
α
is dense in
ˆ
1
Then there must be some j ∈ G
γ
such that
j '
Pγ
ˆ
1 is a partial ordering of λ satisfying the ccc
Let ¹
p
be a maximal antichain of 1
γ
such that j ∈ ¹
p
, and deﬁne
ˆ
<
∗
as a 1
γ
name with
(j. :) ∈
ˆ
<
∗
for each : ∈
ˆ
1 and (c. :) ∈
ˆ
<
∗
if : = (α. β) where α < β < λ and j = c ∈ ¹
p
.
That is,
ˆ
<
∗
[G] = 1 when j ∈ G and
ˆ
<
∗
[G] =∈` λ otherwise. Then this is the name for a
partial ordering of λ, and therefore there is some η ∈ o
γ,λ
such that
ˆ
<
∗
=
ˆ
<
η
, and η ` γ.
Since j ∈ G
γ
⊆ G
η
,
ˆ
C
η
[G
η
] =
ˆ
<
η
[G
η
] = 1.
Since 1
η+1
= 1
η
∗C
η
, we know that G
Qη
⊆ C
η
is generic since forcing with the composition is equivalent to successiv
Since 1
i
∈ \ [G
γ
] ⊆ \ [G
η
] and is dense, it follows that 1
i
¸
G
Qη
= ∅ and since G
Qη
is a
subset of 1 in 1
κ
, `¹
λ
holds.
Proof that 2
ℵ
0
= κ
The relationship between Martin’s axiom and the continuum hypothesis tells us that 2
ℵ
0
`
κ. Since 2
ℵ
0
was less than κ in \ , and since [1
κ
[ = κ adds at most κ elements, it must be
that 2
ℵ
0
= κ.
Version: 3 Owner: Henry Author(s): Henry
34.137 a shorter proof: Martin’s axiom and the con
tinuum hypothesis
This is another, shorter proof for the fact that `¹
ℵ
0
always holds.
Let (1. <) be a partially ordered set and D be a collection of subsets of (1. <). We remeber
that a ﬁlter G on (1. <) is Dgeneric if G
¸
1 = ∅ for all 1 ∈ D which are dense in (1. <).
(”dense” in this context means: If 1 is dense in (1. <), than for every j ∈ 1 there’s a d ∈ 1
such that d < j.
Let (1. <) be a partially ordered set and D a countable collection of dense subsets of 1, then
there exists a Dgeneric ﬁlter G on 1. Moreover, it could be shown, that for every j ∈ 1
there’s such a Dgeneric ﬁlter G with j ∈ G.
287
L et 1
1
. . . . . 1
n
. . . . be the dense subsets in D. Furthermore let j
0
= j. Now we can choose
for every 1 < : < ω an element j
n
∈ 1 such that j
n
< j
n−1
and j
n
∈ 1
n
. If we now consider
the set G := ¦¡ ∈ 1 : ∃ : < ωs.t. j
n
< ¡¦, than it is easy to check that G is a Dgeneric
ﬁlter on 1 and j ∈ G obviously. This completes the proof.
Version: 4 Owner: x bas Author(s): x bas
34.138 continuum hypothesis
The Continuum Hypothesis states that there is no cardinal number κ such that ℵ
0
< κ < 2
ℵ
0
.
An equivalent statement is that ℵ
1
= 2
ℵ
0
.
It is known to be independent of the axioms of ZFC.
The continuum hypothesis can also be stated as: there is no subset of the real numbers
which has cardinality strictly between that of the reals and that of the integers. It is from
this that the name comes, since the set of real numbers is also known as the continuum.
Version: 8 Owner: Evandar Author(s): Evandar
34.139 forcing
Forcing is the method used by Paul Cohen to prove the independence of the continuum hypothesis
(CH). In fact, the method was used by Cohen to prove that CH could be violated. The treat
ment I give here is VERY informal. I will develop it later. First let me give an example from
algebra.
Suppose we have a ﬁeld /, and we want to add to this ﬁeld an element α such α
2
= −1.
We see that we cannot simply drop a new α in /, since then we are not guaranteed that we
still have a ﬁeld. Neither can we simply assume that / already has such an element. The
standard way of doing this is to start by adjoining a generic indeterminate A, and impose a
constraint on A, saying that A
2
+ 1 = 0. What we do is take the quotient /[A](A
2
+ 1),
and make a ﬁeld out of it by taking the quotient ﬁeld. We then obtain /(α), where α is the
equivalence class of A in the quotient. The general case of this is the theorem of algebra
saying that every polynomial j over a ﬁeld / has a root in some extension ﬁeld.
We can rephrase this and say that “it is consistent with standard ﬁeld theory that −1 have
a square root”.
When the theory we consider is ZFC, we run in exactly the same problem : we can’t just
add a “new” set and pretend it has the required properties, because then we may violate
something else, like foundation. Let M be a transitive model of set theory, which we call
288
the ground model. We want to “add a new set” o to M in such a way that the extension
M
t
has M as a subclass, and the properties of M are preserved, and o ∈ M
t
.
The ﬁrst step is to “approximate” the new set using elements of M. This is the analogue of
ﬁnding the irreducible polynomial in the algebraic example. The set 1 of such “approxima
tions” can be ordered by how much information the approximations give : let j. ¡ ∈ 1, then
j < ¡ if and only if j “is stronger than” ¡. We call this set a set of forcing conditions.
Furthermore, it is required that the set 1 itself and the order relation be elements of M.
Since 1 is a partial order, some of its subsets have interesting properties. Consider 1 as
a topological space with the order topology. A subset 1 ⊆ 1 is dense in 1 if and only if
for every j ∈ 1, there is d ∈ 1 such that d < j. A ﬁlter in 1 is said to be Mgeneric if
and only if it intersects every one of the dense subsets of 1 which are in M. An Mgeneric
ﬁlter in 1 is also referred to as a generic set of conditions in the literature. In general,
eventhough 1 is a set in M, generic ﬁlters are not elements of M.
If 1 is a set of forcing conditions, and G is a generic set of conditions in 1, all in the ground
model M, then we deﬁne M[G] to be the least model of ZFC that contains G. In forthcoming
entries I will detail the construction of M[G]. The big theorem is this :
Theorem 5. M[G] is a model of ZFC, and has the same ordinals as M, and M⊆ M[G].
The way to prove that we can violate CH using a generic extension is to add many new
“subsets of ω” in the following way : let M be a transitive model of ZFC, and let (1. <) be
the set (in M) of all functions 1 whose domain is a ﬁnite subset of ℵ
2
ℵ
0
, and whose range
is the set ¦0. 1¦. The ordering here is j < ¡ if and only if j ⊃ ¡. Let G be a generic set of
conditions in 1. Then
¸
G is a total function whose domain is ℵ
2
ℵ
0
, and range is ¦0. 1¦.
We can see this 1 as coding ℵ
2
new functions 1
α
: ℵ
0
→¦0. 1¦, α < ℵ
2
, which are subsets of
omega. These functions are all dictinct, and so CH is violated in M[G].
All this relies on a proper deﬁnition of the satisfaction relation in M[G], and the forcing relation,
which will come in a forthcoming entry. Details can be found in Thomas Jech’s book Set
Theory.
Version: 6 Owner: jihemme Author(s): jihemme
34.140 generalized continuum hypothesis
The generalized continuum hypothesis states that for any inﬁnite cardinal λ there is no
cardinal κ such that λ < κ < 2
λ
.
Equivalently, for every ordinal α, ℵ
α+1
= 2
ℵα
.
Like the continuum hypothesis, the generalized continuum hypothesis is known to be independent
of the axioms of ZFC.
289
Version: 7 Owner: Evandar Author(s): Evandar
34.141 inaccessible cardinals
A limit cardinal κ is a strong limit cardinal if for any λ < κ, 2
λ
< κ.
A regular limit cardinal κ is called weakly inaccessible, and a regular strong limit cardinal
is called inaccessible.
Version: 2 Owner: Henry Author(s): Henry
34.142 Q
Q
S
is a combinatoric principle regarding a stationary set o ⊆ κ. It holds when there is a
sequence '¹
α
`
α∈S
such that each ¹
α
⊆ α and for any ¹ ⊆ κ, ¦α ∈ o [ ¹
¸
α = ¹
α
¦ is
stationary.
To get some sense of what this means, observe that for any λ < κ, ¦λ¦ ⊆ κ, so the set of
¹
α
= ¦λ¦ is stationary (in κ). More strongly, suppose κ λ. Then any subset of 1 ⊂ λ is
bounded in κ so ¹
α
= 1 on a stationary set. Since [o[ = κ, it follows that 2
λ
< κ. Hence
Q
ℵ
1
, the most common form (often written as just Q), implies CH.
Version: 3 Owner: Henry Author(s): Henry
34.143 ♣
♣
S
is a combinatoric principle weaker than Q
S
. It states that, for o stationary in κ, there
is a sequence '¹
α
`
α∈S
such that ¹
α
⊆ α and sup(¹
α
) = α and with the property that for
each unbounded subset 1 ⊆ κ there is some ¹
α
⊆ A.
Any sequence satisfying Q
S
can be adjusted so that sup(¹
α
) = α, so this is indeed a weakened
form of Q
S
.
Any such sequence actually contains a stationary set of α such that ¹
α
⊆ 1 for each 1:
given any club ( and any unbounded 1, construct a κ sequence, (
∗
and 1
∗
, from the
elements of each, such that the αth member of (
∗
is greater than the αth member of 1
∗
,
which is in turn greater than any earlier member of (
∗
. Since both sets are unbounded, this
construction is possible, and 1
∗
is a subset of 1 still unbounded in κ. So there is some α
such that ¹
α
⊆ 1
∗
, and since sup(¹
α
) = α, α is also the limit of a subsequence of (
∗
and
therefore an element of (.
290
Version: 1 Owner: Henry Author(s): Henry
34.144 Dedekind inﬁnite
A set ¹ is a said to be Dedekind inﬁnite if there is an injective function 1 : ω →¹, where
ω denotes the set of natural numbers.
A Dedekind inﬁnite set is certainly inﬁnite, and if the axiom of choice is assumed, then an
inﬁnite set is Dedekind inﬁnite. However, it is consistent with the failure of the axiom of
choice that there is a set which is inﬁnite but not Dedekind inﬁnite.
Version: 4 Owner: Evandar Author(s): Evandar
34.145 ZermeloFraenkel axioms
Equality of sets: If A and ) are sets, and r ∈ A iﬀ r ∈ ) , then A = ) . Pair set: If A
and ) are sets, then there is a set 2 containing only A and ) . Union over a set: If A is a
set, then there exists a set that contains every element of each r ∈ A. axiom of power set:
If A is a set, then there exists a set P(r) with the property that ) ∈ P(r) iﬀ any element
n ∈ ) is also in A. Replacement axiom: Let 1(r. n) be some formula. If, for all r, there
is exactly one n such that 1(r. n) is true, then for any set ¹ there exists a set 1 with the
property that / ∈ 1 iﬀ there exists some c ∈ ¹ such that 1(c. /) is true. regularity axiom:
Let 1(r) be some formula. If there is some r that makes 1(r) true, then there is a set )
such that 1() ) is true, but for no n ∈ ) is 1(n) true. Existence of an inﬁnite set: There
exists a nonempty set A with the property that, for any r ∈ A, there is some n ∈ A such
that r ⊆ n but r = n. Ernst Zermelo and Abraham Fraenkel proposed these axioms as a
foundation for what is now called ZermeloFraenkel set theory, or ZF. If these axioms are
accepted along with the axiom of choice, it is often denoted ZFC.
Version: 10 Owner: mathcam Author(s): mathcam, vampyr
34.146 class
By a class in modern set theory we mean an arbitrary collection of elements of the universe.
All sets are classes (as they are collections of elements of the universe  which are usually
sets, but could also be urelements), but not all classes are sets. Classes which are not sets
are called proper classes.
291
The need for this distinction arises from the paradoxes of the so called naive set theory. In
naive set theory one assumes that to each possible division of the universe into two disjoint
and mutually comprehensive parts there corresponds an entity of the universe, a set. This is
the contents of Frege’s famous ﬁfth axiom, which states that to each second order predicate
1 there corresponds a ﬁrst order object j called the extension of 1, s.t. ∀r(1(r) ↔r ∈ j).
(Every predicate 1 divides the universe into two mutually comprehensive and disjoint parts;
namely the part which consists of objects for which 1 holds and the part consisting of objects
for which 1 does not hold).
Speaking in modern terms we may view the situation as follows. Consider a model of set
theory M. The interpretation the model gives to ∈ deﬁnes implicitly a function 1 : 1(M) →
M. Seen this way, the fact that not all classes can be sets simply means that we can’t
injectively map the powerset of any set into the set itself, which is a famous result by
Cantor. Functions like 1 here are known as extensors and they have been used in the study
of semantics of set theory.
Russell’s paradox  which could be seen as a proof of Cantor’s theorem about cardinalities
of powersets  shows that Frege’s ﬁfth axiom is contradictory; not all classes can be sets.
From here there are two traditional ways to proceed: either trough the theory of types or
trough some form of limitation of size principle.
The limitation of size principle in its vague form says that all small classes (in the sense of
cardinality) are sets, while all proper classes are very big; “too big” to be sets. The limitation
of size principle can be found in Cantor’s work where it is the basis for Cantor’s doctrine that
only transﬁnite collections can be thought as speciﬁc objects (sets), but some collections are
“absolutely inﬁnite”, and can’t be thought to be comprehended into an object. This can be
given a precise formulation: all classes which are of the same cardinality as the universal
class are too big, and all other classes are small. In fact, this formulation can be used
in von NeumannBernaysG¨odel set theory to replace the replacement axiom and almost all
other set existence axioms (with the exception of the powerset axiom).
The limitation of size principle can be seen to give rise to extensors of type 1
<[A[
(¹) → ¹.
(1
<[A[
(¹) is the set of all subsets of ¹ which are of cardinality less than that of ¹). This is
not the only possible way to avoid Russell’s paradox. We could use an extensor according
to which all classes which are of cardinality less than that of the universe or for which the
cardinality of their complement is less than that of the universe are sets (i.e. map into
elements of the model).
In many set theories there are formally no proper classes; ZFC is an example of just such a
set theory. In these theories one usually means by a proper class an open formula Φ, possibly
with set parameters c
1
. .... c
n
. Notice, however, that these do not exhaust all possible proper
classes that should “really” exist for the universe, as it only allows us to deal with proper
classes that can be deﬁned by means of an open formula with parameters. The theory NBG
formalises this usage: it’s conservative over ZFC (as clearly speaking about open formulae
with parameters must be!).
292
There is a set theory known as MorseKelley set theory which allows us to speak about and
to quantify over an extended class of impredicatively deﬁned porper classes that can’t be
reduced to simply speaking about open formulae.
Version: 5 Owner: Aatu Author(s): Aatu
34.147 complement
Let ¹ be a subset of 1. The complement of ¹ in 1 (denoted ¹
when the larger set 1 is
clear from context) is the set diﬀerence 1 ` ¹.
Version: 1 Owner: djao Author(s): djao
34.148 delta system
If o is a set of ﬁnite sets then it is a ∆system if there is some (possibly empty) A such
that for any c. / ∈ o, if c = / then c
¸
/ = A.
Version: 2 Owner: Henry Author(s): Henry
34.149 delta system lemma
If o is a set of ﬁnite sets such that [o[ = ℵ
1
then there is a o
t
⊆ o such that [o
t
[ = ℵ
1
and
o is a ∆system.
Version: 3 Owner: Henry Author(s): Henry
34.150 diagonal intersection
If 'o
i
`
i<α
is a sequence then the diagonal intersection, ∆
i<α
o
i
is deﬁned to be ¦β < α [
β ∈
¸
γ<β
o
γ
¦.
That is, β is in ∆
i<α
o
i
if it is contained in the ﬁrst β members of the sequence.
Version: 2 Owner: Henry Author(s): Henry
293
34.151 intersection
The intersection of two sets ¹ and 1 is the set that contains all the elements r such that
r ∈ ¹ and r ∈ 1. The intersection of ¹ and 1 is written as ¹
¸
1.
Example. If ¹ = ¦1. 2. 3. 4. 5¦ and 1 = ¦1. 3. 5. 7. 9¦ then ¹
¸
1 = ¦1. 3. 5¦.
We can deﬁne also the intersection of an arbitrary number of sets. If ¦¹
j
¦
j∈J
is a family of
sets we deﬁne the intersection of all them, denoted
¸
j∈J
¹
j
, as the set consisting in those
elements belonging to all sets ¹
j
:
¸
j∈J
¹
j
= ¦r ∈ ¹
j
: for all , ∈ J¦.
Version: 7 Owner: drini Author(s): drini, xriso
34.152 multiset
A multiset is a set for which duplicate elements are allowed.
For example, ¦1. 1. 3¦ is a multiset, but not a set.
Version: 2 Owner: akrowne Author(s): akrowne
34.153 proof of delta system lemma
Since there are only ℵ
0
possible cardinalities for any element of o, there must be some
: such that there are an uncountable number of elements of o with cardinality :. Let
o
∗
= ¦c ∈ o [ [c[ = :¦ for this :. By induction, the lemma holds:
If : = 1 then there each element of o
∗
is distinct, and has no intersection with the others,
so A = ∅ and o
t
= o
∗
.
Suppose : 1. If there is some r which is in an uncountable number of elements of o
∗
then
take o
∗∗
= ¦c`¦r¦ [ r ∈ c ∈ o
∗
¦. Obviously this is uncountable and every element has :−1
elements, so by the induction hypothesis there is some o
t
⊆ o
∗∗
of uncountable cardinality
such that the intersection of any two elements is A. Obviously ¦c
¸
¦r¦ [ c ∈ o
t
¦ satisﬁes
the lemma, since the intersection of any two elements is A
¸
¦r¦.
On the other hand, if there is no such r then we can construct a sequence 'c
i
`
i<ω
1
such that
each c
i
∈ o
∗
and for any i = ,, c
i
¸
c
j
= ∅ by induction. Take any element for c
0
, and
given 'c
i
`
i<α
, since α is countable, ¹ =
¸
i<α
c
i
is countable. Obviously each element of
294
¹ is in only a countable number of elements of o
∗
, so there are an uncountable number of
elements of o
∗
which are candidates for c
α
. Then this sequence satisﬁes the lemma, since
the intersection of any two elements is ∅.
Version: 2 Owner: Henry Author(s): Henry
34.154 rational number
The rational numbers ´ are the fraction ﬁeld of the ring Z of integers. In more elementary
terms, a rational number is a quotient c/ of two integers c and /. Two fractions c/ and
cd are equivalent if the product of the cross terms is equal:
c
/
=
c
d
⇔cd = /c
Addition and multiplication of fractions are given by the formulae
c
/
+
c
d
=
cd + /c
/d
c
/
c
d
=
cc
/d
The ﬁeld of rational numbers is an ordered ﬁeld, under the ordering relation: c/ < cd if
the inequality c d < / c holds in the integers.
Version: 7 Owner: djao Author(s): djao
34.155 saturated (set)
If j : A −→) is a surjective map, we say that a subset ( ⊆ A is saturated (with respect to
p) if ( contains every set j
−1
(¦n¦) it intersects. Equivalently, ( is saturated if it is a union
of ﬁbres.
Version: 2 Owner: dublisk Author(s): dublisk
34.156 separation and doubletons axiom
• Separation axiom : If A is a set and 1 is a condition on sets, there exists a set
) whose members are precisely the members of A satisfying 1. Common notation:
) = ¦¹ ∈ A1(¹)¦.
• Doubletons axiom (or Pairs): If A and ) are sets there is a set 2 whose only
members are A and ). Common notation: 2 = ¦A. ) ¦.
295
REFERENCES
1. G.M. Bergman, An Invitation to General Algebra and Universal Constructions.
Version: 3 Owner: vladm Author(s): vladm
34.157 set
34.157.1 Introduction
A set is a collection, group, or conglomerate
1
.
Sets can be of “real” objects or mathematical objects; but the sets themselves are purely
conceptual. This is an important point to note: the set of all cows (for example) does not
physically exist, even though the cows do. The set is a “gathering” of the cows into one
conceptual unit that is not part of physical reality. This makes it easy to see why we can
have sets with an inﬁnite number of elements; even though we may not be able to point out
inﬁnity objects in the real world, we can construct conceptual sets which an inﬁnite number
of elements (see the examples below).
Mathematics is thus built upon sets of purely conceptual, or mathematical, objects. Sets
are usually denoted by uppercase roman letters (like o). Sets can be deﬁned by listing the
members, as in
o = ¦c. /. c. d¦
Or, a set can be deﬁned from a formula. This type of statement deﬁning a set is of the form
o = ¦r : 1(r)¦
where o is the symbol denoting the set, r is the variable we are introducing to represent a
generic element of the set, and 1(r) is some property that is true for values r within o (that
is r ∈ o iﬀ 1(r) holds). (We denote “and” by comma separated clauses in 1(r). Also note
that the r : portion of the set deﬁnition may contain a qualiﬁcation which narrows values of
r to some other set which is already known).
Sets are, in fact, completely deﬁned by their elements. If two sets have the same elements,
they are equivalent. This is called the axiom of extensionality, and it is one of the most
important characteristics of sets that distinguishes them from predicates or properties.
1
However, not every collection has to be a set (in fact, all collections can’t be sets). See proper class for
more details.
296
The symbol ∈ denotes inclusion in a set. For example,
: ∈ o
would be read “: is an element of o”, or “o contains :”.
Some examples of sets, with formal deﬁnitions, are :
• The set of all even integers : ¦r ∈ Z : 2 [ r¦
• The set of all prime numbers: ¦j ∈ N : ∀r ∈ N r [ j ⇒r ∈ ¦1. j¦¦, where ⇒ denotes
implies and [ denotes divides.
• The set of all real functions of one real parameter: ¦1(r) ∈ R : r ∈ R¦
• The set of all isosceles triangles: ¦´¹1( : (¹1 = 1() = ¹(¦, where overline
denotes segment length.
Z, N, and R are all standard sets: the integers, the natural numbers, and the real numbers,
respectively. These are all inﬁnite sets.
The most basic set is the empty set (denoted ∅ or ¦¦).
The astute reader may have noticed that all of our examples of sets utilize sets, which does
not suﬃce for rigorous deﬁnition. We can be more rigorous if we postulate only the empty
set, and deﬁne a set in general as anything which one can construct from the empty set and
the ZFC axioms.
All objects in modern mathematics are constructed via sets.
34.157.2 Set Notions
An important set notion is cardinality. Cardinality is roughly the same as the intuitive
notion of “size”. For sets which have a less than inﬁnite (noninﬁnite) number of elements,
cardinality can be thought of as size. However, intuition breaks down for sets with an inﬁnite
number of elements. For more detail, see the cardinality entry.
Another important set concept is that of subsets. A subset 1 of a set ¹ is any set which
contains only elements that appear in ¹. Subsets are denoted with the ⊆ symbol, i.e. 1 ⊆ ¹.
Also useful is the notion of a proper subset, denoted 1 ⊂ ¹, which adds the restriction
that 1 must be smaller than ¹ (that is, have a lower cardinality).
297
34.157.3 Set Operations
There are a number of standard (common) operations which are used to manipulate sets,
producing new sets from combinations of existing sets (sometimes with entirely diﬀerent
types of elements). These standard operations are:
• union
• intersection
• set diﬀerence
• symmetric set diﬀerence
• complement
• cartesian product
Version: 5 Owner: akrowne Author(s): akrowne
298
Chapter 35
03Exx – Set theory
35.1 intersection of sets
Let A. ) be sets. The intersection of A and ) , denoted A
¸
) is the set
A
¸
) = ¦. : . ∈ A. . ∈ ) ¦
Version: 3 Owner: drini Author(s): drini, apmxi
299
Chapter 36
03F03 – Proof theory, general
36.1 NJj
NJj is a natural deduction proof system for intuisitionistic propositional logic. Its only
axiom is α ⇒α for any atomic α. Its rules are:
Γ ⇒α
Γ ⇒α ∨ β Γ ⇒β ∨ α cc
(∨1)
Γ ⇒α Σ. α
0
⇒φ Π. β
0
⇒φ
[Γ. Σ. Π] ⇒φ
(∨1)
The syntax α
0
indicates that the rule also holds if that formula is omitted.
Γ ⇒α Σ ⇒β
[Γ. Σ] ⇒α ∧ β
(∧1)
Γ ⇒α ∧ β
Γ ⇒α Γ ⇒β
(∧1)
Γ. α ⇒β
Γ ⇒α →β
(→1)
Γ ⇒α →β Σ ⇒α
[Γ. Σ] ⇒β
(→1)
Γ ⇒⊥
Γ ⇒α
where α is atomic(⊥
i
)
Version: 3 Owner: Henry Author(s): Henry
36.2 NKj
NKj is a natural deduction proof system for classical propositional logic. It is identical to
NJp except that it replaces the rule ⊥
i
with the rule:
300
Γ. α ⇒⊥
Γ ⇒α
where α is atomic(⊥
c
)
Version: 1 Owner: Henry Author(s): Henry
36.3 natural deduction
Natural deduction refers to related proof systems for several diﬀerent kinds of logic, intended
to be similar to the way people actually reason. Unlike many other proof systems, it has
many rules and few axioms. Sequents in natural deduction have only one formula on the
right side.
Typically the rules consist of one pair for each connective, one of which allows the introduc
tion of that symbol and the other its elimination.
To give one example, the proof rules →1 and →1 are:
Γ. α ⇒β
Γ ⇒α →β
(→1)
and
Γ ⇒α →β Σ ⇒α
[Γ. Σ] ⇒β
(→1)
Version: 1 Owner: Henry Author(s): Henry
36.4 sequent
A sequent represents a formal step in a proof. Typically it consists of two lists of formulas,
one representing the premises and one the conclusions. A typical sequent might be:
φ. ψ ⇒α. β
This claims that, from premises φ and ψ either α or β must be true. Note that ⇒ is not
a symbol in the language, rather it is a symbol in the metalanguage used to discuss proofs.
Also, notice the asymmetry: everything on the left must be true to conclude only one thing
on the right. This does create a diﬀerent kind of symmetry, since adding formulas to either
side results in a weaker sequent, while removing them from either side gives a stronger one.
Some systems allow only one formula on the right.
301
Most proof systems provide ways to deduce one sequent from another. These rules are
written with a list of sequents above and below a line. This rule indicates that if everything
above the line is true, so is everything under the line. A typical rule is:
Γ ⇒Σ
Γ. α ⇒Σ α. Γ ⇒Σ
This indicates that if we can deduce Σ from Γ, we can also deduce it from Γ together with
α.
Note that the capital greek letters are usually used to denote a (possibly empty) list of
formulas. [Γ. Σ] is used to denote the contraction of Γ and Σ, that is, the list of those
formulas appearing in either Γ or Σ but with no repeats.
Version: 5 Owner: Henry Author(s): Henry
36.5 sound,, complete
If 1/ and 1: are two sets of facts (in particular, a theory of some language and the set of
things provable by some method) we say 1: is sound for 1/ if 1: ⊆ 1/. Typically we
have a theory and set of rules for constructing proofs, and we say the set of rules are sound
(which theory is intended is usually clear from context) since everything they prove is true
(in 1/).
If 1/ ⊆ 1: we say 1: is complete for 1/. Again, we usually have a theory and a set of
rules for constructing proofs, and say that the set of rules is complete since everything true
(in 1/) can be proven.
Version: 4 Owner: Henry Author(s): Henry
302
Chapter 37
03F07 – Structure of proofs
37.1 induction
Induction is the name given to a certain kind of proof, and also to a (related) way of deﬁning
a function. For a proof, the statement to be proved has a suitably ordered set of cases.
Some cases (usually one, but possibly zero or more than one), are proved separately, and
the other cases are deduced from those. The deduction goes by contradiction, as we shall
see. For a function, its domain is suitably ordered. The function is ﬁrst deﬁned on some
(usually nonempty) subset of its domain, and is then deﬁned at other points r in terms of
its values at points n such that n < r.
37.1.1 Elementary proof by induction
Proof by induction is a variety of proof by contradiction, relying, in the elementary cases,
on the fact that every nonempty set of natural numbers has a least element. Suppose we
want to prove a statement 1(:) which involves a natural number :. It is enough to prove:
1) If : ∈ N, and 1(:) is true for all : ∈ N such that : < :, then 1(:) is true.
or, what is the same thing,
2) If 1(:) is false, then 1(:) is false for some : < :.
To see why, assume that 1(:) is false for some :. Then there is a smallest / ∈ N such that
1(/) is false. Then, by hypothesis, 1(:) is true for all : < /. By (1), 1(/) is true, which is
a contradiction.
(If we don’t regard induction as a kind of proof by contradiction, then we have to think
of it as supplying some kind of sequence of proofs, of unlimited length. That’s not very
303
satisfactory, particularly for transﬁnite inductions, which we will get to below.)
Usually the initial case of : = 0, and sometimes a few cases, need to be proved separately,
as in the following example. Write 1
n
=
¸
n
k=0
/
2
. We claim
1
n
=
:
3
3
+
:
2
2
+
:
6
for all : ∈ N
Let us try to apply (1). We have the inductive hypothesis (as it is called)
1
m
=
:
3
3
+
:
2
2
+
:
6
for all : < :
which tells us something if : 0. In particular, setting : = : −1,
1
n−1
=
(: −1)
3
3
+
(: −1)
2
2
+
: −1
6
Now we just add :
2
to each side, and verify that the right side becomes
n
3
3
+
n
2
2
+
n
6
. This
proves (1) for nonzero :. But if : = 0, the inductive hypothesis is vacuously true, but of no
use. So we need to prove 1(0) separately, which in this case is trivial.
Textbooks sometimes distinguish between weak and strong (or complete) inductive proofs.
A proof that relies on the inductive hypothesis (1) is said to go by strong induction. But in
the sumofsquares formula above, we needed only the hypothesis 1(:−1), not 1(:) for all
: < :. For another example, a proof about the Fibonacci sequence might use just 1(:−2)
and 1(: −1). An argument using only 1(: −1) is referred to as weak induction.
37.1.2 Deﬁnition of a function by induction
Let’s begin with an example, the function N → N, : → c
n
, where c is some integer 0.
The inductive deﬁnition reads
c
0
= 1
c
n
= c(c
n−1
) for all : 0
Formally, such a deﬁnition requires some justiﬁcation, which runs roughly as follows. Let 1
be the set of : ∈ N for which the following deﬁnition ”has no problem”.
c
0
= 1
c
n
= c(c
n−1
) for 0 < : ≤ :
We now have a ﬁnite sequence 1
m
on the interval [0. :], for each : ∈ 1. We verify that any
1
l
and 1
m
have the same values throughout the intersection of their two domains. Thus we
can deﬁne a single function on the union of the various domains. Now suppose 1 = N, and
let / be the least element of N − 1. That means that the deﬁnition has a problem when
304
: = / but not when : < /. We soon get a contradiction, so we deduce 1 = N. That means
that the union of those domains is all of N, i.e. the function c
n
is deﬁned, unambiguously,
throughout N.
Another inductively deﬁned function is the Fibonacci sequence, q.v.
We have been speaking of the inductive deﬁnition of a function, rather than just a sequence
(a function on N), because the notions extend with little change to transﬁnite inductions.
An illustration par excellence of inductive proofs and deﬁnitions is Conway’s theory of
surreal numbers. The numbers and their algebraic laws of composition are deﬁned entirely
by inductions which have no special starting cases.
37.1.3 Minor variations of the method
The reader can ﬁgure out what is meant by ”induction starting at /”, where / is not neces
sarily zero. Likewise, the term ”downward induction” is selfexplanatory.
A common variation of the method is proof by induction on a function of the index :.
Rather than spell it out formally, let me just give an example. Let : be a positive integer
having no prime factors of the form 4: + 3. Then : = c
2
+ /
2
for some integers c and /.
The usual textbook proof uses induction on a function of :, namely the number of prime
factors of :. The induction starts at 1 (i.e. either : = 2 or prime : = 4:+1), which in this
instance is the only part of the proof that is not quite easy.
37.1.4 Wellordered sets
An ordered set (o. ≤) is said to be wellordered if any nonempty subset of o has a least
element. The criterion (1), and its proof, hold without change for any wellordered set o in
place of N (which is a wellordered set). But notice that it won’t be enough to prove that
1(:) implies 1(: + 1) (where : + 1 denotes the least element :, if it exists). The reason
is, given an element :, there may exist elements < : but no element / such that : = / +1.
Then the induction from : to : + 1 will fail to ”reach” :. For more on this topic, look for
”limit ordinals”.
Informally, any variety of induction which works for ordered sets o in which a segment
o
x
= ¦n ∈ o[n < r¦ may be inﬁnite, is called ”transﬁnite induction”.
37.1.5 Noetherian induction
An ordered set o, or its order, is called noetherian if any nonempty subset of o has a
maximal element. Several equivalent deﬁnitions are possible, such as the ”ascending chain condition”:
305
any strictly increasing sequence of elements of o is ﬁnite. The following result is easily proved
by contradiction.
Principle of Noetherian induction: Let (o. ≤) be a set with a Noetherian order, and let
1 be a subset of o having this property: if r ∈ o is such that the condition n r implies
n ∈ 1, then r ∈ 1. Then 1 = o.
So, to prove something ”1(r)” about every element r of a Noetherian set, it is enough to
prove that ”1(.) for all . n” implies ”1(n)”. This time the induction is going downward,
but of course that is only a matter of notation. The opposite of a Noetherian order, i.e. an
order in which any strictly decreasing sequence is ﬁnite, is also in use; it is called a partial
wellorder, or an ordered set having no inﬁnite antichain.
The standard example of a Noetherian ordered set is the set of ideals in a Noetherian ring.
But the notion has various other uses, in topology as well as algebra. For a nontrivial
example of a proof by Noetherian induction, look up the Hilbert basis theorem.
37.1.6 Inductive ordered sets
An ordered set (o. ≤) is said to be inductive if any totally ordered subset of o has an
upper bound in o. Since the empty set is totally ordered, any inductive ordered set is non
empty. We have this important result:
Zorn’s lemma: Any inductive ordered set has a maximal element.
Zorn’s lemma is widely used in existence proofs, rather than in proofs of a property 1(r) of
an arbitrary element r of an ordered set. Let me sketch one typical application. We claim
that every vector space has a basis. First, we prove that if a free subset 1, of a vector space
\ , is a maximal free subset (with respect to the order relation ⊂), then it is a basis. Next,
to see that the set of free subsets is inductive, it is enough to verify that the union of any
totally ordered set of free subsets is free, because that union is an upper bound on the totally
ordered set. Last, we apply Zorn’s lemma to conclude that \ has a maximal free subset.
Version: 10 Owner: Daume Author(s): Larry Hammick, slider142
306
Chapter 38
03F30 – Firstorder arithmetic and
fragments
38.1 Elementary Functional Arithmetic
Elementary Functional Arithmetic, or EFA, is a weak theory of arithmetic created
by removing induction from Peano arithmetic. Because it lacks induction, axioms deﬁning
exponentiation must be added.
• ∀r(r
t
= 0) (0 is the ﬁrst number)
• ∀r. n(r
t
= n
t
→r = n) (the successor function is onetoone)
• ∀r(r + 0 = r) (0 is the additive identity)
• ∀r. n(r+n
t
= (r+n)
t
) (addition is the repeated application of the successor function)
• ∀r(r 0 = 0)
• ∀r. n(r (n
t
) = r n + r (multiplication is repeated addition)
• ∀r((r < 0)) (0 is the smallest number)
• ∀r. n(r < n
t
↔r < n ∨ r = n)
• ∀r(r
0
= 1)
• ∀r(r
y
= r
y
r)
Version: 2 Owner: Henry Author(s): Henry
307
38.2 PA
Peano Arithmetic (PA) is the restriction of Peano’s axioms to a ﬁrst order theory of arith
metic. The only change is that the induction axiom is replaced by induction restricted to
arithmetic formulas:
φ(0) ∧ ∀r(φ(r) →φ(r
t
)) →∀rφ(r))where φ is arithmetical
Note that this replaces the single, secondorder, axiom of induction with a countably inﬁnite
schema of axioms.
Appropriate axioms deﬁning +, , and < are included. A full list of the axioms of PA looks
like this (although the exact list of axioms varies somewhat from source to source):
• ∀r(r
t
= 0) (0 is the ﬁrst number)
• ∀r. n(r
t
= n
t
→r = n) (the successor function is onetoone)
• ∀r(r + 0 = r) (0 is the additive identity)
• ∀r. n(r+n
t
= (r+n)
t
) (addition is the repeated application of the successor function)
• ∀r(r 0 = 0)
• ∀r. n(r (n
t
) = r n + r) (multiplication is repeated addition)
• ∀r((r < 0)) (0 is the smallest number)
• ∀r. n(r < n
t
↔r < n ∨ r = n)
• φ(0) ∧ ∀r(φ(r) →φ(r
t
)) →∀rφ(r))where φ is arithmetical
Version: 7 Owner: Henry Author(s): Henry
38.3 Peano arithmetic
Peano’s axioms are a deﬁnition of the set of natural numbers, denoted N. From these
axioms Peano arithmetic on natural numbers can be derived.
1. 0 ∈ N (0 is a natural number)
2. For each r ∈ N, there exists exactly one r
t
∈ N, called the successor of r
3. r
t
= 0 (0 is not the successor of any natural number)
308
4. r = n if and only if r
t
= n
t
.
5. (axiom of induction) If ` ⊆ N and 0 ∈ ` and r ∈ ` implies r
t
∈ `, then ` = N.
The successor of r is sometimes denoted or instead of r
t
. We then have 1 = o0, 2 = o1 =
oo0, and so on.
Peano arithmetic consists of statements derived via these axioms. For instance, from these
axioms we can deﬁne addition and multiplication on natural numbers. Addition is deﬁned
as
r + 1 = r
t
for all r ∈ N
r + n
t
= (r + n)
t
for all r. n ∈ N
Addition deﬁned in this manner can then be proven to be both associative and commutative.
Multiplication is
r 1 = r for all r ∈ N
r n
t
= r n + r for all r. n ∈ N
This deﬁnition of multiplication can also be proven to be both associative and commutative,
and it can also be shown to be distributive over addition.
Version: 4 Owner: Henry Author(s): Henry, Logan
309
Chapter 39
03F35 – Second and higherorder
arithmetic and fragments
39.1 ¹(¹
0
¹(¹
0
is a weakened form of second order arithmetic. Its axioms include the axioms of PA
together with arithmetic comprehension.
Version: 1 Owner: Henry Author(s): Henry
39.2 1(¹
0
1(¹
0
is a weakened form of second order arithmetic. It consists of the axioms of PA other
than induction, together with Σ
0
1
IND and ∆
0
1
CA.
Version: 1 Owner: Henry Author(s): Henry
39.3 2
2
2
2
is the full system of second order arithmetic, that is, the full theory of numbers and sets
of numbers. It is suﬃcient for a great deal of mathematics, including much of number theory
and analysis.
The axioms deﬁning successor, addition, multiplication, and comparison are the same as
those of PA. 2
2
adds the full induction axiom and the full comprehension axiom.
310
Version: 1 Owner: Henry Author(s): Henry
39.4 comprehension axiom
The axiom of comprehension (CA) states that every formula deﬁnes a set. That is,
∃A∀r(r ∈ A ↔φ(r))for any formulaφwhereAdoes not occur free inφ
The names speciﬁcation and separation are sometimes used in place of comprehension, par
ticularly for weakened forms of the axiom (see below).
In theories which make no distinction between objects and sets (such as ZF), this formulation
leads to Russel’s paradox, however in stratiﬁed theories this is not a problem (for example
second order arithmetic includes the axiom of comprehension).
This axiom can be restricted in various ways. One possibility is to restrict it to forming
subsets of sets:
∀) ∃A∀r(r ∈ A ↔r ∈ ) ∧ φ(r))for any formulaφwhereAdoes not occur free inφ
This formulation (used in ZF set theory) is sometimes called the Aussonderungsaxiom.
Another way is to restrict φ to some family 1, giving the axiom FCA. For instance the
axiom Σ
0
1
CA is:
∃A∀r(r ∈ A ↔φ(r))whereφisΣ
0
1
andAdoes not occur free inφ
A third form (usually called separation) uses two formulas, and guarantees only that those
satisfying one are included while those satisfying the other are excluded. The unrestricted
form is the same as unrestricted collection, but, for instance, Σ
0
1
separation:
∀r(φ(r) ∧ ψ(r)) →∃A∀r((φ(r) →r ∈ A) ∧ (ψ(r) →r ∈ A))
whereφandψareΣ
0
1
andAdoes not occur free inφorψ
is weaker than Σ
0
1
CA.
Version: 4 Owner: Henry Author(s): Henry
39.5 induction axiom
An induction axiom speciﬁes that a theory includes induction, possibly restricted to speciﬁc
formulas. IND is the general axiom of induction:
φ(0) ∧ ∀r(φ(r) →φ(r + 1)) →∀rφ(r) for any formula φ
311
If φ is restricted to some family of formulas 1 then the axiom is called FIND, or F induction.
For example the axiom Σ
0
1
IND is:
φ(0) ∧ ∀r(φ(r) →φ(r + 1)) →∀rφ(r) where φ is Σ
0
1
Version: 4 Owner: Henry Author(s): Henry
312
Chapter 40
03G05 – Boolean algebras
40.1 Boolean algebra
A Boolean algebra is a set 1 with two binary operators, ∧ “meet,”and ∨ “join,” and
one unary operator
t
“complement,” which together are a Boolean lattice. If A and ) are
boolean algebras, a mapping 1 : A → ) is a morphism of Boolean algebras when it is a
morphism of ∧, ∨, and
t
.
Version: 6 Owner: greg Author(s): greg
40.2 M. H. Stone’s representation theorem
Theorem 3. Given a Boolean algebra 1 there exists a totally disconnected Hausdorﬀ space
A such that 1 is isomorphic to the Boolean algebra of clopen subsets of A.
[ Very rough scetch of proof] Let
A = ¦1 : 1 →¦0. 1¦ [ 1 is a homomorphism¦
endowed with the subspace topology induced by the product topology on 1
0,1¦
. Then A
is a totally disconnected Hausdorﬀ space. Let Cl(A) denote the Boolean algebra of clopen
subsets of A, then the following map
1 : 1 →Cl(A). 1(r) = ¦1 ∈ A [ 1(r) = 1¦
is well deﬁned (i.e. 1(r) is indeed a clopen set), and an isomorphism.
Version: 4 Owner: Dr Absentius Author(s): Dr Absentius
313
Chapter 41
03G10 – Lattices and related
structures
41.1 Boolean lattice
A Boolean lattice 1 is a distributive lattice in which for each element r ∈ 1 there exists
a complement r
t
∈ 1 such that
r ∧ r
t
= 0
r ∨ r
t
= 1
(r
t
)
t
= r
(r ∧ n)
t
= r
t
∨ n
t
(r ∨ n)
t
= r
t
∧ n
t
Given a set, any collection of subsets that is closed under unions, intersections, and comple
ments is a Boolean algebra.
Boolean rings (with identity, but allowing 0=1) are equivalent to Boolean lattices. To view
a Boolean ring as a Boolean lattice, deﬁne r ∧ n = rn and r ∨ n = r + n + rn. To view a
Boolean lattice as a Boolean ring, deﬁne rn = r ∧ n and r + n = (r
t
∧ n) ∨ (r ∧ n
t
).
Version: 3 Owner: mathcam Author(s): mathcam, greg
41.2 complete lattice
A complete lattice is a nonempty poset in which every nonempty subset has a supremum
and an inﬁmum.
314
In particular, a complete lattice is a lattice.
Version: 1 Owner: Evandar Author(s): Evandar
41.3 lattice
Alattice is any nonempty poset 1 in which any two elements r and n have a least upper bound,
r ∨ n, and a greatest lower bound, r ∧ n.
In other words, if ¡ = r ∧n then ¡ ∈ 1, ¡ < r and ¡ < n. Further, for all j ∈ 1 if j < r and
j < n, then j < ¡.
Likewise, if ¡ = r ∨ n then ¡ ∈ 1, r < ¡ and n < ¡, and for all j ∈ 1 if r < j and n < j,
then ¡ < j.
Since 1 is a poset, the operations ∧ and ∨ have the following properties:
r ∧ r = r. r ∨ r = r (idempotency)
r ∧ n = n ∧ r. r ∨ n = n ∨ r (commutativity)
r ∧ (n ∧ .) = (r ∧ n) ∧ .. (associativity)
r ∨ (n ∨ .) = (r ∨ n) ∨ .
r ∧ (r ∨ n) = r ∨ (r ∧ n) = r (absorption)
Further, r < n is equivalent to:
r ∧ n = r and r ∨ n = n (consistency)
Version: 5 Owner: mps Author(s): mps, greg
315
Chapter 42
03G99 – Miscellaneous
42.1 Chu space
A Chu space over a set Σ is a triple (A. :. X) with : : AX →Σ. A is called the carrier
and X the cocarrier.
Although the deﬁnition is symmetrical, in practice asymmetric uses are common. In partic
ular, often X is just taken to be a set of function from A to Σ, with :(c. r) = r(c) (such a
Chu space is called normal and is abbreviated (A. X)).
We deﬁne the perp of a Chu space C = (A. :. X) to be C
⊥
= (X. :
. A) where :
(r. c) =
:(c. r).
Deﬁne ˆ : and ˇ : to be functions deﬁning the rows and columns of C respectively, so that
ˆ :(c) : X → Σ and ˇ :(r) : A → Σ are given by ˆ :(c)(r) = ˇ :(r)(c) = :(c. r). Clearly the rows
of C are the columns of C
⊥
.
Using these deﬁnitions, a Chu space can be represented using a matrix.
If ˆ : is injective then we call C separable and if ˇ : is injective we call C extensional. A Chu
space which is both separable and extensional is biextensional.
Version: 3 Owner: Henry Author(s): Henry
42.2 Chu transform
If C = (A. :. X) and D = (B. :. Y) are Chu spaces then we say a pair of functions 1 : A →B
and o : Y → X form a Chu transform from C to D if for any (c. n) ∈ A Y we have
:(c. o(n)) = :(1(c). n).
316
Version: 1 Owner: Henry Author(s): Henry
42.3 biextensional collapse
If C = (A. :. X) is a Chu space, we can deﬁne the biextensional collapse of C to be
(ˆ :[¹]. :
t
. ˇ :[A]) where :
t
(ˆ :(c). ˇ :(r)) = :(c. r).
That is, to name the rows of the biextensional collapse, we just use functions representing
the actual rows of the original Chu space (and similarly for the columns). The eﬀect is to
merge indistinguishable rows and columns.
We say that two Chu spaces are equivalent if their biextensional collapses are isomorphic.
Version: 3 Owner: Henry Author(s): Henry
42.4 example of Chu space
Any set ¹ can be represented as a Chu space over ¦0. 1¦ by (¹. :. P(¹)) with :(c. A) = 1
iﬀ c ∈ A. This Chu space satisﬁes only the trivial property 2
A
, signifying the fact that sets
have no internal structure. If ¹ = ¦c. /. c¦ then the matrix representation is:
¦¦ ¦a¦ ¦b¦ ¦c¦ ¦a,b¦ ¦a,c¦ ¦b,c¦ ¦a,b,c¦
a 0 1 0 0 1 1 0 1
b 0 0 1 0 1 0 1 1
c 0 0 0 1 0 1 1 1
Increasing the structure of a Chu space, that is, adding properties, is equivalent to deleting
columns. For instance we can delete the columns named ¦c¦ and ¦/. c¦ to turn this into
the partial order satisfying c < c. By deleting more columns, we can further increase the
structure. For example, if we require that the set of rows be closed under the bitwise or
operation (and delete those columns which would prevent this) then we can it will deﬁne a
semilattice, and if it is closed under both bitwise or and bitwise and then it will deﬁne a
lattice. If the rows are also closed under complementation then we have a Boolean algebra.
Note that these are not arbitrary connections: the Chu transforms on each of these classes
of Chu spaces correspond to the appropriate notion of homomorphism for those classes.
For instance, to see that Chu transforms are order preserving on Chu spaces viewed as partial
orders, let C = (A. :. X) be a Chu space satisfying / < c. That is, for any r ∈ A we have
:(/. r) = 1 →:(c. r) = 1. Then let (1. o) be a Chu transform to D = (B. :. X), and suppose
:(1(/). n) = 1. Then :(/. o(n)) = 1 by the deﬁnition of a Chu transform, and then we have
:(c. o(n)) = 1 and so :(1(c). n) = 1, demonstrating that 1(/) < 1(c).
317
Version: 2 Owner: Henry Author(s): Henry
42.5 property of a Chu space
A property of a Chu space over Σ with carrier A is some ) ⊆ Σ
A
. We say that a Chu
space C = (A. :. X) satisﬁes ) if A ⊆ ) .
For example, every Chu space satisﬁes the property Σ
A
.
Version: 2 Owner: Henry Author(s): Henry
318
Chapter 43
0500 – General reference works
(handbooks, dictionaries,
bibliographies, etc.)
43.1 example of pigeonhole principle
A simple example.
For any group of 8 integers, there exist at least two of them whose diﬀerence is divisible by
7.
C onsider the residue classes modulo 7. These are 0. 1. 2. 3. 4. 5. 6. We have seven classes
and eight integers. So it must be the case that 2 integers fall on the same residue class, and
therefore their diﬀerence will be divisible by 7.
Version: 1 Owner: drini Author(s): drini
43.2 multiindex derivative of a power
Theorem If i. / are multiindices in N
n
, and r = (r
1
. . . . . r
n
), then
∂
i
r
k
=
k!
(k−i)!
r
k−i
if i ≤ /.
0 otherwise.
Proof. The proof follows from the corresponding rule for the ordinary derivative; if i. / are
319
in 0. 1. 2. . . ., then
d
i
dr
i
r
k
=
k!
(k−i)!
r
k−i
if i ≤ /.
0 otherwise.
(43.2.1)
Suppose i = (i
1
. . . . . i
n
), / = (/
1
. . . . . /
n
), and r = (r
1
. . . . . r
n
). Then we have that
∂
i
r
k
=
∂
[i[
∂r
i
1
1
∂r
in
n
r
k
1
1
r
kn
n
=
∂
i
1
∂r
i
1
1
r
k
1
1
∂
in
∂r
in
n
r
kn
n
.
For each : = 1. . . . . :, the function r
kr
r
only depends on r
r
. In the above, each partial
diﬀerentiation ∂∂r
r
therefore reduces to the corresponding ordinary diﬀerentiation ddr
r
.
Hence, from equation 43.2.1, it follows that ∂
i
r
k
vanishes if i
r
/
r
for any : = 1. . . . . :. If
this is not the case, i.e., if i ≤ / as multiindices, then for each :,
d
ir
dr
ir
r
r
kr
r
=
/
r
!
(/
r
−i
r
)!
r
kr−ir
r
.
and the theorem follows. P
Version: 4 Owner: matte Author(s): matte
43.3 multiindex notation
Deﬁnition [1, 2, 3] Amultiindex is an :tuple (i
1
. . . . . i
n
) of nonnegative integers i
1
. . . . . i
n
.
In other words, i ∈ N
n
. Usually, : is the dimension of the underlying space. Therefore, when
dealing with multiindices, it is assumed clear from the context.
Operations on multiindices
For a multiindex i, we deﬁne the length (or order) as
[i[ = i
1
+ + i
n
.
and the factorial as
i! =
n
¸
k=1
i
k
!.
If i = (i
1
. . . . . i
n
) and , = (,
1
. . . . . ,
n
) are two multiindices, their sum and diﬀerence is
deﬁned componentwise as
i + , = (i
1
+ ,
1
. . . . . i
n
+ ,
n
).
i −, = (i
1
−,
1
. . . . . i
n
−,
n
).
320
Thus [i ± ,[ = [i[ ± [,[. Also, if ,
k
≤ i
k
for all / = 1. . . . . :, then we write , ≤ i. For
multiindices i. ,, with , ≤ i, we deﬁne
i
,
=
i!
(i −,)!,!
.
For a point r = (r
1
. . . . . r
n
) in R
n
(with standard coordinates) we deﬁne
r
i
=
n
¸
k=1
r
i
k
k
.
Also, if 1 : R
n
→R is a smooth function, and i = (i
1
. . . . . i
n
) is a multiindex, we deﬁne
∂
i
1 =
∂
[i[
∂
i
1
c
1
∂
in
c
n
1.
where c
1
. . . . . c
n
are the standard unit vectors of R
n
. Since 1 is suﬃciently smooth, the order
in which the derivations are performed is irrelevant. For multiindices i and ,, we thus have
∂
i
∂
j
= ∂
i+j
= ∂
j+i
= ∂
j
∂
i
.
Much of the motivation for the above notation is that standard results such as Leibniz’ rule,
Taylor’s formula, etc can be written more or less asis in many dimensions by replacing
indices in N with multiindices. Below are some examples of this.
Examples
1. If : is a positive integer, and r
1
. . . . . r
k
are complex numbers, the multinomial expan
sion states that
(r
1
+ + r
k
)
n
= :!
¸
[i[=n
r
i
i!
.
where r = (r
1
. . . . . r
k
) and i is a multiindex. (proof)
2. Leibniz’ rule [1]: If 1. o : R
n
→R are smooth functions, and , is a multiindex, then
∂
j
(1o) =
¸
i≤j
,
i
∂
i
(1) ∂
j−i
(o).
where i is a multiindex.
REFERENCES
1. http://www.math.umn.edu/ jodeit/course/TmprDist1.pdf
2. M. Reed, B. Simon, Methods of Mathematical Physics, I  Functional Analysis, Aca
demic Press, 1980.
3. E. Weisstein, Eric W. Weisstein’s world of mathematics, entry on MultiIndex Notation
Version: 8 Owner: matte Author(s): matte
321
Chapter 44
05A10 – Factorials, binomial
coeﬃcients, combinatorial functions
44.1 Catalan numbers
The Catalan numbers, or Catalan sequence, have many interesting applications in com
binatorics.
The :th Catalan number is given by:
(
n
=
2n
n
: + 1
.
where
n
r
represents the binomial coeﬃcient. The ﬁrst several Catalan numbers are 1, 1, 2,
5, 14, 42, 132, 429, 1430, 4862 ,. . . (see EIS sequence A000108 for more terms). The Catalan
numbers are also generated by the recurrence relation
(
0
= 1. (
n
=
n−1
¸
i=0
(
i
(
n−1−i
.
For example, (
3
= 1 2 + 1 1 + 2 1 = 5, (
4
= 1 5 + 1 2 + 2 1 + 5 1 = 14, etc.
The ordinary generating function for the Catalan numbers is
∞
¸
n=0
(
n
.
n
=
1 −
√
1 −4.
2.
.
Interpretations of the :th Catalan number include:
322
1. The number of ways to arrange : pairs of matching parentheses, e.g.:
()
(()) ()()
((())) (()()) ()(()) (())() ()()()
2. The number of ways an polygon of : + 2 sides can be split into : triangles.
3. The number of rooted binary trees with exactly : + 1 leaves.
The Catalan sequence is named for Eug`ene Charles Catalan, but it was discovered in 1751
by Euler when he was trying to solve the problem of subdividing polygons into triangles.
REFERENCES
1. Ronald L. Graham, Donald E. Knuth, and Oren Patashnik. Concrete Mathematics. Addison
Wesley, 1998. Zbl 0836.00001.
Version: 3 Owner: bbukh Author(s): bbukh, vampyr
44.2 LeviCivita permutation symbol
Deﬁnition. Let /
i
∈ ¦1. . :¦ for all i = 1. . :. The LeviCivita permutation
symbols ε
k
1
kn
and ε
k
1
kn
are deﬁned as
ε
k
1
km
= ε
k
1
km
=
+1 when ¦ →/
l
¦ is an even permutation (of ¦1. . :¦),
−1 when ¦ →/
l
¦ is an odd permutation,
0 otherwise, i.e., when /
i
= /
j
. for some i = ,.
The LeviCivita permutation symbol is a special case of the generalized Kronecker delta symbol.
Using this fact one can write the LeviCivita permutation symbol as the determinant of an
: : matrix consisting of traditional delta symbols. See the entry on the generalized Kro
necker symbol for details.
When using the LeviCivita permutation symbol and the generalized Kronecker delta symbol,
the Einstein summation convention is usually employed. In the below, we shall also use this
convention.
properties
323
• When : = 2, we have for all i. ,. :. : in ¦1. 2¦,
ε
ij
ε
mn
= δ
m
i
δ
n
j
−δ
n
i
δ
m
j
. (44.2.1)
ε
ij
ε
in
= δ
n
j
. (44.2.2)
ε
ij
ε
ij
= 2. (44.2.3)
• When : = 3, we have for all i. ,. /. :. : in ¦1. 2. 3¦,
ε
jmn
ε
imn
= 2δ
i
j
. (44.2.4)
ε
ijk
ε
ijk
= 6. (44.2.5)
Let us prove these properties. The proofs are instructional since they demonstrate typical
argumentation methods for manipulating the permutation symbols.
Proof. For equation 220.5.1, let us ﬁrst note that both sides are antisymmetric with respect
of i, and ::. We therefore only need to consider the case i = , and : = :. By substitution,
we see that the equation holds for ε
12
ε
12
, i.e., for i = : = 1 and , = : = 2. (Both sides are
then one). Since the equation is antisymmetric in i, and ::, any set of values for these
can be reduced the the above case (which holds). The equation thus holds for all values of
i, and ::. Using equation 220.5.1, we have for equation 44.2.2
ε
ij
ε
in
= δ
i
i
δ
n
j
−δ
n
i
δ
i
j
= 2δ
n
j
−δ
n
j
= δ
n
j
.
Here we used the Einstein summation convention with i going from 1 to 2. Equation 44.2.3
follows similarly from equation 44.2.2. To establish equation 44.2.4, let us ﬁrst observe that
both sides vanish when i = ,. Indeed, if i = ,, then one can not choose : and : such
that both permutation symbols on the left are nonzero. Then, with i = , ﬁxed, there are
only two ways to choose : and : from the remaining two indices. For any such indices,
we have ε
jmn
ε
imn
= (ε
imn
)
2
= 1 (no summation), and the result follows. The last property
follows since 3! = 6 and for any distinct indices i. ,. / in ¦1. 2. 3¦, we have ε
ijk
ε
ijk
= 1 (no
summation). P
Examples.
• The determinant of an : : matrix ¹ = (c
ij
) can be written as
det ¹ = ε
i
1
in
c
1i
1
c
nin
.
where each i
l
should be summed over 1. . . . . :.
• If ¹ = (¹
1
. ¹
2
. ¹
3
) and 1 = (1
1
. 1
2
. 1
3
) are vectors in R
3
(represented in some right
hand oriented orthonormal basis), then the ith component of their cross product equals
(¹1)
i
= ε
ijk
¹
j
1
k
.
324
For instance, the ﬁrst component of ¹1 is ¹
2
1
3
−¹
3
1
2
. From the above expression
for the cross product, it is clear that ¹ 1 = −1 ¹. Further, if ( = ((
1
. (
2
. (
3
)
is a vector like ¹ and 1, then the triple scalar product equals
¹ (1 () = ε
ijk
¹
i
1
j
(
k
.
From this expression, it can be seen that the triple scalar product is antisymmetric
when exchanging any adjacent arguments. For example, ¹ (1 () = −1 (¹().
• Suppose 1 = (1
1
. 1
2
. 1
3
) is a vector ﬁeld deﬁned on some domain of 1
3
with Cartesian
coordinates r = (r
1
. r
2
. r
3
). Then the ith component of the curl of 1 equals
(∇1)
i
(r) = ε
ijk
∂
∂r
j
1
k
(r).
Version: 7 Owner: matte Author(s): matte
44.3 Pascal’s rule (bit string proof )
This proof is based on an alternate, but equivalent, deﬁnition of the binomial coeﬃcient:
n
r
is the number of bit strings (ﬁnite sequences of 0s and 1s) of length : with exactly : ones.
We want to show that
:
:
=
: −1
: −1
+
: −1
:
To do so, we will show that both sides of the equation are counting the same set of bit
strings.
The lefthand side counts the set of strings of : bits with : 1s. Suppose we take one of these
strings and remove the ﬁrst bit /. There are two cases: either / = 1, or / = 0.
If / = 1, then the new string is : −1 bits with : −1 ones; there are
n−1
r−1
bit strings of this
nature.
If / = 0, then the new string is : − 1 bits with : ones, and there are
n−1
r
strings of this
nature.
Therefore every string counted on the left is covered by one, but not both, of these two cases.
If we add the two cases, we ﬁnd that
:
:
=
: −1
: −1
+
: −1
:
Version: 2 Owner: vampyr Author(s): vampyr
325
44.4 Pascal’s rule proof
We need to show
:
/
+
:
/ −1
=
: + 1
/
Let us begin by writing the lefthand side as
:!
/!(: −/)!
+
:!
(/ −1)!(: −(/ −1))!
Getting a common denominator and simplifying, we have
:!
/!(: −/)!
+
:!
(/ −1)!(: −/ + 1)!
=
(: −/ + 1):!
(: −/ + 1)/!(: −/)!
+
/:!
/(/ −1)!(: −/ + 1)!
=
(: −/ + 1):! + /:!
/!(: −/ + 1)!
=
(: + 1):!
/!((: + 1) −/)!
=
(: + 1)!
/!((: + 1) −/)!
=
: + 1
/
Version: 5 Owner: akrowne Author(s): akrowne
44.5 Pascal’s triangle
Pascal’s triangle is the following conﬁguration of numbers:
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
.
.
.
.
.
.
.
.
.
326
This triangle goes on into inﬁnity. Therefore we have only printed the ﬁrst 8 lines. In general,
this triangle is constructed such that entries on the left side and right side are 1, and every
entry inside the triangle is obtained by summing the two entries immediately above it. For
instance, on the forth row 4 = 1 + 3.
Historically, the application of this triangle has been to give the coeﬃcients when expanding
binomial expressions. For instance, to expand (c + /)
4
, one simply look up the coeﬃcients
on the fourth row, and write
(c + /)
4
= c
4
+ 4c
3
/ + 6c
2
/
2
+ 4c/
3
+ /
4
.
Pascal’s triangle is named after the French mathematician Blaise Pascal (16231662) [3].
However, this triangle was known at least around 1100 AD in China; ﬁve centuries before Pas
cal [1]. In modern language, the expansion of the binomial is given by the binomial theorem
discovered by Isaac Newton in 1665 [2]: For any : = 1. 2. . . . and real numbers c. /, we have
(c + /)
n
=
n
¸
k=0
:
/
c
n−k
/
k
= c
n
+
:
1
c
n−1
/ +
:
2
c
n−2
/
2
+ /
n
.
Thus, in Pascal’s triangle, the entries on the :th row are given by the binomial coeﬃcients
:
/
=
:!
(: −/)!/!
.
for / = 1. . . . . :.
REFERENCES
1. Wikipedia’s entry on the binomial coeﬃcients
2. Wikipedia’s entry on Isaac Newton
3. Wikipedia’s entry on Blaise Pascal
Version: 1 Owner: Koro Author(s): matte
327
44.6 Upper and lower bounds to binomial coeﬃcient
:
/
≤
:
k
/!
:
/
≤
: c
/
k
:
/
≥
:
/
k
Also, for large ::
n
k
≈
n
k
k!
.
Version: 1 Owner: gantsich Author(s): gantsich
44.7 binomial coeﬃcient
The number of ways to choose : objects from a set with : elements (: ≥ :) is given by
:!
(: −:)!:!
.
It is usually denoted in several ways, like
:
:
. ((:. :). (
n
r
These numbers are called binomial coeﬃcients, because they show up at expanding (r+n)
n
.
Some interesting properties:
•
n
r
is the coeﬃcient of r
k
n
n−k
in (r + n)
n
. (binomial theorem).
•
n
r
=
n
n−r
.
•
n
r−1
+
n
r
=
n+1
r
(Pascal’s rule).
•
n
0
= 1 =
n
n
for all :.
•
n
0
+
n
1
+
n
2
+ +
n
n
= 2
n
.
•
n
0
−
n
1
+
n
2
− + (−1)
n
n
n
= 0.
•
¸
n
t=1
t
k
=
n+1
k+1
.
328
On the context of Computer Science, it also helps to see
n
r
as the number of strings
consisting of ones and zeros with : ones and : − : zeros. This equivalency comes from the
fact that if o be a ﬁnite set with : elements,
n
r
is the number of distinct subsets of o with
: elements. For each subset 1 of o, consider the function
A
T
: o →¦0. 1¦
where A
T
(r) = 1 whenever r ∈ 1 and 0 otherwise (so A
T
is the characteristic function for
1). For each 1 ∈ P(o), A
T
can be used to produce a unique bit string of length : with
exactly : ones.
Version: 14 Owner: drini Author(s): drini
44.8 double factorial
The double factorial of a positive integer : is
:!! = :(: −2) /
n
where /
n
denotes 1 if : is odd and 2 if : is even.
For example,
7!! = 7 5 3 1 = 105
10!! = 10 8 6 4 2 = 3840
Note that :!! is not the same as (:!)!.
Version: 3 Owner: drini Author(s): Larry Hammick, Riemann
44.9 factorial
For any nonnegative integer :, the factorial of :, denoted :!, can be deﬁned by
:! =
n
¸
r=1
:
where for : = 0 the empty product is taken to be 1.
Alternatively, the factorial can be deﬁned recursively by 0! = 1 and :! = :(:−1)! for : 0.
:! is equal to the number of permutations of : distinct objects. For example, there are 5!
ways to arrange the ﬁve letters A, B, C, D and E into a word.
329
Euler’s gamma function Γ(r) generalizes the notion of factorial to almost all complex values,
as
Γ(: + 1) = :!
for every nonnegative integer :.
Version: 13 Owner: yark Author(s): yark, Riemann
44.10 falling factorial
For : ∈ N, the rising and falling factorials are :
th
degree polynomial described, respectively,
by
r
n
= r(r + 1) . . . (r + : −1)
r
n
= r(r −1) . . . (r −: + 1)
The two types of polynomials are related by:
r
n
= (−1)
n
(−r)
n
.
The rising factorial is often written as (r)
n
, and referred to as the Pochhammer symbol (see
hypergeometric series). Unfortunately, the falling factorial is also often denoted by (r)
n
, so
great care must be taken when encountering this notation.
Notes.
Unfortunately, the notational conventions for the rising and falling factorials lack a common
standard, and are plagued with a fundamental inconsistency. An examination of reference
works and textbooks reveals two fundamental sources of notation: works in combinatorics
and works dealing with hypergeometric functions.
Works of combinatorics [1,2,3] give greater focus to the falling factorial because of if its role
in deﬁning the Stirling numbers. The symbol (r)
n
almost always denotes the falling factorial.
The notation for the rising factorial varies widely; we ﬁnd 'r`
n
in [1] and (r)
(n)
in [3].
Works focusing on special functions [4,5] universally use (r)
n
to denote the rising factorial and
use this symbol in the description of the various ﬂavours of hypergeometric series. Watson [5]
credits this notation to Pochhammer [6], and indeed the special functions literature eschews
“falling factorial” in favour of “Pochhammer symbol”. Curiously, according to Knuth [7],
Pochhammer himself used (r)
n
to denote the binomial coeﬃcient (Note: I haven’t veriﬁed
this.)
The notation featured in this entry is due to D. Knuth [7,8]. Given the fundamental in
consistency in the existing notations, it seems sensible to break with both traditions, and
to adopt new and graphically suggestive notation for these two concepts. The traditional
330
notation, especially in the hypergeometric camp, is so deeply entrenched that, realistically,
one needs to be familiar with the traditional modes and to take care when encountering the
symbol (r)
n
.
References
1. Comtet, Advanced combinatorics.
2. Jordan, Calculus of ﬁnite diﬀerences.
3. Riordan, Introduction to combinatorial analysis.
4. Erd´elyi, et. al., Bateman manuscript project.
5. Watson, A treatise on the theory of Bessel functions.
6. Pochhammer, “Ueber hypergeometrische Functionen :
ter
Ordnung,” Journal f¨ ur die
reine und angewandte Mathematik 71 (1870), 316–352.
7. Knuth, “Two notes on notation” download
8. Greene, Knuth, Mathematics for the analysis of algorithms.
Version: 7 Owner: rmilson Author(s): rmilson
44.11 inductive proof of binomial theorem
When : = 1,
(c + /)
1
=
1
¸
k=0
1
/
c
1−k
/
k
=
1
0
c
1
/
0
+
1
1
c
0
/
1
= c + /.
For the inductive step, assume it holds for :. Then for n = :+ 1,
331
(c + /)
m+1
= c(c + /)
m
+ /(c + /)
m
= c
m
¸
k=0
:
/
c
m−k
/
k
+ /
m
¸
j=0
:
,
c
m−j
/
j
by the inductive hypothesis
=
m
¸
k=0
:
/
c
m−k+1
/
k
+
m
¸
j=0
:
,
c
m−j
/
j+1
by multiplying through by c and /
= c
m+1
+
m
¸
k=1
:
/
c
m−k+1
/
k
+
m
¸
j=0
:
,
c
m−j
/
j+1
by pulling out the / = 0 term
= c
m+1
+
m
¸
k=1
:
/
c
m−k+1
/
k
+
m+1
¸
k=1
:
/ −1
c
m−k+1
/
k
by letting , = / −1
= c
m+1
+
m
¸
k=1
:
/
c
m−k+1
/
k
+
m
¸
k=1
:
/ −1
c
m+1−k
/
k
+ /
m+1
by pulling out the/ = :+ 1 term
= c
m+1
+ /
m+1
+
m
¸
k=1
¸
:
/
+
:
/ −1
c
m+1−k
/
k
by combining the sums
= c
m+1
+ /
m+1
+
m
¸
k=1
:+ 1
/
c
m+1−k
/
k
from Pascal’s rule
=
m+1
¸
k=0
:+ 1
/
c
m+1−k
/
k
by adding in the :+ 1 terms,
as desired.
Version: 5 Owner: KimJ Author(s): KimJ
44.12 multinomial theorem
A multinomial is a mathematical expression consisting of two or more terms, e.g.
c
1
r
1
+ c
2
r
2
+ . . . + c
k
r
k
.
The multinomial theorem provides the general form of the expansion of the powers of this
expression, in the process specifying the multinomial coeﬃcients which are found in that
expansion. The expansion is:
(r
1
+ r
2
+ . . . + r
k
)
n
=
¸
:!
:
1
!:
2
! :
k
!
r
n
1
1
r
n
2
2
r
n
k
k
(44.12.1)
where the sum is taken over all multiindices (:
1
. . . . :
k
) ∈ N
k
that sum to :.
332
The expression
n!
n
1
!n
2
!n
k
!
occurring in the expansion is called multinomial coeﬃcient and
is denoted by
:
:
1
. :
2
. . . . . :
k
.
Version: 7 Owner: bshanks Author(s): yark, bbukh, rmilson, bshanks
44.13 multinomial theorem (proof )
Proof. The below proof of the multinomial theorem uses the binomial theorem and induction
on /. In addition, we shall use multiindex notation.
First, for / = 1, both sides equal r
n
1
. For the induction step, suppose the multinomial
theorem holds for /. Then the binomial theorem and the induction assumption yield
(r
1
+ + r
k
+ r
k+1
)
n
=
n
¸
l=0
:

(r
1
+ + r
k
)
l
r
n−l
k+1
=
n
¸
l=0
:

!
¸
[i[=l
r
i
i!
r
n−l
k+1
= :!
n
¸
l=0
¸
[i[=l
r
i
r
n−l
k+1
i!(: −)!
where r = (r
1
. . . . . r
k
) and i is a multiindex in 1
k
+
. To complete the proof, we need to show
that the sets
¹ = ¦(i
1
. . . . . i
k
. : −) ∈ 1
k+1
+
[  = 0. . . . . :. [(i
1
. . . . . i
k
)[ = ¦.
1 = ¦, ∈ 1
k+1
+
[ [,[ = :¦
are equal. The inclusion ¹ ⊂ 1 is clear since
[(i
1
. . . . . i
k
. : −)[ =  + : − = :.
For 1 ⊂ ¹, suppose , = (,
1
. . . . . ,
k+1
) ∈ 1
k+1
+
, and [,[ = :. Let  = [(,
1
. . . . . ,
k
)[. Then
 = : −,
k+1
, so ,
k+1
= : − for some  = 0. . . . . :. It follows that that ¹ = 1.
Let us deﬁne n = (r
1
. . r
k+1
) and let , = (,
1
. . . . . ,
k+1
) be a multiindex in 1
k+1
+
. Then
(r
1
+ + r
k+1
)
n
= :!
¸
[j[=n
r
(j
1
,...,j
k
)
r
j
k+1
k+1
(,
1
. . . . . ,
k
)!,
k+1
!
= :!
¸
[j[=n
n
j
,!
.
333
This completes the proof. P
Version: 1 Owner: matte Author(s): matte
44.14 proof of upper and lower bounds to binomial co
eﬃcient
Let 2 ≤ / ≤ : be natural numbers. We’ll ﬁrst prove the inequality
:
/
≤
:c
/
k
.
We rewrite
n
k
as
:
/
=
1 −
1
:
1 −
/ −1
:
:
k−1
to get
(: −1) (: −/ + 1)
c :
k−1
< 1.
Multiplying the inequality above with
k
k
k!
< c
k−1
yields
:(: −1) (: −/ + 1)
/
k
/!
/
k
/!
=
:
/
/
:
k
1
c
< c
k−1
⇔
:
/
<
:c
/
k
.
To conclude the proof we show that
n−1
¸
i=1
1 +
1
i
i
=
:
n
:!
∀ : ≥ 2 ∈ N. (44.14.1)
n−1
¸
i=1
1 +
1
i
i
=
n−1
¸
i=1
(i + 1)
i
i
i
=
n
¸
i=2
i
i−1
n−1
¸
i=1
i
i−1
(: −1)!
Since each lefthand factor in (44.14.1) is < c, we have
n
n
n!
< c
n−1
. Since :−i < : ∀ 1 ≤ i ≤
/ −1, we immediately get
:
/
=
k−1
¸
i=2
1 −
1
i
/!
<
:
n
/!
.
334
And from
/ ≤ : ⇔ (: −i) / ≥ (/ −i) : ∀ 1 ≤ i ≤ / −1
we obtain
:
/
=
:
/
k−1
¸
i=1
: −i)
/ −i
≥
n
k
k
.
Version: 4 Owner: Thomas Heye Author(s): Thomas Heye
335
Chapter 45
05A15 – Exact enumeration problems,
generating functions
45.1 Stirling numbers of the ﬁrst kind
Introduction. The Stirling numbers of the ﬁrst kind, frequently denoted as
:(:. /). /. : ∈ N. 1 < / < :.
are the integer coeﬃcients of the falling factorial polynomials. To be more precise, the
deﬁning relation for the Stirling numbers of the ﬁrst kind is:
r
n
= r(r −1)(r −2) . . . (r −: + 1) =
n
¸
k=1
:(:. /)r
k
.
Here is the table of some initial values.
:`/ 1 2 3 4 5
1 1
2 1 1
3 2 3 1
4 6 11 6 1
5 24 50 35 10 1
Recurrence Relation. The evident observation that
r
n+1
= rr
n
−:r
n
.
leads to the following equivalent characterization of the :(:. /), in terms of a 2place recur
rence formula:
:(: + 1. /) = :(:. / −1) −::(:. /). 1 < / < :.
336
subject to the following initial conditions:
:(:. 0) = 0. :(1. 1) = 1.
Generating Function. There is also a strong connection with the generalized binomial formula,
which furnishes us with the following generating function:
(1 + t)
x
=
∞
¸
n=0
n
¸
k=1
:(:. /)r
k
t
n
:!
.
This generating function implies a number of identities. Taking the derivative of both sides
with respect to t and equating powers, leads to the recurrence relation described above.
Taking the derivative of both sides with respect to r gives
(/ + 1):(:. / + 1) =
n
¸
j=k
(−1)
n−j
(: −,)!
: + 1
,
:(,. /)
This is because the derivative of the left side of the generating funcion equation with respect
to r is
(1 + t)
x
ln(1 +t) = (1 +t)
x
∞
¸
k=1
(−1)
k−1
t
k
/
.
The relation
(1 + t)
x
1
(1 + t)
x
2
= (1 +t)
x
1
+x
2
yields the following family of summation identities. For any given /
1
. /
2
. d ` 1 we have
/
1
+ /
2
/
1
:(d + /
1
+ /
2
. /
1
+ /
2
) =
¸
d
1
+d
2
=d
d + /
1
+ /
2
/
1
+ d
1
:(d
1
+ /
1
. /
1
):(d
2
+ /
2
. /
2
).
Enumerative interpretation. The absolute value of the Stirling number of the ﬁrst kind,
:(:. /), counts the number of permutations of : objects with exactly / orbits (equiva
lently, with exactly / cycles). For example, :(4. 2) = 11, corresponds to the fact that
the symmetric group on 4 objects has 3 permutations of the form
(∗∗)(∗∗) — 2 orbits of size 2 each.
and 8 permutations of the form
(∗ ∗ ∗) — 1 orbit of size 3, and 1 orbit of size 1.
(see the entry on cycle notation for the meaning of the above expressions.)
Let us prove this. First, we can remark that the unsigned Stirling numbers of the ﬁrst are
characterized by the following recurrence relation:
[:(: + 1. /)[ = [:(:. / −1)[ + :[:(:. /)[. 1 < / < :.
337
To see why the above recurrence relation matches the count of permutations with / cycles,
consider forming a permutation of : + 1 objects from a permutation of : objects by adding
a distinguished object. There are exactly two ways in which this can be accomplished. We
could do this by forming a singleton cycle, i.e. leaving the extra object alone. This accounts
for the :(:. / −1) term in the recurrence formula. We could also insert the new object into
one of the existing cycles. Consider an arbitrary permutation of : object with / cycles, and
label the objects c
1
. . . . . c
n
, so that the permutation is represented by
(c
1
. . . c
j
1
)(c
j
1
+1
. . . c
j
2
) . . . (c
j
k−1
+1
. . . c
n
)
. .. .
/ cycles
.
To form a new permutation of :+1 objects and / cycles one must insert the new object into
this array. There are, evidently : ways to perform this insertion. This explains the ::(:. /)
term of the recurrence relation. Q.E.D.
Version: 1 Owner: rmilson Author(s): rmilson
45.2 Stirling numbers of the second kind
Summary. The Stirling numbers of the second kind,
o(:. /). /. : ∈ N. 1 < / < :.
are a doubly indexed sequence of natural numbers, enjoying a wealth of interesting combina
torial properties. There exist several logically equivalent characterizations, but the starting
point of the present entry will be the following deﬁnition:
The Stirling number o(:. /) is the number of way to partition a set of : objects
into / groups.
For example, o(4. 2) = 7 because there are seven ways to partition 4 objects — call them a,
b, c, d — into two groups, namely:
(c)(/cd). (/)(ccd). (c)(c/d). (d)(c/c). (c/)(cd). (cc)(/d). (cd)(/c)
Four additional characterizations will be discussed in this entry:
• a recurrence relation
• a generating function related to the falling factorial
• diﬀerential operators
• a doubleindex generating function
Each of these will be discussed below, and shown to be equivalent.
338
A recurrence relation. The Stirling numbers of the second kind can be characterized in
terms of the following recurrence relation:
o(:. /) = /o(: −1. /) + o(: −1. / −1). 1 < / < :.
subject to the following initial conditions:
o(:. :) = o(:. 1) = 1.
Let us now show that the recurrence formula follows from the enumerative deﬁnition. Evidently,
there is only one way to partition : objects into 1 group (everything is in that group), and
only one way to partition : objects into : groups (every object is a group all by itself).
Proceeding recursively, a division of : objects c
1
. . . . . c
n−1
. c
n
into / groups can be achieved
by only one of two basic maneuvers:
• We could partition the ﬁrst : − 1 objects into / groups, and then add object c
n
into
one of those groups. There are /o(: −1. /) ways to do this.
• We could partition the ﬁrst : −1 objects into / −1 groups and then add object c
n
as
a new, 1 element group. This gives an additional o(: − 1. / − 1) ways to create the
desired partition.
The recursive point of view, therefore explains the connection between the recurrence for
mula, and the original deﬁnition.
Using the recurrence formula we can easily obtain a table of the initial Stirling numbers:
:`/ 1 2 3 4 5
1 1
2 1 1
3 1 3 1
4 1 7 6 1
5 1 15 25 10 1
Falling Factorials. Consider the vector space of polynomials in indeterminate r. The
most obvious basis of this inﬁnitedimensional vector space is the sequence of monomial
powers: r
n
. : ∈ N. However, the sequence of falling factorials:
r
n
= r(r −1)(r −2) . . . (r −: + 1). : ∈ N
is also a basis, and hence can be used to generate the monomial basis. Indeed, the Stirling
numbers of the second kind can be characterized as the the coeﬃcients involved in the
corresponding change of basis matrix, i.e.
r
n
=
n
¸
k=1
o(:. /)r
k
.
339
So, for example,
r
4
= r + 7r(r −1) + 6r(r −1)(r −2) + r(r −1)(r −2)(r −3).
Arguing inductively, let us prove that this characterization follows from the recurrence rela
tion. Evidently the formula is true for : = 1. Suppose then that the formula is true for a
given :. We have
rr
k
= r
k+1
+ /r
k
.
and hence using the recurrence relation we deduce that
r
n+1
=
n
¸
k=1
o(:. /) rr
k
=
n
¸
k=1
/o(:. /)r
k
+ o(:. / + 1)
r
k+1
=
n+1
¸
k=1
o(:. /)r
k
Diﬀerential operators. Let 1
x
denote the ordinary derivative, applied to polynomials
in indeterminate r, and let 1
x
denote the diﬀerential operator r1
x
. We have the following
characterization of the Stirling numbers of the second kind in terms of these two operators:
(1
x
)
n
=
n
¸
k=1
o(:. /) r
k
(1
x
)
k
.
where an exponentiated diﬀerential operator denotes the operator composed with itself the
indicated number of times. Let us show that this follows from the recurrence relation. The
proof is once again, inductive. Suppose that the characterization is true for a given :. We
have
1
x
(r
k
(1
x
)
k
) = /r
k
(1
x
)
k
+ r
k+1
(1
x
)
k+1
.
and hence using the recurrence relation we deduce that
(1
x
)
n+1
= r1
x
n
¸
k=1
o(:. /) r
k
(1
x
)
k
=
n
¸
k=1
o(:. /)
/r
k
(1
x
)
k
+ r
k+1
(1
x
)
k+1
=
n+1
¸
k=1
o(:. /) r
k
(1
x
)
k
340
Double index generating function. One can also characterize the Stirling numbers of
the second kind in terms of the following generating function:
c
x(e
t
−1)
=
∞
¸
n=1
n
¸
k=1
o(:. /) r
k
t
n
:!
.
Let us now prove this. Note that the diﬀerential equation
dξ
dt
= ξ.
admits the general solution
ξ = c
t
r.
It follows that for any polynomial j(ξ) we have
exp(t1
ξ
)[j(ξ)]
ξ=x
=
∞
¸
n=0
t
n
:!
(1
ξ
)
n
[j(ξ)]
ξ=x
= j(c
t
r).
The proof is simple: just take 1
t
of both sides. To be more explicit,
1
t
j(c
t
r)
= j
t
(c
t
r)c
t
r = 1
ξ
[j(ξ)]
ξ=xe
t
.
and that is exactly equal to 1
t
of the lefthand side. Since this relation holds for all polyno
mials, it also holds for all formal power series. In particular if we apply the above relation
to c
ξ
, use the result of the preceding section, and note that
1
ξ
[c
ξ
] = c
ξ
.
we obtain
c
xe
t
=
∞
¸
n=1
t
n
:!
(1
ξ
)
n
[c
ξ
]
ξ=x
=
∞
¸
n=1
n
¸
k=1
o(:. /)
t
n
:!
ξ
k
(1
ξ
)
k
[c
ξ
]
ξ=x
= c
x
∞
¸
n=1
n
¸
k=1
o(:. /) r
k
t
n
:!
Dividing both sides by c
x
we obtain the desired generating function. Q.E.D.
Version: 2 Owner: rmilson Author(s): rmilson
341
Chapter 46
05A19 – Combinatorial identities
46.1 Pascal’s rule
Pascal’s rule is the binomial identity
:
/
+
:
/ −1
=
: + 1
/
where 1 < / < : and
n
k
is the binomial coeﬃcient.
Version: 5 Owner: KimJ Author(s): KimJ
342
Chapter 47
05A99 – Miscellaneous
47.1 principle of inclusionexclusion
The principle of inclusionexclusion provides a way of methodically counting the union
of possibly nondisjoint sets.
Let ( = ¦¹
1
. ¹
2
. . . . ¹
N
¦ be a ﬁnite collection of ﬁnite sets. Let 1
k
represent the set of /fold
intersections of members of ( (e.g., 1
2
contains all possible intersections of two sets chosen
from ().
Then
N
¸
i=1
¹
i
=
N
¸
j=0
¸
(−1)
(j+1)
¸
S∈I
j
[o[
For example:
[¹
¸
1[ = ([¹[ +[1[) −([¹
¸
1[)
[¹
¸
1
¸
([ = ([¹[ +[1[ +[([) −([¹
¸
1[ +[¹
¸
([ +[1
¸
([) + ([¹
¸
1
¸
([)
The principle of inclusionexclusion, combined with de Morgan’s theorem, can be used to
count the intersection of sets as well. Let ¹ be some universal set such that ¹
k
⊆ ¹ for each
/, and let ¹
k
represent the complement of ¹
k
with respect to ¹. Then we have
N
¸
i=1
¹
i
=
N
¸
i=1
¹
i
thereby turning the problem of ﬁnding an intersection into the problem of ﬁnding a union.
Version: 2 Owner: vampyr Author(s): vampyr
343
47.2 principle of inclusionexclusion proof
The proof is by induction. Consider a single set ¹
1
. Then the principle of inclusionexclusion
states that [¹
1
[ = [¹
1
[, which is trivially true.
Now consider a collection of exactly two sets ¹
1
and ¹
2
. We know that
¹
¸
1 = (¹` 1)
¸
(1 ` ¹)
¸
(¹
¸
1)
Furthermore, the three sets on the righthand side of that equation must be disjoint. There
fore, by the addition principle, we have
[¹
¸
1[ = [¹` 1[ +[1 ` ¹[ +[¹
¸
1[
= [¹` 1[ +[¹
¸
1[ +[1 ` ¹[ +[¹
¸
1[ −[¹
¸
1[
= [¹[ +[1[ −[¹
¸
1[
So the principle of inclusionexclusion holds for any two sets.
Now consider a collection of ` 2 ﬁnite sets ¹
1
. ¹
2
. . . . ¹
N
. We assume that the principle
of inclusionexclusion holds for any collection of ` sets where 1 < ` < `. Because the
union of sets is associative, we may break up the union of all sets in the collection into a
union of two sets:
N
¸
i=1
¹
i
=
N−1
¸
i=1
¹
i
¸
¹
N
By the principle of inclusionexclusion for two sets, we have
N
¸
i=1
¹
i
=
N−1
¸
i=1
¹
i
+[¹
N
[ −
N−1
¸
i=1
¹
i
¸
¹
N
Now, let 1
k
be the collection of all /fold intersections of ¹
1
. ¹
2
. . . . ¹
N−1
, and let 1
t
k
be
the collection of all /fold intersections of ¹
1
. ¹
2
. . . . ¹
N
that include ¹
N
. Note that ¹
N
is
included in every member of 1
t
k
and in no member of 1
k
, so the two sets do not duplicate
one another.
We then have
N
¸
i=1
¹
i
=
N
¸
j=1
¸
(−1)
(j+1)
¸
S∈I
j
[o[
+[¹
N
[ −
N−1
¸
i=1
¹
i
¸
¹
N
344
by the principle of inclusionexclusion for a collection of `−1 sets. Then, we may distribute
set intersection over set union to ﬁnd that
N
¸
i=1
¹
i
=
N
¸
j=1
¸
(−1)
(j+1)
¸
S∈I
j
[o[
+ [¹
N
[ −
N−1
¸
i=1
(¹
i
¸
¹
N
)
Note, however, that
(¹
x
¸
¹
N
)
¸
(¹
y
¸
¹
N
) = (¹
x
¸
¹
y
¸
¹
N
)
Henc we may again apply the principle of inclusionexclusion for ` −1 sets, revealing that
N
¸
i=1
¹
i
=
N−1
¸
j=1
¸
(−1)
(j+1)
¸
S∈I
j
[o[
+[¹
N
[ −
N−1
¸
j=1
¸
(−1)
(j+1)
¸
S∈I
j
[o
¸
¹
N
[
=
N−1
¸
j=1
¸
(−1)
(j+1)
¸
S∈I
j
[o[
+[¹
N
[ −
N−1
¸
j=1
¸
(−1)
(j+1)
¸
S∈I
j+1
[o[
=
N−1
¸
j=1
¸
(−1)
(j+1)
¸
S∈I
j
[o[
+[¹
N
[ −
N
¸
j=2
¸
(−1)
(j)
¸
S∈I
j
[o[
=
N−1
¸
j=1
¸
(−1)
(j+1)
¸
S∈I
j
[o[
+[¹
N
[ +
N
¸
j=2
¸
(−1)
(j+1)
¸
S∈I
j
[o[
The second sum does not include 1
t
1
. Note, however, that 1
t
1
= ¦¹
N
¦, so we have
N
¸
i=1
¹
i
=
N−1
¸
j=1
¸
(−1)
(j+1)
¸
S∈I
j
[o[
+
N
¸
j=1
¸
(−1)
(j+1)
¸
S∈I
j
[o[
=
N−1
¸
j=1
(−1)
(j+1)
¸
¸
S∈I
j
[o[ +
¸
S∈I
j
[o[
¸
¸
Combining the two sums yields the principle of inclusionexclusion for ` sets.
Version: 1 Owner: vampyr Author(s): vampyr
345
Chapter 48
05B15 – Orthogonal arrays, Latin
squares, Room squares
48.1 example of Latin squares
It is easily shown that the multiplication table (Cayleytable) of a group has exactly these
properties and thus are latin squares. The converse, however, is (unfortunately) not true, ie.
not all Latin squares are multiplication tables for a group (the smallest counter example is
a Latin square of order 5).
Version: 2 Owner: jgade Author(s): jgade
48.2 graecolatin squares
Let ¹ = (c
ij
) and 1 = (/
ij
) be two :: matrices. We deﬁne their join as the matrix whose
(i. ,)th entry is the pair (c
ij
. /
ij
).
A graecolatin square is then the join of two latin squares.
The name comes from Euler’s use of greek and latin letters to diﬀerentiate the entries on
each array.
An example of graecolatin square:
¸
¸
¸
cα /β cγ dδ
dγ cδ /α cβ
/δ cγ dβ cα
cβ dα cδ /γ
346
Version: 1 Owner: drini Author(s): drini
48.3 latin square
A latin square of order : is an : : array such that each column and each row are made
with the same : symbols, using every one exactly once time.
Examples.
¸
¸
¸
c / c d
c d c /
d c / c
/ c d c
¸
¸
¸
1 2 3 4
4 3 2 1
2 1 4 3
3 4 1 2
Version: 1 Owner: drini Author(s): drini
48.4 magic square
A magic square of order : is an :: array using each one of the numbers 1. 2. 3. . . . . :
2
once
and such that the sum of the numbers in each row, column or main diagonal is the same.
Example:
¸
8 1 6
3 5 7
4 9 2
It’s easy to prove that the sum is always
1
2
:(:
2
+1). So in the example with : = 3 the sum
is always
1
2
(3 10) = 15.
Version: 1 Owner: drini Author(s): drini
347
Chapter 49
05B35 – Matroids, geometric lattices
49.1 matroid
A matroid, or an independence structure, is a kind of ﬁnite mathematical structure whose
properties imitate the properties of a ﬁnite subset of a vector space. Notions such as rank
and independence (of a subset) have a meaning for any matroid, as does the notion of duality.
A matroid permits several equivalent formal deﬁnitions: two deﬁnitions in terms of a rank
function, one in terms of independant subsets, and several more.
For a ﬁnite set A, β(A) will denote the set of all subsets of A, and [A[ will denote the
number of elements of A. 1 is a ﬁxed ﬁnite set throughout.
Deﬁnition 1: A matroid is a pair (1. :) where : is a mapping β(1) → N satisfying these
axioms:
r1) :(o) ≤ [o[ for all o ⊂ 1.
r2) If o ⊂ 1 ⊂ 1 then :(o) ≤ :(1).
r3) For any subsets o and 1 of 1,
:(o
¸
1) + :(o
¸
1) ≤ :(o) + :(1).
The matroid (1. :) is called normal if also
r*) :(¦c¦) = 1 for any c ∈ 1.
: is called the rank function of the matroid. (r3) is called the submodular inequality.
The notion of isomorphism between one matroid (1. :) and another (1. :) has the expected
meaning: there exists a bijection 1 : 1 → 1 which preserves rank, i.e. satisﬁes :(1(¹)) =
348
:(¹) for all ¹ ⊂ 1.
Deﬁnition 2: A matroid is a pair (1. :) where : is a mapping β(1) → N satisfying these
axioms:
q1) :(∅) = 0.
q2) If r ∈ 1 and o ⊂ 1 then :(o
¸
¦r¦) −:(o) ∈ ¦0. 1¦.
q3) If r. n ∈ 1 and o ⊂ 1 and :(o
¸
¦r¦) = :(o
¸
¦n¦) = :(o) then :(o
¸
¦r. n¦) = :(o).
Deﬁnition 3: A matroid is a pair (1. 1) where 1 is a subset of β(1) satisfying these axioms:
i1) ∅ ∈ 1.
i2) If o ⊂ 1 ⊂ 1 and 1 ∈ 1 then o ∈ 1.
i3) If o. 1 ∈ 1 and o. 1 ⊂ l ⊂ 1 and o and 1 are both maximal subsets of l with the
property that they are in 1, then [o[ = [1[.
An element of 1 is called an independent set. (1. 1) is called normal if any singleton subset
of 1 is independant, i.e.
i*) ¦r¦ ∈ 1 for all r ∈ 1
Deﬁnition 4: A matroid is a pair (1. 1) where 1 is a subset of β(1) satisfying these
axioms:
b1) 1 = ∅.
b2) If o. 1 ∈ 1 and o ⊂ 1 then o = 1.
b3) If o. 1 ∈ 1 and r ∈ 1 −o then there exists n ∈ 1 −1 such that (o
¸
r) −n ∈ 1.
An element of 1 is called a basis (of 1). (1. 1) is called normal if also
b*)
¸
b∈B
/ = 1
i.e. if any singleton subset of 1 can be extended to a basis.
Deﬁnition 5: A matroid is a pair (1. φ) where φ is a mapping β(1) → β(1) satisfying
these axioms:
φ1) o ⊂ φ(o) for all o ⊂ 1.
φ2) If o ⊂ φ(1) then φ(o) ⊂ φ(1).
φ3) If r ∈ φ(o
¸
¦n¦) −φ(o) then n ∈ φ(o
¸
¦r¦).
φ is called the span mapping of the matroid, and φ(¹) is called the span of the subset ¹.
349
(1. φ) is called normal if also
φ*) φ(∅) = ∅
Deﬁnition 6: A matroid is a pair (1. () where ( is a subset of β(1) satisfying these
axioms:
c1) ∅ ∈ (.
c2) If o. 1 ∈ ( and o ⊂ 1 then o = 1.
c3) If o. 1 ∈ ( and o = 1 and r ∈ o
¸
1 then there exists l ∈ ( such that r ∈ l and
l ⊂ o
¸
1.
An element of ( is called a circuit. (1. () is called normal if also
c*) No singleton subset of 1 is a circuit.
49.1.1 Equivalence of the deﬁnitions
It would take several pages to spell out what is a circuit in terms of rank, and likewise for
each other possible pair of the alternative deﬁning notions, and then to prove that the various
sets of axioms unambiguously deﬁne the same structure. So let me sketch just one example:
the equivalence of Deﬁnitions 1 (on rank) and 6 (on circuits). Assume ﬁrst the conditions in
Deﬁnition 1. Deﬁne a circuit as a minimal subset ¹ of β(1) having the property :(¹) < [¹[.
With a little eﬀort, we verify the axioms (c1)(c3). Now assume (c1)(c3), and let :(¹) be
the largest integer : such that ¹ has a subset 1 for which
– 1 contains no element of (
– : = [1[.
One now proves (r1)(r3). Next, one shows that if we deﬁne ( in terms of :, and then
another rank function : in terms of (, we end up with :=:. The equivalence of (r*) and
(c*) is easy enough as well.
49.1.2 Examples of matroids
Let \ be a vector space over a ﬁeld /, and let 1 be a ﬁnite subset of \ . For o ⊂ 1, let :(o)
be the dimension of the subspace of \ generated by o. Then (1. :) is a matroid. Such a
matroid, or one isomorphic to it, is said to be representable over /. The matroid is normal
iﬀ 0 ∈ 1. There exist matroids which are not representable over any ﬁeld.
The second example of a matroid comes from graph theory. The following deﬁnition will be
rather informal, partly because the terminology of graph theory is not very well standardised.
350
For our present purpose, a graph consists of a ﬁnite set \ , whose elements are called vertices,
plus a set 1 of twoelement subsets of \ , called edges. A circuit in the graph is a ﬁnite set
of at least three edges which can be arranged in a cycle:
¦c. /¦. ¦/. c¦. . . . ¦n. .¦. ¦.. c¦
such that the vertices c. /. . . . . are distinct. With circuits thus deﬁned, 1 satisﬁes the axioms
in Deﬁnition 6, and is thus a matroid, and in fact a normal matroid. (The deﬁnition is easily
adjusted to permit graphs with loops, which deﬁne nonnormal matroids.) Such a matroid,
or one isomorphic to it, is called “graphic”.
Let 1 = ¹
¸
1 be a ﬁnite set, where ¹ and 1 are nonempty and disjoint. Let G a subset of
¹ 1. We get a “matching” matroid on 1 as follows. Each element of 1 deﬁnes a “line”
which is a subset (a row or column) of the set ¹1. Let us call the elements of G “points”.
For any o ⊂ 1 let :(o) be the largest number : such that for some set of points 1:
– [1[ = :
– No two points of 1 are on the same line
– Any point of 1 is on a line deﬁned by an element of o.
One can prove (it is not trivial) that : is the rank function of a matroid on 1. That
matroid is normal iﬀ every line contains at least one point. matching matroids participate in
combinatorics, in connection with results on “transversals”, such as Hall’s marriage theorem.
49.1.3 The dual of a matroid
Proposition: Let 1 be a matroid and : its rank function. Deﬁne a mapping : : β(1) → N
by
:(¹) = [¹[ −:(1) + :(1 −¹).
Then the pair (1. :) is a matroid (called the dual of (1. :).
We leave the proof as an exercise. Also, it is easy to verify that the dual of the dual is the
original matroid. A circuit in (1. :) is also referred to as a cocircuit in (1. :). There is a
notion of cobasis also, and cospan.
If the dual of 1 is graphic, 1 is called cographic. This notion of duality agrees with the
notion of same name in the theory of planar graphs (and likewise in linear algebra): given
a plane graph, the dual of its matroid is the matroid of the dual graph. A matroid that is
both graphic and cographic is called planar, and various criteria for planarity of a graph can
be extended to matroids. The notion of orientability can also be extended from graphs to
matroids.
351
49.1.4 Binary matroids
A matroid is said to be binary if it is representable over the ﬁeld of two elements. There are
several other (equivalent) characterisations of a binary matroid (1. :), such as:
– The symmetric – diﬀerence of any family of – circuits is the union – of a family of pairwise
disjoint circuits.
– For any circuit ( and cocircuit 1, we have [(
¸
1[ ⇔0 (mod 2).
Any graphic matroid is binary. The dual of a binary matroid is binary.
49.1.5 Miscellaneous
The deﬁnition of the chromatic polynomial of a graph,
χ(r) =
¸
F⊂E
(−1)
[F[
r
r(E)−r(F)
.
extends without change to any matroid. This polynomial has something to say about the
decomposibility of matroids into simpler ones.
Also on the topic of decomposibility, matroids have a sort of structure theory, in terms of
what are called minors and separators. That theory, due to Tutte, goes by induction; roughly
speaking, it is an adaptation of the old algorithms for putting a matrix into a canonical form.
Along the same lines are several theorems on “basis exchange”, such as the following. Let
1 be a matroid and let
¹ = ¦c
1
. . . . . c
n
¦
1 = ¦/
1
. . . . . /
n
¦
be two (equipotent) bases of 1. There exists a permutation ψ of the set ¦1. . . . . :¦ such
that, for every : from 0 to :,
¦c
1
. . . . . c
m
. /
ψ(m+1)
. . . . . /
ψ(n)
¦
is a basis of 1.
49.1.6 Further reading
A good textbook is:
James G. Oxley, Matroid Theory, Oxford University Press, New York etc., 1992
plus the updatesanderrata ﬁle at Dr. Oxley’s website.
352
The chromatic polynomial is not discussed in Oxley, but see e.g. Zaslavski.
Version: 3 Owner: drini Author(s): Larry Hammick, NeuRet
49.2 polymatroid
The polymatroid deﬁned by a given matroid (1. :) is the set of all functions u : 1 → R
such that
u(c) ≥ 0 for all c ∈ 1
¸
e∈S
u(c) ≤ :(o) for all o ⊂ 1 .
Polymatroids are related to the convex polytopes seen in linear programming, and have
similar uses.
Version: 1 Owner: nobody Author(s): Larry Hammick
353
Chapter 50
05C05 – Trees
50.1 AVL tree
An AVL tree is A balanced binary search tree where the height of the two subtrees (children)
of a node diﬀers by at most one. Lookup, insertion, and deletion are ((ln :), where : is
the number of nodes in the tree.
The structure is named for the inventors, AdelsonVelskii and Landis (1962).
Version: 5 Owner: Thomas Heye Author(s): Thomas Heye
50.2 Aronszajn tree
A κtree 1 for which [1
α
[ < κ for all α < κ and which has no coﬁnal branches is called a
κAronszajn tree. If κ = ω
1
then it is referred to simply as an Aronszajn tree.
If there are no κAronszajn trees for some κ then we say κ has the tree property. ω has
the tree property, but no singular cardinal has the tree property.
Version: 6 Owner: Henry Author(s): Henry
50.3 Suslin tree
An Aronszajn tree is a Suslin tree iﬀ it has no uncountable antichains.
Version: 1 Owner: Henry Author(s): Henry
354
50.4 antichain
A subset ¹ of a poset (1. <
P
) is an antichain if no two elements are comparable. That is,
if c. / ∈ ¹ then c ≮
P
/ and / ≮
P
c.
A maximal antichain of 1 is one which is maximal.
In particular, if (1. <
P
) is a tree then the maximal antichains are exactly those antichains
which intersect every branch, and if the tree is splitting then every level is a maximal
antichain.
Version: 3 Owner: Henry Author(s): Henry
50.5 balanced tree
A balanced tree is a rooted tree where each subtree of the root has an equal number of
nodes (or as near as possible). For an example, see binary tree.
Version: 2 Owner: Logan Author(s): Logan
50.6 binary tree
A binary tree is a rooted tree where every node has two or fewer children. A balanced
binary tree is a binary tree that is also a balanced tree. For example,
¹
1 1
( 1 1 G
is a balanced binary tree.
The two (potential) children of a node in a binary tree are often called the left and right
children of that node. The left child of some node A and all that child’s descendents are the
left descendents of A. A similar deﬁnition applies to A’s right descendents. The left
subtree of A is A’s left descendents, and the right subtree of A is its right descendents.
Since we know the maximum number of children a binary tree node can have, we can make
some statements regarding minimum and maximum depth of a binary tree as it relates to
355
the total number of nodes. The maximum depth of a binary tree of : nodes is : −1 (every
nonleaf node has exactly one child). The minimum depth of a binary tree of : nodes (: 0)
is log
2
: (every nonleaf node has exactly two children, that is, the tree is balanced).
A binary tree can be implicitly stored as an array, if we designate a constant, maximum
depth for the tree. We begin by storing the root node at index 0 in the array. We then store
its left child at index 1 and its right child at index 2. The children of the node at index 1
are stored at indices 3 and 4, and the chldren of the node at index 2 are stored at indices 5
and 6. This can be generalized as: if a node is stored at index /, then its left child is located
at index 2/ + 1 and its right child at 2/ + 2. This form of implicit storage thus eliminates
all overhead of the tree structure, but is only really advantageous for trees that tend to be
balanced. For example, here is the implicit array representation of the tree shown above.
A B E C D F G
Many data structures are binary trees. For instance, heaps and binary search trees are binary
trees with particular properties.
Version: 3 Owner: Daume Author(s): Daume, Logan
50.7 branch
A subset 1 of a tree (1. <
T
) is a branch if 1 is a maximal linearly ordered subset of 1.
That is:
• <
T
is a linear ordering of 1
• If t ∈ 1 ` 1 then 1
¸
¦t¦ is not linearly ordered by <
T
.
This is the same as the intuitive conception of a branch: it is a set of nodes starting at the
root and going all the way to the tip (in inﬁnite sets the conception is more complicated,
since there may not be a tip, but the idea is the same). Since branches are maximal there is
no way to add an element to a branch and have it remain a branch.
A coﬁnal branch is a branch which intersects every level of the tree.
Version: 1 Owner: Henry Author(s): Henry
50.8 child node (of a tree)
A child node ( of a node 1 in a tree is any node connected to 1 which has a path distance
from the root node 1 which is one greater than the path distance between 1 and 1.
356
Drawn in the canonical rootattop manner, a child node of a node 1 in a tree is simply any
node immediately below 1 which is connected to it.
•
• •
• •
•
Figure: A node (blue) and its children (red.)
Version: 1 Owner: akrowne Author(s): akrowne
50.9 complete binary tree
A complete binary tree is a binary tree with the additional property that every node
must have exactly two “children” if an internal node, and zero children if a leaf node.
More precisely: for our base case, the complete binary tree of exactly one node is simply
the tree consisting of that node by itself. The property of being “complete” is preserved
if, at each step, we expand the tree by connecting exactly zero or two individual nodes (or
complete binary trees) to any node in the tree (but both must be connected to the same
node.)
Version: 4 Owner: akrowne Author(s): akrowne
50.10 digital search tree
A digital search tree is a tree which stores strings internally so that there is no need for
extra leaf nodes to store the strings.
Version: 5 Owner: Logan Author(s): Logan
357
50.11 digital tree
A digital tree is a tree for storing a set of strings where nodes are organized by substrings
common to two or more strings. Examples of digital trees are digital search trees and tries.
Version: 3 Owner: Logan Author(s): Logan
50.12 example of Aronszajn tree
Construction 1: If κ is a singular cardinal then there is a simple construction of a κ
Aronszajn tree. Let '/
β
`
β<ι
with ι < κ be a sequence coﬁnal in κ. Then consider the tree
where 1 = ¦(α. /
β
) [ α < /
β
∧ β < ι¦ with (α
1
. /
β
1
) <
T
(α
2
. /
β
2
) iﬀ α
1
< α
2
and /
β
1
= /
β
2
.
Note that this is similar to (indeed, a subtree of) the construction given for a tree with no
coﬁnal branches. It consists of ι disjoint branches, with the βth branch of height /
β
. Since
ι < κ, every level has fewer than κ elements, and since the sequence is coﬁnal in κ, 1 must
have height and cardinality κ.
Construction 2: We can construct an Aronszajn tree out of the compact subsets of ´
+
.
<
T
will be deﬁned by r <
T
n iﬀ n is an endextension of r. That is, r ⊆ n and if : ∈ n ` r
and : ∈ r then : < :.
Let 1
0
= ¦[0]¦. Given a level 1
α
, let 1
α+1
= ¦r
¸
¦¡¦ [ r ∈ 1
α
∧ ¡ max r¦. That is, for
every element r in 1
α
and every rational number ¡ larger than any element of r, r
¸
¦¡¦ is
an element of 1
α+1
. If α < ω
1
is a limit ordinal then each element of 1
α
is the union of some
branch in 1(α).
We can show by induction that [1
α
[ < ω
1
for each α < ω
1
. For the base case, 1
0
has only one
element. If [1
α
[ < ω
1
then [1
α+1
[ = [1
α
[ [´[ = [1
α
[ ω = ω < ω
1
. If α < ω
1
is a limit ordinal
then 1(α) is a countable union of countable sets, and therefore itself countable. Therefore
there are a countable number of branches, so 1
α
is also countable. So 1 has countable levels.
Suppose 1 has an uncountable branch, 1 = '/
0
. /
1
. . . .`. Then for any i < , < ω
1
, /
i
⊂ /
j
.
Then for each i, there is some r
i
∈ /
i+1
` /
i
such that r
i
is greater than any element of
/
i
. Then 'r
0
. r
1
. . . .` is an uncountable increasing sequence of rational numbers. Since the
rational numbers are countable, there is no such sequence, so 1 has no uncountable branch,
and is therefore Aronszajn.
Version: 1 Owner: Henry Author(s): Henry
358
50.13 example of tree (set theoretic)
The set Z
+
is a tree with <
T
=<. This isn’t a very interesting tree, since it simply consists
of a line of nodes. However note that the height is ω even though no particular node has
that height.
A more interesting tree using Z
+
deﬁnes : <
T
: if i
a
= : and i
b
= : for some i. c. / ∈
Z
+
¸
¦0¦. Then 1 is the root, and all numbers which are not powers of another number are
in 1
1
. Then all squares (which are not also fourth powers) for 1
2
, and so on.
To illustrate the concept of a coﬁnal branch, observe that for any limit ordinal κ we can
construct a κtree which has no coﬁnal branches. We let 1 = ¦(α. β)[α < β < κ¦ and
(α
1
. β
1
) <
T
(α
2
. β
2
) ↔ α
1
< α
2
∧ β
1
= β
2
. The tree then has κ disjoint branches, each
consisting of the set ¦(α. β)[α < β¦ for some β < κ. No branch is coﬁnal, since each branch
is capped at β elements, but for any γ < κ, there is a branch of height γ + 1. Hence the
suprememum of the heights is κ.
Version: 1 Owner: Henry Author(s): Henry
50.14 extended binary tree
An extended binary tree is a transformation of any binary tree into a complete binary tree.
This transformation consists of replacing every null subtree of the original tree with “special
nodes.” The nodes from the original tree are then internal nodes, while the “special nodes”
are external nodes.
For instance, consider the following binary tree.
The following tree is its extended binary tree. Empty circles represent internal nodes, and
ﬁlled circles represent external nodes.
Every internal node in the extended tree has exactly two children, and every external node
is a leaf. The result is a complete binary tree.
Version: 4 Owner: Logan Author(s): Logan
359
50.15 external path length
Given a binary tree 1, construct its extended binary tree 1
t
. The external path length
of 1 is then deﬁned to be the sum of the lengths of the paths to each of the external nodes.
For example, let 1 be the following tree.
The extended binary tree of 1 is
The external path length of 1 (denoted 1) is
1 = 2 + 3 + 3 + 3 + 3 + 3 + 3 = 20
The internal path length of 1 is deﬁned to be the sum of the lengths of the paths to each
of the internal nodes. The internal path length of our example tree (denoted 1) is
1 = 1 + 2 + 0 + 2 + 1 + 2 = 8
Note that in this case 1 = 1 + 2:, where : is the number of internal nodes. This happens
to hold for all binary trees.
Version: 1 Owner: Logan Author(s): Logan
50.16 internal node (of a tree)
An internal node of a tree is any node which has degree greater than one. Or, phrased in
rooted tree terminology, the internal nodes of a tree are the nodes which have at least one
child node.
360
•
• •
• •
•
Figure: A tree with internal nodes highlighted in red.
Version: 3 Owner: akrowne Author(s): akrowne
50.17 leaf node (of a tree)
A leaf of a tree is any node which has degree of exactly 1. Put another way, a leaf node of
a rooted tree is any node which has no child nodes.
•
• •
• •
•
Figure: A tree with leaf nodes highlighted in red.
Version: 2 Owner: akrowne Author(s): akrowne
50.18 parent node (in a tree)
A parent node 1 of a node ( in a tree is the ﬁrst node which lies along the path from (
to the root of the tree, 1.
Drawn in the canonical rootattop manner, the parent node of a node ( in a tree is simply
the node immediately above ( which is connected to it.
361
•
• •
• •
•
Figure: A node (blue) and its parent (red.)
Version: 2 Owner: akrowne Author(s): akrowne
50.19 proof that ω has the tree property
Let 1 be a tree with ﬁnite levels and an inﬁnite number of elements. Then consider the
elements of 1
0
. 1 can be partitioned into the set of descendants of each of these elements,
and since any ﬁnite partition of an inﬁnite set has at least one inﬁnite partition, some element
r
0
in 1
0
has an inﬁnite number of descendants. The same procedure can be applied to the
children of r
0
to give an element r
1
∈ 1
1
which has an inﬁnite number of descendants, and
then to the children of r
1
, and so on. This gives a sequence A = 'r
0
. r
1
. . . .`. The sequence
is inﬁnite since each element has an inﬁnite number of descendants, and since r
i+1
is always
of child of r
i
, A is a branch, and therefore an inﬁnite branch of 1.
Version: 2 Owner: Henry Author(s): Henry
50.20 root (of a tree)
The root of a tree is a placeholder node. It is typically drawn at the top of the page, with
the other nodes below (with all nodes having the same path distance from the root at the
same height.)
362
•
• •
• •
•
Figure: A tree with root highlighted in red.
Any tree can be redrawn this way, selecting any node as the root. This is important to
note: taken as a graph in general, the notion of “root” is meaningless. We introduce a root
explicitly when we begin speaking of a graph as a tree– there is nothing in general that
selects a root for us.
However, there are some special cases of trees where the root can be distinguished from the
other nodes implicitly due to the properties of the tree. For instance, a root is uniquely
identiﬁable in a complete binary tree, where it is the only node with degree two.
Version: 4 Owner: akrowne Author(s): akrowne
50.21 tree
Formally, a forest is an undirected, acyclic graph. A forest consists of trees, which are
themselves acyclic, connected graphs. For example, the following diagram represents a forest,
each connected component of which is a tree.
• • • • • •
• • • • •
All trees are forests, but not all forests are trees. As in a graph, a forest is made up of vertices
(which are often called nodes interchangeably) and edges. Like any graph, the vertices and
edges may each be labelled — that is, associated with some atom of data. Therefore a forest
or a tree is often used as a data structure.
Often a particular node of a tree is speciﬁed as the root. Such trees are typically drawn with
the root at the top of the diagram, with all other nodes depending down from it (however
this is not always the case). A tree where a root has been speciﬁed is called a rooted tree. A
363
tree where no root has been speciﬁed is called a free tree. When speaking of tree traversals,
and most especially of trees as datastructures, rooted trees are often implied.
The edges of a rooted tree are often treated as directed. In a rooted tree, every nonroot
node has exactly one edge that leads to the root. This edge can be thought of as connecting
each node to its parent. Often rooted trees ae considered directed in the sense that all edges
connect parents to their children, but not viceversa. Given this parentchild relationship, a
descendant of a node in a directed tree is deﬁned as any other node reachable from that
node (that is, a node’s children and all their descendants).
Given this directed notion of a rooted tree, a rooted subtree can be deﬁned as any node
of a tree and all of its descendants. This notion of a rooted subtree is very useful in dealing
with trees inductively and deﬁning certain algorithms inductively.
Because of their simple structure and unique properties, trees and forests have many uses.
Because of the simple deﬁnition of various tree traversals, they are often used to store and
lookup data. Many algorithms are based upon trees, or depend upon a tree in some manner,
such as the heapsort algorithm or Huﬀman encoding. There are also a great many speciﬁc
forms and families of trees, each with its own constraints, strengths, and weaknesses.
Version: 6 Owner: Logan Author(s): Logan
50.22 weightbalanced binary trees are ultrametric
Let A be the set of leaf nodes in a weightbalanced binary tree. Let the distance between
leaf nodes be identiﬁed with the weighted path length between them. We will show that this
distance metric on A is ultrametric.
Before we begin, let the join of any two nodes r. n, denoted r ∨ n, be deﬁned as the node
. which is the most immediate common ancestor of r and n (that is, the common ancestor
which is farthest from the root). Also, we are using weightbalanced in the sense that
• the weighted path length from the root to each leaf node is equal, and
• each subtree is weightbalanced, too.
Lemma: two properties of weightbalanced trees
Because the tree is weightbalanced, the distances between any node and each of the leaf
node descendents of that node are equal. So, for any leaf nodes r. n,
d(r. r ∨ n) = d(n. r ∨ n) (50.22.1)
364
Hence,
d(r. n) = d(r. r ∨ n) + d(n. r ∨ n) = 2 ∗ d(r. r ∨ n) (50.22.2)
Back to the main proof
We will now show that the ultrametric three point condition holds for any three leaf nodes
in a weightbalanced binary tree.
Consider any three points c. /. c in a weightbalanced binary tree. If d(c. /) = d(/. c) = d(c. c),
then the three point condition holds. Now assume this is not the case. Without loss of
generality, assume that d(c. /) < d(c. c).
Applying Eqn. 50.22.2,
2 ∗ d(c. c ∨ /) < 2 ∗ d(c. c ∨ c)
d(c. c ∨ /) < d(c. c ∨ c)
Note that both c ∨/ and c ∨c are ancestors of c. Hence, c ∨ c is a more distant ancestor of
c and so must c ∨ c must be an ancestor of c ∨ /.
Now, consider the path between / and c. to get from / to c is to go from / up to c ∨ /, then
up to c ∨c, and then down to c. Since this is a tree, this is the only path. The highest node
in this path (the ancestor of both b and c) was c ∨c, so the distance d(/. c) = 2 ∗ d(/. c ∨c).
But by Eqn. 50.22.1 and Eqn. 50.22.2 (noting that / is a descendent of c ∨ c), we have
d(/. c) = 2 ∗ d(/. c ∨ c) = 2 ∗ d(c. c ∨ c) = d(c. c)
To summarize, we have d(c. /) < d(/. c) = d(c. c), which is the desired ultrametric three
point condition. So we are done.
Note that this means that, if c. / are leaf nodes, and you are at a node outside the subtree
under c ∨ /, then d(you. c) = d(you. /). In other words, (from the point of view of distance
between you and them,) the structure of any subtree that is not your own doesn’t matter to
you. This is expressed in the three point condition as ”if two points are closer to each other
than they are to you, then their distance to you is equal”.
(above, we have only proved this if you are at a leaf node, but it works for any node which is
outside the subtree under c ∨/, because the paths to c and / must both pass through c ∨/).
Version: 2 Owner: bshanks Author(s): bshanks
365
50.23 weighted path length
Given an extended binary tree 1 (that is, simply any complete binary tree, where leafs are
denoted as external nodes), associate weights with each external node. The weighted
path length of 1 is the sum of the product of the weight and path length of each external
node, over all external nodes.
Another formulation is that weighted path length is
¸
u
j

j
over all external nodes ,, where
u
j
is the weight of an external node ,, and 
j
is the distance from the root of the tree to ,.
If u
j
= 1 for all ,, then weighted path length is exactly the same as external path length.
Example
Let 1 be the following extended binary tree. Square nodes are external nodes, and circular
nodes are internal nodes. Values in external nodes indicate weights, which are given in this
problem, while values in internal nodes represent the weighted path length of subtrees rooted
at those nodes, and are calculated from the given weights and the given tree. The weight of
the tree as a whole is given at the root of the tree.
This tree happens to give the minimum weighted path length for this particular set of
weights.
Version: 1 Owner: Logan Author(s): Logan
366
Chapter 51
05C10 – Topological graph theory,
imbedding
51.1 Heawood number
The Heawood number of a surface is the maximal number of colors needed to color any graph
embedded in the surface. For example, fourcolor conjecture states that Heawood number
of the sphere is four.
In 1890 Heawood proved for all surfaces except sphere that the Heawood number is
H(o) <
¸
7 +
49 −24c(o)
2
¸
.
where c(o) is the Euler characteristic of the surface.
Later it was proved in the works of Franklin, Ringel and Youngs that
H(o) `
¸
7 +
49 −24c(o)
2
¸
.
For example, the complete graph on 7 vertices can be embedded in torus as follows:
1 2 3 1
4 4
6 7
5 5
1 2 3 1
367
REFERENCES
1. B´ela Bollob´as. Graph Theory: An Introductory Course, volume 63 of GTM. SpringerVerlag,
1979. Zbl 0411.05032.
2. Thomas L. Saaty and Paul C. Kainen. The FourColor Problem: Assaults and Conquest. Dover,
1986. Zbl 0463.05041.
Version: 6 Owner: bbukh Author(s): bbukh
51.2 Kuratowski’s theorem
A ﬁnite graph is planar if and only if it contains no subgraph that is isomorphic to or is
a subdivision of 1
5
or 1
3,3
, where 1
5
is the complete graph of order 5 and 1
3,3
is the
complete bipartite graph of order 6. Wagner’s theorem is an equivalent later result.
REFERENCES
1. Kazimierz Kuratowski. Sur le probl`eme des courbes gauches en topologie. Fund. Math., 15:271–
283, 1930.
Version: 7 Owner: bbukh Author(s): bbukh, digitalis
51.3 Szemer´ediTrotter theorem
The number of incidences of a set of : points and a set of : lines in the real plane R
2
is
1 = ((: + :+ (::)
2
3
).
Proof. Let’s consider the points as vertices of a graph, and connect two vertices by an edge
if they are adjacent on some line. Then the number of edges is c = 1 − :. If c < 4: then
we are done. If c ` 4: then by crossing lemma
:
2
` cr(G) `
1
64
(1 −:)
3
:
2
.
and the theorem follows.
Recently, T´oth[1] extended the theorem to the complex plane C
2
. The proof is diﬃcult.
368
REFERENCES
1. Csaba D. T´oth. The Szemer´ediTrotter theorem in the complex plane. arXiv:CO/0305283, May
2003.
Version: 3 Owner: bbukh Author(s): bbukh
51.4 crossing lemma
The crossing number of a graph G with : vertices and : ` 4: edges is
cr(G) `
1
64
:
3
:
2
.
Version: 1 Owner: bbukh Author(s): bbukh
51.5 crossing number
The crossing number cr(G) of a graph G is the minimal number of crossings among all
embeddings of G in the plane.
Version: 1 Owner: bbukh Author(s): bbukh
51.6 graph topology
A graph (\. 1) is identiﬁed by its vertices \ = ¦·
1
. ·
2
. . . .¦ and its edges 1 = ¦¦·
i
. ·
j
¦. ¦·
k
. ·
l
¦. . . .¦.
A graph also admits a natural topology, called the graph topology, by identifying every
edge ¦·
i
. ·
j
¦ with the unit interval 1 = [0. 1] and gluing them together at coincident vetices.
This construction can be easily realized in the framework of simplicial complexes. We can
form a simplicial complex G = ¦¦·¦ [ · ∈ \ ¦
¸
1. And the desired topological realization
of the graph is just the geometric realization [G[ of G.
Viewing a graph as a topological space has several advantages:
• The notion of graph isomorphism simply becomes that of homeomorphism.
• The notion of a connected graph coincides with topological conectedness.
• A connected graph is a tree iﬀ its fundamental group is trivial.
369
Version: 3 Owner: igor Author(s): igor
51.7 planar graph
A planar graph is a graph which can be drawn on a plane (ﬂat 2d surface) with no edge
crossings.
No complete graphs above 1
4
are planar. 1
4
, drawn without crossings, looks like :
¹
1
( 1
Hence it is planar (try this for 1
5
.)
Version: 3 Owner: akrowne Author(s): akrowne
51.8 proof of crossing lemma
Euler’s formula implies the linear lower bound cr(G) ` :−3:+6, and so it cannot be used
directly. What we need is to consider the subgraphs of our graph, apply Euler’s formula on
them, and then combine the estimates. The probabilistic method provides a natural way to
do that.
Consider a minimal embedding of G. Choose independently every vertex of G with probabil
ity j. Let G
p
be a graph induced by those vertices. By Euler’s formula, cr(G
p
)−:
p
+3:
p
` 0.
The expectation is clearly
1(cr(G
p
) −:
p
+ 3:
p
) ` 0.
Since 1(:
p
) = j:, 1(:
p
) = j
2
: and 1(A
p
) = j
4
cr(G), we get an inequality that bounds
the crossing number of G from below,
cr(G) ` j
−2
:−3j
−3
:.
Now set j =
4n
m
(which is at most 1 since : ` 4:), and the inequaliy becomes
cr(G) `
1
64
:
3
:
2
.
370
Similarly, if : `
9
2
:, then we can set j =
9n
2m
to get
cr(G) `
4
243
:
3
:
2
.
REFERENCES
1. Martin Aigner and G¨ unter M. Ziegler. Proofs from THE BOOK. Springer, 1999.
Version: 2 Owner: bbukh Author(s): bbukh
371
Chapter 52
05C12 – Distance in graphs
52.1 Hamming distance
In comparing two bit patterns, the Hamming distance is the count of bits diﬀerent in the two
patterns. More generally, if two ordered lists of items are compared, the Hamming distance
is the number of items that do not identically agree. This distance is applicable to encoded
information, and is a particularly simple metric of comparison, often more useful than the
cityblock distance or Euclidean distance.
References
• Originally from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.html)
Version: 5 Owner: akrowne Author(s): akrowne
372
Chapter 53
05C15 – Coloring of graphs and
hypergraphs
53.1 bipartite graph
A bipartite graph is a graph with a chromatic number of 2.
The following graph, for example, is bipartite:
¹ 1
1 1
H G
1 (
One way to think of a bipartite graph is by partitioning the vertices into two disjoint sets
where vertices in one set are adjacent only to vertices in the other set. In the above graph,
this may be more obvious with a diﬀerent representation:
373
¹ 1
1 1
H 1
( G
The two subsets are the two columns of vertices, all of which have the same colour.
A graph is bipartite if and only if all its cycles have even length. This is easy to see intuitively:
any path of odd length on a bipartite must end on a vertex of the opposite colour from the
beginning vertex and hence cannot be a cycle.
Version: 5 Owner: vampyr Author(s): vampyr
53.2 chromatic number
The chromatic number of a graph is the minimum number of colours required to colour
it.
Consider the following graph:
¹ 1 ( 1
1 1
This graph has been coloured using 3 colours. Furthermore, it’s clear that it cannot be
coloured with fewer than 3 colours, as well: it contains a subgraph (1(1) that is isomorphic
to the complete graph of 3 vertices. As a result, the chromatic number of this graph is indeed
3.
This example was easy to solve by inspection. In general, however, ﬁnding the chromatic
number of a large graph (and, similarly, an optimal colouring) is a very diﬃcult (NPhard)
problem.
Version: 2 Owner: vampyr Author(s): vampyr
374
53.3 chromatic number and girth
A famous theorem of P. Erd¨os
1
.
Theorem 6. For any natural numbers / and o, there exists a graph G with chromatic number
χ(G) ≥ / and girth girth(G) ≥ o.
Obviously, we can easily have graphs with high chromatic numbers. For instance, the
complete graph 1
n
trivially has χ(1
n
) = :; however girth(1
n
) = 3 (for : ≥ 3). And
the cycle graph (
n
has girth((
n
) = :, but
χ((
n
) =
1 : = 1
2 : even
3 otherwise.
It seems intuitively plausible that a high chromatic number occurs because of short, “local”
cycles in the graph; it is hard to envisage how a graph with no short cycles can still have
high chromatic number.
Instead of envisaging, Erd¨os’ proof shows that, in some appropriately chosen probability space
on graphs with : vertices, the probability of choosing a graph which does not have χ(G) ≥ /
and girth(G) ≥ o tends to zero as : grows. In particular, the desired graphs exist.
This seminal paper is probably the most famous application of the probabilistic method, and
is regarded by some as the foundation of the method.
2
Today the probabilistic method is
a standard tool for combinatorics. More constructive methods are often preferred, but are
almost always much harder.
Version: 3 Owner: ariels Author(s): ariels
53.4 chromatic polynomial
Let G be a graph (in the sense of graph theory) whose set \ of vertices is ﬁnite and nonempty,
and which has no loops or multiple edges. For any natural number r, let χ(G. r), or just χ(r),
denote the number of rcolorations of G, i.e. the number of mappings 1 : \ →¦1. 2. . . . . r¦
such that 1(c) = 1(/) for any pair (c. /) of adjacent vertices. Let us prove that χ (which
is called the chromatic polynomial of the graph G) is a polynomial function in r with
coeﬃcients in Z. Write 1 for the set of edges in G. If [1[=0, then trivially χ(r) = r
[V [
(where [ [ denotes the number of elements of a ﬁnite set). If not, then we choose an edge c
1
See the very readable P. Erd¨os, Graph theory and probability, Canad J. Math. 11 (1959), 34–38.
2
However, as always, with the beneﬁt of hindsight we can see that the probabilistic method had been used
before, e.g. in various applications of Sard’s theorem. This does nothing to diminish from the importance
of the clear statement of the tool.
375
and construct two graphs having fewer edges than G: H is obtained from G by contracting
the edge c, and 1 is obtained from G by omitting the edge c. We have
χ(G. r) = χ(1. r) −χ(H. r) (53.4.1)
for all r ∈ N, because the polynomial χ(1. r) is the number of colorations of the vertices of
G which might or might not be valid for the edge c, while χ(H. r) is the number which are
not valid. By induction on [1[, (53.4.1) shows that χ(G. r) is a polynomial over Z.
By reﬁning the argument a little, one can show
χ(r) = r
[V [
−[1[r
[V [−1
+ . . . ±:r
k
.
for some nonzero integer :, where / is the number of connected components of G, and the
coeﬃcients alternate in sign.
With the help of the M¨obiusRota inversion formula (see Moebius inversion), or directly by
induction, one can prove
χ(r) =
¸
F⊂E
(−1)
[F[
r
[V [−r(F)
where the sum is over all subsets 1 of 1, and :(1) denotes the rank of 1 in G, i.e. the
number of elements of any maximal cyclefree subset of 1. (Alternatively, the sum may be
taken only over subsets 1 such that 1 is equal to the span of 1; all other summands cancel
out in pairs.)
The chromatic number of G is the smallest r 0 such that χ(G. r) 0 or, equivalently,
such that χ(G. r) = 0.
The Tutte polynomial of a graph, or more generally of a matroid (1. :), is this function
of two variables:
t(r. n) =
¸
F⊂E
(r −1)
r(E)−r(F)
(n −1)
[F[−r(F)
.
Compared to the chromatic polynomial, the Tutte contains more information about the
matroid. Still, two or more nonisomorphic matroids may have the same Tutte polynomial.
Version: 5 Owner: bbukh Author(s): bbukh, Larry Hammick
53.5 colouring problem
The colouring problem is to assign a colour to every vertex of a graph such that no two
adjacent vertices have the same colour. These colours, of course, are not necessarily colours
in the optic sense.
Consider the following graph:
376
¹ 1 ( 1
1 1
One potential colouring of this graph is:
¹ 1 ( 1
1 1
¹ and ( have the same colour; 1 and 1 have a second colour; and 1 and 1 have another.
Graph colouring problems have many applications in such situations as scheduling and
matching problems.
Version: 3 Owner: vampyr Author(s): vampyr
53.6 complete bipartite graph
The complete bipartite graph 1
n,m
is a graph with two sets of vertices, one with :
members and one with :, such that each vertex in one set is adjacent to every vertex in the
other set and to no vertex in its own set. As the name implies, 1
n,m
is bipartite.
Examples of complete bipartite graphs:
1
2,5
:
(
¹ 1
1
( 1
G
1
3,3
:
377
¹ 1
1 1
( 1
Version: 3 Owner: vampyr Author(s): vampyr
53.7 complete kpartite graph
The complete /partite graph 1
a
1
,a
2
...a
k
is a /partite graph with c
1
. c
2
. . . c
k
vertices of each
colour wherein every vertex is adjacent to every other vertex with a diﬀerent colour and to
no vertices with the same colour.
For example, the 3partite complete graph 1
2,3,4
:
¹ 1 (
1
1 H
1 1
G
Version: 3 Owner: vampyr Author(s): vampyr
53.8 fourcolor conjecture
The fourcolor conjecture was a longstanding problem posed by Guthrie while coloring a
map of England. The conjecture states that every map on a plane or a sphere can be colored
using only four colors such that no two adjacent countries are assigned the same color. This
is equivalent to the statement that chromatic number of every planar graph is no more than
four. After many unsuccessfull attempts the conjecture was proven by Appel and Haken in
1976 with an aid of computer.
378
Interestingly, the seemingly harder problem of determining the maximal number of colors
needed for all surfaces other than the sphere was solved long before the fourcolor conjecture
was settled. This number is now called the Heawood number of the surface.
REFERENCES
1. Thomas L. Saaty and Paul C. Kainen. The FourColor Problem: Assaults and Conquest. Dover,
1986.
Version: 5 Owner: bbukh Author(s): bbukh
53.9 kpartite graph
A /partite graph is a graph with a chromatic number of /.
An alternate deﬁnition of a /partite graph is a graph where the vertices are partitioned into
/ subsets with the following conditions:
1. No two vertices in the same subset are 1. adjacent.
2. There is no partition of 2. the vertices with fewer than / subsets where condition 1 holds.
These two deﬁnitions are equivalent. Informally, we see that a colour can be assigned to all
the vertices in each subset, since they are not adjacent to one another. Furthermore, this is
also an optimal colouring, since the second condition holds.
An example of a 4partite graph:
¹ 1 (
G H
1 1 1
A 2partite graph is also called a bipartite graph.
Version: 5 Owner: vampyr Author(s): vampyr
379
53.10 property B
A hypergraph G is said to possess property B if it 2colorable, i.e., its vertices can be colored
in two colors, so that no edge of G is monochromatic.
The property was named after Felix Bernstein by E. W. Miller.
Version: 1 Owner: bbukh Author(s): bbukh
380
Chapter 54
05C20 – Directed graphs (digraphs),
tournaments
54.1 cut
On a digraph, deﬁne a sink to be a vertex with outdegree zero and a source to be a vertex
with indegree zero. Let G be a digraph with nonnegative weights and with exactly one
sink and exactly one source. A cut ( on G is a subset of the edges such that every path
from the source to the sink passes through an edge in (. In other words, if we remove every
edge in ( from the graph, there is no longer a path from the source to the sink.
Deﬁne the weight of ( as
\
C
=
¸
e∈C
\(c)
where \(c) is the weight of the edge c.
Observe that we may achieve a trivial cut by removing all the edges of G. Typically, we are
more interested in minimal cuts, where the weight of the cut is minimized for a particular
graph.
Version: 2 Owner: vampyr Author(s): vampyr
54.2 de Bruijn digraph
The vertices of the de Bruijn digraph 1(:. :) are all possible words of length :−1 chosen
from an alphabet of size :.
1(:. :) has :
m
edges consisting of each possible word of length : from an alphabet of size
381
:. The edge c
1
c
2
. . . c
n
connects the vertex c
1
c
2
. . . c
n−1
to the vertex c
2
c
3
. . . c
n
.
For example, 1(2. 4) could be drawn as:
000
0000
0001
100
1000
1001
001
0010
0011
010
0100
0101
101
1011
1010
110
1100
1101
011
0110
0111
111
1111
1110
Notice that an Euler cycle on 1(:. :) represents a shortest sequence of characters from an
alphabet of size : that includes every possible subsequence of : characters. For example,
the sequence 000011110010101000 includes all 4bit subsequences. Any de Bruijn digraph
must have an Euler cycle, since each vertex has in degree and out degree of :.
Version: 3 Owner: vampyr Author(s): vampyr
54.3 directed graph
A directed graph or digraph is a pair (\. 1) where \ is a set of vertices and 1 is a
subset of \ \ called edges or arcs.
If 1 is symmetric (i.e., (n. ·) ∈ 1 if and only if (·. n) ∈ 1), then the digraph is isomorphic
to an ordinary (that is, undirected) graph.
Digraphs are generally drawn in a similar manner to graphs with arrows on the edges to
indicate a sense of direction. For example, the digraph
(¦c. /. c. d¦. ¦(c. /). (/. d). (/. c). (c. /). (c. c). (c. d)¦)
may be drawn as
382
c
/
c
d
Version: 2 Owner: vampyr Author(s): vampyr
54.4 ﬂow
On a digraph, deﬁne a sink to be a vertex with outdegree zero and a source to be a vertex
with indegree zero. Let G be a digraph with nonnegative weights and with exactly one
sink and exactly one source. A ﬂow on G is an assignment 1 : 1(G) →R of values to each
edge of G satisfying certain rules:
1. For any edge c, we must have 0 < 1(c) < \(c) (where \(c) is the weight of c).
2. For any vertex ·, excluding the source and the sink, let 1
in
be the set of edges incident
to · and let 1
out
be the set of edges incident from ·. Then we must have
¸
e∈E
in
1(c) =
¸
e∈Eout
1(c).
Let 1
source
be the edges incident from the source, and let 1
sink
be the set of edges incident
to the sink. If 1 is a ﬂow, then
¸
e∈E
sink
1(c) =
¸
e∈Esource
1(c) .
We will refer to this quantity as the amount of ﬂow.
Note that a ﬂow given by 1(c) = 0 trivially satisﬁes these conditions. We are typically more
interested in maximum ﬂows, where the amount of ﬂow is maximized for a particular
graph.
We may interpret a ﬂow as a means of transmitting something through a network. Suppose
we think of the edges in a graph as pipes, with the weights corresponding with the capacities
of the pipes; we are pouring water into the system through the source and draining it through
the sink. Then the ﬁrst rule requires that we do not pump more water through a pipe than
is possible, and the second rule requires that any water entering a junction of pipes must
383
leave. Under this interpretation, the maximum amount of ﬂow corresponds to the maximum
amount of water we could pump through this network.
Instead of water in pipes, one may think of electric charge in a network of conductors. Rule
(2) above is one of Kirchoﬀ’s two laws for such networks; the other says that the sum of the
voltage drops around any circuit is zero.
Version: 3 Owner: nobody Author(s): Larry Hammick, vampyr
54.5 maximum ﬂow/minimum cut theorem
Let G be a ﬁnite digraph with nonnegative weights and with exactly one sink and exactly
one source. Then
I) For any ﬂow 1 on G and any cut ( of G, the amount of ﬂow for 1 is less than or equal to
the weight of (.
II) There exists a ﬂow 1
0
on G and a cut (
0
of G such that the ﬂow of 1
0
equals the weight
of (
0
.
Proof:(I) is easy, so we prove only (II). Write R for the set of nonnegative real numbers.
Let \ be the set of vertices of G. Deﬁne a matrix
κ : \ \ →R
where κ(r. n) is the sum of the weights (or capacities) of all the directed edges from r to n.
By hypothesis there is a unique · ∈ \ (the source) such that
κ(r. ·) = 0 ∀r ∈ \
and a unique u ∈ \ (the sink) such that
κ(u. r) = 0 ∀r ∈ \ .
We may also assume κ(r. r) = 0 for all r ∈ \ . Any ﬂow 1 will correspond uniquely (see
Remark below) to a matrix
ϕ : \ \ →R
such that
ϕ(r. n) ≤ κ(r. n) ∀r. n ∈ \
¸
z
ϕ(r. .) =
¸
z
ϕ(.. r) ∀r = ·. u
Let λ be the matrix of any maximal ﬂow, and let ¹ be the set of r ∈ \ such that there
exists a ﬁnite sequence r
0
= ·. r
1
. . . . . r
n
= r such that for all : from 1 to : − 1, we have
either
λ(r
m
. r
m+1
) < κ(r
m
. r
m+1
) (54.5.1)
384
or
λ(r
m+1
. r
m
) 0 . (54.5.2)
Write 1 = \ −¹.
Trivially, · ∈ ¹. Let us show that u ∈ 1. Arguing by contradiction, suppose u ∈ ¹, and let
(r
m
) be a sequence from · to u with the properties we just mentioned. Take a real number
c 0 such that
c + λ(r
m
. r
m+1
) < κ(r
m
. r
m+1
)
for all the (ﬁnitely many) : for which (138.1.1) holds, and such that
λ(r
m+1
. r
m
) c
for all the : for which (54.5.2) holds. But now we can deﬁne a matrix j with a larger ﬂow
than λ (larger by c) by:
j(r
m
. r
m+1
) = c + λ(r
m
. r
m+1
) if (138.1.1) holds
j(r
m+1
. r
m
) = λ(r
m+1
. r
m
) −c if (54.5.2) holds
j(c. /) = λ(c. /) for all other pairs (c. /) .
This contradiction shows that u ∈ 1.
Now consider the set ( of pairs (r. n) of vertices such that r ∈ \ and n ∈ \. Since \ is
nonempty, ( is a cut. But also, for any (r. n) ∈ ( we have
λ(r. n) = κ(r. n) (54.5.3)
for otherwise we would have n ∈ \ . Summing (54.5.3) over (, we see that the amount of
the ﬂow 1 is the capacity of (, QED.
Remark: We expressed the proof rather informally, because the terminology of graph theory
is not very well standardized and cannot all be found yet here at PlanetMath. Please feel
free to suggest any revision you think worthwhile.
Version: 5 Owner: bbukh Author(s): Larry Hammick, vampyr
54.6 tournament
A tournament is a directed graph obtained by choosing a direction for each edge in an
undirected complete graph. For example, here is a tournament on 4 vertices:
1 2
3 4
385
Any tournament on a ﬁnite number : of vertices contains a Hamiltonian path, i.e., directed
path on all : vertices. This is easily shown by induction on :: suppose that the statement
holds for :, and consider any tournament 1 on : + 1 vertices. Choose a vertex ·
0
of 1 and
consider a directed path ·
1
. ·
2
. . . . . ·
n
in 1 ` ¦·
0
¦. Now let i ∈ ¦0. . . . . :¦ be maximal such
that ·
j
→·
0
for all , with 1 < , < i. Then
·
1
. . . . . ·
i
. ·
0
. ·
i+1
. . . . . ·
n
is a directed path as desired.
The name “tournament” originates from such a graph’s interpretation as the outcome of
some sports competition in which every player encounters every other player exactly once,
and in which no draws occur; let us say that an arrow points from the winner to the loser.
A player who wins all games would naturally be the tournament’s winner. However, as the
above example shows, there might not be such a player; a tournament for which there isn’t
is called a 1paradoxical tournament. More generally, a tournament 1 = (\. 1) is called
/paradoxical if for every /subset \
t
of \ there is a ·
0
∈ \ ` \
t
such that ·
0
→ · for all
· ∈ \
t
. By means of the probabilistic method Erd¨os showed that if [\ [ is suﬃciently large,
then almost every tournament on \ is /paradoxical.
Version: 3 Owner: bbukh Author(s): bbukh, draisma
386
Chapter 55
05C25 – Graphs and groups
55.1 Cayley graph
Let G = 'A[1` be a presentation of the ﬁnitely generated group G with generators A and
relations 1. We deﬁne the Cayley graph Γ = Γ(G. A) of G with generators A as
Γ = (G. 1) .
where
1 = ¦¦n. c n¦ [n ∈ G. c ∈ A¦ .
That is, the vertices of the Cayley graph are precisely the elements of G, and two elements
of G are connected by an edge iﬀ some generator in A transfers the one to the other.
Examples
1. G = Z
d
, with generators A = ¦c
1
. . . . . c
d
¦, the standard basis vectors. Then Γ(G. A)
is the ddimensional grid; confusingly, it too is often termed “Z
d
”.
2. G = 1
d
, the free group with the d generators A = ¦o
1
. .... o
d
¦. Then Γ(G. A) is the
2dregular tree.
Version: 2 Owner: ariels Author(s): ariels
387
Chapter 56
05C38 – Paths and cycles
56.1 Euler path
An Euler path along a connected graph with : vertices is a path connecting all : vertices,
and traversing every edge of the graph only once. Note that a vertex with an odd degree
allows one to traverse through it and return by another path at least once, while a vertex
with an even degree only allows a number of traversals through, but one cannot end an Euler
path at a vertex with even degree. Thus, a connected graph has an Euler path which is a
circuit (an Euler circuit) if all of its vertices have even degree. A connected graph has an
Euler path which is noncircuituous if it has exactly two vertices with odd degree.
This graph has an Euler path which is a circuit. All of its vertices are of even degree.
This graph has an Euler path which is not a circuit. It has exactly two vertices of odd degree.
Note that a graph must be connected to have an Euler path or circuit. A graph is connected
if every pair of vertices n and . has a path n·. . . . . n. between them.
Version: 12 Owner: slider142 Author(s): slider142
56.2 Veblen’s theorem
The edge set of a graph can be partitioned into cycles if and only if every vertex has even
degree.
388
Version: 2 Owner: digitalis Author(s): digitalis
56.3 acyclic graph
Any graph that contains no cycles is an acyclic graph. A directed acyclic graph is often
called a DAG for short.
For example, the following graph and digraph are acyclic.
¹
1 (
¹
1 (
In contrast, the following graph and digraph are not acyclic, because each contains a cycle.
¹
1 (
¹
1 (
Version: 5 Owner: Logan Author(s): Logan
56.4 bridges of Knigsberg
The bridges of K¨onigsberg is a famous problem inspired by an actual place and situation.
The solution of the problem, put forth by Leonhard Euler in 1736, is the ﬁrst work of
graph theory and is responsible for the foundation of the discipline.
The following ﬁgure shows a portion of the Prussian city of K¨ onigsberg. A river passes
through the city, and there are two islands in the river. Seven bridges cross between the
islands and the mainland:
Figure 1: Map of the K¨ onigsberg bridges.
The mathematical problem arose when citizens of K¨onigsburg noticed that one could not
take a stroll across all seven bridges, returning to the starting point, without crossing at
least one bridge twice.
389
Answering the question of why this is the case required a mathematical theory that didn’t
exist yet: graph theory. This was provided by Euler, in a paper which is still available today.
To solve the problem, we must translate it into a graphtheoretic representation. We model
the land masses, ¹, 1, ( and 1, as vertices in a graph. The bridges between the land
masses become edges. This generates from the above picture the following graph:
Figure 2: Graphtheoretic representation of the K¨ onigsburg bridges.
At this point, we can apply what we know about Euler paths and Euler circuits. Since an
Euler circuit for a graph exists only if every vertex has an even degree, the K¨onigsberg graph
must have no Euler circuit. Hence, we have explained why one cannot take a walk around
K¨onigsberg and return to the starting point without crossing at least one bridge more than
once.
Version: 5 Owner: akrowne Author(s): akrowne
56.5 cycle
A cycle in a graph, digraph, or multigraph, is simple path from a vertex to itself (i.e., a
path where the ﬁrst vertex is the same as the last vertex and no edge is repeated).
For example, consider this graph:
¹ 1
1 (
¹1(1¹ and 11¹1 are two of the cycles in this graph. ¹1¹ is not a cycle, however, since
it uses the edge connecting ¹ and 1 twice. ¹1(1 is not a cycle because it begins on ¹ but
ends on 1.
A cycle of length : is sometimes denoted (
n
and may be referred to as a polygon of : sides:
that is, (
3
is a triangle, (
4
is a quadrilateral, (
5
is a pentagon, etc.
An even cycle is one of even length; similarly, an odd cycle is one of odd length.
Version: 4 Owner: vampyr Author(s): vampyr
390
56.6 girth
The girth of a graph G is the length of the shortest cycle in G.
1
For instance, the girth of any grid Z
d
(where d 2) is 4, and the girth of the vertex graph
of the dodecahedron is 5.
Version: 1 Owner: ariels Author(s): ariels
56.7 path
A path in a graph is a ﬁnite sequence of alternating vertices and edges, beginning and ending
with a vertex, ·
1
c
1
·
2
c
2
·
3
. . . c
n−1
·
n
such that every consecutive pair of vertices ·
x
and ·
x+1
are adjacent and c
x
is incident with ·
x
and with ·
x+1
. Typically, the edges may be omitted
when writing a path (e.g., ·
1
·
2
·
3
. . . ·
n
) since only one edge of a graph may connect two
adjacent vertices. In a multigraph, however, the choice of edge may be signiﬁcant.
The length of a path is the number of edges in it.
Consider the following graph:
¹ 1
1 (
Paths include (but are certainly not limited to) ¹1(1 (length 3), ¹1(1¹ (length 4), and
¹1¹1¹1¹1¹1(1¹ (length 12). ¹11 is not a path since 1 is not adjacent to 1.
In a digraph, each consecutive pair of vertices must be connected by an edge with the proper
orientation; if c = (n. ·) is an edge, but (·. n) is not, then nc· is a valid path but ·cn is not.
Consider this digraph:
G H
J 1
GH1J, GJ, and GHGHGH are all valid paths. GHJ is not a valid path because H and
J are not connected. GJ1 is not a valid path because the edge connecting 1 to J has the
1
There is no widespread agreement on the girth of a forest, which has no cycles. It is also extremely
unimportant.
391
opposite orientation.
Version: 3 Owner: vampyr Author(s): vampyr
56.8 proof of Veblen’s theorem
The proof is very easy by induction on the number of elements of the set 1 of edges. If 1 is
empty, then all the vertices have degree zero, which is even. Suppose 1 is nonempty. If the
graph contains no cycle, then some vertex has degree 1, which is odd. Finally, if the graph
does contain a cycle (, then every vertex has the same degree mod 2 with respect to 1 −(,
as it has with respect to 1, and we can conclude by induction.
Version: 1 Owner: mathcam Author(s): Larry Hammick
392
Chapter 57
05C40 – Connectivity
57.1 /connected graph
For / ∈ N, a graph G is /connected iﬀ G has more than / vertices and if the graph left by
removing any / or less vertices is connected. The largest integer / such that G is /connected
is called the connectivity of G and is denoted by κ(G).
Version: 1 Owner: lieven Author(s): lieven
57.2 Thomassen’s theorem on 3connected graphs
Every 3connected graph G with more than 4 vertices has an edge c such that Gc is also
3connected.
Suppose such an edge doesn’t exist. Then, for every edge c = rn, the graph Gc isn’t
3connected and can be made disconnected by removing 2 vertices. Since κ(G) ` 3, our
contracted vertex ·
xy
has to be one of these two. So for every edge c, G has a vertex . = r. n
such that ¦·
xy
. .¦ separates Gc. Any 2 vertices separated by ¦·
xy
. .¦ in Gc are separated
in G by o := ¦r. n. .¦. Since the minimal size of a separating set is 3, every vertex in o has
an adjacent vertex in every component of G−o.
Now we choose the edge c, the vertex . and the component ( such that [([ is minimal. We
also choose a vertex · adjacent to . in (.
By construction G.· is not 3connected since removing rn disconnects ( − · from G.·.
So there is a vertex u such that ¦.. ·. u¦ separates G and as above every vertex in ¦.. ·. u¦
has an adjacent vertex in every component of G−¦.. ·. u¦. We now consider a component
1 of G−¦.. ·. u¦ that doesn’t contain r or n. Such a component exists since r and n belong
393
to the same component and G − ¦.. ·. u¦ isn’t connected. Any vertex adjacent to · in 1
is also an element of ( since · is an element of (. This means 1 is a proper subset of (
which contradicts our assumption that [([ was minimal.
Version: 2 Owner: lieven Author(s): lieven
57.3 Tutte’s wheel theorem
Every 3connected simple graph can be constructed starting from a wheel graph by repeat
edly either adding an edge between two nonadjacent vertices or splitting a vertex.
Version: 1 Owner: lieven Author(s): lieven
57.4 connected graph
A connected graph is a graph such that there exists a path between all pairs of vertices.
If the graph is a directed graph, and there exists a path from each vertex to every other
vertex, then it is a strongly connected graph.
A connected component is a subset of vertices of any graph and any edges between them
that forms a connected graph. Similarly, a strongly connected component is a subset
of vertices of any digraph and any edges between them that forms a strongly connected
graph. Any graph or digraph is a union of connected or strongly connected components,
plus some edges to join the components together. Thus any graph can be decomposed
into its connected or strongly connected components. For instance, Tarjan’s algorithm can
decompose any digraph into its strongly connected components.
For example, the following graph and digraph are connected and strongly connected, respec
tively.
¹ 1 (
1 1 1
¹ 1 (
1 1 1
On the other hand, the following graph is not connected, and consists of the union of two
connected components.
394
¹ 1 (
1 1 1
The following digraph is not strongly connected, because there is no way to reach 1 from
other vertices, and there is no vertex reachable from (.
¹ 1 (
1 1 1
The three strongly connected components of this graph are
¹ 1
1 1
C F
Version: 3 Owner: Logan Author(s): Logan
57.5 cutvertex
A cutvertex of a graph G is a vertex whose deletion increases the number of components
of G. The edge analogue of a cutvertex is a bridge.
Version: 2 Owner: digitalis Author(s): digitalis
395
Chapter 58
05C45 – Eulerian and Hamiltonian
graphs
58.1 Bondy and Chvtal theorem
Bondy and Chv´atal’s theorem.
Let G be a graph of order : ≥ 3 and suppose that n and · are distinct non adjacent vertices
such that deg(n) + deg(·) ≥ :.
Then G is Hamiltonian if and only if G+ n· is hamiltonian.
Version: 1 Owner: drini Author(s): drini
58.2 Dirac theorem
Theorem: Every graph with : ` 3 vertices and minimum degree at last
n
2
has a Hamiltonian cycle.
Proof: Let G = (\. 1) be a graph with G = : ` 3 and δ(G) `
n
2
. Then G is connected:
otherwise, the degree of any vertex in the smallest component ( of G would be less then
( `
n
2
. Let 1 = r
0
...r
k
be a longest path in G. By the maximality of 1. all the neighbours
of r
0
and all the neighbours of r
k
lie on 1. Hence at last
n
2
of the vertices r
0
. .... r
k−1
are
adjacent to r
k
, and at last
n
2
of these same / < : vertices r
i
are such that r
0
r
i+1
∈ 1. By the
pigeon hole principle, there is a vertex r
i
that has both properties, so we have r
0
r
i+1
∈ 1
and r
i
r
k
∈ 1 for some i < /. We claim that the cycle ( := r
0
r
i+1
1r
k
r
i
1r
0
is a Hamiltonian
cycle of G. Indeed since G is connected, ( would otherwise have a neighbour in G−(. which
could be combined with a spanning path of ( into a path longer than 1. P
Version: 5 Owner: vladm Author(s): vladm
396
58.3 Euler circuit
An Euler circuit is a connected graph such that starting at a vertex c, one can traverse along
every edge of the graph once to each of the other vertices and return to vertex c. In other
words, an Euler circuit is an Euler path that is a circuit. Thus, using the properties of odd
and even degree vertices given in the deﬁnition of an Euler path, an Euler circuit exists iﬀ
every vertex of the graph has an even degree.
This graph is an Euler circuit as all vertices have degree 2.
This graph is not an Euler circuit.
Version: 6 Owner: slider142 Author(s): slider142
58.4 Fleury’s algorithm
Fleury’s algorithm constructs an Euler circuit in a graph (if it’s possible).
1. Pick any vertex to start
2. From that vertex pick an edge to traverse, considering following rule: never cross a
bridge of the reduced graph unless there is no other choice
3. Darken that edge, as a reminder that you can’t traverse it again
4. Travel that edge, coming to the next vertex
5. Repeat 24 until all edges have been traversed, and you are back at the starting vertex
By ”reduced graph” we mean the original graph minus the darkened (already used) edges.
Version: 3 Owner: Johan Author(s): Johan
397
58.5 Hamiltonian cycle
let G be a graph. If there’s a cycle visiting all vertices exactly once, we say that the cycle is
a hamiltonian cycle.
Version: 2 Owner: drini Author(s): drini
58.6 Hamiltonian graph
Let G be a graph or digraph.
If G has a Hamiltonian cycle, we call G a hamiltonian graph.
There is not useful necessary and suﬃcient condition for a graph being hamiltonian. However,
we can get some necessary conditions from the deﬁnition like a hamiltonian graph is always
connected and has order at least 3. This an other observations lead to the condition:
Let G = (\. 1) be a graph of order at least 3. If G is hamiltonian, for every proper subset
l of \ , the subgraph induced by \ −l has at most [l[ components.
For the suﬃciency conditions, we get results like Ore’s theorem or Bondy and Chv´atal theorem
Version: 5 Owner: drini Author(s): drini
58.7 Hamiltonian path
Let G be a graph. A path on G that includes every vertex exactly once is called a hamil
tonian path.
Version: 4 Owner: drini Author(s): drini
58.8 Ore’s theorem
Let G be a graph of order : ≥ 3 such that, for every pair of distinct non adjacent vertices n
and ·, deg(n) + deg(·) ≥ :. Then G is a Hamiltonian graph.
Version: 3 Owner: drini Author(s): drini
398
58.9 Petersen graph
Petersen’s graph. An example of graph that is traceable but not Hamiltonian. That is, it
has a Hamiltonian path but doesn’t have a Hamiltonian cycle.
This is also the canonical example of a hypohamiltonian graph.
Version: 5 Owner: drini Author(s): drini
58.10 hypohamiltonian
A graph G is hypohamiltonian if G is not Hamiltonian, but G−· is Hamiltonian for each
· ∈ \ (\ the vertex set of G). The smallest hypohamiltonian graph is the Petersen graph,
which has ten vertices.
Version: 1 Owner: digitalis Author(s): digitalis
58.11 traceable
let G be a graph. If G has a Hamiltonian path, we say that G is traceable.
Not every traceable graph is Hamiltonian. As an example consider Petersen’s graph.
Version: 2 Owner: drini Author(s): drini
399
Chapter 59
05C60 – Isomorphism problems
(reconstruction conjecture, etc.)
59.1 graph isomorphism
A graph isomorphism is a bijection between the vertices of two graphs G and H:
1 : \ (G) →\ (H)
with the property that any two vertices n and · from G are adjacent if and only if 1(n) and
1(·) are adjacent in H.
If an isomorphism can be constructed between two graphs, then we say those graphs are
isomorphic.
For example, consider these two graphs:
c
o
/ /
c
i
d
,
400
1 2
5 6
8 7
4 3
Although these graphs look very diﬀerent at ﬁrst, they are in fact isomorphic; one isomor
phism between them is
1(c) = 1
1(/) = 6
1(c) = 8
1(d) = 3
1(o) = 5
1(/) = 2
1(i) = 4
1(,) = 7
Version: 2 Owner: vampyr Author(s): vampyr
401
Chapter 60
05C65 – Hypergraphs
60.1 Steiner system
A Steiner system o(t. /. :) is a /−uniform hypergraph on : vertices such that every set
of t vertices is contained in exactly one edge. Notice that o(2. /. :) are merely 2−uniform
linear spaces. The families of hypergraphs o(2. 3. :) are known as Steiner triple systems.
Version: 2 Owner: drini Author(s): drini, NeuRet
60.2 ﬁnite plane
Let H = (\. E) be a linear space. A ﬁnite plane is an intersecting linear space. That is to
say, a linear space in which any two edges in E have a nonempty intersection.
Finite planes are rather restrictive hypergraphs, and the following holds.
Theorem 4. Let H = (\. E) be a ﬁnite plane. Then for some positive integer /, H is
(/ + 1)−regular, (/ + 1)−uniform, and [E[ = [\ [ = /
2
+ / + 1.
The above / is the order of the ﬁnite