P. 1
Free Encyclopedia of Mathematics-Vol 1

Free Encyclopedia of Mathematics-Vol 1

4.79

|Views: 4,912|Likes:
Published by dervis

More info:

Published by: dervis on Jan 03, 2009
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF or read online from Scribd
See more
See less

05/09/2014

pdf

original

Free Encyclopedia of Mathematics 0.0.

1
by the PlanetMath authors Aatu, ack, akrowne, alek thiery, alinabi, almann,
alozano, antizeus, antonio, aparna, ariels, armbrusterb, AxelBoldt, basseykay,
bbukh, benjaminfjones, bhaire, brianbirgen, bs, bshanks, bwebste, cryo, danielm,
Daume, debosberg, deiudi, digitalis, djao, Dr Absentius, draisma, drini, drum-
mond, dublisk, Evandar, fibonaci, flynnheiss, gabor sz, GaloisRadical, gantsich,
gaurminirick, gholmes74, giri, greg, grouprly, gumau, Gunnar, Henry, iddo, igor,
imran, jamika chris, jarino, jay, jgade, jihemme, Johan, karteef, karthik, kemy-
ers3, Kevin OBryant, kidburla2003, KimJ, Koro, lha, lieven, livetoad, liyang, Lo-
gan, Luci, m759, mathcam, mathwizard, matte, mclase, mhale, mike, mikestaflo-
gan, mps, msihl, muqabala, n3o, nerdy2, nobody, npolys, Oblomov, ottocolori,
paolini, patrickwonders, pbruin, petervr, PhysBrain, quadrate, quincynoodles,
ratboy, RevBobo, Riemann, rmilson, ruiyang, Sabean, saforres, saki, say 10,
scanez, scineram, seonyoung, slash, sleske, slider142, sprocketboy, sucrose, super-
higgs, tensorking, thedagit, Thomas Heye, thouis, Timmy, tobix, tromp, tz26, un-
lord, uriw, urz, vampyr, vernondalhart, vitriol, vladm, volator, vypertd, wberry,
Wkbj79, wombat, x bas, xiaoyanggu, XJamRastafire, xriso, yark et al.
edited by Joe Corneli & Aaron Krowne
Copyright c ( 2004 PlanetMath.org authors. Permission is granted to copy, dis-
tribute and/or modify this document under the terms of the GNU Free Documen-
tation License, Version 1.2 or any later version published by the Free Software
Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no
Back-Cover Texts. A copy of the license is included in the section entitled “GNU
Free Documentation License”.
Introduction
Welcome to the PlanetMath “One Big Book” compilation, the Free Encyclopedia of Math-
ematics. This book gathers in a single document the best of the hundreds of authors and
thousands of other contributors from the PlanetMath.org web site, as of January 4, 2004.
The purpose of this compilation is to help the efforts of these people reach a wider audience
and allow the benefits of their work to be accessed in a greater breadth of situations.
We want to emphasize is that the Free Encyclopedia of Mathematics will always be a work
in progress. Producing a book-format encycopedia from the amorphous web of interlinked
and multidimensionally-organized entries on PlanetMath is not easy. The print medium
demands a linear presentation, and to boil the web site down into this format is a difficult,
and in some ways lossy, transformation. A major part of our editorial efforts are going into
making this transformation. We hope the organization we’ve chosen for now is useful to
readers, and in future editions you can expect continuing improvements.
The “linearization” of PlanetMath.org is not the only editorial task we must perform.
Throughout the millenia, readers have come to expect a strict standard of consistency and
correctness from print books, and we must strive to meet this standard in the PlanetMath
Book as closely as possible. This means applying more editorial control to the book form
of PlanetMath than is applied to the web site. We hope you will agree that there is signifi-
cant value to be gained from unifying style, correcting errors, and filtering out not-yet-ready
content, so we will continue to do these things.
For more details on planned improvements to this book, see the TODO file that came with
this archive. Remember that you can help us to improve this work by joining PlanetMath.org
and filing corrections, adding entries, or just participating in the community. We are also
looking for volunteers to help edit this book, or help with programming related to its pro-
duction, or to help work on Noosphere, the PlanetMath software. To send us comments
about the book, use the e-mail address pmbook@planetmath.org. For general comments
and queries, use feedback@planetmath.org.
Happy mathing,
Joe Corneli
Aaron Krowne
Tuesday, January 27, 2004
i
Top-level Math Subject
Classificiations
00 General
01 History and biography
03 Mathematical logic and foundations
05 Combinatorics
06 Order, lattices, ordered algebraic structures
08 General algebraic systems
11 Number theory
12 Field theory and polynomials
13 Commutative rings and algebras
14 Algebraic geometry
15 Linear and multilinear algebra; matrix theory
16 Associative rings and algebras
17 Nonassociative rings and algebras
18 Category theory; homological algebra
19 $K$-theory
20 Group theory and generalizations
22 Topological groups, Lie groups
26 Real functions
28 Measure and integration
30 Functions of a complex variable
31 Potential theory
32 Several complex variables and analytic spaces
33 Special functions
34 Ordinary differential equations
35 Partial differential equations
37 Dynamical systems and ergodic theory
39 Difference and functional equations
40 Sequences, series, summability
41 Approximations and expansions
42 Fourier analysis
43 Abstract harmonic analysis
44 Integral transforms, operational calculus
ii
45 Integral equations
46 Functional analysis
47 Operator theory
49 Calculus of variations and optimal control; optimization
51 Geometry
52 Convex and discrete geometry
53 Differential geometry
54 General topology
55 Algebraic topology
57 Manifolds and cell complexes
58 Global analysis, analysis on manifolds
60 Probability theory and stochastic processes
62 Statistics
65 Numerical analysis
68 Computer science
70 Mechanics of particles and systems
74 Mechanics of deformable solids
76 Fluid mechanics
78 Optics, electromagnetic theory
80 Classical thermodynamics, heat transfer
81 Quantum theory
82 Statistical mechanics, structure of matter
83 Relativity and gravitational theory
85 Astronomy and astrophysics
86 Geophysics
90 Operations research, mathematical programming
91 Game theory, economics, social and behavioral sciences
92 Biology and other natural sciences
93 Systems theory; control
94 Information and communication, circuits
97 Mathematics education
iii
Table of Contents
Introduction i
Top-level Math Subject Classificiations ii
Table of Contents iv
GNU Free Documentation License lii
UNCLA – Unclassified 1
Golomb ruler 1
Hesse configuration 1
Jordan’s Inequality 2
Lagrange’s theorem 2
Laurent series 3
Lebesgue measure 3
Leray spectral sequence 4
M¨obius transformation 4
Mordell-Weil theorem 4
Plateau’s Problem 5
Poisson random variable 5
Shannon’s theorem 6
Shapiro inequality 9
Sylow p-subgroups 9
Tchirnhaus transformations 9
Wallis formulae 10
ascending chain condition 10
bounded 10
bounded operator 11
complex projective line 12
converges uniformly 12
descending chain condition 13
diamond theorem 13
equivalently oriented bases 13
finitely generated R-module 14
fraction 14
group of covering transformations 15
idempotent 15
isolated 17
isolated singularity 17
isomorphic groups 17
joint continuous density function 18
joint cumulative distribution function 18
joint discrete density function 19
left function notation 20
lift of a submanifold 20
limit of a real function exits at a point 20
lipschitz function 21
lognormal random variable 21
lowest upper bound 22
marginal distribution 22
measurable space 23
measure zero 23
minimum spanning tree 23
minimum weighted path length 24
mod 2 intersection number 25
moment generating function 27
monoid 27
monotonic operator 27
multidimensional Gaussian integral 28
multiindex 29
near operators 30
negative binomial random variable 36
normal random variable 37
normalizer of a subset of a group 38
nth root 38
null tree 40
open ball 40
opposite ring 40
orbit-stabilizer theorem 41
orthogonal 41
permutation group on a set 41
prime element 42
product measure 43
projective line 43
projective plane 43
proof of calculus theorem used in the Lagrange
method 44
proof of orbit-stabilizer theorem 45
proof of power rule 45
proof of primitive element theorem 47
proof of product rule 47
proof of sum rule 48
proof that countable unions are countable 48
quadrature 48
quotient module 49
regular expression 49
regular language 50
right function notation 51
ring homomorphism 51
scalar 51
schrodinger operator 51
iv
selection sort 52
semiring 53
simple function 54
simple path 54
solutions of an equation 54
spanning tree 54
square root 55
stable sorting algorithm 56
standard deviation 56
stochastic independence 56
substring 57
successor 57
sum rule 58
superset 58
symmetric polynomial 59
the argument principle 59
torsion-free module 59
total order 60
tree traversals 60
trie 63
unit vector 64
unstable fixed point 65
weak* convergence in normed linear space 65
well-ordering principle for natural numbers 65
00-01 – Instructional exposition (textbooks,
tutorial papers, etc.) 66
dimension 66
toy theorem 67
00-XX – General 68
method of exhaustion 68
00A05 – General mathematics 69
Conway’s chained arrow notation 69
Knuth’s up arrow notation 70
arithmetic progression 70
arity 71
introducing 0th power 71
lemma 71
property 72
saddle point approximation 72
singleton 73
subsequence 73
surreal number 73
00A07 – Problem books 76
Nesbitt’s inequality 76
proof of Nesbitt’s inequality 76
00A20 – Dictionaries and other general
reference works 78
completing the square 78
00A99 – Miscellaneous topics 80
QED 80
TFAE 80
WLOG 81
order of operations 81
01A20 – Greek, Roman 84
Roman numerals 84
01A55 – 19th century 85
Poincar, Jules Henri 85
01A60 – 20th century 90
Bourbaki, Nicolas 90
Erds Number 97
03-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 98
Burali-Forti paradox 98
Cantor’s paradox 98
Russell’s paradox 99
biconditional 99
bijection 100
cartesian product 100
chain 100
characteristic function 101
concentric circles 101
conjunction 102
disjoint 102
empty set 102
even number 103
fixed point 103
infinite 103
injective function 104
integer 104
inverse function 105
linearly ordered 106
operator 106
ordered pair 106
ordering relation 106
partition 107
pullback 107
set closed under an operation 108
signature of a permutation 109
subset 109
surjective 110
v
transposition 110
truth table 111
03-XX – Mathematical logic and founda-
tions 112
standard enumeration 112
03B05 – Classical propositional logic 113
CNF 113
Proof that contrapositive statement is true using
logical equivalence 113
contrapositive 114
disjunction 114
equivalent 114
implication 115
propositional logic 115
theory 116
transitive 116
truth function 117
03B10 – Classical first-order logic 118

1
bootstrapping 118
Boolean 119
G¨odel numbering 120
G¨odel’s incompleteness theorems 120
Lindenbaum algebra 127
Lindstr¨om’s theorem 128
Pressburger arithmetic 129
R-minimal element 129
Skolemization 129
arithmetical hierarchy 129
arithmetical hierarchy is a proper hierarchy 130
atomic formula 131
creating an infinite model 131
criterion for consistency of sets of formulas 132
deductions are ∆
1
132
example of G¨odel numbering 134
example of well-founded induction 135
first order language 136
first order logic 137
first order theories 138
free and bound variables 138
generalized quantifier 139
logic 140
proof of compactness theorem for first order logic
141
proof of principle of transfinite induction 141
proof of the well-founded induction principle 141
quantifier 141
quantifier free 144
subformula 144
syntactic compactness theorem for first order logic
144
transfinite induction 144
universal relation 145
universal relations exist for each level of the arith-
metical hierarchy 145
well-founded induction 146
well-founded induction on formulas 147
03B15 – Higher-order logic and type the-
ory 143
H¨artig’s quantifier 143
Russell’s theory of types 143
analytic hierarchy 145
game-theoretical quantifier 146
logical language 147
second order logic 148
03B40 – Combinatory logic and lambda-
calculus 150
Church integer 150
combinatory logic 150
lambda calculus 151
03B48 – Probability and inductive logic
154
conditional probability 154
03B99 – Miscellaneous 155
Beth property 155
Hofstadter’s MIU system 155
IF-logic 157
Tarski’s result on the undefinability of Truth 160
axiom 161
compactness 164
consistent 164
interpolation property 164
sentence 165
03Bxx – General logic 166
Banach-Tarski paradox 166
03C05 – Equational classes, universal al-
gebra 168
congruence 168
every congruence is the kernel of a homomor-
phism 168
homomorphic image of a Σ-structure is a Σ-structure
vi
169
kernel 169
kernel of a homomorphism is a congruence 169
quotient structure 170
03C07 – Basic properties of first-order lan-
guages and structures 171
Models constructed from constants 171
Stone space 172
alphabet 173
axiomatizable theory 174
definable 174
definable type 175
downward Lowenheim-Skolem theorem 176
example of definable type 176
example of strongly minimal 177
first isomorphism theorem 177
language 178
length of a string 179
proof of homomorphic image of a Σ-structure is
a Σ-structure 179
satisfaction relation 180
signature 181
strongly minimal 181
structure preserving mappings 181
structures 182
substructure 183
type 183
upward Lowenheim-Skolem theorem 183
03C15 – Denumerable structures 185
random graph (infinite) 185
03C35 – Categoricity and completeness of
theories 187
κ-categorical 187
Vaught’s test 187
proof of Vaught’s test 187
03C50 – Models with special properties
(saturated, rigid, etc.) 189
example of universal structure 189
homogeneous 191
universal structure 191
03C52 – Properties of classes of models
192
amalgamation property 192
03C64 – Model theory of ordered struc-
tures; o-minimality 193
infinitesimal 193
o-minimality 194
real closed fields 194
03C68 – Other classical first-order model
theory 196
imaginaries 196
03C90 – Nonclassical models (Boolean-valued,
sheaf, etc.) 198
Boolean valued model 198
03C99 – Miscellaneous 199
axiom of foundation 199
elementarily equivalent 199
elementary embedding 200
model 200
proof equivalence of formulation of foundation
201
03D10 – Turing machines and related no-
tions 203
Turing machine 203
03D20 – Recursive functions and relations,
subrecursive hierarchies 206
primitive recursive 206
03D25 – Recursively (computably) enu-
merable sets and degrees 207
recursively enumerable 207
03D75 – Abstract and axiomatic computabil-
ity and recursion theory 208
Ackermann function 208
halting problem 209
03E04 – Ordered sets and their cofinali-
ties; pcf theory 211
another definition of cofinality 211
cofinality 211
maximal element 212
partitions less than cofinality 213
well ordered set 213
pigeonhole principle 213
proof of pigeonhole principle 213
tree (set theoretic) 214
κ-complete 215
Cantor’s diagonal argument 215
Fodor’s lemma 216
Schroeder-Bernstein theorem 216
Veblen function 216
additively indecomposable, 217
vii
cardinal number 217
cardinal successor 217
cardinality 218
cardinality of a countable union 218
cardinality of the rationals 219
classes of ordinals and enumerating functions 219
club 219
club filter 220
countable 220
countably infinite 221
finite 221
fixed points of normal functions 221
height of an algebraic number 221
if A is infinite and B is a finite subset of A, then
A` B is infinite 222
limit cardinal 222
natural number 223
ordinal arithmetic 224
ordinal number 225
power set 225
proof of Fodor’s lemma 225
proof of Schroeder-Bernstein theorem 225
proof of fixed points of normal functions 226
proof of the existence of transcendental numbers
226
proof of theorems in aditively indecomposable
227
proof that the rationals are countable 228
stationary set 228
successor cardinal 229
uncountable 229
von Neumann integer 229
von Neumann ordinal 230
weakly compact cardinal 231
weakly compact cardinals and the tree property
231
Cantor’s theorem 232
proof of Cantor’s theorem 232
additive 232
antisymmetric 233
constant function 233
direct image 234
domain 234
dynkin system 234
equivalence class 235
fibre 235
filtration 236
finite character 236
fix (transformation actions) 236
function 237
functional 237
generalized cartesian product 238
graph 238
identity map 238
inclusion mapping 239
inductive set 239
invariant 240
inverse function theorem 240
inverse image 241
mapping 242
mapping of period n is a bijection 242
partial function 242
partial mapping 243
period of mapping 243
pi-system 244
proof of inverse function theorem 244
proper subset 246
range 246
reflexive 246
relation 246
restriction of a mapping 247
set difference 247
symmetric 247
symmetric difference 248
the inverse image commutes with set operations
248
transformation 249
transitive 250
transitive 250
transitive closure 250
Hausdorff’s maximum principle 250
Kuratowski’s lemma 251
Tukey’s lemma 251
Zermelo’s postulate 251
Zermelo’s well-ordering theorem 251
Zorn’s lemma 252
axiom of choice 252
equivalence of Hausdorff’s maximum principle,
Zorn’s lemma and the well-ordering theorem 252
equivalence of Zorn’s lemma and the axiom of
viii
choice 253
maximality principle 254
principle of finite induction 254
principle of finite induction proven from well-
ordering principle 255
proof of Tukey’s lemma 255
proof of Zermelo’s well-ordering theorem 255
axiom of extensionality 256
axiom of infinity 256
axiom of pairing 257
axiom of power set 258
axiom of union 258
axiom schema of separation 259
de Morgan’s laws 260
de Morgan’s laws for sets (proof) 261
set theory 261
union 264
universe 264
von Neumann-Bernays-Gdel set theory 265
FS iterated forcing preserves chain condition 267
chain condition 268
composition of forcing notions 268
composition preserves chain condition 268
equivalence of forcing notions 269
forcing relation 270
forcings are equivalent if one is dense in the other
270
iterated forcing 272
iterated forcing and composition 273
name 273
partial order with chain condition does not col-
lapse cardinals 274
proof of partial order with chain condition does
not collapse cardinals 274
proof that forcing notions are equivalent to their
composition 275
complete partial orders do not add small subsets
280
proof of complete partial orders do not add small
subsets 280
Q is equivalent to ♣ and continuum hypothesis
281
Levy collapse 281
proof of Q is equivalent to ♣ and continuum hy-
pothesis 282
Martin’s axiom 283
Martin’s axiom and the continuum hypothesis
283
Martin’s axiom is consistent 284
a shorter proof: Martin’s axiom and the contin-
uum hypothesis 287
continuum hypothesis 288
forcing 288
generalized continuum hypothesis 289
inaccessible cardinals 290
Q 290
♣ 290
Dedekind infinite 291
Zermelo-Fraenkel axioms 291
class 291
complement 293
delta system 293
delta system lemma 293
diagonal intersection 293
intersection 294
multiset 294
proof of delta system lemma 294
rational number 295
saturated (set) 295
separation and doubletons axiom 295
set 296
03Exx – Set theory 299
intersection 299
03F03 – Proof theory, general 300
NJp 300
NKp 300
natural deduction 301
sequent 301
sound,, complete 302
03F07 – Structure of proofs 303
induction 303
03F30 – First-order arithmetic and frag-
ments 307
Elementary Functional Arithmetic 307
PA 308
Peano arithmetic 308
03F35 – Second- and higher-order arith-
metic and fragments 310
ACA
0
310
ix
RCA
0
310
Z
2
310
comprehension axiom 311
induction axiom 311
03G05 – Boolean algebras 313
Boolean algebra 313
M. H. Stone’s representation theorem 313
03G10 – Lattices and related structures
314
Boolean lattice 314
complete lattice 314
lattice 315
03G99 – Miscellaneous 316
Chu space 316
Chu transform 316
biextensional collapse 317
example of Chu space 317
property of a Chu space 318
05-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 319
example of pigeonhole principle 319
multi-index derivative of a power 319
multi-index notation 320
05A10 – Factorials, binomial coefficients,
combinatorial functions 322
Catalan numbers 322
Levi-Civita permutation symbol 323
Pascal’s rule (bit string proof) 325
Pascal’s rule proof 326
Pascal’s triangle 326
Upper and lower bounds to binomial coefficient
328
binomial coefficient 328
double factorial 329
factorial 329
falling factorial 330
inductive proof of binomial theorem 331
multinomial theorem 332
multinomial theorem (proof) 333
proof of upper and lower bounds to binomial co-
efficient 334
05A15 – Exact enumeration problems, gen-
erating functions 336
Stirling numbers of the first kind 336
Stirling numbers of the second kind 338
05A19 – Combinatorial identities 342
Pascal’s rule 342
05A99 – Miscellaneous 343
principle of inclusion-exclusion 343
principle of inclusion-exclusion proof 344
05B15 – Orthogonal arrays, Latin squares,
Room squares 346
example of Latin squares 346
graeco-latin squares 346
latin square 347
magic square 347
05B35 – Matroids, geometric lattices 348
matroid 348
polymatroid 353
05C05 – Trees 354
AVL tree 354
Aronszajn tree 354
Suslin tree 354
antichain 355
balanced tree 355
binary tree 355
branch 356
child node (of a tree) 356
complete binary tree 357
digital search tree 357
digital tree 358
example of Aronszajn tree 358
example of tree (set theoretic) 359
extended binary tree 359
external path length 360
internal node (of a tree) 360
leaf node (of a tree) 361
parent node (in a tree) 361
proof that ω has the tree property 362
root (of a tree) 362
tree 363
weight-balanced binary trees are ultrametric 364
weighted path length 366
05C10 – Topological graph theory, imbed-
ding 367
Heawood number 367
Kuratowski’s theorem 368
Szemer´edi-Trotter theorem 368
crossing lemma 369
crossing number 369
x
graph topology 369
planar graph 370
proof of crossing lemma 370
05C12 – Distance in graphs 372
Hamming distance 372
05C15 – Coloring of graphs and hyper-
graphs 373
bipartite graph 373
chromatic number 374
chromatic number and girth 375
chromatic polynomial 375
colouring problem 376
complete bipartite graph 377
complete k-partite graph 378
four-color conjecture 378
k-partite graph 379
property B 380
05C20 – Directed graphs (digraphs), tour-
naments 381
cut 381
de Bruijn digraph 381
directed graph 382
flow 383
maximum flow/minimum cut theorem 384
tournament 385
05C25 – Graphs and groups 387
Cayley graph 387
05C38 – Paths and cycles 388
Euler path 388
Veblen’s theorem 388
acyclic graph 389
bridges of Knigsberg 389
cycle 390
girth 391
path 391
proof of Veblen’s theorem 392
05C40 – Connectivity 393
k-connected graph 393
Thomassen’s theorem on 3-connected graphs 393
Tutte’s wheel theorem 394
connected graph 394
cutvertex 395
05C45 – Eulerian and Hamiltonian graphs
396
Bondy and Chvtal theorem 396
Dirac theorem 396
Euler circuit 397
Fleury’s algorithm 397
Hamiltonian cycle 398
Hamiltonian graph 398
Hamiltonian path 398
Ore’s theorem 398
Petersen graph 399
hypohamiltonian 399
traceable 399
05C60 – Isomorphism problems (reconstruc-
tion conjecture, etc.) 400
graph isomorphism 400
05C65 – Hypergraphs 402
Steiner system 402
finite plane 402
hypergraph 403
linear space 404
05C69 – Dominating sets, independent sets,
cliques 405
Mantel’s theorem 405
clique 405
proof of Mantel’s theorem 405
05C70 – Factorization, matching, covering
and packing 407
Petersen theorem 407
Tutte theorem 407
bipartite matching 407
edge covering 409
matching 409
maximal bipartite matching algorithm 410
maximal matching/minimal edge covering theo-
rem 411
05C75 – Structural characterization of types
of graphs 413
multigraph 413
pseudograph 413
05C80 – Random graphs 414
examples of probabilistic proofs 414
probabilistic method 415
05C90 – Applications 417
Hasse diagram 417
05C99 – Miscellaneous 419
Euler’s polyhedron theorem 419
Poincar´e formula 419
xi
Turan’s theorem 419
Wagner’s theorem 420
block 420
bridge 420
complete graph 420
degree (of a vertex) 421
distance (in a graph) 421
edge-contraction 421
graph 422
graph minor theorem 422
graph theory 423
homeomorphism 424
loop 424
minor (of a graph) 424
neighborhood (of a vertex) 425
null graph 425
order (of a graph) 425
proof of Euler’s polyhedron theorem 426
proof of Turan’s theorem 427
realization 427
size (of a graph) 428
subdivision 428
subgraph 429
wheel graph 429
05D05 – Extremal set theory 431
LYM inequality 431
Sperner’s theorem 432
05D10 – Ramsey theory 433
Erd¨os-Rado theorem 433
Ramsey’s theorem 433
Ramsey’s theorem 434
arrows 435
coloring 436
proof of Ramsey’s theorem 437
05D15 – Transversal (matching) theory 438
Hall’s marriage theorem 438
proof of Hall’s marriage theorem 438
saturate 440
system of distinct representatives 440
05E05 – Symmetric functions 441
elementary symmetric polynomial 441
reduction algorithm for symmetric polynomials
441
06-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 443
equivalence relation 443
06-XX – Order, lattices, ordered algebraic
structures 445
join 445
meet 445
06A06 – Partial order, general 446
directed set 446
infimum 446
sets that do not have an infimum 447
supremum 447
upper bound 448
06A99 – Miscellaneous 449
dense (in a poset) 449
partial order 449
poset 450
quasi-order 450
well quasi ordering 450
06B10 – Ideals, congruence relations 452
order in an algebra 452
06C05 – Modular lattices, Desarguesian
lattices 453
modular lattice 453
06D99 – Miscellaneous 454
distributive 454
distributive lattice 454
06E99 – Miscellaneous 455
Boolean ring 455
08A40 – Operations, polynomials, primal
algebras 456
coefficients of a polynomial 456
08A99 – Miscellaneous 457
binary operation 457
filtered algebra 457
11-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 459
Euler phi-function 459
Euler-Fermat theorem 460
Fermat’s little theorem 460
Fermat’s theorem proof 460
Goldbach’s conjecture 460
Jordan’s totient function 461
Legendre symbol 461
Pythagorean triplet 462
Wilson’s theorem 462
arithmetic mean 462
xii
ceiling 463
computation of powers using Fermat’s little the-
orem 463
congruences 464
coprime 464
cube root 464
floor 465
geometric mean 465
googol 466
googolplex 467
greatest common divisor 467
group theoretic proof of Wilson’s theorem 467
harmonic mean 467
mean 468
number field 468
pi 468
proof of Wilson’s theorem 470
proof of fundamental theorem of arithmetic 471
root of unity 471
11-01 – Instructional exposition (textbooks,
tutorial papers, etc.) 472
base 472
11-XX – Number theory 474
Lehmer’s Conjecture 474
Sierpinski conjecture 474
prime triples conjecture 475
11A05 – Multiplicative structure; Euclidean
algorithm; greatest common divisors 476
Bezout’s lemma (number theory) 476
Euclid’s algorithm 476
Euclid’s lemma 478
Euclid’s lemma proof 478
fundamental theorem of arithmetic 479
perfect number 479
smooth number 480
11A07 – Congruences; primitive roots; residue
systems 481
Anton’s congruence 481
Fermat’s Little Theorem proof (Inductive) 482
Jacobi symbol 483
Shanks-Tonelli algorithm 483
Wieferich prime 483
Wilson’s theorem for prime powers 484
factorial module prime powers 485
proof of Euler-Fermat theorem 485
proof of Lucas’s theorem 486
11A15 – Power residues, reciprocity 487
Euler’s criterion 487
Gauss’ lemma 487
Zolotarev’s lemma 489
cubic reciprocity law 491
proof of Euler’s criterion 493
proof of quadratic reciprocity rule 494
quadratic character of 2 495
quadratic reciprocity for polynomials 496
quadratic reciprocity rule 497
quadratic residue 497
11A25 – Arithmetic functions; related num-
bers; inversion formulas 498
Dirichlet character 498
Liouville function 498
Mangoldt function 499
Mertens’ first theorem 499
Moebius function 499
Moebius in version 500
arithmetic function 502
multiplicative function 503
non-multiplicative function 505
totient 507
unit 507
11A41 – Primes 508
Chebyshev functions 508
Euclid’s proof of the infinitude of primes 509
Mangoldt summatory function 509
Mersenne numbers 510
Thue’s lemma 510
composite number 511
prime 511
prime counting function 511
prime difference function 512
prime number theorem 512
prime number theorem result 513
proof of Thue’s Lemma 514
semiprime 515
sieve of Eratosthenes 516
test for primality of Mersenne numbers 516
11A51 – Factorization; primality 517
Fermat Numbers 517
Fermat compositeness test 517
Zsigmondy’s theorem 518
xiii
divisibility 518
division algorithm for integers 519
proof of division algorithm for integers 519
square-free number 520
squarefull number 520
the prime power dividing a factorial 521
11A55 – Continued fractions 523
Stern-Brocot tree 523
continued fraction 524
11A63 – Radix representation; digital prob-
lems 527
Kummer’s theorem 527
corollary of Kummer’s theorem 528
11A67 – Other representations 529
Sierpinski Erd¨os egyptian fraction conjecture 529
adjacent fraction 529
any rational number is a sum of unit fractions
530
conjecture on fractions with odd denominators
532
unit fraction 532
11A99 – Miscellaneous 533
ABC conjecture 533
Suranyi theorem 533
irrational to an irrational power can be rational
534
triangular numbers 534
11B05 – Density, gaps, topology 536
Cauchy-Davenport theorem 536
Mann’s theorem 536
Schnirelmann density 537
Sidon set 537
asymptotic density 538
discrete space 538
essential component 539
normal order 539
11B13 – Additive bases 541
Erd¨os-Turan conjecture 541
additive basis 542
asymptotic basis 542
base con version 542
sumset 546
11B25 – Arithmetic progressions 547
Behrend’s construction 547
Freiman’s theorem 548
Szemer´edi’s theorem 548
multidimensional arithmetic progression 549
11B34 – Representation functions 550
Erd¨os-Fuchs theorem 550
11B37 – Recurrences 551
Collatz problem 551
recurrence relation 551
11B39 – Fibonacci and Lucas numbers and
polynomials and generalizations 553
Fibonacci sequence 553
Hogatt’s theorem 554
Lucas numbers 554
golden ratio 554
11B50 – Sequences (mod m) 556
Erd¨os-Ginzburg-Ziv theorem 556
11B57 – Farey sequences; the sequences ?
557
Farey sequence 557
11B65 – Binomial coefficients; factorials;
q-identities 559
Lucas’s Theorem 559
binomial theorem 559
11B68 – Bernoulli and Euler numbers and
polynomials 561
Bernoulli number 561
Bernoulli periodic function 561
Bernoulli polynomial 562
generalized Bernoulli number 562
11B75 – Other combinatorial number the-
ory 563
Erd¨os-Heilbronn conjecture 563
Freiman isomorphism 563
sum-free 564
11B83 – Special sequences and polynomi-
als 565
Beatty sequence 565
Beatty’s theorem 566
Fraenkel’s partition theorem 566
Sierpinski numbers 567
palindrome 567
proof of Beatty’s theorem 568
square-free sequence 569
superincreasing sequence 569
11B99 – Miscellaneous 570
Lychrel number 570
xiv
closed form 571
11C08 – Polynomials 573
content of a polynomial 573
cyclotomic polynomial 573
height of a polynomial 574
length of a polynomial 574
proof of Eisenstein criterion 574
proof that the cyclotomic polynomial is irreducible
575
11D09 – Quadratic and bilinear equations
577
Pell’s equation and simple continued fractions
577
11D41 – Higher degree equations; Fermat’s
equation 578
Beal conjecture 578
Euler quartic conjecture 579
Fermat’s last theorem 580
11D79 – Congruences in many variables
582
Chinese remainder theorem 582
Chinese remainder theorem proof 583
11D85 – Representation problems 586
polygonal number 586
11D99 – Miscellaneous 588
Diophantine equation 588
11E39 – Bilinear and Hermitian forms 590
Hermitian form 590
non-degenerate bilinear form 590
positive definite form 591
symmetric bilinear form 591
Clifford algebra 591
11Exx – Forms and linear algebraic groups
593
quadratic function associated with a linear func-
tional 593
11F06 – Structure of modular groups and
generalizations; arithmetic groups 594
Taniyama-Shimura theorem 594
11F30 – Fourier coefficients of automor-
phic forms 597
Fourier coefficients 597
11F67 – Special values of automorphic L-
series, periods of modular forms, cohomol-
ogy, modular symbols 598
Schanuel’s conjecutre 598
period 598
11G05 – Elliptic curves over global fields
600
complex multiplication 600
11H06 – Lattices and convex bodies 602
Minkowski’s theorem 602
lattice in R
n
602
11H46 – Products of linear forms 604
triple scalar product 604
11J04 – Homogeneous approximation to
one number 605
Dirichlet’s approximation theorem 605
11J68 – Approximation to algebraic num-
bers 606
Davenport-Schmidt theorem 606
Liouville approximation theorem 606
proof of Liouville approximation theorem 607
11J72 – Irrationality; linear independence
over a field 609
nth root of 2 is irrational for n ≥ 3 (proof using
Fermat’s last theorem) 609
e is irrational (proof) 610
irrational 610
square root of 2 is irrational 611
11J81 – Transcendence (general theory)
612
Fundamental Theorem of Transcendence 612
Gelfond’s theorem 612
four exponentials conjecture 612
six exponentials theorem 613
transcendental number 614
11K16 – Normal numbers, radix expan-
sions, etc. 615
absolutely normal 615
11K45 – Pseudo-random numbers; Monte
Carlo methods 617
pseudorandom numbers 617
quasirandom numbers 618
random numbers 619
truly random numbers 619
11L03 – Trigonometric and exponential sums,
general 620
Ramanujan sum 620
11L05 – Gauss and Kloosterman sums; gen-
xv
eralizations 622
Gauss sum 622
Kloosterman sum 623
Landsberg-Schaar relation 623
derivation of Gauss sum up to a sign 624
11L40 – Estimates on character sums 625
Plya-Vinogradov inequality 625
11M06 – ζ(s) and L(s, χ) 627
Ap´ery’s constant 627
Dedekind zeta function 627
Dirichlet L-series 628
Riemann θ-function 629
Riemann Xi function 630
Riemann omega function 630
functional equation for the Riemann Xi function
630
functional equation for the Riemann theta func-
tion 631
generalized Riemann hypothesis 631
proof of functional equation for the Riemann theta
function 631
11M99 – Miscellaneous 633
Riemann zeta function 633
formulae for zeta in the critical strip 636
functional equation of the Riemann zeta function
638
value of the Riemann zeta function at s = 2 638
11N05 – Distribution of primes 640
Bertrand’s conjecture 640
Brun’s constant 640
proof of Bertrand’s conjecture 640
twin prime conjecture 642
11N13 – Primes in progressions 643
primes in progressions 648
11N32 – Primes represented by polynomi-
als; other multiplicative structure of poly-
nomial values 644
Euler four-square identity 644
11N56 – Rate of growth of arithmetic func-
tions 645
highly composite number 645
11N99 – Miscellaneous 646
Chinese remainder theorem 646
proof of chinese remainder theorem 646
11P05 – Waring’s problem and variants
648
Lagrange’s four-square theorem 648
Waring’s problem 648
proof of Lagrange’s four-square theorem 649
11P81 – Elementary theory of partitions
651
pentagonal number theorem 651
11R04 – Algebraic numbers; rings of alge-
braic integers 653
Dedekind domain 653
Dirichlet’s unit theorem 653
Eisenstein integers 654
Galois representation 654
Gaussian integer 658
algebraic conjugates 659
algebraic integer 659
algebraic number 659
algebraic number field 659
calculating the splitting of primes 660
characterization in terms of prime ideals 661
ideal classes form an abelian group 661
integral basis 661
integrally closed 662
transcendental root theorem 662
11R06 – PV-numbers and generalizations;
other special algebraic numbers 663
Salem number 663
11R11 – Quadratic extensions 664
prime ideal decomposition in quadratic exten-
sions of ´ 664
11R18 – Cyclotomic extensions 666
Kronecker-Weber theorem 666
examples of regular primes 667
prime ideal decomposition in cyclotomic exten-
sions of ´ 668
regular prime 669
11R27 – Units and factorization 670
regulator 670
11R29 – Class numbers, class groups, dis-
criminants 672
Existence of Hilbert Class Field 672
class number formula 673
discriminant 673
ideal class 674
xvi
ray class group 675
11R32 – Galois theory 676
Galois criterion for solvability of a polynomial by
radicals 676
11R34 – Galois cohomology 677
Hilbert Theorem 90 677
11R37 – Class field theory 678
Artin map 678
Tchebotarev density theorem 679
modulus 679
multiplicative congruence 680
ray class field 680
11R56 – Ad`ele rings and groups 682
adle 682
idle 682
restricted direct product 683
11R99 – Miscellaneous 684
Henselian field 684
valuation 685
11S15 – Ramification and extension the-
ory 686
decomposition group 686
examples of prime ideal decomposition in num-
ber fields 688
inertial degree 691
ramification index 692
unramified action 697
11S31 – Class field theory; p-adic formal
groups 699
Hilbert symbol 699
11S99 – Miscellaneous 700
p-adic integers 700
local field 701
11Y05 – Factorization 703
Pollard’s rho method 703
quadratic sieve 706
11Y55 – Calculation of integer sequences
709
Kolakoski sequence 709
11Z05 – Miscellaneous applications of num-
ber theory 711
τ function 711
arithmetic derivative 711
example of arithmetic derivative 712
proof that τ(n) is the number of positive divisors
of n 712
12-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 714
monomial 714
order and degree of polynomial 715
12-XX – Field theory and polynomials 716
homogeneous polynomial 716
subfield 716
12D05 – Polynomials: factorization 717
factor theorem 717
proof of factor theorem 717
proof of rational root theorem 718
rational root theorem 719
sextic equation 719
12D10 – Polynomials: location of zeros
(algebraic theorems) 720
Cardano’s derivation of the cubic formula 720
Ferrari-Cardano derivation of the quartic formula
721
Galois-theoretic derivation of the cubic formula
722
Galois-theoretic derivation of the quartic formula
724
cubic formula 728
derivation of quadratic formula 728
quadratic formula 729
quartic formula 730
reciprocal polynomial 730
root 731
variant of Cardano’s derivation 732
12D99 – Miscellaneous 733
Archimedean property 733
complex 734
complex conjugate 735
complex number 737
examples of totally real fields 738
fundamental theorem of algebra 739
imaginary 739
imaginary unit 739
indeterminate form 739
inequalities for real numbers 740
interval 742
modulus of complex number 743
proof of fundamental theorem of algebra 744
proof of the fundamental theorem of algebra 744
xvii
real and complex embeddings 744
real number 746
totally real and imaginary fields 747
12E05 – Polynomials (irreducibility, etc.)
748
Gauss’s Lemma I 748
Gauss’s Lemma II 749
discriminant 749
polynomial ring 751
resolvent 751
de Moivre identity 754
monic 754
Wedderburn’s Theorem 754
proof of Wedderburn’s theorem 755
second proof of Wedderburn’s theorem 756
finite field 757
Frobenius automorphism 760
characteristic 761
characterization of field 761
example of an infinite field of finite characteristic
762
examples of fields 762
field 764
field homomorphism 764
prime subfield 765
12F05 – Algebraic extensions 766
a finite extension of fields is an algebraic exten-
sion 766
algebraic closure 767
algebraic extension 767
algebraically closed 767
algebraically dependent 768
existence of the minimal polynomial 768
finite extension 769
minimal polynomial 769
norm 770
primitive element theorem 770
splitting field 770
the field extension R/´ is not finite 771
trace 771
12F10 – Separable extensions, Galois the-
ory 772
Abelian extension 772
Fundamental Theorem of Galois Theory 772
Galois closure 773
Galois conjugate 773
Galois extension 773
Galois group 773
absolute Galois group 774
cyclic extension 774
example of nonperfect field 774
fixed field 774
infinite Galois theory 774
normal closure 776
normal extension 776
perfect field 777
radical extension 777
separable 777
separable closure 778
12F20 – Transcendental extensions 779
transcendence degree 779
12F99 – Miscellaneous 780
composite field 780
extension field 780
12J15 – Ordered fields 782
ordered field 782
13-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 783
absolute value 783
associates 784
cancellation ring 784
comaximal 784
every prime ideal is radical 784
module 785
radical of an ideal 786
ring 786
subring 787
tensor product 787
13-XX – Commutative rings and algebras
789
commutative ring 789
13A02 – Graded rings 790
graded ring 790
13A05 – Divisibility 791
Eisenstein criterion 791
13A10 – Radical theory 792
Hilbert’s Nullstellensatz 792
nilradical 792
radical of an integer 793
13A15 – Ideals; multiplicative ideal theory
xviii
794
contracted ideal 794
existence of maximal ideals 794
extended ideal 795
fractional ideal 796
homogeneous ideal 797
ideal 797
maximal ideal 797
principal ideal 798
the set of prime ideals of a commutative ring
with identity 798
13A50 – Actions of groups on commuta-
tive rings; invariant theory 799
Schwarz (1975) theorem 799
invariant polynomial 800
13A99 – Miscellaneous 801
Lagrange’s identity 801
characteristic 802
cyclic ring 802
proof of Euler four-square identity 803
proof that every subring of a cyclic ring is a cyclic
ring 804
proof that every subring of a cyclic ring is an
ideal 804
zero ring 805
13B02 – Extension theory 806
algebraic 806
module-finite 806
13B05 – Galois theory 807
algebraic 807
13B21 – Integral dependence 808
integral 808
13B22 – Integral closure of rings and ide-
als ; integrally closed rings, related rings
(Japanese, etc.) 809
integral closure 809
13B30 – Quotients and localization 810
fraction field 810
localization 810
multiplicative set 811
13C10 – Projective and free modules and
ideals 812
example of free module 812
13C12 – Torsion modules and ideals 813
torsion element 813
13C15 – Dimension theory, depth, related
rings (catenary, etc.) 814
Krull’s principal ideal theorem 814
13C99 – Miscellaneous 815
Artin-Rees theorem 815
Nakayama’s lemma 815
prime ideal 815
proof of Nakayama’s lemma 816
proof of Nakayama’s lemma 817
support 817
13E05 – Noetherian rings and modules 818
Hilbert basis theorem 818
Noetherian module 818
proof of Hilbert basis theorem 819
finitely generated modules over a principal ideal
domain 819
13F07 – Euclidean rings and generaliza-
tions 821
Euclidean domain 821
Euclidean valuation 821
proof of Bezout’s Theorem 822
proof that an Euclidean domain is a PID 822
13F10 – Principal ideal rings 823
Smith normalform 823
13F25 – Formal power series rings 825
formal power series 825
13F30 – Valuation rings 831
discrete valuation 831
discrete valuation ring 831
13G05 – Integral domains 833
Dedekind-Hasse valuation 833
PID 834
UFD 834
a finite integral domain is a field 835
an artinian integral domain is a field 835
example of PID 835
field of quotients 836
integral domain 836
irreducible 837
motivation for Euclidean domains 837
zero divisor 838
13H05 – Regular local rings 839
regular local ring 839
13H99 – Miscellaneous 840
local ring 840
xix
semi-local ring 841
13J10 – Complete rings, completion 842
completion 842
13J25 – Ordered rings 844
ordered ring 844
13J99 – Miscellaneous 845
topological ring 845
13N15 – Derivations 846
derivation 846
13P10 – Polynomial ideals, Gr¨obner bases
847
Gr¨obner basis 847
14-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 849
Picard group 849
affine space 849
affine variety 849
dual isogeny 850
finite morphism 850
isogeny 851
line bundle 851
nonsingular variety 852
projective space 852
projective variety 854
quasi-finite morphism 854
14A10 – Varieties and morphisms 855
Zariski topology 855
algebraic map 856
algebraic sets and polynomial ideals 856
noetherian topological space 857
regular map 857
structure sheaf 858
14A15 – Schemes and morphisms 859
closed immersion 859
coherent sheaf 859
fibre product 860
prime spectrum 860
scheme 863
separated scheme 864
singular set 864
14A99 – Miscellaneous 865
Cartier divisor 865
General position 865
Serre’s twisting theorem 866
ample 866
height of a prime ideal 866
invertible sheaf 866
locally free 867
normal irreducible varieties are nonsingular in
codimension 1 867
sheaf of meromorphic functions 867
very ample 867
14C20 – Divisors, linear systems, invert-
ible sheaves 869
divisor 869
Rational and birational maps 870
general type 870
14F05 – Vector bundles, sheaves, related
constructions 871
direct image (functor) 871
14F20 –
´
Etale and other Grothendieck topolo-
gies and cohomologies 872
site 872
14F25 – Classical real and complex coho-
mology 873
Serre duality 873
sheaf cohomology 874
14G05 – Rational points 875
Hasse principle 875
14H37 – Automorphisms 876
Frobenius morphism 876
14H45 – Special curves and curves of low
genus 878
Fermat’s spiral 878
archimedean spiral 878
folium of Descartes 879
spiral 879
14H50 – Plane and space curves 880
torsion (space curve) 880
14H52 – Elliptic curves 881
Birch and Swinnerton-Dyer conjecture 881
Hasse’s bound for elliptic curves over finite fields
882
L-series of an elliptic curve 882
Mazur’s theorem on torsion of elliptic curves 884
Mordell curve 884
Nagell-Lutz theorem 885
Selmer group 886
bad reduction 887
conductor of an elliptic curve 890
xx
elliptic curve 890
height function 894
j-invariant 895
rank of an elliptic curve 896
supersingular 897
the torsion subgroup of an elliptic curve injects
in the reduction of the curve 897
14H99 – Miscellaneous 900
Riemann-Roch theorem 900
genus 900
projective curve 901
proof of Riemann-Roch theorem 901
14L17 – Affine algebraic groups, hyperal-
gebra constructions 902
affine algebraic group 902
algebraic torus 902
14M05 – Varieties defined by ring con-
ditions (factorial, Cohen-Macaulay, semi-
normal) 903
normal 903
14M15 – Grassmannians, Schubert vari-
eties, flag manifolds 904
Borel-Bott-Weil theorem 904
flag variety 905
14R15 – Jacobian problem 906
Jacobian conjecture 906
15-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 907
Cholesky decomposition 907
Hadamard matrix 908
Hessenberg matrix 909
If A ∈ M
n
(k) and A is supertriangular then
A
n
= 0 910
Jacobi determinant 910
Jacobi’s Theorem 912
Kronecker product 912
LU decomposition 913
Peetre’s inequality 914
Schur decomposition 915
antipodal 916
conjugate transpose 916
corollary of Schur decomposition 917
covector 918
diagonal matrix 918
diagonalization 920
diagonally dominant matrix 920
eigenvalue (of a matrix) 921
eigenvalue problem 922
eigenvalues of orthogonal matrices 924
eigenvector 925
exactly determined 926
free vector space over a set 926
in a vector space, λv = 0 if and only if λ = 0 or
v is the zero vector 928
invariant subspace 929
least squares 929
linear algebra 930
linear least squares 932
linear manifold 934
matrix exponential 934
matrix operations 935
nilpotent matrix 938
nilpotent transformation 938
non-zero vector 939
off-diagonal entry 940
orthogonal matrices 940
orthogonal vectors 941
overdetermined 941
partitioned matrix 941
pentadiagonal matrix 942
proof of Cayley-Hamilton theorem 942
proof of Schur decomposition 943
singular value decomposition 944
skew-symmetric matrix 945
square matrix 946
strictly upper triangular matrix 946
symmetric matrix 947
theorem for normal triangular matrices 947
triangular matrix 948
tridiagonal matrix 949
under determined 950
unit triangular matrix 950
unitary 951
vector space 952
vector subspace 953
zero map 954
zero vector in a vector space is unique 955
zero vector space 955
15-01 – Instructional exposition (textbooks,
tutorial papers, etc.) 956
xxi
circulant matrix 956
matrix 957
15-XX – Linear and multilinear algebra;
matrix theory 960
linearly dependent functions 960
15A03 – Vector spaces, linear dependence,
rank 961
Sylvester’s law 961
basis 961
complementary subspace 962
dimension 963
every vector space has a basis 964
flag 964
frame 965
linear combination 968
linear independence 968
list vector 968
nullity 969
orthonormal basis 970
physical vector 970
proof of rank-nullity theorem 972
rank 973
rank-nullity theorem 973
similar matrix 974
span 975
theorem for the direct sum of finite dimensional
vector spaces 976
vector 976
15A04 – Linear transformations, semilin-
ear transformations 980
admissibility 980
conductor of a vector 980
cyclic decomposition theorem 981
cyclic subspace 981
dimension theorem for symplectic complement
(proof) 981
dual homomorphism 982
dual homomorphism of the derivative 983
image of a linear transformation 984
invertible linear transformation 984
kernel of a linear transformation 985
linear transformation 985
minimal polynomial (endomorphism) 986
symplectic complement 987
trace 988
15A06 – Linear equations 989
Gaussian elimination 989
finite-dimensional linear problem 991
homogeneous linear problem 992
linear problem 993
reduced row echelon form 993
row echelon form 994
under-determined polynomial interpolation 994
15A09 – Matrix inversion, generalized in-
verses 996
matrix adjoint 996
matrix inverse 997
15A12 – Conditioning of matrices 1000
singular 1000
15A15 – Determinants, permanents, other
special matrix functions 1001
Cayley-Hamilton theorem 1001
Cramer’s rule 1001
cofactor expansion 1002
determinant 1003
determinant as a multilinear mapping 1005
determinants of some matrices of special form
1006
example of Cramer’s rule 1006
proof of Cramer’s rule 1008
proof of cofactor expansion 1008
resolvent matrix 1009
15A18 – Eigenvalues, singular values, and
eigenvectors 1010
Jordan canonical form theorem 1010
Lagrange multiplier method 1011
Perron-Frobenius theorem 1011
characteristic equation 1012
eigenvalue 1012
eigenvalue 1013
15A21 – Canonical forms, reductions, clas-
sification 1015
companion matrix 1015
eigenvalues of an involution 1015
linear involution 1016
normal matrix 1017
projection 1018
quadratic form 1019
15A23 – Factorization of matrices 1021
QR decomposition 1021
xxii
15A30 – Algebraic systems of matrices 1023
ideals in matrix algebras 1023
15A36 – Matrices of integers 1025
permutation matrix 1025
15A39 – Linear inequalities 1026
Farkas lemma 1026
15A42 – Inequalities involving eigenvalues
and eigenvectors 1027
Gershgorin’s circle theorem 1027
Gershgorin’s circle theorem result 1027
Shur’s inequality 1028
15A48 – Positive matrices and their gen-
eralizations; cones of matrices 1029
negative definite 1029
negative semidefinite 1029
positive definite 1030
positive semidefinite 1030
primitive matrix 1031
reducible matrix 1031
15A51 – Stochastic matrices 1032
Birkoff-von Neumann theorem 1032
proof of Birkoff-von Neumann theorem 1032
15A57 – Other types of matrices (Hermi-
tian, skew-Hermitian, etc.) 1035
Hermitian matrix 1035
direct sum of Hermitian and skew-Hermitian ma-
trices 1036
identity matrix 1037
skew-Hermitian matrix 1037
transpose 1038
15A60 – Norms of matrices, numerical range,
applications of functional analysis to ma-
trix theory 1041
Frobenius matrix norm 1041
matrix p-norm 1042
self consistent matrix norm 1043
15A63 – Quadratic and bilinear forms, in-
ner products 1044
Cauchy-Schwarz inequality 1044
adjoint endomorphism 1045
anti-symmetric 1046
bilinear map 1046
dot product 1049
every orthonormal set is linearly independent 1050
inner product 1051
inner product space 1051
proof of Cauchy-Schwarz inequality 1052
self-dual 1052
skew-symmetric bilinear form 1053
spectral theorem 1053
15A66 – Clifford algebras, spinors 1056
geometric algebra 1056
15A69 – Multilinear algebra, tensor prod-
ucts 1058
Einstein summation convention 1058
basic tensor 1059
multi-linear 1061
outer multiplication 1061
tensor 1062
tensor algebra 1065
tensor array 1065
tensor product (vector spaces) 1067
tensor transformations 1069
15A72 – Vector and tensor algebra, theory
of invariants 1072
bac-cab rule 1072
cross product 1072
euclidean vector 1073
rotational invariance of cross product 1074
15A75 – Exterior algebra, Grassmann al-
gebras 1076
contraction 1076
exterior algebra 1077
15A99 – Miscellaneous topics 1081
Kronecker delta 1081
dual space 1081
example of trace of a matrix 1083
generalized Kronecker delta symbol 1083
linear functional 1084
modules are a generalization of vector spaces 1084
proof of properties of trace of a matrix 1085
quasipositive matrix 1086
trace of a matrix 1086
Volume 2
16-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 1088
direct product of modules 1088
direct sum 1089
xxiii
exact sequence 1089
quotient ring 1090
16D10 – General module theory 1091
annihilator 1091
annihilator is an ideal 1091
artinian 1092
composition series 1092
conjugate module 1093
modular law 1093
module 1093
proof of modular law 1094
zero module 1094
16D20 – Bimodules 1095
bimodule 1095
16D25 – Ideals 1096
associated prime 1096
nilpotent ideal 1096
primitive ideal 1096
product of ideals 1097
proper ideal 1097
semiprime ideal 1097
zero ideal 1098
16D40 – Free, projective, and flat modules
and ideals 1099
finitely generated projective module 1099
flat module 1099
free module 1100
free module 1100
projective cover 1100
projective module 1101
16D50 – Injective modules, self-injective
rings 1102
injective hull 1102
injective module 1102
16D60 – Simple and semisimple modules,
primitive rings and ideals 1104
central simple algebra 1104
completely reducible 1104
simple ring 1105
16D80 – Other classes of modules and ide-
als 1106
essential submodule 1106
faithful module 1106
minimal prime ideal 1107
module of finite rank 1107
simple module 1107
superfluous submodule 1107
uniform module 1108
16E05 – Syzygies, resolutions, complexes
1109
n-chain 1109
chain complex 1109
flat resolution 1110
free resolution 1110
injective resolution 1110
projective resolution 1110
short exact sequence 1111
split short exact sequence 1111
von Neumann regular 1111
16K20 – Finite-dimensional 1112
quaternion algebra 1112
16K50 – Brauer groups 1113
Brauer group 1113
16K99 – Miscellaneous 1114
division ring 1114
16N20 – Jacobson radical, quasimultipli-
cation 1115
Jacobson radical 1115
a ring modulo its Jacobson radical is semiprimi-
tive 1116
examples of semiprimitive rings 1116
proof of Characterizations of the Jacobson radi-
cal 1117
properties of the Jacobson radical 1118
quasi-regularity 1119
semiprimitive ring 1120
16N40 – Nil and nilpotent radicals, sets,
ideals, rings 1121
Koethe conjecture 1121
nil and nilpotent ideals 1121
16N60 – Prime and semiprime rings 1123
prime ring 1123
16N80 – General radicals and rings 1124
prime radical 1124
radical theory 1124
16P40 – Noetherian rings and modules 1126
Noetherian ring 1126
noetherian 1126
16P60 – Chain conditions on annihilators
and summands: Goldie-type conditions ,
xxiv
Krull dimension 1128
Goldie ring 1128
uniform dimension 1128
16S10 – Rings determined by universal prop-
erties (free algebras, coproducts, adjunc-
tion of inverses, etc.) 1130
Ore domain 1130
16S34 – Group rings , Laurent polynomial
rings 1131
support 1131
16S36 – Ordinary and skew polynomial rings
and semigroup rings 1132
Gaussian polynomials 1132
q skew derivation 1133
q skew polynomial ring 1133
sigma derivation 1133
sigma, delta constant 1133
skew derivation 1133
skew polynomial ring 1134
16S99 – Miscellaneous 1135
algebra 1135
algebra (module) 1135
16U10 – Integral domains 1137
Pr¨ ufer domain 1137
valuation domain 1137
16U20 – Ore rings, multiplicative sets, Ore
localization 1139
Goldie’s Theorem 1139
Ore condition 1139
Ore’s theorem 1140
classical ring of quotients 1140
saturated 1141
16U70 – Center, normalizer (invariant el-
ements) 1142
center (rings) 1142
16U99 – Miscellaneous 1143
anti-idempotent 1143
16W20 – Automorphisms and endomor-
phisms 1144
ring of endomorphisms 1144
16W30 – Coalgebras, bialgebras, Hopf al-
gebras ; rings, modules, etc. on which
these act 1146
Hopf algebra 1146
almost cocommutative bialgebra 1147
bialgebra 1148
coalgebra 1148
coinvariant 1149
comodule 1149
comodule algebra 1149
comodule coalgebra 1150
module algebra 1150
module coalgebra 1150
16W50 – Graded rings and modules 1151
graded algebra 1151
graded module 1151
supercommutative 1151
16W55 – “Super” (or “skew”) structure
1153
super tensor product 1153
superalgebra 1153
supernumber 1154
16W99 – Miscellaneous 1155
Hamiltonian quaternions 1155
16Y30 – Near-rings 1158
near-ring 1158
17A01 – General theory 1159
commutator bracket 1159
17B05 – Structure theory 1161
Killing form 1161
Levi’s theorem 1161
nilradical 1161
radical 1162
17B10 – Representations, algebraic theory
(weights) 1163
Ado’s theorem 1163
Lie algebra representation 1163
adjoint representation 1164
examples of non-matrix Lie groups 1165
isotropy representation 1165
17B15 – Representations, analytic theory
1166
invariant form (Lie algebras) 1166
17B20 – Simple, semisimple, reductive (su-
per)algebras (roots) 1167
Borel subalgebra 1167
Borel subgroup 1167
Cartan matrix 1168
Cartan subalgebra 1168
Cartan’s criterion 1168
xxv
Casimir operator 1168
Dynkin diagram 1169
Verma module 1169
Weyl chamber 1170
Weyl group 1170
Weyl’s theorem 1170
classification of finite-dimensional representations
of semi-simple Lie algebras 1171
cohomology of semi-simple Lie algebras 1171
nilpotent cone 1171
parabolic subgroup 1172
pictures of Dynkin diagrams 1172
positive root 1175
rank 1175
root lattice 1175
root system 1176
simple and semi-simple Lie algebras 1177
simple root 1178
weight (Lie algebras) 1178
weight lattice 1178
17B30 – Solvable, nilpotent (super)algebras
1179
Engel’s theorem 1179
Lie’s theorem 1182
solvable Lie algebra 1183
17B35 – Universal enveloping (super)algebras
1184
Poincar´e-Birkhoff-Witt theorem 1184
universal enveloping algebra 1185
17B56 – Cohomology of Lie (super)algebras
1187
Lie algebra cohomology 1187
17B67 – Kac-Moody (super)algebras (struc-
ture and representation theory) 1188
Kac-Moody algebra 1188
generalized Cartan matrix 1188
17B99 – Miscellaneous 1190
Jacobi identity interpretations 1190
Lie algebra 1190
real form 1192
18-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 1193
Grothendieck spectral sequence 1193
category of sets 1194
functor 1194
monic 1194
natural equivalence 1195
representable functor 1195
supplemental axioms for an Abelian category 1195
18A05 – Definitions, generalizations 1197
autofunctor 1197
automorphism 1197
category 1198
category example (arrow category) 1199
commutative diagram 1199
double dual embedding 1200
dual category 1201
duality principle 1201
endofunctor 1202
examples of initial objects, terminal objects and
zero objects 1202
forgetful functor 1204
isomorphism 1205
natural transformation 1205
types of homomorphisms 1205
zero object 1206
18A22 – Special properties of functors (faith-
ful, full, etc.) 1208
exact functor 1208
18A25 – Functor categories, comma cate-
gories 1210
Yoneda embedding 1210
18A30 – Limits and colimits (products, sums,
directed limits, pushouts, fiber products,
equalizers, kernels, ends and coends, etc.)
1211
categorical direct product 1211
categorical direct sum 1211
kernel 1212
18A40 – Adjoint functors (universal con-
structions, reflective subcategories, Kan ex-
tensions, etc.) 1213
adjoint functor 1213
equivalence of categories 1214
18B40 – Groupoids, semigroupoids, semi-
groups, groups (viewed as categories) 1215
groupoid (category theoretic) 1215
18E10 – Exact categories, abelian cate-
gories 1216
abelian category 1216
xxvi
exact sequence 1217
derived category 1218
enough injectives 1218
18F20 – Presheaves and sheaves 1219
locally ringed space 1219
presheaf 1220
sheaf 1220
sheafification 1225
stalk 1226
18F30 – Grothendieck groups 1228
Grothendieck group 1228
18G10 – Resolutions; derived functors 1229
derived functor 1229
18G15 – Ext and Tor, generalizations, K¨ unneth
formula 1231
Ext 1231
18G30 – Simplicial sets, simplicial objects
(in a category) 1232
nerve 1232
simplicial category 1232
simplicial object 1233
18G35 – Chain complexes 1235
5-lemma 1235
9-lemma 1236
Snake lemma 1236
chain homotopy 1237
chain map 1237
homology (chain complex) 1237
18G40 – Spectral sequences, hypercoho-
mology 1238
spectral sequence 1238
19-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 1239
Algebraic K-theory 1239
K-theory 1240
examples of algebraic K-theory groups 1241
19K33 – EXT and K-homology 1242
Fredholm module 1242
K-homology 1243
19K99 – Miscellaneous 1244
examples of K-theory groups 1244
20-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 1245
alternating group is a normal subgroup of the
symmetric group 1245
associative 1245
canonical projection 1246
centralizer 1246
commutative 1247
examples of groups 1247
group 1250
quotient group 1250
20-02 – Research exposition (monographs,
survey articles) 1252
length function 1252
20-XX – Group theory and generalizations
1253
free product with amalgamated subgroup 1253
nonabelian group 1254
20A05 – Axiomatics and elementary prop-
erties 1255
Feit-Thompson theorem 1255
Proof: The orbit of any element of a group is a
subgroup 1255
center 1256
characteristic subgroup 1256
class function 1257
conjugacy class 1258
conjugacy class formula 1258
conjugate stabilizer subgroups 1258
coset 1259
cyclic group 1259
derived subgroup 1260
equivariant 1260
examples of finite simple groups 1261
finitely generated group 1262
first isomorphism theorem 1262
fourth isomorphism theorem 1262
generator 1263
group actions and homomorphisms 1263
group homomorphism 1265
homogeneous space 1265
identity element 1268
inner automorphism 1268
kernel 1269
maximal 1269
normal subgroup 1269
normality of subgroups is not transitive 1269
normalizer 1270
order (of a group) 1271
xxvii
presentation of a group 1271
proof of first isomorphism theorem 1272
proof of second isomorphism theorem 1273
proof that all cyclic groups are abelian 1274
proof that all cyclic groups of the same order are
isomorphic to each other 1274
proof that all subgroups of a cyclic group are
cyclic 1274
regular group action 1275
second isomorphism theorem 1275
simple group 1276
solvable group 1276
subgroup 1276
third isomorphism theorem 1277
20A99 – Miscellaneous 1279
Cayley table 1279
proper subgroup 1280
quaternion group 1280
20B05 – General theory for finite groups
1282
cycle notation 1282
permutation group 1283
20B15 – Primitive groups 1284
primitive transitive permutation group 1284
20B20 – Multiply transitive finite groups
1286
Jordan’s theorem (multiply transitive groups) 1286
multiply transitive 1286
sharply multiply transitive 1287
20B25 – Finite automorphism groups of al-
gebraic, geometric, or combinatorial struc-
tures 1288
diamond theory 1288
20B30 – Symmetric groups 1289
symmetric group 1289
symmetric group 1289
20B35 – Subgroups of symmetric groups
1290
Cayley’s theorem 1290
20B99 – Miscellaneous 1291
(p, q) shuffle 1291
Frobenius group 1291
permutation 1292
proof of Cayley’s theorem 1292
20C05 – Group rings of finite groups and
their modules 1294
group ring 1294
20C15 – Ordinary representations and char-
acters 1295
Maschke’s theorem 1295
a representation which is not completely reducible
1295
orthogonality relations 1296
20C30 – Representations of finite symmet-
ric groups 1299
example of immanent 1299
immanent 1299
permanent 1299
20C99 – Miscellaneous 1301
Frobenius reciprocity 1301
Schur’s lemma 1301
character 1302
group representation 1303
induced representation 1303
regular representation 1304
restriction representation 1304
20D05 – Classification of simple and non-
solvable groups 1305
Burnside p −q theorem 1305
classification of semisimple groups 1305
semisimple group 1305
20D08 – Simple groups: sporadic groups
1307
Janko groups 1307
20D10 – Solvable groups, theory of for-
mations, Schunck classes, Fitting classes,
π-length, ranks 1308
ˇ
Cuhinin’s Theorem 1308
separable 1308
supersolvable group 1309
20D15 – Nilpotent groups, p-groups 1310
Burnside basis theorem 1310
20D20 – Sylow subgroups, Sylow proper-
ties, π-groups, π-structure 1311
π-groups and π
t
-groups 1311
p-subgroup 1311
Burnside normal complement theorem 1312
Frattini argument 1312
Sylow p-subgroup 1312
Sylow theorems 1312
xxviii
Sylow’s first theorem 1313
Sylow’s third theorem 1313
application of Sylow’s theorems to groups of or-
der pq 1313
p-primary component 1314
proof of Frattini argument 1314
proof of Sylow theorems 1314
subgroups containing the normalizers of Sylow
subgroups normalize themselves 1316
20D25 – Special subgroups (Frattini, Fit-
ting, etc.) 1317
Fitting’s theorem 1317
characteristically simple group 1317
the Frattini subgroup is nilpotent 1317
20D30 – Series and lattices of subgroups
1319
maximal condition 1319
minimal condition 1319
subnormal series 1320
20D35 – Subnormal subgroups 1321
subnormal subgroup 1321
20D99 – Miscellaneous 1322
Cauchy’s theorem 1322
Lagrange’s theorem 1322
exponent 1322
fully invariant subgroup 1323
proof of Cauchy’s theorem 1323
proof of Lagrange’s theorem 1324
proof of the converse of Lagrange’s theorem for
finite cyclic groups 1324
proof that expG divides [G[ 1324
proof that [g[ divides expG 1325
proof that every group of prime order is cyclic
1325
20E05 – Free nonabelian groups 1326
Nielsen-Schreier theorem 1326
Scheier index formula 1326
free group 1326
proof of Nielsen-Schreier theorem and Schreier
index formula 1327
Jordan-Holder decomposition 1328
profinite group 1328
extension 1329
holomorph 1329
proof of the Jordan Holder decomposition theo-
rem 1329
semidirect product of groups 1330
wreath product 1333
Jordan-Hlder decomposition theorem 1334
simplicity of the alternating groups 1334
abelian groups of order 120 1337
fundamental theorem of finitely generated abelian
groups 1337
conjugacy class 1338
Frattini subgroup 1338
non-generator 1338
20Exx – Structure and classification of in-
finite or finite groups 1339
faithful group action 1339
20F18 – Nilpotent groups 1340
classification of finite nilpotent groups 1340
nilpotent group 1340
20F22 – Other classes of groups defined by
subgroup chains 1342
inverse limit 1342
20F28 – Automorphism groups of groups
1344
outer automorphism group 1344
20F36 – Braid groups; Artin groups 1345
braid group 1345
20F55 – Reflection and Coxeter groups 1347
cycle 1347
dihedral group 1348
20F65 – Geometric group theory 1349
groups that act freely on trees are free 1349
20F99 – Miscellaneous 1350
perfect group 1350
20G15 – Linear algebraic groups over ar-
bitrary fields 1351
Nagao’s theorem 1351
computation of the order of GL(n, F
q
) 1351
general linear group 1352
order of the general linear group over a finite field
1352
special linear group 1352
20G20 – Linear algebraic groups over the
reals, the complexes, the quaternions 1353
orthogonal group 1353
20G25 – Linear algebraic groups over local
fields and their integers 1354
xxix
Ihara’s theorem 1354
20G40 – Linear algebraic groups over fi-
nite fields 1355
SL
2
(F
3
) 1355
20J06 – Cohomology of groups 1356
group cohomology 1356
stronger Hilbert theorem 90 1357
20J15 – Category of groups 1359
variety of groups 1359
20K01 – Finite abelian groups 1360
Schinzel’s theorem 1360
20K10 – Torsion groups, primary groups
and generalized primary groups 1361
torsion 1361
20K25 – Direct sums, direct products, etc.
1362
direct product of groups 1362
20K99 – Miscellaneous 1363
Klein 4-group 1363
divisible group 1364
example of divisible group 1364
locally cyclic group 1364
20Kxx – Abelian groups 1366
abelian group 1366
20M10 – General structure theory 1367
existence of maximal semilattice decomposition
1367
semilattice decomposition of a semigroup 1368
simple semigroup 1368
20M12 – Ideal theory 1370
Rees factor 1370
ideal 1370
20M14 – Commutative semigroups 1372
Archimedean semigroup 1372
commutative semigroup 1372
20M20 – Semigroups of transformations,
etc. 1373
semigroup of transformations 1373
20M30 – Representation of semigroups; ac-
tions of semigroups on sets 1375
counting theorem 1375
example of group action 1375
group action 1376
orbit 1377
proof of counting theorem 1377
stabilizer 1378
20M99 – Miscellaneous 1379
a semilattice is a commutative band 1379
adjoining an identity to a semigroup 1379
band 1380
bicyclic semigroup 1380
congruence 1381
cyclic semigroup 1381
idempotent 1382
null semigroup 1383
semigroup 1383
semilattice 1383
subsemigroup,, submonoid,, and subgroup 1384
zero elements 1384
20N02 – Sets with a single binary opera-
tion (groupoids) 1386
groupoid 1386
idempotency 1386
left identity and right identity 1387
20N05 – Loops, quasigroups 1388
Moufang loop 1388
loop and quasigroup 1389
22-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 1390
fixed-point subspace 1390
22-XX – Topological groups, Lie groups
1391
Cantor space 1391
22A05 – Structure of general topological
groups 1392
topological group 1392
22C05 – Compact groups 1393
n-torus 1393
reductive 1393
22D05 – General properties and structure
of locally compact groups 1394
Γ-simple 1394
22D15 – Group algebras of locally com-
pact groups 1395
group C

-algebra 1395
22E10 – General properties and structure
of complex Lie groups 1396
existence and uniqueness of compact real form
1396
maximal torus 1397
xxx
Lie group 1397
complexification 1399
Hilbert-Weyl theorem 1400
the connection between Lie groups and Lie alge-
bras 1401
26-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 1402
derivative notation 1402
fundamental theorems of calculus 1403
logarithm 1404
proof of the first fundamental theorem of calcu-
lus 1405
proof of the second fundamental theorem of cal-
culus 1405
root-mean-square 1406
square 1406
26-XX – Real functions 1408
abelian function 1408
full-width at half maximum 1408
26A03 – Foundations: limits and general-
izations, elementary topology of the line
1410
Cauchy sequence 1410
Dedekind cuts 1410
binomial proof of positive integer power rule 1413
exponential 1414
interleave sequence 1415
limit inferior 1415
limit superior 1416
power rule 1417
properties of the exponential 1417
squeeze rule 1418
26A06 – One-variable calculus 1420
Darboux’s theorem (analysis) 1420
Fermat’s Theorem (stationary points) 1420
Heaviside step function 1421
Leibniz’ rule 1421
Rolle’s theorem 1422
binomial formula 1422
chain rule 1422
complex Rolle’s theorem 1423
complex mean-value theorem 1423
definite integral 1424
derivative of even/odd function (proof) 1425
direct sum of even/odd functions (example) 1425
even/odd function 1426
example of chain rule 1427
example of increasing/decreasing/monotone func-
tion 1428
extended mean-value theorem 1428
increasing/decreasing/monotone function 1428
intermediate value theorem 1429
limit 1429
mean value theorem 1430
mean-value theorem 1430
monotonicity criterion 1431
nabla 1431
one-sided limit 1432
product rule 1432
proof of Darboux’s theorem 1433
proof of Fermat’s Theorem (stationary points)
1434
proof of Rolle’s theorem 1434
proof of Taylor’s Theorem 1435
proof of binomial formula 1436
proof of chain rule 1436
proof of extended mean-value theorem 1437
proof of intermediate value theorem 1437
proof of mean value theorem 1438
proof of monotonicity criterion 1439
proof of quotient rule 1439
quotient rule 1440
signum function 1440
26A09 – Elementary functions 1443
definitions in trigonometry 1443
hyperbolic functions 1444
26A12 – Rate of growth of functions, or-
ders of infinity, slowly varying functions
1446
Landau notation 1446
26A15 – Continuity and related questions
(modulus of continuity, semicontinuity, dis-
continuities, etc.) 1448
Dirichlet’s function 1448
semi-continuous 1448
semicontinuous 1449
uniformly continuous 1450
26A16 – Lipschitz (H¨ older) classes 1451
Lipschitz condition 1451
Lipschitz condition and differentiability 1452
xxxi
Lipschitz condition and differentiability result 1453
26A18 – Iteration 1454
iteration 1454
periodic point 1454
26A24 – Differentiation (functions of one
variable): general theory, generalized deriva-
tives, mean-value theorems 1455
Leibniz notation 1455
derivative 1456
l’Hpital’s rule 1460
proof of De l’Hpital’s rule 1461
related rates 1462
26A27 – Nondifferentiability (nondifferen-
tiable functions, points of nondifferentia-
bility), discontinuous derivatives 1464
Weierstrass function 1464
26A36 – Antidifferentiation 1465
antiderivative 1465
integration by parts 1465
integrations by parts for the Lebesgue integral
1466
26A42 – Integrals of Riemann, Stieltjes
and Lebesgue type 1468
Riemann sum 1468
Riemann-Stieltjes integral 1469
continuous functions are Riemann integrable 1469
generalized Riemann integral 1469
proof of Continuous functions are Riemann inte-
grable 1470
26A51 – Convexity, generalizations 1471
concave function 1471
26Axx – Functions of one variable 1472
function centroid 1472
26B05 – Continuity and differentiation ques-
tions 1473
C

0
(U) is not empty 1473
Rademacher’s Theorem 1474
smooth functions with compact support 1475
26B10 – Implicit function theorems, Jaco-
bians, transformations with several vari-
ables 1477
Jacobian matrix 1477
directional derivative 1477
gradient 1478
implicit differentiation 1481
implicit function theorem 1481
proof of implicit function theorem 1482
26B12 – Calculus of vector functions 1484
Clairaut’s theorem 1484
Fubini’s Theorem 1484
Generalised N-dimensional Riemann Sum 1485
Generalized N-dimensional Riemann Integral 1485
Helmholtz equation 1486
Hessian matrix 1487
Jordan Content of an N-cell 1487
Laplace equation 1487
chain rule (several variables) 1488
divergence 1489
extremum 1490
irrotational field 1490
partial derivative 1491
plateau 1492
proof of Green’s theorem 1492
relations between Hessian matrix and local ex-
trema 1493
solenoidal field 1494
26B15 – Integration: length, area, volume
1495
arc length 1495
26B20 – Integral formulas (Stokes, Gauss,
Green, etc.) 1497
Green’s theorem 1497
26B25 – Convexity, generalizations 1499
convex function 1499
extremal value of convex/concave functions 1500
26B30 – Absolutely continuous functions,
functions of bounded variation 1502
absolutely continuous function 1502
total variation 1503
26B99 – Miscellaneous 1505
derivation of zeroth weighted power mean 1505
weighted power mean 1506
26C15 – Rational functions 1507
rational function 1507
26C99 – Miscellaneous 1508
Laguerre Polynomial 1508
26D05 – Inequalities for trigonometric func-
tions and polynomials 1509
Weierstrass product inequality 1509
proof of Jordan’s Inequality 1509
xxxii
26D10 – Inequalities involving derivatives
and differential and integral operators 1511
Gronwall’s lemma 1511
proof of Gronwall’s lemma 1511
26D15 – Inequalities for sums, series and
integrals 1513
Carleman’s inequality 1513
Chebyshev’s inequality 1513
MacLaurin’s Inequality 1514
Minkowski inequality 1514
Muirhead’s theorem 1515
Schur’s inequality 1515
Young’s inequality 1515
arithmetic-geometric-harmonic means inequality
1516
general means inequality 1516
power mean 1517
proof of Chebyshev’s inequality 1517
proof of Minkowski inequality 1518
proof of arithmetic-geometric-harmonic means in-
equality 1519
proof of general means inequality 1521
proof of rearrangement inequality 1522
rearrangement inequality 1523
26D99 – Miscellaneous 1524
Bernoulli’s inequality 1524
proof of Bernoulli’s inequality 1524
26E35 – Nonstandard analysis 1526
hyperreal 1526
e is not a quadratic irrational 1527
zero of a function 1528
28-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 1530
extended real numbers 1530
28-XX – Measure and integration 1532
Riemann integral 1532
martingale 1532
28A05 – Classes of sets (Borel fields, σ-
rings, etc.), measurable sets, Suslin sets,
analytic sets 1534
Borel σ-algebra 1534
28A10 – Real- or complex-valued set func-
tions 1535
σ-finite 1535
Argand diagram 1535
Hahn-Kolmogorov theorem 1536
measure 1536
outer measure 1536
properties for measure 1538
28A12 – Contents, measures, outer mea-
sures, capacities 1540
Hahn decomposition theorem 1540
Jordan decomposition 1540
Lebesgue decomposition theorem 1541
Lebesgue outer measure 1541
absolutely continuous 1542
counting measure 1543
measurable set 1543
outer regular 1543
signed measure 1543
singular measure 1544
28A15 – Abstract differentiation theory,
differentiation of set functions 1545
Hardy-Littlewood maximal theorem 1545
Lebesgue differentiation theorem 1545
Radon-Nikodym theorem 1546
integral depending on a parameter 1547
28A20 – Measurable and nonmeasurable
functions, sequences of measurable func-
tions, modes of convergence 1549
Egorov’s theorem 1549
Fatou’s lemma 1549
Fatou-Lebesgue theorem 1550
dominated convergence theorem 1550
measurable function 1550
monotone convergence theorem 1551
proof of Egorov’s theorem 1551
proof of Fatou’s lemma 1552
proof of Fatou-Lebesgue theorem 1552
proof of dominated convergence theorem 1553
proof of monotone convergence theorem 1553
28A25 – Integration with respect to mea-
sures and other set functions 1555
L

(X, dµ) 1555
Hardy-Littlewood maximal operator 1555
Lebesgue integral 1556
28A60 – Measures on Boolean rings, mea-
sure algebras 1558
σ-algebra 1558
σ-algebra 1558
xxxiii
algebra 1559
measurable set (for outer measure) 1559
28A75 – Length, area, volume, other geo-
metric measure theory 1561
Lebesgue density theorem 1561
28A80 – Fractals 1562
Cantor set 1562
Hausdorff dimension 1565
Koch curve 1566
Sierpinski gasket 1567
fractal 1567
28Axx – Classical measure theory 1569
Vitali’s Theorem 1569
proof of Vitali’s Theorem 1569
28B15 – Set functions, measures and inte-
grals with values in ordered spaces 1571
L
p
-space 1571
locally integrable function 1572
28C05 – Integration theory via linear func-
tionals (Radon measures, Daniell integrals,
etc.), representing set functions and mea-
sures 1573
Haar integral 1573
28C10 – Set functions and measures on
topological groups, Haar measures, invari-
ant measures 1575
Haar measure 1575
28C20 – Set functions and measures and
integrals in infinite-dimensional spaces (Wiener
measure, Gaussian measure, etc.) 1577
essential supremum 1577
28D05 – Measure-preserving transforma-
tions 1578
measure-preserving 1578
30-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 1579
domain 1579
region 1579
regular region 1580
topology of the complex plane 1580
30-XX – Functions of a complex variable
1581
z
0
is a pole of f 1581
30A99 – Miscellaneous 1582
Riemann mapping theorem 1582
Runge’s theorem 1582
Weierstrass M-test 1583
annulus 1583
conformally equivalent 1583
contour integral 1584
orientation 1585
proof of Weierstrass M-test 1585
unit disk 1586
upper half plane 1586
winding number and fundamental group 1586
30B10 – Power series (including lacunary
series) 1587
Euler relation 1587
analytic 1588
existence of power series 1588
infinitely-differentiable function that is not ana-
lytic 1590
power series 1591
proof of radius of convergence 1592
radius of convergence 1593
30B50 – Dirichlet series and other series
expansions, exponential series 1594
Dirichlet series 1594
30C15 – Zeros of polynomials, rational func-
tions, and other analytic functions (e.g.
zeros of functions with bounded Dirichlet
integral) 1596
Mason-Stothers theorem 1596
zeroes of analytic functions are isolated 1596
30C20 – Conformal mappings of special
domains 1598
automorphisms of unit disk 1598
unit disk upper half plane conformal equivalence
theorem 1598
30C35 – General theory of conformal map-
pings 1599
proof of conformal mapping theorem 1599
30C80 – Maximum principle; Schwarz’s lemma,
Lindel¨of principle, analogues and general-
izations; subordination 1601
Schwarz lemma 1601
maximum principle 1601
proof of Schwarz lemma 1602
30D20 – Entire functions, general theory
1603
xxxiv
Liouville’s theorem 1603
Morera’s theorem 1603
entire 1604
holomorphic 1604
proof of Liouville’s theorem 1604
30D30 – Meromorphic functions, general
theory 1606
Casorati-Weierstrass theorem 1606
Mittag-Leffler’s theorem 1606
Riemann’s removable singularity theorem 1607
essential singularity 1607
meromorphic 1607
pole 1607
proof of Casorati-Weierstrass theorem 1608
proof of Riemann’s removable singularity theo-
rem 1608
residue 1609
simple pole 1610
30E20 – Integration, integrals of Cauchy
type, integral representations of analytic
functions 1611
Cauchy integral formula 1611
Cauchy integral theorem 1612
Cauchy residue theorem 1613
Gauss’ mean value theorem 1614
M¨obius circle transformation theorem 1614
M¨obius transformation cross-ratio preservation
theorem 1614
Rouch’s theorem 1614
absolute convergence implies convergence for an
infinite product 1615
absolute convergence of infinite product 1615
closed curve theorem 1615
conformal M¨obius circle map theorem 1615
conformal mapping 1616
conformal mapping theorem 1616
convergence/divergence for an infinite product
1616
example of conformal mapping 1616
examples of infinite products 1617
link between infinite products and sums 1617
proof of Cauchy integral formula 1618
proof of Cauchy residue theorem 1619
proof of Gauss’ mean value theorem 1620
proof of Goursat’s theorem 1620
proof of M¨obius circle transformation theorem
1622
proof of Simultaneous converging or diverging of
product and sum theorem 1623
proof of absolute convergence implies convergence
for an infinite product 1624
proof of closed curve theorem 1624
proof of conformal M¨obius circle map theorem
1624
simultaneous converging or diverging of product
and sum theorem 1625
Cauchy-Riemann equations 1625
Cauchy-Riemann equations (polar coordinates)
1626
proof of the Cauchy-Riemann equations 1626
removable singularity 1627
30F40 – Kleinian groups 1629
Klein 4-group 1629
31A05 – Harmonic, subharmonic, super-
harmonic functions 1630
a harmonic function on a graph which is bounded
below and nonconstant 1630
example of harmonic functions on graphs 1630
examples of harmonic functions on R
n
1631
harmonic function 1632
31B05 – Harmonic, subharmonic, super-
harmonic functions 1633
Laplacian 1633
32A05 – Power series, series of functions
1634
exponential function 1634
32C15 – Complex spaces 1637
Riemann sphere 1637
32F99 – Miscellaneous 1638
star-shaped region 1638
32H02 – Holomorphic mappings, (holomor-
phic) embeddings and related questions 1639
Bloch’s theorem 1639
Hartog’s theorem 1639
32H25 – Picard-type theorems and gener-
alizations 1640
Picard’s theorem 1640
little Picard theorem 1640
33-XX – Special functions 1641
beta function 1641
xxxv
33B10 – Exponential and trigonometric func-
tions 1642
natural logarithm 1642
33B15 – Gamma, beta and polygamma func-
tions 1643
Bohr-Mollerup theorem 1643
gamma function 1643
proof of Bohr-Mollerup theorem 1645
33B30 – Higher logarithm functions 1647
Lambert W function 1647
33B99 – Miscellaneous 1648
natural log base 1648
33D45 – Basic orthogonal polynomials and
functions (Askey-Wilson polynomials, etc.)
1649
orthogonal polynomials 1649
33E05 – Elliptic functions and integrals
1651
Weierstrass sigma function 1651
elliptic function 1652
elliptic integrals and Jacobi elliptic functions 1652
examples of elliptic functions 1654
modular discriminant 1654
34-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 1656
Liapunov function 1656
Lorenz equation 1657
Wronskian determinant 1659
dependence on initial conditions of solutions of
ordinary differential equations 1660
differential equation 1661
existence and uniqueness of solution of ordinary
differential equations 1662
maximal interval of existence of ordinary differ-
ential equations 1663
method of undetermined coefficients 1663
natural symmetry of the Lorenz equation 1664
symmetry of a solution of an ordinary differen-
tial equation 1665
symmetry of an ordinary differential equation
1665
34-01 – Instructional exposition (textbooks,
tutorial papers, etc.) 1667
second order linear differential equation with con-
stant coefficients 1667
34A05 – Explicit solutions and reductions
1669
separation of variables 1669
variation of parameters 1670
34A12 – Initial value problems, existence,
uniqueness, continuous dependence and con-
tinuation of solutions 1672
initial value problem 1672
34A30 – Linear equations and systems, gen-
eral 1674
Chebyshev equation 1674
34A99 – Miscellaneous 1676
autonomous system 1676
34B24 – Sturm-Liouville theory 1677
eigenfunction 1677
34C05 – Location of integral curves, sin-
gular points, limit cycles 1678
Hopf bifurcation theorem 1678
Poincare-Bendixson theorem 1679
omega limit set 1679
34C07 – Theory of limit cycles of polyno-
mial and analytic vector fields (existence,
uniqueness, bounds, Hilbert’s 16th prob-
lem and ramif 1680
Hilbert’s 16th problem for quadratic vector fields
1680
34C23 – Bifurcation 1682
equivariant branching lemma 1682
34C25 – Periodic solutions 1683
Bendixson’s negative criterion 1683
Dulac’s criteria 1683
proof of Bendixson’s negative criterion 1684
34C99 – Miscellaneous 1685
Hartman-Grobman theorem 1685
equilibrium point 1685
stable manifold theorem 1686
34D20 – Lyapunov stability 1687
Lyapunov stable 1687
neutrally stable fixed point 1687
stable fixed point 1687
34L05 – General spectral theory 1688
Gelfand spectral radius theorem 1688
34L15 – Estimation of eigenvalues, upper
and lower bounds 1689
Rayleigh quotient 1689
xxxvi
34L40 – Particular operators (Dirac, one-
dimensional Schr¨odinger, etc.) 1690
Dirac delta function 1690
construction of Dirac delta function 1691
35-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 1692
differential operator 1692
35J05 – Laplace equation, reduced wave
equation (Helmholtz), Poisson equation 1694
Poisson’s equation 1694
35L05 – Wave equation 1695
wave equation 1695
35Q53 – KdV-like equations (Korteweg-de
Vries, Burgers, sine-Gordon, sinh-Gordon,
etc.) 1697
Korteweg - de Vries equation 1697
35Q99 – Miscellaneous 1698
heat equation 1698
37-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 1699
37A30 – Ergodic theorems, spectral the-
ory, Markov operators 1700
ergodic 1700
fundamental theorem of demography 1700
proof of fundamental theorem of demography 1701
37B05 – Transformations and group ac-
tions with special properties (minimality,
distality, proximality, etc.) 1703
discontinuous action 1703
37B20 – Notions of recurrence 1704
nonwandering set 1704
37B99 – Miscellaneous 1705
ω-limit set 1705
asymptotically stable 1706
expansive 1706
the only compact metric spaces that admit a pos-
itively expansive homeomorphism are discrete spaces
1707
topological conjugation 1708
topologically transitive 1709
uniform expansivity 1709
37C10 – Vector fields, flows, ordinary dif-
ferential equations 1710
flow 1710
globally attracting fixed point 1711
37C20 – Generic properties, structural sta-
bility 1712
Kupka-Smale theorem 1712
Pugh’s general density theorem 1712
structural stability 1713
37C25 – Fixed points, periodic points, fixed-
point index theory 1714
hyperbolic fixed point 1714
37C29 – Homoclinic and heteroclinic or-
bits 1715
heteroclinic 1715
homoclinic 1715
37C75 – Stability theory 1716
attracting fixed point 1716
stable manifold 1716
37C80 – Symmetries, equivariant dynam-
ical systems 1718
Γ-equivariant 1718
37D05 – Hyperbolic orbits and sets 1719
hyperbolic isomorphism 1719
37D20 – Uniformly hyperbolic systems (ex-
panding, Anosov, Axiom A, etc.) 1720
Anosov diffeomorphism 1720
Axiom A 1721
hyperbolic set 1721
37D99 – Miscellaneous 1722
Kupka-Smale 1722
37E05 – Maps of the interval (piecewise
continuous, continuous, smooth) 1723
Sharkovskii’s theorem 1723
37G15 – Bifurcations of limit cycles and
periodic orbits 1724
Feigenbaum constant 1724
Feigenbaum fractal 1725
equivariant Hopf theorem 1726
37G40 – Symmetries, equivariant bifurca-
tion theory 1728
Po´enaru (1976) theorem 1728
bifurcation problem with symmetry group 1728
trace formula 1729
37G99 – Miscellaneous 1730
chaotic dynamical system 1730
37H20 – Bifurcation theory 1732
bifurcation 1732
39B05 – General 1733
xxxvii
functional equation 1733
39B62 – Functional inequalities, including
subadditivity, convexity, etc. 1734
Jensen’s inequality 1734
proof of Jensen’s inequality 1735
proof of arithmetic-geometric-harmonic means in-
equality 1735
subadditivity 1736
superadditivity 1736
40-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 1738
Cauchy product 1738
Cesro mean 1739
alternating series 1739
alternating series test 1739
monotonic 1740
monotonically decreasing 1740
monotonically increasing 1741
monotonically nondecreasing 1741
monotonically nonincreasing 1741
sequence 1742
series 1742
40A05 – Convergence and divergence of
series and sequences 1743
Abel’s lemma 1743
Abel’s test for convergence 1744
Baroni’s Theorem 1744
Bolzano-Weierstrass theorem 1744
Cauchy criterion for convergence 1744
Cauchy’s root test 1745
Dirichlet’s convergence test 1745
Proof of Baroni’s Theorem 1746
Proof of Stolz-Cesaro theorem 1747
Stolz-Cesaro theorem 1748
absolute convergence theorem 1748
comparison test 1748
convergent sequence 1749
convergent series 1749
determining series convergence 1749
example of integral test 1750
geometric series 1750
harmonic number 1751
harmonic series 1752
integral test 1753
proof of Abel’s lemma (by induction) 1754
proof of Abel’s test for convergence 1754
proof of Bolzano-Weierstrass Theorem 1754
proof of Cauchy’s root test 1756
proof of Leibniz’s theorem (using Dirichlet’s con-
vergence test) 1756
proof of absolute convergence theorem 1756
proof of alternating series test 1757
proof of comparison test 1757
proof of integral test 1758
proof of ratio test 1759
ratio test 1759
40A10 – Convergence and divergence of
integrals 1760
improper integral 1760
40A25 – Approximation to limiting values
(summation of series, etc.) 1761
Euler’s constant 1761
40A30 – Convergence and divergence of
series and sequences of functions 1763
Abel’s limit theorem 1763
L¨owner partial ordering 1763
L¨owner’s theorem 1764
matrix monotone 1764
operator monotone 1764
pointwise convergence 1764
uniform convergence 1765
40G05 – Ces`aro, Euler, N¨orlund and Haus-
dorff methods 1766
Ces`aro summability 1766
40G10 – Abel, Borel and power series meth-
ods 1768
Abel summability 1768
proof of Abel’s convergence theorem 1769
proof of Tauber’s convergence theorem 1770
41A05 – Interpolation 1772
Lagrange Interpolation formula 1772
Simpson’s 3/8 rule 1772
trapezoidal rule 1773
41A25 – Rate of convergence, degree of
approximation 1775
superconvergence 1775
41A58 – Series expansions (e.g. Taylor,
Lidstone series, but not Fourier series) 1776
Taylor series 1776
Taylor’s Theorem 1778
xxxviii
41A60 – Asymptotic approximations, asymp-
totic expansions (steepest descent, etc.)
1779
Stirling’s approximation 1779
42-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 1781
countable basis 1781
discrete cosine transform 1782
42-01 – Instructional exposition (textbooks,
tutorial papers, etc.) 1784
Laplace transform 1784
42A05 – Trigonometric polynomials, in-
equalities, extremal problems 1785
Chebyshev polynomial 1785
42A16 – Fourier coefficients, Fourier se-
ries of functions with special properties,
special Fourier series 1787
Riemann-Lebesgue lemma 1787
example of Fourier series 1788
42A20 – Convergence and absolute con-
vergence of Fourier and trigonometric se-
ries 1789
Dirichlet conditions 1789
42A38 – Fourier and Fourier-Stieltjes trans-
forms and other transforms of Fourier type
1790
Fourier transform 1790
42A99 – Miscellaneous 1792
Poisson summation formula 1792
42B05 – Fourier series and coefficients 1793
Parseval equality 1793
Wirtinger’s inequality 1793
43A07 – Means on groups, semigroups,
etc.; amenable groups 1795
amenable group 1795
44A35 – Convolution 1796
convolution 1796
46-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 1799
balanced set 1799
bounded function 1800
bounded set (in a topological vector space) 1801
cone 1802
locally convex topological vector space 1803
sequential characterization of boundedness 1803
symmetric set 1803
46A30 – Open mapping and closed graph
theorems; completeness (including B-, B
r
-
completeness) 1805
closed graph theorem 1805
open mapping theorem 1805
46A99 – Miscellaneous 1806
Heine-Cantor theorem 1806
proof of Heine-Cantor theorem 1806
topological vector space 1807
46B20 – Geometry and structure of normed
linear spaces 1808
lim
p→∞
|x|
p
= |x|

1808
Hahn-Banach theorem 1809
proof of Hahn-Banach theorem 1810
seminorm 1811
vector norm 1813
46B50 – Compactness in Banach (or normed)
spaces 1815
Schauder fixed point theorem 1815
proof of Schauder fixed point theorem 1815
46B99 – Miscellaneous 1817

p
1817
Banach space 1818
an inner product defines a norm 1818
continuous linear mapping 1818
equivalent norms 1819
normed vector space 1820
46Bxx – Normed linear spaces and Banach
spaces; Banach lattices 1821
vector p-norm 1821
46C05 – Hilbert and pre-Hilbert spaces:
geometry and topology (including spaces
with semidefinite inner product) 1822
Bessel inequality 1822
Hilbert module 1822
Hilbert space 1823
proof of Bessel inequality 1823
46C15 – Characterizations of Hilbert spaces
1825
classification of separable Hilbert spaces 1825
46E15 – Banach spaces of continuous, dif-
ferentiable or analytic functions 1826
Ascoli-Arzela theorem 1826
Stone-Weierstrass theorem 1826
xxxix
proof of Ascoli-Arzel theorem 1827
Holder inequality 1827
Young Inequality 1828
conjugate index 1828
proof of Holder inequality 1828
proof of Young Inequality 1829
vector field 1829
46F05 – Topological linear spaces of test
functions, distributions and ultradistribu-
tions 1830
T
f
is a distribution of zeroth order 1830
p.v.(
1
x
) is a distribution of first order 1831
Cauchy principal part integral 1832
delta distribution 1833
distribution 1833
equivalence of conditions 1835
every locally integrable function is a distribution
1836
localization for distributions 1836
operations on distributions 1837
smooth distribution 1839
space of rapidly decreasing functions 1840
support of distribution 1841
46H05 – General theory of topological al-
gebras 1843
Banach algebra 1843
46L05 – General theory of C

-algebras 1844
C

-algebra 1844
Gelfand-Naimark representation theorem 1844
state 1844
46L85 – Noncommutative topology 1846
Gelfand-Naimark theorem 1846
Serre-Swan theorem 1846
46T12 – Measure (Gaussian, cylindrical,
etc.) and integrals (Feynman, path, Fres-
nel, etc.) on manifolds 1847
path integral 1847
47A05 – General (adjoints, conjugates, prod-
ucts, inverses, domains, ranges, etc.) 1849
Baker-Campbell-Hausdorff formula(e) 1849
adjoint 1850
closed operator 1850
properties of the adjoint operator 1851
47A35 – Ergodic theory 1852
ergodic theorem 1852
47A53 – (Semi-) Fredholm operators; in-
dex theories 1853
Fredholm index 1853
Fredholm operator 1853
47A56 – Functions whose values are lin-
ear operators (operator and matrix val-
ued functions, etc., including analytic and
meromorphic ones 1855
Taylor’s formula for matrix functions 1855
47A60 – Functional calculus 1856
Beltrami identity 1856
Euler-Lagrange differential equation 1857
calculus of variations 1857
47B15 – Hermitian and normal operators
(spectral measures, functional calculus, etc.)
1862
self-adjoint operator 1862
47G30 – Pseudodifferential operators 1863
Dini derivative 1863
47H10 – Fixed-point theorems 1864
Brouwer fixed point in one dimension 1864
Brouwer fixed point theorem 1865
any topological space with the fixed point prop-
erty is connected 1865
fixed point property 1866
proof of Brouwer fixed point theorem 1867
47L07 – Convex sets and cones of opera-
tors 1868
convex hull of S is open if S is open 1868
47L25 – Operator spaces (= matricially
normed spaces) 1869
operator norm 1869
47S99 – Miscellaneous 1870
Drazin inverse 1870
49K10 – Free problems in two or more in-
dependent variables 1871
Kantorovitch’s theorem 1871
49M15 – Methods of Newton-Raphson, Galerkin
and Ritz types 1873
Newton’s method 1873
51-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 1877
Apollonius theorem 1877
Apollonius’ circle 1877
Brahmagupta’s formula 1878
xl
Brianchon theorem 1878
Brocard theorem 1878
Carnot circles 1879
Erd¨os-Anning Theorem 1879
Euler Line 1879
Gergonne point 1879
Gergonne triangle 1880
Heron’s formula 1880
Lemoine circle 1880
Lemoine point 1880
Miquel point 1881
Mollweide’s equations 1881
Morley’s theorem 1881
Newton’s line 1882
Newton-Gauss line 1882
Pascal’s mystic hexagram 1882
Ptolemy’s theorem 1882
Pythagorean theorem 1883
Schooten theorem 1883
Simson’s line 1884
Stewart’s theorem 1884
Thales’ theorem 1884
alternate proof of parallelogram law 1885
alternative proof of the sines law 1885
angle bisector 1887
angle sum identity 1888
annulus 1889
butterfly theorem 1889
centroid 1889
chord 1890
circle 1890
collinear 1893
complete quadrilateral 1893
concurrent 1893
cosines law 1894
cyclic quadrilateral 1894
derivation of cosines law 1894
diameter 1895
double angle identity 1896
equilateral triangle 1896
fundamental theorem on isogonal lines 1897
height 1897
hexagon 1897
hypotenuse 1898
isogonal conjugate 1898
isosceles triangle 1899
legs 1899
medial triangle 1899
median 1900
midpoint 1900
nine-point circle 1900
orthic triangle 1901
orthocenter 1901
parallelogram 1902
parallelogram law 1902
pedal triangle 1902
pentagon 1903
polygon 1903
proof of Apollonius theorem 1904
proof of Apollonius theorem 1904
proof of Brahmagupta’s formula 1905
proof of Erd¨os-Anning Theorem 1906
proof of Heron’s formula 1906
proof of Mollweide’s equations 1907
proof of Ptolemy’s inequality 1908
proof of Ptolemy’s theorem 1909
proof of Pythagorean theorem 1910
proof of Pythagorean theorem 1910
proof of Simson’s line 1911
proof of Stewart’s theorem 1912
proof of Thales’ theorem 1913
proof of butterfly theorem 1913
proof of double angle identity 1914
proof of parallelogram law 1915
proof of tangents law 1915
quadrilateral 1916
radius 1916
rectangle 1916
regular polygon 1917
regular polyhedron 1917
rhombus 1918
right triangle 1919
sector of a circle 1919
sines law 1919
sines law proof 1920
some proofs for triangle theorems 1920
square 1921
tangents law 1921
triangle 1921
triangle center 1922
xli
51-01 – Instructional exposition (textbooks,
tutorial papers, etc.) 1924
geometry 1924
51-XX – Geometry 1927
non-Euclidean geometry 1927
parallel postulate 1927
51A05 – General theory and projective ge-
ometries 1928
Ceva’s theorem 1928
Menelaus’ theorem 1928
Pappus’s theorem 1929
proof of Ceva’s theorem 1929
proof of Menelaus’ theorem 1930
proof of Pappus’s theorem 1931
proof of Pascal’s mystic hexagram 1932
51A30 – Desarguesian and Pappian geome-
tries 1934
Desargues’ theorem 1934
proof of Desargues’ theorem 1934
51A99 – Miscellaneous 1936
Pick’s theorem 1936
proof of Pick’s theorem 1936
51F99 – Miscellaneous 1939
Weizenbock’s Inequality 1939
51M04 – Elementary problems in Euclidean
geometries 1940
Napoleon’s theorem 1940
corollary of Morley’s theorem 1941
pivot theorem 1941
proof of Morley’s theorem 1941
proof of pivot theorem 1943
51M05 – Euclidean geometries (general)
and generalizations 1944
area of the n-sphere 1944
geometry of the sphere 1945
sphere 1945
spherical coordinates 1947
volume of the n-sphere 1947
51M10 – Hyperbolic and elliptic geome-
tries (general) and generalizations 1949
Lobachevsky’s formula 1949
51M16 – Inequalities and extremum prob-
lems 1950
Brunn-Minkowski inequality 1950
Hadwiger-Finsler inequality 1950
isoperimetric inequality 1951
proof of Hadwiger-Finsler inequality 1951
51M20 – Polyhedra and polytopes; regu-
lar figures, division of spaces 1953
polyhedron 1953
51M99 – Miscellaneous 1954
Euler line proof 1954
SSA 1954
cevian 1955
congruence 1955
incenter 1956
incircle 1956
symmedian 1957
51N05 – Descriptive geometry 1958
curve 1958
piecewise smooth 1960
rectifiable 1960
51N20 – Euclidean analytic geometry 1961
Steiner’s theorem 1961
Van Aubel theorem 1961
conic section 1961
proof of Steiner’s theorem 1963
proof of Van Aubel theorem 1964
proof of Van Aubel’s Theorem 1965
three theorems on parabolas 1966
52A01 – Axiomatic and generalized con-
vexity 1969
convex combination 1969
52A07 – Convex sets in topological vector
spaces 1970
Fr´echet space 1970
52A20 – Convex sets in n dimensions (in-
cluding convex hypersurfaces) 1973
Carath´eodory’s theorem 1973
52A35 – Helly-type theorems and geomet-
ric transversal theory 1974
Helly’s theorem 1974
52A99 – Miscellaneous 1975
convex set 1975
52C07 – Lattices and convex bodies in n
dimensions 1976
Radon’s lemma 1976
52C35 – Arrangements of points, flats, hy-
perplanes 1978
Sylvester’s theorem 1978
xlii
53-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 1979
Lie derivative 1979
closed differential forms on a simple connected
domain 1979
exact (differential form) 1980
manifold 1980
metric tensor 1983
proof of closed differential forms on a simple con-
nected domain 1983
pullback of a k-form 1985
tangent space 1985
53-01 – Instructional exposition (textbooks,
tutorial papers, etc.) 1988
curl 1988
53A04 – Curves in Euclidean space 1990
Frenet frame 1990
Serret-Frenet equations 1991
curvature (space curve) 1992
fundamental theorem of space curves 1993
helix 1993
space curve 1994
53A45 – Vector and tensor analysis 1996
closed (differential form) 1996
53B05 – Linear and affine connections 1997
Levi-Civita connection 1997
connection 1997
vector field along a curve 2001
53B21 – Methods of Riemannian geome-
try 2002
Hodge star operator 2002
Riemannian manifold 2002
53B99 – Miscellaneous 2004
germ of smooth functions 2004
53C17 – Sub-Riemannian geometry 2005
Sub-Riemannian manifold 2005
53D05 – Symplectic manifolds, general 2006
Darboux’s Theorem (symplectic geometry) 2006
Moser’s theorem 2006
almost complex structure 2007
coadjoint orbit 2007
examples of symplectic manifolds 2007
hamiltonian vector field 2008
isotropic submanifold 2008
lagrangian submanifold 2009
symplectic manifold 2009
symplectic matrix 2009
symplectic vector field 2010
symplectic vector space 2010
53D10 – Contact manifolds, general 2011
contact manifold 2011
53D20 – Momentum maps; symplectic re-
duction 2012
momentum map 2012
54-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 2013
Krull dimension 2013
Niemytzki plane 2013
Sorgenfrey line 2014
boundary (in topology) 2014
closed set 2014
coarser 2015
compact-open topology 2015
completely normal 2015
continuous proper map 2016
derived set 2016
diameter 2016
every second countable space is separable 2016
first axiom of countability 2017
homotopy groups 2017
indiscrete topology 2018
interior 2018
invariant forms on representations of compact
groups 2018
ladder connected 2019
local base 2020
loop 2020
loop space 2020
metrizable 2020
neighborhood system 2021
paracompact topological space 2021
pointed topological space 2021
proper map 2021
quasi-compact 2022
regularly open 2022
separated 2022
support of function 2023
topological invariant 2024
topological space 2024
topology 2025
xliii
triangle inequality 2025
universal covering space 2026
54A05 – Topological spaces and general-
izations (closure spaces, etc.) 2027
characterization of connected compact metric spaces.
2027
closure axioms 2027
neighborhood 2028
open set 2028
54A20 – Convergence in general topology
(sequences, filters, limits, convergence spaces,
etc.) 2030
Banach fixed point theorem 2030
Dini’s theorem 2031
another proof of Dini’s theorem 2031
continuous convergence 2032
contractive maps are uniformly continuous 2033
net 2033
proof of Banach fixed point theorem 2034
proof of Dini’s theorem 2035
theorem about continuous convergence 2035
ultrafilter 2035
ultranet 2036
54A99 – Miscellaneous 2037
basis 2037
box topology 2037
closure 2038
cover 2038
dense 2039
examples of filters 2039
filter 2039
limit point 2040
nowhere dense 2040
perfect set 2041
properties of the closure operator 2041
subbasis 2041
54B05 – Subspaces 2042
irreducible 2042
irreducible component 2042
subspace topology 2042
54B10 – Product spaces 2043
product topology 2043
product topology preserves the Hausdorff prop-
erty 2044
54B15 – Quotient spaces, decompositions
2045
Klein bottle 2045
M¨obius strip 2046
cell attachment 2047
quotient space 2047
torus 2048
54B17 – Adjunction spaces and similar con-
structions 2049
adjunction space 2049
54B40 – Presheaves and sheaves 2050
direct image 2050
54B99 – Miscellaneous 2051
cofinite and cocountable topology 2051
cone 2051
join 2052
order topology 2052
suspension 2053
54C05 – Continuous maps 2054
Inverse Function Theorem (topological spaces)
2054
continuity of composition of functions 2054
continuous 2055
discontinuous 2055
homeomorphism 2057
proof of Inverse Function Theorem (topological
spaces) 2057
restriction of a continuous mapping is continu-
ous 2057
54C10 – Special maps on topological spaces
(open, closed, perfect, etc.) 2059
densely defined 2059
open mapping 2059
54C15 – Retraction 2060
retract 2060
54C70 – Entropy 2061
differential entropy 2061
54C99 – Miscellaneous 2062
Borsuk-Ulam theorem 2062
ham sandwich theorem 2062
proof of Borsuk-Ulam theorem 2062
54D05 – Connected and locally connected
spaces (general aspects) 2064
Jordan curve theorem 2064
clopen subset 2064
connected component 2065
xliv
connected set 2065
connected set in a topological space 2066
connected space 2066
connectedness is preserved under a continuous
map 2066
cut-point 2067
example of a connected space that is not path-
connected 2067
example of a semilocally simply connected space
which is not locally simply connected 2068
example of a space that is not semilocally simply
connected 2068
locally connected 2069
locally simply connected 2069
path component 2069
path connected 2070
products of connected spaces 2070
proof that a path connected space is connected
2070
quasicomponent 2070
semilocally simply connected 2071
54D10 – Lower separation axioms (T
0
–T
3
,
etc.) 2072
T0 space 2072
T1 space 2072
T2 space 2072
T3 space 2073
a compact set in a Hausdorff space is closed 2073
proof of A compact set in a Hausdorff space is
closed 2074
regular 2074
regular space 2074
separation axioms 2075
topological space is T
1
if and only if every sin-
gleton is closed. 2076
54D15 – Higher separation axioms (com-
pletely regular, normal, perfectly or col-
lectionwise normal, etc.) 2077
Tietze extension theorem 2077
Tychonoff 2077
Urysohn’s lemma 2078
normal 2078
proof of Urysohn’s lemma 2078
54D20 – Noncompact covering properties
(paracompact, Lindel¨of, etc.) 2081
Lindel¨of 2081
countably compact 2081
locally finite 2081
54D30 – Compactness 2082
Y is compact if and only if every open cover of
Y has a finite subcover 2082
Heine-Borel theorem 2083
Tychonoff’s theorem 2083
a space is compact if and only if the space has
the finite intersection property 2083
closed set in a compact space is compact 2084
closed subsets of a compact set are compact 2084
compact 2085
compactness is preserved under a continuous map
2085
examples of compact spaces 2086
finite intersection property 2088
limit point compact 2088
point and a compact set in a Hausdorff space
have disjoint open neighborhoods. 2088
proof of Heine-Borel theorem 2089
properties of compact spaces 2091
relatively compact 2092
sequentially compact 2092
two disjoint compact sets in a Hausdorff space
have disjoint open neighborhoods. 2092
54D35 – Extensions of spaces (compacti-
fications, supercompactifications, comple-
tions, etc.) 2094
Alexandrov one-point compactification 2094
compactification 2094
54D45 – Local compactness, σ-compactness
2095
σ-compact 2095
examples of locally compact and not locally com-
pact spaces 2095
locally compact 2096
54D65 – Separability 2097
separable 2097
54D70 – Base properties 2098
second countable 2098
54D99 – Miscellaneous 2099
Lindel¨of theorem 2099
first countable 2099
proof of Lindel¨of theorem 2099
xlv
totally disconnected space 2100
54E15 – Uniform structures and general-
izations 2101
topology induced by uniform structure 2101
uniform space 2101
uniform structure of a metric space 2102
uniform structure of a topological group 2102
ε-net 2103
Euclidean distance 2103
Hausdorff metric 2104
Urysohn metrization theorem 2104
ball 2104
bounded 2105
city-block metric 2105
completely metrizable 2105
distance to a set 2106
equibounded 2106
isometry 2106
metric space 2107
non-reversible metric 2107
open ball 2108
some structures on R
n
2108
totally bounded 2110
ultrametric 2110
Lebesgue number lemma 2111
proof of Lebesgue number lemma 2111
complete 2111
completeness principle 2112
uniformly equicontinuous 2112
Baire category theorem 2112
Baire space 2113
equivalent statement of Baire category theorem
2113
generic 2114
meager 2114
proof for one equivalent statement of Baire cat-
egory theorem 2114
proof of Baire category theorem 2115
residual 2115
six consequences of Baire category theorem 2116
Hahn-Mazurkiewicz theorem 2116
Vitali covering 2116
compactly generated 2116
54G05 – Extremally disconnected spaces,
F-spaces, etc. 2117
extremally disconnected 2117
54G20 – Counterexamples 2118
Sierpinski space 2118
long line 2118
55-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 2120
Universal Coefficient Theorem 2120
invariance of dimension 2121
55M05 – Duality 2122
Poincar´e duality 2122
55M20 – Fixed points and coincidences 2123
Sperner’s lemma 2123
55M25 – Degree, winding number 2125
degree (map of spheres) 2125
winding number 2126
55M99 – Miscellaneous 2127
genus of topological surface 2127
55N10 – Singular theory 2128
Betti number 2128
Mayer-Vietoris sequence 2128
cellular homology 2128
homology (topological space) 2129
homology of RP
3
. 2131
long exact sequence (of homology groups) 2132
relative homology groups 2133
55N99 – Miscellaneous 2134
suspension isomorphism 2134
55P05 – Homotopy extension properties,
cofibrations 2135
cofibration 2135
homotopy extension property 2135
55P10 – Homotopy equivalences 2136
Whitehead theorem 2136
weak homotopy equivalence 2136
55P15 – Classification of homotopy type
2137
simply connected 2137
55P20 – Eilenberg-Mac Lane spaces 2138
Eilenberg-Mac Lane space 2138
55P99 – Miscellaneous 2139
fundamental groupoid 2139
55Pxx – Homotopy theory 2141
nulhomotopic map 2141
55Q05 – Homotopy groups, general; sets
of homotopy classes 2142
xlvi
Van Kampen’s theorem 2142
category of pointed topological spaces 2143
deformation retraction 2143
fundamental group 2144
homotopy of maps 2144
homotopy of paths 2145
long exact sequence (locally trivial bundle) 2145
55Q52 – Homotopy groups of special spaces
2146
contractible 2146
55R05 – Fiber spaces 2147
classification of covering spaces 2147
covering space 2148
deck transformation 2148
lifting of maps 2150
lifting theorem 2151
monodromy 2151
properly discontinuous action 2153
regular covering 2153
55R10 – Fiber bundles 2155
associated bundle construction 2155
bundle map 2156
fiber bundle 2156
locally trivial bundle 2157
principal bundle 2157
pullback bundle 2158
reduction of structure group 2158
section of a fiber bundle 2160
some examples of universal bundles 2161
universal bundle 2161
55R25 – Sphere bundles and vector bun-
dles 2163
Hopf bundle 2163
vector bundle 2163
55U10 – Simplicial sets and complexes 2164
simplicial complex 2164
57-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 2167
connected sum 2167
57-XX – Manifolds and cell complexes 2168
CW complex 2168
57M25 – Knots and links in S
3
2170
connected sum 2170
knot theory 2170
unknot 2173
57M99 – Miscellaneous 2174
Dehn surgery 2174
57N16 – Geometric structures on mani-
folds 2175
self-intersections of a curve 2175
57N70 – Cobordism and concordance 2176
h-cobordism 2176
Smale’s h-cobordism theorem 2176
cobordism 2176
57N99 – Miscellaneous 2178
orientation 2178
57R22 – Topology of vector bundles and
fiber bundles 2180
hairy ball theorem 2180
57R35 – Differentiable mappings 2182
Sard’s theorem 2182
differentiable function 2182
57R42 – Immersions 2184
immersion 2184
57R60 – Homotopy spheres, Poincar´e con-
jecture 2185
Poincar´e conjecture 2185
The Poincar´e dodecahedral space 2185
homology sphere 2186
57R99 – Miscellaneous 2187
transversality 2187
57S25 – Groups acting on specific mani-
folds 2189
Isomorphism of the group PSL
2
(C) with the
group of Mobius transformations 2189
58A05 – Differentiable manifolds, founda-
tions 2190
partition of unity 2190
58A10 – Differential forms 2191
differential form 2191
58A32 – Natural bundles 2194
conormal bundle 2194
cotangent bundle 2194
normal bundle 2195
tangent bundle 2195
58C35 – Integration on manifolds; mea-
sures on manifolds 2196
general Stokes theorem 2196
proof of general Stokes theorem 2196
58C40 – Spectral theory; eigenvalue prob-
xlvii
lems 2199
spectral radius 2199
58E05 – Abstract critical point theory (Morse
theory, Ljusternik-Schnirelman (Lyusternik-
Shnirelman) theory, etc.) 2200
Morse complex 2200
Morse function 2200
Morse lemma 2201
centralizer 2201
60-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 2202
Bayes’ theorem 2202
Bernoulli random variable 2202
Gamma random variable 2203
beta random variable 2204
chi-squared random variable 2205
continuous density function 2205
expected value 2206
geometric random variable 2207
proof of Bayes’ Theorem 2207
random variable 2208
uniform (continuous) random variable 2208
uniform (discrete) random variable 2209
60A05 – Axioms; other general questions
2210
example of pairwise independent events that are
not totally independent 2210
independent 2210
random event 2211
60A10 – Probabilistic measure theory 2212
Cauchy random variable 2212
almost surely 2212
60A99 – Miscellaneous 2214
Borel-Cantelli lemma 2214
Chebyshev’s inequality 2214
Markov’s inequality 2215
cumulative distribution function 2215
limit superior of sets 2215
proof of Chebyshev’s inequality 2216
proof of Markov’s inequality 2216
60E05 – Distributions: general theory 2217
Cram´er-Wold theorem 2217
Helly-Bray theorem 2217
Scheff´e’s theorem 2218
Zipf’s law 2218
binomial distribution 2219
convergence in distribution 2220
density function 2221
distribution function 2221
geometric distribution 2222
relative entropy 2223
Paul L´evy continuity theorem 2224
characteristic function 2225
Kolmogorov’s inequality 2226
discrete density function 2226
probability distribution function 2227
60F05 – Central limit and other weak the-
orems 2229
Lindeberg’s central limit theorem 2229
60F15 – Strong theorems 2231
Kolmogorov’s strong law of large numbers 2231
strong law of large numbers 2231
60G05 – Foundations of stochastic processes
2233
stochastic process 2233
60G99 – Miscellaneous 2234
stochastic matrix 2234
60J10 – Markov chains with discrete pa-
rameter 2235
Markov chain 2235
62-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 2236
covariance 2236
moment 2237
variance 2237
62E15 – Exact distribution theory 2239
Pareto random variable 2239
exponential random variable 2240
hypergeometric random variable 2240
negative hypergeometric random variable 2241
negative hypergeometric random variable, exam-
ple of 2242
proof of expected value of the hypergeometric
distribution 2243
proof of variance of the hypergeometric distribu-
tion 2243
proof that normal distribution is a distribution
2245
65-00 – General reference works (hand-
books, dictionaries, bibliographies, etc.) 2246
xlviii
normal equations 2246
principle components analysis 2247
pseudoinverse 2248
65-01 – Instructional exposition (textbooks,
tutorial papers, etc.) 2250
cubic spline interpolation 2250
65B15 – Euler-Maclaurin formula 2252
Euler-Maclaurin summation formula 2252
proof of Euler-Maclaurin summation formula 2252
65C05 – Monte Carlo methods 2254
Monte Carlo methods 2254
65D32 – Quadrature and cubature formu-
las 2256
Simpson’s rule 2256
65F25 – Orthogonalization 2257
Givens rotation 2257
Gram-Schmidt orthogonalization 2258
Householder transformation 2259
orthonormal 2261
65F35 – Matrix norms, conditioning, scal-
ing 2262
Hilbert matrix 2262
Pascal matrix 2262
Toeplitz matrix 2263
matrix condition number 2264
matrix norm 2264
pivoting 2265
65R10 – Integral transforms 2266
integral transform 2266
65T50 – Discrete and fast Fourier trans-
forms 2267
Vandermonde matrix 2267
discrete Fourier transform 2268
68M20 – Performance evaluation; queue-
ing; scheduling 2270
Amdahl’s Law 2270
efficiency 2270
proof of Amdahl’s Law 2271
68P05 – Data structures 2272
heap insertion algorithm 2272
heap removal algorithm 2273
68P10 – Searching and sorting 2275
binary search 2275
bubblesort 2276
heap 2277
heapsort 2278
in-place sorting algorithm 2279
insertion sort 2279
lower bound for sorting 2281
quicksort 2282
sorting problem 2283
68P20 – Information storage and retrieval
2285
Browsing service 2285
Digital Library Index 2285
Digital Library Scenario 2285
Digital Library Space 2286
Digitial Library Searching Service 2286
Service, activity, task, or procedure 2286
StructuredStream 2286
collection 2286
digital library stream 2287
digital object 2287
good hash table primes 2287
hashing 2289
metadata format 2293
system state 2293
transition event 2294
68P30 – Coding and information theory
(compaction, compression, models of com-
munication, encoding schemes, etc.) 2295
Huffman coding 2295
Huffman’s algorithm 2297
arithmetic encoding 2299
binary Gray code 2300
entropy encoding 2301
68Q01 – General 2302
currying 2302
higher-order function 2303
68Q05 – Models of computation (Turing
machines, etc.) 2304
Cook reduction 2304
Levin reduction 2305
Turing computable 2305
computable number 2305
deterministic finite automaton 2306
non-deterministic Turing machine 2307
non-deterministic finite automaton 2307
non-deterministic pushdown automaton 2309
oracle 2310
xlix
self-reducible 2311
universal Turing machine 2311
68Q10 – Modes of computation (nondeter-
ministic, parallel, interactive, probabilis-
tic, etc.) 2312
deterministic Turing machine 2312
random Turing machine 2313
68Q15 – Complexity classes (hierarchies,
relations among complexity classes, etc.)
2315
NP-complete 2315
complexity class 2315
constructible 2317
counting complexity class 2317
polynomial hierarchy 2317
polynomial hierarchy is a hierarchy 2318
time complexity 2318
68Q25 – Analysis of algorithms and prob-
lem complexity 2320
counting problem 2320
decision problem 2320
promise problem 2321
range problem 2321
search problem 2321
68Q30 – Algorithmic information theory
(Kolmogorov complexity, etc.) 2323
Kolmogorov complexity 2323
Kolmogorov complexity function 2323
Kolmogorov complexity upper bounds 2324
computationally indistinguishable 2324
distribution ensemble 2325
hard core 2325
invariance theorem 2325
natural numbers identified with binary strings
2326
one-way function 2326
pseudorandom 2327
psuedorandom generator 2327
support 2327
68Q45 – Formal languages and automata
2328
automaton 2328
context-free language 2329
68Q70 – Algebraic theory of languages and
automata 2331
Kleene algebra 2331
Kleene star 2331
monad 2332
68R05 – Combinatorics 2333
switching lemma 2333
68R10 – Graph theory 2334
Floyd’s algorithm 2334
digital library structural metadata specification
2334
digital library structure 2335
digital library substructure 2335
68T10 – Pattern recognition, speech recog-
nition 2336
Hough transform 2336
68U10 – Image processing 2340
aliasing 2340
68W01 – General 2341
Horner’s rule 2341
68W30 – Symbolic computation and alge-
braic computation 2343
algebraic computation 2343
68W40 – Analysis of algorithms 2344
speedup 2344
74A05 – Kinematics of deformation 2345
body 2345
deformation 2345
76D05 – Navier-Stokes equations 2346
Navier-Stokes equations 2346
81S40 – Path integrals 2347
Feynman path integral 2347
90C05 – Linear programming 2349
linear programming 2349
simplex algorithm 2350
91A05 – 2-person games 2351
examples of normal form games 2351
normal form game 2352
91A10 – Noncooperative games 2353
dominant strategy 2353
91A18 – Games in extensive form 2354
extensive form game 2354
91A99 – Miscellaneous 2355
Nash equilibrium 2355
Pareto dominant 2355
common knowledge 2356
complete information 2356
l
example of Nash equilibrium 2357
game 2357
game theory 2358
strategy 2358
utility 2359
92B05 – General biology and biomathe-
matics 2360
Lotka-Volterra system 2360
93A10 – General systems 2362
transfer function 2362
93B99 – Miscellaneous 2363
passivity 2363
93D99 – Miscellaneous 2365
Hurwitz matrix 2365
94A12 – Signal theory (characterization,
reconstruction, etc.) 2366
rms error 2366
94A17 – Measures of information, entropy
2367
conditional entropy 2367
gaussian maximizes entropy for given covariance
2368
mutual information 2368
proof of gaussian maximizes entropy for given
covariance 2369
94A20 – Sampling theory 2371
sampling theorem 2371
94A60 – Cryptography 2372
Diffie-Hellman key exchange 2372
elliptic curve discrete logarithm problem 2373
94A99 – Miscellaneous 2374
Heaps’ law 2374
History 2375
li
GNU Free Documentation License
Version 1.1, March 2000
Copyright c ( 2000 Free Software Foundation, Inc.
59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Everyone is permitted to copy and distribute verbatim copies of this license document, but
changing it is not allowed.
Preamble
The purpose of this License is to make a manual, textbook, or other written document “free”
in the sense of freedom: to assure everyone the effective freedom to copy and redistribute
it, with or without modifying it, either commercially or noncommercially. Secondarily, this
License preserves for the author and publisher a way to get credit for their work, while not
being considered responsible for modifications made by others.
This License is a kind of “copyleft”, which means that derivative works of the document must
themselves be free in the same sense. It complements the GNU General Public License, which
is a copyleft license designed for free software.
We have designed this License in order to use it for manuals for free software, because free
software needs free documentation: a free program should come with manuals providing the
same freedoms that the software does. But this License is not limited to software manuals; it
can be used for any textual work, regardless of subject matter or whether it is published as a
printed book. We recommend this License principally for works whose purpose is instruction
or reference.
Applicability and Definitions
This License applies to any manual or other work that contains a notice placed by the copy-
right holder saying it can be distributed under the terms of this License. The “Document”,
lii
below, refers to any such manual or work. Any member of the public is a licensee, and is
addressed as “you”.
A “Modified Version” of the Document means any work containing the Document or a
portion of it, either copied verbatim, or with modifications and/or translated into another
language.
A “Secondary Section” is a named appendix or a front-matter section of the Document
that deals exclusively with the relationship of the publishers or authors of the Document to
the Document’s overall subject (or to related matters) and contains nothing that could fall
directly within that overall subject. (For example, if the Document is in part a textbook
of mathematics, a Secondary Section may not explain any mathematics.) The relationship
could be a matter of historical connection with the subject or with related matters, or of
legal, commercial, philosophical, ethical or political position regarding them.
The “Invariant Sections” are certain Secondary Sections whose titles are designated, as being
those of Invariant Sections, in the notice that says that the Document is released under this
License.
The “Cover Texts” are certain short passages of text that are listed, as Front-Cover Texts or
Back-Cover Texts, in the notice that says that the Document is released under this License.
A “Transparent” copy of the Document means a machine-readable copy, represented in a
format whose specification is available to the general public, whose contents can be viewed
and edited directly and straightforwardly with generic text editors or (for images composed
of pixels) generic paint programs or (for drawings) some widely available drawing editor,
and that is suitable for input to text formatters or for automatic translation to a variety of
formats suitable for input to text formatters. A copy made in an otherwise Transparent file
format whose markup has been designed to thwart or discourage subsequent modification
by readers is not Transparent. A copy that is not “Transparent” is called “Opaque”.
Examples of suitable formats for Transparent copies include plain ASCII without markup,
Texinfo input format, L
A
T
E
X input format, SGML or XML using a publicly available DTD,
and standard-conforming simple HTML designed for human modification. Opaque formats
include PostScript, PDF, proprietary formats that can be read and edited only by proprietary
word processors, SGML or XML for which the DTD and/or processing tools are not generally
available, and the machine-generated HTML produced by some word processors for output
purposes only.
The “Title Page” means, for a printed book, the title page itself, plus such following pages
as are needed to hold, legibly, the material this License requires to appear in the title page.
For works in formats which do not have any title page as such, “Title Page” means the text
near the most prominent appearance of the work’s title, preceding the beginning of the body
of the text.
liii
Verbatim Copying
You may copy and distribute the Document in any medium, either commercially or non-
commercially, provided that this License, the copyright notices, and the license notice saying
this License applies to the Document are reproduced in all copies, and that you add no
other conditions whatsoever to those of this License. You may not use technical measures
to obstruct or control the reading or further copying of the copies you make or distribute.
However, you may accept compensation in exchange for copies. If you distribute a large
enough number of copies you must also follow the conditions in section 3.
You may also lend copies, under the same conditions stated above, and you may publicly
display copies.
Copying in Quantity
If you publish printed copies of the Document numbering more than 100, and the Document’s
license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly
and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover
Texts on the back cover. Both covers must also clearly and legibly identify you as the
publisher of these copies. The front cover must present the full title with all words of the
title equally prominent and visible. You may add other material on the covers in addition.
Copying with changes limited to the covers, as long as they preserve the title of the Document
and satisfy these conditions, can be treated as verbatim copying in other respects.
If the required texts for either cover are too voluminous to fit legibly, you should put the
first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto
adjacent pages.
If you publish or distribute Opaque copies of the Document numbering more than 100, you
must either include a machine-readable Transparent copy along with each Opaque copy, or
state in or with each Opaque copy a publicly-accessible computer-network location containing
a complete Transparent copy of the Document, free of added material, which the general
network-using public has access to download anonymously at no charge using public-standard
network protocols. If you use the latter option, you must take reasonably prudent steps, when
you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy
will remain thus accessible at the stated location until at least one year after the last time
you distribute an Opaque copy (directly or through your agents or retailers) of that edition
to the public.
It is requested, but not required, that you contact the authors of the Document well before
redistributing any large number of copies, to give them a chance to provide you with an
updated version of the Document.
liv
Modifications
You may copy and distribute a Modified Version of the Document under the conditions of
sections 2 and 3 above, provided that you release the Modified Version under precisely this
License, with the Modified Version filling the role of the Document, thus licensing distribution
and modification of the Modified Version to whoever possesses a copy of it. In addition, you
must do these things in the Modified Version:
• Use in the Title Page (and on the covers, if any) a title distinct from that of the
Document, and from those of previous versions (which should, if there were any, be
listed in the History section of the Document). You may use the same title as a previous
version if the original publisher of that version gives permission.
• List on the Title Page, as authors, one or more persons or entities responsible for
authorship of the modifications in the Modified Version, together with at least five of
the principal authors of the Document (all of its principal authors, if it has less than
five).
• State on the Title page the name of the publisher of the Modified Version, as the
publisher.
• Preserve all the copyright notices of the Document.
• Add an appropriate copyright notice for your modifications adjacent to the other copy-
right notices.
• Include, immediately after the copyright notices, a license notice giving the public
permission to use the Modified Version under the terms of this License, in the form
shown in the Addendum below.
• Preserve in that license notice the full lists of Invariant Sections and required Cover
Texts given in the Document’s license notice.
• Include an unaltered copy of this License.
• Preserve the section entitled “History”, and its title, and add to it an item stating
at least the title, year, new authors, and publisher of the Modified Version as given
on the Title Page. If there is no section entitled “History” in the Document, create
one stating the title, year, authors, and publisher of the Document as given on its
Title Page, then add an item describing the Modified Version as stated in the previous
sentence.
• Preserve the network location, if any, given in the Document for public access to
a Transparent copy of the Document, and likewise the network locations given in the
Document for previous versions it was based on. These may be placed in the “History”
section. You may omit a network location for a work that was published at least four
years before the Document itself, or if the original publisher of the version it refers to
gives permission.
lv
• In any section entitled “Acknowledgements” or “Dedications”, preserve the section’s
title, and preserve in the section all the substance and tone of each of the contributor
acknowledgements and/or dedications given therein.
• Preserve all the Invariant Sections of the Document, unaltered in their text and in
their titles. Section numbers or the equivalent are not considered part of the section
titles.
• Delete any section entitled “Endorsements”. Such a section may not be included in
the Modified Version.
• Do not retitle any existing section as “Endorsements” or to conflict in title with any
Invariant Section.
If the Modified Version includes new front-matter sections or appendices that qualify as
Secondary Sections and contain no material copied from the Document, you may at your
option designate some or all of these sections as invariant. To do this, add their titles to
the list of Invariant Sections in the Modified Version’s license notice. These titles must be
distinct from any other section titles.
You may add a section entitled “Endorsements”, provided it contains nothing but endorse-
ments of your Modified Version by various parties – for example, statements of peer review
or that the text has been approved by an organization as the authoritative definition of a
standard.
You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25
words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version.
Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or
through arrangements made by) any one entity. If the Document already includes a cover
text for the same cover, previously added by you or by arrangement made by the same entity
you are acting on behalf of, you may not add another; but you may replace the old one, on
explicit permission from the previous publisher that added the old one.
The author(s) and publisher(s) of the Document do not by this License give permission to
use their names for publicity for or to assert or imply endorsement of any Modified Version.
Combining Documents
You may combine the Document with other documents released under this License, under
the terms defined in section 4 above for modified versions, provided that you include in the
combination all of the Invariant Sections of all of the original documents, unmodified, and
list them all as Invariant Sections of your combined work in its license notice.
The combined work need only contain one copy of this License, and multiple identical In-
variant Sections may be replaced with a single copy. If there are multiple Invariant Sections
lvi
with the same name but different contents, make the title of each such section unique by
adding at the end of it, in parentheses, the name of the original author or publisher of that
section if known, or else a unique number. Make the same adjustment to the section titles
in the list of Invariant Sections in the license notice of the combined work.
In the combination, you must combine any sections entitled “History” in the various original
documents, forming one section entitled “History”; likewise combine any sections entitled
“Acknowledgements”, and any sections entitled “Dedications”. You must delete all sections
entitled “Endorsements.”
Collections of Documents
You may make a collection consisting of the Document and other documents released under
this License, and replace the individual copies of this License in the various documents with
a single copy that is included in the collection, provided that you follow the rules of this
License for verbatim copying of each of the documents in all other respects.
You may extract a single document from such a collection, and distribute it individually
under this License, provided you insert a copy of this License into the extracted document,
and follow this License in all other respects regarding verbatim copying of that document.
Aggregation With Independent Works
A compilation of the Document or its derivatives with other separate and independent doc-
uments or works, in or on a volume of a storage or distribution medium, does not as a whole
count as a Modified Version of the Document, provided no compilation copyright is claimed
for the compilation. Such a compilation is called an “aggregate”, and this License does not
apply to the other self-contained works thus compiled with the Document, on account of
their being thus compiled, if they are not themselves derivative works of the Document.
If the Cover Text requirement of section 3 is applicable to these copies of the Document, then
if the Document is less than one quarter of the entire aggregate, the Document’s Cover Texts
may be placed on covers that surround only the Document within the aggregate. Otherwise
they must appear on covers around the whole aggregate.
Translation
Translation is considered a kind of modification, so you may distribute translations of the
Document under the terms of section 4. Replacing Invariant Sections with translations
lvii
requires special permission from their copyright holders, but you may include translations of
some or all Invariant Sections in addition to the original versions of these Invariant Sections.
You may include a translation of this License provided that you also include the original
English version of this License. In case of a disagreement between the translation and the
original English version of this License, the original English version will prevail.
Termination
You may not copy, modify, sublicense, or distribute the Document except as expressly pro-
vided for under this License. Any other attempt to copy, modify, sublicense or distribute the
Document is void, and will automatically terminate your rights under this License. However,
parties who have received copies, or rights, from you under this License will not have their
licenses terminated so long as such parties remain in full compliance.
Future Revisions of This License
The Free Software Foundation may publish new, revised versions of the GNU Free Doc-
umentation License from time to time. Such new versions will be similar in spirit to
the present version, but may differ in detail to address new problems or concerns. See
http://www.gnu.org/copyleft/.
Each version of the License is given a distinguishing version number. If the Document
specifies that a particular numbered version of this License ”or any later version” applies
to it, you have the option of following the terms and conditions either of that specified
version or of any later version that has been published (not as a draft) by the Free Software
Foundation. If the Document does not specify a version number of this License, you may
choose any version ever published (not as a draft) by the Free Software Foundation.
ADDENDUM: How to use this License for your docu-
ments
To use this License in a document you have written, include a copy of the License in the
document and put the following copyright and license notices just after the title page:
Copyright c ( YEAR YOUR NAME. Permission is granted to copy, distribute
and/or modify this document under the terms of the GNU Free Documenta-
tion License, Version 1.1 or any later version published by the Free Software
Foundation; with the Invariant Sections being LIST THEIR TITLES, with the
lviii
Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST. A
copy of the license is included in the section entitled “GNU Free Documentation
License”.
If you have no Invariant Sections, write “with no Invariant Sections” instead of saying which
ones are invariant. If you have no Front-Cover Texts, write “no Front-Cover Texts” instead
of “Front-Cover Texts being LIST”; likewise for Back-Cover Texts.
If your document contains nontrivial examples of program code, we recommend releasing
these examples in parallel under your choice of free software license, such as the GNU General
Public License, to permit their use in free software.
lix
Chapter 1
UNCLA – Unclassified
1.1 Golomb ruler
A Golomb ruler of length : is a ruler with only a subset of the integer markings ¦0. c
2
. . :¦ ⊂
¦0. 1. 2. . . . . :¦ that appear on a regular ruler. The defining criterion of this subset is that
there exists an : such that any positive integer / < : can be expresses uniquely as a
difference / = c
i
−c
j
for some i. ,. This is referred to as an :-Golomb ruler.
A 4-Golomb ruler of length : is given by ¦0. 1. 3. 7¦. To verify this, we need to show that
every number 1. 2. . . . . 7 can be expressed as a difference of two numbers in the above set:
1 = 1 −0
2 = 3 −1
3 = 3 −0
4 = 7 −3
An optimal Golomb ruler is one where for a fixed value of : the value of c
n
is minimized.
Version: 2 Owner: mathcam Author(s): mathcam, imran
1.2 Hesse configuration
A Hesse configuration is a set 1 of nine non-collinear points in the projective plane over
a field 1 such that any line through two points of 1 contains exactly three points of 1.
1
Then there are 12 such lines through 1. A Hesse configuration exists if and only if the field
1 contains a primitive third root of unity. For such 1 the projective automorphism group
PGL(3. 1) acts transitively on all possible Hesse configurations.
The configuration 1 with its intersection structure of 12 lines is isomorphic to the affine space
¹ = F
2
where F is a field with three elements.
The group Γ ⊂ PGL(3. 1) of all symmetries that map 1 onto itself has order 216 and it
is isomorphic to the group of affine transformations of ¹ that have determinant 1. The
stabilizer in Γ of any of the 12 lines through 1 is a cyclic subgroup of order three and Γ is
generated by these subgroups.
The symmetry group Γ is isomorphic to G(1)2(1) where G(1) ⊂ GL(3. 1) is a group
of order 648 generated by reflections of order three and 2(1) is its cyclic center of order
three. The reflection group G(C) is called the Hesse group which appears as G
25
in the
classification of finite complex reflection groups by Shephard and Todd.
If 1 is algebraically closed and the characteristic of 1 is not 2 or 3 then the nine inflection
points of an elliptic curve 1 over 1 form a Hesse configuration.
Version: 3 Owner: debosberg Author(s): debosberg
1.3 Jordan’s Inequality
Jordan’s Inequality states
2
π
r < sin(r) < r, ∀ r ∈ [0.
π
2
]
Version: 3 Owner: unlord Author(s): unlord
1.4 Lagrange’s theorem
Lagrange’s theorem
1: G group
2: H < G
3: [G : H] index of H in G
4: [G[ = [H[[G : H]
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 2 Owner: bwebste Author(s): akrowne, apmxi
2
1.5 Laurent series
A Laurent series centered about c is a series of the form

¸
k=−∞
c
k
(. −c)
k
where c
k
. c. . ∈ C.
One can prove that the above series converges everywhere inside the set
1 := ¦. ∈ C [ 1
1
< [. −c[ < 1
2
¦
where
1
1
:= limsup
k→∞
[c
−k
[
1/k
and
1
2
:= 1

limsup
k→∞
[c
k
[
1/k

.
(This set may be empty)
Every Laurent series has an associated function, given by
1(.) :=

¸
k=−∞
c
k
(. −c)
k
.
whose domain is the set of points in C on which the series converges. This function is analytic
inside the annulus 1, and conversely, every analytic function on an annulus is equal to some
(unqiue) Laurent series.
Version: 3 Owner: djao Author(s): djao
1.6 Lebesgue measure
Let o ⊆ R, and let o
t
be the complement of o with respect to R. We define o to be
measurable if, for any ¹ ⊆ R,
:

(¹) = :


¸
o) + :


¸
o
t
)
where :

(o) is the Lebesgue outer measure of o. If o is measurable, then we define the
Lebesgue measure of o to be :(o) = :

(o).
Lebesgue measure on R
n
is the :-fold product measure of Lebesgue measure on R.
Version: 2 Owner: vampyr Author(s): vampyr
3
1.7 Leray spectral sequence
The Leray spectral sequence is a special case of the Grothendieck spectral sequence regarding
composition of functors.
If 1 : A →) is a continuous map of topological spaces, and if F is a sheaf of abelian groups
on A, then there is a spectral sequence 1
pq
2
= H
p
(). R
q
1

F) →H
p+q
(A. F)
where 1

is the direct image functor.
Version: 1 Owner: bwebste Author(s): nerdy2
1.8 M¨obius transformation
A M¨obius transformation is a bijection on the extended complex plane C
¸
¦∞¦ given by
1(.) =

az+b
cz+d
. = −
d
c
. ∞
a
c
. = ∞
∞ . = −
d
c
where c. /. c. d ∈ C and cd −/c = 0
It can be shown that the inverse, and composition of two mobius transformations are similarly
defined, and so the M¨obius transformations form a group under composition.
The geometric interpretation of the M¨obius group is that it is the group of automorphisms
of the Riemann sphere.
Any M¨obius map can be composed from the elementary transformations - dilations, trans-
lations and inversions. If we define a line to be a circle passing through ∞ then it can be
shown that a M¨obius transformation maps circles to circles, by looking at each elementary
transformation.
Version: 9 Owner: vitriol Author(s): vitriol
1.9 Mordell-Weil theorem
If 1 is an elliptic curve defined over a number field 1, then the group of points with
coordinates in 1 is a finitely generated abelian group.
Version: 1 Owner: nerdy2 Author(s): nerdy2
4
1.10 Plateau’s Problem
The ”Plateau’s Problem” is the problem of finding the surface with minimal area among all
surfaces wich have the same prescribed boundary.
This problem is named after the Belgian physicist Joseph plateau (1801-1883) who experi-
mented with soap films. As a matter of fact if you take a wire (which represents a closed curve
in three-dimensional space) and dip it in a solution of soapy water, you obtain a soapy sur-
face which has the wire as boundary. It turns out that this surface has the minimal area
among all surfaces with the same boundary, so the soap film is a solution to the Plateau’s
Problem.
Jesse Douglas (1897-1965) solved the problem by proving the existence of such minimal
surfaces. The solution to the problem is achieved by finding an harmonic and conformal
parameterization of the surface.
The extension of the problem to higher dimensions (i.e. for /-dimensional surfaces in :-
dimensional space) turns out to be much more difficult to study. Moreover while the solutions
to the original problem are always regular it turns out that the solutions to the extended
problem may have singularities if : ≥ 8. To solve the extended problem the theory of
currents (Federer and Fleming) has been developed.
Version: 4 Owner: paolini Author(s): paolini
1.11 Poisson random variable
A is a Poisson random variable with parameter λ if
1
X
(r) =
e
−λ
λ
x
x!
, r = ¦0. 1. 2. ...¦
Parameters:
- λ 0
syntax:
A ∼ 1oi::o:(λ)
5
Notes:
1. A is often used to describe the ocurrence of rare events. It’s a very commonly used
distribution in all fields of statistics.
2. 1[A] = λ
3. \ c:[A] = λ
4. `
X
(t) = c
λ(e
t
−1)
Version: 2 Owner: Riemann Author(s): Riemann
1.12 Shannon’s theorem
Definition (Discrete) Let (Ω. F. j) be a discrete probability space, and let A be a discrete
random variable on Ω.
The entropy H[A] is defined as the functional
H[A] = −
¸
x∈Ω
j(A = r) log j(A = r). (1.12.1)
Definition (Continuous) Entropy in the continuous case is called differential entropy.
Discussion—Discrete Entropy Entropy was first introduced by Shannon in 1948 in his
landmark paper “A Mathematical Theory of Communication.” A modified and expanded
argument of his argument is presented here.
Suppose we have a set of possible events whose probabilities of occurrence are j
1
. j
2
. . . . . j
n
.
These probabilities are known but that is all we know concerning which event will occur.
Can we find a measure of how much “choice” is involved in the selection of the event or of
how uncertain we are of the outcome? If there is such a measure, say H(j
1
. j
2
. . . . . j
n
), it is
reasonable to require of it the following properties:
1. H should be continuous in the j
i
.
2. If all the j
i
are equal, j
i
=
1
n
, then H should be a monotonic increasing function of :.
With equally likely events there is more choice, or uncertainty, when there are more
possible events.
6
3. If a choice be broken down into two successive choices, the original H should be the
weighted sum of the individual values of H.
As an example of this last property, consider losing your luggage down a chute which feeds
three carousels, A, B and C. Assume that the baggage handling system is constructed such
that the probability of your luggage ending up on carousel A is
1
2
, on B is
1
3
, and on C is
1
6
. These probabilities specify the j
i
. There are two ways to think about your uncertainty
about where your luggage will end up.
First, you could consider your uncertainty to be H(1
A
. 1
B
. 1
C
) = H(
1
2
.
1
3
.
1
6
). On the other
hand, you reason, no matter how byzantine the baggage handling system is, half the time
your luggage will end up on carousel A and half the time it will end up on carousels B
or C (with uncertainty H(1
A
. 1
B
S
C
) = H(
1
2
.
1
2
)). If it doesn’t go into A (and half the
time it won’t), then two-thirds of the time it shows up on B and one-third of the time
it winds up on carousel C (and your uncertainty about this second event, in isolation, is
H(1
B
. 1
C
) = H(
2
3
.
1
3
)). But remember this second event only happens half the time (1
B
S
C
of the time), so you must weight this second uncertainty appropriately—that is, by
1
2
. The
uncertainties computed using each of these chains of reasoning must be equal. That is,
H (1
A
. 1
B
. 1
C
) = H

1
A
. 1
B
S
C

+ 1
B
S
C
H (1
B
. 1
C
)
H

1
2
.
1
3
.
1
6

= H

1
2
.
1
2

+
1
2
H

2
3
.
1
3

If you’re not as lost as your luggage, then you may be interested in the following. . .
Theorem The only H satisfying the three above assumptions is of the form:
H = −/
n
¸
i=1
j
i
log j
i
/ is a constant, essentially a choice of unit of measure. The measure of uncertainty, H, is
called entropy, not to be confused (though it often is) with Boltzmann’s thermodynamic
entropy. The logarithm may be taken to the base 2, in which case H is measured in “bits,”
or to the base c, in which case H is measured in “nats.”
Discussion—Continuous Entropy Despite its seductively analogous form, continuous
entropy cannot be obtained as a limiting case of discrete entropy.
We wish to obtain a generally finite measure as the “bin size” goes to zero. In the discrete
case, the bin size is the (implicit) width of each of the : (finite or infinite) bins/buckets/states
whose probabilities are the j
n
. As we generalize to the continuous domain, we must make
this width explicit.
7
To do this, start with a continuous function 1 discretized as shown in the figure:
As the figure indicates, by the mean-value theorem there exists a value r
i
in each bin such
Figure 1.1: Discretizing the function 1 into bins of width ∆
that
1(r
i
)∆ = int
(i+1)∆
i∆
1(r)dr (1.12.2)
and thus the integral of the function 1 can be approximated (in the Riemannian sense) by
int

−∞
1(r)dr = lim
∆→0

¸
i=−∞
1(r
i
)∆ (1.12.3)
where this limit and “bin size goes to zero” are equivalent.
We will denote
H

:= −

¸
i=−∞
∆1(r
i
) log ∆1(r
i
) (1.12.4)
and expanding the log we have
H

= −

¸
i=−∞
∆1(r
i
) log ∆1(r
i
) (1.12.5)
= −

¸
i=−∞
∆1(r
i
) log 1(r
i
) −

¸
i=−∞
1(r
i
)∆log ∆. (1.12.6)
As ∆ →0, we have

¸
i=−∞
1(r
i
)∆ →int1(r)dr = 1 and (1.12.7)

¸
i=−∞
∆1(r
i
) log 1(r
i
) →int1(r) log 1(r)dr (1.12.8)
This leads us to our definition of the differential entropy (continuous entropy):
/[1] = lim
∆→0

H

+ log ∆

= −int

−∞
1(r) log 1(r)dr. (1.12.9)
Version: 13 Owner: gaurminirick Author(s): drummond
8
1.13 Shapiro inequality
Let : ` 3 positive reals r
1
. r
2
. . . . . r
n
∈ R
+
.
The following inequality
r
1
r
1
+ r
2
+
r
2
r
2
+ r
3
+ +
r
n
r
1
+ r
2
`
:
2
with r
i
+ r
i+1
0 is true for any even integer : < 12 and any odd integer : < 23.
Version: 1 Owner: alek thiery Author(s): alek thiery
1.14 Sylow j-subgroups
Let G be a finite group and j be a prime that divides [G[. We can then write [G[ = j
k
: for
some positive integer / so that j does not divide :.
Any subgroup of H whose order is j
k
is called a Sylow j-subgroup or simply j subgroup.
First Sylow theorem states that any group with order j
k
: has a Sylow j-subgroup.
Version: 3 Owner: drini Author(s): drini, apmxi
1.15 Tchirnhaus transformations
A polynomial transformation which transforms a polynomial to another with certain zero-
coefficients is called a Tschirnhaus Transformation. It is thus an invertible transforma-
tion of the form r → o(r)/(r) where o. / are polynomials over the base field 1 (or some
subfield of the splitting field of the polynomial being transformed). If gcd(1(r). 1(r)) = 1
then the Tschirnhaus transformation becomes a polynomial transformation mod f.
Specifically, it concerns a substitution that reduces finding the roots of the polynomial
p = 1
n
+ c
1
1
n−1
+ ... + c
n
=
n
¸
i=1
(1 −:
i
) ∈ /[1]
to finding the roots of another q - with less parameters - and solving an auxiliary polynomial
equation s, with deg(:) < deg(j
¸
¡).
Historically, the transformation was applied to reduce the general quintic equation, to simpler
resolvents. Examples due to Hermite and Klein are respectively: The principal resolvent
1(A) := A
5
+ c
0
A
2
+ c
1
A + c
3
9
and the Bring-Jerrard form
1(A) := A
5
+ c
1
A + c
2
Tschirnhaus transformations are also used when computing Galois groups to remove repeated
roots in resolvent polynomials. Almost any transformation will work but it is extremely hard
to find an efficient algorithm that can be proved to work.
Version: 5 Owner: bwebste Author(s): bwebste, ottem
1.16 Wallis formulae
int
π
2
0
sin
2n
rdr =
1.3.....(2: −1)
2.4.....2:
π
2
int
π
2
0
sin
2n+1
rdr =
2.4.....2:
3.5.....(2: + 1)
π
2
=

¸
n=1
4:
2
4:
2
−1
=
2
1
2
3
4
3
4
5
...
Version: 2 Owner: vypertd Author(s): vypertd
1.17 ascending chain condition
A collection o of subsets of a set A (that is, a subset of the power set of A) satisfies
the ascending chain condition or ACC if there does not exist an infinite ascending chain
:
1
⊂ :
2
⊂ of subsets from o.
See also the descending chain condition (DCC).
Version: 2 Owner: antizeus Author(s): antizeus
1.18 bounded
Let A be a subset of R. We say that A is bounded when there exists a real number ` such
that [r[ < ` for all r ∈ `. When A is an interval, we speak of an bounded interval.
This can be generalized first to R
n
. We say that A ⊂ R
n
is bounded if there is a real number
` such that |r| < ` for all r ∈ ` and | | is the Euclidean distance between r and n.
When we consider balls, we speak of bounded balls
10
This condition is equivalent to the statement: There is a real number 1 such that |r−n| < 1
for all r. n ∈ A.
A further generalization to any metric space \ says that A ⊂ \ is bounded when there is a
real number ` such that d(r. n) < ` for all r. n ∈ A and d represents the metric (distance
function) on \ .
Version: 2 Owner: drini Author(s): drini, apmxi
1.19 bounded operator
Definition [1]
1. Suppose A and ) are normed vector spaces with norms ||
X
and ||
Y
. Further,
suppose 1 is a linear map 1 : A →) . If there is a ( ≥ 0 such that
|1r|
Y
≤ |r|
X
for all r ∈ A, then 1 is a bounded operator.
2. Let A and ) be as above, and let 1 : A →) is a bounded operator. Then the norm
of 1 is defined as the real number
|1| = sup¦
|1r|
Y
|r|
X
[ r ∈ A ` ¦0¦¦.
In the special case when A is the zero vector space, any linear map 1 : A →) is the
zero map since 1(0) = 01(0) = 0. In this case, we define |1| = 0.
TODO:
1. The defined norm for mappings is a norm
2. Examples: identity operator, zero operator: see [1].
3. Give alternative expressions for norm of 1. (supremum taken over unit ball)
4. Discuss boundedness and continuity
Theorem [1, 2] Suppose 1 : A →) is a linear map between vector spaces A and ) . If A
is finite dimensional, then 1 is bounded.
11
REFERENCES
1. E. Kreyszig, Introductory Functional Analysis With Applications, John Wiley & Sons,
1978.
2. G. Bachman, L. Narici, Functional analysis, Academic Press, 1966.
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 2 Owner: bwebste Author(s): matte, apmxi
1.20 complex projective line
complex projective line
1: (.
1
. .
2
) complex numbers
2: (.
1
. .
2
) = (0. 0)
3: ∀λ ∈ C ` ¦0¦ : (λ.
1
. λ.
2
) ∼ (.
1
. .
2
)
4: ¦(λ.
1
. λ.
2
)

λ ∈ C ` ¦0¦¦∼
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.21 converges uniformly
Let A be a set, (). ρ) a metric space and ¦1
n
¦ a sequence of functions from A to ) , and
1 : A →) another function.
If for any ε 0 there exists an integer ` such that
ρ(1
n
(r). 1(r)) < ε
for all : ` we say that 1
n
converges unformly to 1.
Version: 2 Owner: drini Author(s): drini, apmxi
12
1.22 descending chain condition
A collection o of subsets of a set A (that is, a subset of the power set of A) satisfies
the descending chain condition or DCC if there does not exist an infinite descending chain
:
1
⊃ :
2
⊃ of subsets from o.
See also the ascending chain condition (ACC).
Version: 1 Owner: antizeus Author(s): antizeus
1.23 diamond theorem
In the simplest case, the result states that every image of a two-colored ”Diamond” figure (like
the figure in Plato’s Meno dialogue) under the action of the symmetric group of degree 4 has
some ordinary or color-interchange symmetry. The theorem generalizes to graphic designs
on 2x2x2, 4x4, and 4x4x4 arrays. It is of interest because it relates classical (Euclidean)
symmetries to underlying group actions that come from finite rather than from classical
geometry. The group actions in the 4x4 case of the theorem throw some light on the R. T.
Curtis ”miracle octad generator” approach to the large Mathieu group.
Version: 2 Owner: m759 Author(s): m759
1.24 equivalently oriented bases
equivalently oriented bases
1: \ finite-dimensional vector space
2: (·
1
. . . . . ·
n
) ordered basis for \
3: (u
1
. . . . . u
n
) ordered basis for \
4: ¹ : \ →\
5: ∀i ∈ ¦1. . . . . :¦ : ¹·
i
= u
i
6: det(¹) 0
fact: there is a unique linear isomorphism taking a given basis to
another given basis
13
1: \ finite-dimensional vector space
2: (·
1
. . . . . ·
n
) ordered basis for \
3: (u
1
. . . . . u
n
) ordered basis for \
4: ∃!¹ : \ →\ linear isomorphism : ∀i ∈ ¦1. . . . . :¦ : ¹·
i
= u
i
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.25 finitely generated 1-module
finitely generated 1-module
1: A module over 1
2: ) ⊂ A
3: A generated by )
4: ) finite
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 1 Owner: Thomas Heye Author(s): apmxi
1.26 fraction
A fraction is a rational number expressed in the form
n
d
or :d, where : is designated the
numerator and d the denominator. The slash between them is known as a solidus when
the fraction is expressed as :d.
The fraction :d has value : [ d. For instance, 32 = 3 [ 2 = 1.5.
If :d < 1, then :d is known as a proper fraction. Otherwise, it is an improper
fraction. If : and d are relatively prime, then :d is said to be in lowest terms. To
get a fraction in lowest terms, simply divide the numerator and the denominator by their
greatest common divisor:
60
84
=
60 [ 12
84 [ 12
=
5
7
.
14
The rules for manipulating fractions are
c
/
=
/c
//
c
/
+
c
d
=
cd + /c
/d
c
/

c
d
=
cd −/c
/d
c
/

c
d
=
cc
/d
c
/
[
c
d
=
cd
/c
.
Version: 3 Owner: bwebste Author(s): digitalis
1.27 group of covering transformations
group of covering transformations
1: (¦/ : A →A

/covering transformation¦. ◦)
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.28 idempotent
idempotent
1: 1 ring
2: : ∈ 1
3: :
2
= :
The following facts hold in commutative rings.
15
fact: if : is idempotent, then 1 −: is idempotent
1: 1 ring
2: : ∈ 1
3: : idempotent
4: 1 −: idempotent
fact: if : is idempotent, then :1 is a ring
1: 1 ring
2: : ∈ 1
3: : idempotent
4: :1 is a ring
fact: if : is idempotent, then :1 has identity :
1: 1 ring
2: : ∈ 1
3: : idempotent
4: ∀: ∈ :1 : :: = :: = :
fact: if : is idempotent, then 1

= :1 (1 −:)1
1: 1 ring
2: : ∈ 1
3: : idempotent
4: 1

= :1 (1 −:)1
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 3 Owner: bwebste Author(s): apmxi
16
1.29 isolated
Let A be a topological space, let o ⊂ A, and let r ∈ o. The point r is said to be an isolated
point of o if there exists an open set l ⊂ A such that l
¸
o = ¦r¦.
The set o is isolated if every point in o is an isolated point.
Version: 1 Owner: djao Author(s): djao
1.30 isolated singularity
isolated singularity
1: 1 : l ⊂ C →C
¸
¦∞¦
2: .
0
∈ l
3: 1 analytic on l ` ¦.
0
¦
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.31 isomorphic groups
isomorphic groups
1: (A
1
. ∗
1
). (A
2
. ∗
2
) groups
2: 1 : A
1
→A
2
isomorphism
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 1 Owner: Thomas Heye Author(s): apmxi
17
1.32 joint continuous density function
Let A
1
. A
2
. .... A
n
be : random variables all defined on the same probability space. The joint
continuous density function of A
1
. A
2
. .... A
n
, denoted by 1
X
1
,X
2
,...,Xn
(r
1
. r
2
. .... r
n
), is the
function
1
X
1
,X
2
,...,Xn
: 1
n
→1
such that int
(x
1
,x
2
,...,xn)
(−∞,−∞,...,−∞)
1
X
1
,X
2
,...,Xn
(n
1
. n
2
. .... n
n
)dn
1
dn
2
...dn
n
= 1
X
1
,X
2
,...,Xn
(r
1
. r
2
. .... r
n
)
As in the case where : = 1, this function satisfies:
1. 1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) ` 0 ∀(r
1
. .... r
n
)
2. int
x
1
,...,xn
1
X
1
,X
2
,...,Xn
(n
1
. n
2
. .... n
n
)dn
1
dn
2
...dn
n
= 1
As in the single variable case, 1
X
1
,X
2
,...,Xn
does not represent the probability that each of the
random variables takes on each of the values.
Version: 4 Owner: Riemann Author(s): Riemann
1.33 joint cumulative distribution function
Let A
1
. A
2
. .... A
n
be : random variables all defined on the same probability space. The joint
cumulative distribution function of A
1
. A
2
. .... A
n
, denoted by 1
X
1
,X
2
,...,Xn
(r
1
. r
2
. .... r
n
),
is the following function:
1
X
1
,X
2
,...,Xn
: 1
n
→1
1
X
1
,X
2
,...,Xn
(r
1
. r
2
. .... r
n
) = 1[A
1
< r
1
. A
2
< r
2
. .... A
n
< r
n
]
As in the unidimensional case, this function satisfies:
1. lim
(x
1
,...,xn)→(−∞,...,−∞)
1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) = 0 and lim
(x
1
,...,xn)→(∞,...,∞)
1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) =
1
2. 1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) is a monotone, nondecreasing function.
18
3. 1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) is continuous from the right in each variable.
The way to evaluate 1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) is the following:
1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) = int
x
1
−∞
int
x
2
−∞
int
xn
−∞
1
X
1
,X
2
,...,Xn
(n
1
. .... n
n
)dn
1
dn
2
dn
n
(if 1 is continuous) or
1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) =
¸
i
1
x
1
,...,inxn
1
X
1
,X
2
,...,Xn
(i
1
. .... i
n
)
(if 1 is discrete),
where 1
X
1
,X
2
,...,Xn
is the joint density function of A
1
. .... A
n
.
Version: 3 Owner: Riemann Author(s): Riemann
1.34 joint discrete density function
Let A
1
. A
2
. .... A
n
be : random variables all defined on the same probability space. The
joint discrete density function of A
1
. A
2
. .... A
n
, denoted by 1
X
1
,X
2
,...,Xn
(r
1
. r
2
. .... r
n
), is
the following function:
1
X
1
,X
2
,...,Xn
: 1
n
→1
1
X
1
,X
2
,...,Xn
(r
1
. r
2
. .... r
n
) = 1[A
1
= r
1
. A
2
= r
2
. .... A
n
= r
n
]
As in the single variable case, sometimes it’s expressed as j
X
1
,X
2
,...,Xn
(r
1
. r
2
. .... r
n
) to mark
the difference between this function and the continuous joint density function.
Also, as in the case where : = 1, this function satisfies:
1. 1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) ` 0 ∀(r
1
. .... r
n
)
2.
¸
x
1
,...,xn
1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) = 1
In this case, 1
X
1
,X
2
,...,Xn
(r
1
. .... r
n
) = 1[A
1
= r
1
. A
2
= r
2
. .... A
n
= r
n
].
19
Version: 3 Owner: Riemann Author(s): Riemann
1.35 left function notation
We are said to be using left function notation if we write functions to the left of their
arguments. That is, if α : A →) is a function and r ∈ A, then αr is the image of r under
α.
Furthermore, if we have a function β : ) → 2, then we write the composition of the two
functions as βα : A → 2, and the image of r under the composition as βαr = (βα)r =
β(αr).
Compare this to right function notation.
Version: 1 Owner: antizeus Author(s): antizeus
1.36 lift of a submanifold
lift of a submanifold
1: A. ) topological manifolds
2: 2 ⊂ ) submanifold
3: o : 2 →) inclusion
4: ˜ o lift of o
5: i(˜ o)
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.37 limit of a real function exits at a point
Let A ⊂ R be an open set of real numbers and 1 : A →R a function.
If r
0
∈ A, we say that 1 is continuous at r
0
if for any ε 0 there exists δ positive such that
[1(r) −1(r
0
)[ < ε
20
whenever
[r −r
0
[ < δ.
Based on apmξ
Version: 2 Owner: drini Author(s): drini, apmxi
1.38 lipschitz function
lipschitz function
1: 1 : R →C
2: ∃` ∈ R : ∀r. n ∈ R : [1(r) −1(n)[ < `[r −n[
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.39 lognormal random variable
A is a Lognormal random variable with parameters j and σ
2
if
1
X
(r) =
1

2πσ
2
e

(ln x−µ)
2

2
x
, r 0
Parameters:
- j ∈ 1
- σ
2
0
syntax:
A ∼ 1oo`(j. σ
2
)
21
Notes:
1. A is a random variable such that ln(A) is a normal random variable with mean j and
variance σ
2
.
2. 1[A] = c
µ+σ
2
/2
3. \ c:[A] = c

2

2
(c
σ
2
−1)
4. `
X
(t) not useful
Version: 2 Owner: Riemann Author(s): Riemann
1.40 lowest upper bound
Let o be a set with an ordering relation <, and let 1 be a subset of o. A lowest upper bound
of 1 is an upper bound r of 1 with the property that r < n for every upper bound n of 1.
A lowest upper bound of 1, when it exists, is unique.
Greatest lower bound is defined similarly: a greatest lower bound of 1 is a lower bound r of
1 with the property that r ` n for every lower bound n of 1.
Version: 3 Owner: djao Author(s): djao
1.41 marginal distribution
Given random variables A
1
. A
2
. .... A
n
and a subset 1 ⊂ ¦1. 2. .... :¦, the marginal distri-
bution of the random variables A
i
: i ∈ 1 is the following:
1
|X
i
:i∈I¦
(x) =
¸
|x
i
:i / ∈I¦
1
X
1
,...,Xn
(r
1
. .... r
n
) or
1
|X
i
:i∈I¦
(x) = int
|x
i
:i / ∈I¦
1
X
1
,...,Xn
(n
1
. .... n
n
)
¸
|u
i
:i / ∈I¦
dn
i
,
summing if the variables are discrete and integrating if the variables are continuous.
This is, the marginal distribution of a set of random variables A
1
. .... A
n
can be obtained by
summing (or integrating) the joint distribution over all values of the other variables.
22
The most common marginal distribution is the individual marginal distribution (ie, the
marginal distribution of ONE random variable).
Version: 4 Owner: Riemann Author(s): Riemann
1.42 measurable space
A measurable space is a set 1 together with a collection B(1) of subsets of 1 which is a
sigma algebra.
The elements of B(1) are called measurable sets.
Version: 3 Owner: djao Author(s): djao
1.43 measure zero
measure zero
1: (A. `. j) measure space
2: ¹ ∈ `
3: j(¹) = 0
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.44 minimum spanning tree
Given a graph G with weighted edges, a minimum spanning tree is a spanning tree with
minimum weight, where the weight of a spanning tree is the sum of the weights of its edges.
There may be more than one minimum spanning tree for a graph, since it is the weight of
the spanning tree that must be minimum.
For example, here is a graph G of weighted edges and a minimum spanning tree 1 for that
graph. The edges of 1 are drawn as solid lines, while edges in G but not in 1 are drawn as
dotted lines.
23

3
4
7

8
4

2
5 •
5
3
7

2

6

Prim’s algorithm or Kruskal’s algorithm can compute the minimum spanning tree of a graph.
Version: 3 Owner: Logan Author(s): Logan
1.45 minimum weighted path length
Given a list of weights, \ := ¦u
1
. u
2
. . . . . u
n
¦, the minimum weighted path length is the
minimum of the weighted path length of all extended binary trees that have : external nodes
with weights taken from \. There may be multiple possible trees that give this minimum
path length, and quite often finding this tree is more important than determining the path
length.
Example
Let \ := ¦1. 2. 3. 3. 4¦. The minimum weighted path length is 29. A tree that gives this
weighted path length is shown below.
Applications
Constructing a tree of minimum weighted path length for a given set of weights has several
applications, particularly dealing with optimization problems. A simple and elegant algo-
rithm for constructing such a tree is Huffman’s algorithm. Such a tree can give the most
optimal algorithm for merging : sorted sequences (optimal merge). It can also provide a
means of compressing data (Huffman coding), as well as lead to optimal searches.
Version: 2 Owner: Logan Author(s): Logan
24
1.46 mod 2 intersection number
mod 2 intersection number
case: transversal map
1: A smooth manifold
2: A compact
3: ) smooth manifold
4: 2 ⊂ ) closed submanifold
5: 1 : A →) smooth
6: 2 and A have complementary dimension
7: 1 transversal to 2
8: [1
−1
(2)[ (mod#1)
case: nontransversal map
1: A smooth manifold
2: A compact
3: ) smooth manifold
4: 2 ⊂ ) closed submanifold
5: 1 : A →) smooth
6: dim(A) + dim(2) = dim() )
7: o homotopic to 1
8: o transversal to 2
9: [o
−1
(2)[ (mod#1)
25
fact: a homotopic transversal map exists
1: A smooth manifold
2: A compact
3: ) smooth manifold
4: 2 ⊂ ) closed submanifold
5: 1 : A →) smooth
6: dim(A) + dim(2) = dim() )
7: ∃o homotopic to 1 : o transversal to 2
fact: two homotopic transversal maps have the same mod 2 inter-
section number
1: A smooth manifold
2: A compact
3: ) smooth manifold
4: 2 ⊂ ) closed submanifold
5: 1
1
. 1
2
: A →) smooth
6: 1
1
homotopic to 1
2
7: 1
2
(1
1
. 2) = 1
2
(1
2
. 2)
fact: boundary theorem
1: A manifold with boundary
2: ) manifold
3: 2 ⊂ ) submanifold
4: 2 and ∂A have complementary dimension
5: o : ∂A →)
26
6: o can be extended to A
7: 1
2
(o. 2) = 0
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.47 moment generating function
Given a random variable A, the moment generating function of A is the following
function:
`
X
(t) = 1[c
tX
] for t ∈ 1 (if the expectation converges).
It can be shown that if the moment generating function of X is defined on an interval around
the origin, then
1[A
k
] = `
(k)
X
(t)[
t=0
In other words, the /th-derivative of the moment generating function evaluated at zero is
the /th moment of A.
Version: 1 Owner: Riemann Author(s): Riemann
1.48 monoid
A monoid is a semigroup G which contains an identity element; that is, there exists an
element c ∈ G such that c c = c c = c for all c ∈ G.
Version: 1 Owner: djao Author(s): djao
1.49 monotonic operator
For a poset A, an operator 1 is a monotonic operator if for all r. n ∈ A, r < n implies
1(r) < 1(n).
Version: 1 Owner: Logan Author(s): Logan
27
1.50 multidimensional Gaussian integral
Let N(0. K) be an unnormalized multidimensional Gaussian with mean 0 and covariance
matrix K, 1
ij
= cov(r
i
. r
j
). K is symmetric by the identity cov(r
j
. r
i
) = cov(r
i
. r
j
). Let
x = [r
1
r
2
. . . r
n
]
T
and d
n
x ⇔
¸
n
i=1
dr
n
.
It is easy to see that N(0. K) = exp (−
1
2
x
T
K
−1
x). How can we normalize N(0. K)?
We can show that

c

1
2
x
T
K
−1
x
d
n
x = ((2π)
n
[K[)
1
2
(1.50.1)
where [K[ = det K.
K
−1
is real and symmetric (since (K
−1
)
T
= (K
T
)
−1
= K
−1
). For convenience, let A = K
−1
.
We can decompose A into A = TΛT
−1
, where T is an orthonormal (T
T
T = I) matrix of
the eigenvectors of A and Λ is a diagonal matrix of the eigenvalues of A. Then

c

1
2
x
T
Ax
d
n
x =

c

1
2
x
T
TΛT
−1
x
d
n
x. (1.50.2)
Because T is orthonormal, we have T
−1
= T
T
. Now define a new vector variable y ⇔T
T
x,
and substitute:

c

1
2
x
T
TΛT
−1
x
d
n
x =

c

1
2
x
T
TΛT
T
x
d
n
x (1.50.3)
=

c

1
2
y
T
Λy
[J[d
n
y (1.50.4)
(1.50.5)
where [J[ is the determinant of the Jacobian matrix J
mn
=
∂xm
∂yn
. In this case, J = T and
thus [J[ = 1.
Now we’re in business, because Λ is diagonal and thus the integral may be separated into
the product of : independent Gaussians, each of which we can integrate separately using the
well-known formula
intc

1
2
at
2
dt =


c
1
2
. (1.50.6)
Carrying out this program, we get
28

c

1
2
y
T
Λy
d
n
y =
n
¸
k=1
intc

1
2
λ
k
y
2
k
dn
k
(1.50.7)
=
n
¸
k=1


λ
k
1
2
(1.50.8)
=

(2π)
n
¸
n
k=1
λ
k
1
2
(1.50.9)
=

(2π)
n
[Λ[
1
2
(1.50.10)
(1.50.11)
Now, we have [A[ = [TΛT
−1
[ = [T[[Λ[[T
−1
[ = [T[ = [Λ[[T[
−1
= [Λ[, so this becomes

c

1
2
x
T
Ax
d
n
x =

(2π)
n
[A[
1
2
. (1.50.12)
Substituting back in for K
−1
, we get

c

1
2
x
T
K
−1
x
d
n
x =

(2π)
n
[K
−1
[
1
2
= ((2π)
n
[K[)
1
2
. (1.50.13)
as promised.
Version: 4 Owner: drini Author(s): drini, drummond
1.51 multiindex
multiindex
Let : ∈ N. Then a element α ∈ N
n
is called a multiindex
Version: 2 Owner: mike Author(s): mike, apmxi
29
1.52 near operators
1.52.1 Perturbations and small perturbations: definitions and some
results
We start our discussion on the Campanato theory of near operators with some preliminary
tools.
Let A. ) be two sets and let a metric d be defined on ) . If 1 : A →) is an injective map,
we can define a metric on A by putting:
d
F
(r
t
. r
tt
) = d(1(r
t
). 1(r
tt
)).
Indeed, d
F
is zero if and only if r
t
= r
tt
(since 1 is injective); d
F
is obviously symmetric and
the triangle inequality follows from the triangle inequality of d.
If moreover 1(A) is a complete subspace of ) , then A is complete wrt the metric d
F
.
Indeed, let (n
n
) be a Cauchy sequence in A. By definition of d, then (1(n
n
)) is a Cauchy
sequence in ) , and in particular in 1(A), which is complete. Thus, there exists n
0
=
1(r
0
) ∈ 1(A) which is limit of the sequence (1(n
n
)). r
0
is the limit of (r
n
) in (A. d
F
),
which completes the proof.
A particular case of the previous statement is when 1 is onto (and thus a bijection) and
(). d) is complete.
Similarly, if 1(A) is compact in ) , then A is compat with the metric d
F
.
Definition 1. Let A be a set and ) be a metric space. Let 1. G be two maps from A to
) . We say that G is a perturbation of 1 if there exist a constant / 0 such that for each
r
t
. r
tt
∈ A one has:
d(G(r
t
). G(r
tt
)) ≤ /d(1(r
t
). 1(r
tt
))
remark 1. In particular, if 1 is injective then Gis a perturbation of 1 if Gis uniformly continuous
wrt to the metric induced on A by 1.
Definition 2. In the same hypothesis as in the previous definition, we say that G is a small
perturbation of 1 if it is a perturbation of constant / < 1.
We can now prove this generalization of the Banach-Caccioppoli fixed point theorem:
Theorem 1. Let A be a set and (). d) be a complete metric space. Let 1. G be two mappings
from A to ) such that:
1. 1 is bijective;
30
2. G is a small perturbation of 1.
Then, there exists a unique n ∈ A such that G(n) = 1(n)
T he hypothesis (1) ensures that the metric space (A. d
F
) is complete. If we now consider
the function 1 : A →A defined by
1(r) = 1
−1
(G(r))
we note that, by (2), we have
d(G(r
t
). G(r
tt
)) ≤ /d(1(r
t
). 1(r
tt
))
where / ∈ (0. 1) is the constant of the small perturbation; note that, by the definition of d
F
and applying 1 ◦ 1
−1
to the first side, the last equation can be rewritten as
d
F
(1(r
t
). 1(r
tt
)) ≤ /d
F
(r
t
. r
tt
);
in other words, since / < 1, 1 is a contraction in the complete metric space (A. d
F
); therefore
(by the classical Banach-Caccioppoli fixed point theorem) 1 has a unique fixed point: there
exist n ∈ A such that 1(n) = n; by definition of 1 this is equivalent to G(n) = 1(n), and
the proof is hence complete.
remark 2. The hypothesis of the theorem can be generalized as such: let A be a set and
) a metric space (not necessarily complete); let 1. G be two mappings from A to ) such
that 1 is injective, 1(A) is complete and G(A) ⊆ 1(A); then there exists n ∈ A such that
G(n) = 1(n).
(Apply the theorem using 1(A) instead of ) as target space.)
remark 3. The Banach-Caccioppoli fixed point theorem is obtained when A = ) and 1 is
the identity.
We can use theorem 1 to prove a result that applies to perturbations which are not necessarily
small (i.e. for which the constant / can be greater than one). To prove it, we must assume
some supplemental structure on the metric of ) : in particular, we have to assume that the
metric d is invariant by dilations, that is that d(αn
t
. αn
tt
) = αd(n
t
. n
tt
) for each n
t
. n
tt
∈ ) .
The most common case of such a metric is when the metric is deduced from a norm (i.e. when
) is a normed space, and in particular a Banach space). The result follows immediately:
Corollary 1. Let A be a set and (). d) be a complete metric space with a metric d invariant
by dilations. Let 1. G be two mappings from A to ) such that 1 is bijective and G is a
perturbation of 1, with constant 1 0.
Then, for each ` 1 there exists a unique n
M
∈ A such that G(n) = `1(n)
T he proof is an immediate consequence of theorem 1 given that the map
˜
G(n) = G(n)`
is a small perturbation of 1 (a property which is ensured by the dilation invariance of the
metric d).
31
We also have the following
Corollary 2. Let A be a set and (). d) be a complete, compact metric space with a metric
d invariant by dilations. Let 1. G be two mappings from A to ) such that 1 is bijective and
G is a perturbation of 1, with constant 1 0.
Then there exists at least one n
K
∈ A such that G(n

) = 11(n

)
L et (c
n
) be a decreasing sequence of real numbers greater than one, converging to one
(c
n
↓ 1) and let `
n
= c
n
1 for each : ∈ N. We can apply corollary 1 to each `
n
, obtaining
a sequence n
n
of elements of A for which one has
G(n
n
) = `
n
1(n
n
). (1.52.1)
Since (A. d
F
) is compact, there exist a subsequence of n
n
which converges to some n

; by
continuity of G and 1 we can pass to the limit in (1.52.1), obtaining
G(n

) = 11(n

)
which completes the proof.
remark 4. For theorem 2 we cannot ensure uniqueness of n

, since in general the sequence
n
n
may change with the choice of c
n
, and the limit might be different. So the corollary can
only be applied as an existence theorem.
1.52.2 Near operators
We can now introduce the concept of near operators and discuss some of their properties.
A historical remark: Campanato initially introduced the concept in Hilbert spaces; subse-
quently, it was remarked that most of the theory could more generally be applied to Banach
spaces; indeed, it was also proven that the basic definition can be generalized to make part
of the theory available in the more general environment of metric vector spaces.
We will here discuss the theory in the case of Banach spaces, with only a couple of exceptions:
to see some of the extra properties that are available in Hilbert spaces and to discuss a
generalization of the Lax-Milgram theorem to metric vector spaces.
1.52.3 Basic definitions and properties
Definition 3. Let A be a set and ) a Banach space. Let ¹. 1 be two operators from A
to ) . We say that ¹ is near 1 if and only if there exist two constants α 0 and / ∈ (0. 1)
such that, for each r
t
. r
tt
∈ A one has
|1(r
t
) −1(r
tt
) −α(¹(r
t
) −¹(r
tt
))| < / |1(r
t
) −1(r
tt
)|
32
In other words, ¹ is near 1 if 1 −α¹ is a small perturbation of 1 for an appropriate value
of α.
Observe that in general the property is not symmetric: if ¹ is near 1, it is not necessarily
true that 1 is near ¹; as we will briefly see, this can only be proven if α < 12, or in the
case that ) is a Hilbert space, by using an equivalent condition that will be discussed later
on. Yet it is possible to define a topology with some interesting properties on the space of
operators, by using the concept of nearness to form a base.
The core point of the nearness between operators is that it allows us to “transfer” many
important properties from 1 to ¹; in other words, if 1 satisfies certain properties, and ¹ is
near 1, then ¹ satisfies the same properties. To prove this, and to enumerate some of these
“nearness-invariant” properties, we will emerge a few important facts.
In what follows, unless differently specified, we will always assume that A is a set, ) is a
Banach space and ¹. 1 are two operators from A to ) .
Lemma 1. If ¹ is near 1 then there exist two positive constants `
1
. `
2
such that
|1(r
t
) −1(r
tt
)| < `
1
|¹(r
t
) −¹(r
tt
)|
|¹(r
t
) −¹(r
tt
)| < `
2
|1(r
t
) −1(r
tt
)|
W e have:
|1(r
t
) −1(r
tt
)| <
< |1(r
t
) −1(r
tt
) −α(¹(r
t
) −¹(r
tt
))| + α|¹(r
t
) −¹(r
tt
)| <
< / |1(r
t
) −1(r
tt
)| + α|¹(r
t
) −¹(r
tt
)|
and hence
|1(r
t
) −1(r
tt
)| <
α
1 −/
|¹(r
t
) −¹(r
tt
)|
which is the first inequality with `
1
= α(1 −/) (which is positive since / < 1).
But also
|¹(r
t
) −¹(r
tt
)| <
<
1
α
|1(r
t
) −1(r
tt
) −α(¹(r
t
) −¹(r
tt
))| +
1
α
|1(r
t
) −1(r
tt
)| <
<
/
α
|1(r
t
) −1(r
tt
)| +
1
α
|1(r
t
) −1(r
tt
)|
and hence
|¹(r
t
) −¹(r
tt
)| <
1 + /
α
|1(r
t
) −1(r
tt
)|
which is the second inequality with `
2
= (1 +/)α.
33
The most important corollary of the previous lemma is the following
Corollary 3. If ¹ is near 1 then two points of A have the same image under ¹ if and only
if the have the same image under 1.
We can express the previous concept in the following formal way: for each n in 1(A) there
exist . in ) such that ¹(1
−1
(n)) = ¦.¦ and conversely. In yet other words: each fiber of ¹
is a fiber (for a different point) of 1, and conversely.
It is therefore possible to define a map 1
A
: 1(A) →) by putting 1
A
(n) = .; the range of
1
A
is ¹(A). Conversely, it is possible to define 1
B
: ¹(A) → ) , by putting 1
B
(.) = n; the
range of 1
B
is 1(A). Both maps are injective and, if restricted to their respective ranges,
one is the inverse of the other.
Also observe that 1
B
and 1
A
are continuous. This follows from the fact that for each r ∈ A
one has
1
A
(1(r)) = ¹(r). 1
B
(¹(r)) = 1(r)
and that the lemma ensures that given a sequence (r
n
) in A, the sequence (1(r
n
)) converges
to 1(r
0
) if and only if (¹(r
n
)) converges to ¹(r
0
).
We can now list some invariant properties of operators with respect ot nearness. The prop-
erties are given in the form “if and only if” because each operator is near itself (therefore
ensuring the “only if” part).
1. a map is injective iff it is near an injective operator;
2. a map is surjective iff it is near a surjective operator;
3. a map is open iff it is near an open map;
4. a map has dense range iff it is near a map with dense range.
To prove (2) it is necessary to use theorem 1.
Another important property that follows from the lemma is that if there exist n ∈ ) such that
¹
−1
(n)
¸
1
−1
(n) = ∅, then it is ¹
−1
(n) = 1
−1
(n): intersecting fibers are equal. (Campanato
only stated this property for the case n = 0 and called it “‘the kernel property”; I prefer to
call it the “fiber persistence” property.)
A topology based on nearness
In this section we will show that the concept of nearness between operator can indeed be
connected to a topological understanding of the set of maps from A to ) .
34
Let M be the set of maps between A and ) . For each 1 ∈ M and for each / ∈ (0. 1) we let
l
k
(1) the set of all maps G ∈ Msuch that 1 −G is a small perturbation of 1 with constant
/. In other words, G ∈ l
k
(1) iff G is near 1 with constants 1. /.
The set U(1) = ¦l
k
(1) [ 0 < / < 1¦ satisfies the axioms of the set of fundamental
neighbourhoods. Indeed:
1. 1 belongs to each l
k
(1);
2. l
k
(1) ⊂ l
h
(1) iff / < /, and thus the intersection property of neighbourhoods is
trivial;
3. for each l
k
(1) there exist l
h
(1) such that for each G ∈ l
h
(1) there exist l
j
(G) ⊆
l
k
(1).
This last property (permanence of neighbourhoods) is somewhat less trivial, so we shall now
prove it.
L et l
k
(1) be given.
Let l
h
(1) be another arbitrary neighbourhood of 1 and let G be an arbitrary element in it.
We then have:
|1(r
t
) −1(r
tt
) −(G(r
t
) −G(r
tt
))| ≤ / |1(r
t
) −1(r
tt
)| . (1.52.2)
but also (lemma 1)
|(G(r
t
) −G(r
tt
))| ≤ (1 + /) |1(r
t
) −1(r
tt
)| . (1.52.3)
Let also l
j
(G) be an arbitrary neighbourhood of G and H an arbitrary element in it. We
then have:
|G(r
t
) −G(r
tt
) −(H(r
t
) −H(r
tt
))| ≤ , |G(r
t
) −G(r
tt
)| . (1.52.4)
The nearness between 1 and H is calculated as such:
|1(r
t
) −1(r
tt
) −(H(r
t
) −H(r
tt
))| ≤
|1(r
t
) −1(r
tt
) −(G(r
t
) −G(r
tt
))| +|G(r
t
) −G(r
tt
) −(H(r
t
) −H(r
tt
))| ≤
/ |1(r
t
) −1(r
tt
)| + , |G(r
t
) −G(r
tt
)| ≤ (/ + ,(1 + /)) |1(r
t
) −1(r
tt
)| . (1.52.5)
We then want /+,(1+/) ≤ /, that is , ≤ (/ −/)(1+/); the condition 0 < , < 1 is always
satisfied on the right side, and the left side gives us / < /.
It is important to observe that the topology generated this way is not a Hausdorff topology:
indeed, it is not possible to separate 1 and 1 +n (where 1 ∈ M and n is a constant element
of ) ). On the other hand, the subset of all maps with with a fixed valued at a fixed point
(1(r
0
) = n
0
) is a Hausdorff subspace.
35
Another important characteristic of the topology is that the set H of invertible operators
from A to ) is open in M (because a map is invertible iff it is near an invertible map). This
is not true in the topology of uniform convergence, as is easily seen by choosing A = ) = R
and the sequence with generic element 1
n
(r) = r
3
− r:: the sequence converges (in the
uniform convergence topology) to 1(r) = r
3
, which is invertible, but none of the 1
n
is
invertible. Hence 1 is an element of H which is not inside H, and H is not open.
1.52.4 Some applications
As we mentioned in the introduction, the Campanato theory of near operators allows us
to generalize some important theorems; we will now present some generalizations of the
Lax-Milgram theorem, and a generalization of the Riesz representation theorem.
[TODO]
Version: 5 Owner: Oblomov Author(s): Oblomov
1.53 negative binomial random variable
A is a Negative binomial random variable with parameters : and j if
1
X
(r) =

r+x−1
x

j
r
(1 −j)
x
, r = ¦0. 1. ...¦
Parameters:
- : 0
- j ∈ [0. 1]
syntax:
A ∼ `co1i:(:. j)
Notes:
1. If : ∈ N, A represents the number of failed Bernoulli trials before the :th success.
Note that if : = 1 the variable is a geometric random variable.
36
2. 1[A] = :
1−p
p
3. \ c:[A] = :
1−p
p
2
4. `
X
(t) = (
p
1−(1−p)e
t
)
r
Version: 2 Owner: Riemann Author(s): Riemann
1.54 normal random variable
A is a Normal random variable with parameters j and σ
2
if
1
X
(r) =
1

2πσ
2
c

(x−µ)
2

2
, r ∈ 1
Parameters:
- j ∈ 1
- σ
2
0
syntax:
A ∼ `(j. σ
2
)
Notes:
1. Probably the most frequently used distribution. 1
X
(r) will look like a bell-shaped
function, hence justifying the synonym bell distribution.
2. When j = 0 and σ
2
= 1 the distribution is called standard normal
3. The cumulative distribution function of A is often called Φ(r).
4. 1[A] = j
5. \ c:[A] = σ
2
6. `
X
(t) = c
tµ+t
2 σ
2
2
Version: 4 Owner: Riemann Author(s): Riemann
37
1.55 normalizer of a subset of a group
normalizer of a subset of a group
1: A group
2: ) ⊂ A subset
3: ¦r ∈ A

r) r
−1
= ) ¦
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 1 Owner: drini Author(s): apmxi
1.56 nth root
There are two often-used definitions of the nth root. The first discussed deals with real numbers
only; the second deals with complex numbers.
The nth root of a non-negative real number r, written as
n

r, can be defined as the real
number n such that n
n
= r. This notation is normally, but not always, used when : is a
natural number. This definition could also be written as
n

r
n
⇔r∀r ` 0 ∈ R.
Example:
4

81 = 3 because 3
4
= 3 3 3 3 = 81.
Example:
5

r
5
+ 5r
4
+ 10r
3
+ 5r
2
+ 1 = r + 1 because (r + 1)
5
= (r
2
+ 2r + 1)
2
(r + 1) =
r
5
+ 5r
4
+ 10r
3
+ 10r
2
+ 5r + 1. (See the binomial theorem and Pascal’s Triangle.)
The nth root operation is distributive for multiplication and division, but not for addition
and subtraction. That is,
n

r n =
n

r
n

n, and
n

x
y
=
n

x
n

y
. However, except in special
cases,
n

r + n =
n

r +
n

n and
n

r −n =
n

r −
n

n.
Example:
4

81
625
=
3
5
because
3
5
4
=
3
4
5
4
=
81
625
.
The nth root notation is actually an alternative to exponentiation. That is,
n

r ⇔ r
1
n
. As
such, the nth root operation is associative with exponentiation. That is,
n

r
3
= r
3
n
=
n

r
3
.
In this definition,
n

r is undefined when r < 0.and : is even. When : is odd and r < 0,
n

r < 0. Examples:
3

−1 = −1, but
4

−1 is undefined for this definition.
A more generalized definition: The nth roots of a complex number t = r+ni = (r. ni) = (:. θ)
are all the complex numbers .
1
. .
2
. . . . . .
n
∈ C that satisfy the condition .
n
k
= t. : such
complex numbers always exist.
38
One of the more popular methods of finding these roots is through geometry and trigonome-
try. The complex numbers are treated as a plane using Cartesian coordinates with an r axis
and a ni axis. (Remember, in the context of complex numbers, i ⇔

−1.) These rectangu-
lar coordinates (r. ni) are then translated to polar coordinates (:. θ), where : =
2n

r
2
+ n
2
(according to the previous definition of nth root), θ =
π
2
if r = 0, and θ = arctan
y
x
if r = 0.
(See the Pythagorean theorem.)
Then the nth roots of t are the vertices of a regular polygon having : sides, centered at
(0. 0i), and having (:. θ) as calculated above as one of its vertices.
Example: Consider
3

8. 8 can also be written as 8+0i or in polar as (8. 0). By our method, we
now have an equilateral triangle centered at (0. 0) and having one vertex at (2. 0). Knowing
that a complete circle consists of 2π radians, and knowing that all angles are equal in an
equilateral triangle, we can deduce that the other two vertices lie at polar coordinates (2.

3
)
and (2.

3
). Translating back into rectangular coordinates, we have:
3

8 = 2
3

8 = 2(cos

3
+ i sin

3
) = 2(−
1
2
+ i

3
2
) = −1 + i

3
3

8 = 2(cos

3
+ i sin

3
) = 2(−
1
2
+ i


3
2
) = −1 −i

3
Example: Consider
4

−16. We can rewrite this as
4

−1
4

16 = 2

i.
We can find 2

i by using a formula for multiplying complex numbers in polar coordinates:
(:
1
. θ
1
) (:
2
. θ
2
) = (:
1
:
2
. θ
1
+ θ
2
). So 0 + i = (:
2
. 2θ). Therefore, : =
4

0
2
+ 1
2
= 1 and
θ =
π
4
. So

i = (1.
π
4
), and doubling that we get (2.
π
4
).
Now we have a square centered at polar coordinates (0. 0) with one corner at (2.
π
4
). Adding
π
2
to the angle repeatedly gives us the remainder of the corners: (2.

4
), (2.

4
), (2.

4
).
Translating these to rectangular coordinates works as in the previous example.
So the four solutions to
4

−16 are

2 + i

2, −

2 + i

2, −

2 −i

2, and

2 −i

2.
Example: Consider
3

1 + i. As in the previous examples, our first step is to convert 1 + 1i
into polar coordinates. We get : =

1
2
+ 1
2
=

2 and θ = arctan1 =
π
4
, giving a polar
coordinate of (

2.
π
4
). Now we take the cube root of this complex number: (

2.
π
4
) = (:
3
. 3θ).
We get coordinates (
6

2.
π
12
). This point is one vertex of an equilateral triangle centered at
(0. 0). The other two vertices of the triangle are derived from adding

3
to θ. We know this
because lines from the center of an equilateral triangle to each of the corners will form three
equal angles of width

3
about the center, and because all three vertices of an equilateral
triangle will be the same distance from the center.
So the other vertices in polar coordinates are (
6

2.

4
) and (
6

2.
17π
12
). Most people would
just use a calculator to compute the sines and cosines of these angles, but they can be
interpolated using these handy identities:
39
cos 2t = 1 −2 sin
2
t (use this to calculate sin(
π
12
) from sin(
π
3
) =

3
2
)
sin(c + /) = sin(c) cos(/) + cos(c) sin(/) (use c =

4
and / =

3
)
cos(c + /) = cos(c) cos(/) −sin(c) sin(/)
The process of calculating these values is left as an exercise to the reader in the interest of
space. The rectangular coordinates, the cube roots of 1 +i, are:
(
6

2.
π
12
) =
6

2
4

12
2
+ i
6

2

1 −

3
2
(
6

2.

4
) = −
6

128
2
+ i
6

128
2
(
6

2.
17π
12
) =
6

2

2−

6
4
−i
6

2

2+

6
4
Version: 8 Owner: mathcam Author(s): mathcam, wberry
1.57 null tree
A null tree is simply a tree with zero nodes.
Version: 1 Owner: Logan Author(s): Logan
1.58 open ball
Let (A. ρ) be a metric space and r
0
∈ A. Let : be a positive number. The set
1(r
0
. :) = ¦r ∈ A : ρ(r. r
0
) < :¦
is called the ball with center r
0
and radius :. On some spaces like C or R
2
this is also known
as an open disk and when the space is R, it is known as open interval (all three spaces with
standard metric).
Version: 2 Owner: drini Author(s): drini, apmxi
1.59 opposite ring
If 1 is a ring, then we may construct the opposite ring 1
op
which has the same underlying
abelian group structure, but with multiplication in the opposite order: the product of :
1
and
:
2
in 1
op
is :
2
:
1
.
40
If ` is a left 1-module, then it can be made into a right 1
op
-module, where a module
element :, when multiplied on the right by an element : of 1
op
, yields the :: that we
have with our left 1-module action on `. Similarly, right 1-modules can be made into left
1
op
-modules.
If 1 is a commutative ring, then it is equal to its own opposite ring.
Version: 1 Owner: antizeus Author(s): antizeus
1.60 orbit-stabilizer theorem
Given a group action G on a set A, define Gr to be the orbit of r and G
x
to be the set of
stabilizers of r. For each r ∈ A the correspondence g(r) →oG
x
is a bijection between Gr,
and the set of left cosets of G
x
A famous corollary is that
[Gr[ [G
x
[ = [G[ ∀r ∈ A
Version: 8 Owner: vitriol Author(s): vitriol
1.61 orthogonal
The definition of orthogonal varies depending on the mathematical constructs in question.
There are particular definitions for
• orthogonal matrices
• orthogonal polynomials
• orthogonal vectors
In general, two objects are orthogonal if they do not “coincide” in some sense. Sometimes
orthogonal means roughly the same thing as “perpendicular”.
Version: 2 Owner: akrowne Author(s): akrowne
1.62 permutation group on a set
permutation group on a set
41
1: ¹ set
2: (o
A
. ◦) symmetric group
3: A < o
A
4: (A. ◦)
fact: conjugating stabilizer of an element by permutation produces
stabilizer of permuted element
1: ¹ set
2: c ∈ ¹
3: A permutation group on ¹
4: σ ∈ A
5: σStab
X
(c)σ
−1
= Stab
X
(σ(c))
fact: if a permutation group acts transitively, then the intersection
of conjugated stabilizers is the identity
1: ¹ set
2: c ∈ ¹
3: A permutation group on ¹
4:
¸
σ∈X
σStab
X
(c) = 1
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.63 prime element
An element j in a ring 1 is a prime element if it generates a prime ideal. If 1 is commutative,
this is equivalent to saying that for all c. / ∈ 1 , if j divides c/, then j divides c or j divides /.
42
When 1 = Z the prime elements as formulated above are simply prime numbers.
Version: 3 Owner: dublisk Author(s): dublisk
1.64 product measure
Let (1
1
. B
1
(1
1
)) and (1
2
. B
2
(1
2
)) be two measurable spaces, with measures j
1
and j
2
. Let
B
1
B
2
be the sigma algebra on 1
1
1
2
generated by subsets of the form 1
1
1
2
, where
1
1
∈ B
1
(1
1
) and 1
2
∈ B
2
(1
2
).
The product measure j
1
j
2
is defined to be the unique measure on the measurable space
(1
1
1
2
. B
1
B
2
) satisfying the property
j
1
j
2
(1
1
1
2
) = j
1
(1
1
)j
2
(1
2
) for all 1
1
∈ B
1
(1
1
). 1
2
∈ B
2
(1
2
).
Version: 2 Owner: djao Author(s): djao
1.65 projective line
projective line
example
1: / = ¦[A. ). 2. \] ∈ RP
3

2 = \ = 0¦
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.66 projective plane
projective plane
1: ∼: S
2
S
2
→¦0. 1¦
2: r ∼ n ⇔n = −r
43
3: j : S
2
→S
2

4: quotient space obtained from j
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 2 Owner: bhaire Author(s): bhaire, apmxi
1.67 proof of calculus theorem used in the Lagrange
method
Let 1(x) and o
i
(x). i = 0. . . .. : be differentiable scalar functions; x ∈ 1
n
.
We will find local extremes of the function 1(x) where ∇1 = 0. This can be proved by
contradiction:
∇1 = 0
⇔∃c
0
0. ∀c; 0 < c < c
0
: 1(x −c∇1) < 1(x) < 1(x + c∇f )
but then 1(x) is not a local extreme.
Now we put up some conditions, such that we should find the x ∈ o ⊂ 1
n
that gives a local
extreme of 1. Let o =
¸
m
i=1
o
i
, and let o
i
be defined so that o
i
(x) = 0∀x ∈ o
i
.
Any vector x ∈ 1
n
can have one component perpendicular to the subset o
i
(for visualization,
think : = 3 and let o
i
be a flat surface). ∇o
i
will be perpendicular to o
i
, because:
∃c
0
0. ∀c; 0 < c < c
0
: o
i
(x −c∇o
i
) < o
i
(x) < o
i
(x + c∇o
i
)
But o
i
(x) = 0, so any vector x +c∇o
i
must be outside o
i
, and also outside o. (todo: I have
proved that there might exist a component perpendicular to each subset o
i
, but not that
there exists only one; this should be done)
By the argument above, ∇1 must be zero - but now we can ignore all components of ∇1
perpendicular to o. (todo: this should be expressed more formally and proved)
So we will have a local extreme within o
i
if there exists a λ
i
such that
∇1 = λ
i
∇o
i
We will have local extreme(s) within o where there exists a set λ
i
. i = 1. . . .. : such that
∇1 =
¸
λ
i
∇o
i
Version: 2 Owner: tobix Author(s): tobix
44
1.68 proof of orbit-stabilizer theorem
The correspondence is clearly surjective. It is injective because if oG
x
= o
t
G
x
then o = o
t
/
for some / ∈ G
x
. Therefore g(r) = g’(h(r)) = g’(r).
Version: 1 Owner: vitriol Author(s): vitriol
1.69 proof of power rule
The power rule can be derived by repeated application of the product rule.
Proof for all positive integers :
The power rule has been shown to hold for : = 0 and : = 1. If the power rule is known to
hold for some / 0, then we have
D
D
x
r
k+1
=
D
D
x
(r r
k
)
= r(
D
D
x
r
k
) + r
k
= r (/r
k−1
) + r
k
= /r
k
+ r
k
= (/ + 1)r
k
Thus the power rule holds for all positive integers :.
Proof for all positive rationals :
Let n = r
p/q
. We need to show
D
y
D
x
(r
p/q
) =
j
¡
r
p/q−1
(1.69.1)
The proof of this comes from implicit differentiation.
By definition, we have n
q
= r
p
. We now take the derivative with respect to r on both sides
of the equality.
45
D
D
x
n
q
=
D
D
x
r
p
D
D
y
(n
q
)
D
y
D
x
= jr
p−1
¡n
p−1
D
y
D
x
= jr
p−1
D
y
D
x
=
j
¡
r
p−1
n
p−1
=
j
¡
r
p−1
n
−q/y
=
j
¡
r
p−1
r
p/q−p
=
j
¡
r
p−1+p/q−p
=
j
¡
r
p/q−1
Proof for all positive irrationals :
For positive irrationals we claim continuity due to the fact that (1.69.1) holds for all positive
rationals, and there are positive rationals that approach any positive irrational.
Proof for negative powers :
We again employ implicit differentiation. Let n = r, and differentiate n
n
with respect to r
for some non-negative :. We must show
D
u
−n
D
x
= −:n
−n−1
(1.69.2)
By definition we have n
n
n
−n
= 1. We begin by taking the derivative with respect to r on
both sides of the equality. By application of the product rule we get
46
D
D
x
(n
n
n
−n
) = 1
n
n
D
u
−n
D
x
+ n
−n
D
u
n
D
x
= 0
n
n
D
u
−n
D
x
+ n
−n
(:n
n−1
) = 0
n
n
D
u
−n
D
x
= −:n
−1
D
u
−n
D
x
= −:n
−n−1
Version: 3 Owner: alek thiery Author(s): alek thiery, Logan
1.70 proof of primitive element theorem
Let 1
a
∈ 1[r], respectively 1
b
∈ 1[r], be the monic irreducible polynomial satisfied by c,
respectively /. If 1 is an extension of 1 that splits 1
a
1
b
, then 1 is normal over 1, and so
there are a finite number of subfields of 1 containing 1, as many as there are subgroups of
Gal(11), by the Fundamental Theorem of Galois Theory. Let c
k
= c+// with / ∈ 1, and
consider the fields 1(c
k
). Since 1 is characteristic 0, there are infinitely many choices for /.
But 1 ⊂ 1(c
k
) ⊂ 1(c. /) ⊂ 1, so by the above there are only finitely many 1(c
k
). Therefore,
for some /
i
. /
j
∈ 1, 1(c
k
i
) = 1(c
k
j
). Then c
k
j
∈ 1(c
k
i
), and so c
k
i
−c
k
j
= (/
i
−/
j
)/ ∈ 1(c
k
i
),
and thus / ∈ 1(c
k
i
). Then also c = c
k
i
−/
i
/ ∈ 1(c
k
i
), which gives 1(c. /) ⊂ 1(c
k
i
). But we
also have 1(c
k
i
) ⊂ 1(c. /), and thus 1(c. /) = 1(c
k
i
), QED.
Version: 1 Owner: sucrose Author(s): sucrose
1.71 proof of product rule
D
D
x
[1(r)o(r)] = lim
h→0
1(r + /)o(r + /) −1(r)o(r)
/
= lim
h→0
1(r + /)o(r + /) + 1(r + /)o(r) −1(r + /)o(r) −1(r)o(r)
/
= lim
h→0
¸
1(r + /)
o(r + /) −o(r)
/
+ o(r)
1(r + /) −1(r)
/

= 1(r)o
t
(r) + 1
t
(r)o(r)
Version: 1 Owner: Logan Author(s): Logan
47
1.72 proof of sum rule
D
D
x
[1(r) + o(r)] = lim
h→0
1(r + /) + o(r + /) −1(r) −o(r)
/
= lim
h→0
¸
1(r + /) −1(r)
/
+
o(r + /) −o(r)
/

= 1
t
(r) + o
t
(r)
Version: 1 Owner: Logan Author(s): Logan
1.73 proof that countable unions are countable
Let ( be a countable collection of countable sets. We will show that
¸
( is countable.
Let 1 be the set of positive primes. 1 is countably infinite, so there is a bijection between
1 and N. Since there is a bijection between ( and a subset of N, there must in turn be a
one-to-one function 1 : ( →1.
Each o ∈ ( is countable, so there exists a bijection between o and some subset of N. Call
this function o, and define a new function /
S
: o →N such that for all r ∈ o,
/
S
(r) = 1(o)
g(x)
Note that /
S
is one-to-one. Also note that for any distinct pair o. 1 ∈ (, the range of /
S
and the range of /
T
are disjoint due to the fundamental theorem of arithmetic.
We may now define a one-to-one function / :
¸
( → N, where, for each r ∈
¸
(, /(r) =
/
S
(r) for some o ∈ ( where r ∈ o (the choice of o is irrelevant, so long as it contains r).
Since the range of / is a subset of N, / is a bijection into that set and hence
¸
( is countable.
Version: 2 Owner: vampyr Author(s): vampyr
1.74 quadrature
Quadrature is the computation of a univariate definite integral. It can refer to either
numerical or analytic techniques; one must gather from context which is meant.
Cubature refers to higher-dimensional definite integral computation.
Some numerical quadrature methods are Simpson’s rule, the trapezoidal rule, and Riemann sums.
Version: 4 Owner: akrowne Author(s): akrowne
48
1.75 quotient module
quotient module
1: A is a ring
2: ) a module over A
3: 2 is a submodule of )
4: )2 is the additive group of cosets of 2 in )
5: r(n + 2) = rn + 2 module structure
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 1 Owner: Thomas Heye Author(s): apmxi
1.76 regular expression
A regular expression is a particular metasyntax for specifying regular grammars, which
has many useful applications.
While variations abound, fundamentally a regular expression consists of the following components.
Parentheses can be used for grouping and nesting, and must contain a fully-formed regular
expression. The | symbol can be used for denoting alternatives. Some specifications do not
provide nesting or alternatives. There are also a number of postfix operators. The ? operator
means that the preceding element can either be present or non-present, and corresponds to a
rule of the form ¹ →1[ λ. The * operator means that the preceding element can be present
zero or more times, and corresponds to a rule of the form ¹ → 1¹[ λ. The + operator
means that the preceding element can be present one or more times, and corresponds to a
rule of the form ¹ → 1¹[ 1. Note that while these rules are not immediately in regular
form, they can be transformed so that they are.
Here is an example of a regular expression that specifies a grammar that generates the binary
representation of all multiples of 3 (and only multiples of 3).
(0

(1(01

0)

1)

)

0

This specifies the context-free grammar (in BNF):
49
o ::= ¹1
¹ ::= (1
1 ::= 01[λ
( ::= 0([λ
1 ::= 111
1 ::= 11[λ
1 ::= 0G0
G ::= 1G[λ
A little further work is required to transform this grammar into an acceptable form for
regular grammars, but it can be shown that this grammar (and any grammar specified by a
regular expression) is equivalent to some regular grammar.
Regular expressions have many applications. Quite often they are used for powerful string
matching and substitution features in many text editors and programming languages.
Version: 1 Owner: Logan Author(s): Logan
1.77 regular language
A regular grammar is a context-free grammar where all productions must take one of the
following forms (specified here in BNF, λ is the empty string):
<non-terminal> ::= tc::i:c|
<non-terminal> ::= tc::i:c| non-terminal
<non-terminal> ::= λ
A regular language is the set of strings generated by a regular grammar. Regular grammars
are also known as Type-3 grammars in the Chomsky hierarchy.
A regular grammar can be represented by a deterministic or non-deterministic finite automaton.
Such automata can serve to either generate or accept sentences in a particular regular
language. Note that since the set of regular languages is a subset of context-free lan-
guages, any deterministic or non-deterministic finite automaton can be simulated by a
pushdown automaton.
Version: 2 Owner: Logan Author(s): Logan
50
1.78 right function notation
We are said to be using right function notation if we write functions to the right of their
arguments. That is, if α : A →) is a function and r ∈ A, then rα is the image of r under
α.
Furthermore, if we have a function β : ) → 2, then we write the composition of the two
functions as αβ : A → 2, and the image of r under the composition as rαβ = r(αβ) =
(rα)β.
Compare this to left function notation.
Version: 1 Owner: antizeus Author(s): antizeus
1.79 ring homomorphism
Let 1 and o be rings. A ring homomorphism is a function 1 : 1 −→o such that:
• 1(c + /) = 1(c) + 1(/) for all c. / ∈ 1
• 1(c /) = 1(c) 1(/) for all c. / ∈ 1
When working in a context in which all rings have a multiplicative identity, one also requires
that 1(1
R
) = 1
S
.
Version: 3 Owner: djao Author(s): djao
1.80 scalar
A scalar is a quantity that is invariant under coordinate transformation, also known as a
tensor of rank 0. For example, the number 1 is a scalar, so is any number or variable : ∈ R.
The point (3. 4) is not a scalar because it is variable under rotation. As such, a scalar can
be an element of a field over which a vector space is defined.
Version: 3 Owner: slider142 Author(s): slider142
1.81 schrodinger operator
schrodinger operator
51
1: \ : R →R
2: n →−
d
y
x
d2
y
+ \ (r)n
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.82 selection sort
The Problem
See the Sorting Problem.
The Algorithm
Suppose 1 = ¦r
1
. r
2
. . . . . r
n
¦ is the initial list of unsorted elements. The selection sort
algorithm sorts this list in : steps. At each step i, find the largest element 1[,] such that
, < : − i + 1, and swap it with the element at 1[: − i + 1]. So, for the first step, find the
largest value in the list and swap it with the last element in the list. For the second step,
find the largest value in the list up to (but not including) the last element, and swap it with
the next to last element. This is continued for :−1 steps. Thus the selection sort algorithm
is a very simple, in-place sorting algorithm.
Pseudocode
Algorithm Selection Sort(L, n)
Input: A list 1 of : elements
Output: The list 1 in sorted order begin
for i ←: downto 2 do
begin
tc:j ←1[i]
:cr ←1
for , ←2 to i do
if 1[,] 1[:cr] then
:cr ←,
1[i] ←1[:cr]
1[:cr] ←tc:j
end
end
52
Analysis
The selection sort algorithm has the same runtime for any set of : elements, no matter
what the values or order of those elements are. Finding the maximum element of a list of
i elements requires i − 1 comparisons. Thus 1(:), the number of comparisons required to
sort a list of : elements with the selection sort, can be found:
1(:) =
n
¸
i=2
(i −1)
=
n
¸
i=1
i −: −2
=
(:
2
−: −4)
2
= O(:
2
)
However, the number of data movements is the number of swaps required, which is :−1. This
algorithm is very similar to the insertion sort algorithm. It requires fewer data movements,
but requires more comparisons.
Version: 1 Owner: Logan Author(s): Logan
1.83 semiring
A semiring is an algebra (¹. . +. 0. 1) of a set ¹, where 0 and 1 are constants, (¹. . 1) is
a monoid, (¹. +. 0) is a commutative monoid, distributes over + from the left and right,
and 0 is both a left and right annihilator (0c = c0 = 0). Often c / is written simply as c/,
and the semiring (¹. . +. 0. 1) as simply ¹.
The relation < on a semiring ¹ is defined as c < / if and only if there exists some c ∈ ¹
such that c + c = /, and is a quasiordering. If + is idempotent over ¹ (that is, c + c = c
holds for all c ∈ ¹), then < is a partial ordering.
Addition and (left and right) multiplication are monotonic operators with respect to <, with
0 as the minimal element.
Version: 2 Owner: Logan Author(s): Logan
53
1.84 simple function
Let (A. B) be a measurable space. Let χ
A
k
, / = 1. 2. . . . . : be the characteristic functions
of sets ¹
k
∈ B. We call / a simple function if it can be written as
/ =
n
¸
k=1
c
k
χ
A
k
. c
k
∈ R. (1.84.1)
for some : ∈ N.
Version: 2 Owner: drummond Author(s): drummond
1.85 simple path
A simple path in a graph is a path that contains no vertex more than once. By definition,
cycles are particular instances of simple paths.
Version: 1 Owner: Logan Author(s): Logan
1.86 solutions of an equation
solutions of an equation
1: ¦r

1(r) = 0¦
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 1 Owner: Thomas Heye Author(s): apmxi
1.87 spanning tree
A spanning tree of a (connected) graph G is a connected, acyclic subgraph of G that
contains all of the vertices of G. Below is an example of a spanning tree 1, where the edges
in 1 are drawn as solid lines and the edges in G but not in 1 are drawn as dotted lines.
54
• •

• • •
• •
• •
For any tree there is exactly one spanning tree: the tree itself.
Version: 2 Owner: Logan Author(s): Logan
1.88 square root
The square root of a non-negative real number r, written as

r, is the real number n such
that n
2
= r. Equivalently,

r
2
⇔r. Or,

r

r ⇔r.
Example:

9 = 3 because 3
2
= 3 3 = 9.
Example:

r
2
+ 2r + 1 = r+1 because (r+1)
2
= (r+1)(r+1) = r
2
+r+r+1 = r
2
+2r+1.
In some situations it is better to allow two values for

r. For example,

4 = ±2 because
2
2
= 4 and (−2)
2
= 4.
The square root operation is distributive for multiplication and division, but not for addition
and subtraction.
That is,

r n =

r

n, and

x
y
=

x

y
.
However, in general,

r + n =

r +

n and

r −n =

r −

n.
Example:

r
2
n
2
= rn because (rn)
2
= rn rn = r r n n = r
2
n
2
= r
2
n
2
.
Example:

9
25
=
3
5
because

3
5

2
=
3
2
5
2
=
9
25
.
The square root notation is actually an alternative to exponentiation. That is,

r ⇔r
1
2
. As
such, the square root operation is associative with exponentiation. That is,

r
3
= r
3
2
=

r
3
.
Negative real numbers do not have real square roots. For example,

−4 is not a real number.
Proof by contradiction: Suppose

−4 = r ∈ R. If r is negative, r
2
is positive. But if r is
55
positive, r
2
is also positive. But r cannot be zero either, because 0
2
= 0. So

−4 ∈ R.
For additional discussion of the square root and negative numbers, see the discussion of
complex Numbers.
Version: 9 Owner: wberry Author(s): wberry
1.89 stable sorting algorithm
A stable sorting algorithm is any sorting algorithm that preserves the relative ordering of
items with equal values. For instance, consider a list of ordered pairs 1 := ¦(¹. 3). (1. 5). ((. 2). (1. 5). (1. 4)
If a stable sorting algorithm sorts 1 on the second value in each pair using the < relation,
then the result is guaranteed to be ¦((. 2). (¹. 3). (1. 4). (1. 5). (1. 5)¦. However, if an
algorithm is not stable, then it is possible that (1. 5) may come before (1. 5) in the sorted
output.
Some examples of stable sorting algorithms are bubblesort and mergesort (although the
stability of mergesort is dependent upon how it is implemented). Some examples of unstable
sorting algorithms are heapsort and quicksort (quicksort could be made stable, but then it
wouldn’t be quick any more). Stability is a useful property when the total ordering relation
is dependent upon initial position. Using a stable sorting algorithm means that sorting by
ascending position for equal keys is built-in, and need not be implemented explicitly in the
comparison operator.
Version: 3 Owner: Logan Author(s): Logan
1.90 standard deviation
Given a random variable A, the standard deviation of A is defined as
o1[A] =

\ c:[A].
The standard deviation is a measure of the variation of A around the expected value.
Version: 1 Owner: Riemann Author(s): Riemann
1.91 stochastic independence
The random variables A
1
. A
2
. .... A
n
are stochastically independent (or just independent)
if
56
1
X
1
,...,Xn
(r
1
. .... r
n
) = 1
X
1
(r
1
) 1
Xn
(r
n
) ∀(r
1
. .... r
n
) ∈ 1
n
This is, the random variables A
1
. .... A
n
are independent if its joint distribution function can
be expressed as the product of the marginal distributions of the variables, evaluated at the
corresponding points.
This definition implies all the following:
1. 1
X
1
,...,Xn
(r
1
. .... r
n
) = 1
X
1
(r
1
) 1
Xn
(r
n
)∀(r
1
. .... r
n
) ∈ 1
n
(joint cumulative distribution)
2. `
X
1
+...+Xn
(t) = `
X
1
(t) `
Xn
(t)∀(t
1
. .... t
n
) (moment generating function)
3. 1[
¸
n
i=1
A
i
] =
¸
n
i=1
1[A
i
] (expectation)
However, only the first two above imply independence. See also correlation.
There are other definitions of independence, too.
Version: 3 Owner: Riemann Author(s): Riemann
1.92 substring
Given a string : ∈ Σ

, a string t is a substring of : if : = nt· for some strings n. · ∈ Σ

.
For example, |j. c|. /c. c|j/c, and λ (the empty string) are all substrings of the string c|j/c.
Version: 2 Owner: Logan Author(s): Logan
1.93 successor
Given a set o, the successor of o is the set o
¸
¦o¦. One often denotes the successor of o
by o
t
.
Version: 1 Owner: djao Author(s): djao
57
1.94 sum rule
The sum rule states that
D
D
x
[1(r) + o(r)] = 1
t
(r) + o
t
(r)
Proof
See the proof of the sum rule.
Examples
D
D
x
(r + 1) =
D
D
x
r +
D
D
x
1 = 1
D
D
x
(r
2
−3r + 2) =
D
D
x
r
2
+
D
D
x
(−3r) +
D
D
x
(2) = 2r −3
D
D
x
(sin r + cos r) =
D
D
x
sin r +
D
D
x
cos r = cos r −sin r
Version: 3 Owner: Logan Author(s): Logan
1.95 superset
Given two sets ¹ and 1, ¹ is a superset of 1 if every element in 1 is also in ¹. We
denote this relation as ¹ ⊇ 1. This is equivalent to saying that 1 is a subset of ¹, that is
¹ ⊇ 1 ⇒1 ⊆ ¹.
Similar rules that hold for ⊆ also hold for ⊇. If A ⊇ ) and ) ⊇ A, then A = ) . Every set
is a superset of itself, and every set is a superset of the empty set.
¹ is a proper superset of 1 if ¹ ⊇ 1 and ¹ = 1. This relation is often denoted as ¹ ⊃ 1.
Unfortunately, ¹ ⊃ 1 is often used to mean the more general superset relation, and thus it
should be made explicit when proper superset is intended.
Version: 2 Owner: Logan Author(s): Logan
58
1.96 symmetric polynomial
A polynomial 1 ∈ 1[r
1
. . . . . r
n
] in : variables with coefficients in a ring 1 is symmetric if
σ(1) = 1 for every permutation σ of the set ¦r
1
. . . . . r
n
¦.
Every symmetric polynomial can be written as a polynomial expression in the elementary symmetric polynomials
Version: 2 Owner: djao Author(s): djao
1.97 the argument principle
the argument principle
1: 1 meromorphic in Ω
2: ∀0 < i < : : 1(c
i
) = 0
3: ∀0 < i < : : 1(/
i
) = ∞
4: γ cycle
5: γ homologous to zero with respect to Ω
6: ∀c
i
∈ im(γ) : ∀/
i
∈ im(γ) :
1
2πi
int
γ
1
t
(.)1(.)d. =
¸
n
j=0
ind
γ
(c
j
) −
¸
m
k=0
ind
γ
(/
k
)
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.98 torsion-free module
torsion-free module
1: 1 integral domain
2: A left module over 1
3: A
t
torsion submodule
4: A
t
= 0
59
fact: a finitely generated torsion-free submodule is a free module
1: A finitely generated 1-module
2: A torsion-free
3: A free
(to be fiexd)
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 2 Owner: drini Author(s): drini, apmxi
1.99 total order
A total order is a special case of a partial order. If < is a partial order on ¹, then it
satisfies the following three properties:
1. reflexivity: c < c for all c ∈ ¹
2. antisymmetry: If c < / and / < c for any c. / ∈ ¹, then c = /
3. transitivity: If c < / and / < c for any c. /. c ∈ ¹, then c < c
The relation < is a total order if it satisfies the above three properties and the following
additional property:
4. Comparability: For any c. / ∈ ¹, either c < / or / < c.
Version: 2 Owner: Logan Author(s): Logan
1.100 tree traversals
A tree traversal is an algorithm for visiting all the nodes in a rooted tree exactly once.
The constraint is on rooted trees, because the root is taken to be the starting point of the
traversal. A traversal is also defined on a forest in the sense that each tree in the forest can
be iteratively traversed (provided one knows the roots of every tree beforehand). This entry
presents a few common and simple tree traversals.
60
In the description of a tree, the notion of rooted-subtrees was presented. Full understanding
of this notion is necessary to understand the traversals presented here, as each of these
traversals depends heavily upon this notion.
In a traversal, there is the notion of visiting a node. Visiting a node often consists of doing
some computation with that node. The traversals are defined here without any notion of
what is being done to visit a node, and simply indicate where the visit occurs (and most
importantly, in what order).
Examples of each traversal will be illustrated on the following binary tree.

• •
• • • •
Vertices will be numbered in the order they are visited, and edges will be drawn with arrows
indicating the path of the traversal.
Preorder Traversal
Given a rooted tree, a preorder traversal consists of first visiting the root, and then
executing a preorder traversal on each of the root’s children (if any).
For example
1
a g
2
b d
f
5
h j
3
c
4
e
6
i
7
The term preorder refers to the fact that a node is visited before any of its descendents.
A preorder traversal is defined for any rooted tree. As pseudocode, the preorder traversal is
Algorithm PreorderTraversal(r, Visit)
Input: A node r of a binary tree, with children left(r) and right(r), and some computation
Visit
d
efined
for r
Output: Visits nodes of subtree rooted at r in a preorder traversal begin
61
Visit
(r)
PreorderTraversal
(left(r). Visit)
PreorderTraversal
(right(r). Visit)
end
Postorder Traversal
Given a rooted tree, a postorder traversal consists of first executing a postorder traversal
on each of the root’s children (if any), and then visiting the root.
For example
7
a g
3
b d
f
6
h j
1
c
2
e
4
i
5
As with the preorder traversal, the term postorder here refers to the fact that a node is
visited after all of its descendents. A postorder traversal is defined for any rooted tree. As
pseudocode, the postorder traversal is
Algorithm PostorderTraversal(r, Visit)
Input: A node r of a binary tree, with children left(r) and right(r), and some computation
Visit
d
efined
for r
Output: Visits nodes of subtree rooted at r in a postorder traversal begin
Visit
(r)
PostorderTraversal
(left(r). Visit)
PostorderTraversal
(right(r). Visit)
end
62
In-order Traversal
Given a binary tree, an in-order traversal consists of executing an in-order traversal on
the root’s left child (if present), then visiting the root, then executing an in-order traversal
on the root’s right child (if present). Thus all of a root’s left descendents are visited before
the root, and the root is visited before any of its right descendents.
For example
4
a g
2
b d
f
6
h j
1
c
3
e
5
i
7
As can be seen, the in-order traversal has the wonderful property of traversing a tree from
left to right (if the tree is visualized as it has been drawn here). The term in-order comes
from the fact that an in-order traversal of a binary search tree visits the data associated with
the nodes in sorted order. As pseudocode, the in-order traversal is
Algorithm InOrderTraversal(r, Visit)
Input: A node r of a binary tree, with children left(r) and right(r), and some computation
Visit
d
efined
for r
Output: Visits nodes of subtree rooted at r in an in-order traversal begin
InOrderTraversal
(left(r). Visit)
Visit
(r)
InOrderTraversal
(right(r). Visit)
end
Version: 3 Owner: Logan Author(s): Logan
1.101 trie
A trie is a digital tree for storing a set of strings in which there is one node for every prefix
of every string in the set. The name comes from the word retrieval, and thus is pronounced
the same as tree (which leads to much confusion when spoken aloud). The word retrieval is
63
stressed, because a trie has a lookup time that is equivalent to the length of the string being
looked up.
If a trie is to store some set of strings o ⊆ Σ

(where Σ is an alphabet), then it takes the
following form. Each edge leading to non-leaf nodes in the trie is labelled by an element
of Σ. Any edge leading to a leaf node is labelled by $ (some symbol not in Σ). For every
string : ∈ o, there is a path from the root of the trie to a leaf, the labels of which when
concatenated form : ++ $ (where ++ is the string concatenation operator). For every path
from the root of the trie to a leaf, the labels of the edges concatenated form some string in
o.
Example
Suppose we wish to store the set of strings o := ¦c|j/c. /ctc. /cc:. /cc:t. /cct¦. The trie that
stores o would be

a b

l

e

p

a t

h

r
s
t

a

a

$

t

$

$

$
• •
$
• •
• •
Version: 4 Owner: Logan Author(s): Logan
1.102 unit vector
A unit vector is a vector with a length, or vector norm, of one. In R
n
, one can obtain such a
vector by dividing a vector by its magnitude [·[. For example, we have a vector < 1. 2. 3 .
A unit vector pointing in this direction would be
1
[ < 1. 2. 3 [
< 1. 2. 3 =
1

14
< 1. 2. 3 =<
1

14
.
2

14
.
3

14

64
. The magnitude of this vector is 1.
Version: 7 Owner: slider142 Author(s): slider142
1.103 unstable fixed point
A fixed point is considered unstable if it is neither attracting nor Liapunov stable. A saddle
point is an example of such a fixed point.
Version: 1 Owner: armbrusterb Author(s): armbrusterb
1.104 weak* convergence in normed linear space
weak* convergence in normed linear space
1: (r
t
n
) ⊂ A
t
2: A a Banach space
3: ∃r
t
∈ A
t
: ∀r ∈ A : lim
n→∞
r(r
t
n
) ⇔r
t
n
(r) = r
t
(r).
4: If A is reflexive, then weak* convergence is the same as weak convergence
Note: This is a “seed” entry written using a short-hand format described in this FAQ.
Version: 1 Owner: bwebste Author(s): apmxi
1.105 well-ordering principle for natural numbers
Every nonempty set o of nonnegative integers contains a least element; that is, there is some
integer c in o such that c < / for all / belonging to o.
For example, the positive integers are a well-ordered set under the standard order.
Version: 5 Owner: KimJ Author(s): KimJ
65
Chapter 2
00-01 – Instructional exposition
(textbooks, tutorial papers, etc.)
2.1 dimension
The word dimension in mathematics has many definitions, but all of them are trying to
quantify our intuition that, for example, a sheet of paper has somehow one less dimension
than a stack of papers.
One common way to define dimension is through some notion of a number of independent
quantities needed to describe an element of an object. For example, it is natural to say
that the sheet of paper is two-dimensional because one needs two real numbers to specify
a position on the sheet, whereas the stack of papers is three-dimension because a position
in a stack is specified by a sheet and a position on the sheet. Following this notion, in
linear algebra the dimension of a vector space is defined as the minimal number of vectors
such that every other vector in the vector space is representable as a sum of these. Similarly,
the word rank denotes various dimension-like invariants that appear throughout the algebra.
However, if we try to generalize this notion to the mathematical objects that do not possess
an algebraic structure, then we run into a difficulty. From the point of view of set theory
there are as many real numbers as pairs of real numbers since there is a bijection from real
numbers to pairs of real numbers. To distinguish a plane from a cube one needs to impose
restrictions on the kind of mapping. Surprisingly, it turns out that the continuity is not
enough as was pointed out by Peano. There are continuous functions that map a square
onto a cube. So, in topology one uses another intuitive notion that in a high-dimensional
space there are more directions than in a low-dimensional. Hence, the (Lebesgue covering)
dimension of a topological space is defined as the smallest number d such that every covering
of the space by open sets can be refined so that no point is contained in more than d+1 sets.
For example, no matter how one covers a sheet of paper by sufficiently small other sheets
of paper such that no two sheets can overlap each other, but cannot merely touch, one will
66
always find a point that is covered by 2 + 1 = 3 sheets.
Another definition of dimension rests on the idea that higher-dimensional objects are in some
sense larger than the lower-dimensional ones. For example, to cover a cube with a side length
2 one needs at least 2
3
= 8 cubes with a side length 1, but a square with a side length 2 can
be covered by only 2
2
= 4 unit squares. Let `(c) be the minimal number of open balls in
any covering of a bounded set o by balls of radius c. The Besicovitch-Hausdorff dimension
of o is defined as −lim
→∞
log

`(c). The Besicovitch-Hausdorff dimension is not always
defined, and when defined it might be non-integral.
Version: 4 Owner: bbukh Author(s): bbukh
2.2 toy theorem
A toy theorem is a simplified version of a more general theorem. For instance, by intro-
ducing some simplifying assumptions in a theorem, one obtains a toy theorem.
Usually, a toy theorem is used to illustrate the claim of a theorem. It can also be illustrative
and insightful to study proofs of a toy theorem derived from a non-trivial theorem. Toy
theorems also have a great education value. After presenting a theorem (with, say, a highly
non-trivial proof), one can sometimes give some assurance that the theorem really holds, by
proving a toy version of the theorem.
For instance, a toy theorem of Brouwer fixed point theorem is obtained by restricting the
dimension to one. In this case, the Brouwer fixed point theorem follows almost immediately
from the intermediate value theorem (see this page).
Version: 1 Owner: matte Author(s): matte
67
Chapter 3
00-XX – General
3.1 method of exhaustion
The method of exhaustion is calculating an area by approximating it by the areas of a
sequence of polygons.
For example, filling up the interior of a circle by inscribing polygons with more and more
sides.
Version: 1 Owner: vladm Author(s): vladm
68
Chapter 4
00A05 – General mathematics
4.1 Conway’s chained arrow notation
Conway’s chained arrow notation is a way of writing numbers even larger than those
provided by the up arrow notation. We define : → : → j = :
(p+2)
: = :↑ ↑
. .. .
p
: and
: →: = : →: →1 = :
n
. Longer chains are evaluated by
: → →: →j →1 = : → →: →j
: → →: →1 →¡ = : → →:
and
: → →: →j + 1 →¡ + 1 = : → →: →(: → →: →j →¡ + 1) →¡
For example:
3 →3 →2 =
3 →(3 →2 →2) →1 =
3 →(3 →2 →2) =
3 →(3 →(3 →1 →2) →1) =
3 →(3 →3 →1) =
3
3
3
=
3
27
= 7625597484987
69
A much larger example is:
3 →2 →4 →4 =
3 →2 →(3 →2 →3 →4) →3 =
3 →2 →(3 →2 →(3 →2 →2 →4) →3) →3 =
3 →2 →(3 →2 →(3 →2 →(3 →2 →1 →4) →3) →3) →3 =
3 →2 →(3 →2 →(3 →2 →(3 →2) →3) →3) →3 =
3 →2 →(3 →2 →(3 →2 →9 →3) →3) →3
Clearly this is going to be a very large number. Note that, as large as it is, it is proceeding
towards an eventual final evaluation, as evidenced by the fact that the final number in the
chain is getting smaller.
Version: 4 Owner: Henry Author(s): Henry
4.2 Knuth’s up arrow notation
Knuth’s up arrow noation is a way of writing numbers which would be unwieldy in
standard decimal notation. It expands on the exponential notation : ↑ : = :
n
. Define
: ↑↑ 0 = 1 and : ↑↑ : = : ↑ (: ↑↑ [: −1]).
Obviously : ↑↑ 1 = :
1
= :, so 3 ↑↑ 2 = 3
3↑↑1
= 3
3
= 27, but 2 ↑↑ 3 = 2
2↑↑2
= 2
2
2↑↑1
=
2
(2
2
)
= 16.
In general, : ↑↑ : = :
m
···
m
, a tower of height :.
Clearly, this process can be extended: : ↑↑↑ 0 = 1 and : ↑↑↑ : = : ↑↑ (: ↑↑↑ [: −1]).
An alternate notation is to write :
(i)
: for : ↑ ↑
. .. .
i−2 times
:. (i−2 times because then :
(2)
: = ::
and :
(1)
: = :+ :.) Then in general we can define :
(i)
: = :
(i−1)
(:
(i)
(: −1)).
To get a sense of how quickly these numbers grow, 3 ↑↑↑ 2 = 3 ↑↑ 3 is more than seven and
a half trillion, and the numbers continue to grow much more than exponentially.
Version: 3 Owner: Henry Author(s): Henry
4.3 arithmetic progression
Arithmetic progression of length :, initial term c
1
and common difference d is the sequence
c
1
. c
1
+ d. c
1
+ 2d. . . . . c
1
+ (: −1)d.
70
The sum of terms of an arithmetic progression can be computed using Gauss’s trick:
o = (c
1
+ 0)
+o = (c
1
+ (: −1)d
2o = (2c
1
+ (: −1)
We just add the sum with itself written backwards, and the sum of each of the columns equals
to (2c
1
+ (: −1)d). The sum is then
o =
(2c
1
+ (: −1)d):
2
.
Version: 3 Owner: bbukh Author(s): bbukh
4.4 arity
The arity of something is the number of arguments it takes. This is usually applied to
functions: an :-ary function is one that takes : arguments. Unary is a synonym for 1-ary,
and binary for 2-ary.
Version: 1 Owner: Henry Author(s): Henry
4.5 introducing 0th power
Let c be a number. Then for all : ∈ N, c
n
is the product of : c’s. For integers (and their
extensions) we have a “multiplicative identity” called “1”, i.e.c 1 = c for all c. So we can
write
c
n
= c
n+0
= c
n
1.
From the definition of the power of c the usual laws can be derived; so it is plausible to set
c
0
= 1, since 0 doesn’t change a sum, like 1 doesn’t change the product.
Version: 4 Owner: Thomas Heye Author(s): Thomas Heye
4.6 lemma
There is no technical distinction between a lemma and a theorem. A lemma is a proven
statement, typically named a lemma to distinguish it as a truth used as a stepping stone to
a larger result rather than an important statement in and of itself. Of course, some of the
most powerful statements in mathematics are known as lemmas, including Zorn’s lemma,
71
Bezout’s lemma, Gauss’ lemma, Fatou’s lemma, etc., so one clearly can’t get too much sim-
ply by reading into a proposition’s name.
According to [1], the plural ’Lemmas’ is commonly used. The correct plural of lemma,
however, is lemmata.
REFERENCES
1. N. Higham, Handbook of writing for the mathematical sciences, Society for Industrial and
Applied Mathematics, 1998. (pp. 16)
Version: 5 Owner: mathcam Author(s): mathcam
4.7 property
Given each element of a set A, a property is either true or false. Formally, a property 1 : A →
¦true. false¦. Any property gives rise in a natural way to the set ¦r : r has the property 1¦
and the corresponding characteristic function.
Version: 3 Owner: fibonaci Author(s): bbukh, fibonaci, apmxi
4.8 saddle point approximation
The saddle point approximation (SPA), a.k.a. Stationary phase approximation, is a widely
used method in quantum field theory (QFT) and related fields. Suppose we want to evaluate
the following integral in the limit ζ →∞:
I = lim
ζ→∞
int

−∞
dr e
−ζf(x)
. (4.8.1)
The saddle point approximation can be applied if the function 1(r) satisfies certain condi-
tions. Assume that 1(r) has a global minimum 1(r
0
) = n
min
at r = r
0
, which is sufficiently
separated from other local minima and whose value is sufficiently smaller than the value of
those. Consider the Taylor expansion of 1(r) about the point r
0
:
1(r) = 1(r
0
) + ∂
x
1(r)

x=x
0
(r −r
0
) +
1
2

x
2
1(r)

x=x
0
(r −r
0
)
2
+ ((r
3
). (4.8.2)
Since 1(r
0
) is a (global) minimum, it is clear that 1
t
(r
0
) = 0. Therefore 1(r) may be
approximated to quadratic order as
1(r) ≈ 1(r
0
) +
1
2
1
tt
(r
0
)(r −r
0
)
2
. (4.8.3)
72
The above assumptions on the minima of 1(r) ensure that the dominant contribution to
(4.8.1) in the limit ζ →∞ will come from the region of integration around r
0
:
I ≈ lim
ζ→∞
e
−ζf(x
0
)
int

−∞
dr e

ζ
2
f

(x
0
)(x−x
0
)
2
(4.8.4)
≈ lim
ζ→∞
e
−ζf(x
0
)


ζ1
tt
(r
0
)

1/2
.
In the last step we have performed the Gaußian integral. The next nonvanishing higher order
correction to (4.8.4) stems from the quartic term of the expansion (4.8.2). This correction
may be incorporated into (4.8.4) to yield (after expanding part of the exponential):
I ≈ lim
ζ→∞
e
−ζf(x
0
)
int

−∞
dr e

ζ
2
f

(x
0
)(x−x
0
)
2

1 −
ζ
4!
(∂
4
x
1(r))[
x=x
0
(r −r
0
)
4

. (4.8.5)
...to be continued with applications to physics...
Version: 2 Owner: msihl Author(s): msihl
4.9 singleton
A set consisting of a single element is usually referred to as a singleton.
Version: 2 Owner: Koro Author(s): Koro
4.10 subsequence
If A is a set and (c
n
)
n∈N
is a sequence in A, then a subsequence of (c
n
) is a sequence of
the form (c
nr
)
r∈N
where (:
r
)
r∈N
is a strictly increasing sequence of natural numbers.
Version: 2 Owner: Evandar Author(s): Evandar
4.11 surreal number
The surreal numbers are a generalization of the reals. Each surreal number consists of two
parts (called the left and right), each of which is a set of surreal numbers. For any surreal
number `, these parts can be called `
L
and `
R
. (This could be viewed as an ordered pair of
sets, however the surreal numbers were intended to be a basis for mathematics, not something
to be embedded in set theory.) A surreal number is written ` = '`
L
[ `
R
`.
Not every number of this form is a surreal number. The surreal numbers satisfy two addi-
tional properties. First, if r ∈ `
R
and n ∈ `
L
then r < n. Secondly, they must be well
73
founded. These properties are both satisfied by the following construction of the surreal
numbers and the < relation by mutual induction:
'[`, which has both left and right parts empty, is 0.
Given two (possibly empty) sets of surreal numbers 1 and 1 such that for any r ∈ 1 and
n ∈ 1, r < n, '1 [ 1`.
Define ` < ` if there is no r ∈ `
L
such that ` < r and no n ∈ `
R
such that n < `.
This process can be continued transfinitely, to define infinite and infinitesimal numbers. For
instance if Z is the set of integers then ω = 'Z [`. Note that this does not make equality the
same as identity: '1 [ 1` = '[`, for instance.
It can be shown that ` is ”sandwiched” between the elements of `
L
and `
R
: it is larger
than any element of `
L
and smaller than any element of `
R
.
Addition of surreal numbers is defined by
` +` = '¦` +r [ r ∈ `
L
¦
¸
¦` +r [ n ∈ `
L
¦ [ ¦` +r [ r ∈ `
R
¦
¸
¦` +r [ n ∈ `
R
¦`
It follows that −` = '−`
R
[ −`
L
`.
The definition of multiplication can be written more easily by defining ` `
L
= ¦` r [
r ∈ `
L
¦ and similarly for `
R
.
Then
` ` ='` `
L
+ ` `
L
−`
L
`
L
. ` `
R
+ ` `
R
−`
R
`
R
[
` `
L
+ ` `
R
−`
L
`
R
. ` `
R
+ ` `
L
−`
R
`
L
`
The surreal numbers satisfy the axioms for a field under addition and multiplication (whether
they really are a field is complicated by the fact that they are too large to be a set).
The integers of surreal mathematics are called the omnific integers. In general positive
integers : can always be written ': − 1 [` and so −: = '[ 1 − :` = '[ (−:) + 1`. So for
instance 1 = '0 [`.
In general, 'c [ /` is the simplest number between c and /. This can be easily used to define
the dyadic fractions: for any integer c, c +
1
2
= 'c [ c + 1`. Then
1
2
= '0 [ 1`,
1
4
= '0 [
1
2
`,
and so on. This can then be used to locate non-dyadic fractions by pinning them between
a left part which gets infinitely close from below and a right part which gets infinitely close
from above.
74
Ordinal arithmetic can be defined starting with ω as defined above and adding numbers
such as 'ω [` = ω + 1 and so on. Similarly, a starting infinitesimal can be found as '0 [
1.
1
2
.
1
4
. . .` =
1
ω
, and again more can be developed from there.
Version: 5 Owner: Henry Author(s): Henry
75
Chapter 5
00A07 – Problem books
5.1 Nesbitt’s inequality
Nesbitt’s inequality says, that for positive real c, / and c we have:
c
/ + c
+
/
c + c
+
c
c + /
`
3
2
.
Version: 2 Owner: mathwizard Author(s): mathwizard
5.2 proof of Nesbitt’s inequality
Starting from Nesbitt’s inequality
c
/ + c
+
/
c + c
+
c
c + /
`
3
2
we transform the left hand side:
c + / + c
/ + c
+
c + / + c
c + c
+
c + / + c
c + /
−3 `
3
2
.
Now this can be transformed into:
((c + /) + (c + c) + (/ + c))

1
c + /
+
1
c + c
+
1
/ + c

` 9.
Division by 3 and the right factor yields:
(c + /) + (c + c) + (/ + c)
3
`
3
1
a+b
+
1
a+c
+
1
b+c
.
76
Now on the left we have the arithmetic mean and on the right the harmonic mean, so this
inequality is true.
Version: 2 Owner: mathwizard Author(s): mathwizard
77
Chapter 6
00A20 – Dictionaries and other
general reference works
6.1 completing the square
Let us consider the expression r
2
+rn, where r and n are real (or complex) numbers. Using
the formula
(r + n)
2
= r
2
+ 2rn + n
2
we can write
r
2
+ rn = r
2
+ rn + 0
= r
2
+ rn +
n
2
4

n
2
4
= (r +
n
2
)
2

n
2
4
.
This manipulation is called completing the square [3] in r
2
+rn, or completing the square
r
2
.
Replacing n by −n, we also have
r
2
−rn = (r −
n
2
)
2

n
2
4
.
Here are some applications of this method:
• Derivation of the solution formula to the quadratic equation.
• Completing the square can also be used to find the extremal value of a quadratic
polynomial [2] without calculus. Let us illustrate this for the polynomial j(r) =
78
4r
2
+ 8r + 9. Completing the square yields
j(r) = (2r + 2)
2
−4 + 9
= (2r + 2)
2
+ 5
≥ 5.
since (2r + 2)
2
≥ 0. Here, equality holds if and only if r = −1. Thus j(r) ≥ 5 for all
r ∈ R, and j(r) = 5 if and only if r = −1. It follows that j(r) has a global minimum
at r = −1, where j(−1) = 5.
• Completing the square can also be used as an integration technique to integrate, say
1
4x
2
+8x+9
[3].
REFERENCES
1. R. Adams, Calculus, a complete course, Addison-Wesley Publishers Ltd, 3rd ed.
2. Matematik Lexikon (in Swedish), J. Thompson, T. Martinsson, Wahlstr¨om & Widstrand,
1991.
(Anyone has an English reference?)
Version: 7 Owner: mathcam Author(s): matte
79
Chapter 7
00A99 – Miscellaneous topics
7.1 QED
The term “QED” is actually an abbreviation and stands for the Latin quod erat demon-
strandum, meaning “which was to be demonstrated.”
QED typically is used to signify the end of a mathematical proof. The symbol
¯
is often used in place of “QED,” and is called the “Halmos symbol” after mathematician
Paul Halmos (it can vary in width, however, and sometimes it is fully or partially shaded).
Halmos borrowed this symbol from magazines, where it was used to denote “end of article.”
Version: 3 Owner: akrowne Author(s): akrowne
7.2 TFAE
The abbreviation “TFAE” is shorthand for “the following are equivalent”. It is used before
a set of equivalent conditions (each implies all the others).
In a definition, when one of the conditions is somehow “better” (simpler, shorter, ...), it
makes sense to phrase the definition with that condition, and mention that the others are
equivalent. “TFAE” is typically used when none of the conditions can take priority over the
others. Actually proving the claimed equivalence must, of course, be done separately.
Version: 1 Owner: ariels Author(s): ariels
80
7.3 WLOG
“WLOG” (or “WOLOG”) is an acronym which stands for “without loss of generality.”
WLOG is invoked in situations where some property of a model or system is invariant
under the particular choice of instance attributes, but for the sake of demonstration, these
attributes must be fixed.
For example, we might be discussing properties of a segment (open or closed) of the real number
line. Due to the nature of the reals, we can select endpoints c and / without loss of gen-
erality. Nothing about our discussion of this segment depends on the choice of c or /. Of
course, any segment does actually have specific endpoints, so it may help to actually select
some (say 0 and 1) for clarity.
WLOG can also be invoked to shorten proofs where there are a number of choices of config-
uration, but the proof is “the same” for each of them. We need only walk through the proof
for one of these configurations, and “WLOG” serves as a note that we haven’t lost anything
in the choosing.
Version: 2 Owner: akrowne Author(s): akrowne
7.4 order of operations
The order of operations is a convention that tells us how to evaluate mathematical expres-
sions (these could be purely numerical). The problem arises because expressions consist of
operators applied to variables or values (or other expressions) that each demand individual
evaluation, yet the order in which these individual evaluations are done leads to different
outcomes.
A conventional order of operations solves this. One could technically do without memorizing
this convention, but the only alternative is to use parentheses to group every single term of
an expression and evaluate the innermost operations first.
For example, in the expression c / +c, how do we know whether to apply multiplication or
addition first? We could interpret even this simple expression two drastically different ways:
1. Add / and c,
2. Multiply the sum from (1) with c.
or
1. Multiply c and /,
81
2. Add to the product in (1) the value of c.
One can see the different outcomes for the two cases by selecting some different values for c,
/, and c. The issue is resolved by convention in order of operations: the correct evaluation
would be the second one.
The nearly universal mathematical convention dictates the following order of operations (in
order of which operators should be evaluated first):
1. factorial.
2. Exponentiation.
3. Multiplication.
4. Division.
5. Addition.
Any parenthesized expressions are automatically higher “priority” than anything on the
above list.
There is also the problem of what order to evaluate repeated operators of the same type, as
in:
c/cd
The solution in this problem is typically to assume the left-to-right interpretation. For the
above, this would lead to the following evaluation:
(((c/)c)d)
In other words,
1. Evaluate c/.
2. Evaluate (1)/c.
3. Evaluate (2)/d.
Note that this isn’t a problem for associative operators such as multiplication or addition in
the reals. One must still proceed with caution, however, as associativity is a notion bound
82
up with the concept of groups rather than just operators. Hence, context is extremely
important.
For more obscure operations than the ones listed above, parentheses should be used to remove
ambiguity. Completely new operations are typically assumed to have the highest priority,
but the definition of the operation should be accompanied by some sort of explanation of how
it is evaluated in relation to itself. For example, Conway’s chained arrow notation explicitly
defines what order repeated applications of itself should be evaluated in (it is right-to-left
rather than left-to-right)!
Version: 2 Owner: akrowne Author(s): akrowne
83
Chapter 8
01A20 – Greek, Roman
8.1 Roman numerals
Roman numerals are a method of writing numbers employed primarily by the ancient
Romans. It place of digits, the Romans used letters to represent the numbers central to the
system:
1 1
\ 5
A 10
1 50
( 100
1 500
` 1000
Larger numbers can be made by writing a bar over the letter, which means one thousand
times as much. For instance \ is 5000.
Other numbers were written by putting letters together. For instance 11 means 2. Larger
letters go on the left, so 111 is 52, but 111 is not a valid Roman numeral.
One additional rule allows a letter to the left of a larger letter to signify subtracting the
smaller from the larger. For instance 1\ is 4. This can only be done once; 3 is written 111,
not 11\ . Also, it is generally required that the smaller letter be the one immediately smaller
than the larger, so 1999 is usually written `(`A(1A, not `1`.
It is worth noting that today it is usually considered incorrect to repeat a letter four times,
so 1\ is prefered to 1111. However many older monuments do not use the subtraction rule
at all, so 44 was written AAAA1111 instead of the now preferable A11A.
Version: 3 Owner: Henry Author(s): Henry
84
Chapter 9
01A55 – 19th century
9.1 Poincar, Jules Henri
Jules Henri Poincar´e was born on April 29
th
1854 in Cit´e Ducale[BA] a neighborhood in
Nancy, a city in France. He was the son of Dr L´eon Poincar´e (1828-1892) who was a
professor at the University of Nancy in the faculty of medicine.[14] His mother, Eug´enie
Launois (1830-1897) was described as a “gifted mother”[6] who gave special instruction to
her son. She was 24 and his father 26 years of age when Henri was born[9]. Two years after
the birth of Henri they gave birth to his sister Aline.[6]
In 1862 Henri entered the Lyc´ee of Nancy which is today, called in his honor, the Lyc´ee
Henri Poincar´e. In fact the University of Nancy is also named in his honor. He graduated
from the Lyc´ee in 1871 with a bachelors degree in letters and sciences. Henri was the top
of class in almost all subjects, he did not have much success in music and was described as
“average at best” in any physical activities.[9] This could be blamed on his poor eyesight
and absentmindedness.[4] Later in 1873, Poincar´e entered l’Ecole Polytechnique where he
performed better in mathematics than all the other students. He published his first pa-
per at 20 years of age, titled D´emonstration nouvelle des propri´et´es de l’indicatrice
d’une surface.[3] He graduated from the institution in 1876. The same year he decided
to attend l’Ecole des Mines and graduated in 1879 with a degree in mining engineering.[14]
After his graduation he was appointed as an ordinary engineer in charge of the mining
services in Vesoul. At the same time he was preparing for his doctorate in sciences (not
surprisingly), in mathematics under the supervision of Charles Hermite. Some of Charles
Hermite’s most famous contributions to mathematics are: Hermite’s polynomials, Hermite’s
differential equation, Hermite’s formula of interpolation and Hermitian matrices.[9] Poincar´e,
as expected graduated from the University of Paris in 1879, with a thesis relating to dif-
Jules Henri Poincar´e (1854 - 1912)
85
ferential equations. He then became a teacher at the University of Caen, where he taught
analysis. He remained there until 1881. He then was appointed as the “maˆıtre de conf´erences
d’analyse”[14] (professor in charge of analysis conferences) at the University of Paris. Also in
that same year he married Miss Poulain d’Andecy. Together they had four children: Jeanne
born in 1887, Yvonne born in 1889, Henriette born in 1891, and finally L´eon born in 1893.
He had now returned to work at the Ministry of Public Services as an engineer. He was
responsible for the development of the northern railway. He held that position from 1881 to
1885. This was the last job he held in administration for the government of France. In 1893
he was awarded the title of head engineer in charge of the mines. After that his career awards
and position continuously escalated in greatness and quantity. He died two years before the
war on July 17
th
1912 of an embolism at the age of 58. Interestingly, at the beginning of
World War I, his cousin Raymond Poincar´e was the president of the French Republic.
Poincar´e’s work habits have been compared to a bee flying from flower to flower. Poincar´e
was interested in the way his mind worked, he studied his habits. He gave a talk about his
observations in 1908 at the Institute of General Psychology in Paris. He linked his way of
thinking to how he made several discoveries. His mental organization was not only interesting
to him but also to Toulouse, a psychologist of the Psychology Laboratory of the School of
Higher Studies in Paris. Toulouse wrote a book called Henri Poincar´e which was published
in 1910. He discussed Poincar´e’s regular schedule: he worked during the same times each
day in short periods of time. He never spent a long time on a problem since he believed
that the subconscious would continue working on the problem while he worked on another
problem. Toulouse also noted that Poincar´e also had an exceptional memory. In addition he
stated that most mathematicians worked from principle already established while Poincar´e
was the type that started from basic principle each time.[9] His method of thinking is well
summarized as:
Habitu´e `a n´egliger les d´etails et `a ne regarder que les cimes, il passait de l’une `a
l’autre avec une promptitude surprenante et les faits qu’il d´ecouvrait se groupant
d’eux-mˆemes autour de leur centre ´etaient instantan´emant et automatiquement
class´e dans sa m´emoire. (He neglected details and jumped from idea to idea, the
facts gathered from each idea would then come together and solve the problem)
[BA]
The mathematician Darboux claimed he was “un intuitif”(intuitive)[BA], arguing that this
is demonstrated by the fact that he worked so often by visual representation. He did not
care about being rigorous and disliked logic. He believed that logic was not a way to invent
but a way to structure ideas but that logic limits ideas.
Poincar´e had the opposite philosophical views of Bertrand Rusell and Gottlob Fredge who
believed that mathematics were a branch of logic. Poincar´e strongly disagreed, claiming that
intuition was the life of mathematics. Poincar´e gives an interesting point of view in his book
Science and Hypothesis:
For a superficial observer, scientific truth is beyond the possibility of doubt; the
86
logic of science is infallible, and if the scientists are sometimes mistaken, this is
only from their mistaking its rule. [12]
Poincar´e believed that arithmetic is a synthetic science. He argued that Peano’s axioms
cannot be proven non-circularly with the principle of induction.[7] Therefore concluding
that arithmetic is a priori synthetic and not analytic. Poincar´e then went on to say that
mathematics can not be a deduced from logic since it is not analytic. It is important to
note that even today Poincar´e has not been proven wrong in his argumentation. His views
were the same as those of Kant[8]. However Poincar´e did not share Kantian views in all
branches of philosophy and mathematics. For example in geometry Poincar´e believed that
the structure of non-Euclidean space can be known analytically. He wrote 3 books that made
his philosophies known: Science and Hypothesis, The Value of Science and Science
and Method.
Poincar´e’s first area of interest in mathematics was the fuchsian function that he named after
the mathematician Lazarus Fuch because Fuch was known for being a good teacher and done
alot of research in differential equations and in the theory of functions. The functions did
not keep the name fuchsian and are today called automorphic. Poincar´e actually developed
the concept of those functions as part of his doctoral thesis.[9] An automorphic function is a
function 1(.) where . ∈ C which is analytic under its domain and which is invariant under
a denumerable infinite group of linear fractional transformations, they are the generaliza-
tions of trigonometric functions and elliptic functions.[15] Below Poincar´e explains how he
discovered Fuchsian functions:
For fifteen days I strove to prove that there could not be any functions like those
I have since called Fuchsian functions. I was then very ignorant; every day I
seated myself at my work table, stayed an hour or two, tried a great number
of combinations and reached no results. One evening, contrary to my custom, I
drank black coffee and could not sleep. Ideas rose in crowds; I felt them collide
until pairs interlocked, so to speak, making a stable combination. By the next
morning I had established the existence of a class of Fuchsian functions, those
which come from the hypergeometric series; I had only to write out the results,
which took but a few hours. [11]
This is a clear indication Henri Poincar´e brilliance. Poincar´e communicated a lot with Klein
another mathematician working on fuchsian functions. They were able to discuss and further
the theory of automorphic(fushian) functions. Apparently Klein became jealous of Poincar´e’s
high opinion of Fuch’s work and ended their relationship on bad terms.
Poincar´e contributed to the field of algebraic topology and published Analysis situs in 1895
which was the first real systematic look at topology. He acquired most of his knowledge
from his work on differential equations. He also formulated the Poincar´e conjecture, one
of the great unsolved mathematics problems. It is currently one of the “Millennium Prize
Problems”. The problem is stated as:
87
Consider a compact 3-dimensional manifold V without boundary. Is it possible
that the fundamental group V could be trivial, even though V is not homeomorphic
to the 3-dimensional sphere? [5]
The problem has been attacked by many mathematicians such as Henry Whitehead in 1934,
but without success. Later in the 50’s and 60’s progress was made and it was discovered
that for higher dimension manifolds the problem was easier. (Theorems have been stated for
those higher dimensions by Stephe Smale, John Stallings, Andrew Wallace, and many more)
[5] Poincar´e also studied homotopy theory, which is the study of topology reduced to various
groups that are algebraically invariant.[9] He introduced the fundamental group in a paper
in 1894, and later stated his infamous conjecture. He also did work in analytic functions,
algebraic geometry, and Diophantine problems where he made important contributions not
unlike most of the areas he studied in.
In 1887, Oscar II, King of Sweden and Norway held a competition to celebrate his sixtieth
birthday and to promote higher learning.[1] The King wanted a contest that would be of
interest so he decided to hold a mathematics competition. Poincar´e entered the competition
submitting a memoir on the three body problem which he describes as:
Le but final de la M´ecanique c´eleste est de r´esoudre cette grande question de
savoir si la loi de Newton explique `a elle seule tous les ph´enom`enes astronomiques;
le seul moyen d’y parvenir est de faire des observation aussi pr´ecises que possible
et de les comparer ensuite aux r´esultats du calcul. (The goal of celestial mechanics
is to answer the great question of whether Newtonian mechanics explains all
astronomical phenomenons. The only way this can be proven is by taking the
most precise observation and comparing it to the theoretical calculations.) [13]
Poincar´e did in fact win the competition. In his memoir he described new mathematical ideas
such as homoclinic points. The memoir was about to be published in Acta Mathematica
when an error was found by the editor. This error in fact led to the discovery of chaos
theory. The memoir was published later in 1890.[9] In addition Poincar´e proved that the
determinism and predictability were disjoint problems. He also found that the solution of
the three body problem would change drastically with small change on the initial conditions.
This area of research was neglected until 1963 when Edward Lorenz discovered the famous
a chaotic deterministic system using a simple model of the atmosphere.[7]
He made many contributions to different fields of applied mathematics as well such as: celes-
tial mechanics, fluid mechanics, optics, electricity, telegraphy, capillarity, elasticity, thermo-
dynamics, potential theory, quantum theory, theory of relativity and cosmology. In the field
of differential equations Poincar´e has given many results that are critical for the qualitative
theory of differential equations, for example the Poincar´e sphere and the Poincar´e map.
It is that intuition that led him to discover and study so many areas of science. Poincar´e
is considered to be the next universalist after Gauss. After Gauss’s death in 1855 people
88
generally believed that there would be no one else that could master all branches of math-
ematics. However they were wrong because Poincar´e took all areas of mathematics as “his
province”[4].
REFERENCES
1. The 1911 Edition Encyclopedia: Oscar II of Sweden and Norway, [online],
http://63.1911encyclopedia.org/O/OS/OSCAR II OF SWEDEN AND NORWAY.htm
2. Belliver, Andr´e: Henri Poincar´e ou la vocation souveraine, Gallimard, 1956.
3. Bour P-E., Rebuschi M.: Serveur W3 des Archives H. Poincar´e [online] http://www.univ-
nancy2.fr/ACERHP/
4. Boyer B. Carl: A History of Mathematics: Henri Poincar´e, John Wiley & Sons, inc., Toronto,
1968.
5. Clay Mathematics Institute: Millennium Prize Problems, 2000, [online]
http://www.claymath.org/prizeproblems/ .
6. Encyclopaedia Britannica: Biography of Jules Henri Poincar´e.
7. Murz, Mauro: Jules Henri Poincar´e [Internet Encyclopedia of Philosophy], [online]
http://www.utm.edu/research/iep/p/poincare.htm, 2001.
8. Kolak, Daniel: Lovers of Wisdom (second edition), Wadsworth, Belmont, 2001.
9. O’Connor, J. John & Robertson, F. Edmund: The MacTutor History of Mathematics Archive,
[online] http://www-gap.dcs.st-and.ac.uk/ history/, 2002.
10. Oeuvres de Henri Poincar´e: Tome XI, Gauthier-Villard, Paris, 1956.
11. Poincar´e, Henri: Science and Method; The Foundations of Science, The Science Press, Lan-
caster, 1946.
12. Poincar´e, Henri: Science and Hypothesis; The Foundations of Science, The Science Press,
Lancaster, 1946.
13. Poincar´e, Henri: Les m´ethodes nouvelles de la m´ecanique celeste, Dover Publications, Inc.
New York, 1957.
14. Sageret, Jules: Henri Poincar´e, Mercvre de France, Paris, 1911.
15. Weisstein, W. Eric: World of Mathematics: Automorphic Function, CRC Press LLC, 2002.
Version: 6 Owner: Daume Author(s): Daume
89
Chapter 10
01A60 – 20th century
10.1 Bourbaki, Nicolas
by
´
Emilie Richer
The Problem
The devastation of World War I presented a unique challenge to aspiring mathematicians of
the mid 1920’s. Among the many casualties of the war were great numbers of scientists and
mathematicians who would at this time have been serving as mentors to the young students.
Whereas other countries such as Germany were sending their scholars to do scientific work,
France was sending promising young students to the front. A war-time directory of the
´ecole Normale Sup´erieure in Paris confirms that about 2/3 of their student population was
killed in the war.[DJ] Young men studying after the war had no young teachers, they had
no previous generation to rely on for guidance. What did this mean? According to Jean
Dieudonn´e, it meant that students like him were missing out on important discoveries and
advances being made in mathematics at that time. He explained : “I am not saying that
they (the older professors) did not teach us excellent mathematics (...) But it is indubitable
that a 50 year old mathematician knows the mathematics he learned at 20 or 30, but has
only notions, often rather vague, of the mathematics of his epoch, i.e. the period of time
when he is 50.” He continued : “I had graduated from the ´ecole Normale and I did not know
what an ideal was! This gives you and idea of what a young French mathematician knew in
1930.”[DJ] Henri Cartan, another student in Paris shortly after the war affirmed : “we were
the first generation after the war. Before us their was a vide, a vacuum, and it was necessary
to make everything new.”[JA] This is exactly what a few young Parisian math students set
out to do.
The Beginnings
After graduation from the ´ecole Normale Sup´erieure de Paris a group of about ten young
90
mathematicians had maintained very close ties.[WA] They had all begun their careers and
were scattered across France teaching in universities. Among them were Henri Cartan and
Andr´e Weil who were both in charge of teaching a course on differential and integral calcu-
lus at the University of Strasbourg. The standard textbook for this class at the time was
“Trait´e d’Analyse” by E. Goursat which the young professors found to be inadequate in
many ways.[BA] According to Weil, his friend Cartan was constantly asking him questions
about the best way to present a given topic to his class, so much so that Weil eventually
nicknamed him “the grand inquisitor”.[WA] After months of persistent questioning, in the
winter of 1934, Weil finally got the idea to gather friends (and former classmates) to settle
their problem by rewriting the treatise for their course. It is at this moment that Bourbaki
was conceived.
The suggestion of writing this treatise spread and very soon a loose circle of friends, includ-
ing Henri Cartan, Andr´e Weil, Jean Delsarte, Jean Dieudonn´e and Claude Chevalley began
meeting regularly at the Capoulade, a caf´e in the Latin quarter of Paris to plan it . They
called themselves the “Committee on the Analysis Treatise”[BL]. According to Chevalley
the project was extremely naive. The idea was to simply write another textbook to replace
Goursat’s.[GD] After many discussions over what to include in their treatise they finally
came to the conclusion that they needed to start from scratch and present all of essential
mathematics from beginning to end. With the idea that “the work had to be primarily a
tool, not usable in some small part of mathematics but in the greatest possible number of
places”.[DJ] Gradually the young men realized that their meetings were not sufficient, and
they decided they would dedicate a few weeks in the summer to their new project. The
collaborators on this project were not aware of it’s enormity, but were soon to find out.
In July of 1935 the young men gathered for their first congress (as they would later call them)
in Besse-en-Chandesse. The men believed that they would be able to draft the essentials of
mathematics in about three years. They did not set out wanting to write something new,
but to perfect everything already known. Little did they know that their first chapter would
not be completed until 4 years later. It was at one of their first meetings that the young
men chose their name : Nicolas Bourbaki. The organization and it’s membership would go
on to become one of the greatest enigmas of 20th century mathematics.
The first Bourbaki congress, July 1935. From left to right, back row: Henri Cartan, Ren´e
de Possel, Jean Dieudonn´e, Andr´e Weil, university lab technician, seated: Mirl`es, Claude
Chevalley, Szolem Mandelbrojt.
Andr´e Weil recounts many years later how they decided on this name. He and a few other
Bourbaki collaborators had been attending the ´ecole Normale in Paris, when a notification
was sent out to all first year science students : a guest speaker would be giving a lecture and
attendance was highly recommended. As the Story goes, the young students gathered to
hear, (unbeknownst to them) an older student, Raoul Husson who had disguised himself with
a fake beard and an unrecognizable accent. He gave what is said to be an incomprehensible,
91
nonsensical lecture, with the young students trying desperately to follow him. All his results
were wrong in a non-trivial way and he ended with his most extravagant : Bourbaki’s
Theorem. One student even claimed to have followed the lecture from beginning to end.
Raoul had taken the name for his theorem from a general in the Franco-Prussian war. The
committee was so amused by the story that they unanimously chose Bourbaki as their name.
Weil’s wife was present at the discussion about choosing a name and she became Bourbaki’s
godmother baptizing him Nicolas.[WA] Thus was born Nicolas Bourbaki.
Andr´e Weil, Claude Chevalley, Jean Dieudonn´e, Henri Cartan and Jean Delsarte were among
the few present at these first meetings, they were all active members of Bourbaki until their
retirements. Today they are considered by most to be the founding fathers of the Bourbaki
group. According to a later member they were “those who shaped Bourbaki and gave it
much of their time and thought until they retired” he also claims that some other early
contributors were Szolem Mandelbrojt and Ren´e de Possel.[BA]
Reforming Mathematics : The Idea
Bourbaki members all believed that they had to completely rethink mathematics. They felt
that older mathematicians were holding on to old practices and ignoring the new. That is
why very early on Bourbaki established one it’s first and only rules : obligatory retirement
at age 50. As explained by Dieudonn´e “if the mathematics set forth by Bourbaki no longer
correspond to the trends of the period, the work is useless and has to be redone, this is why
we decided that all Bourbaki collaborators would retire at age 50.”[DJ] Bourbaki wanted to
create a work that would be an essential tool for all mathematicians. Their aim was to create
something logically ordered, starting with a strong foundation and building continuously on
it. The foundation that they chose was set theory which would be the first book in a series
of 6 that they named “´el´ements de math´ematique”(with the ’s’ dropped from math´ematique
to represent their underlying belief in the unity of mathematics). Bourbaki felt that the old
mathematical divisions were no longer valid comparing them to ancient zoological divisions.
The ancient zoologist would classify animals based on some basic superficial similarities such
as “all these animals live in the ocean”. Eventually they realized that more complexity
was required to classify these animals. Past mathematicians had apparently made similar
mistakes : “the order in which we (Bourbaki) arranged our subjects was decided according to
a logical and rational scheme. If that does not agree with what was done previously, well, it
means that what was done previously has to be thrown overboard.”[DJ] After many heated
discussions, Bourbaki eventually settled on the topics for “´el´ements de math´ematique” they
would be, in order:
I Set theory
II algebra
III topology
IV functions of one real variable
V topological vector spaces
VI Integration
92
They now felt that they had eliminated all secondary mathematics, that according to them
“did not lead to anything of proved importance.”[DJ] The following table summarizes Bour-
baki’s choices.
What remains after cutting the loose threads What is excluded(the loose threads)
Linear and multilinear algebra theory of ordinals and cardinals
A little general topology the least possible Lattices
Topological vector Spaces Most general topology
Homological algebra Most of group theory finite groups
commutative algebra Most of number theory
Non-commutative algebra Trigonometrical series
Lie groups Interpolation
Integration Series of polynomials
differentiable manifolds Applied mathematics
Riemannian geometry
Dieudonn´e’s metaphorical ball of yarn: “here is my picture of mathematics now. It
is a ball of wool, a tangled hank where all mathematics react upon another in an almost
unpredictable way. And then in this ball of wool, there are a certain number of threads coming
out in all directions and not connecting with anything else. Well the Bourbaki method is very
simple-we cut the threads.”[DJ]
Reforming Mathematics : The Process
It didn’t take long for Bourbaki to become aware of the size of their project. They were
now meeting three times a year (twice for one week and once for two weeks) for Bourbaki
“congresses” to work on their books. Their main rule was unanimity on every point. Any
member had the right to veto anything he felt was inadequate or imperfect. Once Bourbaki
had agreed on a topic for a chapter the job of writing up the first draft was given to any
member who wanted it. He would write his version and when it was complete it would be
presented at the next Bourbaki congress. It would be read aloud line by line. According
to Dieudonn´e “each proof was examined point by point and criticized pitilessly. He goes
on ”one has to see a Bourbaki congress to realize the virulence of this criticism and how it
surpasses by far any outside attack.”[DJ] Weil recalls a first draft written by Cartan (who has
unable to attend the congress where it would being presented). Bourbaki sent him a telegram
summarizing the congress, it read : “union intersection partie produit tu es d´emembr´e foutu
Bourbaki” (union intersection subset product you are dismembered screwed Bourbaki).[WA]
During a congress any member was allowed to interrupt to criticize, comment or ask questions
at any time. Apparently Bourbaki believed it could get better results from confrontation
than from orderly discussion.[BA] Armand Borel, summarized his first congress as “two or
three monologues shouted at top voice, seemingly independant of one another”.[BA]
Bourbaki congress 1951.
After a first draft had been completely reduced to pieces it was the job of a new collaborator
to write up a second draft. This second collaborator would use all the suggestions and
changes that the group had put forward during the congress. Any member had to be able to
93
take on this task because one of Bourbaki’s mottoes was “the control of the specialists by the
non-specialists”[BA] i.e. a member had to be able to write a chapter in a field that was not
his specialty. This second writer would set out on his assignment knowing that by the time
he was ready to present his draft the views of the congress would have changed and his draft
would also be torn apart despite it’s adherence to the congresses earlier suggestions. The
same chapter might appear up to ten times before it would finally be unanimously approved
for publishing. There was an average of 8 to 12 years from the time a chapter was approved
to the time it appeared on a bookshelf.[DJ] Bourbaki proceeded this way for over twenty
years, (surprisingly) publishing a great number of volumes.
Bourbaki congress 1951.
Recruitment and Membership
During these years, most Bourbaki members held permanent positions at universities across
France. There, they could recruit for Bourbaki, students showing great promise in math-
ematics. Members would never be replaced formally nor was there ever a fixed number of
members. However when it felt the need, Bourbaki would invite a student or colleague to a
congress as a “cobaille” (guinea pig). To be accepted, not only would the guinea pig have
to understand everything, but he would have to actively participate. He also had to show
broad interests and an ability to adapt to the Bourbaki style. If he was silent he would
not be invited again.(A challenging task considering he would be in the presence of some of
the strongest mathematical minds of the time) Bourbaki described the reaction of certain
guinea pigs invited to a congress : “they would come out with the impression that it was a
gathering of madmen. They could not imagine how these people, shouting -sometimes three
or four at a time- about mathematics, could ever come up with something intelligent.”[DJ]
If a new recruit was showing promise, he would continue to be invited and would gradually
become a member of Bourbaki without any formal announcement. Although impossible to
have complete anonymity, Bourbaki was never discussed with the outside world. It was many
years before Bourbaki members agreed to speak publicly about their story. The following
table gives the names of some of Bourbaki’s collaborators.
1
st
generation (founding fathers) 2
nd
generation (invited after WWII) 3
rd
generation
H. Cartan J. Dixmier A. Borel
C. Chevalley R. Godement F. Bruhat
J. Delsarte S. Eilenberg P. Cartier
J. Dieudonn´e J.L. Koszul A. Grothendieck
A. Weil P. Samuel S. Lang
J.P Serre J. Tate
L. Shwartz
3 Generations of Bourbaki (membership according to Pierre Cartier)[SM]. Note: There
have been a great number of Bourbaki contributors, some lasting longer than others, this table
gives the members listed by Pierre Cartier. Different sources list different “official members”
in fact the Bourbaki website lists J.Coulomb, C.Ehresmann, R.de Possel and S. Mandelbrojt
as 1
st
generation members.[BW]
94
Bourbaki congress 1938, from left to right: S. Weil, C. Pisot, A. Weil, J. Dieudonn´e, C.
Chabauty, C. Ehresmann, J. Delsarte.
The Books
The Bourbaki books were the first to have such a tight organization, the first to use an
axiomatic presentation. They tried as often as possible to start from the general and work
towards the particular. Working with the belief that mathematics are fundamentally sim-
ple and for each mathematical question there is an optimal way of answering it. This
required extremely rigid structure and notation. In fact the first six books of “´el´ements de
math´ematique” use a completely linearly-ordered reference system. That is, any reference
at a given spot can only be to something earlier in the text or in an earlier book. This
did not please all of its readers as Borel elaborates : “I was rather put off by the very dry
style, without any concession to the reader, the apparent striving for the utmost generality,
the inflexible system of internal references and the total absence of outside ones”. However,
Bourbaki’s style was in fact so efficient that a lot of its notation and vocabulary is still
in current usage. Weil recalls that his grandaughter was impressed when she learned that
he had been personally responsible for the symbol ∅ for the empty set,[WA] and Chevalley
explains that to “bourbakise” now means to take a text that is considered screwed up and
to arrange it and improve it. Concluding that “it is the notion of structure which is truly
bourbakique”.[GD]
As well as ∅, Bourbaki is responsible for the introduction of the ⇒ (the implication arrow),
N, R, C, ´ and Z (respectively the natural, real, complex, rational numbers and the integers)
(
A
(complement of a set A), as well as the words bijective, surjective and injective.[DR]
The Decline
Once Bourbaki had finally finished its first six books, the obvious question was “what next?”.
The founding members who (not intentionally) had often carried most of the weight were now
approaching mandatory retirement age. The group had to start looking at more specialized
topics, having covered the basics in their first books. But was the highly structured Bourbaki
style the best way to approach these topics? The motto “everyone must be interested in
everything” was becoming much more difficult to enforce. (It was easy for the first six
books whose contents are considered essential knowledge of most mathematicians) Pierre
Cartier was working with Bourbaki at this point. He says “in the forties you can say that
Bourbaki know where to go: his goal was to provide the foundation for mathematics”.[12] It
seemed now that they did not know where to go. Nevertheless, Bourbaki kept publishing.
Its second series (falling short of Dieudonn´e’s plan of 27 books encompassing most of modern
mathematics [BA]) consisted of two very successful books :
Book VII Commutative algebra
Book VIII Lie Groups
However Cartier claims that by the end of the seventies, Bourbaki’s method was understood,
and many textbooks were being written in its style : “Bourbaki was left without a task. (...)
95
With their rigid format they were finding it extremely difficult to incorporate new mathe-
matical developments”[SM] To add to its difficulties, Bourbaki was now becoming involved
in a battle with its publishing company over royalties and translation rights. The matter was
settled in 1980 after a “long and unpleasant” legal process, where, as one Bourbaki member
put it “both parties lost and the lawyer got rich”[SM]. In 1983 Bourbaki published its last
volume : IX Spectral Theory.
By that time Cartier says Bourbaki was a dinosaur, the head too far away from the tail.
Explaining : “when Dieudonn´e was the “scribe of Bourbaki” every printed word came from
his pen. With his fantastic memory he knew every single word. You could say “Dieudonn´e
what is the result about so and so?” and he would go to the shelf and take down the book and
open it to the right page. After Dieudonn´e retired no one was able to do this. So Bourbaki
lost awareness of his own body, the 40 published volumes.”[SM] Now after almost twenty
years without a significant publication is it safe to say the dinosaur has become extinct?
1
But since Nicolas Bourbaki never in fact existed, and was nothing but a clever teaching and
research ploy, could he ever be said to be extinct?
REFERENCES
[BL] L. BEAULIEU: A Parisian Caf´e and Ten Proto-Bourbaki Meetings (1934-1935), The Mathe-
matical Intelligencer Vol.15 No.1 1993, pp 27-35.
[BCCC] A. BOREL, P.CARTIER, K. CHANDRASKHARAN, S. CHERN, S. IYANAGA: Andr´e
Weil (1906-1998), Notices of the AMS Vol.46 No.4 1999, pp 440-447.
[BA] A. BOREL: Twenty-Five Years with Nicolas Bourbaki, 1949-1973, Notices of the AMS Vol.45
No.3 1998, pp 373-380.
[BN] N. BOURBAKI: Th´eorie des Ensembles, de la collection ´el´ements de Math´ematique, Her-
mann, Paris 1970.
[BW] Bourbaki website: [online] at www.bourbaki.ens.fr.
[CH] H. CARTAN: Andr´e Weil:Memories of a Long Friendship, Notices of the AMS Vol.46 No.6
1999, pp 633-636.
[DR] R. D´eCAMPS: Qui est Nicolas Bourbaki?, [online] at http://faq.maths.free.fr.
[DJ] J. DIEUDONN´e: The Work of Nicholas Bourbaki, American Math. Monthly 77,1970, pp134-
145.
[EY] Encylop´edie Yahoo: Nicolas Bourbaki, [online] at http://fr.encylopedia.yahoo.com.
[GD] D. GUEDJ: Nicholas Bourbaki, Collective Mathematician: An Interview with Claude Cheval-
ley, The Mathematical Intelligencer Vol.7 No.2 1985, pp18-22.
[JA] A. JACKSON: Interview with Henri Cartan, Notices of the AMS Vol.46 No.7 1999, pp782-788.
[SM] M. SENECHAL: The Continuing Silence of Bourbaki- An Interview with Pierre Cartier, The
Mathematical Intelligencer, No.1 1998, pp 22-28.
[WA] A. WEIL: The Apprenticeship of a Mathematician, Birkh¨ auser Verlag 1992, pp 93-122.
1
Today what remains is “L’Association des Collaborteurs de Nicolas Bourbaki” who organize Bourbaki
seminars three times a year. These are international conferences, hosting over 200 mathematicians who come
to listen to presentations on topics chosen by Bourbaki (or the A.C.N.B). Their last publication was in 1998,
chapter 10 of book VI commutative algebra.
96
Version: 6 Owner: Daume Author(s): Daume
10.2 Erds Number
A low Erd¨os number is a status symbol among 20th Century mathematicians and is similar
to the 6-degrees-of-separation concept.
Let c(j) be the Erd¨os number of person j. Your Erd¨os number is
• 0 if you are Paul Erd¨os
• min¦c(r)[r ∈ A¦ + 1, where A is the set of all persons you have authored a paper
with.
Version: 7 Owner: tz26 Author(s): tz26
97
Chapter 11
03-00 – General reference works
(handbooks, dictionaries,
bibliographies, etc.)
11.1 Burali-Forti paradox
The Burali-Forti paradox demonstrates that the class of all ordinals is not a set. If there
were a set of all ordinals, (:d, then it would follow that (:d was itself an ordinal, and
therefore that (:d ∈ (:d. Even if sets in general are allowed to contain themselves, ordinals
cannot since they are defined so that ∈ is well founded over them.
This paradox is similar to both Russel’s paradox and Cantor’s paradox, although it predates
both. All of these paradoxes prove that a certain object is ”too large” to be a set.
Version: 2 Owner: Henry Author(s): Henry
11.2 Cantor’s paradox
Cantor’s paradox demonstrates that there can be no largest cardinality. In particular,
there must be an unlimited number of infinite cardinalities. For suppose that α were the
largest cardinal. Then we would have [P(α)[ = [α[. Suppose 1 : α → P(α) is a bijection
proving their equicardinality. Then A = ¦β ∈ α [ β ∈ 1(β)¦ is a subset of α, and so there
is some γ ∈ α such that 1(γ) = A. But γ ∈ A ↔γ ∈ A, which is a paradox.
The key part of the argument strongly resembles Russell’s paradox, which is in some sense
a generalization of this paradox.
98
Besides allowing an unbounded number of cardinalities as ZF set theory does, this paradox
could be avoided by a few other tricks, for instance by not allowing the construction of a
power set or by adopting paraconsistent logic.
Version: 2 Owner: Henry Author(s): Henry
11.3 Russell’s paradox
Suppose that for any coherent proposition 1(r), we can construct a set ¦r : 1(r)¦. Let
o = ¦r : r ∈ r¦. Suppose o ∈ o; then, by definition, o ∈ o. Likewise, if o ∈ o, then by
definition o ∈ o. Therefore, we have a contradiction. Bertrand Russell gave this paradox as
an example of how a purely intuitive set theory can be inconsistent. The regularity axiom,
one of the Zermelo-Fraenkel axioms, was devised to avoid this paradox by prohibiting self-
swallowing sets.
An interpretation of Russell paradox without any formal language of set theory could be
stated like “If the barber shaves all those who do not themselves shave, does he shave him-
self?”. If you answer himself that is false since he only shaves all those who do not themselves
shave. If you answer someone else that is also false because he shaves all those who do not
themselves shave and in this case he is part of that set since he does not shave himself.
Therefore we have a contradiction.
Version: 5 Owner: Daume Author(s): Daume, vampyr
11.4 biconditional
A biconditional is a truth function that is true only in the case that both parameters are true
or both are false. For example, ”a only if b”, ”a just in case b”, as well as ”b implies a and a
implies b” are all ways of stating a biconditional in english. Symbolically the biconditional
is written as
c ↔/
or
c ⇔/
Its truth table is
99
a b c ↔/
F F T
F T F
T F F
T T T
In addition, the biconditional function is sometimes written as ”iff”, meaning ”if and only
if”.
The biconditional gets its name from the fact that it is really two conditionals in conjunction,
(c →/) ∧ (/ →c)
This fact is important to recognize when writing a mathematical proof, as both conditionals
must be proven independently.
Version: 8 Owner: akrowne Author(s): akrowne
11.5 bijection
Let A and ) be sets. A function 1 : A →) that is one-to-one and onto is called a bijection
or bijective function from A to ) .
When A = ) , 1 is also called a permutation of A.
Version: 8 Owner: mathcam Author(s): mathcam, drini
11.6 cartesian product
For any sets ¹ and 1, the cartesian product ¹ 1 is the set consisting of all ordered pairs
(c. /) where c ∈ ¹ and / ∈ 1.
Version: 1 Owner: djao Author(s): djao
11.7 chain
Let 1 ⊆ ¹, where ¹ is ordered by ≤. 1 is a chain in ¹ if any two elements of 1 are
comparable.
That is, 1 is a linearly ordered subset of ¹.
Version: 1 Owner: akrowne Author(s): akrowne
100
11.8 characteristic function
Definition Suppose ¹ is a subset of a set A. Then the function
χ
A
(r) =

1. when r ∈ ¹.
0. when r ∈ A ` ¹
is the characteristic function for ¹.
Properties
Suppose ¹. 1 are subsets of a set A.
1. For set intersections and set unions, we have
χ
A
T
B
= χ
A
χ
B
.
χ
A
S
B
= χ
A
+ χ
B
−χ
A
T
B
.
2. For the symmetric difference,
χ
A

B
= χ
A
+ χ
B
−2χ
A
T
B
.
3. For the set complement,
χ
A
= 1 −χ
A
.
Remarks
A synonym for characteristic function is indicator function [1].
REFERENCES
1. G.B. Folland, Real Analysis: Modern Techniques and Their Applications, 2nd ed,
John Wiley & Sons, Inc., 1999.
Version: 6 Owner: bbukh Author(s): bbukh, matte, vampyr
11.9 concentric circles
A collection of circles is said to be concentric if they have the same center. The region formed
between two concentric circles is therefore an annulus.
Version: 1 Owner: dublisk Author(s): dublisk
101
11.10 conjunction
A conjunction is true only when both parameters (called conjuncts) are true. In English, con-
junction is denoted by the word ”and”. Symbolically, we represent it as ∧ or multiplication
applied to Boolean parameters. Conjunction of c and / would be written
c ∧ /
or, in algebraic context,
c /
or
c/
The truth table for conjuction is
c / c ∧ /
F F F
F T F
T F F
T T T
Version: 6 Owner: akrowne Author(s): akrowne
11.11 disjoint
Two sets A and ) are disjoint if their intersection A
¸
) is the empty set.
Version: 1 Owner: djao Author(s): djao
11.12 empty set
An empty set ∅ is a set that contains no elements. The Zermelo-Fraenkel axioms of set theory
postulate that there exists an empty set.
Version: 2 Owner: djao Author(s): djao
102
11.13 even number
Definition Suppose / is an integer. If there exists an integer : such that / = 2: + 1, then
/ is an odd number. If there exists an integer : such that / = 2:, then / is an even
number.
The concept of even and odd numbers are most easily understood in the binary base. Then
the above definition simply states that even numbers end with a 0, and odd numbers end
with a 1.
Properties
1. Every integer is either even or odd. This can be proven using induction, or using the
fundamental theorem of arithmetic.
2. An integer / is even (odd) if and only if /
2
is even (odd).
Version: 3 Owner: mathcam Author(s): matte
11.14 fixed point
A fixed point r of a function 1 : A →A, is a point that remains constant upon application
of that function, i.e.:
1(r) = r.
Version: 5 Owner: mathwizard Author(s): mathwizard
11.15 infinite
A set o is infinite if it is not finite; that is, there is no : ∈ N for which there is a bijection
between : and o. Hence an infinite set has a cardinality greater than any natural number:
[o[ ≥ ℵ
0
Infinite sets can be divided into countable and uncountable. For countably infinite sets o,
there is a bijection between o and N. This is not the case for uncountably infinite sets (like
the reals and any non-trivial real interval).
Some examples of finite sets:
103
• The empty set: ¦¦.
• ¦0. 1¦
• ¦1. 2. 3. 4. 5¦
• ¦1. 1.5. c. π¦
Some examples of infinite sets:
• ¦1. 2. 3. 4. . . .¦ (countable)
• The primes: ¦2. 3. 5. 7. 9. . . .¦ (countable)
• An interval of the reals: (0. 1) (uncountable)
• The rational numbers: ´ (countable)
Version: 4 Owner: akrowne Author(s): akrowne, vampyr
11.16 injective function
We say that a function 1 : A →) is injective or one-to-one if 1(r) = 1(n) implies r = n,
or equivalently, whenever r = n, then 1(r) = 1(n).
Version: 6 Owner: drini Author(s): drini
11.17 integer
The set of integers, denoted by the symbol Z, is the set ¦ −3. −2. −1. 0. 1. 2. 3. . . . ¦ con-
sisting of the natural numbers and their negatives.
Mathematically, Z is defined to be the set of equivalence classes of pairs of natural numbers
N N under the equivalence relation (c. /) ∼ (c. d) if c + d = / + c.
Addition and multiplication of integers are defined as follows:
• (c. /) + (c. d) := (c + c. / + d)
• (c. /) (c. d) := (cc + /d. cd + /c)
104
Typically, the class of (c. /) is denoted by symbol : if / < c (resp. −: if c < /), where : is
the unique natural number such that c = / + : (resp. c + : = /). Under this notation, we
recover the familiar representation of the integers as ¦. . . . −3. −2. −1. 0. 1. 2. 3. . . . ¦. Here
are some examples:
• 0 = equivalence class of (0. 0) = equivalence class of (1. 1) = . . .
• 1 = equivalence class of (1. 0) = equivalence class of (2. 1) = . . .
• −1 = equivalence class of (0. 1) = equivalence class of (1. 2) = . . .
The set of integers Z under the addition and multiplication operations defined above form
an integral domain. The integers admit the following ordering relation making Z into an
ordered ring: (c. /) < (c. d) in Z if c + d < / + c in N.
The ring of integers is also a Euclidean domain, with valuation given by the absolute value
function.
Version: 7 Owner: djao Author(s): djao
11.18 inverse function
Definition Suppose 1 : A → ) is a mapping between sets A and ) , and suppose 1
−1
:
) →A is a mapping that satisfies
1
−1
◦ 1 = id
X
.
1 ◦ 1
−1
= id
Y
.
Then 1
−1
is called the inverse of 1, or the inverse function of 1.
Remarks
1. The inverse function of a function 1 : A →) exists if and only if 1 is a bijection, that
is, 1 is an injection and a surjection.
2. When an inverse function exists, it is unique.
3. The inverse function and the inverse image of a set coincide in the following sense.
Suppose 1
−1
(¹) is the inverse image of a set ¹ ⊂ ) under a function 1 : A →) . If 1
is a bijection, then 1
−1
(n) = 1
−1
(¦n¦).
Version: 3 Owner: matte Author(s): matte
105
11.19 linearly ordered
An ordering ≤ (or <) of ¹ is called linear or total if any two elements of ¹ are comparable.
The pair (¹. ≤) is then called a linearly ordered set.
Version: 1 Owner: akrowne Author(s): akrowne
11.20 operator
Synonym of mapping and function. Often used to refer to mappings where the domain and
codomain are, in some sense a space of functions.
Examples: differential operator, convolution operator.
Version: 2 Owner: rmilson Author(s): rmilson
11.21 ordered pair
For any sets c and /, the ordered pair (c. /) is the set ¦¦c¦. ¦c. /¦¦.
The characterizing property of an ordered pair is:
(c. /) = (c. d) ⇔c = / and c = d.
and the above construction of ordered pair, as weird as it seems, is actually the simplest
possible formulation which achieves this property.
Version: 4 Owner: djao Author(s): djao
11.22 ordering relation
Let o be a set. An ordering relation is a relation < on o such that, for every c. /. c ∈ o:
• Either c < /, or / < c,
• If c < / and / < c, then c < c,
• If c < / and / < c, then c = /.
106
Given an ordering relation <, one can define a relation < by: c < / if c < / and c = /. The
opposite ordering is the relation ` given by: c ` / if / < c, and the relation is defined
analogously.
Version: 3 Owner: djao Author(s): djao
11.23 partition
A partition 1 of a set o is a collection of mutually disjoint non-empty sets such that
¸
1 = o.
Any partition 1 of a set o introduces an equivalence relation on o, where each j ∈ 1 is
an equivalence class. Similarly, given an equivalence relation on o, the collection of distinct
equivalence classes is a partition of o.
Version: 4 Owner: vampyr Author(s): vampyr
11.24 pullback
Definition Suppose A. ). 2 are sets, and we have maps
1 : ) → 2.
Φ : A → ).
Then the pullback of 1 under Φ is the mapping
Φ

1 : A → 2.
r → (1 ◦ Φ)(r).
Let us denote by `(A. ) ) the set of all mappings 1 : A → ) . We then see that Φ

is a
mapping `(). 2) →`(A. 2). In other words, Φ

pulls back the set where 1 is defined on
from ) to A. This is illustrated in the below diagram.
A
Φ
Φ

f
)
f
2
Properties
1. For any set A, (id
X
)

= id
M(X,X)
.
107
2. Suppose we have maps
Φ : A → ).
Ψ : ) → 2
between sets A. ). 2. Then
(Ψ◦ Φ)

= Φ

◦ Ψ

.
3. If Φ : A →) is a bijection, then Φ

is a bijection and

Φ

−1
=

Φ
−1


.
4. Suppose A. ) are sets with A ⊂ ) . Then we have the inclusion map ι : A →) , and
for any 1 : ) →2, we have
ι

1 = 1[
X
.
where 1[
X
is the restriction of 1 to A [1].
REFERENCES
1. W. Aitken, De Rahm Cohomology: Summary of Lectures 1-4, online.
Version: 7 Owner: matte Author(s): matte
11.25 set closed under an operation
A set A is said to be closed under some map 1, if 1 maps elements in A to elements in A,
i.e., 1 : A →A. More generally, suppose ) is the :-fold cartesian product ) = A A.
If 1 is a map 1 : ) →A, then we also say that A is closed under the map 1.
The above definition has no relation with the definition of a closed set in topology. Instead,
one should think of A and 1 as a closed system.
Examples
1. The set of invertible matrices is closed under matrix inversion. This means that the
inverse of an invertible matrix is again an invertible matrix.
2. Let ((A) be the set of complex valued continuous functions on some topological space
A. Suppose 1. o are functions in ((A). Then we define the pointwise product of 1
and o as the function 1o : r → 1(r)o(r). Since 1o is continuous, we have that ((A)
is closed under pointwise multiplication.
108
In the first examples, the operations is of the type A → A. In the latter, pointwise multi-
plication is a map ((A) ((A) →((A).
Version: 2 Owner: matte Author(s): matte
11.26 signature of a permutation
Let A be a finite set, and let G be the group of permutations of A (see permutation group).
There exists a unique homomorphism χ from G to the multiplicative group ¦−1. 1¦ such
that χ(t) = −1 for any transposition (loc. sit.) t ∈ G. The value χ(o), for any o ∈ G,
is called the signature or sign of the permutation o. If χ(o) = 1, o is said to be of even
parity; if χ(o) = −1, o is said to be of odd parity.
Proposition: If A is totally ordered by a relation <, then for all o ∈ G,
χ(o) = (−1)
k(g)
(11.26.1)
where /(o) is the number of pairs (r. n) ∈ A A such that r < n and o(r) o(n). (Such a
pair is sometimes called an inversion of the permutation o.)
Proof: This is clear if o is the identity map A →A. If o is any other permutation, then for
some consecutive c. / ∈ A we have c < / and o(c) o(/). Let / ∈ G be the transposition
of c and /. We have
/(/ ◦ o) = /(o) −1
χ(/ ◦ o) = −χ(o)
and the proposition follows by induction on /(o).
Version: 4 Owner: drini Author(s): Larry Hammick
11.27 subset
Given two sets ¹ and 1, we say that ¹ is a subset of 1 (which we denote as ¹ ⊆ 1 or
simply ¹ ⊂ 1) if every element of ¹ is also in 1. That is, the following implication holds:
r ∈ ¹ ⇒r ∈ 1.
Some examples: The set ¹ = ¦d. :. i. t. o¦ is a subset of the set 1 = ¦j. c. d. :. i. t. o¦ because
every element of ¹ is also in 1. That is, ¹ ⊆ 1.
On the other hand, if ( = ¦j. c. d. :. o¦ neither ¹ is a subset of ( (because t ∈ ¹ but t ∈ ()
nor ( is a subset of ¹ (because j ∈ ( but j ∈ ¹). The fact that ¹ is not a subset of ( is
written as ¹ ⊆ (. And then, in this example we also have ( ⊆ ¹.
109
If A ⊆ ) and ) ⊆ A, it must be the case that A = ) .
Every set is a subset of itself, and the empty set is a subset of every other set. The set ¹ is
called a proper subset of 1, if ¹ ⊂ 1 and ¹ = 1 (in this case we do not use ¹ ⊆ 1).
Version: 5 Owner: drini Author(s): drini
11.28 surjective
A function 1 : A → ) is called surjective or onto if, for every n ∈ ) , there is an r ∈ A
such that 1(r) = n.
Equivalently, 1 : A →) is onto when its image is all the codomain:
Im1 = ).
Version: 2 Owner: drini Author(s): drini
11.29 transposition
Given a set A = ¦c
1
. c
2
. . . . . c
n
¦, a transposition is a permutation (bijective function of A
onto itself) 1 such that there exist indices i. , such that 1(c
i
) = c
j
, 1(c
j
) = c
i
and 1(c
k
) = c
k
for all other indices /.
Example: If A = ¦c. /. c. d. c¦ the function σ given by
σ(c) = c
σ(/) = c
σ(c) = c
σ(d) = d
σ(c) = /
is a transposition.
One of the main results on symmetric groups states that any permutation can be expressed
as composition of transpositions, and for any two decompositions of a given permutation,
the number of transpositions is always even or always odd.
Version: 2 Owner: drini Author(s): drini
110
11.30 truth table
A truth table is a tabular listing of all possible input value combinations for a truth function
and their corresponding output values. For : input variables, there will always be 2
n
rows
in the truth table. A sample truth table for (c ∧ /) →c would be
c / c (c ∧ /) →c
F F F T
F F T F
F T F T
F T T F
T F F T
T F T F
T T F T
T T T T
(Note that ∧ represents logical and, while → represents the conditional truth function).
Version: 4 Owner: akrowne Author(s): akrowne
111
Chapter 12
03-XX – Mathematical logic and
foundations
12.1 standard enumeration
The standard enumeration of ¦0. 1¦

is the sequence of strings :
0
= λ, :
1
= 0, :
2
= 1,
:
3
= 00, :
4
= 01, in lexicographic order.
The characteristic function of a language ¹ is χ
A
: N →¦0. 1¦ such that
χ
A
(:) =

1. if :
n
∈ ¹
0. if :
n
∈ ¹.
The characteristic sequence of a language ¹ (also denoted as χ
A
) is the concatenation of the
values of the characteristic function in the natural order.
Version: 12 Owner: xiaoyanggu Author(s): xiaoyanggu
112
Chapter 13
03B05 – Classical propositional logic
13.1 CNF
A propositional formula is a CNF formula, meaning Conjunctive normal form, if it is a
conjunction of disjunction of literals (a literal is a propositional variable or its negation).
Hence, a CNF is a formula of the form: 1
1
∧ 1
2
∧ . . . ∧ 1
n
, where each 1
i
is of the form
|
i1
∨ |
i2
∨ . . . ∨ |
im
for literals |
ij
and some :.
Example: (r ∨ n ∨ .) ∧ (n ∨ u ∨ n) ∧ (r ∨ · ∨ n).
Version: 2 Owner: iddo Author(s): iddo
13.2 Proof that contrapositive statement is true using
logical equivalence
You can see that the contrapositive of an implication is true by considering the following:
The statement j ⇒¡ is logically equivalent to j ∨ ¡ which can also be written as j ∨ ¡.
By the same token, the contrapositive statement ¡ ⇒ j is logically equivalent to ¡ ∨ j
which, using double negation on ¡, becomes ¡ ∨ j.
This, of course, is the same logical statement.
Version: 2 Owner: sprocketboy Author(s): sprocketboy
113
13.3 contrapositive
Given an implication of the form
j →¡
(”p implies q”) the contrapositive of this implication is
¡ →j
(”not q implies not p”).
An implication and its contrapositive are equivalent statements. When proving a theorem,
it is often more convenient or more intuitive to prove the contrapositive instead.
Version: 3 Owner: vampyr Author(s): vampyr
13.4 disjunction
A disjunction is true if either of its parameters (called disjuncts) are true. Disjunction
does not correspond to ”or” in English (see exclusive or.) Disjunction uses the symbol ∨
or sometimes + when taken in algebraic context. Hence, disjunction of c and / would be
written
c ∨ /
or
c + /
The truth table for disjunction is
c / c ∨ /
F F F
F T T
T F T
T T T
Version: 8 Owner: akrowne Author(s): akrowne
13.5 equivalent
Two statements A and B are said to be (logically) equivalent if A is true if and only if B is
true (that is, A implies B and B implies A). This is usually written as ¹ ⇔1. For example,
for any integer ., the statement ”. is positive” is equivalent to ”. is not negative and . = 0”.
Version: 1 Owner: sleske Author(s): sleske
114
13.6 implication
An implication is a logical construction that essentially tells us if one condition is true, then
another condition must be also true. Formally it is written
c →/
or
c ⇒/
which would be read ”c implies /”, or ”c therefore /”, or ”if c, then /” (to name a few).
Implication is often confused for ”if and only if”, or the biconditional truth function (⇔).
They are not, however, the same. The implication c → / is true even if only / is true. So
the statement ”pigs have wings, therefore it is raining today”, is true if it is indeed raining,
despite the fact that the first item is false.
In fact, any implication c →/ is called vacuously true when c is false. By contrast, c ⇔/
would be false if either c or / was by itself false (c ⇔/ ⇔(c ∧/) ∨(c ∧/), or in terms of
implication as (c →/) ∧ (/ →c)).
It may be useful to remember that c → / only tells you that it cannot be the case that
/ is false while c is true; / must ”follow” from c (and “false” does follow from “false”).
Alternatively, c →/ is in fact equivalent to
/ ∨ c
The truth table for implication is therefore
a b c →/
F F T
F T T
T F F
T T T
Version: 3 Owner: akrowne Author(s): akrowne
13.7 propositional logic
A propositional logic is a logic in which the only objects are propositions, that is,
objects which themselves have truth values. Variables represent propositions, and there are
no relations, functions, or quantifiers except for the constants 1 and ⊥ (representing true
115
and false respectively). The connectives are typically , ∧, ∨, and →(representing negation,
conjunction, disjunction, and implication), however this set is redundant, and other choices
can be used (1 and ⊥ can also be considered 0-ary connectives).
A model for propositional logic is just a truth function ν on a set of variables. Such a truth
function can be easily extended to a truth function ν on all formulas which contain only the
variables ν is defined on by adding recursive clauses for the usual definitions of connectives.
For instance ν(α ∧ β) = 1 iff ν(α) = ν(β) = 1.
Then we say ν [= φ if ν(φ) = 1, and we say [= φ if for every ν such that ν(φ) is defined,
ν [= φ (and say that φ is a tautology).
Propositional logic is decidable: there is an easy way to determine whether a sentence is a
tautology. It can be done using truth tables, since a truth table for a particular formula can
be easily produced, and the formula is a tautology if every assignment of truth values makes
it true. It is not known whether this method is efficient: the equivalent problem of whether
a formula is satisfiable (that is, whether its negation is a tautology) is a canonical example
of an NP-complete problem.
Version: 3 Owner: Henry Author(s): Henry
13.8 theory
If 1 is a logical language for some logic L, and 1 is a set of formulas with no free variables
then 1 is a theory of L.
We write 1 = φ for any formula φ if every model M of L such that ` = 1, ` = φ.
We write 1 ¬ φ is for there is a proof of φ from 1.
Version: 1 Owner: Henry Author(s): Henry
13.9 transitive
The transitive property of logic is
(c ⇒/) ∧ (/ ⇒c) ⇒(c ⇒c)
Where ⇒ is the conditional truth function. From this we can derive that
(c = /) ∧ (/ = c) ⇒(c = c)
116
Version: 1 Owner: akrowne Author(s): akrowne
13.10 truth function
A truth function is a function that returns one of two values, one of which is interpreted
as ”true”, and the other which is interpreted as ”false”. Typically either ”T” and ”F” are
used, or ”1” and ”0”, respectively. Using the latter, we can write
1 : ¦0. 1¦
n
→¦0. 1¦
defines a truth function 1. That is, 1 is a mapping from any number (:) of true/false (0 or
1) values to a single value, which is 0 or 1.
Version: 2 Owner: akrowne Author(s): akrowne
117
Chapter 14
03B10 – Classical first-order logic
14.1 ∆
1
bootstrapping
This proves that a number of useful relations and functions are ∆
1
in first order arithmetic,
providing a bootstrapping of parts of mathematical practice into any system including the ∆
1
relations (since the ∆
1
relations are exactly the recursive ones, this includes Turing machines).
First, we want to build a tupling relation which will allow a finite sets of numbers to be
encoded by a single number. To do this we first show that 1(c. /) ↔c[/ is ∆
1
. This is true
since c[/ ↔∃c < /(c c = /), a formula with only bounded quantifiers.
Next note that 1(r) ↔x is prime is ∆
1
since 1(r) ↔ ∃n < r(n = 1 ∧ n[r). Also
¹
P
(r. n) ↔1(r) ∧ 1(n) ∧ ∀. < n(r < . →1(r)).
These two can be used to define (the graph of) a primality function, j(c) = c+1-th prime.
Let j(c. /) = ∃c < /
a
2
([2[c] ∧ [∀¡ < /∀: < /(¹
P
(¡. :) → ∀, < c[¡
j
[c ↔ :
j+1
[c])] ∧ [/
a
[c] ∧
[/
a+1
[c]).
This rather awkward looking formula is worth examining, since it illustrates a principle which
will be used repeatedly. c is intended to be a function of the form 2
0
3
1
5
2
and so on. If
it includes /
a
but not /
a+1
then we know that / must be the c + 1-th prime. The definition
is so complicated because we cannot just say, as we’d like to, j(c + 1) is the smallest prime
greater than j(c) (since we don’t allow recursive definitions). Instead we embed the series
of values this recursion would take into a single number (c) and guarantee that the recursive
relationship holds for at least c terms; then we just check if the c-th value is /.
Finally, we can define our tupling relation. Technically, since a given relation must have
a fixed arity, we define for each : a function 'r
0
. . . . . r
n
` =
¸
n
j
x
i
+1
i
. Then define (r)
i
to be the i-th element of r when r is interpreted as a tuple, so '(r)
0
. . . . . (r)
n
` = r. Note
that the tupling relation, even taken collectively, is not total. For instance 5 is not a tuple
(although it is sometimes convenient to view it as a tuple with ”empty spaces”: ' . . 5`). In
118
situations like this, and also when attempting to extract entries beyond the length, (r)
i
= 0
(for instance, (5)
0
= 0). On the other hand there is a 0-ary tupling relation, '` = 1.
Thanks to our definition of j, we have 'r
0
. . . . . r
n
` = r ↔r = j(0)
x
0
+1
j(:)
xn+1
. This
is clearly ∆
1
. (Note that we don’t use the
¸
as above, since we don’t have that, but since
we have a different tupling function for each : this isn’t a problem.)
For the reverse, (r)
i
= n ↔([j(i)
y+1
[r] ∧ [j(i)
y+2
[r]) ∨ ([n = 0] ∧ [j(i)[r]).
Also, define a length function by len(r) = n ↔ [j(n + 1)[r] ∧ ∀. < n[j(.)[r] and a mem-
bership relation by in(r. :) ↔∃i < len(r)[(r)
i
= :].
Armed with this, we can show that all primitive recursive functions are ∆
1
. To see this, note
that r = 0, the zero function, is trivially recursive, as are r = on and j
n,m
(r
1
. . . . . r
n
) = r
m
.
The ∆
1
functions are closed under composition, since if φ(r) and ψ(r) both have no unbounded
quantifiers, φ(ψ(r)) obviously doesn’t either.
Finally, suppose we have functions 1(r) and o(r. :. :) in ∆
1
. Then define the primitive
recursion /(r. n) by first defining:
¯
/(r. n) = . ↔|:(.) = n ∧ ∀i < n[(.)
i+1
= o(r. i. (.)
i
)] ∧ [|:(.) = 0 ∨ (.)
0
= 1(r)
and then /(r. n) = (
¯
/(r. n))
y
.

1
is also closed under minimization: if 1(r. n) is a ∆
1
relation then jn.1(r. n) is a function
giving the least n satisfying 1(r. n). To see this, note that jn.1(r. n) = . ↔1(r. .) ∧∀: <
.1(r. :).
Finally, using primitive recursion it is possible to concatenate sequences. First, to concate-
nate a single number, if : = 'r
0
. . . . r
n
` then : ∗
1
n = t j(len(:) +1)
y+1
. Then we can define
the concatenation of : with t = 'n
0
. . . . . n
m
` by defining 1(:. t) = : and o(:. t. ,. i) = , ∗
1
(t)
i
,
and by primitive recursion, there is a function /(:. t. i) whose value is the first , elements of
t appended to :. Then : ∗ t = /(:. t. len(t).
We can also define ∗
u
, which concatenates only elements of t not appearing in :. This just
requires defining the graph of o to be o(:. t. ,. i. r) ↔[in(:. (t)
i
) ∧r = ,] ∨[in(:. (t)
i
) ∧r =
, ∗
1
(t)
i
]
Version: 6 Owner: Henry Author(s): Henry
14.2 Boolean
Boolean refers to that which can take on the values ”true” or ”false”, or that which concerns
truth and falsity. For example ”Boolean variable”, ”Boolean logic”, ”Boolean statement”,
etc.
119
”Boolean” is named for George Boole, the 19th century mathematician.
Version: 5 Owner: akrowne Author(s): akrowne
14.3 G¨odel numbering
A G¨odel numbering is any way off assigning numbers to the formulas of a language. This is
often useful in allowing sentences of a language to be self-referential. The number associated
with a formula φ is called its G¨odel number and is denoted 'φ¯.
More formally, if L is a language and G is a surjective partial function from the terms of
L to the formulas over L then G is a G¨odel numbering. 'φ¯ may be any term t such that
G(t) = φ. Note that G is not defined within L (there is no formula or object of L representing
G), however properties of it (such as being in the domain of G, being a subformula, and so
on) are.
Athough anything meeting the properties above is a G¨odel numbering, depending on the
specific language and usage, any of the following properties may also be desired (and can
often be found if more effort is put into the numbering):
• If φ is a subformula of ψ then 'φ¯ < 'ψ¯
• For every number :, there is some φ such that 'φ¯ = :
• G is injective
Version: 4 Owner: Henry Author(s): Henry
14.4 G¨odel’s incompleteness theorems
G¨odel’s first and second incompleteness theorems are perhaps the most celebrated results
in mathematical logic. The basic idea behind G¨odel’s proofs is that by the device of
G¨odel numbering, one can formulate properties of theories and sentences as arithmetical
properties of the corresponding G¨odel numbers, thus allowing 1st order arithmetic to speak
of its own consistency, provability of some sentence and so forth.
The original result G¨odel proved in his classic paper On Formally Undecidable propositions
in Principia Mathematica and Related Systems can be stated as
Theorem 1. No theory T axiomatisable in the type system of PM (i.e. in Russell’s theory of types)
which contains Peano-arithmetic and is ω-consistent proves all true theorems of arithmetic
(and no false ones).
120
Stated this way, the theorem is an obvious corollary of Tarski’s result on the undefinability of Truth.
This can be seen as follows. Consider a G¨odel numbering G, which assigns to each formula
φ its G¨odel number
]
φ
|
. The set of G¨odel numbers of all true sentences of arithmetic is
¦
]
φ
|
[ N [= φ¦, and by Tarski’s result it isn’t definable by any arithmetic formula. But
assume there’s a theory 1 an axiomatisation Ax
T
of which is definable in arithmetic and
which proves all true statements of arithmetic. But now ∃1(1 is a proof of r from Ax
T
)
defines the set of (G¨odel numbers of) true sentences of arithmetic, which contradicts Tarski’s
result.
The proof given above is highly non-constructive. A much stronger version can actually be
extracted from G¨odel’s paper, namely that
Theorem 2. There is a primitive recursive function G, s.t. if 1 is a theory with a p.r.
axiomatisation α, and if all primitive recursive functions are representable in 1 then N [=
G(α) but 1 ¬ G(α)
This second form of the theorem is the one usually proved, although the theorem is usually
stated in a form for which the nonconstructive proof based on Tarski’s result would suffice.
The proof for this stronger version is based on a similar idea as Tarski’s result.
Consider the formula ∃1(1 is a proof of r from α), which defines a predicate 1:o·
α
(r)
which represents provability from α. Assume we have numerated the open formulae with
one variable in a sequence 1
i
, so that every open formula occurs. Consider now the sentence
Prov(1
x
), which defines the non-provability from α predicate. Now, since Prov(1
x
) is
an open formula with one variable, it must be 1
k
for some /. Thus we can consider the
closed sentence 1
k
(/). This sentence is equivalent to Prov(:n/:t(
]
1:o·
α
(r)
|
). /)), but
since :n/:t(
]
Prov(r)
|
). /) is just 1
k
(/), it ”asserts its own unprovability”.
Since all the steps we took to get the undecided but true sentence :n/:t(
]
Prov(r)
|
). /)
is just 1
k
(/) were very simple mechanic manipulations of G¨odel numbers guaranteed to
terminate in bounded time, we have in fact produced the p.r. Function G required by the
statement of the theorem.
The first version of the proof can be used to show that also many non-axiomatisable theories
are incomplete. For example, consider 1¹ + all true Π
1
sentences. Since Π
1
truth is
definable at Π
2
-level, this theory is definable in arithmetic by a formula α. However, it’s not
complete, since otherwise ∃j(j is a proof of r from α) would be the set of true sentences
of arithmetic. This can be extended to show that no arithmetically definable theory with
sufficient expressive power is complete.
The second version of G¨odel’s first incompleteness theorem suggests a natural way to extend
theories to stronger theories which are exactly as sound as the original theories. This sort
of process has been studied by Turing, Feferman, Fenstad and others under the names of
ordinal logics and transfinite recursive progressions of arithmetical theories.
G¨odel’s second incompleteness theorem concerns what a theory can prove about its own
provability predicate, in particular whether it can prove that no contradiction is provable.
121
The answer under very general settings is that a theory can’t prove that it is consistent,
without actually being inconsistent.
The second incompleteness theorem is best presented by means of a provability logic. Con-
sider an arithmetic theory 1 which is p.r. axiomatised by α. We extend the language this
theory is expressed in with a new sentence forming operator P, so that any sentence in
parentheses prefixed by P is a sentence. Thus for example, P(0 = 1) is a formula. Intu-
itively, we want P(φ) to express the provability of φ from α. Thus the semantics of our new
language is exactly the same as that of the original language, with the additional rule that
P(φ) is true if and only if α ¬ φ. There is a slight difficulty here; φ might itself contain
boxed expressions, and we haven’t yet provided any semantics for these. The answer is sim-
ple, whenever a boxed expression P(ψ) occurs within the scope of another box, we replace
it with the arithmetical statement 1:o·
α
(ψ). Thus for example the truth of P(P(0 = 1))
is equivalent to α ¬ 1:o·
α
(
]
0 = 1
|
). Assuming that α is strong enough to prove all true
instances of Prov(
]
φ
|
) we can in fact interprete the whole of the new boxed language by
the translation. This is what we shall do, so formally α ¬ φ (where φ might contain boxed
sentences) is taken to mean α ¬ φ∗ where φ∗ is obtained by replacing the boxed expressions
with arithmetical formulae as above.
There are a number of restrictions we must impose on α (and thus on P, the meaning of
which is determined by α). These are known as Hilbert-Bernays derivability conditions and
they are as follows
• if α ¬ φ then α ¬ P(φ)
• α ¬ P(φ) →PP(φ)
• α ¬ P(φ →ψ) →(Pφ →Pψ)
A statement (o:: asserts the consistency of α if its equivalent to P(0 = 1). G¨odel’s first
incompleteness theorem shows that there is a sentence 1
k
(/) for which the following is true
P(0 = 1) → Q(1
k
(/)) ∧ Q((1
k
(/)), where Q is the dual of P, i.e. Q(φ) ↔ P(φ).
A careful analysis reveals that this is provable in any α which satisfied the derivability
conditions, i.e. α ¬ P(0 = 1) → Q(1
k
(/)) ∧ Q((1
k
(/)). Assume now that α can prove
P(0 = 1), i.e. that α can prove its own consistency. Then α can prove Q(1
k
(/)) ∧
Q((1
k
(/)). But this means that α can prove 1
k
(/)! Thus α is inconsistent.
Version: 4 Owner: Aatu Author(s): Aatu
14.5 Lindenbaum algebra
Let 1 be a first order language. We define the equivalence relation ∼ over formulas of 1 by
ϕ ∼ ψ if and only if ¬ ϕ ⇔ ψ. Let 1 = 1 ∼ be the set of equivalence classes. We define
the operations ⊕ and and complementation, denoted [ϕ] on 1 by :
122
[ϕ] ⊕[ψ] = [ϕ ∨ ψ]
[ϕ] [ψ] = [ϕ ∧ ψ]
[ϕ] = [ϕ]
We let 0 = [ϕ∧ϕ] and 1 = [ϕ∨ϕ]. Then the structure (1. ⊕. .¯. 0. 1) is a Boolean algebra,
called the Lindenbaum algebra.
Note that it may possible to define the Lindenbaum algebra on extensions of first order logic,
as long as there is a notion of formal proof that can allow the definition of the equivalence
relation.
Version: 12 Owner: jihemme Author(s): jihemme
14.6 Lindstr¨om’s theorem
One of the very first results of the study of model theoretic logics is a characterisation
theorem due to Per Lindstr¨om. He showed that the classical first order logic is the strongest
logic having the following properties
• Being closed under contradictory negation
• compactness
• L¨owenheim-Skolem theorem
also, he showed that first order logic can be characterised as the strongest logic for which
the following hold
• Completeness (r.e. axiomatisability)
• L¨owenheim-Skolem theorem
The notion of “strength” used here is as follows. A logic L
t
is stronger than L or as strong
if the class of sets definable in L ⊆ the class of sets definable in L
t
.
Version: 2 Owner: Aatu Author(s): Aatu
123
14.7 Pressburger arithmetic
Pressburger arithmetic is a weakened form of arithmetic which includes the structure
N, the constant 0, the unary function o, the binary function +, and the binary relation <.
Essentially, it is Peano arithmetic without multiplication.
Pressburger arithmetic is decideable, but is consequently very limited in what it can express.
Version: 2 Owner: Henry Author(s): Henry
14.8 R-minimal element
Let o be a set and 1 be a relation on o. An element c ∈ o is said to be 1-minimal if and
only if there is no r ∈ o such that r1c.
Version: 1 Owner: jihemme Author(s): jihemme
14.9 Skolemization
Skolemization is a way of removing existential quantifiers from a formula. Variables bound
by existential quantifiers which are not inside the scope of universal quantifiers can simply
be replaced by constants: ∃r[r < 3] can be changed to c < 3, with c a suitable constant.
When the existential quantifier is inside a universal quantifier, the bound variable must
be replaced by a Skolem function of the variables bound by universal quantifiers. Thus
∀r[r = 0 ∨ ∃n[r = n + 1]] becomes ∀r[r = 0 ∨ r = 1(r) + 1].
This is used in second order logic to move all existential quantifiers outside the scope of first
order universal quantifiers. This can be done since second order quantifiers can quantify over
functions. For instance ∀
1
r∀
1
n∃
1
.φ(r. n. .) is equivalent to ∃
2
1∀
1
r∀
1
nφ(r. n. 1(r. n)).
Version: 1 Owner: Henry Author(s): Henry
14.10 arithmetical hierarchy
The arithmetical hierarchy is a hierarchy of either (depending on the context) formulas
or relations. The relations of a particular level of the hierarchy are exactly the relations
defined by the formulas of that level, so the two uses are essentially the same.
The first level consists of formulas with only bounded quantifiers, the corresponding relations
124
are also called the primitive recursive relations (this definition is equivalent to the definition
from computer science). This level is called any of ∆
0
0
, Σ
0
0
and Π
0
0
, depending on context.
A formula φ is Σ
0
n
if there is some ∆
0
1
formula ψ such that φ can be written:
φ(

/) = ∃r
1
∀r
2
Cr
n
ψ(

/. r)
where C is either ∀ or ∃, whichever maintains the pattern of alternating quantifiers
The Σ
0
1
relations are the same as the Recursively Enumerable relations.
Similarly, φ is a Π
0
n
relation if there is some ∆
0
1
formula ψ such that:
φ(

/) = ∀r
1
∃r
2
Cr
n
ψ(

/. r)
where C is either ∀ or ∃, whichever maintains the pattern of alternating quantifiers
A formula is ∆
0
n
if it is both Σ
0
n
and Π
0
n
. Since each Σ
0
n
formula is just the negation of a Π
0
n
formula and vice-versa, the Σ
0
n
relations are the complements of the Π
0
n
relations.
The relations in ∆
0
1
= Σ
0
1
¸
Π
0
1
are the Recursive relations.
Higher levels on the hierarchy correspond to broader and broader classes of relations. A for-
mula or relation which is Σ
0
n
(or, equivalently, Π
0
n
) for some integer : is called arithmetical.
The superscript 0 is often omitted when it is not necessary to distinguish from the analytic hierarchy.
Functions can be described as being in one of the levels of the hierarchy if the graph of the
function is in that level.
Version: 14 Owner: iddo Author(s): yark, iddo, Henry
14.11 arithmetical hierarchy is a proper hierarchy
By definition, we have ∆
n
= Π
n
¸
Σ
n
. In addition, Σ
n
¸
Π
n
⊆ ∆
n+1
.
This is proved by vacuous quantification. If 1 is equivalent to φ(:) then 1 is equivalent to
∀rφ(:) and ∃rφ(:), where r is some variable that does not occur free in φ.
More significant is the proof that all containments are proper. First, let : ` 1 and l be
universal for 2-ary Σ
n
relations. Then 1(r) ↔l(r. r) is obviously Σ
n
. But suppose 1 ∈ ∆
n
.
Then 1 ∈ 1i
n
, so 1 ∈ Σ
n
. Since l is universal, ther is some c such that 1(r) ↔l(c. r),
and therefore 1(c) ↔l(c. c) ↔l(c. c). This is clearly a contradiction, so 1 ∈ Σ
n
` ∆
n
and 1 ∈ Π
n
` ∆
n
.
125
In addition the recursive join of 1 and 1, defined by
1 ⊕1(r) ↔(∃n < r[r = 2 n] ∧ 1(r)) ∨ (∃n < r[r = 2 n] ∧ 1(r))
Clearly both 1 and 1 can be recovered from 1⊕1, so it is contained in neither Σ
n
nor
Π
n
. However the definition above has only unbounded quantifiers except for those in 1 and
1, so 1⊕1(r) ∈ ∆
n+1
` Σ
n
¸
Π
n
Version: 3 Owner: Henry Author(s): Henry
14.12 atomic formula
Let 1 be a first order language, and suppose it has signature Σ. A formula ϕ of 1 is said to
be atomic if and only if :
1. ϕ = “t
1
= t
2
”, where t
1
and t
2
are terms;
2. ϕ = “1(t
1
. .... t
n
)”, where 1 ∈ Σ is an :-ary relation symbol.
Version: 1 Owner: jihemme Author(s): jihemme
14.13 creating an infinite model
From the syntactic compactness theorem for first order logic, we get this nice (and useful)
result:
Let T be a theory of first-order logic. If T has finite models of unboundedly large sizes, then
T also has an infinite model.
D efine the propositions
Φ
n
⇔∃r
1
. . . ∃r
n
.(r
1
= r
2
) ∧ . . . ∧ (r
1
= r
n
) ∧ (r
2
= r
3
) ∧ . . . ∧ (r
n−1
= r
n
)

n
says “there exist (at least) : different elements in the world”). Note that . . . ¬ Φ
n
¬
. . . ¬ Φ
2
¬ Φ
1
. Define a new theory
T

= T
¸
¦Φ
1
. Φ
2
. . . .¦ .
For any finite subset T
t
⊂ T

, we claim that T
t
is consistent: Indeed, T
t
contains axioms of
T, along with finitely many of ¦Φ
n
¦
n≥1
. Let Φ
m
correspond to the largest index appearing in
T
t
. If M
m
[= T is a model of T with at least : elements (and by hypothesis, such as model
exists), then M
m
[= T
¸
¦Φ
m
¦ ¬ T
t
.
126
So every finite subset of T

is consistent; by the compactness theorem for first-order logic,
T

is consistent, and by G¨odel’s completeness theorem for first-order logic it has a model
M. Then M [= T

¬ T, so M is a model of T with infinitely many elements (M [= Φ
n
for
any :, so M has at least ≥ : elements for all :).
Version: 3 Owner: ariels Author(s): ariels
14.14 criterion for consistency of sets of formulas
Let 1 be a first order language, and ∆ ⊆ 1 be a set of sentences. Then ∆ is consistent if
and only if every finite subset of ∆ is consistent.
Version: 2 Owner: jihemme Author(s): jihemme
14.15 deductions are ∆
1
Using the example of G¨odel numbering, we can show that Proves(c. r) (the statement that
c is a proof of r, which will be formally defined below) is ∆
1
.
First, Term(r) should be true iff r is the G¨odel number of a term. Thanks to primitive
recursion, we can define it by:
Term(r) ↔∃i < r[r = '0. i`]∨
r = '5`∨
∃n < r[r = '6. n` ∧ Term(n)]∨
∃n. . < r[r = '8. n. .` ∧ Term(n) ∧ Term(.)]∨
∃n. . < r[r = '9. n. .` ∧ Term(n) ∧ Term(.)]
Then AtForm(r), which is true when r is the G¨odel number of an atomic formula, is defined
by:
Form(r) ↔∃n. . < r[r = '1. n. .` ∧ Term(n) ∧ Term(.)]∨
∃n. . < r[r = '7. n. .` ∧ Term(n) ∧ Term(.)]∨
Next, Form(r), which is true only if r is the G¨odel number of a formula, is defined recursively
127
by:
Form(r) ↔AtForm(r)∨
∃i. n < r[r = '2. i. n` ∧ Form(n)]∨
∃n < r[r = '3. n` ∧ Form(n)]∨
∃n. . < r[r = '4. n. .` ∧ Form(n) ∧ Form(.)]
The definition of QFForm(r), which is true when r is the G¨odel numbe of a quantifier free formula,
is defined the same way except without the second clause.
Next we want to show that the set of logical tautologies is ∆
1
. This will be done by formal-
izing the concept of truth tables, which will require some development. First we show that
AtForms(c), which is a sequence containing the (unique) atomic formulas of c is ∆
1
. Define
it by:
AtForms(c. t) ↔(Form(c) ∧ t = 0)∨
Form(c) ∧ (
∃r. n < c[c = '1. r. n`t = c]∨
∃r. n < c[c = '7. r. n` ∧ t = c]∨
∃i. r < c[c = '2. i. r` ∧ t = AtForms(r)]∨
∃r < c[c = '3. r` ∧ t = AtForms(r)]∨
∃r. n < c[c = '4. r. n` ∧ t = AtForms(r) ∗
u
AtForms(n)])
We say · is a truth assignment if it is a sequence of pairs with the first member of each
pair being a atomic formula and the second being either 1 or 0:
1¹(·) ↔∀i < len(·)∃r. n < (·)
i
[(·)
i
= 'r. n` ∧ AtForm(r) ∧ (n = 1 ∨ n = 0)]
Then · is a truth assignment for c if · is a truth assignment, c is quantifier free, and every
atomic formula in c is the first member of one of the pairs in ·. That is:
1¹1(·. c) ↔1¹(·)∧QFForm(c)∧∀i < len(AtForms(c))∃, < len(·)[((·)
j
)
0
= (AtForms(c))
i
]
Then we can define when · makes c true by:
True(·. c) ↔1¹1(·. c)∧
AtForm(c) ∧ ∃i < len(·)[((·)
i
)
0
= c ∧ ((·)
i
)
1
= 1]∨
∃n < r[r = '3. n` ∧ True(·. n)]∨
∃n. . < r[r = '4. n. .` ∧ True(·. n) →True(·. .)]
128
Then c is a tautology if every truth assignment makes it true:
Taut(c)∀· < 2
2
AtForms(a)
[1¹1(·. c) →True(·. c)]
We say that a number c is a deduction of φ if it encodes a proof of φ from a set of axioms
¹r. This means that c is a sequence where for each (c)
i
either:
• (c)
i
is the G¨odel number of an axiom
• (c)
i
is a logical tautology
or
• there are some ,. / < i such that (c)
j
= '4. (c)
k
. (c)
i
` (that is, (c)
i
is a conclusion
under modus ponens from (c)
j
and (c)
k
).
and the last element of c is 'φ¯.
If ¹r is ∆
1
(almost every system of axioms, including 1¹, is ∆
1
) then Proves(c. r), which is
true if c is a deduction whose last value is r, is also ∆
1
. This is fairly simple to see from the
above results (let Ax(r) be the relation specifying that r is the G¨odel number of an axiom):
Proves(c. r) ↔∀i < len(c)[Ax((c)
i
) ∨ ∃,. / < i[(c)
j
= '4. (c)
k
. (c)
i
`] ∨ Taut((c)
i
)]
Version: 5 Owner: Henry Author(s): Henry
14.16 example of G¨odel numbering
We can define by recursion a function c from formulas of arithmetic to numbers, and the
corresponding G¨odel numbering as the inverse.
The symbols of the language of arithmetic are =, ∀, , →, 0, o, <, +, , the variables ·
i
for any integer i, and ( and ). ( and ) are only used to define the order of operations, and
should be inferred where appropriate in the definition below.
We can define a function c by recursion as follows:
• c(·
i
) = '0. i`
• c(φ = ψ) = '1. c(φ). c(ψ)`
• c(∀·
i
φ) = '2. c(·
i
). c(φ)`
129
• c(φ) = '3. c(φ)`
• c(φ →ψ) = '4. c(φ). c(ψ)`
• c(0) = '5`
• c(oφ) = '6. c(φ)`
• c(φ < ψ) = '7. c(φ). c(ψ)`
• c(φ + ψ) = '8. c(φ). c(ψ)`
• c(φ ψ) = '9. c(φ). c(ψ)`
Clearly c
−1
is a G¨odel numbering, with 'φ¯ = c(φ).
Version: 3 Owner: Henry Author(s): Henry
14.17 example of well-founded induction
As an example of the use of Well-founded induction in the case where the order is not a
linear one, I’ll prove the fundamental theorem of arithmetic : every natural number has a
prime factorization.
First note that the division relation is well-founded. This fact is proven in every algebra
books. The [-minimal elements are the prime numbers. We detail the two steps of the proof
:
1. If : is prime, then : is its own factorization into primes, so the assertion is true for
the [-minimal elements.
2. If : is not prime, then : has a non-trivial factorization (by definition of not being
prime), i.e. : = :/, where :. : = 1. By induction, : and / have prime factorizations,
and we can see that this implies that : has one too. This takes care of case 2.
Here are other commonly used well-founded sets :
1. ideals of a Noetherian ring ordered by inverse proper inclusion;
2. Ideals of an artinian ring ordered by inclusion;
3. graphs ordered by minors (A graph ¹ is a minor of 1 if and only if it can be obtained
from 1 by collapsing edges);
4. ordinal numbers;
130
5. etc.
Version: 4 Owner: jihemme Author(s): jihemme
14.18 first order language
Terms and formulas of first order logic are constructed with the classical logical symbols ∀,∃,
∧, ∨, , ⇒, ⇔, and also ( and ), and a set
Σ =

¸
n∈ω
Rel
n

¸

¸
n∈ω
Fun
n

¸
Const
where for each natural number :,
• Rel
n
is a (usually countable) set of :-ary relation symbols.
• Fun
n
is a (usually countable) set of :-ary function symbols.
• Const is a (usually countable) set of constant symbols.
We require that all these sets be disjoint. The elements of the set Σ are the only non-logical
symbols that we are allowed to use when we construct terms and formulas. They form the
signature of the language. So far they are only symbols, so they don’t mean anything. For
most structures that we encounter, the set Σ is finite, but we allow it to be infinite, even
uncountable, as this sometimes makes things easier, and just about everything still works
when the signature is uncountable. We also assume that we have an unlimited supply of
variables, with the only constraint that the collection of variables form a set, which should
be disjoint from the other sets of non-logical symbols.
The arity of a function or relation symbol is the number of parameters the symbol is about
to take. It is usually assumed to be a property of the symbol, and it is bad grammar to use
an :-ary function or relation with : parameters if : = :.
Terms are built inductively according to the following rules :
1. Any variable is a term;
2. Any constant symbol is a term;
3. If 1 is an :-ary function symbol, and t
1
. .... t
n
are term, then 1(t
1
. .... t
n
) is a term.
131
With terms in hands, we build formulas inductively by a finite application of the following
rules :
1. If t
1
and t
2
are terms, then t
1
= t
2
is a formula;
2. If 1 is an :-ary relation symbol and t
1
. .... t
n
are terms, then 1(t
1
. .... t
n
) is a formula;
3. If ϕ is a formula, then so is ϕ;
4. If ϕ and ψ are formulas, then so is ϕ ∨ ψ;
5. If ϕ is a formula, and r is a variable, then ∃r(ϕ) is a formula.
The other logical symbols are obtained in the following way :
ϕ ∧ ψ
def
:= (ϕ ∨ ψ) ϕ ⇒ψ
def
:= ϕ ∨ ψ
ϕ ⇔ψ
def
:= (ϕ ⇒ψ) ∧ (ψ ⇒ϕ) ∀r.ϕ
def
:= (∃r(ϕ))
All logical symbols are used when building formulas.
Version: 8 Owner: jihemme Author(s): jihemme
14.19 first order logic
A logic is first order if it has exactly one type. Usually the term refers specifically to the
logic with connectives , ∨, ∧, →, and ↔ and the quantifiers ∀ and ∃, all given the usual
semantics:
• φ is true iff φ is not true
• φ ∨ ψ is true if either φ is true or ψ is true
• ∀rφ(r) is true iff φ
t
x
is true for every object t (where φ
t
x
is the result of replacing every
unbound occurence of r in φ with t)
• φ ∧ ψ is the same as (φ ∨ ψ)
• φ →ψ is the same as (φ) ∨ ψ
• φ ↔ψ is the same as (φ →ψ) ∧ (ψ →φ)
• ∃rφ(r) is the same as ∀rφ(r)
132
However languages with slightly different quantifiers and connectives are sometimes still
called first order as long as there is only one type.
Version: 4 Owner: Henry Author(s): Henry
14.20 first order theories
Let 1 be a first-order language. A theory in 1 is a set of sentences of 1, i.e. a set of
formulas of 1 that have no free variables.
Definition. A theory 1 is said to be consistent if and only if 1 ¬⊥, where ⊥ stands for
“false”. In other words, 1 is consistent if one cannot derive a contradiction from it. If ϕ is
a sentence of 1, then we say ϕ is consistent with 1 if and only if the theory 1
¸
¦ϕ¦ is
consistent.
Definition. A theory 1 ⊆ 1 is said to be complete if and only if for every formula ϕ ∈ 1,
either 1 ¬ ϕ or 1 ¬ ϕ.
lemma. A theory 1 in 1 is complete if and only if it is maximal consistent. In other words,
1 is complete if and only if for every ϕ ∈ 1, 1
¸
¦ϕ¦ is inconsistent.
Theorem. (Tarski) Every consistent theory 1 in 1 can be extended to a complete theory.
Proof : Use Zorn’s lemma on the collection of consistent theory extending 1. ♦
Version: 3 Owner: jihemme Author(s): jihemme
14.21 free and bound variables
In the entry first-order languages, I have mentioned the use of variables without men-
tioning what variables really are. A variable is a symbol that is supposed to range over the
universe of discourse. Unlike a constant, it has no fixed value.
There are two ways in which a variable can occur in a formula: free or bound. Informally,
a variable is said to occur free in a formula ϕ if and only if it is not within the ”scope” of a
quantifier. For instance, r occurs free in ϕ if and only if it occurs in it as a symbol, and no
subformula of ϕ is of the form ∃r.ψ. Here the r after the ∃ is to be taken literally : it is r
and no other symbol.
The set 1\ (ϕ) of free variables of ϕ is defined by Well-founded induction on the construction
of formulas. First we define \ c:(t), where t is a term, to be the set or all variables occurring
in t, and then :
133
1\ (t
1
= t
2
) = \ c:(t
2
)
¸
\ c:(t
2
)
1\ (1(t
1
. .... t
n
)) =
n
¸
k=1
\ c:(t
k
)
1\ (ϕ) = 1\ (ϕ)
1\ (ϕ ∨ ψ) = 1\ (ϕ)
¸
1\ (ψ)
1\ (∃r(ϕ)) = 1\ (ϕ)`¦r¦
When for some ϕ, the set 1\ (ϕ) is not empty, then it is customary to write ϕ as ϕ(r
1
. ...r
n
),
in order to stress the fact that there are some free variables left in ϕ, and that those free
variables are among r
1
. .... r
n
. When r
1
. .... r
n
appear free in ϕ, then they are considered as
place-holders, and it is understood that we will have to supply “values” for them, when
we want to determine the truth of ϕ. If 1\ (ϕ) = ∅, then ϕ is called a sentence.
If a variable never occurs free in ϕ (and occurs as a symbol), then we say the variable is
bound. A variable r is bound if and only if ∃r(ψ) or ∀r(ψ) is a subformula of ϕ for some ψ
The problem with this definition is that a variable can occur both free and bound in the
same formula. For example, consider the following formula of the lenguage ¦+. . 0. 1¦ of ring
theory :
r + 1 = 0 ∧ ∃r(r + n = 1)
The variable r occurs both free and bound here. However, the following lemma tells us that
we can always avoid this situation :
Lemma 1. It is possible to rename the bound variables without affecting the truth of a
formula. In other words, if ϕ = ∃r(ψ), or ∀r(ψ), and . is a variable not occuring in ψ, then
¬ ϕ ⇔ ∃.(ψ(.r)), where ψ(.r) is the formula obtained from ψ by replacing every free
occurence of r by ..
Version: 5 Owner: jihemme Author(s): jihemme
14.22 generalized quantifier
Generalized quantifiers are an abstract way of defining quantifiers.
The underlying principle is that formulas quantified by a generalized quantifier are true if
the set of elements satisfying those formulas belong in some relation associated with the
quantifier.
134
Every generalized quantifier has an arity, which is the number of formulas it takes as argu-
ments, and a type, which for an :-ary quantifier is a tuple of length :. The tuple represents
the number of quantified variables for each argument.
The most common quantifiers are those of type '1`, including ∀ and ∃. If C is a quantifier
of type '1`, ` is the universe of a model, and C
M
is the relation associated with C in that
model, then Crφ(r) ↔¦r ∈ ` [ φ(r)¦ ∈ C
M
.
So ∀
M
= ¦`¦, since the quantified formula is only true when all elements satisfy it. On the
other hand ∃
M
= 1(`) −¦∅¦.
In general, the monadic quantifiers are those of type '1. . . . . 1` and if C is an :-ary monadic
quantifier then C
M
⊆ 1(`)
n
. H¨artig’s quantifier, for instance, is '1. 1`, and 1
M
= ¦'A. ) ` [
A. ) ⊆ ` ∧ [A[ = [) [¦.
A quantifier C is polyadic if it is of type ':
1
. . . . . :
n
` where each :
i
∈ N. Then:
C
M

¸
i
1(`)
n
i
These can get quite elaborate; \rnφ(r. n) is a '2` quantifier where A ∈ \
M
↔ A is a
well-ordering. That is, it is true if the set of pairs making φ true is a well-ordering.
Version: 1 Owner: Henry Author(s): Henry
14.23 logic
Generally, by logic, people mean first order logic, a formal set of rules for building mathemat-
ical statements out of symbols like (negation) and → (implication) along with quantifiers
like ∀ (for every) and ∃ (there exists).
More generally, a logic is any set of rules for forming sentences (the logic’s syntax) together
with rules for assigning truth values to them (the logic’s semantics). Normally it includes
a (possibly empty) set of types 1 (also called sorts), which represent the different kinds of
objects that the theory discusses (typical examples might be sets, numbers, or sets of num-
bers). In addition it specifies particular quantifiers, connectives, and variables. Particular
theories in the logic can then add relations and functions to fully specify a logical language.
Version: 5 Owner: Henry Author(s): Henry
135
14.24 proof of compactness theorem for first order logic
The theorem states that if a set of sentences of a first-order language 1 is inconsistent, then
some finite subset of it is inconsistent. Suppose ∆ ⊆ 1 is inconsistent. Then by definition
∆ ¬⊥, i.e. there is a formal proof of “false” using only assumptions from ∆. Formal proofs
are finite objects, so let Γ collect all the formulas of ∆ that are used in the proof.
Version: 1 Owner: jihemme Author(s): jihemme
14.25 proof of principle of transfinite induction
To prove the transfinite induction theorem, we note that the class of ordinals is well-ordered
by ∈. So suppose for some Φ, there are ordinals α such that Φ(α) is not true. Suppose
further that Φ satisfies the hypothesis, i.e. ∀α(∀β < α(Φ(β)) ⇒ Φ(α)). We will reach a
contradiction.
The class ( = ¦α : Φ(α)¦ is not empty. Note that it may be a proper class, but this is not
important. Let γ = min(() be the ∈-minimal element of (. Then by assumption, for every
λ < γ, Φ(λ) is true. Thus, by hypothesis, Φ(γ) is true, contradiction.
Version: 8 Owner: jihemme Author(s): jihemme, quadrate
14.26 proof of the well-founded induction principle
This proof is very similar to the proof of the transfinite induction theorem. Suppose Φ is
defined for a well-founded set (o. 1), and suppose Φ is not true for every c ∈ o. Assume
further that Φ satisfies requirements 1 and 2 of the statement. Since 1 is a well-founded
relation, the set ¦c ∈ o : Φ(c)¦ has an 1 minimal element :. This element is either an 1
minimal element of o itself, in which case condition 1 is violated, or it has 1 predessors. In
this case, we have by minimality Φ(:) for every : such that :1:, and by condition 2, Φ(:) is
true, contradiction.
Version: 4 Owner: jihemme Author(s): jihemme
14.27 quantifier
A quantifier is a logical symbol which makes an assertion about the set of values which
make one or more formulas true. This an exceedingly general concept; the vast majority of
mathematics is done with the two standard quantifiers, ∀ and ∃.
136
The universal quantifier ∀ takes a variable and a formula and asserts that the formula
holds for any value of r. A typical example would be a sentence like:
∀r[0 < r]
which states that no matter what value r takes, 0 < r.
The existential quantifier ∃ is the dual; that is the formula ∀rφ(r) is equivalent to
∃rφ(r). It states that there is some r satsifying the formula, as in
∃r[r 0]
which states that there is some value of r greater than 0.
The scope of a quantifier is the portion of a formula where it binds its variables. Note
that previous bindings of a variable are overridden within the scope of a quantifier. In the
examples above, the scope of the quantifiers was the entire formula, but that need not be
the case. The following is a more complicated use of quantifiers:
∀r[r = 0 ∨ ∃n
The scope of the first existential quantifier.
. .. .
[r = n + 1 ∧ (n = 0 ∨ ∃r[n = r + 1]
. .. .

)]]
. .. .
The scope of the universal quantifier.
†:The scope of the second existential quantifier. Within this area, all references to r refer to
the variable bound by the existential quantifier. It is impossible to refer directly to the one
bound by the universal quantifier.
As that example illustrates, it can be very confusing when one quantifier overrides another.
Since it does not change the meaning of a sentence to change a bound variable and all bound
occurrences of it, it is better form to replace sentences like that with an equivalent but more
readable one like:
∀r[r = 0 ∨ ∃n[r = n + 1 ∧ (n = 0 ∨ ∃.[n = . + 1])]]
These sentences both assert that every number is either equal to zero, or that there is some
number one less than it, and that the number one less than it is also either zero or has
a number one less than it. [Note: This is not the most useful of sentences. It would be
nice to replace this with a mathematically simple sentence which uses nested quantifiers
meaningfully.]
137
The quantifiers may not range over all objects. That is, ∀rφ(r) may not specify that r
can be any object, but rather any object belonging to some class of objects. Similarly
∃rφ(r) may specify that there is some r within that class which satisfies φ. For instance
second order logic has two universal quantifiers, ∀
1
and ∀
2
(with corresponding existential
quantifiers), and variables bound by them only range over the first and second order objects
respectively. So ∀
1
r[0 < r] only states that all numbers are greater than or equal to 0, not
that sets of numbers are as well (which would be meaningless).
A particular use of a quantifier is called bounded or restricted if it limits the objects
to a smaller range. This is not quite the same as the situation mentioned above; in the
situation above, the definition of the quantifier does not include all objects. In this case,
quantifiers can range over everything, but in a particular formula it doesn’t. This is expressed
in first order logic with formulas like these four:
∀r[r < c →φ(r)]∀r[r ∈ A →φ(r)] ∃r[r < c ∧ φ(r)]∃r[r ∈ A ∧ φ(A)]
The restriction is often incorporated into the quantifier. For instance the first example might
be written ∀r < c[φ(c)].
A quantifier is called vacuous if the variable it binds does not appear anywhere in its scope,
such as ∀r∃n[0 < r]. While vacuous quantifiers do not change the meaning of a sentence,
they are occasionally useful in finding an equivalent formula of a specific form.
While these are the most common quantifiers (in particular, they are the only quantifiers
appearing in classical first-order logic), some logics use others. The quantifier ∃!rφ(r), which
means that there is a unique r satsifying φ(r) is equivalent to ∃r[φ(r) ∧ ∀n[φ(n) →r = n]].
Other quantifiers go beyond the usual two. Examples include interpreting Crφ(r) to mean
there are an infinite (or uncountably infinite) number of r satisfying φ(r). More elaborate
examples include the branching Henkin quantifier, written:
∀r ∃n
∀c ∃/
φ(r. n. c. /)
This quantifier is similar to ∀r∃n∀c∃/φ(r. n. c. /) except that the choice of c and / cannot
depend on the values of c and /. This concept can be further generalized to the game-
semantic, or independence-friendly, quantifiers. All of these quantifiers are examples of
generalized quantifiers.
Version: 7 Owner: Henry Author(s): Henry
138
14.28 quantifier free
Let 1 be a first order language. A formula ψ is quantifier free iff it contains no quantifiers.
Let 1 be a complete 1-theory. Let o ⊆ 1. Then o is an elimination set for 1 iff for every
ψ(¯ r) ∈ 1 there is some φ(¯ r) ∈ o so that 1 ¬ ∀¯ r(ψ(¯ r) ↔φ(¯ r).
In particular, 1 has quantifier elimination iff the set of quantifier free formulas is an elimi-
nation set for 1. In other words 1 has quantifier elimination iff for every ψ(¯ r) ∈ 1 there is
some quantifier free φ(¯ r) ∈ 1 so that 1 ¬ ∀¯ r(ψ(¯ r)) ↔φ(¯ r).
Version: 2 Owner: mathcam Author(s): mathcam, Timmy
14.29 subformula
Let 1 be a first order language and suppose ϕ. ψ ∈ 1 are formulas. Then we say that ϕ is a
subformula of ψ if and only if :
1. ψ = ϕ
2. ψ is one of α, ∀r(α) or ∃r(α), and either ϕ = α, or is a subformula of α.
3. ψ is α ∨ β or α ∧ β and either ϕ = α, ϕ = β, or ϕ is a subformula of α or β
Version: 2 Owner: jihemme Author(s): jihemme
14.30 syntactic compactness theorem for first order
logic
Let 1 be a first-order language, and ∆ ⊆ 1 be a set of sentences. If ∆ is inconsistent, then
some finite Γ ⊆ ∆ is inconsistent.
Version: 2 Owner: jihemme Author(s): jihemme
14.31 transfinite induction
Suppose Φ(α) is a property defined for every ordinal α, the principle of transfinite induc-
tion states that in the case where for every α, if the fact that Φ(β) is true for every β < α
implies that Φ(α) is true, then Φ(α) is true for every ordinal α. Formally :
139
∀α(∀β(β < α ⇒Φ(β)) ⇒Φ(α)) ⇒∀α(Φ(α))
The principle of transfinite induction is very similar to the principle of finite induction, ex-
cept that it is stated in terms of the whole class of the ordinals.
Version: 7 Owner: jihemme Author(s): jihemme, quadrate
14.32 universal relation
If Φ is a class of :-ary relations with r as the only free variables, an : + 1-ary formula ψ is
universal for Φ if for any φ ∈ Φ there is some c such that ψ(c. r) ↔φ(r). In other words,
ψ can simulate any element of Φ.
Similarly, if Φ is a class of function of r, a formula ψ is universal for Φ if for any φ ∈ Φ there
is some c such that ψ(c. r) = φ(r).
Version: 3 Owner: Henry Author(s): Henry
14.33 universal relations exist for each level of the arith-
metical hierarchy
Let 1 ∈ ¦Σ
n
. ∆
n
. Π
n
¦ and take any / ∈ N. Then there is a / + 1-ary relation l ∈ 1 such
that l is universal for the /-ary relations in 1.
Proof
First we prove the case where 1 = ∆
1
, the recursive relations. We use the example of a
G¨odel numbering.
Define 1 to be a / + 2-ary relation such that 1(c. r. c) if:
• c = 'φ¯
• c is a deduction of either φ(r) or φ(r)
Since deductions are ∆
1
, it follows that 1 is ∆
1
. Then define l
t
(c. r) to be the least c such
that 1(c. r. c) and l(c. r) ↔(l
t
(c. r))
len(U

(e,x))
= c. This is again ∆
1
since the ∆
1
functions
are closed under minimization.
140
If 1 is any / −c:n ∆
1
function then 1(r) = l('1¯. r).
Now take 1 to be the /-ary relatons in either Σ
n
or Π
n
. Call the universal relation for
/ + :-ary ∆
1
relations l

. Then any φ ∈ 1 is equivalent to a relation in the form
Cn
1
C
t
n
2
C

n
n
ψ(r. n) where o ∈ ∆
1
, and so l(r) = Cn
1
C
t
n
2
C

n
n
l

('ψ¯. r. n). Then
l is universal for 1.
Finally, if 1 is the /-ary ∆
n
relations and φ ∈ 1 then φ is equivalent to relations of the form
∃n
1
∀n
2
Cn
n
ψ(r. n) and ∀.
1
∃.
2
C.
n
η(r. .). If the /-ary universal relations for Σ
n
and
Π
n
are l
Σ
and l
Π
respectively then φ(r) ↔l
Σ
('ψ¯. r) ∧ l
Π
('η¯. r).
Version: 2 Owner: Henry Author(s): Henry
14.34 well-founded induction
The principle of well-founded induction is a generalization of the principle of transfinite induction.
Definition. Let o be a non-empty set, and 1 be a partial order relation on o. Then 1 is
said to be a well-founded relation if and only if every subset A ⊆ o has an 1-minimal
element. In the special case where 1 is a total order, we say o is well-ordered by 1. The
structure (o. 1) is called a well-founded set.
Note that 1 is by no means required to be a total order. A classical example of a well-
founded set that is not totally ordered is the set N of natural numbers ordered by division,
i.e. c1/ if and only if c divides /, and c = 1. The 1-minimal elements of this order are the
prime numbers.
Let Φ be a property defined on a well-founded set o. The principle of well-founded induction
states that if the following is true :
1. Φ is true for all the 1-minimal elements of o
2. for every c, if for every r such that r1c, we have Φ(r), then we have Φ(c)
then Φ is true for every c ∈ o.
As an example of application of this principle, we mention the proof of the fundamental theorem of arithmetic
: every natural number has a unique factorization into prime numbers. The proof goes by
well-founded induction in the set N ordered by division.
Version: 10 Owner: jihemme Author(s): jihemme
141
14.35 well-founded induction on formulas
Let 1 be a first-order language. The formulas of 1 are built by a finite application of the
rules of construction. This says that the relation < defined on formulas by ϕ < ψ if and only
if ϕ is a subformula of ψ is a well-founded relation. Therefore, we can formulate a principle
of induction for formulas as follows : suppose 1 is a property defined on formulas, then 1
is true for every formula of 1 if and only if
1. 1 is true for the atomic formulas;
2. for every formula ϕ, if 1 is true for every subformula of ϕ, then 1 is true for ϕ.
Version: 3 Owner: jihemme Author(s): jihemme
142
Chapter 15
03B15 – Higher-order logic and type
theory
15.1 H¨artig’s quantifier
H¨artig’s quantifier is a quantifier which takes two variables and two formulas, written
1rnφ(r)ψ(n). It asserts that [¦r [ φ(r)¦[ = [¦n [ ψ(n)[. That is, the cardinality of the values
of r which make φ is the same as the cardinality of the values which make ψ(r) true. Viewed
as a generalized quantifier, 1 is a '2` quantifier.
Closely related is the Rescher quantifier, which also takes two variables and two formulas,
is written Jrnφ(r)ψ(n), and asserts that [¦r [ φ(r)¦[ < [¦n [ ψ(n)[. The Rescher quantifier is
sometimes defined instead to be a similar but different quantifier, Jrφ(r) ↔ [¦r [ φ(r)¦[
[¦r [ φ(r)¦[. The first definition is a '2` quantifier while the second is a '1` quantifier.
Another similar quantifier is Chang’s quantifier C
C
, a '1` quantifier defined by C
C
M
= ¦A ⊆
` [ [A[ = [`[¦. That is, C
C
rφ(r) is true if the number of r satisfying φ has the same
cardinality as the universe; for finite models this is the same as ∀, but for infinite ones it is
not.
Version: 3 Owner: Henry Author(s): Henry
15.2 Russell’s theory of types
After the discovery of the paradoxes of set theory (notably Russell’s paradox), it become
apparent that naive set theory must be replaced by something in which the paradoxes can’t
arise. Two solutions were proposed: type theory and axiomatic set theory based on a
limitation of size principle (see the entries class and von Neumann-Bernays-G¨odel set theory).
143
Type theory is based on the idea that impredicative definitions are the root of all evil.
Bertrand Russell and various other logicians in the beginning of the 20th century proposed
an analysis of the paradoxes that singled out so called vicious circles as the culprits. A
vicious circle arises when one attempts to define a class by quantifying over a totality of
classes including the class being defined. For example, Russell’s class 1 = ¦r [ r ∈ r¦
contains a variable r that ranges over all classes.
Russell’s type theory, which is found in its mature form in the momentous Principia Mathe-
matica avoids the paradoxes by two devices. First, Frege’s fifth axiom is abandoned entirely:
the extensions of predicates do not appear among the objects. Secondly, the predicates
themselves are ordered into a ramified hierarchy so that the predicates at the lowest level
can be defined by speaking of objects only, the predicates at the next level by speaking of
objects and of predicates at the previous level and so forth.
The first of these principles has drastic implications to mathematics. For example, the
predicate “has the same cardinality” seemingly can’t be defined at all. For predicates apply
only to objects, and not to other predicates. In Frege’s system this is easy to overcome: the
equicardinality predicate is defined for extensions of predicates, which are objects. In order
to overcome this, Russell introduced the notion of types (which are today known as degrees).
Predicates of degree 1 apply only to objects, predicates of degree 2 apply to predicates of
degree 1, and so forth.
Type theoretic universe may seem quite odd to someone familiar with the cumulative hier-
archy of set theory. For example, the empty set appears anew in all degrees, as do various
other familiar structures, such as the natural numbers. Because of this, it is common to
indicate only the relative differences in degrees when writing down a formula of type theory,
instead of the absolute degrees. Thus instead of writing
∃1
1
∀r
0
(r
0
∈ 1
1
↔r
0
= r
0
)
one writes
∃1
i+1
∀r
i
(r
i
∈ 1
i+1
↔r
i
= r
i
)
to indicate that the formula holds for any i. Another possibility is simply to drop the
subscripts indicating degree and let the degrees be determined implicitly (this can usually
be done since we know that r ∈ n implies that if n is of degree :, then r is of degree : +1).
A formula for which there is an assignment of types (degrees) to the variables and constants
so that it accords to the restrictions of type theory is said to be stratified.
The second device implies another dimension in which the predicates are ordered. In any
given degree, there appears a hierarchy of levels. At first level of degree : + 1 one has
predicates that apply to elements of degree : and which can be defined with reference only
to predicates of degree :. At second level there appear all the predicates that can be defined
144
with reference to preidcates of degree : and to predicates of degree : + 1 of level 1, and so
forth.
This second principle makes virtually all mathematics break down. For example, when
speaking of real number system and its completeness, one wishes to quantify over all pred-
icates of real numbers (this is possible at degree : + 1 if the predicates of real numbers
appear at degree :), not only of those of a given level. In order to overcome this, Russell
and Whitehead introduced in PM the so-called axiom of reducibility, which states that if a
predicate 1
n
occurs at some level / (i.e. 1
n
= 1
k
n
), it occurs already on the first level.
Frank P. Ramsay was the first to notice that the axiom of reducibility in effect collapses the
hierarchy of levels, so that the hierarchy is entirely superfluous in presense of the axiom. The
original form of type theory is known as ramified type theory, and the simpler alternative
with no second hierarchy of levels is known as unramified type theory or simply as simple
type theory.
One descendant of type theory is W. v. Quine’s system of set theory known as NF (New
Foundations), which differs considerably from the more familiar set theories (ZFC, NBG,
Morse-Kelley). In NF there is a class comprehension axiom saying that to any stratified
formula there corresponds a set of elements satisfying the formula. The Russell class is not
a set, since it contains the formula r ∈ r, which can’t be stratified, but the universal class
is a set: r = r is perfectly legal in type theory, as we can assign to r any degree and get a
well-formed formula of type theory. It is not known if NF axiomatises any extensor (see the
entry class) based on a limitation of size principle, like the more familiar set theories do.
In the modern variants of type theory, one usually has a more general supply of types.
Beginning with some set τ of types (presumably a division of the simple objects into some
natural categories), one defines the set of types 1 by setting
• if c. / ∈ 1, then (c →/) ∈ 1
• for all t ∈ τ, t ∈ 1
One way to proceed to get something familiar is to have τ contain a type t for truth values.
Then sentences are objects of type t, open formulae of one variable are of type Object →t
and so forth. This sort of type system is often found in the study of typed lambda calculus
and also in intensional logics, which are often based on the former.
Version: 4 Owner: Aatu Author(s): Aatu
15.3 analytic hierarchy
The analytic hierarchy is a hierarchy of either (depending on context) formulas or relations
similar to the arithmetical hierarchy. It is essentially the second order equivalent. Like the
145
arithmetical hierarchy, the relations in each level are exactly the relations defined by the
formulas of that level.
The first level can be called ∆
1
0
, ∆
1
1
, Σ
1
0
, or Π
1
0
, and consists of the arithmetical formulas or
relations.
A formula φ is Σ
1
n
if there is some arithmetical formula ψ such that:
φ(

/) = ∃A
1
∀A
2
CA
n
ψ(

/.

A
n
)
where C is either ∀ or ∃, whichever maintains the pattern of alternating quantifiers, and each A
i
is a set
Similarly, a formula φ is Π
1
n
if there is some arithmetical formula ψ such that:
φ(

/) = ∀A
1
∃A
2
CA
n
ψ(

/.

A
n
)
where C is either ∀ or ∃, whichever maintains the pattern of alternating quantifiers, and each A
i
is a set
Version: 1 Owner: Henry Author(s): Henry
15.4 game-theoretical quantifier
A Henkin or branching quantifier is a multi-variable quantifier in which the selection of
variables depends only on some, but not all, of the other quantified variables. For instance
the simplest Henkin quantifier can be written:
∀r∃n
∀c∃/
φ(r. n. c. /)
This quantifier, inexpressible in ordinary first order logic, can best be understood by its
Skolemization. The formula above is equivalent to ∀r∀cφ(r. 1(n). c. o(c)). Critically, the
selection of n depends only on r while the selection of / depends only on c.
Logics with this quantifier are stronger than first order logic, lying between first and second order logic
in strength. For instance the Henkin quantifier can be used to define the Rescher quantifier,
and by extension H¨artig’s quantifer:
∀r∃n
∀c∃/
[(r = c ↔n = /) ∧ φ(r) →ψ(n)] ↔1rnφ(r)ψ(n)
To see that this is true, observe that this essentially requires that the Skolem functions
1(r) = n and o(c) = / the same, and moreover that they are injective. Then for each r
satisfying φ(r), there is a different 1(r) satisfying ψ((1(r)).
146
This concept can be generalized to the game-theoretical quantifiers. This concept comes
from interpreting a formula as a game between a ”Prover” and ”Refuter.” A theorem is
provable whenever the Prover has a winning strategy; at each ∧ the Refuter chooses which
side they will play (so the Prover must be prepared to win on either) while each ∨ is a choice
for the Prover. At a , the players switch roles. Then ∀ represents a choice for the Refuter
and ∃ for the Prover.
Classical first order logic, then, adds the requirement that the games have perfect information.
The game-theoretical quantifers remove this requirement, so for instance the Henkin quan-
tifier, which would be written ∀r∃n∀c∃
/∀x
/φ(r. n. c. /) states that when the Prover makes a
choice for /, it is made without knowledge of what was chosen at r.
Version: 2 Owner: Henry Author(s): Henry
15.5 logical language
In its most general form, a logical language is a set of rules for constructing formulas for
some logic, which can then by assigned truth values based on the rules of that logic.
A logical languages L consists of:
• A set 1 of function symbols (common examples include + and )
• A set 1 of relation symbols (common examples include = and <)
• A set ( of logical connectives (usually , ∧, ∨, → and ↔)
• A set C of quantifiers (usuallly ∀ and ∃)
• A set \ of variables
Every function symbol, relation symbol, and connective is associated with an arity (the set
of :-ary function symbols is denoted 1
n
, and similarly for relation symbols and connectives).
Each quantifier is a generalized quantifier associated with a quantifier type ':
1
. . . . . :
n
`.
The underlying logic has a (possibly empty) set of types 1. There is a function Type :
1
¸
\ → 1 which assignes a type to each function and variable. For each arity : is a
function Inputs
n
: 1
n
¸
1
n
→ 1
n
which gives the types of each of the arguments to a
function symbol or relation. In addition, for each quantifier type ':
1
. . . . . :
n
` there is a
function Inputs
/n
1
,...,nn)
defined on C
/n
1
,...,nn)
(the set of quantifiers of that type) which gives
an :-tuple of :
i
-tuples of types of the arguments taken by formulas the quantifier applies to.
The terms of L of type t ∈ 1 are built as follows:
147
1. If · is a variable such that Type(·) = t then · is a term of type t
2. If 1 is an :-ary function symbol such that Type(1) = t and t
1
. . . . . t
n
are terms such
that for each i < : Type(t
i
) = (Inputs
n
(1))
i
then 1t
1
. . . . . t
n
is a term of type t
The formulas of L are built as follows:
1. If : is an :-ary relation symbol and t
1
. . . . . t
n
are terms such that Type(t
i
) = (Inputs
n
(:))
i
then :t
1
. . . . . t
n
is a formula
2. If c is an :-ary connective and 1
1
. . . . . 1
n
are formulas then c1
1
. . . . . 1
n
is a formula
3. If ¡ is a quantifier of type ':
1
. . . . . :
n
`, ·
1,1
. . . . . ·
1,n
1
. ·
2,1
. . . . . ·
n,1
. . . . . ·
n,nn
are a sequence
of variables such that Type(·
i,j
) = ((Inputs
/n
1
,...,nn)
(¡))
j
)
i
and 1
1
. . . . . 1
n
are formulas
then ¡·
1,1
. . . . . ·
1,n
1
. ·
2,1
. . . . . ·
n,1
. . . . . ·
n,nn
1
1
. . . . . 1
n
is a formula
Generally the connectives, quantifiers, and variables are specified by the appropriate logic,
while the function and relation symbols are specified for particular languages. Note that
0-ary functions are usually called constants.
If there is only one type which is equated directly with truth values then this is essentially
a propositional logic. If the standard quantifiers and connectives are used, there is only one
type, and one of the relations is = (with its usual semantics), this produces first order logic.
If the standard quantifiers and connectives are used, there are two types, and the relations
include = and ∈ with appropriate semantics, this is second order logic (a slightly different
formulation replaces ∈ with a 2-ary function which represents function application; this views
second order objects as functions rather than sets).
Note that often connectives are written with infix notation with parentheses used to control
order of operations.
Version: 7 Owner: Henry Author(s): Henry
15.6 second order logic
Second order logic refers to logics with two (or three) types where one type consists of the
objects of interest and the second is either sets of those objects or functions on those objects
(or both, in the three type case). For instance, second order arithmetic has two types: the
numbers and the sets of numbers.
Formally, second order logic usually has:
• the standard quantifiers (four of them, since each type needs its own universal and
existential quantifiers)
148
• the standard connectives
• the relation = with its normal semantics
• if the second type represents sets, a relation ∈ where the first argument is of the first
type and the second argument is the second type
• if the second type represents functions, a binary function which takes one argument of
each type and results in an object of the first type, representing function application
Specific second order logics may deviate from this definition slightly. In particular, some
mathematicians have argued that first order logics which additional quantifiers which give
it most or all of the strength of second order logic should be considered second order logics.
Some people, chiefly Quine, have raised philisophical objections to second order logic, cen-
tering on the question of whether models require fixing some set of sets or functions as the
“actual” sets or functions for the purposes of that model.
Version: 4 Owner: Henry Author(s): Henry
149
Chapter 16
03B40 – Combinatory logic and
lambda-calculus
16.1 Church integer
A Church integer is a representation of integers as functions, invented by Alonzo Church.
An integer ` is represented as a higher-order function, which applies a given function to a
given expression ` times.
For example, in Haskell, a function that returns a particular Church integer might be
The transformation from a Church integer to an integer might be
unchurch n = n (+1) 0
Thus the (+1) function would be applied to an initial value of 0 : times, yielding the ordinary
integer :.
Version: 2 Owner: Logan Author(s): Logan
16.2 combinatory logic
Combinatory logic was invented by Moses Sch¨onfinkel in the early 1920s, and was mostly
developed by Haskell Curry. The idea was to reduce the notation of logic to the simplest
terms possible. As such, combinatory logic consists only of combinators, combination
operations, and no free variables.
150
A combinator is simply a function with no free variables. A free variable is any variable
referred to in a function that is not a parameter of that function. The operation of com-
bination is then simply the application of a combinator to its parameters. Combination is
specified by simple juxtaposition of two terms, and is left-associative. Parentheses may also
be present to override associativity. For example
1orn = (1o)rn = ((1o)r)n
All combinators in combinatory logic can be derived from two basic combinators, o and 1.
They are defined as
o1or = 1r(or)
1rn = r
Reference is sometimes made to a third basic combinator, 1, which can be defined in terms
of o and 1.
1r = o11r = r
Combinatory logic where 1 is considered to be derived from o and 1 is sometimes known
as pure combinatory logic.
Combinatory logic and lambda calculus are equivalent. However, lambda calculus is more
concise than combinatory logic; an expression of size O(:) in lambda calculus is equivalent
to an expression of size O(:
2
) in combinatory logic.
For example, o1or = 1r(or) in combinatory logic is equivalent to o = (λ1(λo(λr((1r)(or))))),
and 1rn = r is equivalent to 1 = (λr(λnr)).
Version: 2 Owner: Logan Author(s): Logan
16.3 lambda calculus
Lambda calculus (often referred to as λ-calculus) was invented in the 1930s by Alonzo
Church, as a form of mathematical logic dealing primarly with functions and the application
of functions to their arguments. In pure lambda calculus, there are no constants. In-
stead, there are only lambda abstractions (which are simply specifications of functions),
variables, and applications of functions to functions. For instance, Church integers are used
as a substitute for actual constants representing integers.
151
A lambda abstraction is typically specified using a lambda expression, which might look
like the following.
λ r . 1 r
The above specifies a function of one argument, that can be reduced by applying the
function 1 to its argument (function application is left-associative by default, and parentheses
can be used to specify associativity).
The λ-calculus is equivalent to combinatory logic (though much more concise). Most functional
programming languages are also equivalent to λ-calculus, to a degree (any imperative fea-
tures in such languages are, of course, not equivalent).
Examples
We can specify the Church integer 3 in λ-calculus as
3 = λ1 r . 1 (1 (1 r))
Suppose we have a function i:c, which when given a string representing an integer, returns
a new string representing the number following that integer. Then
3 i:c ”0” = ”3”
Addition of Church integers in λ-calculus is
cdd = λ r n . (λ 1 . . r 1 (n 1 .))
cdd 2 3 = λ 1 . . 2 1 (3 1 .)
= λ 1 . . 2 1 (1 (1 (1 .)))
= λ 1 . . 1 (1 (1 (1 (1 .))))
= 5
Multiplication is
:n| = λ r n . (λ 1 . . r (λ u . n 1 u) .)
:n| 2 3 = λ 1 . . 2 (λ u . 3 1 u) .
= λ 1 . . 2 (λ u . 1 (1 (1 u))) .
= λ 1 . . 1 (1 (1 (1 (1 (1 .)))))
152
Russell’s Paradox in λ-calculus
The λ-calculus readily admits Russell’s paradox. Let us define a function : that takes a
function r as an argument, and is reduced to the application of the logical function :ot to
the application of r to itself.
: = λ r . :ot (r r)
Now what happens when we apply : to itself?
: : = :ot (: :)
= :ot (:ot (: :))
.
.
.
Since we have :ot (: :) = (: :), we have a paradox.
Version: 3 Owner: Logan Author(s): Logan
153
Chapter 17
03B48 – Probability and inductive
logic
17.1 conditional probability
Let (Ω. B. j) be a probability space, and let A and ) be random variables on Ω with joint
probability distribution j(A. ) ) := j(A
¸
) ).
The conditional probability of A given ) is defined as
j(A[) ) :=
j(A
¸
) )
j() )
. (17.1.1)
In general,
j(A[) )j() ) = j(A. ) ) = j() [A)j(A). (17.1.2)
and so we have
j(A[) ) =
j() [A)j(A)
j() )
. (17.1.3)
Version: 1 Owner: drummond Author(s): drummond
154
Chapter 18
03B99 – Miscellaneous
18.1 Beth property
A logic is said to have the Beth property if whenever a predicate 1 is implicitly definable by φ
(i.e. if all models have at most one unique extension satisfying φ), then 1 is explicitly defin-
able relative to φ (i.e. there is a ψ not containing 1,such that φ [= ∀r
1
. ... r
n
(1(r
1
. .... r
n
) ↔
ψ(r
1
. .... r
n
))).
Version: 3 Owner: Aatu Author(s): Aatu
18.2 Hofstadter’s MIU system
The alphabet of the system contains three symbols `. 1. l. The set of theorem is the set of
string constructed by the rules and the axiom, is denoted by T and can be built as follows:
(axiom) `1 ∈ T.
(i) If r1 ∈ T then r1l ∈ T.
(ii) If `r ∈ T then `rr ∈ T.
(iii) In any theorem, 111 can be replaced by l.
(iv) In any theorem, ll can be omitted.
example:
155
• Show that `l11 ∈ T
`1 ∈ T by axiom
→`11 ∈ T by rule (ii) where r = 1
→`1111 ∈ T by rule (ii) where r = 11
→`11111111 ∈ T by rule (ii) where r = 1111
→`11111111l ∈ T by rule (i) where r = `1111111
→`11111ll ∈ T by rule (iii)
→`11111 ∈ T by rule (iv)
→`l11 ∈ T by rule (iii)
• Is `l a theorem?
No. Why? Because the number of 1’s of a theorem is never a multiple of 3. We will
show this by structural induction.
base case: The statement is true for the base case. Since the axiom has one 1 .
Therefore not a multiple of 3.
induction hypothesis: Suppose true for premise of all rule.
induction step: By induction hypothesis we assume the premise of each rule to be true
and show that the application of the rule keeps the staement true.
Rule 1: Applying rule 1 does not add any 1’s to the formula. Therefore the statement
is true for rule 1 by induction hypothesis.
Rule 2: Applying rule 2 doubles the amount of 1’s of the formula but since the initial
amount of 1’s was not a multiple of 3 by induction hypothesis. Doubling that amount
does not make it a multiple of 3 (i.e. if : ⇔ 0 mod3 then 2: ⇔ 0 mod3). Therefore
the statement is true for rule 2.
Rule 3: Applying rule 3 replaces 111 by l. Since the initial amount of 1’s was not a
multiple of 3 by induction hypothesis. Removing 111 will not make the number of 1’s
in the formula be a multiple of 3. Therefore the statement is true for rule 3.
Rule 4: Applying rule 4 removes ll and does not change the amount of 1’s. Since
the initial amount of 1’s was not a multiple of 3 by induction hypothesis. Therefore
the statement is true for rule 4.
Therefore all theorems do not have a multiple of 3 1’s.
[GVL]
REFERENCES
[HD] Hofstader, R. Douglas: G¨odel, Escher, Bach: an Eternal Golden Braid. Basic Books, Inc.,
New York, 1979.
Version: 5 Owner: Daume Author(s): Daume
156
18.3 IF-logic
Independence Friendly logic (IF-logic) is an interesting conservative extension of classical first order logic
based on very natural ideas from game theoretical semantics developed by Jaakko Hintikka
and Gabriel Sandu among others. Although IF-logic is a conservative extension of first order
logic, it has a number of interesting properties, such as allowing truth-definitions and ad-
mitting a translation of all Σ
1
1
sentences (second order sentences with an intial second order
existential quantifier followed by a first order sentence).
IF-logic can be characterised as the natural extension of first order logic when one allows
informational independence to occur in the game theoretical truth definition. To understand
this idea we need first to introduce the game theoretical definition of truth for classical first
order logic.
To each first order sentence φ we assign a game G(φ) with to players played on models of
the appropriate language. The two players are called verifier and falsisier (or nature). The
idea is that the verifier attempts to show that the sentence is true in the model, while the
falsifier attempts to show that it is false in the model. The game G(φ) is defined as follows.
We will use the convention that if j is a symbol that names a function, a predicate or an
object of the model `, then j
M
is that named entity.
• if 1 is an :-ary predicate and t
i
are names of elements of the model, then G(1(t
1
. .... t
n
))
is a game in which the verifier immediatedly wins if (t
M
1
. .... t
M
n
) ∈ 1
M
and otherwise
the falsifier immediatedly wins.
• the game G(φ
1
∨ φ
2
) begins with the choice φ
i
from φ
1
and φ
2
(i = 1 or i = 2) by the
verifier and then proceeds as the game G(φ
i
)
• the game G(φ
1
∧ φ
2
) is the same as G(φ
1
∨ φ
2
), except that the choice is made by the
falsifier
• the game G(∃rφ(r)) begins with the choice by verifier of a member of ` which is
given a name a, and then proceeds as G(φ(a))
• the game G(∀rφ(r)) is the same as G(∃rφ(r)), except that the choice of a is made by
the falsifier
• the game G(φ) is the same as G(φ) with the roles of the falsifier and verifier exchanged
Truth of a sentence φ is defined as the existence of a winning strategy for verifier for the
game G(φ). Similarly, falsity of φ is defined as the existence of a winning strategy for the
falsifier for the game G(φ). (A strategy is a specification which determines for each move the
opponent does what the player should do. A winning strategy is a strategy which guarantees
victory no matter what strategy the opponent follows).
157
For classical first order logic, this definition is equivalent to the usual Tarskian definition of
turth (i.e. the one based on satisfaction found in most treatments of semantics of first order
logic). This means also that since the law of excluded middle holds for first order logic that
the games G(φ) have a very strong property; either the falsifier or the verifier has a winning
strategy.
Notice that all rules except those for negation and atomic sentences concern choosing a
sentence or finding an element. These can be codified into functions, which tell us which
sentence to pick or which element of the model to choose, based on our previous choices
and those of our opponent. For example, consider the sentence ∀r(1(r) ∨ C(r)). The
corresponding game begins with the falsifier picking an element a from the model, so a
strategy for the verifier must specify for each element a which of C(a) and 1(a) to pick. The
truth of the sentence is equivalent to the existence of a winning strategy for the verifier, i.e.
just such a function. But this means that ∀r(1(r)∨C(r)) is equivalent to ∃1∀r1(r)∧1(r) =
0∨C(r)∧1(r) = 1. Let’s consider a more complicated example: ∀r∃n∀.∃:1(r. n. .. :). The
truth of this is equivalent to the existence of a functions 1 and o, s.t. ∀r∀.1(r. 1(r). .. o(.)).
These sort of functions are known as Skolem functions, and they are in essence just winning
strategies for the verifier. We won’t prove it here, but all first order sentences can be
expressed in form ∃1
1
...∃1
n
∀r
1
...∀r
k
φ, where φ is a truth functional combination of atomic
sentences in which all terms are either constants or variables r
i
or formed by application of
the functions 1
i
to such terms. Such sentences are said to be in Σ
1
1
form.
Let’s consider a Σ
1
1
sentence ∃1∃o∀r∀.φ(r. 1(r). n. o(.)). Up front, it seems to assert the
existence of a winning strategy in a simple semantical game like those described above.
However, the game can’t correspond to any (classical) first order formula! Let’s first see
what the game the existence of a winning strategy of which this formula asserts looks like.
First, the falsifier chooses elements a and b to serve as r and n. Then the verifier chooses
an element c knowing only a and an element d knowing only b. The verifier’s goal is that
φ(a. c. b. d) comes out as a true atomic sentence. The game could be actually arranged so
that the verifier is a team of two players (who aren’t allowed to communicate with each
other), one of which picks c the other one picking d.
From a game theoretical point of view, games in which some moves must be made without
depending on some of the earlier moves are called informationally incomplete games, and
they occur very commonly. Bridge is such a game, for example, and usually real examples
of such games have “players” being actually teams made up of several people.
IF-logic comes out of the game theoretical definition in a natural way if we allow informa-
tional independence in our semantical games. In IF-logic, every connective can be augmented
with an independence marker , so that ∗∗
t
means that the game for the occurance of

t
within the scope of ∗ must be played without knowledge of the choices made for ∗. For
example (∀r∃n)∃nφ(r. n) asserts that for any choice of value for r by the falsisier, the
verifier can find a value for n which does not depend on the value of r, s.t. φ(r. n) comes
out true. This is not a very characteristic example, as it can be written as an ordinary first
order formula ∃n∀rφ(r. n). The curious game we described above corresponding to the sec-
158
ond order Skolem-function formulation Σ
1
1
sentence ∃1∃o∀r∀.φ(r. 1(r). n. o(.)) corresponds
to an IF-sentence (∀r∃n)(∀.∃n)(∃n)(∃n)φ(r. n. .. n). IF-logic allows informational in-
dependence also for the usual logical connectives, for example (∀r∨)(φ(r) ∨ ψ(r)) is true
if and only if for all r, either φ(r) or ψ(r) is true, but which of these is picked by the verifier
must be decided independently of the choice for r by the falsifier.
One of the striking characteristics of IF-logic is that every Σ
1
1
formula φ has an IF-translation
φ
I
1 which is true if and only if φ is true (the equivalence does not in general hold if we
replace ’true’ with ’false’). Since for example first order truth (in a model) is Σ
1
1
definable
(it’s just quantification over all possible valuations, which are second order objects), there
are IF-theories which correctly represent the truth predicate for their first order part. What
is even more striking is that sufficiently strong IF-theories can do this for the whole of the
language they are expressed in.
This seems to contradict Tarski’s famous result on the undefinability of truth, but this is
illusory. Tarski’s result depends on the assumption that the logic is closed under contradic-
tory negation. This is not the case for IF-logic. In general for a given sentence φ there is no
sentence φ∗ which is true just in case φ is not true. Thus the law of excluded middle does
not hold in general in IF-logic (although it does for the classical first order portion). This is
quite unsurprising since games of imperfect information are very seldom determined in the
sense that either the verifier or the falsisifer has a winning strategy. For example, a game in
which I choose a 10-letter word and you have one go at guessing it is not determined in this
sense, since there is no 10-letter word you couldn’t guess and on the other hand you have
no way of forcing me to choose any particular 10-letter word (which would guarantee your
victory).
IF-logic is stronger than first order logic in the usual sense that there are classes of structures
which are IF-definable but not first-order definable. Some of these are even finite. Many
interesting concepts are expressible in IF-logic, such as equicardinality,infinity (which can
be expressed by a logical formula in contradistinction to ordinary first order logic in which
non-logical symbols are needed), well-order
By Lindstr¨om’s theorem we thus know that either IF-logic is not complete (i.e. it’s set of
validities is not r.e.) or the L¨owenheim-Skolem theorem does not hold. In fact, (downward)
L¨owenheim-Skolem theorem does hold for IF-logic, so it’s not complete. There is a com-
plete disproof procedure for IF-logic, but because IF-logic is not closed under contradictory
negation this does not yield a complete proof procedure.
IF-logic can be extended by allowing contradictory negations of closed sentences and turth
functional combinations thereof. This extended IF-logic is extremely strong. For example,
the second order induction axiom for PA is ∀A((A(0) ∧∀n(A(n) →A(n +1))) →∀nA(n)).
The negation of this is a Σ
1
1
sentence asserting the existence of a set which invalidates the
induction axiom. Since Σ
1
1
sentences are expressible in IF-logic, we can translate the negation
of the induction axiom into IF-sentence φ. But now φ is a formula of extended IF-logic,
and is clearly equivalent to the usual induction axiom! As all the rest of PA axioms are first
order, this shows that extended IF-logic PA can correctly define the natural number system.
159
There exists also an interesting “translation” of :th order logic into extended IF-logic. Con-
sider an :-sorted first order language and an :th order theory 1 translated into this language.
Now, extend the language to second order and add the axiom stating that the sort / + 1
actually comprises the whole of the powerset of the sort /. This is a Π
1
sentence (i.e. of
the form “for all predicates P there is a first order element of sort / + 1 which comprises
exactly the / extension of 1”). It is easy to see that a formula is valid in this new system if
and only if it was valid in the original :th order logic. The negation of this axiom is again
Σ
1
1
and translatable into IF-logic and thus the axiom itself is expressible in extended IF-
logic. Moreover, since most interesting second order theories are finitely axiomatisable, we
can consider sentences of form 1∗ →φ (where 1∗ is the multisorted translation of 1), which
express logical implication of φ by 1 (correctly). This is equivalent to (1∗) ∨ φ (where
is contradictory), but since 1∗ is a conjunction of a 1i
1
1
sentence asserting comprehension
translated into extended IF-logic and first order translation of the axioms of 1, this is a Σ
1
1
formula translatable to non-extended IF-logic and so is φ. Thus sentences of form 1 → φ
of :th order logic are translatable into IF-sentences which are true just in case the originals
were.
Version: 1 Owner: Aatu Author(s): Aatu
18.4 Tarski’s result on the undefinability of Truth
Assume L is a logic which is closed under contradictory negation and has the usual truth-
functional connectives. Assume also that L has a notion of open formula with one variable
and of substitution. Assume that 1 is a theory of L in which we can define define surrogates
for formulae of L, and in which all true instances of the substitution relation and the truth-
functional connective relations are provable. We show that either 1 is inconsistent or 1 can’t
be augmented with a truth predicate True for which the following T-schema holds
True(
t
φ
t
) ↔φ
Assume that the open formulae with one variable of L have been indexed by some suitable
set that is representable in 1 (otherwise the predicate True would be next to useless, since if
there’s no way to speak of sentences of a logic, there’s little hope to define a truth-predicate
for it). Denote the i:th element in this indexing by 1
i
. Consider now the following open
formula with one variable
Liar(r) = True(1
x
)(r)
Now, since Liar is an open formula with one free variable it’s indexed by some i. Now
consider the sentence Liar(i). From the T-schema we know that
160
True(Liar(i)) ↔Liar(i)
and by the definition of Liar and the fact that i is the index of Liar(r) we have
True(Liar(i)) ↔True(Liar(i))
which clearly is absurd. Thus there can’t be an extension of 1 with a predicate Truth for
which the T-schema holds.
We have made several assumptions on the logic L which are crucial in order for this proof
to go trough. The most important is that L is closed under contradictory negation. There
are logics which allow truth-predicates, but these are not usually closed under contradictory
negation (so that it’s possible that True(Liar(i)) is neither true nor false). These logics
usually have stronger notions of negation, so that a sentence 1 says more than just that
1 is not true, and the proposition that 1 is simply not true is not expressible.
An example of a logic for which Tarski’s undefinability result does not hold is the so-called
Independence Friendly logic, the semantics of which is based on game theory and which
allows various generalised quantifiers (the Henkin branching quantifier, &c.) to be used.
Version: 5 Owner: Aatu Author(s): Aatu
18.5 axiom
In a nutshell, the logico-deductive method is a system of inference where conclusions (new
knowledge) follow from premises (old knowledge) through the application of sound arguments
(syllogisms, rules of inference). Tautologies excluded, nothing can be deduced if nothing
is assumed. Axioms and postulates are the basic assumptions underlying a given body
of deductive knowledge. They are accepted without demonstration. All other assertions
(theorems, if we are talking about mathematics) must be proven with the aid of the basic
assumptions.
The logico-deductive method was developed by the ancient Greeks, and has become the core
principle of modern mathematics. However, the interpretation of mathematical knowledge
has changed from ancient times to the modern, and consequently the terms axiom and
postulate hold a slightly different meaning for the present day mathematician, then they
did for Aristotle and Euclid.
The ancient Greeks considered geometry as just one of several sciences, and held the theorems
of geometry on par with scientific facts. As such, they developed and used the logico-
deductive method as a means of avoiding error, and for structuring and communicating
knowledge. Aristotle’s Posterior Analytics is a definitive exposition of the classical view.
161
“Axiom”, in classical terminology, referred to a self-evident assumption common to many
branches of science. A good example would be the assertion that
When an equal amount is taken from equals, an equal amount results.
At the foundation of the various sciences lay certain basic hypotheses that had to be accepted
without proof. Such a hypothesis was termed a postulate. The postulates of each science
were different. Their validity had to be established by means of real-world experience.
Indeed, Aristotle warns that the content of a science cannot be successfully communicated,
if the learner is in doubt about the truth of the postulates.
The classical approach is well illustrated by Euclid’s elements, where we see a list of axioms
(very basic, self-evident assertions) and postulates (common-sensical geometric facts drawn
from our experience).
A1 Things which are equal to the same thing are also equal to one another.
A2 If equals be added to equals, the wholes are equal.
A3 If equals be subtracted from equals, the remainders are equal.
A4 Things which coincide with one another are equal to one another.
A5 The whole is greater than the part.
P1 It is possible to draw a straight line from any point to any other point.
P2 It is possible to produce a finite straight line continuously in a straight line.
P3 It is possible to describe a circle with any centre and distance.
P4 It is true that all right angles are equal to one another.
P5 It is true that, if a straight line falling on two straight lines make the interior angles on
the same side less than two right angles, the two straight lines, if produced indefinitely,
meet on that side on which are the angles less than the two right angles.
The classical view point is explored in more detail here.
A great lesson learned by mathematics in the last 150 years is that it is useful to strip the
meaning away from the mathematical assertions (axioms, postulates, propositions, theorems)
and definitions. This abstraction, one might even say formalization, makes mathematical
knowledge more general, capable of multiple different meanings, and therefore useful in
multiple contexts.
In structuralist mathematics we go even further, and develop theories and axioms (like
field theory, group theory, topology, vector spaces) without any particular application in
162
mind. The distinction between an “axiom” and a “postulate” disappears. The postulates
of Euclid are profitably motivated by saying that they lead to a great wealth of geometric
facts. The truth of these complicated facts rests on the acceptance of the basic hypotheses.
However by throwing out postulate 5, we get theories that have meaning in wider contexts,
hyperbolic geometry for example. We must simply be prepared to use labels like ”line”
and ”parallel” with greater flexibility. The development of hyperbolic geometry taught
mathematicians that postulates should be regarded as purely formal statements, and not as
facts based on experience.
When mathematicians employ the axioms of a field, the intentions are even more abstract.
The propositions of field theory do not concern any one particular application; the mathe-
matician now works in complete abstraction. There are many examples of fields; field theory
gives correct knowledge in all contexts.
It is not correct to say that the axioms of field theory are ”propositions that are regarded as
true without proof.” Rather, the Field Axioms are a set of constraints. If any given system of
addition and multiplication tolerates these constraints, then one is in a position to instantly
know a great deal of extra information about this system. There is a lot of bang for the
formalist buck.
Modern mathematics formalizes its foundations to such an extent that mathematical theories
can be regarded as mathematical objects, and logic itself can be regarded as a branch of
mathematics. Frege, Russell, Poincar´e, Hilbert, and G¨odel are some of the key figures in
this development.
In the modern understanding, a set of axioms is any collection of formally stated assertions
from which other formally stated assertions follow by the application of certain well-defined
rules. In this view, logic becomes just another formal system. A set of axioms should be
consistent; it should be impossible to derive a contradiction from the axiom. A set of axioms
should also be non-redundant; an assertion that can be deduced from other axioms need not
be regarded as an axiom.
It was the early hope of modern logicians that various branches of mathematics, perhaps
all of mathematics, could be derived from a consistent collection of basic axioms. An early
success of the formalist program was Hilbert’s formalization of Euclidean geometry, and the
related demonstration of the consistency of those axioms.
In a wider context, there was an attempt to base all of mathematics on Cantor’s set theory.
Here the emergence of Russell’s paradox, and similar antinomies of naive set theory raised
the possibility that any such system could turn out to be inconsistent.
The formalist project suffered a decisive setback, when in 1931 G¨odel showed that it is
possible, for any sufficiently large set of axioms (Peano’s axioms, for example) to construct
a statement whose truth is independent of that set of axioms. As a corollary, G¨odel proved
that the consistency of a theory like Peano arithmetic is an unprovable assertion within the
scope of that theory.
163
It is reasonable to believe in the consistency of Peano arithmetic because it is satisfied by
the system of natural numbers, an infinite but intuitively accessible formal system. How-
ever, at this date we have no way of demonstrating the consistency of modern set theory
(Zermelo-Frankel axioms). The axiom of choice, a key hypothesis of this theory, remains a
very controversial assumption. Furthermore, using techniques of forcing (Cohen) one can
show that the continuum hypothesis (Cantor) is independent of the Zermelo-Frankel axioms.
Thus, even this very general set of axioms cannot be regarded as the definitive foundation
for mathematics.
Version: 11 Owner: rmilson Author(s): rmilson, digitalis
18.6 compactness
A logic is said to be (κ. λ)-compact, if the following holds
If Φ is a set of sentences of cardinality less than or equal to κ and all subsets of
Φ of cardinality less than λ are consistent, then Φ is consistent.
For example, first order logic is (ω. ω)-compact, for if all finite subsets of some class of
sentences are consistent, so is the class itself.
Version: 2 Owner: Aatu Author(s): Aatu
18.7 consistent
If 1 is a theory of L then it is consistent iff there is some model M of L such that M = 1.
If a theory is not consistent then it is inconsistent.
A slightly different definition is sometimes used, that 1 is consistent iff 1 ¬ ⊥ (that is, as
long as it does not prove a contradiction). As long as the proof calculus used is sound and
complete, these two definitions are equivalent.
Version: 3 Owner: Henry Author(s): Henry
18.8 interpolation property
A logic is said to have the interpolation property if whenever φ(1. o) →ψ(1. o) holds, then
there is a sentence θ(1), so that φ(1. o) → θ(1) and θ(1) → ψ(1. 1), where 1. o and 1
164
are some sets of symbols that occur in the formulae, o being the set of symbols common to
both φ and ψ.
The interpolation property holds for first order logic. The interpolation property is related
to Beth definability property and Robinson’s consistency property. Also, a natural general-
isation is the concept ∆-closed logic.
Version: 2 Owner: Aatu Author(s): Aatu
18.9 sentence
A sentence is a formula with no free variables.
Simple examples include:
∀r∃n[r < n]
or
∃.[. + 7 −43 = 0]
However the following formula is not a sentence:
r + 2 = 3
Version: 2 Owner: Henry Author(s): Henry
165
Chapter 19
03Bxx – General logic
19.1 Banach-Tarski paradox
The 3-dimensional ball can be split in a finite number of pieces which can be pasted together
to give two balls of the same volume as the first!
Let us formulate the theorem formally. We say that a set ¹ ⊂ R
n
is decomposable
in ` pieces ¹
1
. . . . . ¹
N
if there exist some isometries θ
1
. . . . . θ
N
of R
n
such that ¹ =
θ
1

1
)
¸
. . .
¸
θ
N

N
) while θ
1

1
). . . . . θ
N

N
) are all disjoint.
We then say that two sets ¹. 1 ⊂ R
n
are equi-decomposable if both ¹ and 1 are decom-
posable in the same pieces ¹
1
. . . . . ¹
N
.
Theorem 2 (Banach-Tarski). The unit ball B
3
⊂ R
3
is equi-decomposable to the union of
two disjoint unit balls.
19.1.1 Comments
The actual number of pieces needed for this decomposition is not so large. Say that ten
pieces are enough.
Also it is not important that the set considered is a ball. Every two set with non empty
interior are equidecomposable in R
3
. Also the ambient space can be choosen larger. The
theorem is true in all R
n
with : ≥ 3 but it is not true in R
2
nor in R.
Where is the paradox? We are saying that a piece of (say) gold can be cut and pasted to
obtain two pieces equal to the previous one. And we may divide these two pieces in the same
way to obtain four pieces and so on...
We believe that this is not possible since the weight of the piece of gold does not change
166
when I cut it.
A consequence of this theorem is, in fact, that it is not possible to define the volume for all
subsets of the 3-dimensional space. In particular the volume cannot be computed for some
of the pieces in which the unit ball is decomposed (some of them are not measurable).
The existence of non-measurable sets is proved more simply and in all dimension by Vitali Theorem.
However Banach-Tarski paradox says something more. It says that it is not possible to define
a measure on all the subsets of R
3
even if we drop the countable additivity and replace it
with a finite additivity:
j(¹
¸
1) = j(¹) + j(1) ∀¹. 1 disjoint.
Another point to be noticed is that the proof needs the axiom of choice. So some of the
pieces in which the ball is divided are not constructable.
Version: 4 Owner: paolini Author(s): paolini
167
Chapter 20
03C05 – Equational classes, universal
algebra
20.1 congruence
Let Σ be a fixed signature, and A a structure for Σ. A congruence ∼ on A is an
equivalence relation such that for every natural number : and :-ary function symbol 1
of Σ, if c
i
∼ c
t
i
then 1
A
(c
1
. . . . c
n
) ∼ 1
A
(c
t
1
. . . . c
t
n
).
Version: 6 Owner: almann Author(s): almann
20.2 every congruence is the kernel of a homomor-
phism
Let Σ be a fixed signature, and A a structure for Σ. If ∼ is a congruence on A, then there
is a homomorphism 1 such that ∼ = ker (1).
D efine a homomorphism 1 : A → A ∼ : c → [[c]]. Observe that c ∼ / if and only if
1(c) = 1(/), so ∼ = ker (1). To verify that 1 is a homomorphism, observe that
1. For each constant symbol c of Σ, 1(c
A
) = [[c
A
]] = c
A/∼
.
2. For every natural number : and :-ary relation symbol 1 of Σ, if 1
A
(c
1
. . . . . c
n
) then
1
A/∼
([[c
1
]]. . . . . [[c
n
]]), so 1
A/∼
(1(c
1
). . . . . 1(c
n
)).
168
3. For every natural number : and :-ary function symbol 1 of Σ,
1(1
A
(c
1
. . . . c
n
)) = [[1
A
(c
1
. . . . c
n
)]]
= 1
A/∼
([[c
1
]]. . . . [[c
n
]])
= 1
A/∼
(1(c
1
). . . . 1(c
n
)).
Version: 3 Owner: almann Author(s): almann
20.3 homomorphic image of a Σ-structure is a Σ-structure
Let Σ be a fixed signature, and A and B two structures for Σ. If 1 : A → B is a
homomorphism, then i(1) is a structure for Σ.
Version: 3 Owner: almann Author(s): almann
20.4 kernel
Given a function 1 : ¹ →1, the kernel of 1 is the equivalence relation on ¹ defined by
(c. c
t
) ∈ ker (1) ⇔1(c) = 1(c
t
).
Version: 3 Owner: almann Author(s): almann
20.5 kernel of a homomorphism is a congruence
Let Σ be a fixed signature, and Aand Btwo structures for Σ. If 1 : A →Bis a homomorphism,
then ker (1) is a congruence on A.
I f 1 is an :-ary function symbol of Σ, and 1(c
i
) = 1(c
t
i
), then
1(1
A
(c
1
. . . . . c
n
)) = 1
B
(1(c
1
). . . . . 1(c
n
))
= 1
B
(1(c
t
1
). . . . . 1(c
t
n
))
= 1(1
A
(c
t
1
. . . . . c
t
n
)).
Version: 4 Owner: almann Author(s): almann
169
20.6 quotient structure
Let Σ be a fixed signature, A a structure for Σ, and ∼ a congruence on A. The quotient
structure of A by ∼, denoted A∼, is defined as follows:
1. The universe of A∼ is the set ¦[[c]] [ c ∈ A¦.
2. For each constant symbol c of Σ, c
A/∼
= [[c
A
]].
3. For every natural number : and every :-ary function symbol 1 of Σ,
1
A/∼
([[c
1
]]. . . . [[c
n
]]) = [[1
A
(c
1
. . . . c
n
)]].
4. For every natural number : and every :-ary relation symbol 1 of Σ, 1
A/∼
([[c
1
]]. . . . . [[c
n
]])
if and only if for some c
t
i
∼ c
i
we have 1
A
(c
t
1
. . . . . c
t
n
).
Version: 7 Owner: almann Author(s): almann
170
Chapter 21
03C07 – Basic properties of first-order
languages and structures
21.1 Models constructed from constants
The definition of a structure and of the satisfaction relation is nice, but it raises the following
question : how do we get models in the first place? The most basic construction for models
of first-order theory is the construction that uses constants. Throughout this entry, 1 is a
fixed first-order language.
Let ( be a set of constant symbols of 1, and 1 be a theory in 1. Then we say ( is a set of
witnesses for 1 if and only if for every formula ϕ with at most one free variable r, we have
1 ¬ ∃r(ϕ) ⇒ϕ(c) for some c ∈ (.
lemma. Let 1 is any consistent set of sentences of 1, and ( is a set of new symbols such
that [([ = [1[. Let 1
t
= 1
¸
(. Then there is a consistent set 1
t
⊆ 1
t
extending 1 and
which has ( as set of witnesses.
Lemma. If 1 is a consistent theory in 1, and ( is a set of witnesses for 1 in 1, then 1 has
a model whose elements are the constants in (.
Proof: Let Σ be the signature for 1. If 1 is a consistent set of sentences of 1, then there is
a maximal consistent 1
t
⊇ 1. Note that 1
t
and 1 have the same sets of witnesses. As every
model of 1
t
is also a model of 1, we may assume 1 is maximal consistent.
We let the universe of M be the set of equivalence classes ( ∼, where c ∼ / if and only if
“c = /” ∈ 1. As 1 is maximal consistent, this is an equivalence relation. We interpret the
non-logical symbols as follows :
1. [c] =
M
[/] if and only if c ∼ /;
171
2. Constant symbols are interpreted in the obvious way, i.e. if c ∈ Σ is a constant symbol,
then c
M
= [c];
3. If 1 ∈ Σ is an :-ary relation symbol, then ([c
1
]. .... [c
n
]) ∈ 1
M
if and only if 1(c
1
. .... c
n
) ∈
1;
4. If 1 ∈ Σ is an :-any function symbol, then 1
M
([c
0
]. .... [c
n
]) = [/] if and only if
“1(c
1
. .... c
n
) = /” ∈ 1.
From the fact that 1 is maximal consistent, and ∼ is an equivalence relation, we get that
the operations are well-defined (it is not so simple, i’ll write it out later). The proof that
M[= 1 is a straightforward induction on the complexity of the formulas of 1. ♦
Corollary. (The extended completeness theorem) A set 1 of formulas of 1 is consistent if
and only if it has a model (regardless of whether or not 1 has witnesses for 1).
Proof: First add a set ( of new constants to 1, and expand 1 to 1
t
in such a way that (
is a set of witnesses for 1
t
. Then expand 1
t
to a maximal consistent set 1
tt
. This set has a
model M consisting of the constants in (, and M is also a model ot 1. ♦
Corollary. (compactness theorem) A set 1 of sentences of 1 has a model if and only if
every finite subset of 1 has a model.
Proof: Replace “has a model” by “is consistent”, and apply the syntactic compactness
theorem. ♦
Corollary. (G¨odel’s completeness theorem) Let 1 be a consistent set of formulas of 1. Then
A sentence ϕ is a theorem of 1 if and only if it is true in every model of 1.
Proof: If ϕ is not a theorem of 1, then ϕ is consistent with 1, so 1
¸
¦ϕ¦ has a model
M, in which ϕ cannot be true. ♦
Corollary. (Downward L¨owenheim-Skolem theorem) If 1 ⊆ 1 has a model, then it has a
model of power at most [1[.
I f 1 has a model, then it is consistent. The model constructed from constants has power
at most [1[ (because we must add at most [1[ many new constants). ♦
Most of the treatment found in this entry can be read in more details in Chang and Keisler’s
book Model Theory.
Version: 6 Owner: jihemme Author(s): jihemme
21.2 Stone space
Suppose 1 is a first order language and 1 is a set of parameters from an 1-structure `.
172
Let o
n
(1) be the set of (complete) :-types over 1 (see type). Then we put a topology on
o
n
(1) in the following manner.
For every formula ψ ∈ 1(1) we let o(ψ) := ¦j ∈ o
n
(1) : ψ ∈ j¦. Then the topology is the
one with a basis of open sets given by ¦o(ψ) : ψ ∈ 1(1)¦. Then we call o
n
(1) endowed
with this topology the Stone space of complete :-types over 1.
Some logical theorems and conditions are equivalent to topological conditions on this topol-
ogy.
• The compactness theorem for first order logic is so named because it is equivalent to
this topology being compact.
• We define j to be an isolated type iff j is an isolated point in the stone space. This is
equivalent to there being some formula ψ so that for every φ ∈ j we have 1
¸
ψ ¬ φ
i.e. all the formulas in j are implied by some formula.
• The Morley rank of a type j ∈ o
1
(`) is equal to the Cantor-Bendixson rank of j in
this space.
The idea of considering the Stone space of types dates back to [1].
We can see that the set of formulas in a language is a Boolean lattice. A type is an ultra-filter
on this lattice. The definition of a Stone space can be made in an analogous way on the set
of ultra-filters on any boolean lattice.
REFERENCES
1. M. Morley, Categoricity in power. Trans. Amer. Math. Soc. 114 (1965), 514-538.
Version: 5 Owner: ratboy Author(s): Larry Hammick, Timmy
21.3 alphabet
An alphabet Σ is a nonempty finite set of symbols. The main restriction is that we must
make sure that every string formed from Σ can be broken back down into symbols in only
one way.
For example, ¦/. |o. o. /|. oo¦ is not a valid alphabet because the string /|oo can be broken
up in two ways: / |o o and /| oo. ¦Cc. ¨ :c. d. c¦ is a valid alphabet, because there is only
one way to fully break up any given string formed from it.
If Σ is our alphabet and : ∈ Z
+
, we define the following as the powers of Σ
173
• Σ
0
= λ, where λ stands for the empty string.
• Σ
n
= ¦rn[r ∈ Σ. n ∈ Σ
n−1
¦ (rn is the juxtaposition of r and n)
So, Σ
n
is the set of all strings formed from Σ of length :.
Version: 1 Owner: xriso Author(s): xriso
21.4 axiomatizable theory
Let 1 be a first order theory. A subset ∆ ⊆ 1 is a set of axioms for 1 if and only if 1 is
the set of all consequences of the formulas in ∆. In other words, ϕ ∈ 1 if and only if ϕ is
provable using only assumptions from ∆.
Definition. A theory 1 is said to be finitely axiomatizable if and only if there is a finite
set of axioms for 1; it is said to be recursively axiomatizable if and only if it has a
recursive set of axioms.
For example, group theory is finitely axiomatizable (it has only three axioms), and Peano arithmetic
is recursivaly axiomatizable : there is clearly an algorithm that can decide if a formula of
the language of the natural numbers is an axiom.
Theorem. complete recursively axiomatizable theories are decidable.
As an example of the use of this theorem, consider the theory of algebraically closed fields
of characteristic j for any number j prime or 0. It is complete, and the set of axioms is
obviously recursive, and so it is decidable.
Version: 2 Owner: jihemme Author(s): jihemme
21.5 definable
21.5.1 Definable sets and functions
Definability In Model Theory
Let L be a first order language. Let ` be an L-structure. Denote r
1
. . . . . r
n
by r and
n
1
. . . . . n
m
by n, and suppose φ(r. n) is a formula from L, and /
1
. . . . . /
m
is some sequence
from `.
Then we write φ(`
n
.

/) to denote ¦c ∈ `
n
: ` [= φ(c.

/)¦. We say that φ(`
n
.

/) is

/-
definable. More generally if o is some set and 1 ⊆ `, and there is some

/ from 1 so that
174
o is

/-definable then we say that o is 1-definable.
In particular we say that a set o is ∅-definable or zero definable iff it is the solution set of
some formula without parameters.
Let 1 be a function, then we say 1 is 1-definable iff the graph of 1 (i.e. ¦(r. n) : 1(r) = n¦)
is a 1-definable set.
If o is 1-definable then any automorphism of ` that fixes 1 pointwise, fixes o setwise.
A set or function is definable iff it is 1-definable for some parameters 1.
Some authors use the term definable to mean what we have called ∅-definable here. If this
is the convention of a paper, then the term parameter definable will refer to sets that are
definable over some parameters.
Sometimes in model theory it is not actually very important what language one is using, but
merely what the definable sets are, or what the definability relation is.
Definability of functions in Proof Theory
In proof theory, given a theory 1 in the language L, for a function 1 : ` → ` to be
definable in the theory 1, we have two conditions:
(i) There is a formula in the language L s.t. 1 is definable over the model `, as in the above
definition; i.e., its graph is definable in the language L over the model `, by some formula
φ(r. n).
(ii) The theory 1 proves that 1 is indeed a function, that is 1 ¬ ∀r∃!n.φ(r. n).
For example: the graph of exponentiation function r
y
= . is definable by the language of
the theory 1∆
0
(a weak subsystem of PA), however the function itself is not definable in
this theory.
Version: 13 Owner: iddo Author(s): iddo, yark, Timmy
21.6 definable type
Let ` be a first order structure. Let ¹ and 1 be sets of parameters from `. Let j be
a complete :-type over 1. Then we say that j is an ¹-definable type iff for every formula
ψ(¯ r. ¯ n) with ln(¯ r) = :, there is some formula dψ(¯ n. ¯ .) and some parameters ¯ c from ¹ so
that for any
¯
/ from 1 we have ψ(¯ r.
¯
/) ∈ j iff ` [= dψ(
¯
/. ¯c).
Note that if j is a type over the model ` then this condition is equivalent to showing that
175
¦
¯
/ ∈ ` : ψ(¯ r.
¯
/) ∈ `¦ is an ¹-definable set.
For j a type over 1, we say j is definable if it is 1-definable.
If j is definable, we call dψ the defining formula for ψ, and the function ψ →dψ a defining
scheme for j.
Version: 1 Owner: Timmy Author(s): Timmy
21.7 downward Lowenheim-Skolem theorem
Let 1 be a first order language, let A be an 1-structure and let 1 ⊆ dom(A). Then there is
an 1-structure B such that 1 ⊆ B and [B[ < Max([1[. [1[) and B is elementarily embedded
in A.
Version: 1 Owner: Evandar Author(s): Evandar
21.8 example of definable type
Consider (Q. <) as a structure in a language with one binary relation, which we interpret as
the order. This is a universal, ℵ
0
-categorical structure (see example of universal structure).
The theory of (Q. <) has quantifier elimination, and so is o-minimal. Thus a type over the
set Q is determined by the quantifier free formulas over Q, which in turn are determined by
the atomic formulas over Q. An atomic formula in one variable over 1 is of the form r < /
or r / or r = / for some / ∈ 1. Thus each 1-type over Q determines a Dedekind cut over
Q, and conversly a Dedekind cut determines a complete type over Q. Let 1(j) := ¦c ∈ Q :
r c ∈ j¦.
Thus there are two classes of type over Q.
1. Ones where 1(j) is of the form (−∞. c) or (−∞. c] for some c ∈ Q. It is clear that
these are definable from the above discussion.
2. Ones where 1(j) has no supremum in Q. These are clearly not definable by o-minimality
of Q.
Version: 1 Owner: Timmy Author(s): Timmy
176
21.9 example of strongly minimal
Let 1
R
be the language of rings. In other words 1
R
has two constant symbols 0. 1 and three
binary function symbols +. .. −. Let 1 be the 1
R
-theory that includes the field axioms and
for each : the formula
∀r
0
. r
1
. . . . . r
n
∃n((

1in
r
i
= 0) →
¸
0in
r
i
n
i
= 0)
Which expresses that every degree : polynomial which is non constant has a root. Then any
model of 1 is an algebraically closed field. One can show that this is a complete theory and
has quantifier elimination (Tarski). Thus every 1-definable subset of any 1 [= 1 is definable
by a quantifier free formula in 1
R
(1) with one free variable n. A quantifier free formula is a
Boolean combination of atomic formulas. Each of these is of the form
¸
in
/
i
n
i
= 0 which
defines a finite set. Thus every definable subset of 1 is a finite or cofinite set. Thus 1 and
1 are strongly minimal
Version: 3 Owner: Timmy Author(s): Timmy
21.10 first isomorphism theorem
Let Σ be a fixed signature, and A and B structures for Σ. If 1 : A →B is a homomorphism,
then Aker (1) is bimorphic to i(1). Furthermore, if 1 has the additional property that for
every natural number : and :-ary relation symbol 1 of Σ,
1
B
(1(c
1
). . . . . 1(c
n
)) ⇒∃c
t
i
[1(c
i
) = 1(c
t
i
) ∧ 1
A
(c
t
1
. . . . . c
t
n
)].
then Aker (1)

= i(1).
S ince the homomorphic image of a Σ-structure is also a Σ-structure, we may assume that
i(1) = B.
Let ∼ = ker (1). Define a bimorphism φ: A ∼→ B : [[c]] → 1(c). To verify that φ is well
defined, let c ∼ c
t
. Then φ([[c]]) = 1(c) = 1(c
t
) = φ([[c
t
]]). To show that φ is injective,
suppose φ([[c]]) = φ([[c
t
]]). Then 1(c) = 1(c
t
), so c ∼ c
t
. Hence [[c]] = [[c
t
]]. To show that φ is
a homomorphism, observe that for any constant symbol c of Σ we have φ([[c
A
]]) = 1(c
A
) = c
B
.
For every natural number : and :-ary relation symbol 1 of Σ,
1
A/∼
([[c
1
]]. . . . . [[c
n
]]) ⇒1
A
(c
1
. . . . . c
n
)
⇒1
B
(1(c
1
). . . . . 1(c
n
))
⇒1
B
(φ([[c
1
]]. . . . . φ([[c
n
]])).
177
For every natural number : and :-ary function symbol 1 of Σ,
φ(1
A/∼
([[c
1
]]. . . . . [[c
n
]])) = φ([[1
A
(c
1
. . . . . c
n
)]])
= 1(1
A
(c
1
. . . . . c
n
))
= 1
B
(1(c
1
). . . . . 1(c
n
))
= 1
B
(φ([[c
1
]]. . . . . φ([[c
n
]])).
Thus φ is a bimorphism.
Now suppose 1 has the additional property mentioned in the statement of the theorem.
Then
1
B
(φ([[c
1
]]). . . . . φ([[c
n
]])) ⇒1
B
(1(c
1
). . . . . 1(c
n
))
⇒∃c
t
i
[c
i
∼ c
t
i
∧ 1
A
(c
t
1
. . . . . c
t
n
)]
⇒1
A/∼
([[c
1
]]. . . . . [[c
n
]]).
Thus φ is an isomorphism.
Version: 4 Owner: almann Author(s): almann
21.11 language
Let Σ be an alphabet. We then define the following using the powers of an alphabet and
infinite union, where : ∈ Z.
Σ
+
=

¸
n=1
Σ
n
Σ

=

¸
n=0
Σ
n
= Σ
+
¸
¦λ¦
A string is an element of Σ

, meaning that it is a grouping of symbols from Σ one after
another. For example, c//c is a string, and c//c is a different string. Σ
+
, like Σ

, contains
all finite strings except that Σ
+
does not contain the empty string λ.
A language over Σ is a subset of Σ

, meaning that it is a set of strings made from the
symbols in the alphabet Σ.
Take for example an alphabet Σ = ¦♣. ℘. 63. c. ¹¦. We can construct languages over Σ, such
as: 1 = ¦ccc. λ. ¹℘63. 63♣. ¹c¹c¹¦, or ¦℘c. ℘cc. ℘ccc. ℘cccc. ¦, or even the empty set
∅. In the context of languages, ∅ is called the empty language.
Version: 12 Owner: bbukh Author(s): bbukh, xriso
178
21.12 length of a string
Suppose we have a string u on alphabet Σ. We can then represent the string as u =
r
1
r
2
r
3
r
n−1
r
n
, where for all r
i
(1 < i < :), r
i
∈ Σ (this means that each r
i
must be
a ”letter” from the alphabet). Then, the length of u is :. The length of a string u is
represented as |u|.
For example, if our alphabet is Σ = ¦c. /. cc¦ then the length of the string u = /ccc/ is
|u| = 4, since the string breaks down as follows: r
1
= /, r
2
= cc, r
3
= c, r
4
= /. So, our
r
n
is r
4
and therefore : = 4. Although you may think that cc is two separate symbols, our
chosen alphabet in fact classifies it as a single symbol.
A ”special case” occurs when |u| = 0, i.e. it does not have any symbols in it. This string
is called the empty string. Instead of saying u = , we use λ to represent the empty string:
u = λ. This is similar to the practice of using β to represent a space, even though a space
is really blank.
If your alphabet contains λ as a symbol, then you must use something else to denote the
empty string.
Suppose you also have a string · on the same alphabet as u. We turn u into r
1
r
n
just
as before, and similarly · = n
1
n
m
. We say · is equal to u if and only if both : = :,
and for every i, r
i
= n
i
.
For example, suppose u = //c and · = /c/, both strings on alphabet Σ = ¦c. /¦. These
strings are not equal because the second symbols do not match.
Version: 3 Owner: xriso Author(s): xriso
21.13 proof of homomorphic image of a Σ-structure is
a Σ-structure
We need to show that i(1) is closed under functions. For every constant symbol c of Σ,
c
B
= 1(c
A
). Hence c
B
∈ i(1). Also, if /
1
. . . . . /
n
∈ i(1) and 1 is an :-ary function symbol of
Σ, then for some c
1
. . . . . c
n
∈ A we have
1
B
(/
1
. . . . . /
n
) = 1
B
(1(c
1
). . . . . 1(c
n
)) = 1(1
A
(c
1
. . . . . c
n
)).
Hence 1
B
(/
1
. . . . . /
n
) ∈ i(1).
Version: 1 Owner: almann Author(s): almann
179
21.14 satisfaction relation
Alfred Tarski was the first mathematician to give a definition of what it means for a formula
to be “true” in a structure. To do this, we need to provide a meaning to terms, and truth-
values to the formulas. In doing this, free variables cause a problem : what value are they
going to have ? One possible answer is to supply temporary values for the free variables,
and define our notions in terms of these temporary values.
Let A be a structure for the signature Σ. Suppose J is an interpretation, and σ is a function
that assigns elements of ¹ to variables, we define the function Val
I,σ
inductively on the
construction of terms :
Val
I,σ
(c) = "(c) c a constant symbol
Val
I,σ
(r) = σ(r) r a variable
Val
I,σ
(1(t
1
. .... t
n
)) = "(1)(Val
I,σ
(t
1
). .... Val
I,σ
(t
n
)) 1 an :-ary function symbol
Now we are set to define satisfaction. Again we have to take care of free variables by assigning
temporary values to them via a function σ. We define the relation A. σ [= ϕ by induction
on the construction of formulas :
A. σ [= t
1
= t
2
if and only if Val
I,σ
(t
1
) = Val
I,σ
(t
2
)
A. σ [= 1(t
1
. .... t
n
) if and only if (Val
I,σ
(t
1
). .... Val
I,σ
(t
1
)) ∈ "(1)
A. σ [= ϕ if and only if A. σ [= ϕ
A. σ [= ϕ ∨ ψ if and only if either A. σ [= ψ or A. σ [= ψ
A. σ [= ∃r.ϕ(r) if and only if for some c ∈ ¹. A. σ[rc] [= ϕ
Here
σ[rc](n)

c if r = n
σ(n) else.
In case for some ϕ of 1, we have A. σ [= ϕ, we say that A models, or is a model of,
or satisfies ϕ in environment, or context sigma. If ϕ has the free variables r
1
. .... r
n
,
and c
1
. .... c
n
∈ A, we also write A [= ϕ(c
1
. .... c
n
) or A [= ϕ(c
1
r
1
. .... c
n
r
n
) instead of
A. σ[r
1
c
1
] [r
n
c
n
] [= ϕ. In case ϕ is a sentence (formula with no free variables), we write
A [= ϕ.
Version: 8 Owner: jihemme Author(s): jihemme
180
21.15 signature
A signature is the collection of a set of constant symbols, and for every natural number :,
a set of :-ary relation symbols and a set of :-ary function symbols.
Version: 1 Owner: almann Author(s): almann
21.16 strongly minimal
Let 1 be a first order language and let ` be an 1-structure. Let o, a subset of the domain
of ` be a definable infinite set. Then o is strongly minimal iff every definable ( ⊆ o we
have either ( is finite or o ` ( is finite. We say that ` is strongly minimal iff the domain
of ` is a strongly minimal set.
If ` is strongly minimal and ` ⇔` then ` is strongly minimal. Thus if 1 is a complete
1 theory then we say 1 is strongly minimal if it has some model (equivalently all models)
which is strongly minimal.
Note that ` is strongly minimal iff every definable subset of ` is quantifier free definable
in a language with just equality. Compare this to the notion of o-minimal structures.
Version: 1 Owner: Timmy Author(s): Timmy
21.17 structure preserving mappings
Let Σ be a fixed signature, and A and B be two structures for Σ. The interesting functions
from A to B are the ones that preserve the structure.
A function 1 : A →B is said to be a homomorphism if and only if:
1. For every constant symbol c of Σ, 1(c
A
) = c
B
.
2. For every natural number : and every :-ary function symbol 1 of Σ,
1(1
A
(c
1
. .... c
n
)) = 1
B
(1(c
1
). .... 1(c
n
)).
3. For every natural number : and every :-ary relation symbol 1 of Σ,
1
A
(c
1
. . . . . c
n
) ⇒1
B
(1(c
1
). . . . . 1(c
n
)).
Homomorphisms with various additional properties have special names:
181
• An injective homomorphism is called a monomorphism.
• A surjective homomorphism is called an epimorphism.
• A bijective homomorphism is called a bimorphism.
• An injective homomorphism whose inverse function is also a homomorphism is called
an embedding.
• A surjective embedding is called an isomorphism.
• A homomorphism from a structure to itself (e.g., 1 : A → A) is called an endomor-
phism.
• An isomorphism from a structure to itself is called an automorphism.
Version: 5 Owner: almann Author(s): almann, yark, jihemme
21.18 structures
Suppose Σ is a fixed signature, and L is the corresponding first-order language. A Σ-
structure A consists of a set ¹, called the universe of A, together with an interpretation
for the non-logical symbols contained in Σ. The interpretation of Σ in A is an operation
J on sets that has the following properties :
1. For each constant symbol c, J(c) is an element of ¹.
2. For each : ∈ N, and each :-ary function symbol 1, J(1) : ¹
n
→ ¹ is a function from
¹
n
to ¹.
3. For each : ∈ N, and each :-ary relation symbol 1, J(1) is a subset of (:-ary
relation on) ¹
n
.
Another commonly used notation is J(c) = c
A
, J(1) = 1
A
, J(1) = 1
A
. For notational
convenience, when the context makes it clear in which structure we are working, we use the
elements of Σ to stand for both the symbols and their interpretation. When Σ is understood,
we call A a structure, instead of a Σ-structure. In some texts, model may be used for
structure. Also, we shall write c ∈ A instead of c ∈ ¹. Of course, there are many different
possibilities for the interpretation J. If A is a structure, then the power of A, which we
denote [A[, is the cardinality of its universe ¹. It is easy to see that the number of possibilities
for the interpretation J is at most 2
[A[
when ¹ is infinite.
Version: 5 Owner: jihemme Author(s): jihemme
182
21.19 substructure
Let Σ be a fixed signature, and A and B structures for Σ. We say A is a substructure of
B, denoted A ⊆ B, if for all r ∈ A we have r ∈ B, and the inclusion map i : A →B : r →r
is an embedding.
Version: 1 Owner: almann Author(s): almann
21.20 type
Let 1 be a first order language. Let ` be an 1-structure. Let 1 ⊆ `, and let c ∈ `
n
.
Then we define the type of c over 1 to be the set of 1-formulas φ(r.
¯
/) with parameters
¯
/
from 1 so that ` [= φ(c.
¯
/). A collection of 1-formulas is a complete :-type over 1 iff it is
of the above form for some 1. ` and c ∈ `
n
.
We call any consistent collection of formulas j in : variables with parameters from 1 a
partial :-type over 1. (See criterion for consistency of sets of formulas.)
Note that a complete :-type j over 1 is consistent so is in particular a partial type over
1. Also j is maximal in the sense that for every formula ψ(r.
¯
/) over 1 we have either
ψ(r.
¯
/) ∈ j or ψ(r.
¯
/) ∈ j. In fact, for every collection of formulas j in : variables
the following are equivalent:
• j is the type of some sequence of : elements c over 1 in some model ` ⇔`
• j is a maximal consistent set of formulas.
For : ∈ ω we define o
n
(1) to be the set of complete :-types over 1.
Some authors define a collection of formulas j to be a :-type iff j is a partial :-type. Others
define j to be a type iff j is a complete :-type.
A type (resp. partial type/complete type) is any :-type (resp. partial type/complete type)
for some : ∈ ω.
Version: 2 Owner: Timmy Author(s): Timmy
21.21 upward Lowenheim-Skolem theorem
Let 1 be a first-order language and let A be an infinite 1-structure. Then if κ is a cardinal
with κ ` Max([A[. [1[) then there is an 1-structure B such that A is elementarily embedded
183
in B.
Version: 2 Owner: Evandar Author(s): Evandar
184
Chapter 22
03C15 – Denumerable structures
22.1 random graph (infinite)
Suppose we have some method ` of generating sequences of letters from ¦j. ¡¦ so that at
each generation the probability of obtaining j is r, a real number strictly between 0 and 1.
Let ¦c
i
: i < ω¦ be a set of vertices. For each i < ω , i ` 1 we construct a graph G
i
on the
vertices c
1
. . . . . c
i
recursively.
• G
1
is the unique graph on one vertex.
• For i 1 we must describe for any , < / < i when c
j
and c
k
are joined.
– If / < i then join c
j
and c
k
in G
i
iff c
j
and c
k
are joined in G
i−1
– If / = i then generate a letter |(,. /) with `. Join c
j
to c
k
iff |(,. /) = j.
Now let Γ be the graph on ¦c
i
: i < ω¦ so that for any :. : < ω, c
n
is joined to c
m
in Γ iff
it is in some G
i
.
Then we call Γ a random graph. Consider the following property which we shall call f-
saturation:
Given any finite disjoint l and \ , subsets of ¦c
i
: i < ω¦ there is some c
n
∈ ¦c
i
: i <
ω¦ ` (l
¸
\ ) so that c
n
is joined to every point of l and no points in \ .
Proposition 1. A random graph has f-saturation with probability 1.
Proof: Let /
1
. /
2
. . . . . /
n
. . . . be an enumeration of ¦c
i
: i < ω¦ ` (l
¸
\ ). We say that /
i
is
correctly joined to (l. \ ) iff it is joined to all the members of l and non of the members of
\ . Then the probability that /
i
is not correctly joined is (1 − r
[U[
(1 −r)
[V [
) which is some
185
real number n strictly between 0 and 1. The probability that none of the first : are correctly
joined is n
m
and the probability that none of the /
i
s are correctly joined is lim
n→∞
n
n
= 0.
Thus one of the /
i
s is correctly joined.
Proposition 2. Any two countable graphs with f-saturation are isomorphic.
Proof: This is via a back and fourth argument. The property of f-saturation is exactly what
is needed.
Thus although the system of generation of a random graph looked as though it could deliver
many potentially different graphs, this is not the case. Thus we talk about the random
graph.
The random graph can also be constructed as a Fraisse limit of all finite graphs, and in many
other ways. It is homogeneous and universal for the class of all countable graphs.
The theorem that almost every two infinite graph random are isomorphic was first proved
in [1].
REFERENCES
1. Paul Erd¨os and Alf´ed R´enyi. Assymetric graphs. Acta Math. Acad. Sci. Hung., 14:295–315,
1963.
Version: 2 Owner: bbukh Author(s): bbukh, Timmy
186
Chapter 23
03C35 – Categoricity and
completeness of theories
23.1 κ-categorical
Let 1 be a first order language and let o be a set of 1-sentences. If κ is a cardinal, then o
is said to be κ-categorical if o has a model of cardinality κ and any two such models are
isomorphic.
In other words, o is categorical iff it has a unique model of cardinality κ, to within isomorphism.
Version: 1 Owner: Evandar Author(s): Evandar
23.2 Vaught’s test
Let 1 be a first order language, and let o be a set of 1-sentences with no finite models which
is κ-categorical for some κ ` [1[. Then o is complete.
Version: 4 Owner: Evandar Author(s): Evandar
23.3 proof of Vaught’s test
Let ϕ be an 1-sentence, and let A be the unique model of S of cardinality κ. Suppose A = ϕ.
Then if B is any model of o then by the upward and downward Lowenheim-Skolem theorems,
there is a model C of o which is elementarily equivalent to B such that [C[ = κ. Then C is
isomorphic to A, and so C = ϕ, and B = ϕ. So B = ϕ for all models B of o, so o = ϕ.
187
Similarly, if A = ϕ then o = ϕ. So o is complete.¯
Version: 1 Owner: Evandar Author(s): Evandar
188
Chapter 24
03C50 – Models with special
properties (saturated, rigid, etc.)
24.1 example of universal structure
Let 1 be the first order language with the binary relation <. Consider the following sentences:
• ∀r. n((r < n (r < n) ∧ ((r < n ∧ n < r) ↔r = n))
• ∀r. n. .(r < n ∧ n < . →r < .)
Any 1-structure satisfying these is called a linear order. We define the relation < so that
r < n iff r < n ∧ r = n. Now consider these sentences:
1. ∀r. n((r < n →∃.(r < . < n))
2. ∀r∃n. .(n < r < .)
A linear order that satisfies 1. is called dense. We say that a linear order that satisfies 2. is
without endpoints. Let 1 be the theory of dense linear orders without endpoints. This is a
complete theory.
We can see that (Q. <) is a model of 1. It is actually a rather special model.
Theorem 3. Let (o. <) be any finite linear order. Then o embeds in (Q. <).
Proof: By induction on [o[, it is trivial for [o[ = 1.
189
Suppose that the statement holds for all linear orders with cardinality less than or equal to
:. Let [o[ = : + 1, then pick some c ∈ o, let o
t
be the structure induced by o on o ` c.
Then there is some embedding c of o
t
into Q.
• Now suppose c is less than every member of o
t
, then as Q is without endpoints, there
is some element / less than every element in the image of c. Thus we can extend c to
map c to / which is an embedding of o into Q.
• We work similarly if c is greater than every element in o
t
.
• If neither of the above hold then we can pick some maximum c
1
∈ o
t
so that c
1
< c.
Similarly we can pick some minimum c
2
∈ o
t
so that c
2
< c. Now there is some / ∈ Q
with c(c
1
) < / < c(c
2
). Then extending c by mapping c to / is the required embedding.
¯
It is easy to extend the above result to countable structures. One views a countable structure
as a the union of an increasing chain of finite substructures. The necessary embedding is
the union of the embeddings of the substructures. Thus (Q. <) is universal countable linear
order.
Theorem 4. (Q. <) is homogeneous.
Proof: The following type of proof is known as a back and forth argument. Let o
1
and o
2
be
two finite substructures of (Q. <). Let c : o
1
→o
2
be an isomorphism. It is easier to think
of two disjoint copies 1 and ( of Q with o
1
a substructure of 1 and o
2
a substructure of (.
Let /
1
. /
2
. . . . be an enumeration of 1 ` o
1
. Let c
1
. c
2
. . . . . c
n
be an enumeration of ( ` o
2
.
We iterate the following two step process:
The ith forth step If /
i
is already in the domain of c then do nothing. If /
i
is not in the
domain of c. Then as in proposition 3, either /
i
is less than every element in the domain of
c or greater than or it has an immediate successor and predecessor in the range of c. Either
way there is an element c in (` range(c) relative to the range of c. Thus we can extend the
isomorphism to include /
i
.
The ith back step If c
i
is already in the range of c then do nothing. If c
i
is not in the
domain of c. Then exactly as above we can find some / ∈ 1` dom(c) and extend c so that
c(/) = c
i
.
After ω stages, we have an isomorphism whose range includes every /
i
and whose domain
includes every c
i
. Thus we have an isomorphism from 1 to ( extending c. ¯
A similar back and forth argument shows that any countable dense linear order wihtout
endpoints is isomorphic to (Q. <) so 1 is ℵ
0
-categorical.
Version: 5 Owner: Timmy Author(s): Timmy
190
24.2 homogeneous
Let 1 be a first order language. Let ` be an 1-structure. Then we say ` is homogeneous
if the following holds:
ifσ is an isomorphism between finite substructures of `, then σ extends to an automorphism
of `.
Version: 1 Owner: Timmy Author(s): Timmy
24.3 universal structure
Let 1 be a first order language, and let 1 be an elementary class of 1-structures. Let κ be
a cardinal. 1
κ
be the set of structures from 1 with cardinality less than or equal to κ.
Let ` ∈ 1
κ
. Suppose that for every ` ∈ 1
κ
there is an embedding of ` into `. Then we
say ` is universal.
Version: 1 Owner: Timmy Author(s): Timmy
191
Chapter 25
03C52 – Properties of classes of
models
25.1 amalgamation property
A class of 1-structures o has the amalgamation property iff whenever ¹. 1
1
. 1
2
∈ o and and
1
i
: ¹ → 1
i
are elementary embeddings for i ∈ ¦1. 2¦ then there is some ( ∈ o and some
elementary embeddings o
i
: 1
i
→( for i ∈ ¦1. 2¦ so that o
1
(1
1
(r)) = o
2
(1
2
(r)) for all r ∈ ¹.
Compare this with the free product with amalgamated subgroup for groups and the defini-
tion of pushout contained there.
Version: 2 Owner: Timmy Author(s): Timmy
192
Chapter 26
03C64 – Model theory of ordered
structures; o-minimality
26.1 infinitesimal
Let 1 be a real closed field, for example the reals thought of as a structure in 1, the language
of ordered rings. Let 1 be some set of parameters from 1. Consider the following set of
formulas in 1(1):
¦r < / : / ∈ 1 ∧ / 0¦
Then this set of formulas is finitely satisfied, so by compactness is consistent. In fact this
set of formulas extends to a unique type j over 1, as it defines a Dedekind cut. Thus there
is some model ` containing 1 and some c ∈ ` so that tp(c1) = j.
Any such element will be called 1-infinitesimal. In particular, suppose 1 = ∅. Then the
definable closure of 1 is the intersection of the reals with the algebraic numbers. Then a
∅-infinitesimal (or simply infinitesimal ) is any element of any real closed field that is positive
but smaller than every real algebraic (positive) number.
As noted above such models exist, by compactness. One can construct them using ultra-
products, see the entry on hyperreal. This is due to Abraham Robinson, who used such
fields to formulate nonstandard analysis.
Let 1 be any ordered ring, then 1 contains N. We say 1 is archemedian iff for every c ∈ 1
there is some : ∈ N so that c < :. Otherwise 1 is non-archemedian.
Real closed fields with infinitesimal elements are non-archemedian: for c an infinitesimal we
have c < 1: and thus 1c : for each : ∈ N.
Reference: A Robinson, Selected papers of Abraham Robinson. Vol. II. Nonstandard anal-
193
ysis and philosophy (New Haven, Conn., 1979)
Version: 2 Owner: Timmy Author(s): Timmy
26.2 o-minimality
Let ` be an ordered structure. An interval in ` is any subset of ` that can be expressed
in one of the following forms:
• ¦r : c < r < /¦ for some c. / from `
• ¦r : r c¦ for some c from `
• ¦r : r < c¦ for some c from `
Then we define ` to be o-minimal iff every definable subset of ` is a finite union of intervals
and points. This is a property of the theory of ` i.e. if ` ⇔` and ` is o-minimal, then
` is o-minimal. Note that ` being o-minimal is equivalent to every definable subset of `
being quantifier free definable in the language with just the ordering. Compare this with
strong minimality.
The model theory of o-minimal structures is well understood, for an excellent account see
Lou van den Dries, Tame topology and o-minimal structures, CUP 1998. In particular,
although this condition is merely on definable subsets of ` it gives very good information
about definable subsets of `
n
for : ∈ ω.
Version: 4 Owner: Timmy Author(s): Timmy
26.3 real closed fields
It is clear that the axioms for a structure to be an ordered field can written in 1, the
first order language of ordered rings. It is also true that the following conditions can be
written in a schema of first order sentences in this language. For each odd degree polynomial
j ∈ 1[r], j has a root.
Let ¹ be all these sentences together with one that states that all positive elements have a
square root. Then one can show that the consequences of ¹ are a complete theory 1. It is
clear that this theory is the theory of the real numbers. We call any 1 structure a real closed
field.
194
The semi algebraic sets on a real closed field are Boolean combinations of solution sets of
polynomial equalities and inequalities. Tarski showed that 1 has quantifier elimination,
which is equivalent to the class of semi algebraic sets being closed under projection.
Let 1 be a real closed field. Consider the definable subsets of 1. By quantifier elimination,
each is definable by a quantifier free formula, i.e. a boolean combination of atomic formulas.
An atomic formula in one variable has one of the following forms:
• 1(r) o(r) for some 1. o ∈ 1[r]
• 1(r) = o(r) for some 1. o ∈ 1[r].
The first defines a finite union of intervals, the second defines a finite union of points. Every
definable subset of 1 is a finite union of these kinds of sets, so is a finite union of intervals
and points. Thus any real closed field is o-minimal.
Version: 2 Owner: Timmy Author(s): Timmy
195
Chapter 27
03C68 – Other classical first-order
model theory
27.1 imaginaries
Given an algebraic structure o to investigate, mathematicians consider substructures, re-
strictions of the structure, quotient structures and the like. A natural question for a math-
ematician to ask if he is to understand o is “What structures naturally live in o?” We can
formalise this question in the following manner: Given some logic appropriate to the struc-
ture o, we say another structure 1 is definable in o iff there is some definable subset 1
t
of
o
n
, a bijection σ : 1
t
→ 1 and a definable function (respectively relation) on 1
t
for each
function (resp. relation) on 1 so that σ is an isomorphism (of the relevant type for 1).
For an example take some infinite group (G. .). Consider the centre of G, 2 := ¦r ∈ G :
∀n ∈ G(rn = nr)¦. Then 2 is a first order definable subset of G, which forms a group with
the restriction of the multiplication, so (2. .) is a first order definable structure in (G. .).
As another example consider the structure (R. +. .. 0. 1) as a field. Then the structure (R. <)
is first order definable in the structure (R. +. .. 0. 1) as for all r. n ∈ R
2
we have r < n iff
∃.(.
2
= n − r). Thus we know that (R. +. .. 0. 1) is unstable as it has a definable order on
an infinite subset.
Returning to the first example, 2 is normal in G, so the set of (left) cosets of 2 form a
factor group. The domain of the factor group is the quotient of Gunder the equivalence relation
r ⇔ n iff ∃. ∈ 2(r. = n). Therefore the factor group G2 will not (in general) be a de-
finable structure, but would seem to be a “natural” structure. We therefore weaken our
formalisation of “natural” from definable to interpretable. Here we require that a struc-
ture is isomorphic to some definable structure on equivalence classes of definable equivalence
relations. The equivalence classes of a ∅-definable equivalence relation are called imaginaries.
196
In [2] Poizat defined the property of Elimination of Imaginaries. This is equivalent to the
following definition:
Definition 1. A structure A with at least two distinct ∅-definable elements admits elimina-
tion of imaginaries iff for every : ∈ N and ∅-definable equivalence relation ∼ on A
n
there is
a ∅-definable function 1 : A
n
→A
p
(for some j) such that for all r and n from A
n
we have
r ∼ n iff 1(r) = 1(n).
Given this property, we think of the function 1 as coding the equivalence classes of ∼,
and we call 1(r) a code for r ∼. If a structure has elimination of imaginaries then every
interpretable structure is definable.
In [3] Shelah defined, for any structure A a multi-sorted structure A
eq
. This is done by adding
a sort for every ∅-definable equivalence relation, so that the equivalence classes are elements
(and code themselves). This is a closure operator i.e. A
eq
has elimination of imaginaries.
See [1] chapter 4 for a good presentation of imaginaries and A
eq
. The idea of passing to
A
eq
is very useful for many purposes. Unfortunately A
eq
has an unwieldy language and
theory. Also this approach does not answer the question above. We would like to show that
our structure has elimination of imaginaries with just a small selection of sorts added, and
perhaps in a simple language. This would allow us to describe the definable structures more
easily, and as we have elimination of imaginaries this would also describe the interpretable
structures.
REFERENCES
1. Wilfrid Hodges, A shorter model theory Cambridge University Press, 1997.
2. Bruno Poizat, Une th´eorie de Galois imaginaire, Journal of Symbolic Logic, 48 (1983), pp.
1151-1170.
3. Saharon Shelah, Classification Theory and the Number of Non-isomorphic Models, North Hol-
lans, Amsterdam, 1978.
Version: 2 Owner: Timmy Author(s): Timmy
197
Chapter 28
03C90 – Nonclassical models
(Boolean-valued, sheaf, etc.)
28.1 Boolean valued model
A traditional model of a language makes every formula of that language either true or
false. A Boolean valued model is a generalization in which formulas take on any value in a
Boolean algebra.
Specifically, a Boolean valued model of a signature Σ over the language L is a set A together
with a Boolean algebra B. Then the objects of the model are the functions A
B
= B →A.
For any formula φ, we can assign a value |φ| from the Boolean algebra. For example, if L is
the language of first order logic, a typical recursive definition of |φ| might look something
like this:
• |1 = o| =

f(b)=g(b)
/
• |φ| = |φ|
t
• |φ ∨ ψ| = |φ| ∨ |ψ|
• |∃rφ(r)| =

f∈A
B
|φ(1)|
Version: 1 Owner: Henry Author(s): Henry
198
Chapter 29
03C99 – Miscellaneous
29.1 axiom of foundation
The axiom of foundation (also called the axiom of regularity) is an axiom of ZF
set theory prohibiting circular sets and sets with infinite levels of containment. Intuitively,
it states that every set can be built up from the empty set. There are several equivalent
formulations, for instance:
For any nonempty set A there is some n ∈ A such that n
¸
A = ∅.
For any set A, there is no function 1 from ω to the transitive closure of A such that
1(: + 1) ∈ 1(:).
For any formula φ, if there is any set r such that φ(r) then there is some A such that φ(A)
but there is no n ∈ A such that φ(n).
Version: 2 Owner: Henry Author(s): Henry
29.2 elementarily equivalent
If M and N are models of L then they are elementarily equivalent, denoted M ⇔ N iff
for every sentence φ:
M= φiffN = φ
Version: 1 Owner: Henry Author(s): Henry
199
29.3 elementary embedding
If A and B are models of L such that for each t ∈ 1, ¹
t
⊆ 1
t
, then we say B is an
elementary extension of A, or, equivalently, A is an elementary substructure of B if,
whenever φ is a formula of L with free variables included in r
1
. . . . . r
n
(of types t
1
. . . . . t
n
)
and c
1
. . . . . c
n
are such that c
i
∈ t
i
for each i < : then:
A = φ(c
1
. . . . . c
n
)iffB = φ(c
1
. . . . . c
n
)
If A and B are models of L then a collection of one-to-one functions 1
t
: ¹
t
→ 1
t
for each
t ∈ 1 is an elementary embedding of A if whenever φ is a formula of type L with free
variables included in r
1
. . . . . r
n
(of types t
1
. . . . . t
n
) and c
1
. . . . . c
n
are such that c
i
∈ t
i
for
each i < : then:
A = φ(c
1
. . . . . c
n
)iffB = φ(1
t
1
(c
1
). . . . . 1
tn
(c
n
))
Version: 1 Owner: Henry Author(s): Henry
29.4 model
Let 1 be a logical language with function symbols 1, relations 1, and types 1. Then
M= '¦M
t
[ t ∈ 1¦. ¦1
M
[ 1 ∈ 1¦. ¦:
M
[ : ∈ 1¦`
is a model of 1 (also called an 1-structure, or, if the underlying logic is clear, a Σ-structure,
where Σ is a signature specifying just 1 and 1) if:
• Whenever 1 is an :-ary function symbol such that Type(1) = t and Inputs
n
(1) =
't
1
. . . . . t
n
` then 1
M
:
¸
n
1
M
t
i
→M
t
• Whenever : is an :-ary relation symbol such that Inputs
n
(:) = 't
1
. . . . . t
n
` then :
M
is
a relation on
¸
n
1
M
t
i
If : is a term of 1 of type t
s
without free variables then it follows that : = 1:
1
. . . :
n
and
:
M
= 1
M
(:
M
1
. . . . . :
M
n
) ∈ `
ts
.
If φ is a sentence then we write M= φ (and say that M satisfies φ) if φ is true in M, where
truth of a relation is defined by:
• 1t
1
. . . t
n
is true if 1
M
(t
M
1
. . . . . t
M
n
)
200
• truth of a non-atomic formula is defined using the semantics of the underlying logic.
If Φ is a class of sentences, we write M= Φ if for every φ ∈ Φ, M= φ.
For any term : of 1 whose only free variables are included in r
1
. . . . . r
n
with types t
1
. . . . . t
n
then for any c
1
. . . . . c
n
such that c
i
∈ `
t
i
define :
M
(c
1
. . . . . c
n
) by:
• If :
i
= r
i
then :
M
i
(c
1
. . . . . c
n
) = c
i
• If : = 1:
1
. . . :
m
then :(
M
c
1
. . . . . c
n
) = 1
M
(:
M
1
(c
1
. . . . . c
n
). . . . . :
M
n
(c
1
. . . . . c
n
))
If φ is a formula whose only free variables are included in r
1
. . . . . r
n
with types t
1
. . . . . t
n
then for any c
1
. . . . . c
n
such that c
i
∈ M
t
i
define M= φ(c
1
. . . . . c
n
) recursively by:
• If φ = 1:
1
. . . :
m
then M= φ(c
1
. . . . . c
n
) iff 1
M
(:
M
1
(c
1
. . . . . c
n
). . . . . :
M
n
(c
1
. . . . . c
n
))
• Otherwise the truth of φ is determined by the semantics of the underlying logic.
As above, M= Φ(c
1
. . . . . c
n
) iff for every φ ∈ Φ, M= φ(c
1
. . . . . c
n
).
Version: 10 Owner: Henry Author(s): Henry
29.5 proof equivalence of formulation of foundation
We show that each of the three formulations of the axiom of foundation given are equivalent.
1 ⇒2
Let A be a set and consider any function 1 : ω →tc(A). Consider ) = ¦1(:) [ : < ω¦. By
assumption, there is some 1(:) ∈ ) such that 1(:)
¸
) = ∅, hence 1(: + 1) ∈ 1(:).
2 ⇒3
Let φ be some formula such that φ(r) is true and for every A such that φ(A), there is some
n ∈ A such that φ(A). The define 1(0) = r and 1(: + 1) is some : ∈ 1(:) such that φ(r).
This would construct a function violating the assumption, so there is no such φ.
201
3 ⇒1
Let A be a nonempty set and define φ(r) ⇔ r ∈ A. Then φ is true for some A, and by
assumption, there is some n such that φ(n) but there is no . ∈ n such that φ(.). Hence
n ∈ A but n
¸
A = ∅.
Version: 1 Owner: Henry Author(s): Henry
202
Chapter 30
03D10 – Turing machines and related
notions
30.1 Turing machine
A Turing machine is an imaginary computing machine invented by Alan Turing to describe
what it means to compute something.
The ”physical description” of a Turing machine is a box with a tape and a tape head. The
tape consists of an infinite number of cells stretching in both directions, with the tape head
always located over exactly one of these cells. Each cell has one of a finite number of symbols
written on it.
The machine has a finite set of states, and with every move the machine can change states,
change the symbol written on the current cell, and move one space left or right. The machine
has a program which specifies each move based on the current state and the symbol under
the current cell. The machine stops when it reaches a combination of state and symbol
for which no move is defined. One state is the start state, which the machine is in at the
beginning of a computation.
A Turing machine may be viewed as computing either a partial function or a relation. When
viewed as a function, the tape begins with a set of symbols which are the input, and when
the machine halts, whatever is on the tape is the output. For instance it is not difficult to
write a program which doubles a binary number, so input of 10 (with 0 on the first cell, 1
on the second, and all the rest blank) would give output 100. If the machine does not halt
on a particular input then the function is undefined on that input.
Alternatively, a Turing machine may be viewed as computing a relation. In that case the
initial symbols on the tape is again an input, and some states are denoted ”accepting.” If
the machine halts in an accepting state, the symbol is accepted, if it halts in any other state,
203
the symbol is rejected. A slight variation is when all states are accepting, and a symbol
is rejected if the machine never halts (of course, if the only method of determining if the
machine will halt is watching it then you can never be sure that it won’t stop at some point
in the future).
Another way for a Turing machine to compute a relation is to list (enumerate) its members
one by one. A relation is recursively enumerable if there is some Turing machine which can
list it in this way, or equivalently if there is a machine which halts in an accepting state only
on the members of the relation. A relation is recursive if it is recursively enumerable and its
complement is also. An equivalent definition is that there is a Turing machine which halts
in an accepting state only on members of the relation and always halts.
There are many variations on the definition of a Turing machine. The tape could be infinite
in only one direction, having a first cell but no last cell. Even stricter, a tape could move in
only one direction. It could be two (or more) dimensional. There could be multiple tapes,
and some of them could be read only. The cells could have multiple tracks, so that they hold
multiple symbols simultaneously.
The programs mentioned above define only one move for each possible state and symbol
combination; these are called deterministic. Some programs define multiple moves for some
combinations.
If the machine halts whenever there is any series of legal moves which leads to a situation
without moves, the machine is called non-deterministic. The notion is that the machine
guesses which move to use whenever there are multiple choices, and always guesses right.
Yet other machines are probabilistic; when given the choice between different moves they
select one at random.
No matter which of these variations is used, the recursive and recursively enumerable relations
and functions are unchanged (with two exception–one of the tapes has to move in two di-
rections, although it need not be infinite in both directions, and there can only be a finite
number of symbols, states, and tapes): the simplest imagineable machine, with a single tape,
one-way infinite tape and only two symbols, is equivalent to the most elaborate imagineable
array of multidimensional tapes, lucky guesses, and fancy symbols.
However not all these machines can compute at the same speed; the speed-up theorem states
that the number of moves it takes a machine to halt can be divided by an arbitrary constant
(the basic method involves increasing the number of symbols so that each cell encodes several
cells from the original machine; each move of the new machine emulates several moves from
the old one).
In particular, the question P
?
= NP, which asks whether an important class determinisitic
machines (those which have a polynomial function of the input length bounding the time it
takes them to halt) is the same as the corresponding class of non-deterministic machines, is
one of the major unsolved problems in modern mathematics.
204
Version: 2 Owner: Henry Author(s): Henry
205
Chapter 31
03D20 – Recursive functions and
relations, subrecursive hierarchies
31.1 primitive recursive
The class of primitive recursive functions is the smallest class of functions on the naturals
(from N to N) that
1. Includes
• the zero function: .(r) = 0
• the successor function: :(r) = r + 1
• the projection functions: j
n,m
(r
1
. . . . . r
n
) = r
m
, : ≤ :
2. Is closed under
• composition: /(r
1
. . . . . r
n
) = 1(o
1
(r
1
. . . . . r
n
). . . . . o
m
(r
1
. . . . . r
n
))
• primitive recursion: /(r. 0) = 1(r); /(r. n + 1) = o(r. n. /(r. n))
The primitive recursive functions are Turing-computable, but not all Turing-computable
functions are primitive recursive (see Ackermann’s function).
Further Reading
• “Dave’s Homepage: Primitive Recursive Functions”: http://www.its.caltech.edu/ boozer/symbols/pr.h
• “Primitive recursive functions”: http://public.logica.com/ stepneys/cyc/p/primrec.htm
Version: 2 Owner: akrowne Author(s): akrowne
206
Chapter 32
03D25 – Recursively (computably)
enumerable sets and degrees
32.1 recursively enumerable
For a language 1, TFAE:
• There exists a Turing machine 1 such that ∀r.(r ∈ 1) ⇔the computation 1(r) terminates.
• There exists a total recursive function 1 : N →1 which is onto.
• There exists a total recursive function 1 : N →1 which is one-to-one and onto.
A language 1 fulfilling any (and therefore all) of the above conditions is called recursively
enumerable.
Examples
1. Any recursive language.
2. The set of encodings of Turing machines which halt when given no input.
3. The set of encodings of theorems of Peano arithmetic.
4. The set of integers : for which the hailstone sequence starting at : reaches 1. (We
don’t know if this set is recursive, or even if it is N; but a trivial program shows it is
recursively enumerable.)
Version: 3 Owner: ariels Author(s): ariels
207
Chapter 33
03D75 – Abstract and axiomatic
computability and recursion theory
33.1 Ackermann function
Ackermann’s function ¹(r. n) is defined by the recurrence relations
¹(0. n) = n + 1
¹(r + 1. 0) = ¹(r. 1)
¹(r + 1. n + 1) = ¹(r. ¹(r + 1. n))
Ackermann’s function is an example of a recursive function that is not primitive recursive,
but is instead j-recursive (that is, Turing-computable).
Ackermann’s function grows extremely fast. In fact, we find that
¹(0. n) = n + 1
¹(1. n) = 2 + (n + 3) −3
¹(2. n) = 2 (n + 3) −3
¹(3. n) = 2
y+3
−3
¹(4. n) = 2
2

2
−3 (n + 3exponentiations)

... and at this point conventional notation breaks down, and we need to employ something
like Conway notation or Knuth notation for large numbers.
208
Ackermann’s function wasn’t actually written in this form by its namesake, Wilhelm Acker-
mann. Instead, Ackermann found that the .-fold exponentiation of r with n was an example
of a recursive function which was not primitive recursive. Later this was simplified by Rosza
Peter to a function of two variables, similar to the one given above.
Version: 5 Owner: akrowne Author(s): akrowne
33.2 halting problem
The halting problem is to determine, given a particular input to a particular computer
program, whether the program will terminate after a finite number of steps.
The consequences of a solution to the halting problem are far-reaching. Consider some
predicate 1(r) regarding natural numbers; suppose we conjecture that 1(r) holds for all
r ∈ N. (Goldbach’s conjecture, for example, takes this form.) We can write a program
that will count up through the natural numbers and terminate upon finding some : such
that 1(:) is false; if the conjecture holds in general, then our program will never terminate.
Then, without running the program, we could pass it along to a halting program to
prove or disprove the conjecture.
In 1936, Alan Turing proved that the halting problem is undecideable; the argument is
presented here informally. Consider a hypothetical program that decides the halting the
problem:
Algorithm Halt(P, I)
Input: A computer program 1 and some input 1 for 1
Output: True if 1 halts on 1 and false otherwise
The implementation of the algorithm, as it turns out, is irrelevant. Now consider another
program:
Algorithm Break(x)
Input: An irrelevant parameter r
Output: begin
if Hc|t(1:cc/. r) then
whiletrue do
nothing
else
1:cc/ ←t:nc
end
In other words, we can design a program that will break any solution to the halting problem.
If our halting solution determines that Break halts, then it will immediately enter an infinite
loop; otherwise, Break will return immediately. We must conclude that the Halt program
does not decide the halting problem.
209
Version: 2 Owner: vampyr Author(s): vampyr
210
Chapter 34
03E04 – Ordered sets and their
cofinalities; pcf theory
34.1 another definition of cofinality
Let κ be a limit ordinal (e.g. a cardinal). The cofinality of κ cf(κ) could also be defined as:
cf(κ) = inf¦[l[ : l ⊆ κs.t. sup l = κ¦
(sup l is calculated using the natural order of the ordinals). The cofinality of a cardinal is
always a regular cardinal and hence cf(κ) = cf(cf(κ)).
This definition is equivalent to the parent definition.
Version: 5 Owner: x bas Author(s): x bas
34.2 cofinality
If α is an ordinal and A ⊆ α then A is said to be cofinal in α if whenever n ∈ α there is
r ∈ A with n < r.
A map 1 : α → β between ordinals α and β is said to be cofinal if the image of 1 is cofinal
in β.
If β is an ordinal, the cofinality cf(β) of β is the least ordinal α such that there is a cofinal
map 1 : α →β. Note that cf(β) < β, because the identity map on β is cofinal.
It is not hard to show that the cofinality of any ordinal is a cardinal, in fact a regular cardinal:
a cardinal κ is said to be regular if cf(κ) = κ and singular if cf(κ) < κ.
211
For any infinite cardinal κ it can be shown that κ < κ
cf(κ)
, and so also κ < cf(2
κ
).
Examples
0 and 1 are regular cardinals. All other finite cardinals have cofinality 1 and are therefore
singular.

0
is regular.
Any infinite successor cardinal is regular.
The smallest infinite singular cardinal is ℵ
ω
. In fact, the map 1 : ω →ℵ
ω
given by 1(:) = ω
n
is cofinal, so cf(ℵ
ω
) = ℵ
0
. Note that cf(2

0
) ℵ
0
, and consequently 2

0
= ℵ
ω
.
Version: 14 Owner: yark Author(s): yark, Evandar
34.3 maximal element
Let ≤ be an ordering on a set o, and let ¹ ⊆ o. Then, with respect to the ordering ≤,
• c ∈ ¹ is the least element of ¹ if c ≤ r, for all r ∈ ¹.
• c ∈ ¹ is a minimal element of ¹ if there exists no r ∈ ¹ such that r ≤ c and r = c.
• c ∈ ¹ is the greatest element of ¹ if r ≤ c for all r ∈ ¹.
• c ∈ ¹ is a maximal element of ¹ if there exists no r ∈ ¹ such that c ≤ r and r = c.
Examples.
• The natural numbers N ordered by divisibility ([) have a least element, 1. The natural
numbers greater than 1 (N ` 1) have no least element, but infinitely many minimal
elements (the primes.) In neither case is there a greatest or maximal element.
• The negative integers ordered by the standard definition of ≤ have a maximal element
which is also the greatest element, −1. They have no minimal or least element.
• The natural numbers N ordered by the standard ≤ have a least element, 1, which is
also a minimal element. They have no greatest or maximal element.
• The rationals greater than zero with the standard ordering ≤ have no least element or
minimal element, and no maximal or greatest element.
Version: 3 Owner: akrowne Author(s): akrowne
212
34.4 partitions less than cofinality
If λ < cf(κ) then κ →(κ)
1
λ
.
This follows easily from the definition of cofinality. For any coloring 1 : κ → λ then define
o : λ →κ+1 by o(α) = [1
−1
(α)[. Then κ =
¸
α<λ
o(α), and by the normal rules of cardinal
arithmatic sup
α<λ
o(α) = κ. Since λ < cf(κ), there must be some α < λ such that o(α) = κ.
Version: 1 Owner: Henry Author(s): Henry
34.5 well ordered set
A well-ordered set is a totally ordered set in which every nonempty subset has a least
member.
An example of well-ordered set is the set of positive integers with the standard order relation
(Z
+
. <), because any nonempty subset of it has least member. However, R
+
(the positive
reals) is not a well-ordered set with the usual order, because (0. 1) = ¦r : 0 < r < 1¦ is a
nonempty subset but it doesn’t contain a least number.
A well-ordering of a set A is the result of defining a binary relation < on A to itself in
such a way that A becomes well-ordered with respect to <.
Version: 9 Owner: drini Author(s): drini, vypertd
34.6 pigeonhole principle
For any natural number :, there does not exist a bijection between : and a proper subset
of :.
The name of the theorem is based upon the observation that pigeons will not occupy a
pigeonhole that already contains a pigeon, so there is no way to fit : pigeons in fewer than
: pigeonholes.
Version: 6 Owner: djao Author(s): djao
34.7 proof of pigeonhole principle
It will first be proven that, if a bijection exists between two finite sets, then the two sets
have the same number of elements.
213
Let o and 1 be finite sets and 1 : o → 1 be a bijection. Since 1 is injective, then [o[ =
[ ran1[. Since 1 is surjective, then [1[ = [ ran1[. Thus, [o[ = [1[.
Since the pigeonhole principle is the contrapositive of the proven statement, it follows that
the pigeonhole principle holds.
Version: 2 Owner: Wkbj79 Author(s): Wkbj79
34.8 tree (set theoretic)
In a set theory, a tree is defined to be a set 1 and a relation <
T
⊆ 1 1 such that:
• <
T
is a partial ordering of 1
• For any t ∈ 1, ¦: ∈ 1 [ : <
T
t¦ is well-ordered
The nodes immediately greater than a node are termed its children, the node immediately
less is its parent (if it exists), any node less is an ancestor and any node greater is a
descendant. A node with no ancestors is a root.
The partial ordering represents distance from the root, and the well-ordering requirement
prohibits any loops or splits below a node (that is, each node has at most one parent, and
therefore at most one grand-parent, and so on). Since there is generally no requirement that
the tree be connected, the null ordering makes any set into a tree, although the tree is a
trivial one, since each element of the set forms a single node with no children.
Since the set of ancestors of any node is well-ordered, we can associate it with an ordinal.
We call this the height, and write: ht(t) = o. t.(¦: ∈ 1 [ : <
T
t¦). This all accords with
normal usage: a root has height 0, something immediately above the root has height 1, and
so on. We can then assign a height to the tree itself, which we define to be the least number
greater than the height of any element of the tree. For finite trees this is just one greater
than the height of its tallest element, but infinite trees may not have a tallest element, so
we define ht(1) = sup¦ht(t) + 1 [ t ∈ 1¦.
For every α <
T
ht(1) we define the α-th level to be the set 1
α
= ¦t ∈ 1 [ ht(t) = α¦. So
of course 1
0
is all roots of the tree. If α <
T
ht(1) then 1(α) is the subtree of elements with
height less than α: t ∈ 1(α) ↔r ∈ 1 ∧ ht(t) < α.
We call a tree a κ-tree for any cardinal κ if [1[ = κ and ht 1 = κ. If κ is finite, the only way
to do this is to have a single branch of length κ.
Version: 6 Owner: Henry Author(s): Henry
214
34.9 κ-complete
A structured set o (typically a filter or a Boolean algebra) is κ-complete if, given any 1 ⊆ o
with [1[ < κ,
¸
1 ∈ o. It is complete if it is κ-complete for all κ.
Similarly, a partial order is κ-complete if any sequence of fewer than κ elements has an
upper bound within the partial order.
A ℵ
1
-complete structure is called countably complete.
Version: 8 Owner: Henry Author(s): Henry
34.10 Cantor’s diagonal argument
One of the starting points in Cantor’s development of set theory was his discovery that there
are different degrees of infinity. The rational numbers, for example, are countably infinite; it
is possible to enumerate all the rational numbers by means of an infinite list. By contrast,
the real numbers are uncountable. it is impossible to enumerate them by means of an
infinite list. These discoveries underlie the idea of cardinality, which is expressed by saying
that two sets have the same cardinality if there exists a bijective correspondence between
them.
In essence, Cantor discovered two theorems: first, that the set of real numbers has the same
cardinality as the power set of the naturals; and second, that a set and its power set have a
different cardinality (see Cantor’s theorem). The proof of the second result is based on the
celebrated diagonalization argument.
Cantor showed that for every given infinite sequence of real numbers r
1
. r
2
. r
3
. . . . it is
possible to construct a real number r that is not on that list. Consequently, it is impossible
to enumerate the real numbers; they are uncountable. No generality is lost if we suppose
that all the numbers on the list are between 0 and 1. Certainly, if this subset of the real
numbers in uncountable, then the full set is uncountable as well.
Let us write our sequence as a table of decimal expansions:
0 . d
11
d
12
d
13
d
14
. . .
0 . d
21
d
22
d
23
d
24
. . .
0 . d
31
d
32
d
33
d
34
. . .
0 . d
41
d
42
d
43
d
44
. . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
where
r
n
= 0.d
n1
d
n2
d
n3
d
n4
. . . .
and the expansion avoids an infinite trailing string of the digit 9.
215
For each : = 1. 2. . . . we choose a digit c
n
that is different from d
nn
and not equal to 9, and
consider the real number r with decimal expansion
0.c
1
c
2
c
3
. . .
By construction, this number r is different from every member of the given sequence. After
all, for every :, the number r differs from the number r
n
in the :
th
decimal digit. The claim
is proven.
Version: 6 Owner: rmilson Author(s): rmilson, slider142
34.11 Fodor’s lemma
If κ is a regular, uncountable cardinal, o is a stationary subset of κ, and 1 : κ → κ is
regressive on o (that is, 1(α) < α for any α ∈ o) then there is some γ and some stationary
o
0
⊆ o such that 1(α) = γ for any α ∈ o
0
.
Version: 1 Owner: Henry Author(s): Henry
34.12 Schroeder-Bernstein theorem
Let o and 1 be sets. If there exists an injection 1 : o →1 and an injection o : 1 →o, then
o and 1 have the same cardinality.
The Schr¨oder-Bernstein theorem is useful for proving many results about cardinality, since
it replaces one hard problem (finding a bijection between o and 1) with two generally easier
problems (finding two injections).
Version: 2 Owner: vampyr Author(s): vampyr
34.13 Veblen function
The Veblen function is used to obtain larger ordinal numbers than those provided by
exponentiation. It builds on a hierarchy of closed and unbounded classes:
• (:(0) is the additively indecomposable numbers, H
• (:(o:) = (:(:)
t
the set of fixed points of the enumerating function of (:(:)
• (:(λ) =
¸
α<λ
(:(α)
216
The Veblen function ϕ
α
β is defined by setting ϕ
α
equal to the enumerating function of (:(α).
We call a number α strongly critical if α ∈ (:(α). The class of strongly critical ordinals
is written SC, and the enumerating function is written 1
SC
(α) = Γ
α
.
Γ
0
, the first strongly critical ordinal, is also called the Feferman-Schutte ordinal.
Version: 1 Owner: Henry Author(s): Henry
34.14 additively indecomposable,
An ordinal α is called additively indecomposable if it is not 0 and for any β. γ < α,
β + γ < α. The set of additively indecomposable ordinals is denoted H.
Obviously 1 ∈ H, since 0 + 0 < 1. Also ω ∈ H since the sum of two finite numbers is still
finite, and no finite numbers other than 1 are in H.
H is closed and unbounded, so the enumerating function of H is normal. In fact, 1
H
(α) = ω
α
.
The derivative 1
t
H
(α) is written c
α
. The number c
0
= ω
ω
ω
···
, therefore, is the first fixed point
of the series ω, ω
ω
, ω
ω
ω
, . . ..
Version: 1 Owner: Henry Author(s): Henry
34.15 cardinal number
A cardinal number is an ordinal number o with the property that o ⊂ A for every ordinal
number A which has the same cardinality as o.
Version: 3 Owner: djao Author(s): rmilson, djao
34.16 cardinal successor
The cardinal successor of a cardinal κ is the least cardinal greater than κ. It is denoted κ
+
.
Version: 1 Owner: yark Author(s): yark
217
34.17 cardinality
Cardinality is a notion of the size of a set which does not rely on numbers. It is a relative
notion because, for instance, two sets may each have an infinite number of elements, but one
may have a greater cardinality. That is, it may have a ”more infinite” number of elements.
The formal definition of cardinality rests upon the notion of a one-to-one mapping between
sets.
Definition.
Sets ¹ and 1 have the same cardinality if there is a one-to-one and onto function 1 from ¹
to 1 (a bijection.) Symbolically, we write [¹[ = [1[. This is also called equipotence.
Results.
1. ¹ is equipotent to ¹.
2. If ¹ is equipotent to 1, then 1 is equipotent to ¹.
3. If ¹ is equipotent to 1 and 1 is equipotent to (, then ¹ is equipotent to (.
Proof.
1. The identity function on ¹ is a bijection from ¹ to ¹.
2. If 1 is a bijection from ¹ to 1, then 1
−1
exists and is a bijection from 1 to ¹.
3. If 1 is a bijection from ¹ to 1 and o is a bijection from 1 to (, then 1 ◦o is a bijection
from ¹ to (.
Example.
The set of even integers E has the same cardinality as the set of integers Z. We define
1 : E →Z such that 1(r) =
x
2
. Then 1 is a bijection, therefore [E[ = [Z[.
Version: 10 Owner: akrowne Author(s): akrowne
34.18 cardinality of a countable union
Let ( be a countable collection of countable sets. Then
¸
( is countable.
Version: 1 Owner: vampyr Author(s): vampyr
218
34.19 cardinality of the rationals
The set of rational numbers ´ is countable, and therefore its cardinality is ℵ
0
.
Version: 2 Owner: quadrate Author(s): quadrate
34.20 classes of ordinals and enumerating functions
A class of ordinals is just a subset of the ordinals. For every class of ordinals ` there is an
enumerating function 1
M
defined by transfinite recursion:
1
M
(α) = min¦r ∈ ` [ 1(β) < r for all β < α¦
This function simply lists the elements of ` in order. Note that it is not necessarily defined
for all ordinals, although it is defined for a segment of the ordinals. Let otype(`) = dom(1)
be the order type of `, which is either On or some ordinal α. If α < β then 1
M
(α) <
1
M
(β), so 1
M
is an order isomorphism between otype(`) and `.
We say ` is κ-closed if for any ` ⊆ ` such that [`[ < κ, also sup ` ∈ `.
We say ` is κ-unbounded if for any α < κ there is some β ∈ ` such that α < β.
We say a function 1 : ` →On is κ-continuous if ` is κ-closed and
1(sup `) = sup¦1(α) [ α ∈ `¦
A function is κ-normal if it is order preserving (α < β implies 1(α) < 1(β)) and continuous.
In particular, the enumerating function of a κ-closed class is always κ-normal.
All these definitions can be easily extended to all ordinals: a class is closed (resp. un-
bounded) if it is κ-closed (unbounded) for all κ. A function is continuous (resp. normal)
if it is κ-continuous (normal) for all κ.
Version: 2 Owner: Henry Author(s): Henry
34.21 club
If κ is a cardinal then a set ( ⊆ κ is closed iff for any o ⊆ ( and α < κ, sup(o
¸
α) = α
then α ∈ (. (That is, if the limit of some sequence in ( is less than κ then the limit is also
in (.)
If κ is a cardinal and ( ⊆ κ then ( is unbounded if, for any α < κ, there is some β ∈ (
such that α < β.
219
If a set is both closed and unbounded then it is a club set.
Version: 1 Owner: Henry Author(s): Henry
34.22 club filter
If κ is a regular uncountable cardinal then club(κ), the filter of all sets containing a club
subset of κ, is a κ-complete filter closed under diagonal intersection called the club filter.
To see that this is a filter, note that κ ∈ club(κ) since it is obviously both closed and
unbounded. If r ∈ club(κ) then any subset of κ containing r is also in club(κ), since r, and
therefore anything containing it, contains a club set.
It is a κ complete filter because the intersection of fewer than κ club sets is a club set. To
see this, suppose '(
i
`
i<α
is a sequence of club sets where α < κ. Obviously ( =
¸
(
i
is
closed, since any sequence which appears in ( appears in every (
i
, and therefore its limit is
also in every (
i
. To show that it is unbounded, take some β < κ. Let 'β
1,i
` be an increasing
sequence with β
1,1
β and β
1,i
∈ (
i
for every i < α. Such a sequence can be constructed,
since every (
i
is unbounded. Since α < κ and κ is regular, the limit of this sequence is less
than κ. We call it β
2
, and define a new sequence 'β
2,i
` similar to the previous sequence.
We can repeat this process, getting a sequence of sequences 'β
j,i
` where each element of a
sequence is greater than every member of the previous sequences. Then for each i < α, 'β
j,i
`
is an increasing sequence contained in (
i
, and all these sequences have the same limit (the
limit of 'β
j,i
`). This limit is then contained in every (
i
, and therefore (, and is greater than
β.
To see that club(κ) is closed under diagonal intersection, let '(
i
`, i < κ be a sequence, and
let ( = ∆
i<κ
(
i
. Since the diagonal intersection contains the intersection, obviously ( is
unbounded. Then suppose o ⊆ ( and sup(o
¸
α) = α. Then o ⊆ (
β
for every β ` α, and
since each (
β
is closed, α ∈ (
β
, so α ∈ (.
Version: 2 Owner: Henry Author(s): Henry
34.23 countable
A set o is countable if there exists a bijection between o and some subset of N.
All finite sets are countable.
Version: 2 Owner: vampyr Author(s): vampyr
220
34.24 countably infinite
A set o is countably infinite if there is a bijection between o and N.
As the name implies, any countably infinite set is both countable and infinite.
Countably infinite sets are also sometimes called denumerable.
Version: 3 Owner: vampyr Author(s): vampyr
34.25 finite
A set o is finite if there exists a natural number : and a bijection from o to :. If there
exists such an :, then it is unique, and it is called the cardinality of o.
Version: 2 Owner: djao Author(s): djao
34.26 fixed points of normal functions
If 1 : ` →On is a function then Fix(1) = ¦r ∈ ` [ 1(r) = r¦ is the set of fixed points of
1. 1
t
, the derivative of 1, is the enumerating function of Fix(1).
If 1 is κ-normal then Fix(1) is κ-closed and κ-normal, and therefore 1
t
is also κ-normal.
Version: 1 Owner: Henry Author(s): Henry
34.27 height of an algebraic number
Suppose we have an algebraic number such that the polynomial of smallest degree it is a
root of (with the co-efficients relatively prime) is given by:
n
¸
i=0
c
i
r
i
Then the height / of the algebraic number is given by:
/ = : +
n
¸
i=0
[c
i
[
221
This is a quantity which is used in the proof of the existence of transcendental numbers.
REFERENCES
1. Shaw, R. Mathematics Society Notes, 1st edition. King’s School Chester, 2003.
2. Stewart, I. Galois Theory, 3rd edition. Chapman and Hall, 2003.
3. Baker, A. Transcendental Number Theory, 1st edition. Cambridge University Press, 1975.
Version: 13 Owner: kidburla2003 Author(s): kidburla2003
34.28 if ¹ is infinite and 1 is a finite subset of ¹. then
¹ ` 1 is infinite
Theorem. If ¹ is an infinite set and 1 is a finite subset of ¹, then ¹` 1 is infinite.
Proof. The proof is by contradiction. If ¹ ` 1 would be finite, there would exist a / ∈ N
and a bijection 1 : ¦1. . . . . /¦ → ¹ ` 1. Since 1 is finite, there also exists a bijection
o : ¦1. . . . . |¦ →1. We can then define a mapping / : ¦1. . . . . / + |¦ →¹ by
/(i) =

1(i) when i ∈ ¦1. . . . . /¦.
o(i −/) when i ∈ ¦/ + 1. . . . . / + |¦.
Since 1 and o are bijections, / is a bijection between a finite subset of N and ¹. This is a
contradiction since ¹ is infinite. P
Version: 3 Owner: mathcam Author(s): matte
34.29 limit cardinal
A limit cardinal is a cardinal κ such that λ
+
< κ for every cardinal λ < κ. Here λ
+
denotes
the cardinal successor of λ. If 2
λ
< κ for every cardinal λ < κ, then κ is called a strong limit
cardinal.
Every strong limit cardinal is a limit cardinal, because λ
+
< 2
λ
holds for every cardinal λ.
Under GCH, every limit cardinal is a strong limit cardinal because in this case λ
+
= 2
λ
for
every infinite cardinal λ.
The three smallest limit cardinals are 0, ℵ
0
and ℵ
ω
. Note that some authors do not count 0,
or sometimes even ℵ
0
, as a limit cardinal.
Version: 7 Owner: yark Author(s): yark
222
34.30 natural number
Given the Zermelo-Fraenkel axioms of set theory, one can prove that there exists an inductive set
A such that ∅ ∈ A. The natural numbers N are then defined to be the intersection of all
subsets of A which are inductive sets and contain the empty set as an element.
The first few natural numbers are:
• 0 := ∅
• 1 := 0
t
= ¦0¦ = ¦∅¦
• 2 := 1
t
= ¦0. 1¦ = ¦∅. ¦∅¦¦
• 3 := 2
t
= ¦0. 1. 2¦ = ¦∅. ¦∅¦. ¦∅. ¦∅¦¦¦
Note that the set 0 has zero elements, the set 1 has one element, the set 2 has two elements,
etc. Informally, the set : is the set consisting of the : elements 0. 1. . . . . :−1, and : is both
a subset of N and an element of N.
In some contexts (most notably, in number theory), it is more convenient to exclude 0 from
the set of natural numbers, so that N = ¦1. 2. 3. . . . ¦. When it is not explicitly specified, one
must determine from context whether 0 is being considered a natural number or not.
Addition of natural numbers is defined inductively as follows:
• c + 0 := c for all c ∈ N
• c + /
t
:= (c + /)
t
for all c. / ∈ N
Multiplication of natural numbers is defined inductively as follows:
• c 0 := 0 for all c ∈ N
• c /
t
:= (c /) + c for all c. / ∈ N
The natural numbers form a monoid under either addition or multiplication. There is an
ordering relation on the natural numbers, defined by: c < / if c ⊆ /.
Version: 11 Owner: djao Author(s): djao
223
34.31 ordinal arithmetic
Ordinal arithmetic is the extension of normal arithmetic to the transfinite ordinal numbers.
The successor operation or (sometimes written r+1, although this notation risks confusion
with the general definition of addition) is part of the definition of the ordinals, and addition
is naturally defined by recursion over this:
• r + 0 = 0
• r + on = o(r + n)
• r + α = sup
γ<α
r + γ for limit α
If r and n are finite then r+n under this definition is just the usual sum, however when r and
n become infinite, there are differences. In particular, ordinal addition is not commutative.
For example,
ω + 1 = ω + o0 = o(ω + 0) = oω
but
1 + ω = sup
n<ω
1 + : = ω
Multiplication in turn is defined by iterated addition:
• r 0 = 0
• r on = r n + r
• r α = sup
γ<α
r γ for limit α
Once again this definition is equivalent to normal multiplication when r and n are finite, but
is not commutative:
ω 2 = ω 1 + ω = ω + ω
but
2 ω = sup
n<ω
2 : = ω
Both these functions are strongly increasing in the second argument and weakly increasing
in the first argument. That is, if α < β then
• γ + α < γ + β
• γ α < γ β
• α + γ < β + γ
• α γ < β γ
Version: 2 Owner: Henry Author(s): Henry
224
34.32 ordinal number
An ordinal number is a well ordered set o such that, for every r ∈ o,
r = ¦. ∈ o [ . < r¦
(where < is the ordering relation on o).
Version: 2 Owner: djao Author(s): djao
34.33 power set
Definition If A is a set, then the power set of A is the set whose elements are the subsets
of A. It is usually denoted as P(A) or 2
X
.
1. If A is a finite set, then [2
X
[ = 2
[X[
. This property motivates the notation 2
X
.
2. For an arbitrary set A, Cantor’s theorem states two things about the power set: First,
there is no bijection between A and P(A). Second, the cardinality of 2
X
is greater
than the cardinality of A.
Version: 5 Owner: matte Author(s): matte, drini
34.34 proof of Fodor’s lemma
If we let 1
−1
: κ →1(o) be the inverse of 1 restricted to o then Fodor’s lemma is equivalent
to the claim that for any function such that α ∈ 1(κ) → α κ there is some α ∈ o such
that 1
−1
(α) is stationary.
Then if Fodor’s lemma is false, for every α ∈ o there is some club set (
α
such that
(
α
¸
1
−1
(α) = ∅. Let ( = ∆
α<κ
(
α
. The club sets are closed under diagonal intersection,
so ( is also club and therefore there is some α ∈ o
¸
(. Then α ∈ (
β
for each β < α, and
so there can be no β < α such that α ∈ 1
−1
(β), so 1(α) ` α, a contradiction.
Version: 1 Owner: Henry Author(s): Henry
34.35 proof of Schroeder-Bernstein theorem
We first prove as a lemma that for any 1 ⊂ ¹, if there is an injection 1 : ¹ →1, then there
is also a bijection / : ¹ →1.
225
Define a sequence ¦(
k
¦

k=0
of subsets of ¹ by (
0
= ¹ − 1 and for / ≥ 0, (
k+1
= 1((
k
).
If the (
k
are not pairwise disjoint, then there are minimal integers , and / with , < / and
(
j
¸
(
k
nonempty. Then / 1, and so (
k
⊂ 1. (
0
¸
1 = ∅, so , 0. Thus (
j
= 1((
j−1
)
and (
k
= 1((
k−1
). By assumption, 1 is injective, so (
j−1
¸
(
k−1
is nonempty, contradicting
the minimality of ,. Hence the (
k
are pairwise disjoint.
Now let ( =
¸

k=0
(
k
, and define / : ¹ →1 by
/(.) =

1(.). . ∈ (
.. . ∈ (
.
If . ∈ (, then /(.) = 1(.) ∈ 1. But if . ∈ (, then . ∈ 1, and so /(.) ∈ 1. Hence / is
well-defined; / is injective by construction. Let / ∈ 1. If / ∈ (, then /(/) = /. Otherwise,
/ ∈ (
k
= 1((
k−1
) for some / ≥ 0, and so there is some c ∈ (
k−1
such that /(c) = 1(c) = /.
Thus / is bijective; in particular, if 1 = ¹, then / is simply the identity map on ¹.
To prove the theorem, suppose 1 : o →1 and o : 1 →o are injective. Then the composition
o1 : o → o(1) is also injective. By the lemma, there is a bijection /
t
: o → o(1). The
injectivity of o implies that o
−1
: o(1) → 1 exists and is bijective. Define / : o → 1 by
/(.) = o
−1
/
t
(.); this map is a bijection, and so o and 1 have the same cardinality.
Version: 13 Owner: mps Author(s): mps
34.36 proof of fixed points of normal functions
Suppose 1 is a κ-normal function and consider any α < κ and define a sequence by α
0
= α
and α
n+1
= 1(α
n
). Let α
ω
= sup
n<ω
α
n
. Then, since 1 is continuous,
1(α
ω
) = sup
n<ω
1(α
n
) = sup
n<ω
α
n+1
= α
ω
So Fix(1) is unbounded.
Suppose ` is a set of fixed points of 1 with [`[ < κ. Then
1(sup `) = sup
α∈N
1(α) = sup
α∈N
α = sup `
so sup ` is also a fixed point of 1, and therefore Fix(1) is closed.
Version: 1 Owner: Henry Author(s): Henry
34.37 proof of the existence of transcendental numbers
Cantor discovered this proof.
226
Lemma:
Consider a natural number /. Then the number of algebraic numbers of height / is finite.
Proof:
To see this, note the sum in the definition of height is positive. Therefore:
: < /
where : is the degree of the polynomial. For a polynomial of degree :, there are only :
coefficients, and the sum of their moduli is (/ −:), and there is only a finite number of ways
of doing this (the number of ways is the number of algebraic numbers). For every polynomial
with degree less than :, there are less ways. So the sum of all of these is also finite, and
this is the number of algebraic numbers with height / (with some repetitions). The result
follows.
Proof of the main theorem:
You can start writing a list of the algebraic numbers because you can put all the ones with
height 1, then with height 2, etc, and write them in numerical order within those sets because
they are finite sets. This implies that the set of algebraic numbers is countable. However,
by diagonalisation, the set of real numbers is uncountable. So there are more real numbers
than algebraic numbers; the result follows.
Version: 5 Owner: kidburla2003 Author(s): kidburla2003
34.38 proof of theorems in aditively indecomposable
H is closed
Let ¦α
i
[ i < κ¦ be some increasing sequence of elements of H and let α = sup¦α
i
[ i < κ¦.
Then for any r. n < α, it must be that r < α
i
and n < α
j
for some i. , < κ. But then
r + n < α
max|i,j¦
< α.
227
H is unbounded
Consider any α, and define a sequence by α
0
= oα and α
n+1
= α
n

n
. Let α
ω
= sup
n<ω
α
ω
be the limit of this sequence. If r. n < α
ω
then it must be that r < α
i
and n < α
j
for some
i. , < ω, and therefore r + n < α
max|i,j¦+1
. Note that α
ω
is, in fact, the next element of H,
since every element in the sequence is clearly additively decomposible.
1
H
(α) = ω
α
Since 0 is not in H, we have 1
H
(0) = 1.
For any α+1, we have 1
H
(α+1) is the least additively indecomposible number greater than
1
H
(α). Let α
0
= o1
H
(α) and α
n+1
= α
n
+ α
n
= α
n
2. Then 1
H
(α + 1) = sup
n<ω
α
n
=
sup
n<ω
oα 2 2 = 1
H
(α) ω. The limit case is trivial since H is closed unbounded, so 1
H
is continuous.
Version: 1 Owner: Henry Author(s): Henry
34.39 proof that the rationals are countable
Suppose we have a rational number α = j¡ in lowest terms with ¡ 0. Define the “height”
of this number as /(α) = [j[ + ¡. For example, /(0) = /(
0
1
) = 1, /(−1) = /(1) = 2,
and /(−2) = /(
−1
2
) = /(
1
2
) = /(2) = 3. Note that the set of numbers with a given height
is finite. The rationals can now be partitioned into classes by height, and the numbers in
each class can be ordered by way of increasing numerators. Thus it is possible to assign
a natural number to each of the rationals by starting with 0. −1. 1. −2.
−1
2
.
1
2
. 2. −3. . . . and
progressing through classes of increasing heights. This assignment constitutes a bijection
between N and ´ and proves that ´ is countable.
A corollary is that the irrational numbers are uncountable, since the union of the irrationals
and the rationals is R, which is uncountable.
Version: 5 Owner: quadrate Author(s): quadrate
34.40 stationary set
If κ is a cardinal, ( ⊆ κ, and ( intersects every club in κ then ( is stationary. If ( is not
stationary then it is thin.
Version: 1 Owner: Henry Author(s): Henry
228
34.41 successor cardinal
A successor cardinal is a cardinal that is the cardinal successor of some cardinal.
Version: 1 Owner: yark Author(s): yark
34.42 uncountable
Definition A set is uncountable if it is not countable. In other words, a set o is uncount-
able, if there is no subset of N with the same cardinality as o.
1. All uncountable sets are infinite. However, the converse is not true. For instance, the
natural numbers and the rational numbers - although infinite - are both countable.
2. The real numbers form an uncountable set. The famous proof of this result is based
on Cantor’s diagonal argument.
Version: 2 Owner: matte Author(s): matte, vampyr
34.43 von Neumann integer
A von Neumann integer is not an integer, but instead a construction of a natural number
using some basic set notation. The von Neumann integers are defined inductively. The
von Neumann integer zero is defined to be the empty set, ∅, and there are no smaller von
Neumann integers. The von Neumann integer ` is then the set of all von Neumann integers
less than `. The set of von Neumann integers is the set of all finite von Neumann ordinals.
This form of construction from very basic notions of sets is applicable to various forms of
set theory (for instance, Zermelo-Fraenkel set theory). While this construction suffices to
define the set of natural numbers, a little more work must be done to define the set of all
integers.
229
Examples
0 = ∅
1 = ¦0¦ = ¦∅¦
2 = ¦0. 1¦ = ¦∅. ¦∅¦¦
3 = ¦0. 1. 2¦ = ¦∅. ¦∅¦ . ¦¦∅. ¦∅¦¦¦¦
.
.
.
` = ¦0. 1. . . . . ` −1¦
Version: 3 Owner: mathcam Author(s): mathcam, Logan
34.44 von Neumann ordinal
The von Neumann ordinal is a method of defining ordinals in set theory.
The von Neumann ordinal α is defined to be the well-ordered set containing the von Neu-
mann ordinals which precede α. The set of finite von Neumann ordinals is known as the
von Neumann integers. Every well-ordered set is isomorphic a von Neumann ordinal.
They can be constructed by transfinite recursion as follows:
• The empty set is 0.
• Given any ordinal α, the ordinal α + 1 (the successor of α is defined to be α
¸
¦α¦.
• Given a set ¹ of ordinals,
¸
a∈A
c is an ordinal.
If an ordinal is the successor of another ordinal, it is an successor ordinal. If an ordinal is
neither 0 nor a successor ordinal then it is a limit ordinal. The first limit ordinal is named
ω.
The class of ordinals is denoted On.
The von Neumann ordinals have the convenient property that if c < / then c ∈ / and c ⊂ /.
Version: 5 Owner: Henry Author(s): Henry, Logan
230
34.45 weakly compact cardinal
Weakly compact cardinals are (large) infinite cardinals which have a property related to the
syntactic compactness theorem for first order logic. Specifically, for any infinite cardinal κ,
consider the language 1
κ,κ
.
This language is identical to first logic except that:
(a) infinite conjunctions and disjunctions of fewer than κ formulas are allowed
(b) infinite strings of fewer than κ quantifiers are allowed
The weak compactness theorem for 1
κ,κ
states that if ∆ is a set of sentences of 1
κ,κ
such
that [∆[ = κ and any θ ⊂ ∆ with [θ[ < κ is consistent then ∆ is consistent.
A cardinal is weakly compact if the weak compactness theorem holds for 1
κ,κ
.
Version: 1 Owner: Henry Author(s): Henry
34.46 weakly compact cardinals and the tree property
A cardinal is weakly compact if and only if it is inaccessible and has the tree property.
Weak compactness implies tree property
Let κ be a weakly compact cardinal and let (1. <
T
) be a κ tree with all levels smaller than
κ. We define a theory in 1
κ,κ
with for each r ∈ 1, a constant c
x
, and a single unary relation
1. Then our theory ∆ consists of the sentences:
• [1(c
x
) ∧ 1(c
y
)] for every incompatible r. n ∈ 1


x∈T(α)
1(c
x
) for each α < κ
It should be clear that 1 represents membership in a cofinal branch, since the first class of
sentences asserts that no incompatible elements are both in 1 while the second class states
that the branch intersects every level.
Clearly [∆[ = κ, since there are κ elements in 1, and hence fewer than κ κ = κ sentences
in the first group, and of course there are κ levels and therefore κ sentences in the second
group.
231
Now consider any Σ ⊆ ∆ with [Σ[ < κ. Fewer than κ sentences of the second group are
included, so the set of r for which the corresponding c
x
must all appear in 1(α) for some
α < κ. But since 1 has branches of arbitrary height, 1(α) [= Σ.
Since κ is weakly compact, it follows that ∆ also has a model, and that model obviously has
a set of c
x
such that 1(c
x
) whose corresponding elements of 1 intersect every level and are
compatible, therefore forming a cofinal branch of 1, proving that 1 is not Aronszajn.
Version: 4 Owner: Henry Author(s): Henry
34.47 Cantor’s theorem
Let A be any set and P(A) its power set. Cantor’s theorem states that there is no bijection
between A and P(A). Moreover the cardinality of P(¹) is stricly greater than that of ¹,
that is [¹[ < [P(¹)[.
Version: 2 Owner: igor Author(s): igor
34.48 proof of Cantor’s theorem
The proof of this theorem is fairly simple using the following construction which is central
to Cantor’s diagonal argument.
Consider a function 1 : A → P(A) from A to its powerset. Then we define the set 2 ⊆ A
as follows:
2 = ¦r ∈ A [ r ∈ 1(r)¦.
Suppose that 1 is, in fact, a bijection. Then there must exist an r ∈ A such that 1(r) = 2.
But, by construction, we have the following contradiction:
r ∈ 2 ⇔r ∈ 1(r) ⇔r ∈ 2.
Hence 1 cannot be a bijection between A and P(A).
Version: 2 Owner: igor Author(s): igor
34.49 additive
Let φ be some real function defined on an algebra of sets A. We say that φ is additive if,
whenever ¹ and 1 are disjoint sets in A, we have
φ(¹
¸
1) = φ(¹) + φ(1).
232
Suppose A is a σ-algebra. Then, given any sequence '¹
i
` of disjoint sets in A, if we have
φ

¸
¹
i

=
¸
φ(¹
i
)
we say that φ is countably additive or σ-additive.
Useful properties of an additive set function φ include the following:
1. φ(∅) = 0.
2. If ¹ ⊆ 1, then φ(¹) < φ(1).
3. If ¹ ⊆ 1, then φ(1 ` ¹) = φ(1) −φ(¹).
4. Given ¹ and 1, φ(¹
¸
1) + φ(¹
¸
1) = φ(¹) + φ(1).
Version: 3 Owner: vampyr Author(s): vampyr
34.50 antisymmetric
A relation R on ¹ is antisymmetric iff ∀r. n ∈ ¹, (rRn ∧ nRr) → (r = n). The number
of possible antisymmetric relations on ¹ is 2
n
3
n
2
−n
2
out of the 2
n
2
total possible relations,
where : = [¹[.
Antisymmetric is not the same thing as ”not symmetric”, as it is possible to have both at
the same time. However, a relation R that is both antisymmetric and symmetric has the
condition that rRn ⇒r = n. There are only 2
n
such possible relations on ¹.
An example of an antisymmetric relation on ¹ = ¦◦. . -¦ would be R = ¦(-. -). (. ◦). (◦. -). (-. )¦.
One relation that isn’t antisymmetric is R = ¦(. ◦). (-. ◦). (◦. -)¦ because we have both -R◦
and ◦R-, but ◦ = -
Version: 4 Owner: xriso Author(s): xriso
34.51 constant function
Definition Suppose A and ) are sets and 1 : A →) is a function. Then 1 is a constant
function if 1(c) = 1(/) for all c. / in A.
233
Properties
1. The composition of a constant function with any function (for which composition is
defined) is a constant function.
2. A constant map between topological spaces is continuous.
Version: 2 Owner: mathcam Author(s): matte
34.52 direct image
Let 1 : ¹ −→1 be a function, and let l ⊂ ¹ be a subset. The direct image of l is the set
1(l) ⊂ 1 consisting of all elements of 1 which equal 1(n) for some n ∈ l.
Version: 4 Owner: djao Author(s): rmilson, djao
34.53 domain
Let 1 be a binary relation. Then the set of all r such that r1n is called the domain of 1.
That is, the domain of 1 is the set of all first coordinates of the ordered pairs in 1.
Version: 5 Owner: akrowne Author(s): akrowne
34.54 dynkin system
Let Ω be a set, and P(Ω) be the power set of Ω. A dynkin system on Ω is a set D ⊆ P(Ω)
such that
1. Ω ∈ D
2. ¹. 1 ∈ D and ¹ ⊆ 1 ⇒1 ` ¹ ∈ D
3. ¹
n
∈ D. ¹
n
⊆ ¹
n+1
. : ≥ 1 ⇒
¸

k=1
¹
k
∈ D.
Let ¹ ⊆ P(Ω) be a set, and consider
Γ = ¦A : A is a dynkin system and ¹ ∈ A¦ . (34.54.1)
234
We define the intersection of all the dynkin systems containing ¹ as
D(¹) :=
¸
X∈Γ
A (34.54.2)
One can easily verify that D(¹) is itself a dynkin system and that it contains ¹. We call
D(¹) the dynkin system generated by ¹. It is the “smallest” dynkin system containing
¹.
A dynkin system which is also π-system is a σ-algebra.
Version: 4 Owner: drummond Author(s): drummond
34.55 equivalence class
Let o be a set with an equivalence relation ∼. An equivalence class of o under ∼ is a subset
1 ⊂ o such that
• If r ∈ 1 and n ∈ o, then r ∼ n if and only if n ∈ 1
• If o is nonempty, then 1 is nonempty
For r ∈ o, the equivalence class containing r is often denoted by [r], so that
[r] := ¦n ∈ o [ r ∼ n¦.
The set of all equivalence classes of o under ∼ is defined to be the set of all subsets of o
which are equivalence classes of o under ∼.
For any equivalence relation ∼, the set of all equivalence classes of o under ∼ is a partition
of o, and this correspondence is a bijection between the set of equivalence relations on o
and the set of partitions of o (consisting of nonempty sets).
Version: 3 Owner: djao Author(s): djao, rmilson
34.56 fibre
Given a function 1 : A −→) , a fibre is an inverse image of an element of ) . That is given
n ∈ ) , 1
−1
(¦n¦) = ¦r ∈ A [ 1(r) = n¦ is a fibre.
235
Example
Define 1 : R
2
−→ R by 1(r. n) = r
2
+ n
2
. Then the fibres of 1 consist of concentric circles
about the origin, the origin itself, and empty sets depending on whether we look at the
inverse image of a positive number, zero, or a negative number respectively.
Version: 3 Owner: dublisk Author(s): dublisk
34.57 filtration
A filtration is a sequence of sets ¹
1
. ¹
2
. . . . . ¹
n
with
¹
1
⊂ ¹
2
⊂ ⊂ ¹
n
.
If one considers the sets ¹
1
. . . . . ¹
n
as elements of a larger set which are partially ordered
by inclusion, then a filtration is simply a finite chain with respect to this partial ordering.
It should be noted that in some contexts the word ”filtration” may also be employed to
describe an infinite chain.
Version: 3 Owner: djao Author(s): djao
34.58 finite character
A family F of sets is of finite character if
1. For each ¹ ∈ F, every finite subset of ¹ belongs to F;
2. If every finite subset of a given set ¹ belongs to F, then ¹ belongs to F.
Version: 4 Owner: Koro Author(s): Koro
34.59 fix (transformation actions)
Let ¹ be a set, and 1 : ¹ →¹ a transformation of that set. We say that r ∈ ¹ is fixed by
1, or that 1 fixes r, whenever
1(r) = r.
The subset of fixed elements is called the fixed set of 1, and is frequently denoted as ¹
T
.
236
We say that a subset 1 ⊂ ¹ is fixed by 1 whenever all elements of 1 are fixed by 1, i.e.
1 ⊂ ¹
T
.
If this is so, 1 restricts to the identity transformation on 1.
The definition generalizes readily to a family of transformations with common domain
1
i
: ¹ →¹. i ∈ 1
In this case we say that a subset 1 ⊂ ¹ is fixed, if it is fixed by all the elements of the
family, i.e. whenever
1 ⊂
¸
i∈I
¹
T
i
.
Version: 7 Owner: rmilson Author(s): rmilson
34.60 function
Let ¹ and 1 be sets. A function 1 : ¹ −→1 is a relation 1 from ¹ to 1 such that
• For every c ∈ ¹, there exists / ∈ 1 such that (c. /) ∈ 1.
• If c ∈ ¹, /
1
. /
2
∈ 1, and (c. /
1
) ∈ 1 and (c. /
2
) ∈ 1, then /
1
= /
2
.
For c ∈ ¹, one usually denotes by 1(c) the unique element / ∈ 1 such that (c. /) ∈ 1. The
set ¹ is called the domain of 1, and the set 1 is called the codomain.
Version: 5 Owner: djao Author(s): djao
34.61 functional
Definition A functional 1 is a function mapping a function space (often a vector space)
\ in to a field of scalars 1, typically taken to be R or C.
Discussion Examples of functionals include the integral and entropy. A functional 1 is
often indicated by the use of square brackets, 1[r] rather than 1(r).
The linear functionals are those functionals 1 that satisfy
• T(x+y)=T(x)+T(y)
237
• T(cx)=cT(x)
for any c ∈ 1, r. n ∈ \ .
Version: 4 Owner: mathcam Author(s): mathcam, drummond
34.62 generalized cartesian product
Given any family of sets ¦¹
j
¦
j∈J
indexed by an index set J, the generalized cartesian product
¸
j∈J
¹
j
is the set of all functions
1 : J →
¸
j∈J
¹
j
such that 1(,) ∈ ¹
j
for all , ∈ J.
For each i ∈ J, the projection map
π
i
:
¸
j∈J
¹
j
→¹
i
is the function defined by
π
i
(1) := 1(i).
Version: 4 Owner: djao Author(s): djao
34.63 graph
The graph of a function 1 : A →) is the subset of A ) given by ¦(r. 1(r)) : r ∈ A¦.
Version: 7 Owner: Koro Author(s): Koro
34.64 identity map
Definition If A is a set, then the identity map in A is the mapping that maps each
element in A to itself.
238
Properties
1. An identity map is always a bijection.
2. Suppose A has two topologies τ
1
and τ
2
. Then the identity mapping 1 : (A. τ
1
) →
(A. τ
2
) is continuous if and only if τ
1
is finer than τ
2
, i.e., τ
1
⊂ τ
2
.
3. The identity map on the :-sphere, is homotopic to the antipodal map ¹ : o
n
→o
n
if
: is odd [1].
REFERENCES
1. V. Guillemin, A. Pollack, Differential topology, Prentice-Hall Inc., 1974.
Version: 3 Owner: bwebste Author(s): matte
34.65 inclusion mapping
Definition Let A be a subset of ) . Then the inclusion map from A to ) is the mapping
ι : A → )
r → r.
In other words, the inclusion map is simply a fancy way to say that every element in A is
also an element in ) .
To indicate that a mapping is an inclusion mapping, one usually writes → instead of →
when defining or mentioning an inclusion map. This hooked arrow symbol → can be seen
as combination of the symbols ⊂ and →. In the above definition, we have not used this
convention. However, examples of this convention would be:
• Let ι : A →) be the inclusion map from A to ) .
• We have the inclusion o
n
→R
n+1
.
Version: 4 Owner: matte Author(s): matte
34.66 inductive set
An inductive set is a set A with the property that, for every r ∈ A, the successor r
t
of r is
also an element of A.
239
One major example of an inductive set is the set of natural numbers N.
Version: 7 Owner: djao Author(s): djao
34.67 invariant
Let ¹ be a set, and 1 : ¹ → ¹ a transformation of that set. We say that r ∈ ¹ is an
invariant of 1 whenever r is fixed by 1:
1(r) = r.
We say that a subset 1 ⊂ ¹ is invariant with respect to 1 whenever
1(1) ⊂ 1.
If this is so, the restriction of 1 is a well-defined transformation of the invariant subset:
1

B
: 1 →1.
The definition generalizes readily to a family of transformations with common domain
1
i
: ¹ →¹. i ∈ 1
In this case we say that a subset is invariant, if it is invariant with respect to all elements of
the family.
Version: 5 Owner: rmilson Author(s): rmilson
34.68 inverse function theorem
Let f be a continuously differentiable, vector-valued function mapping the open set 1 ⊂ R
n
to R
n
and let o = f(1). If, for some point a ∈ 1, the Jacobian, [J
f
(a)[, is non-zero, then
there is a uniquely defined function g and two open sets A ⊂ 1 and ) ⊂ o such that
1. a ∈ A, f(a) ∈ ) ;
2. ) = f(A);
3. f : A →) is one-one;
4. g is continuously differentiable on ) and g(f(x)) = x for all x ∈ A.
240
Simplest case
When : = 1, this theorem becomes: Let 1 be a continuously differentiable, real-valued
function defined on the open interval 1. If for some point c ∈ 1, 1
t
(c) = 0, then there
is a neighbourhood [α. β] of c in which 1 is strictly monotonic. Then n → 1
−1
(n) is a
continuously differentiable, strictly monotonic function from [1(α). 1(β)] to [α. β]. If 1 is
increasing (or decreasing) on [α. β], then so is 1
−1
on [1(α). 1(β)].
Note
The inverse function theorem is a special case of the implicit function theorem where the
dimension of each variable is the same.
Version: 6 Owner: vypertd Author(s): vypertd
34.69 inverse image
Let 1 : ¹ −→ 1 be a function, and let l ⊂ 1 be a subset. The inverse image of l is the
set 1
−1
(l) ⊂ ¹ consisting of all elements c ∈ ¹ such that 1(c) ∈ l.
The inverse image commutes with all set operations: For any collection ¦l
i
¦
i∈I
of subsets of
1, we have the following identities for
1. unions:
1
−1

¸
i∈I
l
i

=
¸
i∈I
1
−1
(l
i
)
2. intersections:
1
−1

¸
i∈I
l
i

=
¸
i∈I
1
−1
(l
i
)
and for any subsets l and \ of 1, we have identities for
3. complements:

1
−1
(l)

= 1
−1
(l

)
4. set differences:
1
−1
(l ` \ ) = 1
−1
(l) ` 1
−1
(\ )
5. symmetric differences:
1
−1
(l

\ ) = 1
−1
(l)

1
−1
(\ )
241
In addition, for A ⊂ ¹ and ) ⊂ 1, the inverse image satisfies the miscellaneous identities
6. (1[
X
)
−1
() ) = A
¸
1
−1
() )
7. 1 (1
−1
() )) = )
¸
1(¹)
8. A ⊂ 1
−1
(1(A)), with equality if 1 is injective.
Version: 5 Owner: djao Author(s): djao, rmilson
34.70 mapping
Synonym of function, although typical usage suggests that mapping is the more generic term.
In a geometric context, the term function is often employed to connote a mapping whose
purpose is to assign values to the elements of its domain, i.e. a function defines a field of
values, whereas mapping seems to have a more geometric connotation, as in a mapping of
one space to another.
Version: 8 Owner: rmilson Author(s): rmilson
34.71 mapping of period : is a bijection
Theorem Suppose A is a set. Then a mapping 1 : A →A of period : is a bijection.
Proof. If : = 1, the claim is trivial; 1 is the identity mapping. Suppose : = 2. 3. . . .. Then
for any r ∈ A, we have r = 1

1
n−1
(r)

, so 1 is an surjection. To see that 1 is a injection,
suppose 1(r) = 1(n) for some r. n in A. Since 1
n
is the identity, it follows that r = n. P
Version: 3 Owner: Koro Author(s): matte
34.72 partial function
A function 1 : ¹ →1 is sometimes called a total function, to signify that 1(c) is defined
for every c ∈ ¹. If ( is any set such that ( ⊇ ¹ then 1 is also a partial function from (
to ¹.
Clearly if 1 is a function from ¹ to 1 then it is a partial function from ¹ to 1, but a partial
function need not be defined for every element of its domain.
Version: 6 Owner: Henry Author(s): Henry
242
34.73 partial mapping
Let A
1
. . A
n
and ) be sets, and let 1 be a function of : variables: 1 : A
1
A
2
A
n

) . Fix r
i
∈ A
i
for 2 < i < :. The induced mapping c →1(c. r
2
. . . . . r
n
) is called the partial
mapping determined by 1 corresponding to the first variable.
In the case where : = 2, the map defined by c → 1(c. r) is often denoted 1(. r). Further,
any function 1 : A
1
A
2
→ ) determines a mapping from A
1
into the set of mappings
of A
2
into ) , namely 1 : r → (n → 1(r. n)). The converse holds too, and it is customary
to identify 1 with 1. Many of the “canonical isomorphisms” that we come across (e.g. in
multilinear algebra) are illustrations of this kind of identification.
Version: 2 Owner: mathcam Author(s): mathcam, Larry Hammick
34.74 period of mapping
Definition Suppose A is a set and 1 is a mapping 1 : A →A. If 1
n
is the identity mapping
on A for some : = 1. 2. . . ., then 1 is said to be a mapping of period :. Here, the notation
1
n
means the :-fold composition 1 ◦ ◦ 1.
Examples
1. A mapping 1 is of period 1 if and only if 1 is the identity mapping.
2. Suppose \ is a vector space. Then a linear involution 1 : \ → \ is a mapping of
period 2. For example, the reflection mapping r →−r is a mapping of period 2.
3. In the complex plane, the mapping . → c
−2πi/n
. is a mapping of period : for : =
1. 2. . . ..
4. Let us consider the function space spanned by the trigonometric functions sin and cos.
On this space, the derivative is a mapping of period 4.
Properties
1. Suppose A is a set. Then a mapping 1 : A →A of period : is a bijection. (proof.)
2. Suppose A is a topological space. Then a continuous mapping 1 : A →A of period :
is a homeomorphism.
Version: 8 Owner: bwebste Author(s): matte
243
34.75 pi-system
Let Ω be a set, and P(Ω) be the power set of Ω. A π-system (or pi-system) on Ω is a set
F ⊆ P(Ω) such that
¹. 1 ∈ F ⇒¹
¸
1 ∈ F. (34.75.1)
A π-system is closed under finite intersection.
Version: 1 Owner: drummond Author(s): drummond
34.76 proof of inverse function theorem
Since det 11(c) = 0 the Jacobian matrix 11(c) is invertible: let ¹ = (11(c))
−1
be its
inverse. Choose : 0 and ρ 0 such that
1 = 1
ρ
(c) ⊂ 1.
|11(r) −11(c)| ≤
1
2:|¹|
∀r ∈ 1.
: ≤
ρ
2|¹|
.
Let n ∈ 1
r
(1(c)) and consider the mapping
1
y
: 1 →R
n
1
y
(r) = r + ¹ (n −1(r)).
If r ∈ 1 we have
|11
y
(r)| = |1 −¹ 11(r)| ≤ |¹| |11(c) −11(r)| ≤
1
2:
.
Let us verify that 1
y
is a contraction mapping. Given r
1
. r
2
∈ 1, by the mean-value theorem
on R
n
we have
[1
y
(r
1
) −1
y
(r
2
)[ ≤ sup
x∈[x
1
,x
2
]
:|11
y
(r)| [r
1
−r
2
[ ≤
1
2
[r
1
−r
2
[.
Also notice that 1
y
(1) ⊂ 1. In fact, given r ∈ 1,
[1
y
(r) −c[ ≤ [1
y
(r) −1
y
(c)[ +[1
y
(c) −c[ ≤
1
2
[r −c[ +[¹ (n −1(c))[ ≤
ρ
2
+|¹|: ≤ ρ.
244
So 1
y
: 1 →1 is a contraction mapping and hence by the contraction principle there exists
one and only one solution to the equation
1
y
(r) = r.
i.e. r is the only point in 1 such that 1(r) = n.
Hence given any n ∈ 1
r
(1(c)) we can find r ∈ 1 which solves 1(r) = n. Let us call
o : 1
r
(1(c)) →1 the mapping which gives this solution, i.e.
1(o(n)) = n.
Let \ = 1
r
(1(c)) and l = o(\ ). Clearly 1 : l → \ is one to one and the inverse of 1
is o. We have to prove that l is a neighbourhood of c. However since 1 is continuous in
c we know that there exists a ball 1
δ
(c) such that 1(1
δ
(c)) ⊂ 1
r
(n
0
) and hence we have
1
δ
(c) ⊂ l.
We now want to study the differentiability of o. Let n ∈ \ be any point, take u ∈ R
n
and
c 0 so small that n + cu ∈ \ . Let r = o(n) and define ·(c) = o(n + cu) −o(n).
First of all notice that being
[1
y
(r + ·(c)) −1
y
(r)[ ≤
1
2
[·(c)[
we have
1
2
[·(c) ≥ [·(c) −c¹ u[ ≥ [·(c)[ −c|¹| [u[
and hence
[·(c)[ ≤ 2c|¹| [u[.
On the other hand we know that 1 is differentiable in r that is we know that for all · it
holds
1(r + ·) −1(r) = 11(r) · + /(·)
with lim
v→0
/(·)[·[ = 0. So we get
[/(·(c))[
c

2|¹| [u[ [/(·(c))[
·(c)
→0 when c →0.
So
lim
→0
o(n + c) −o(n)
c
= lim
→0
·(c)
c
= lim
→0
11(r)
−1

cu −/(·(c))
c
= 11(r)
−1
u
that is
1o(n) = 11(r)
−1
.
Version: 2 Owner: paolini Author(s): paolini
245
34.77 proper subset
Let o be a set and let A ⊂ o be a subset. We say A is a proper subset of o if A = o.
Version: 2 Owner: djao Author(s): djao
34.78 range
Let 1 be a binary relation. Then the set of all n such that r1n for some r is called the
range of 1. That is, the range of 1 is the set of all second coordinates in the ordered pairs
of 1.
In terms of functions, this means that the range of a function is the full set of values it can
take on (the outputs), given the full set of parameters (the inputs). Note that the range is
a subset of the codomain.
Version: 2 Owner: akrowne Author(s): akrowne
34.79 reflexive
A relation R on ¹ is reflexive if and only if ∀c ∈ ¹, cRc. The number of possible reflexive
relations on ¹ is 2
n
2
−n
, out of the 2
n
2
total possible relations, where : = [¹[.
For example, let ¹ = ¦1. 2. 3¦, R is a relation on ¹. Then R = ¦(1. 1). (2. 2). (3. 3). (1. 3). (3. 2)¦
would be a reflexive relation, because it contains all the (c. c), c ∈ ¹ pairs. However,
R = ¦(1. 1). (2. 2). (2. 3). (3. 1)¦ is not reflexive because it would also have to contain (3. 3).
Version: 6 Owner: xriso Author(s): xriso
34.80 relation
A relation is any subset of a cartesian product of two sets ¹ and 1. That is, any 1 ⊂ ¹1
is a binary relation. One may write c1/ to denote c ∈ ¹, / ∈ 1 and (c. /) ∈ 1. A subset of
¹¹ is simply called a relation on ¹.
An example of a relation is the less-than relation on integers, i.e. < ⊂ Z Z. (1. 2) ∈ <,
but (2. 1) ∈ <.
Version: 3 Owner: Logan Author(s): Logan
246
34.81 restriction of a mapping
Definition Let 1 : A → ) be a mapping from a set A to a set ) . If ¹ is a subset of A,
then the restriction of 1 to ¹ is the mapping
1[
A
: ¹ → )
c → 1(c).
Version: 2 Owner: matte Author(s): matte
34.82 set difference
Let ¹ and 1 sets in some ambient set A. The set difference, or simply difference, between
¹ and 1 (in that order) is the set of all elements that are contained in ¹, but not in 1.
This set is denoted by ¹` 1, and we have
¹ ` 1 = ¦r ∈ A [ r ∈ ¹. r ∈ 1¦
= ¹
¸
1

.
where 1

is the complement of 1 in A.
Remark
Sometimes the set difference is also written as ¹ − 1. However, if ¹ and 1 are sets in a
vector space, then ¹−1 is commonly used to denote the set
¹−1 = ¦c −/ [ c ∈ ¹. / ∈ 1¦.
which, in general, is not the same as the set difference of ¹ and 1. Therefore, to avoid
confusion, one should try to avoid the notation ¹−1 for the set difference.
Version: 5 Owner: matte Author(s): matte, quadrate
34.83 symmetric
A relation R on ¹ is symmetric iff ∀r. n ∈ ¹ rRn → nRr. The number of possible
symmetric relations on ¹ is 2
n
2
+n
2
out of the 2
n
2
total possible relations, where : = [¹[.
247
An example of a symmetric relation on ¹ = ¦c. /. c¦ would be R = ¦(c. c). (c. /). (/. c). (c. c). (c. c)¦.
One relation that is not symmetric is R = ¦(/. /). (c. /). (/. c). (c. /)¦, because since we have
(c. /) we must also have (/. c) in order to be symmetric.
Version: 6 Owner: xriso Author(s): xriso
34.84 symmetric difference
The symmetric difference between two sets ¹ and 1, written ¹´ 1, is the set of all
r such that either r ∈ ¹ or r ∈ 1 but not both. It is equal to (¹ − 1)
¸
(1 − ¹) and

¸
1) −(¹
¸
1).
The symmetric difference operator is commutative since ¹´ 1 = (¹ − 1)
¸
(1 − ¹) =
(1 −¹)
¸
(¹−1) = 1´ ¹.
The operation is also associative. To see this, consider three sets ¹. 1. and (. Any given
elemnet r is in zero, one, two, or all three of these sets. If r is not in any of ¹. 1. or (,
then it is not in the symmetric difference of the three sets no matter how it is computed.
If r is in one of the sets, let that set be ¹; then r ∈ ¹´ 1 and r ∈ (¹´ 1)´ (; also,
r ∈ (1´ () and therefore r ∈ ¹´ (1´ (). If r is in two of the sets, let them be ¹
and 1; then r ∈ ¹´ 1 and r ∈ (¹´ 1)´ (; also, r ∈ 1´ (, but because r is in ¹,
r ∈ ¹´ (1´ (). If r is in all three, then r ∈ ¹´ 1 but r ∈ (¹´ 1)´ (; similarly,
r ∈ 1´ ( but r ∈ ¹´ (1´ (). Thus, ¹´ (1´ () = (¹´ 1)´ (.
In general, an element will be in the symmetric difference of several sets iff it is in an
odd number of the sets.
Version: 5 Owner: mathcam Author(s): mathcam, quadrate
34.85 the inverse image commutes with set operations
Theorem. Let 1 be a mapping from A to ) . If ¦1
i
¦
i∈I
is a (possibly uncountable) collection
of subsets in ) , then the following relations hold for the inverse image:
(1) 1
−1

¸
i∈I
1
i

=
¸
i∈I
1
−1

1
i

(2) 1
−1

¸
i∈I
1
i

=
¸
i∈I
1
−1

1
i

If ¹ and 1 are subsets in ) , then we also have:
248
(3) For the set complement,

1
−1
(¹)

= 1
−1

).
(4) For the set difference,
1
−1
(¹` 1) = 1
−1
(¹) ` 1
−1
(1).
(5) For the symmetric difference,
1
−1

1) = 1
−1
(¹)

1
−1
(1).
Proof. For part (1), we have
1
−1

¸
i∈I
1
i

=

r ∈ A [ 1(r) ∈
¸
i∈I
1
i
¸
= ¦r ∈ A [ 1(r) ∈ 1
i
for some i ∈ 1¦
=
¸
i∈I
¦r ∈ A [ 1(r) ∈ 1
i
¦
=
¸
i∈I
1
−1

1
i

.
Similarly, for part (2), we have
1
−1

¸
i∈I
1
i

=
¸
r ∈ A [ 1(r) ∈
¸
i∈I
1
i
¸
= ¦r ∈ A [ 1(r) ∈ 1
i
for all i ∈ 1¦
=
¸
i∈I
¦r ∈ A [ 1(r) ∈ 1
i
¦
=
¸
i∈I
1
−1

1
i

.
For the set complement, suppose r ∈ 1
−1
(¹). This is equivalent to 1(r) ∈ ¹, or 1(r) ∈ ¹

,
which is equivalent to r ∈ 1
−1

). Since the set difference ¹` 1 can be written as ¹
¸
1
c
,
part (4) follows from parts (2) and (3). Similarly, since ¹

1 = (¹` 1)
¸
(1 ` ¹), part (5)
follows from parts (1) and (4). P
Version: 8 Owner: matte Author(s): matte
34.86 transformation
Synonym of mapping and function. Often used to refer to mappings where the domain and
codomain are the same set, i.e. one can compose a transformation with itself. For example,
when one speaks of transformation of a space, one refers to some deformation/ of that space.
Version: 3 Owner: rmilson Author(s): rmilson
249
34.87 transitive
Let ¹ be a set. ¹ is said to be transitive if whenever r ∈ ¹ then r ⊆ ¹.
Equivalently, ¹ is transitive if whenever r ∈ ¹ and n ∈ r then n ∈ ¹.
Version: 1 Owner: Evandar Author(s): Evandar
34.88 transitive
A relation R on ¹ is transitive if and only if ∀r. n. . ∈ ¹, (rRn ∧ nR.) →(rR.).
For example, the “is a subset of” relation ⊆ between sets is transitive. The “is not equal
to” relation = between integers is not transitive. If we assign to our definiton r = 5, n = 42,
and . = 5, we know that both 5 = 42 (r = n) and 42 = 5 (n = .). However, 5 = 5 (r = .),
so = is not transitive
Version: 5 Owner: xriso Author(s): xriso
34.89 transitive closure
The transitive closure of a set A is the smallest transitive set tc(A) such that A ⊆ tc(A).
The transitive closure of a set can be constructed as follows:
Define a function 1 on ω by 1(0) = A and 1(: + 1) =
¸
1(:)
tc(A) =
¸
n<ω
1(:)
Version: 1 Owner: Henry Author(s): Henry
34.90 Hausdorff’s maximum principle
Theorem Let A be a partially ordered set. Then there exists a maximal totally ordered
subset of A.
The Hausdorff’s maximum principle is one of the many theorems equivalent to the axiom of choice.
The below proof uses Zorn’s lemma, which is also equivalent to the axiom of choice.
250
Proof. Let o be the set of all totally ordered subsets of A. o is not empty, since the
empty set is an element of o. Given a subset τ of o, the union of all the elements of τ
is again an element of o, as is easily verified. This shows that o, ordered by inclusion, is
inductive. The result now follows from Zorn’s lemma.P
Version: 3 Owner: matte Author(s): matte, cryo
34.91 Kuratowski’s lemma
Any chain in an ordered set is contained in a maximal chain.
This proposition is equivalent to the axiom of choice.
Version: 2 Owner: Koro Author(s): Koro
34.92 Tukey’s lemma
Each nonempty family of finite character has a maximal element.
Here, by a maximal element we mean a maximal element with respect to the inclusion
ordering: ¹ < 1 iff ¹ ⊆ 1. This lemma is equivalent to the axiom of choice.
Version: 3 Owner: Koro Author(s): Koro
34.93 Zermelo’s postulate
If F is a disjoint family of nonempty sets, then there is a set ( which has exactly one element
of each ¹ ∈ F (i.e such that ¹
¸
( is a singleton for each ¹ ∈ F.)
This is one of the many propositions which are equivalent to the axiom of choice.
Version: 2 Owner: Koro Author(s): Koro
34.94 Zermelo’s well-ordering theorem
If A is any set whatsoever, then there exists a well-ordering of A. The well-ordering theorem
is equivalent to the axiom of choice.
Version: 2 Owner: vypertd Author(s): vypertd
251
34.95 Zorn’s lemma
Let A be a partially ordered set, and suppose that every chain in A has an upper bound.
Then A has a maximal element r, in the sense that for all n ∈ A, n r.
Zorn’s lemma is equivalent to the axiom of choice.
Version: 3 Owner: Evandar Author(s): Evandar
34.96 axiom of choice
Let ( be a collection of nonempty sets. Then there exists a function 1 with domain
( such that 1(r) ∈ r for all r ∈ (. 1 is sometimes called a choice function on (.
The axiom of choice is commonly (although not universally) accepted with the axioms of
Zermelo-Fraenkel set theory. The axiom of choice is equivalent to the well-ordering principle
and to Zorn’s lemma.
The axiom of choice is sometimes called the multiplicative axiom, as it is equivalent to
the proposition that a product of cardinals is zero if and only if one of the factors is zero.
Version: 5 Owner: vampyr Author(s): vampyr
34.97 equivalence of Hausdorff’s maximum principle,
Zorn’s lemma and the well-ordering theorem
Hausdorff’s maximum principle implies Zorn’s lemma. Consider a partially ordered set
A, where every chain has an upper bound. According to the maximum principle there exists
a maximal totally ordered subset ) ⊆ A. This then has an upper bound, r. If r is not
the largest element in ) then ¦r¦
¸
) would be a totally ordered set in which ) would be
properly contained, contradicting the definition. Thus r is a maximal element in A.
Zorn’s lemma implies the well-ordering theorem. Let A be any non-empty set, and
let A be the collection of pairs (¹. <), where ¹ ⊆ A and < is a well-ordering on ¹. Define
a relation _, on A so that for all r. n ∈ A : r _ n iff r equals an initial of n. It is easy
to see that this defines a partial order relation on A (it inherits reflexibility, anti symmetry
and transitivity from one set being an initial and thus a subset of the other).
For each chain ( ⊆ A, define (
t
= (1. <
t
) where R is the union of all the sets ¹ for all
(¹. <) ∈ (, and <
t
is the union of all the relations < for all (¹. <) ∈ (. It follows that (
t
252
is an upper bound for ( in A.
According to Zorn’s lemma, A now has a maximal element, (`. <
M
). We postulate that `
contains all members of A, for if this were not true we could for any c ∈ A −` construct
(`

. <

) where `

= `
¸
¦c¦ and <

is extended so o
a
(`

) = `. Clearly <

then defines
a well-order on `

, and (`

. <

) would be larger than (`. <
M
) contrary to the definition.
Since ` contains all the members of A and <
M
is a well-ordering of `, it is also a well-
ordering on A as required.
The well-ordering theorem implies Hausdorff’s maximum principle. Let (A. _)
be a partially ordered set, and let < be a well-ordering on A. We define the function φ by
transfinite recursion over (A. <) so that
φ(c) =

¦c¦ i1¦c¦
¸¸
b<a
φ(/) is totally ordered under _ .
∅ otherwise.
.
It follows that
¸
x∈X
φ(r) is a maximal totally ordered subset of A as required.
Version: 4 Owner: mathcam Author(s): mathcam, cryo
34.98 equivalence of Zorn’s lemma and the axiom of
choice
Let A be a set partially ordered by < such that each chain has an upper bound. Equate
each r ∈ A with j(r) = ¦n ∈ A [ r < n¦ ⊆ 1(A). Let j(A) = ¦j(r) [ r ∈ A¦. If j(r) = ∅
then it follows that r is maximal.
Suppose no j(r) = ∅. Then by the axiom of choice there is a choice function 1 on j(A), and
since for each j(r) we have 1(j(r)) ∈ j(r), it follows that j(r) < 1(j(r)). Define 1
α
(j(r))
for all ordinals i by transfinite induction:
1
0
(j(r)) = j(r)
1
α+1
(j(r)) = 1(j(r))
And for a limit ordinal α, let 1
α
(j(r)) be the upper bound of 1
i
(j(r)) for i < α.
This construction can go on forever, for any ordinal. Then we can easily construct a surjective
function from A to (:d by o(α) = 1
α
(r). But that requires that A be a proper class, in
contradiction to the fact that it is a set. So there can be no such choice function, and there
must be a maximal element of A.
For the reverse, assume Zorn’s lemma and let ( be any set of non-empty sets. Consider the
253
set of functions 1 = ¦1 [ ∀c ∈ dom(1)(c ∈ ( ∧ 1(c) ∈ c)¦ partially ordered by inclusion.
Then the union of any chain in 1 is also a member of 1 (since the union of a chain of
functions is always a function). By Zorn’s lemma, 1 has a maximal element 1, and since
any function with domain smaller than ( can be easily expanded, dom(1) = (, and so 1 is
a choice function (.
Version: 2 Owner: Henry Author(s): Henry
34.99 maximality principle
Let o be a collection of sets. If, for each chain ( ⊆ o, there exists an A ∈ o such that every
element of ( is a subset of A, then o contains a maximal element. This is known as a the
maximality principle.
The maximality principle is equivalent to the axiom of choice.
Version: 4 Owner: akrowne Author(s): akrowne
34.100 principle of finite induction
Let o be a set of positive integers with the properties
1. 1 belongs to o, and
2. whenever the integer / is in o, then the next integer / + 1 must also be in o.
Then o is the set of all positive integers.
The Second Principle of Finite Induction would replace (2) above with
2’. If / is a positive integer such that 1. 2. . . . . / belong to o, then / + 1 must also be in o.
The Principle of Finite Induction is a consequence of the well-ordering principle.
Version: 3 Owner: KimJ Author(s): KimJ
254
34.101 principle of finite induction proven from well-
ordering principle
Let 1 be the set of all postive integers not in o. Assume 1 is nonempty. The well-ordering principle
says 1 contains a least element; call it c. Since 1 ∈ o, we have c 1, hence 0 < c −1 < c.
The choice of c as the smallest element of 1 means c −1 is not in 1, and hence is in o. But
then (c −1) + 1 is in o, which forces c ∈ o, contradicting c ∈ 1. Hence 1 is empty, and o
is all positive integers.
Version: 4 Owner: KimJ Author(s): KimJ
34.102 proof of Tukey’s lemma
Let o be a set and 1 a set of subsets of o such that 1 is of finite character. By Zorn’s lemma,
it is enough to show that 1 is inductive. For that, it will be enough to show that if (1
i
)
i∈I
is a family of elements of 1 which is totally ordered by inclusion, then the union l of the
1
i
is an element of 1 as well (since l is an upper bound on the family (1
i
)). So, let 1 be
a finite subset of l. Each element of l is in 1
i
for some i ∈ 1. Since 1 is finite and the 1
i
are totally ordered by inclusion, there is some , ∈ 1 such that all elements of 1 are in 1
j
.
That is, 1 ⊂ 1
j
. Since 1 is of finite character, we get 1 ∈ 1, QED.
Version: 1 Owner: Koro Author(s): Larry Hammick
34.103 proof of Zermelo’s well-ordering theorem
Let A be any set and let 1 be a choice function on P(A) ` ¦∅¦. Then define a function i by
transfinite recursion on the class of ordinals as follows:
i(β) = 1(A −
¸
γ<β
¦i(γ)¦) unless A −
¸
γ<β
¦i(γ)¦ = ∅ or i(γ) is undefined for some γ < β
(the function is undefined if either of the unless clauses holds).
Thus i(0) is just 1(A) (the least element of A), and i(1) = 1(A−¦i(0)¦) (the least element
of A other than i(0)).
Define by the axiom of replacement β = i
−1
[A] = ¦γ [ i(r) = γ for some r ∈ A¦. Since β is
a set of ordinals, it cannot contain all the ordinals (by the Burali-Forti paradox).
Since the ordinals are well ordered, there is a least ordinal α not in β, and therefore i(α) is
undefined. It cannot be that the second unless clause holds (since α is the least such ordinal)
255
so it must be that A −
¸
γ<α
¦i(γ)¦ = ∅, and therefore for every r ∈ A there is some γ < α
such that i(γ) = r. Since we already know that i is injective, it is a bijection between α and
A, and therefore establishes a well-ordering of A by r <
X
n ↔i
−1
(r) < i
−1
(n).
The reverse is simple. If ( is a set of nonempty sets, select any well ordering of
¸
(. Then
a choice function is just 1(c) = the least member of c under that well ordering.
Version: 5 Owner: Henry Author(s): Henry
34.104 axiom of extensionality
If A and ) have the same elements, then A = ) .
The Axiom of Extensionality is one of the axioms of Zermelo-Fraenkel set theory. In symbols,
it reads:
∀n(n ∈ A ↔n ∈ ) ) →A = ).
Note that the converse,
A = ) →∀n(n ∈ A ↔n ∈ ) )
is an axiom of the predicate calculus. Hence we have,
A = ) ↔∀n(n ∈ A ↔n ∈ ) ).
Therefore the Axiom of Extensionality expresses the most fundamental notion of a set: a set
is determined by its elements.
Version: 2 Owner: Sabean Author(s): Sabean
34.105 axiom of infinity
There exists an infinite set.
The Axiom of Infinity is an axiom of Zermelo-Fraenkel set theory. At first glance, this axiom
seems to be ill-defined. How are we to know what constitutes an infinite set when we have
not yet defined the notion of a finite set? However, once we have a theory of ordinal numbers
in hand, the axiom makes sense.
Meanwhile, we can give a definition of finiteness that does not rely upon the concept of
number. We do this by introducing the notion of an inductive set. A set o is said to be
inductive if ∅ ∈ o and for every r ∈ o, r
¸
¦r¦ ∈ o. We may then state the Axiom of
Infinity as follows:
There exists an inductive set.
256
In symbols:
∃o[∅ ∈ o ∧ (∀r ∈ o)[r
¸
¦r¦ ∈ o]]
We shall then be able to prove that the following conditions are equivalent:
1. There exists an inductive set.
2. There exists an infinite set.
3. The least nonzero limit ordinal, ω, is a set.
Version: 3 Owner: Sabean Author(s): Sabean
34.106 axiom of pairing
For any c and / there exists a set ¦c. /¦ that contains exactly c and /.
The Axiom of Pairing is one of the axioms of Zermelo-Fraenkel set theory. In symbols, it
reads:
∀c∀/∃c∀r(r ∈ c ↔r = c ∨ r = /).
Using the axiom of extensionality, we see that the set c is unique, so it makes sense to define
the pair
¦c. /¦ = the unique c such that ∀r(r ∈ c ↔r = c ∨ r = /).
Using the Axiom of Pairing, we may define, for any set c, the singleton
¦c¦ = ¦c. c¦.
We may also define, for any set c and /, the ordered pair
(c. /) = ¦¦c¦. ¦c. /¦¦.
Note that this definition satisfies the condition
(c. /) = (c. d) iff c = c and / = d.
We may define the ordered :-tuple recursively
(c
1
. . . . . c
n
) = ((c
1
. . . . . c
n−1
). c
n
).
Version: 4 Owner: Sabean Author(s): Sabean
257
34.107 axiom of power set
For any A, there exists a set ) = 1(A).
The Axiom of Power Set is an axiom of Zermelo-Fraenkel set theory. In symbols, it reads:
∀A∃) ∀n(n ∈ ) ↔n ⊆ A).
In the above, n ⊆ A is defined as ∀.(. ∈ n → . ∈ A). Hence ) is the set of all subsets of
A. ) is called the power set of A and is denoted 1(A). By extensionality, the set ) is
unique.
The Power Set Axiom allows us to define the cartesian product of two sets A and ) :
A ) = ¦(r. n) : r ∈ A ∧ n ∈ ) ¦.
The Cartesian product is a set since
A ) ⊆ 1(1(A
¸
) )).
We may define the Cartesian product of any finite collection of sets recursively:
A
1
A
n
= (A
1
A
n−1
) A
n
.
Version: 5 Owner: Sabean Author(s): Sabean
34.108 axiom of union
For any A there exists a set ) =
¸
A.
The Axiom of Union is an axiom of Zermelo-Fraenkel set theory. In symbols, it reads
∀A∃) ∀n(n ∈ ) ↔∃.(. ∈ A ∧ n ∈ .)).
Notice that this means that ) is the set of elements of all elements of A. More succinctly,
the union of any set of sets is a set. By extensionality, the set ) is unique. ) is called the
union of A.
In particular, the Axiom of Union, along with the axiom of pairing allows us to define
A
¸
) =
¸
¦A. ) ¦.
as well as the triple
¦c. /. c¦ = ¦c. /¦
¸
¦c¦
258
and therefore the :-tuple
¦c
1
. . . . . c
n
¦ = ¦c
1
¦
¸

¸
¦c
n
¦
Version: 5 Owner: Sabean Author(s): Sabean
34.109 axiom schema of separation
Let φ(n. j) be a formula. For any A and j, there exists a set ) = ¦n ∈ A : φ(n. j)¦.
The Axiom Schema of Separation is an axiom schema of Zermelo-Fraenkel set theory. Note
that it represents infinitely many individual axioms, one for each formula φ. In symbols, it
reads:
∀A∀j∃) ∀n(n ∈ ) ↔n ∈ A ∧ φ(n. j)).
By extensionality, the set ) is unique.
The Axiom Schema of Separation implies that φ may depend on more than one parameter
j.
We may show by induction that if φ(n. j
1
. . . . . j
n
) is a formula, then
∀A∀j
1
∀j
n
∃) ∀n(n ∈ ) ↔n ∈ A ∧ φ(n. j
1
. . . . . j
n
))
holds, using the Axiom Schema of Separation and the axiom of pairing.
Another consequence of the Axiom Schema of Separation is that a subclass of any set is a
set. To see this, let C be the class C = ¦n : φ(n. j
1
. . . . . j
n
)¦. Then
∀A∃) (C
¸
A = ) )
holds, which means that the intersection of C with any set is a set. Therefore, in particular,
the intersection of two sets A
¸
) = ¦r ∈ A : r ∈ ) ¦ is a set. Furthermore the difference
of two sets A − ) = ¦r ∈ A : r ∈ ) ¦ is a set and, provided there exists at least one set,
which is guaranteed by the axiom of infinity, the empty set is a set. For if A is a set, then
∅ = ¦r ∈ A : r = r¦ is a set.
Moreover, if C is a nonempty class, then
¸
C is a set, by Separation.
¸
C is a subset of
every A ∈ C.
Lastly, we may use Separation to show that the class of all sets, \ , is not a set, i.e., \ is a
proper class. For example, suppose \ is a set. Then by Separation
\
t
= ¦r ∈ \ : r ∈ r¦
is a set and we have reached a Russell paradox.
Version: 15 Owner: Sabean Author(s): Sabean
259
34.110 de Morgan’s laws
In set theory, de Morgan’s laws relate the three basic set operations to each other; the
union, the intersection, and the complement. de Morgan’s laws are named after the Indian-
born British mathematician and logician Augustus De Morgan (1806-1871) [1].
If ¹ and 1 are subsets of a set A, de Morgan’s laws state that

¸
1)

= ¹

¸
1

.

¸
1)

= ¹

¸
1

.
Here,
¸
denotes the union,
¸
denotes the intersection, and ¹

denotes the set complement
of ¹ in A, i.e., ¹

= A ` ¹.
Above, de Morgan’s laws are written for two sets. In this form, they are intuitively quite
clear. For instance, the first claim states that an element that is not in ¹
¸
1 is not in ¹
and not in 1. It also states that an elements not in ¹ and not in 1 is not in ¹
¸
1.
For an arbitrary collection of subsets, de Morgan’s laws are as follows:
Theorem. Let A be a set with subsets ¹
i
⊂ A for i ∈ 1, where 1 is an arbitrary index-set.
In other words, 1 can be finite, countable, or uncountable. Then

¸
i∈I
¹
i

=
¸
i∈I
¹

i
.

¸
i∈I
¹
i

=
¸
i∈I
¹

i
.
(proof)
de Morgan’s laws in a Boolean algebra
For Boolean variables r and n in a Boolean algebra, de Morgan’s laws state that
(r ∧ n)
t
= r
t
∨ n
t
.
(r ∨ n)
t
= r
t
∧ n
t
.
Not surprisingly, de Morgan’s laws form an indispensable tool when simplifying digital cir-
cuits involving and, or, and not gates [2].
REFERENCES
1. Wikipedia’s entry on de Morgan, 4/2003.
2. M.M. Mano, Computer Engineering: Hardware Design, Prentice Hall, 1988.
Version: 11 Owner: matte Author(s): matte, drini, greg
260
34.111 de Morgan’s laws for sets (proof )
Let A be a set with subsets ¹
i
⊂ A for i ∈ 1, where 1 is an arbitrary index-set. In other
words, 1 can be finite, countable, or uncountable. We first show that

¸
i∈I
¹
i

t
=
¸
i∈I
¹
t
i
.
where ¹
t
denotes the complement of ¹.
Let us define o =
¸
i∈I
¹
i

t
and 1 =
¸
i∈I
¹
t
i
. To establish the equality o = 1, we shall
use a standard argument for proving equalities in set theory. Namely, we show that o ⊂ 1
and 1 ⊂ o. For the first claim, suppose r is an element in o. Then r ∈
¸
i∈I
¹
i
, so r ∈ ¹
i
for any i ∈ 1. Hence r ∈ ¹
t
i
for all i ∈ 1, and r ∈
¸
i∈I
¹
i
= 1. Conversely, suppose r
is an element in 1 =
¸
i∈I
¹
t
i
. Then r ∈ ¹
t
i
for all i ∈ 1. Hence r ∈ ¹
i
for any i ∈ 1, so
r ∈
¸
i∈I
¹
i
, and r ∈ o.
The second claim,

¸
i∈I
¹
i

t
=
¸
i∈I
¹
t
i
.
follows by applying the first claim to the sets ¹
t
i
.
Version: 3 Owner: mathcam Author(s): matte
34.112 set theory
Set theory is special among mathematical theories, in two ways: It plays a central role in
putting mathematics on a reliable axiomatic foundation, and it provides the basic language
and apparatus in which most of mathematics is expressed.
34.112.1 Axiomatic set theory
I will informally list the undefined notions, the axioms, and two of the “schemes” of set
theory, along the lines of Bourbaki’s account. The axioms are closer to the von Neumann-
Bernays-G¨odel model than to the equivalent ZFC model. (But some of the axioms are
identical to some in ZFC; see the entry ZermeloFraenkelAxioms.) The intention here is just
to give an idea of the level and scope of these fundamental things.
There are three undefined notions:
1. the relation of equality of two sets
261
2. the relation of membership of one set in another (r ∈ n)
3. the notion of an ordered 3. pair, which is a set comprised from two 3. other sets, in a
specific order.
Most of the eight schemes belong more properly to logic than to set theory, but they, or
something on the same level, are needed in the work of formalizing any theory that uses the
notion of equality, or uses quantifiers such as ∃. Because of their formal nature, let me just
(informally) state two of the schemes:
S6. If ¹ and 1 are sets, and ¹ = 1, then anything true of ¹ is true of 1, and conversely.
S7. If two properties 1(r) and G(r) of a set r are equivalent, then the “generic” set having
the property 1, is the same as the generic set having the property G.
(The notion of a generic set having a given property, is formalized with the help of the
Hilbert τ symbol; this is one way, but not the only way, to incorporate what is called the
axiom of choice.)
Finally come the five axioms in this axiomatization of set theory. (Some are identical to
axioms in ZFC, q.v.)
A1. Two sets ¹ and 1 are equal iff they have the same elements, i.e. iff the relation r ∈ ¹
implies r ∈ 1 and vice versa.
A2. For any two sets ¹ and 1, there is a set ( such that the r ∈ ( is equivalent to r = ¹
or r = 1.
A3. Two ordered pairs (¹. 1) and ((. 1) are equal iff ¹ = ( and 1 = 1.
A4. For any set ¹, there exists a set 1 such that r ∈ 1 is equivalent to r ⊂ ¹; in other
words, there is a set of all subsets of ¹, for any given set ¹.
A5. There exists an infinite set.
The word “infinite” is defined in terms of Axioms A1-A4. But to formulate the definition,
one must first build up some definitions and results about functions and ordered sets, which
we haven’t done here.
34.112.2 Product sets, relations, functions, etc.
Moving away from foundations and toward applications, all the more complex structures
and relations of set theory are built up out of the three undefined notions. (See the entry
“Set”.) For instance, the relation ¹ ⊂ 1 between two sets, means simply “if r ∈ ¹ then
r ∈ 1”.
262
Using the notion of ordered pair, we soon get the very important structure called the product
¹1 of two sets ¹ and 1. Next, we can get such things as equivalence relations and order
relations on a set ¹, for they are subsets of ¹ ¹. And we get the critical notion of a
function ¹ → 1, as a subset of ¹ 1. Using functions, we get such things as the product
¸
i∈I
¹
i
of a family of sets. (“Family” is a variation of the notion of function.)
To be strictly formal, we should distinguish between a function and the graph of that func-
tion, and between a relation and its graph, but the distinction is rarely necessary in practice.
34.112.3 Some structures defined in terms of sets
The natural numbers provide the first example. Peano, Zermelo and Fraenkel, and others
have given axiom-lists for the set N, with its addition, multiplication, and order relation;
but nowadays the custom is to define even the natural numbers in terms of sets. In more
detail, a natural number is the order-type of a finite well-ordered set. The relation : ≤ :
between :. : ∈ N is defined with the aid of a certain theorem which says, roughly, that for
any two well-ordered sets, one is a segment of the other. The sum or product of two natural
numbers is defined as the cardinal of the sum or product, respectively, of two sets. (For an
extension of this idea, see surreal numbers.)
(The term “cardinal” takes some work to define. The “type” of an ordered set, or any other
kind of structure, is the “generic” structure of that kind, which is defined using τ.)
Groups provide another simple example of a structure defined in terms of sets and ordered
pairs. A group is a pair (G. 1) in which G is just a set, and 1 is a mapping G G → G
satisfying certain axioms; the axioms (associativity etc.) can all be spelled out in terms of
sets and ordered pairs, although in practice one uses algebraic notation to do it. When we
speak of (e.g.) “the” group o
3
of permutations of a 3-element set, we mean the “type” of
such a group.
Topological spaces provide another example of how mathematical structures can be defined
in terms of, ultimately, the sets and ordered pairs in set theory. A topological space is a pair
(o. l), where the set o is arbitrary, but l has these properties:
– any element of l is a subset of o
– the union of any family – (or set) of elements of l is also an element of l
– the intersection of any – finite family of elements of l is an element of l.
Many special kinds of topological spaces are defined by enlarging this list of restrictions on
l.
Finally, many kinds of structure are based on more than one set. E.g. a left module is a
commutative group ` together with a ring 1, plus a mapping 1 ` →` which satisfies
263
a specific set of restrictions.
34.112.4 Categories, homological algebra
Although set theory provides some of the language and apparatus used in mathematics
generally, that language and apparatus have expanded over time, and now include what are
called “categories” and “functors”. A category is not a set, and a functor is not a mapping,
despite similarities in both cases. A category comprises all the structured sets of the same
kind, e.g. the groups, and contains also a definition of the notion of a morphism from one
such structured set to another of the same kind. A functor is similar to a morphism but
compares one category to another, not one structured set to another. The classic examples
are certain functors from the category of topological spaces to the category of groups.
“Homological algebra” is concerned with sequences of morphisms within a category, plus
functors from one category to another. One of its aims is to get structure theories for specific
categories; the homology of groups and the cohomology of Lie algebras are examples. For
more details on the categories and functors of homological algebra, I recommend a search
for “Eilenberg-Steenrod axioms”.
Version: 8 Owner: mathwizard Author(s): Larry Hammick
34.113 union
The union of two sets ¹ and 1 is the set which contains all r ∈ ¹ and all r ∈ 1. The
union of ¹ and 1 is written as (¹
¸
1).
For any sets ¹ and 1,
r ∈ ¹
¸
1 ⇔(r ∈ ¹) ∨ (r ∈ 1)
Version: 1 Owner: imran Author(s): imran
34.114 universe
A universe U is a nonempty set satisfying the following axioms:
1. If r ∈ U and n ∈ r, then n ∈ U.
2. If r. n ∈ U, then ¦r. n¦ ∈ U.
264
3. If r ∈ U, then the power set P(r) ∈ U.
4. If ¦r
i
[i ∈ 1 ∈ U¦ is a family of elements of U, then
¸
i∈I
r
i
∈ U.
From these axioms, one can deduce the following properties:
1. If r ∈ U, then ¦r¦ ∈ U.
2. If r is a subset of n ∈ U, then r ∈ U.
3. If r. n ∈ U, then the ordered pair (r. n) = ¦¦r. n¦. r¦ is in U.
4. If r. n ∈ U, then r
¸
n and r n are in U.
5. If ¦r
i
[i ∈ 1 ∈ U¦ is a family of elements of U, then the product
¸
i∈I
r
i
is in U.
6. If r ∈ U, then the cardinality of r is strictly less than the cardinality of U. In
particular, U ∈ U.
The standard reference for universes is [SGA4].
REFERENCES
[SGA4] Grothendieck et al. SGA4.
Version: 2 Owner: nerdy2 Author(s): nerdy2
34.115 von Neumann-Bernays-Gdel set theory
von Neumann-Bernays-G¨odel (commonly referred to as NBG or vNBG) is an axiomatisation
of set theory closely related to the more familiar Zermelo-Fraenkel with choice (ZFC) ax-
ioamatisation. The primary difference between ZFC and NBG is that NBG has proper classes
among its objects. NBG and ZFC are very closely related and are in fact equiconsistent,
NBG being a conservative extension of ZFC.
In NBG, the proper classes are differentiated from sets by the fact that they do not belong
to other classes. Thus in NBG we have
Set(r) ↔∃nr ∈ n
Another interesting fact about proper classes within NBG is the following limitation of size principle
of von Neumann:
265
Set(r) ↔[r[ = [\ [
where \ is the set theoretic universe. This principle can in fact replace in NBG essen-
tially all set existence axioms with the exception of the powerset axiom (and obviously the
axiom of infinity). Thus the classes that are proper in NBG are in a very clear sense big,
while the sets are small.
The NBG set theory can be axiomatised in two different ways
• Using the G¨odel class construction functions, resulting in a finite axiomatisation
• Using a class comprehension axiom scheme, resulting in an infinite axiomatisation
In the latter alternative we take ZFC and relativise all of its axioms to sets, i.e. we replace
every expression of form ∀rφ with ∀r(oct(r) → φ) and ∃rφ with ∃r(oct(r) ∧ φ) and add
the class comprehension scheme
If φ is a formula with a free variable r with all its quantifiers restricted to
sets, then the following is an axiom: ∃¹∀r(r ∈ ¹ ↔φ)
Notice the important restriction to formulae with quantifiers restricted to sets in the scheme.
This requirement makes the NBG proper classes predicative; you can’t prove the existence
of a class the definition of which quantifies over all classes. This restriction is essential; if we
loosen it we get a theory that is not conservative over ZFC. If we allow arbitrary formulae in
the class comprehension axiom scheme we get what is called Morse-Kelley set theory. This
theory is essentially stronger than ZFC or NBG. In addition to these axioms, NBG also
contains the global axiom of choice
∃(∀r∃.((
¸
r = ¦.¦)
Another way to axiomatise NBG is to use the eight G¨odel class construction functions. These
functions correspond to the various ways in which one can build up formulae (restricted
to sets!) with set parameters. However, the functions are finite in number and so are the
resulting axioms governing their behaviour. In particular, since there is a class corresponding
to any restricted formula, the intersection of any set and this class exists too (and is set).
Thus the comprehension scheme of ZFC can be replaced with a finite number of axioms,
provided we allow for proper classes.
It is easy to show that everything provable in ZF is also provable in NBG. It is also not too
difficult to show that NBG - global choice is conservative extension of ZFC. However, showing
that NBG (including global choice) is a conservative extension of ZFC is considerably more
difficult. This is equivalent to showing that NBG with global choice is conservative over
266
NBG with only local choice (choice restricted to sets). In order to do this one needs to use
(class) forcing. This result is usually credited to Easton and Solovay.
Version: 8 Owner: Aatu Author(s): Aatu
34.116 FS iterated forcing preserves chain condition
Let κ be a regular cardinal and let '
ˆ
C
β
`
β<α
be a finite support iterated forcing where for
every β < α, '
P
β
ˆ
C
β
has the κ chain condition.
By induction:
1
0
is the empty set.
If 1
α
satisfies the κ chain condition then so does 1
α+1
, since 1
α+1
is equivalent to 1
α
∗ C
α
and composition preserves the κ chain condition for regular κ.
Suppose α is a limit ordinal and 1
β
satisfies the κ chain condition for all β < α. Let
o = 'j
i
`
i<κ
be a subset of 1
α
of size κ. The domains of the elements of j
i
form κ finite
subsets of α, so if cf(α) κ then these are bounded, and by the inductive hypothesis, two
of them are compatible.
Otherwise, if cf(α) < κ, let 'α
j
`
j<cf(α)
be an increasing sequence of ordinals cofinal in α.
Then for any i < κ there is some :(i) < cf(α) such that dom(j
i
) ⊆ α
n(i)
. Since κ is regular
and this is a partition of κ into fewer than κ pieces, one piece must have size κ, that is, there
is some , such that , = :(i) for κ values of i, and so ¦j
i
[ :(i) = ,¦ is a set of conditions
of size κ contained in 1
α
j
, and therefore contains compatible members by the induction
hypothesis.
Finally, if cf(α) = κ, let ( = 'α
j
`
j<κ
be a strictly increasing, continuous sequence cofinal in
α. Then for every i < κ there is some :(i) < κ such that dom(j
i
) ⊆ α
n(i)
. When :(i) is a
limit ordinal, since ( is continuous, there is also (since dom(j
i
) is finite) some 1(i) < i such
that dom(j
i
)
¸

f(i)
. α
i
) = ∅. Consider the set 1 of elements i such that i is a limit ordinal
and for any , < i, :(,) < i. This is a club, so by Fodor’s lemma there is some , such that
¦i [ 1(i) = ,¦ is stationary.
For each j
i
such that 1(i) = ,, consider j
t
i
= j
i
` ,. There are κ of these, all members of 1
j
,
so two of them must be compatible, and hence those two are also compatible in 1.
Version: 1 Owner: Henry Author(s): Henry
267
34.117 chain condition
A partial order 1 satisfies the κ-chain condition if for any o ⊆ 1 with [o[ = κ then there
exist distinct r. n ∈ o such that either r < n or n < r.
If κ = ℵ
1
then 1 is said to satisfy the countable chain condition (c.c.c.)
Version: 2 Owner: Henry Author(s): Henry
34.118 composition of forcing notions
Suppose 1 is a forcing notion in M and
ˆ
C is some 1-name such that '
P
ˆ
C is a forcing
notion.
Then take a set of 1-names C such that given a 1 name
˜
C of C, '
P
˜
C =
ˆ
C (that is, no
matter which generic subset G of 1 we force with, the names in C correspond precisely to
the elements of
ˆ
C[G]). We can define
1 ∗ C = ¦'j. ˆ ¡` [ j ∈ 1. ˆ ¡ ∈ C¦
We can define a partial order on 1 ∗ C such that 'j
1
. ˆ ¡
1
` < 'j
2
. ˆ ¡
2
` iff j
1
<
P
j
2
and j
1
'
ˆ ¡
1
<
ˆ
Q
ˆ ¡
2
. (A note on interpretation: ¡
1
and ¡
2
are 1 names; this requires only that ˆ ¡
1
< ˆ ¡
2
in generic subsets contain j
1
, so in other generic subsets that fact could fail.)
Then 1 ∗
ˆ
C is itself a forcing notion, and it can be shown that forcing by 1 ∗
ˆ
C is equivalent
to forcing first by 1 and then by
ˆ
C[G].
Version: 1 Owner: Henry Author(s): Henry
34.119 composition preserves chain condition
Let κ be a regular cardinal. Let 1 be a forcing notion satisfying the κ chain condition.
Let
ˆ
C be a 1-name such that '
P
ˆ
C is a forcing notion satisfying the κ chain
condition. Then 1 ∗ C satisfies the κ chain conditon.
268
Proof:
Outline
We prove that there is some j such that any generic subset of 1 including j also includes κ
of the j
i
. Then, since C[G] satisfies the κ chain condition, two of the corresponding ˆ ¡
i
must
be compatible. Then, since G is directed, there is some j stronger than any of these which
forces this to be true, and therefore makes two elements of o compatible.
Let o = 'j
i
. ˆ ¡
i
`
i<κ
⊆ 1 ∗ C.
Claim: There is some j ∈ 1 such that j ' [¦i [ j
i

ˆ
G¦[ = κ
(Note:
ˆ
G = ¦'j. j` [ j ∈ 1¦, hence
ˆ
G[G] = G)
If no j forces this then every j forces that it is not true, and therefore '
P
[¦i [ j
i
∈ G¦[ < κ.
Since κ is regular, this means that for any generic G ⊆ 1, ¦i [ j
i
∈ G¦ is bounded. For
each G, let 1(G) be the least α such that β < α implies that there is some γ β such that
j
γ
∈ G. Define 1 = ¦α [ α = 1(G)¦ for some G.
Claim: [1[ < κ
If α ∈ 1 then there is some j
α
∈ 1 such that j ' 1(
ˆ
G) = α, and if α. β ∈ 1 then j
α
must
be incompatible with j
β
. Since 1 satisfies the κ chain condition, it follows that [1[ < κ.
Since κ is regular, α = sub(1) < κ. But obviously j
α+1
' j
α+1

ˆ
G. This is a contradiction,
so we conclude that there must be some j such that j ' [¦i [ j
i

ˆ
G¦[ = κ.
If G ⊆ 1 is any generic subset containing j then ¹ = ¦ˆ ¡
i
[G] [ j
i
∈ G¦ must have cardinality
κ. Since C[G] satisfies the κ chain condition, there exist i. , < κ such that j
i
. j
j
∈ G and
there is some ˆ ¡[G] ∈ C[G] such that ˆ ¡[G] < ˆ ¡
i
[G]. ˆ ¡
j
[G]. Then since G is directed, there is
some j
t
∈ G such that j
t
< j
i
. j
j
. j and j
t
' ˆ ¡[G] < ˆ ¡
1
[G]. ˆ ¡
2
[G]. So 'j
t
. ˆ ¡` < 'j
i
. ˆ ¡
i
`. 'j
j
. ˆ ¡
j
`.
Version: 1 Owner: Henry Author(s): Henry
34.120 equivalence of forcing notions
Let 1 and C be two forcing notions such that given any generic subset G of 1 there is a
generic subset H of C with M[G] = M[H] and vice-versa. Then 1 and C are equivalent.
269
Since if G ∈ M[H], τ[G] ∈ M for any 1-name τ, it follows that if G ∈ M[H] and H ∈ M[G]
then M[G] = M[H].
Version: 2 Owner: Henry Author(s): Henry
34.121 forcing relation
If M is a transitive model of set theory and 1 is a partial order then we can define a forcing
relation:
j '
P
φ(τ
1
. . . . . τ
n
)
(j forces φ(τ
1
. . . . . τ
n
))
for any j ∈ 1, where τ
1
. . . . . τ
n
are 1- names.
Specifically, the relation holds if for every generic filter G over 1 which contains j,
M[G] = φ(τ
1
[G]. . . . . τ
n
[G])
That is, j forces φ if every extension of M by a generic filter over 1 containing j makes φ
true.
If j '
P
φ holds for every j ∈ 1 then we can write '
P
φ to mean that for any generic G ⊆ 1,
M[G] = φ.
Version: 2 Owner: Henry Author(s): Henry
34.122 forcings are equivalent if one is dense in the
other
Suppose 1 and C are forcing notions and that 1 : 1 →C is a function such that:
• j
1
<
P
j
2
implies 1(j
1
) <
Q
1(j
2
)
• If j
1
. j
2
∈ 1 are incomparable then 1(j
1
). 1(j
2
) are incomparable
• 1[1] is dense in C
then 1 and C are equivalent.
270
Proof
We seek to provide two operations (computable in the appropriate universes) which convert
between generic subsets of 1 and C, and to prove that they are inverses.
1(G) = H where H is generic
Given a generic G ⊆ 1, consider H = ¦¡ [ 1(j) < ¡¦ for some j ∈ G.
If ¡
1
∈ H and ¡
1
< ¡
2
then ¡
2
∈ H by the definition of H. If ¡
1
. ¡
2
∈ H then let j
1
. j
2
∈ 1
be such that 1(j
1
) < ¡
1
and 1(j
2
) < ¡
2
. Then there is some j
3
< j
1
. j
2
such that j
3
∈ G,
and since 1 is order preseving 1(j
3
) < 1(j
1
) < ¡
1
and 1(j
3
) < 1(j
2
) < ¡
2
.
Suppose 1 is a dense subset of C. Since 1[1] is dense in C, for any d ∈ 1 there is some
j ∈ 1 such that 1(j) < d. For each d ∈ 1, assign (using the axiom of choice) some d
p
∈ 1
such that 1(d
p
) < d, and call the set of these 1
P
. This is dense in 1, since for any j ∈ 1
there is some d ∈ 1 such that d < 1(j), and so some d
p
∈ 1
P
such that 1(d
p
) < d. If d
p
< j
then 1
P
is dense, so suppose d
p
< j. If d
p
< j then this provides a member of 1
P
less than
j; alternatively, since 1(d
p
) and 1(j) are compatible, d
p
and j are compatible, so j < d
p
,
and therefore 1(j) = 1(d
p
) = d, so j ∈ 1
P
. Since 1
P
is dense in 1, there is some element
j ∈ 1
P
¸
G. Since j ∈ 1
P
, there is some d ∈ 1 such that 1(j) < d. But since j ∈ G,
d ∈ H, so H intersects 1.
G can be recovered from 1(G)
Given H constructed as above, we can recover G as the set of j ∈ 1 such that 1(j) ∈ H.
Obviously every element from G is included in the new set, so consider some j such that
1(j) ∈ H. By definition, there is some j
1
∈ G such that 1(j
1
) < 1(j). Take some dense
1 ∈ C such that there is no d ∈ 1 such that 1(j) < d (this can be done easily be taking
any dense subset and removing all such elements; the resulting set is still dense since there
is some d
1
such that d
1
< 1(j) < d). This set intersects 1[G] in some ¡, so there is some
j
2
∈ G such that 1(j
2
) < ¡, and since G is directed, some j
3
∈ G such that j
3
< j
2
. j
1
. So
1(j
3
) < 1(j
1
) < 1(j). If j
3
< j then we would have j < j
3
and then 1(j) < 1(j
3
) < ¡,
contradicting the definition of 1, so j
3
< j and j ∈ G since G is directed.
1
−1
(H) = G where G is generic
Given any generic H in C, we define a corresponding G as above: G = ¦j ∈ 1 [ 1(j) ∈ H¦.
If j
1
∈ G and j
1
< j
2
then 1(j
1
) ∈ H and 1(j
1
) < 1(j
2
), so j
2
∈ G since H is directed. If
j
1
. j
2
∈ G then 1(j
1
). 1(j
2
) ∈ H and there is some ¡ ∈ H such that ¡ < 1(j
1
). 1(j
2
).
271
Consider 1, the set of elements of C which are 1(j) for some j ∈ 1 and either 1(j) < ¡ or
there is no element greater than both 1(j) and ¡. This is dense, since given any ¡
1
∈ C, if
¡
1
< ¡ then (since 1[1] is dense) there is some j such that 1(j) < ¡
1
< ¡. If ¡ < ¡
1
then
there is some j such that 1(j) < ¡ < ¡
1
. If neither of these and ¡ there is some : < ¡
1
. ¡ then
any j such that 1(j) < : suffices, and if there is no such : then any j such that 1(j) < ¡
suffices.
There is some 1(j) ∈ 1
¸
H, and so j ∈ G. Since H is directed, there is some : < 1(j). ¡,
so 1(j) < ¡ < 1(j
1
). 1(j
2
). If it is not the case that 1(j) < 1(j
1
) then 1(j) = 1(j
1
) = 1(j
2
).
In either case, we confirm that H is directed.
Finally, let 1 be a dense subset of 1. 1[1] is dense in C, since given any ¡ ∈ C, there
is some j ∈ 1 such that j < ¡, and some d ∈ 1 such that d < j < ¡. So there is some
1(j) ∈ 1[1]
¸
H, and so j ∈ 1
¸
G.
H can be recovered from 1
−1
(H)
Finally, given G constructed by this method, H = ¦¡ [ 1(j) < ¡¦ for some j ∈ G. To see
this, if there is some 1(j) for j ∈ G such that 1(j) < ¡ then 1(j) ∈ H so ¡ ∈ H. On the
other hand, if ¡ ∈ H then the set of 1(j) such that either 1(j) < ¡ or there is no : ∈ C such
that : < ¡. 1(j) is dense (as shown above), and so intersects H. But since H is directed, it
must be that there is some 1(j) ∈ H such that 1(j) < ¡, and therefore j ∈ G.
Version: 3 Owner: Henry Author(s): Henry
34.123 iterated forcing
We can define an iterated forcing of length α by induction as follows:
Let 1
0
= ∅.
Let
ˆ
C
0
be a forcing notion.
For β < α, 1
β
is the set of all functions 1 such that dom(1) ⊆ β and for any i ∈ dom(1),
1(i) is a 1
i
-name for a member of
ˆ
C
i
. Order 1
β
by the rule 1 < o iff dom(o) ⊆ dom(1) and
for any i ∈ dom(1), o ` i ' 1(i) <
ˆ
Q
i
o(i). (Translated, this means that any generic subset
including o restricted to i forces that 1(i), an element of
ˆ
C
i
, be less than o(i).)
For β < α,
ˆ
C
β
is a forcing notion in 1
β
(so '
P
β
ˆ
C
β
is a forcing notion).
Then the sequence '
ˆ
C
β
`
β<α
is an iterated forcing.
If 1
β
is restricted to finite functions that it is called a finite support iterated forcing
272
(FS), if 1
β
is restricted to countable functions, it is called a countable support iterated
function (CS), and in general if each function in each 1
β
has size less than κ then it is a
< κ-support iterated forcing.
Typically we construct the sequence of
ˆ
C
β
’s by induction, using a function 1 such that
1('
ˆ
C
β
`
β<γ
) =
ˆ
C
γ
.
Version: 2 Owner: Henry Author(s): Henry
34.124 iterated forcing and composition
There is a function satisfying forcings are equivalent if one is dense in the other 1 : 1
α

C
α
→1
α+1
.
Proof
Let 1('o. ˆ ¡`) = o
¸
¦'α. ˆ ¡`¦. This is obviously a member of 1
α+1
, since it is a partial function
from α+1 (and if the domain of o is less than α then so is the domain of 1('o. ˆ ¡`)), if i < α
then obviously 1('o. ˆ ¡`) applied to i satisfies the definition of iterated forcing (since o does),
and if i = α then the definition is satisfied since ˆ ¡ is a name in 1
i
for a member of C
i
.
1 is order preserving, since if 'o
1
. ˆ ¡
1
` < 'o
2
. ˆ ¡
2
`, all the appropriate characteristics of a
function carry over to the image, and o
1
` α '
P
i
ˆ ¡
1
< ˆ ¡
2
(by the definition of < in ∗).
If 'o
1
. ˆ ¡
1
` and 'o
2
. ˆ ¡
2
` are incomparable then either o
1
and o
2
are incomparable, in which
case whatever prevents them from being compared applies to their images as well, or ˆ ¡
1
and
ˆ ¡
2
aren’t compared appropriately, in which case again this prevents the images from being
compared.
Finally, let o be any element of 1
α+1
. Then o ` α ∈ 1
α
. If α ∈ dom(o) then this is just o,
and 1('o. ˆ ¡`) < o for any ˆ ¡. If α ∈ dom(o) then 1('o ` α. o(α)`) = o. Hence 1[1
α
∗ C
α
] is
dense in 1
α+1
, and so these are equivalent.
Version: 3 Owner: Henry Author(s): Henry
34.125 name
We need a way to refer to objects of M[G] within M. This is done by assigning a name to
each element of M[G].
Given a partial order 1, we construct the 1-names by induction. Each name is just a
273
relation between 1 and the set of names already constructed; that is, a name is a set of
ordered pairs of the form (j. τ) where j ∈ 1 and τ is a name constructed at an earlier level
of the induction.
Given a generic subset G ⊆ 1, we can then define the interpretation τ[G] of a 1-name τ in
M[G] by:
τ[G] = ¦τ
t
[G] [ (j. τ
t
) ∈ τ¦ for some j ∈ G
Of course, two different names can have the same interpretation.
The generic subset can be thought of as a ”key” which reveals which potential elements of
τ are actually elements.
Any element r ∈ M can be given a canonical name
ˆ r = ¦(j. ˆ n) [ n ∈ r. j ∈ 1¦
This guarantees that the elements of ˆ r[G] will be exactly the same as the elements of r,
regardless of which members of 1 are contained in G.
Version: 3 Owner: Henry Author(s): Henry
34.126 partial order with chain condition does not col-
lapse cardinals
If 1 is a partial order with satisfies the κ chain condition and G is a generic subset of 1 then
for any κ < λ ∈ M, λ is also a cardinal in M[G], and if cf(α) = λ in M then also cf(α) = λ
in M[G].
This theorem is the simplest way to control a notion of forcing, since it means that a notion
of forcing does not have an effect above a certain point. Given that any 1 satisfies the [1[
+
chain condition, this means that most forcings leaves all of M above a certain point alone.
(Although it is possible to get around this limit by forcing with a proper class.)
Version: 2 Owner: Henry Author(s): Henry
34.127 proof of partial order with chain condition does
not collapse cardinals
Outline:
274
Given any function 1 purporting to violate the theorem by being surjective (or cofinal) on
λ, we show that there are fewer than κ possible values of 1(α), and therefore only max(α. κ)
possible elements in the entire range of 1, so 1 is not surjective (or cofinal).
Details:
Suppose λ κ is a cardinal of M that is not a cardinal in M[G].
There is some function 1 ∈ M[G] and some cardinal α < λ such that 1 : α →λ is surjective.
This has a name,
ˆ
1. For each β < α, consider
1
β
= ¦γ < λ [ j '
ˆ
1(β) = γ¦ for some j ∈ 1
[1
β
[ < κ, since any two j ∈ 1 which force different values for
ˆ
1(β) are incompatible and 1
has no sets of incompatible elements of size κ.
Notice that 1
β
is definable in M. Then the range of 1 must be contained in 1 =
¸
i<α
1
i
.
But [1[ < α κ = max(α. κ) < λ. So 1 cannot possibly be surjective, and therefore λ is not
collapsed.
Now suppose that for some α ` λ κ, cf(α) = λ in M and for some η < λ there is a cofinal
function 1 : η →α.
We can construct 1
β
as above, and again the range of 1 is contained in 1 =
¸
i<η
1
i
. But
then [ range(1)[ < [1[ < η κ < λ. So there is some γ < α such that 1(β) < γ for any β < η,
and therefore 1 is not cofinal in α.
Version: 1 Owner: Henry Author(s): Henry
34.128 proof that forcing notions are equivalent to their
composition
This is a long and complicated proof, the more so because the meaning of C shifts depending
on what generic subset of 1 is being used. It is therefore broken into a number of steps. The
core of the proof is to prove that, given any generic subset G of 1 and a generic subset H of
ˆ
C[G] there is a corresponding generic subset G∗H of 1 ∗C such that M[G][H] = M[G∗H],
and conversely, given any generic subset G of 1 ∗ C we can find some generic G
P
of 1 and
a generic G
Q
of
ˆ
C[G
P
] such that M[G
P
][G
Q
] = M[G].
We do this by constructing functions using operations which can be performed within the
forced universes so that, for example, since M[G][H] has both G and H, G ∗ H can be
calculated, proving that it contains M[G ∗ H]. To ensure equality, we will also have to
ensure that our operations are inverses; that is, given G, G
P
∗ G
H
= G and given G and H,
(G∗ H)
P
= 1 and (G∗ H)
Q
= H.
275
The remainder of the proof merely defines the precise operations, proves that they give
generic sets, and proves that they are inverses.
Before beginning, we prove a lemma which comes up several times:
Lemma: If G is generic in 1 and 1 is dense above some j ∈ G then
G
¸
1 = ∅
Let 1
t
= ¦j
t
∈ 1 [ j
t
∈ 1 ∨ j
t
is incompatible with j¦. This is dense, since if j
0
∈ 1
then either j
0
is incompatible with j, in which case j
0
∈ 1
t
, or there is some j
1
such that
j
1
< j. j
0
, and therefore there is some j
2
< j
1
such that j
2
∈ 1, and therefore j
2
< j
0
. So
G intersects 1
t
. But since a generic set is directed, no two elements are incompatible, so
G must contain an element of 1
t
which is not incompatible with j, so it must contain an
element of 1.
G∗ H is a generic filter
First, given generic subsets G and H of 1 and
ˆ
C[G], we can define:
G ∗ H = ¦'j. ˆ ¡` [ j ∈ G∧ ˆ ¡[G] ∈ H¦
G∗ H is closed
Let 'j
1
. ˆ ¡
1
` ∈ G ∗ H and let 'j
1
. ˆ ¡
1
` < 'j
2
. ˆ ¡
2
`. Then we can conclude j
1
∈ G, j
1
< j
2
,
ˆ ¡
1
[G] ∈ H, and j
1
' ˆ ¡
1
< ˆ ¡
2
, so j
2
∈ G (since G is closed) and ˆ ¡
2
[G] ∈ H since j
1
∈ G and
j
1
forces both ˆ ¡
1
< ˆ ¡
2
and that H is downward closed. So 'j
2
. ˆ ¡
2
` ∈ G∗ H.
G∗ H is directed
Suppose 'j
1
. ˆ ¡
1
`. 'j
1
. ˆ ¡
1
` ∈ G∗ H. So j
1
. j
2
∈ G, and since G is directed, there is some j
3
<
j
1
. j
2
. Since ˆ ¡
1
[G]. ˆ ¡
2
[G] ∈ H and H is directed, there is some ˆ ¡
3
[G] < ˆ ¡
1
[G]. ˆ ¡
2
[G]. Therefore
there is some j
4
< j
3
, j
4
∈ G, such that j
4
' ˆ ¡
3
< ˆ ¡
1
. ˆ ¡
2
, so 'j
4
. ˆ ¡
3
` < 'j
1
. ˆ ¡
1
`. 'j
1
. ˆ ¡
1
` and
'j
4
. ˆ ¡
3
` ∈ G∗ H.
G∗ H is generic
Suppose 1 is a dense subset of 1 ∗
ˆ
C. We can project it into a dense subset of C using G:
276
1
Q
= ¦ˆ ¡[G] [ 'j. ˆ ¡` ∈ 1¦ for some j ∈ G
Lemma: 1
Q
is dense in
ˆ
C[G]
Given any ˆ ¡
0

ˆ
C, take any j
0
∈ G. Then we can define yet another dense subset, this one
in G:
1
ˆ q
0
= ¦j [ j < j
0
∧ j ' ˆ ¡ < ˆ ¡
0
∧ 'j. ˆ ¡` ∈ 1¦ for some ˆ ¡ ∈
ˆ
C
Lemma: 1
ˆ q
0
is dense above j
0
in 1
Take any j ∈ 1 such that j < j
0
. Then, since 1 is dense in 1 ∗
ˆ
C, we have some 'j
1
. ˆ ¡
1
` <
'j. ˆ ¡
0
` such that 'j
1
. ˆ ¡
1
` ∈ 1. Then by definition j
1
< j and j
1
∈ 1
ˆ q
0
.
From this lemma, we can conclude that there is some j
1
< j
0
such that j
1
∈ G
¸
1
ˆ q
0
, and
therefore some ˆ ¡
1
such that j
1
' ˆ ¡
1
< ˆ ¡
0
where 'j
1
. ˆ ¡
1
` ∈ 1. So 1
Q
is indeed dense in
ˆ
C[G].
Since 1
Q
is dense in
ˆ
C[G], there is some ˆ ¡ such that ˆ ¡[G] ∈ 1
Q
¸
H, and so some j ∈ G
such that 'j. ˆ ¡` ∈ 1. But since j ∈ G and ˆ ¡ ∈ H, 'j. ˆ ¡` ∈ G∗ H, so G∗ H is indeed generic.
G
P
is a generic filter
Given some generic subset G of 1 ∗
ˆ
C, let:
G
P
= ¦j ∈ 1 [ j
t
< j ∧ 'j
t
. ˆ ¡` ∈ G¦ for some j
t
∈ 1 and some ˆ ¡ ∈ C
G
P
is closed
Take any j
1
∈ G
P
and any j
2
such that j
1
< j
2
. Then there is some j
t
< j
1
satisfying the
definition of G
P
, and also j
t
< j
2
, so j
2
∈ G
P
.
G
P
is directed
Consider j
1
. j
2
∈ G
P
. Then there is some j
t
1
and some ˆ ¡
1
such that 'j
t
1
. ˆ ¡
1
` ∈ G and some
j
t
2
and some ˆ ¡
2
such that 'j
t
2
. ˆ ¡
2
` ∈ G. Since G is directed, there is some 'j
3
. ˆ ¡
3
` ∈ G such
277
that 'j
3
. ˆ ¡
3
` < 'j
t
1
. ˆ ¡
1
`. 'j
t
2
. ˆ ¡
2
`, and therefore j
3
∈ G
P
, j
3
< j
1
. j
2
.
G
P
is generic
Let 1 be a dense subset of 1. Then 1
t
= ¦'j. ˆ ¡` [ j ∈ 1¦. Clearly this is dense, since if
'j. ˆ ¡` ∈ 1 ∗
ˆ
C then there is some j
t
< j such that j
t
∈ 1, so 'j
t
. ˆ ¡` ∈ 1
t
and 'j
t
. ˆ ¡` < 'j. ˆ ¡`.
So there is some 'j. ˆ ¡` ∈ 1
t
¸
G, and therefore j ∈ 1
¸
G
P
. So G
P
is generic.
G
Q
is a generic filter
Given a generic subset G ⊆ 1 ∗
ˆ
C, define:
G
Q
= ¦ˆ ¡[G
P
] [ 'j. ˆ ¡` ∈ G¦ for some j ∈ 1
(Notice that G
Q
is dependant on G
P
, and is a subset of
ˆ
C[G
P
], that is, the forcing notion
inside M[G
P
], as opposed to the set of names C which we’ve been primarily working with.)
G
Q
is closed
Suppose ˆ ¡
1
[G
P
] ∈ G
Q
and ˆ ¡
1
[G
P
] < ˆ ¡
2
[G
P
]. Then there is some j
1
∈ G
P
such that j
1
'
ˆ ¡
1
< ˆ ¡
2
. Since j
1
∈ G
P
, there is some j
2
< j
1
such that for some ˆ ¡
3
, 'j
2
. ˆ ¡
3
` ∈ G. By the
definition of G
Q
, there is some j
3
such that 'j
3
. ˆ ¡
1
` ∈ G, and since G is directed, there is
some 'j
4
. ˆ ¡
4
` ∈ G and 'j
4
. ˆ ¡
4
` < 'j
3
. ˆ ¡
1
`. 'j
2
. ˆ ¡
3
`. Since G is closed and 'j
4
. ˆ ¡
4
` < 'j
4
. ˆ ¡
2
`, we
have ˆ ¡
2
[G
P
] ∈ G
Q
.
G
Q
is directed
Suppose ˆ ¡
1
[G
P
]. ˆ ¡
2
[G
P
] ∈ G
Q
. Then for some j
1
. j
2
, 'j
1
. ˆ ¡
1
`. 'j
2
. ˆ ¡
2
` ∈ G, and since G is
directed, there is some 'j
3
. ˆ ¡
3
` ∈ G such that 'j
3
. ˆ ¡
3
` < 'j
1
. ˆ ¡
1
`. 'j
2
. ˆ ¡
2
`. Then ˆ ¡
3
[G
P
] ∈ G
Q
and since j
3
∈ G and j
3
' ˆ ¡
3
< ˆ ¡
1
. ˆ ¡
2
, we have ˆ ¡
3
[G
P
] < ˆ ¡
1
[G
P
]. ˆ ¡
2
[G
P
].
G
Q
is generic
Let 1 be a dense subset of C[G
P
] (in M[G
P
]). Let
ˆ
1 be a 1-name for 1, and let j
1
∈ G
P
be a such that j
1
'
ˆ
1 is dense. By the definition of G
P
, there is some j
2
< j
1
such that
'j
2
. ˆ ¡
2
` ∈ G for some ¡
2
. Then 1
t
= ¦'j. ˆ ¡` [ j ' ˆ ¡ ∈ 1 ∧ j < j
2
¦.
278
Lemma: 1
t
is dense (in G) above 'j
2
. ˆ ¡
2
`
Take any 'j. ˆ ¡` ∈ 1 ∗ C such that 'j. ˆ ¡` < 'j
2
. ˆ ¡
2
`. Then j '
ˆ
1 is dense, and therefore
there is some ˆ ¡
3
such that j ' ˆ ¡
3

ˆ
1 and j ' ˆ ¡
3
< ˆ ¡. So 'j. ˆ ¡
3
` < 'j. ˆ ¡` and 'j. ˆ ¡
3
` ∈ 1
t
.
Take any 'j
3
. ˆ ¡
3
` ∈ 1
t
¸
G. Then j
3
∈ G
P
, so ˆ ¡
3
∈ 1, and by the definition of G
Q
, ˆ ¡
3
∈ G
Q
.
G
P
∗ G
Q
= G
If G is a generic subset of 1 ∗ C, observe that:
G
P
∗ G
Q
= ¦'j. ˆ ¡` [ j
t
< j ∧ 'j
t
. ˆ ¡
t
` ∈ G∧ 'j
0
. ˆ ¡` ∈ G¦ for some j
t
. ˆ ¡
t
. j
0
If 'j. ˆ ¡` ∈ G then obviously this holds, so G ⊆ G
P
∗ G
Q
. Conversly, if 'j. ˆ ¡` ∈ G
P
∗ G
Q
then there exist j
t
. ˆ ¡
t
and j
0
such that 'j
t
. ˆ ¡
t
`. 'j
0
. ˆ ¡` ∈ G, and since G is directed, some
'j
1
. ˆ ¡
1
` ∈ G such that 'j
1
. ˆ ¡
1
` < 'j
t
. ˆ ¡
t
`. 'j
0
. ˆ ¡`. But then j
1
< j and j
1
' ˆ ¡
1
< ˆ ¡, and since
G is closed, 'j. ˆ ¡` ∈ G.
(G∗ H)
P
= G
Assume that G is generic in 1 and H is generic in C[G].
Suppose j ∈ (G ∗ H)
P
. Then there is some j
t
∈ 1 and some ˆ ¡ ∈ C such that j
t
< j and
'j
t
. ˆ ¡` ∈ G∗ H. By the definition of G∗ H, j
t
∈ G, and then since G is closed j ∈ G.
Conversely, suppose j ∈ G. Then (since H is non-trivial), 'j. ˆ ¡` ∈ G ∗ H for some ˆ ¡, and
therefore j ∈ (G∗ H)
P
.
(G∗ H)
Q
= H
Assume that G is generic in 1 and H is generic in C[G].
Given any ¡ ∈ H, there is some ˆ ¡ ∈ C such that ˆ ¡[G] = ¡, and so there is some j such that
'j. ˆ ¡` ∈ G∗ H, and therefore ˆ ¡[G] ∈ H.
On the other hand, if ¡ ∈ (G ∗ H)
Q
then there is some 'j. ˆ ¡` ∈ G ∗ H, and therefore some
ˆ ¡[G] ∈ H.
Version: 1 Owner: Henry Author(s): Henry
279
34.129 complete partial orders do not add small sub-
sets
Suppose 1 is a κ-complete partial order in M. Then for any generic subset G, M contains
no bounded subsets of κ which are not in M.
Version: 1 Owner: Henry Author(s): Henry
34.130 proof of complete partial orders do not add
small subsets
Take any r ∈ M[G], r ⊆ κ. Let ˆ r be a name for r. There is some j ∈ G such that
j ' ˆ r is a subset of κ bounded by λ < κ
Outline:
For any ¡ < j, we construct by induction a series of elements ¡
α
stronger than j. Each ¡
α
will determine whether or not α ∈ ˆ r. Since we know the subset is bounded below κ, we can
use the fact that 1 is κ complete to find a single element stronger than ¡ which fixes the
exact value of ˆ r. Since the series is definable in M, so is ˆ r, so we can conclude that above
any element ¡ < j is an element which forces ˆ r ∈ M. Then j also forces ˆ r ∈ M, completing
the proof.
Details:
Since forcing can be described within M, o = ¦¡ ∈ 1 [ ¡ ' ˆ r ∈ \ ¦ is a set in M. Then,
given any ¡ < j, we can define ¡
0
= ¡ and for any ¡
α
(α < λ), ¡
α+1
is an element of 1
stronger than ¡
α
such that either ¡
α+1
' α + 1 ∈ ˆ r or ¡
α+1
' α + 1 ∈ ˆ r. For limit α, let ¡
t
α
be any upper bound of ¡
β
for α < β (this exists since 1 is κ-complete and α < κ), and let
¡
α
be stronger than ¡
t
α
and satisfy either ¡
α+1
' α ∈ ˆ r or ¡
α+1
' α ∈ ˆ r. Finally let ¡

be the
upper bound of ¡
α
for α < λ. ¡

∈ 1 since 1 is κ-complete.
Note that these elements all exist since for any j ∈ 1 and any (first-order) sentence φ there
is some ¡ < j such that ¡ forces either φ or φ.
¡

not only forces that ˆ r is a bounded subset of κ, but for every ordinal it forces whether or
not that ordinal is contained in ˆ r. But the set ¦α < λ [ ¡

' α ∈ ˆ r¦ is defineable in M, and
is of course equal to ˆ r[G

] in any generic G

containing ¡

. So ¡

' ˆ r ∈ M.
Since this holds for any element stronger than j, it follows that j ' ˆ r ∈ M, and therefore
ˆ r[G] ∈ M.
Version: 1 Owner: Henry Author(s): Henry
280
34.131 Q is equivalent to ♣ and continuum hypothesis
If o is a stationary subset of κ and λ < κ implies 2
λ
< κ then
Q
S
↔♣
S
Moreover, this is best possible: Q
S
is consistent with ♣
S
.
Version: 3 Owner: Henry Author(s): Henry
34.132 Levy collapse
Given any cardinals κ and λ in M, we can use the Levy collapse to give a new model
M[G] where λ = κ. Let 1 = Levy(κ. λ) be the set of partial functions 1 : κ → λ with
[ dom(1)[ < κ. These functions each give partial information about a function 1 which
collapses λ onto κ.
Given any generic subset G of 1, M[G] has a set G, so let 1 =
¸
G. Each element of G is a
partial function, and they are all compatible, so 1 is a function. dom(G) = κ since for each
α < κ the set of 1 ∈ 1 such that α ∈ dom(1) is dense (given any function without α, it is
trivial to add (α. 0), giving a stronger function which includes α). Also range(G) = λ since
the set of 1 ∈ 1 such that α < λ is in the range of 1 is again dense (the domain of each 1 is
bounded, so if β is larger than any element of dom(1), 1
¸
¦(β. α)¦ is stronger than 1 and
includes λ in its domain).
So 1 is a surjective function from κ to λ, and λ is collapsed in M[G]. In addition,
[ Levy(κ. λ)[ = λ, so it satisfies the λ
+
chain condition, and therefore λ
+
is not collapsed, and
becomes κ
+
(since for any ordinal between λ and λ
+
there is already a surjective function
to it from λ).
We can generalize this by forcing with 1 = 1c·n(κ. < λ) with λ regular, the set of partial
functions 1 : λ κ →λ such that 1(0. α) = 0, [ dom(1)[ < κ and if α 0 then 1(α. i) < α.
In essence, this is the union of 1c·n(κ. η) for each κ < η < λ.
In M[G], define 1 =
¸
G and 1
α
(β) = 1(α. β). Each 1
α
is a function from κ to α, and by
the same argument as above 1
α
is both total and surjective. Moreover, it can be shown that
1 satisfies the λ chain condition, so λ does not collapse and λ = κ
+
.
Version: 2 Owner: Henry Author(s): Henry
281
34.133 proof of Q is equivalent to ♣ and continuum
hypothesis
The proof that Q
S
implies both ♣
S
and that for every λ < κ, 2
λ
< κ are given in the entries
for Q
S
and ♣
S
.
Let ¹ = '¹
α
`
α∈S
be a sequence which satisfies ♣
S
.
Since there are only κ bounded subsets of κ, there is a surjective function 1 : κ →
Bounded(κ) κ where Bounded(κ) is the bounded subsets of κ. Define a sequence 1 =
'1
α
`
α<κ
by 1
α
= 1(α) if :nj(1
α
) < α and ∅ otherwise. Since the set of (1
α
. λ) ∈
Bounded(κ) κ such that 1
α
= 1 is unbounded for any bounded subset 1, it follow that
every bounded subset of κ occurs κ times in 1.
We can define a new sequence, 1 = '1
α
`
α∈S
such that r ∈ 1
α
↔r ∈ 1
β
for some β ∈ ¹
α
.
We can show that 1 satisfies Q
S
.
First, for any α, r ∈ 1
α
means that r ∈ 1
β
for some β ∈ ¹
α
, and since 1
β
⊆ β ∈ ¹
α
⊆ α,
we have 1
α
⊆ α.
Next take any 1 ⊆ κ. We consider two cases:
1 is bounded
The set of α such that 1 = 1
α
forms an unbounded sequence 1
t
, so there is a stationary
o
t
⊆ o such that α ∈ o
t
↔ ¹
α
⊂ 1
t
. For each such α, r ∈ 1
α
↔ r ∈ 1
i
for some
i ∈ ¹
α
⊂ 1
t
. But each such 1
i
is equal to 1, so 1
α
= 1.
1 is unbounded
We define a function , : κ →κ as follows:
• ,(0) = 0
• To find ,(α), take A
¸
¦,(β) [ β < α¦. This is a bounded subset of κ, so is equal to
an unbounded series of elements of 1. Take ,(α) = γ, where γ is the least number
greater than any element of ¦α¦
¸
¦,(β) [ β < α¦ such that 1
γ
= A
¸
¦,(β) [ β < α¦.
Let 1
t
= range(,). This is obviously unbounded, and so there is a stationary o
t
⊆ o such
that α ∈ o
t
↔¹
α
⊆ 1
t
.
Next, consider (, the set of ordinals less than κ closed under ,. Clearly it is unbounded,
since if λ < κ then ,(λ) includes ,(α) for α < λ, and so induction gives an ordinal greater
than λ closed under , (essentially the result of applying , an infinite number of times). Also,
( is closed: take any c ⊆ ( and suppose sup(c
¸
α) = α. Then for any β < α, there is some
282
γ ∈ c such that β < γ < α and therefore ,(β) < γ. So α is closed under ,, and therefore
contained in (.
Since ( is a club, (
t
= (
¸
o
t
is stationary. Suppose α ∈ (
t
. Then r ∈ 1
α
↔ r ∈ 1
β
where β ∈ ¹
α
. Since α ∈ o
t
, β ∈ range(,), and therefore 1
β
⊆ 1. Next take any r ∈ 1
¸
α.
Since α ∈ (, it is closed under ,, hence there is some γ ∈ α such that ,(r) ∈ γ. Since
sup(¹
α
) = α, there is some η ∈ ¹
α
such that γ < η, so ,(r) ∈ η. Since η ∈ ¹
α
, 1
η
⊆ 1
α
,
and since η ∈ range(,), ,(δ) ∈ 1
η
for any δ < ,
−1
(η), and in particular r ∈ 1
η
. Since we
showed above that 1
α
⊆ α, we have 1
α
= 1
¸
α for any α ∈ (
t
.
Version: 3 Owner: Henry Author(s): Henry
34.134 Martin’s axiom
For any cardinal κ, Martin’s Axiom for κ (`¹
κ
) states that if 1 is a partial order
satisfying ccc then given any set of κ dense subsets of 1, there is a directed subset intersecting
each such subset. Martin’s Axiom states that `¹
κ
holds for every κ < 2

0
.
Version: 3 Owner: Henry Author(s): Henry
34.135 Martin’s axiom and the continuum hypothesis


0
always holds
Given a countable collection of dense subsets of a partial order, we can selected a set 'j
n
`
n<ω
such that j
n
is in the :-th dense subset, and j
n+1
< j
n
for each :. Therefore (H implies
`¹.
If `¹
κ
then 2

0
κ, and in fact 2
κ
= 2

0
κ ` ℵ
0
, so 2
κ
` 2

0
, hence it will suffice to find an surjective function from P(ℵ
0
) to P(κ).
Let ¹ = '¹
α
`
α<κ
, a sequence of infinite subsets of ω such that for any α = β, ¹
α
¸
¹
β
is
finite.
Given any subset o ⊆ κ we will construct a function 1 : ω →¦0. 1¦ such that a unique o can
be recovered from each 1. 1 will have the property that if i ∈ o then 1(c) = 0 for finitely
many elements c ∈ ¹
i
, and if i ∈ o then 1(c) = 0 for infinitely many elements of ¹
i
.
Let 1 be the partial order (under inclusion) such that each element j ∈ 1 satisfies:
283
• j is a partial function from ω to ¦0. 1¦
• There exist i
1
. . . . . i
n
∈ o such that for each , < :, ¹
i
j
⊆ dom(j)
• There is a finite subset of ω, u
p
, such that u
p
= dom(j) −
¸
j<n
¹
i
j
• For each , < :, j(c) = 0 for finitely many elements of ¹
i
j
This satisfies ccc. To see this, consider any uncountable sequence o = 'j
α
`
α<ω
1
of elements
of 1. There are only countably many finite subsets of ω, so there is some u ⊆ ω such that
u = u
p
for uncountably many j ∈ o and j ` u is the same for each such element. Since each
of these function’s domain covers only a finite number of the ¹
α
, and is 1 on all but a finite
number of elements in each, there are only a countable number of different combinations
available, and therefore two of them are compatible.
Consider the following groups of dense subsets:
• 1
n
= ¦j ∈ 1 [ : ∈ dom(j)¦ for : < ω. This is obviously dense since any j not already
in 1
n
can be extended to one which is by adding ':. 1`
• 1
α
= ¦j ∈ 1 [ dom(j) ⊇ ¹
α
¦ for α ∈ o. This is dense since if j ∈ 1
α
then
j
¸
¦'c. 1` [ c ∈ ¹
α
` dom(j)¦ is.
• For each α ∈ o, : < ω, 1
n,α
= ¦j ∈ 1 [ : ` : ∧ j(:) = 0¦ for some : < ω. This
is dense since if j ∈ 1
n,α
then dom(j)
¸
¹
α
= ¹
α
¸

u
p
¸¸
j
¹
i
j

. But u
p
is finite,
and the intersection of ¹
α
with any other ¹
i
is finite, so this intersection is finite,
and hence bounded by some :. ¹
α
is infinite, so there is some : < r ∈ ¹
α
. So
j
¸
¦'r. 0`¦ ∈ 1
n,α
.
By `¹
κ
, given any set of κ dense subsets of 1, there is a generic G which intersects all of
them. There are a total of ℵ
0
+[o[ + (κ −[o[) ℵ
0
= κ dense subsets in these three groups,
and hence some generic G intersecting all of them. Since G is directed, o =
¸
G is a partial
function from ω to ¦0. 1¦. Since for each : < ω, G
¸
1
n
is non-empty, : ∈ dom(o), so o is
a total function. Since G
¸
1
α
for α ∈ o is non-empty, there is some element of G whose
domain contains all of ¹
α
and is 0 on a finite number of them, hence o(c) = 0 for a finite
number of c ∈ ¹
α
. Finally, since G
¸
1
n,α
for each : < ω, α ∈ o, the set of : ∈ ¹
α
such
that o(:) = 0 is unbounded, and hence infinite. So o is as promised, and 2
κ
= 2

0
.
Version: 1 Owner: Henry Author(s): Henry
34.136 Martin’s axiom is consistent
If κ is an uncountable strong limit cardinal such that for any λ < κ, κ
λ
= κ then it is
consistent that 2

0
= κ and MA. This is shown by using finite support iterated forcing to
284
construct a model of ZFC in which this is true. Historically, this proof was the motivation
for developing iterated forcing.
Outline
The proof uses the convenient fact that `¹
κ
holds as long as it holds for all partial orders
smaller than κ. Given the conditions on κ, there are at most κ names for these partial orders.
At each step in the forcing, we force with one of these names. The result is that the actual
generic subset we add intersects every dense subset of every partial order.
Construction of 1
κ
ˆ
C
α
will be constructed by induction with three conditions: [1
α
[ < κ for all α < κ, '

ˆ
C
α

M, and 1
α
satisfies the ccc. Note that a partial ordering on a cardinal λ < κ is a function
from λ λ to ¦0. 1¦, so there are at most 2
λ
< κ of them. Since a canonical name for a
partial ordering of a cardinal is just a function from 1
α
to that cardinal, there are at most
κ
2
λ
< κ of them.
At each of the κ steps, we want to deal with one of these possible partial orderings, so we
need to partition the κ steps in to κ steps for each of the κ cardinals less than κ. In addition,
we need to include every 1
α
name for any level. Therefore, we partion κ into 'o
γ,δ
`
γ,δ<κ
for
each cardinal δ, with each o
γ,δ
having cardinality κ and the added condition that η ∈ o
γ,δ
implies η ` γ. Then each 1
γ
name for a partial ordering of δ is assigned some index η ∈ o
γ,δ
,
and that partial order will be dealt with at stage C
η
.
Formally, given
ˆ
C
β
for β < α, 1
α
can be constructed and the 1
α
names for partial orderings
of each cardinal δ enumerated by the elements of o
α,δ
. α ∈ o
γ,δ
for some γ
α
and δ
α
, and
α ` γ
α
so some canonical 1
γα
name for a partial order
ˆ
<
α
of δ
α
has already been assigned
to α.
Since
ˆ
<
α
is a 1
γα
name, it is also a 1
α
name, so
ˆ
C
α
can be defined as 'δ
α
.
ˆ
<
α
` if '


α
.
ˆ
<
α
`
satisfies the ccc and by the trivial partial order '1. ¦'1. 1`¦` otherwise. Obviously this
satisfies the ccc, and so 1
α+1
does as well. Since
ˆ
C
α
is either trivial or a cardinal together
with a canonical name, '

ˆ
C
α
⊆ M. Finally, [1
α+q
[ <
¸
n
[α[
n
(sup
i
[
ˆ
C
i
[)
n
< κ.
Proof that `¹
λ
holds for λ < κ
Lemma: It suffices to show that `¹
λ
holds for partial order with size < λ
S uppose 1 is a partial order with [1[ κ and let '1
α
`
α<λ
be dense subsets of 1. Define
functions 1
i
: 1 → 1
α
for ακ with 1
α
(j) ` j (obviously such elements exist since 1
α
is
285
dense). Let o : 1 1 → 1 be a function such that o(j. ¡) ` j. ¡ whenever j and ¡ are
compatible. Then pick some element ¡ ∈ 1 and let C be the closure of ¦¡¦ under 1
α
and o
with the same ordering as 1 (restricted to C).
Since there are only κ functions being used, it must be that [C[ < κ. If j ∈ C then 1
α
(j) ` j
and clearly 1
α
(j) ∈ C
¸
1
α
, so each 1
α
¸
C is dense in C. In addition, C is ccc: if ¹ is an
antichain in C and j
1
. j
2
∈ ¹ then j
1
. j
2
are incompatible in C. But if they were compatible
in 1 then o(j
1
. j
2
) ` j
1
. j
2
would be an element of C, so they must be incompatible in 1.
Therefore ¹ is an antichain in 1, and therefore must have countable cardinality, since 1
satisfies the ccc.
By assumption, there is a directed G ⊆ C such that G
¸
(1
α
¸
C) = ∅ for each α < κ, and
therefore `¹
λ
holds in full.
Now we must prove that, if G is a generic subset of 1
κ
, 1 some partial order with [1[ < λ
and '1
α
`
α<λ
are dense subsets of 1 then there is some directed subset of 1 intersecting each
1
α
.
If [1[ < λ then λ additional elements can be added greater than any other element of 1 to
make [1[ = λ, and then since there is an order isomorphism into some partial order of λ,
assume 1 is a partial ordering of λ. Then let 1 = ¦'α. β` [ α ∈ 1
β
¦.
Take canonical names so that 1 =
ˆ
1[G], 1 =
ˆ
1[G] and 1
i
=
ˆ
1
i
[G] for each i < λ and:
'

ˆ
1 is a partial ordering satisfying ccc and
ˆ
1 ⊆ λ λ and
ˆ
1
α
is dense in
ˆ
1
For any α. β there is a maximal antichain 1
α,β
⊆ 1
κ
such that if j ∈ 1
α,β
then either
j '

α <
ˆ
R
β or j '

α <
ˆ
R
β and another maximal antichain 1
α,β
⊆ 1
κ
such that if
j ∈ 1
α,β
then either j '

'α. β` ∈
ˆ
1 or j '

'α. β` ∈
ˆ
1. These antichains determine the
value of those two formulas.
Then, since κ
cf κ
κ and κ
µ
= κ for j < κ, it must be that cf κ = κ, so κ is regular. Then
γ = sup(¦α+1 [ α ∈ dom(j). j ∈
¸
α,β<λ
1
α,β
¸
1
α,β
) < κ, so 1
α,β
. 1
α,β
⊆ 1
γ
, and therefore
the 1
κ
names
ˆ
1 and
ˆ
1 are also 1
γ
names.
Lemma: For any γ, G
γ
= ¦j ` γ [ j ∈ G¦ is a generic subset of 1
γ
F irst, it is directed, since if j
1
` γ. j
2
` γ ∈ G
γ
then there is some j ∈ G such that
j < j
1
. j
2
, and therefore j ` γ ∈ G
γ
and j < j
1
` γ. j
2
` γ.
Also, it is generic. If 1 is a dense subset of 1
γ
then 1
κ
= ¦j ∈ 1
κ
[ j < ¡ ∈ 1¦ is dense in
1
κ
, since if j ∈ 1
κ
then there is some d < j `, but then d is compatible with j, so d
¸
j ∈ 1
κ
.
Therefore there is some j ∈ 1
κ
¸
G
κ
, and so j `∈ 1
¸
G
γ
.
286
Since
ˆ
1 and
ˆ
1 are 1
γ
names,
ˆ
1[G] =
ˆ
1[G
γ
] = 1 and
ˆ
1[G] =
ˆ
1[G
γ
] = 1, so
\ [G
γ
] =
ˆ
1 is a partial ordering of λ satisfying the ccc and
ˆ
1
α
is dense in
ˆ
1
Then there must be some j ∈ G
γ
such that
j '

ˆ
1 is a partial ordering of λ satisfying the ccc
Let ¹
p
be a maximal antichain of 1
γ
such that j ∈ ¹
p
, and define
ˆ
<

as a 1
γ
name with
(j. :) ∈
ˆ
<

for each : ∈
ˆ
1 and (c. :) ∈
ˆ
<

if : = (α. β) where α < β < λ and j = c ∈ ¹
p
.
That is,
ˆ
<

[G] = 1 when j ∈ G and
ˆ
<

[G] =∈` λ otherwise. Then this is the name for a
partial ordering of λ, and therefore there is some η ∈ o
γ,λ
such that
ˆ
<

=
ˆ
<
η
, and η ` γ.
Since j ∈ G
γ
⊆ G
η
,
ˆ
C
η
[G
η
] =
ˆ
<
η
[G
η
] = 1.
Since 1
η+1
= 1
η
∗C
η
, we know that G

⊆ C
η
is generic since forcing with the composition is equivalent to successiv
Since 1
i
∈ \ [G
γ
] ⊆ \ [G
η
] and is dense, it follows that 1
i
¸
G

= ∅ and since G

is a
subset of 1 in 1
κ
, `¹
λ
holds.
Proof that 2

0
= κ
The relationship between Martin’s axiom and the continuum hypothesis tells us that 2

0
`
κ. Since 2

0
was less than κ in \ , and since [1
κ
[ = κ adds at most κ elements, it must be
that 2

0
= κ.
Version: 3 Owner: Henry Author(s): Henry
34.137 a shorter proof: Martin’s axiom and the con-
tinuum hypothesis
This is another, shorter proof for the fact that `¹

0
always holds.
Let (1. <) be a partially ordered set and D be a collection of subsets of (1. <). We remeber
that a filter G on (1. <) is D-generic if G
¸
1 = ∅ for all 1 ∈ D which are dense in (1. <).
(”dense” in this context means: If 1 is dense in (1. <), than for every j ∈ 1 there’s a d ∈ 1
such that d < j.
Let (1. <) be a partially ordered set and D a countable collection of dense subsets of 1, then
there exists a D-generic filter G on 1. Moreover, it could be shown, that for every j ∈ 1
there’s such a D-generic filter G with j ∈ G.
287
L et 1
1
. . . . . 1
n
. . . . be the dense subsets in D. Furthermore let j
0
= j. Now we can choose
for every 1 < : < ω an element j
n
∈ 1 such that j
n
< j
n−1
and j
n
∈ 1
n
. If we now consider
the set G := ¦¡ ∈ 1 : ∃ : < ωs.t. j
n
< ¡¦, than it is easy to check that G is a D-generic
filter on 1 and j ∈ G obviously. This completes the proof.
Version: 4 Owner: x bas Author(s): x bas
34.138 continuum hypothesis
The Continuum Hypothesis states that there is no cardinal number κ such that ℵ
0
< κ < 2

0
.
An equivalent statement is that ℵ
1
= 2

0
.
It is known to be independent of the axioms of ZFC.
The continuum hypothesis can also be stated as: there is no subset of the real numbers
which has cardinality strictly between that of the reals and that of the integers. It is from
this that the name comes, since the set of real numbers is also known as the continuum.
Version: 8 Owner: Evandar Author(s): Evandar
34.139 forcing
Forcing is the method used by Paul Cohen to prove the independence of the continuum hypothesis
(CH). In fact, the method was used by Cohen to prove that CH could be violated. The treat-
ment I give here is VERY informal. I will develop it later. First let me give an example from
algebra.
Suppose we have a field /, and we want to add to this field an element α such α
2
= −1.
We see that we cannot simply drop a new α in /, since then we are not guaranteed that we
still have a field. Neither can we simply assume that / already has such an element. The
standard way of doing this is to start by adjoining a generic indeterminate A, and impose a
constraint on A, saying that A
2
+ 1 = 0. What we do is take the quotient /[A](A
2
+ 1),
and make a field out of it by taking the quotient field. We then obtain /(α), where α is the
equivalence class of A in the quotient. The general case of this is the theorem of algebra
saying that every polynomial j over a field / has a root in some extension field.
We can rephrase this and say that “it is consistent with standard field theory that −1 have
a square root”.
When the theory we consider is ZFC, we run in exactly the same problem : we can’t just
add a “new” set and pretend it has the required properties, because then we may violate
something else, like foundation. Let M be a transitive model of set theory, which we call
288
the ground model. We want to “add a new set” o to M in such a way that the extension
M
t
has M as a subclass, and the properties of M are preserved, and o ∈ M
t
.
The first step is to “approximate” the new set using elements of M. This is the analogue of
finding the irreducible polynomial in the algebraic example. The set 1 of such “approxima-
tions” can be ordered by how much information the approximations give : let j. ¡ ∈ 1, then
j < ¡ if and only if j “is stronger than” ¡. We call this set a set of forcing conditions.
Furthermore, it is required that the set 1 itself and the order relation be elements of M.
Since 1 is a partial order, some of its subsets have interesting properties. Consider 1 as
a topological space with the order topology. A subset 1 ⊆ 1 is dense in 1 if and only if
for every j ∈ 1, there is d ∈ 1 such that d < j. A filter in 1 is said to be M-generic if
and only if it intersects every one of the dense subsets of 1 which are in M. An M-generic
filter in 1 is also referred to as a generic set of conditions in the literature. In general,
eventhough 1 is a set in M, generic filters are not elements of M.
If 1 is a set of forcing conditions, and G is a generic set of conditions in 1, all in the ground
model M, then we define M[G] to be the least model of ZFC that contains G. In forthcoming
entries I will detail the construction of M[G]. The big theorem is this :
Theorem 5. M[G] is a model of ZFC, and has the same ordinals as M, and M⊆ M[G].
The way to prove that we can violate CH using a generic extension is to add many new
“subsets of ω” in the following way : let M be a transitive model of ZFC, and let (1. <) be
the set (in M) of all functions 1 whose domain is a finite subset of ℵ
2

0
, and whose range
is the set ¦0. 1¦. The ordering here is j < ¡ if and only if j ⊃ ¡. Let G be a generic set of
conditions in 1. Then
¸
G is a total function whose domain is ℵ
2

0
, and range is ¦0. 1¦.
We can see this 1 as coding ℵ
2
new functions 1
α
: ℵ
0
→¦0. 1¦, α < ℵ
2
, which are subsets of
omega. These functions are all dictinct, and so CH is violated in M[G].
All this relies on a proper definition of the satisfaction relation in M[G], and the forcing relation,
which will come in a forthcoming entry. Details can be found in Thomas Jech’s book Set
Theory.
Version: 6 Owner: jihemme Author(s): jihemme
34.140 generalized continuum hypothesis
The generalized continuum hypothesis states that for any infinite cardinal λ there is no
cardinal κ such that λ < κ < 2
λ
.
Equivalently, for every ordinal α, ℵ
α+1
= 2
ℵα
.
Like the continuum hypothesis, the generalized continuum hypothesis is known to be independent
of the axioms of ZFC.
289
Version: 7 Owner: Evandar Author(s): Evandar
34.141 inaccessible cardinals
A limit cardinal κ is a strong limit cardinal if for any λ < κ, 2
λ
< κ.
A regular limit cardinal κ is called weakly inaccessible, and a regular strong limit cardinal
is called inaccessible.
Version: 2 Owner: Henry Author(s): Henry
34.142 Q
Q
S
is a combinatoric principle regarding a stationary set o ⊆ κ. It holds when there is a
sequence '¹
α
`
α∈S
such that each ¹
α
⊆ α and for any ¹ ⊆ κ, ¦α ∈ o [ ¹
¸
α = ¹
α
¦ is
stationary.
To get some sense of what this means, observe that for any λ < κ, ¦λ¦ ⊆ κ, so the set of
¹
α
= ¦λ¦ is stationary (in κ). More strongly, suppose κ λ. Then any subset of 1 ⊂ λ is
bounded in κ so ¹
α
= 1 on a stationary set. Since [o[ = κ, it follows that 2
λ
< κ. Hence
Q

1
, the most common form (often written as just Q), implies CH.
Version: 3 Owner: Henry Author(s): Henry
34.143 ♣

S
is a combinatoric principle weaker than Q
S
. It states that, for o stationary in κ, there
is a sequence '¹
α
`
α∈S
such that ¹
α
⊆ α and sup(¹
α
) = α and with the property that for
each unbounded subset 1 ⊆ κ there is some ¹
α
⊆ A.
Any sequence satisfying Q
S
can be adjusted so that sup(¹
α
) = α, so this is indeed a weakened
form of Q
S
.
Any such sequence actually contains a stationary set of α such that ¹
α
⊆ 1 for each 1:
given any club ( and any unbounded 1, construct a κ sequence, (

and 1

, from the
elements of each, such that the α-th member of (

is greater than the α-th member of 1

,
which is in turn greater than any earlier member of (

. Since both sets are unbounded, this
construction is possible, and 1

is a subset of 1 still unbounded in κ. So there is some α
such that ¹
α
⊆ 1

, and since sup(¹
α
) = α, α is also the limit of a subsequence of (

and
therefore an element of (.
290
Version: 1 Owner: Henry Author(s): Henry
34.144 Dedekind infinite
A set ¹ is a said to be Dedekind infinite if there is an injective function 1 : ω →¹, where
ω denotes the set of natural numbers.
A Dedekind infinite set is certainly infinite, and if the axiom of choice is assumed, then an
infinite set is Dedekind infinite. However, it is consistent with the failure of the axiom of
choice that there is a set which is infinite but not Dedekind infinite.
Version: 4 Owner: Evandar Author(s): Evandar
34.145 Zermelo-Fraenkel axioms
Equality of sets: If A and ) are sets, and r ∈ A iff r ∈ ) , then A = ) . Pair set: If A
and ) are sets, then there is a set 2 containing only A and ) . Union over a set: If A is a
set, then there exists a set that contains every element of each r ∈ A. axiom of power set:
If A is a set, then there exists a set P(r) with the property that ) ∈ P(r) iff any element
n ∈ ) is also in A. Replacement axiom: Let 1(r. n) be some formula. If, for all r, there
is exactly one n such that 1(r. n) is true, then for any set ¹ there exists a set 1 with the
property that / ∈ 1 iff there exists some c ∈ ¹ such that 1(c. /) is true. regularity axiom:
Let 1(r) be some formula. If there is some r that makes 1(r) true, then there is a set )
such that 1() ) is true, but for no n ∈ ) is 1(n) true. Existence of an infinite set: There
exists a non-empty set A with the property that, for any r ∈ A, there is some n ∈ A such
that r ⊆ n but r = n. Ernst Zermelo and Abraham Fraenkel proposed these axioms as a
foundation for what is now called Zermelo-Fraenkel set theory, or ZF. If these axioms are
accepted along with the axiom of choice, it is often denoted ZFC.
Version: 10 Owner: mathcam Author(s): mathcam, vampyr
34.146 class
By a class in modern set theory we mean an arbitrary collection of elements of the universe.
All sets are classes (as they are collections of elements of the universe - which are usually
sets, but could also be urelements), but not all classes are sets. Classes which are not sets
are called proper classes.
291
The need for this distinction arises from the paradoxes of the so called naive set theory. In
naive set theory one assumes that to each possible division of the universe into two disjoint
and mutually comprehensive parts there corresponds an entity of the universe, a set. This is
the contents of Frege’s famous fifth axiom, which states that to each second order predicate
1 there corresponds a first order object j called the extension of 1, s.t. ∀r(1(r) ↔r ∈ j).
(Every predicate 1 divides the universe into two mutually comprehensive and disjoint parts;
namely the part which consists of objects for which 1 holds and the part consisting of objects
for which 1 does not hold).
Speaking in modern terms we may view the situation as follows. Consider a model of set
theory M. The interpretation the model gives to ∈ defines implicitly a function 1 : 1(M) →
M. Seen this way, the fact that not all classes can be sets simply means that we can’t
injectively map the powerset of any set into the set itself, which is a famous result by
Cantor. Functions like 1 here are known as extensors and they have been used in the study
of semantics of set theory.
Russell’s paradox - which could be seen as a proof of Cantor’s theorem about cardinalities
of powersets - shows that Frege’s fifth axiom is contradictory; not all classes can be sets.
From here there are two traditional ways to proceed: either trough the theory of types or
trough some form of limitation of size principle.
The limitation of size principle in its vague form says that all small classes (in the sense of
cardinality) are sets, while all proper classes are very big; “too big” to be sets. The limitation
of size principle can be found in Cantor’s work where it is the basis for Cantor’s doctrine that
only transfinite collections can be thought as specific objects (sets), but some collections are
“absolutely infinite”, and can’t be thought to be comprehended into an object. This can be
given a precise formulation: all classes which are of the same cardinality as the universal
class are too big, and all other classes are small. In fact, this formulation can be used
in von Neumann-Bernays-G¨odel set theory to replace the replacement axiom and almost all
other set existence axioms (with the exception of the powerset axiom).
The limitation of size principle can be seen to give rise to extensors of type 1
<[A[
(¹) → ¹.
(1
<[A[
(¹) is the set of all subsets of ¹ which are of cardinality less than that of ¹). This is
not the only possible way to avoid Russell’s paradox. We could use an extensor according
to which all classes which are of cardinality less than that of the universe or for which the
cardinality of their complement is less than that of the universe are sets (i.e. map into
elements of the model).
In many set theories there are formally no proper classes; ZFC is an example of just such a
set theory. In these theories one usually means by a proper class an open formula Φ, possibly
with set parameters c
1
. .... c
n
. Notice, however, that these do not exhaust all possible proper
classes that should “really” exist for the universe, as it only allows us to deal with proper
classes that can be defined by means of an open formula with parameters. The theory NBG
formalises this usage: it’s conservative over ZFC (as clearly speaking about open formulae
with parameters must be!).
292
There is a set theory known as Morse-Kelley set theory which allows us to speak about and
to quantify over an extended class of impredicatively defined porper classes that can’t be
reduced to simply speaking about open formulae.
Version: 5 Owner: Aatu Author(s): Aatu
34.147 complement
Let ¹ be a subset of 1. The complement of ¹ in 1 (denoted ¹

when the larger set 1 is
clear from context) is the set difference 1 ` ¹.
Version: 1 Owner: djao Author(s): djao
34.148 delta system
If o is a set of finite sets then it is a ∆-system if there is some (possibly empty) A such
that for any c. / ∈ o, if c = / then c
¸
/ = A.
Version: 2 Owner: Henry Author(s): Henry
34.149 delta system lemma
If o is a set of finite sets such that [o[ = ℵ
1
then there is a o
t
⊆ o such that [o
t
[ = ℵ
1
and
o is a ∆-system.
Version: 3 Owner: Henry Author(s): Henry
34.150 diagonal intersection
If 'o
i
`
i<α
is a sequence then the diagonal intersection, ∆
i<α
o
i
is defined to be ¦β < α [
β ∈
¸
γ<β
o
γ
¦.
That is, β is in ∆
i<α
o
i
if it is contained in the first β members of the sequence.
Version: 2 Owner: Henry Author(s): Henry
293
34.151 intersection
The intersection of two sets ¹ and 1 is the set that contains all the elements r such that
r ∈ ¹ and r ∈ 1. The intersection of ¹ and 1 is written as ¹
¸
1.
Example. If ¹ = ¦1. 2. 3. 4. 5¦ and 1 = ¦1. 3. 5. 7. 9¦ then ¹
¸
1 = ¦1. 3. 5¦.
We can define also the intersection of an arbitrary number of sets. If ¦¹
j
¦
j∈J
is a family of
sets we define the intersection of all them, denoted
¸
j∈J
¹
j
, as the set consisting in those
elements belonging to all sets ¹
j
:
¸
j∈J
¹
j
= ¦r ∈ ¹
j
: for all , ∈ J¦.
Version: 7 Owner: drini Author(s): drini, xriso
34.152 multiset
A multiset is a set for which duplicate elements are allowed.
For example, ¦1. 1. 3¦ is a multiset, but not a set.
Version: 2 Owner: akrowne Author(s): akrowne
34.153 proof of delta system lemma
Since there are only ℵ
0
possible cardinalities for any element of o, there must be some
: such that there are an uncountable number of elements of o with cardinality :. Let
o

= ¦c ∈ o [ [c[ = :¦ for this :. By induction, the lemma holds:
If : = 1 then there each element of o

is distinct, and has no intersection with the others,
so A = ∅ and o
t
= o

.
Suppose : 1. If there is some r which is in an uncountable number of elements of o

then
take o
∗∗
= ¦c`¦r¦ [ r ∈ c ∈ o

¦. Obviously this is uncountable and every element has :−1
elements, so by the induction hypothesis there is some o
t
⊆ o
∗∗
of uncountable cardinality
such that the intersection of any two elements is A. Obviously ¦c
¸
¦r¦ [ c ∈ o
t
¦ satisfies
the lemma, since the intersection of any two elements is A
¸
¦r¦.
On the other hand, if there is no such r then we can construct a sequence 'c
i
`
i<ω
1
such that
each c
i
∈ o

and for any i = ,, c
i
¸
c
j
= ∅ by induction. Take any element for c
0
, and
given 'c
i
`
i<α
, since α is countable, ¹ =
¸
i<α
c
i
is countable. Obviously each element of
294
¹ is in only a countable number of elements of o

, so there are an uncountable number of
elements of o

which are candidates for c
α
. Then this sequence satisfies the lemma, since
the intersection of any two elements is ∅.
Version: 2 Owner: Henry Author(s): Henry
34.154 rational number
The rational numbers ´ are the fraction field of the ring Z of integers. In more elementary
terms, a rational number is a quotient c/ of two integers c and /. Two fractions c/ and
cd are equivalent if the product of the cross terms is equal:
c
/
=
c
d
⇔cd = /c
Addition and multiplication of fractions are given by the formulae
c
/
+
c
d
=
cd + /c
/d
c
/

c
d
=
cc
/d
The field of rational numbers is an ordered field, under the ordering relation: c/ < cd if
the inequality c d < / c holds in the integers.
Version: 7 Owner: djao Author(s): djao
34.155 saturated (set)
If j : A −→) is a surjective map, we say that a subset ( ⊆ A is saturated (with respect to
p) if ( contains every set j
−1
(¦n¦) it intersects. Equivalently, ( is saturated if it is a union
of fibres.
Version: 2 Owner: dublisk Author(s): dublisk
34.156 separation and doubletons axiom
• Separation axiom : If A is a set and 1 is a condition on sets, there exists a set
) whose members are precisely the members of A satisfying 1. Common notation:
) = ¦¹ ∈ A|1(¹)¦.
• Doubletons axiom (or Pairs): If A and ) are sets there is a set 2 whose only
members are A and ). Common notation: 2 = ¦A. ) ¦.
295
REFERENCES
1. G.M. Bergman, An Invitation to General Algebra and Universal Constructions.
Version: 3 Owner: vladm Author(s): vladm
34.157 set
34.157.1 Introduction
A set is a collection, group, or conglomerate
1
.
Sets can be of “real” objects or mathematical objects; but the sets themselves are purely
conceptual. This is an important point to note: the set of all cows (for example) does not
physically exist, even though the cows do. The set is a “gathering” of the cows into one
conceptual unit that is not part of physical reality. This makes it easy to see why we can
have sets with an infinite number of elements; even though we may not be able to point out
infinity objects in the real world, we can construct conceptual sets which an infinite number
of elements (see the examples below).
Mathematics is thus built upon sets of purely conceptual, or mathematical, objects. Sets
are usually denoted by upper-case roman letters (like o). Sets can be defined by listing the
members, as in
o = ¦c. /. c. d¦
Or, a set can be defined from a formula. This type of statement defining a set is of the form
o = ¦r : 1(r)¦
where o is the symbol denoting the set, r is the variable we are introducing to represent a
generic element of the set, and 1(r) is some property that is true for values r within o (that
is r ∈ o iff 1(r) holds). (We denote “and” by comma separated clauses in 1(r). Also note
that the r : portion of the set definition may contain a qualification which narrows values of
r to some other set which is already known).
Sets are, in fact, completely defined by their elements. If two sets have the same elements,
they are equivalent. This is called the axiom of extensionality, and it is one of the most
important characteristics of sets that distinguishes them from predicates or properties.
1
However, not every collection has to be a set (in fact, all collections can’t be sets). See proper class for
more details.
296
The symbol ∈ denotes inclusion in a set. For example,
: ∈ o
would be read “: is an element of o”, or “o contains :”.
Some examples of sets, with formal definitions, are :
• The set of all even integers : ¦r ∈ Z : 2 [ r¦
• The set of all prime numbers: ¦j ∈ N : ∀r ∈ N r [ j ⇒r ∈ ¦1. j¦¦, where ⇒ denotes
implies and [ denotes divides.
• The set of all real functions of one real parameter: ¦1(r) ∈ R : r ∈ R¦
• The set of all isosceles triangles: ¦´¹1( : (¹1 = 1() = ¹(¦, where overline
denotes segment length.
Z, N, and R are all standard sets: the integers, the natural numbers, and the real numbers,
respectively. These are all infinite sets.
The most basic set is the empty set (denoted ∅ or ¦¦).
The astute reader may have noticed that all of our examples of sets utilize sets, which does
not suffice for rigorous definition. We can be more rigorous if we postulate only the empty
set, and define a set in general as anything which one can construct from the empty set and
the ZFC axioms.
All objects in modern mathematics are constructed via sets.
34.157.2 Set Notions
An important set notion is cardinality. Cardinality is roughly the same as the intuitive
notion of “size”. For sets which have a less than infinite (non-infinite) number of elements,
cardinality can be thought of as size. However, intuition breaks down for sets with an infinite
number of elements. For more detail, see the cardinality entry.
Another important set concept is that of subsets. A subset 1 of a set ¹ is any set which
contains only elements that appear in ¹. Subsets are denoted with the ⊆ symbol, i.e. 1 ⊆ ¹.
Also useful is the notion of a proper subset, denoted 1 ⊂ ¹, which adds the restriction
that 1 must be smaller than ¹ (that is, have a lower cardinality).
297
34.157.3 Set Operations
There are a number of standard (common) operations which are used to manipulate sets,
producing new sets from combinations of existing sets (sometimes with entirely different
types of elements). These standard operations are:
• union
• intersection
• set difference
• symmetric set difference
• complement
• cartesian product
Version: 5 Owner: akrowne Author(s): akrowne
298
Chapter 35
03Exx – Set theory
35.1 intersection of sets
Let A. ) be sets. The intersection of A and ) , denoted A
¸
) is the set
A
¸
) = ¦. : . ∈ A. . ∈ ) ¦
Version: 3 Owner: drini Author(s): drini, apmxi
299
Chapter 36
03F03 – Proof theory, general
36.1 NJj
NJj is a natural deduction proof system for intuisitionistic propositional logic. Its only
axiom is α ⇒α for any atomic α. Its rules are:
Γ ⇒α
Γ ⇒α ∨ β Γ ⇒β ∨ α cc
(∨1)
Γ ⇒α Σ. α
0
⇒φ Π. β
0
⇒φ
[Γ. Σ. Π] ⇒φ
(∨1)
The syntax α
0
indicates that the rule also holds if that formula is omitted.
Γ ⇒α Σ ⇒β
[Γ. Σ] ⇒α ∧ β
(∧1)
Γ ⇒α ∧ β
Γ ⇒α Γ ⇒β
(∧1)
Γ. α ⇒β
Γ ⇒α →β
(→1)
Γ ⇒α →β Σ ⇒α
[Γ. Σ] ⇒β
(→1)
Γ ⇒⊥
Γ ⇒α
where α is atomic(⊥
i
)
Version: 3 Owner: Henry Author(s): Henry
36.2 NKj
NKj is a natural deduction proof system for classical propositional logic. It is identical to
NJp except that it replaces the rule ⊥
i
with the rule:
300
Γ. α ⇒⊥
Γ ⇒α
where α is atomic(⊥
c
)
Version: 1 Owner: Henry Author(s): Henry
36.3 natural deduction
Natural deduction refers to related proof systems for several different kinds of logic, intended
to be similar to the way people actually reason. Unlike many other proof systems, it has
many rules and few axioms. Sequents in natural deduction have only one formula on the
right side.
Typically the rules consist of one pair for each connective, one of which allows the introduc-
tion of that symbol and the other its elimination.
To give one example, the proof rules →1 and →1 are:
Γ. α ⇒β
Γ ⇒α →β
(→1)
and
Γ ⇒α →β Σ ⇒α
[Γ. Σ] ⇒β
(→1)
Version: 1 Owner: Henry Author(s): Henry
36.4 sequent
A sequent represents a formal step in a proof. Typically it consists of two lists of formulas,
one representing the premises and one the conclusions. A typical sequent might be:
φ. ψ ⇒α. β
This claims that, from premises φ and ψ either α or β must be true. Note that ⇒ is not
a symbol in the language, rather it is a symbol in the metalanguage used to discuss proofs.
Also, notice the asymmetry: everything on the left must be true to conclude only one thing
on the right. This does create a different kind of symmetry, since adding formulas to either
side results in a weaker sequent, while removing them from either side gives a stronger one.
Some systems allow only one formula on the right.
301
Most proof systems provide ways to deduce one sequent from another. These rules are
written with a list of sequents above and below a line. This rule indicates that if everything
above the line is true, so is everything under the line. A typical rule is:
Γ ⇒Σ
Γ. α ⇒Σ α. Γ ⇒Σ
This indicates that if we can deduce Σ from Γ, we can also deduce it from Γ together with
α.
Note that the capital greek letters are usually used to denote a (possibly empty) list of
formulas. [Γ. Σ] is used to denote the contraction of Γ and Σ, that is, the list of those
formulas appearing in either Γ or Σ but with no repeats.
Version: 5 Owner: Henry Author(s): Henry
36.5 sound,, complete
If 1/ and 1: are two sets of facts (in particular, a theory of some language and the set of
things provable by some method) we say 1: is sound for 1/ if 1: ⊆ 1/. Typically we
have a theory and set of rules for constructing proofs, and we say the set of rules are sound
(which theory is intended is usually clear from context) since everything they prove is true
(in 1/).
If 1/ ⊆ 1: we say 1: is complete for 1/. Again, we usually have a theory and a set of
rules for constructing proofs, and say that the set of rules is complete since everything true
(in 1/) can be proven.
Version: 4 Owner: Henry Author(s): Henry
302
Chapter 37
03F07 – Structure of proofs
37.1 induction
Induction is the name given to a certain kind of proof, and also to a (related) way of defining
a function. For a proof, the statement to be proved has a suitably ordered set of cases.
Some cases (usually one, but possibly zero or more than one), are proved separately, and
the other cases are deduced from those. The deduction goes by contradiction, as we shall
see. For a function, its domain is suitably ordered. The function is first defined on some
(usually nonempty) subset of its domain, and is then defined at other points r in terms of
its values at points n such that n < r.
37.1.1 Elementary proof by induction
Proof by induction is a variety of proof by contradiction, relying, in the elementary cases,
on the fact that every non-empty set of natural numbers has a least element. Suppose we
want to prove a statement 1(:) which involves a natural number :. It is enough to prove:
1) If : ∈ N, and 1(:) is true for all : ∈ N such that : < :, then 1(:) is true.
or, what is the same thing,
2) If 1(:) is false, then 1(:) is false for some : < :.
To see why, assume that 1(:) is false for some :. Then there is a smallest / ∈ N such that
1(/) is false. Then, by hypothesis, 1(:) is true for all : < /. By (1), 1(/) is true, which is
a contradiction.
(If we don’t regard induction as a kind of proof by contradiction, then we have to think
of it as supplying some kind of sequence of proofs, of unlimited length. That’s not very
303
satisfactory, particularly for transfinite inductions, which we will get to below.)
Usually the initial case of : = 0, and sometimes a few cases, need to be proved separately,
as in the following example. Write 1
n
=
¸
n
k=0
/
2
. We claim
1
n
=
:
3
3
+
:
2
2
+
:
6
for all : ∈ N
Let us try to apply (1). We have the inductive hypothesis (as it is called)
1
m
=
:
3
3
+
:
2
2
+
:
6
for all : < :
which tells us something if : 0. In particular, setting : = : −1,
1
n−1
=
(: −1)
3
3
+
(: −1)
2
2
+
: −1
6
Now we just add :
2
to each side, and verify that the right side becomes
n
3
3
+
n
2
2
+
n
6
. This
proves (1) for nonzero :. But if : = 0, the inductive hypothesis is vacuously true, but of no
use. So we need to prove 1(0) separately, which in this case is trivial.
Textbooks sometimes distinguish between weak and strong (or complete) inductive proofs.
A proof that relies on the inductive hypothesis (1) is said to go by strong induction. But in
the sum-of-squares formula above, we needed only the hypothesis 1(:−1), not 1(:) for all
: < :. For another example, a proof about the Fibonacci sequence might use just 1(:−2)
and 1(: −1). An argument using only 1(: −1) is referred to as weak induction.
37.1.2 Definition of a function by induction
Let’s begin with an example, the function N → N, : → c
n
, where c is some integer 0.
The inductive definition reads
c
0
= 1
c
n
= c(c
n−1
) for all : 0
Formally, such a definition requires some justification, which runs roughly as follows. Let 1
be the set of : ∈ N for which the following definition ”has no problem”.
c
0
= 1
c
n
= c(c
n−1
) for 0 < : ≤ :
We now have a finite sequence 1
m
on the interval [0. :], for each : ∈ 1. We verify that any
1
l
and 1
m
have the same values throughout the intersection of their two domains. Thus we
can define a single function on the union of the various domains. Now suppose 1 = N, and
let / be the least element of N − 1. That means that the definition has a problem when
304
: = / but not when : < /. We soon get a contradiction, so we deduce 1 = N. That means
that the union of those domains is all of N, i.e. the function c
n
is defined, unambiguously,
throughout N.
Another inductively defined function is the Fibonacci sequence, q.v.
We have been speaking of the inductive definition of a function, rather than just a sequence
(a function on N), because the notions extend with little change to transfinite inductions.
An illustration par excellence of inductive proofs and definitions is Conway’s theory of
surreal numbers. The numbers and their algebraic laws of composition are defined entirely
by inductions which have no special starting cases.
37.1.3 Minor variations of the method
The reader can figure out what is meant by ”induction starting at /”, where / is not neces-
sarily zero. Likewise, the term ”downward induction” is self-explanatory.
A common variation of the method is proof by induction on a function of the index :.
Rather than spell it out formally, let me just give an example. Let : be a positive integer
having no prime factors of the form 4: + 3. Then : = c
2
+ /
2
for some integers c and /.
The usual textbook proof uses induction on a function of :, namely the number of prime
factors of :. The induction starts at 1 (i.e. either : = 2 or prime : = 4:+1), which in this
instance is the only part of the proof that is not quite easy.
37.1.4 Well-ordered sets
An ordered set (o. ≤) is said to be well-ordered if any nonempty subset of o has a least
element. The criterion (1), and its proof, hold without change for any well-ordered set o in
place of N (which is a well-ordered set). But notice that it won’t be enough to prove that
1(:) implies 1(: + 1) (where : + 1 denotes the least element :, if it exists). The reason
is, given an element :, there may exist elements < : but no element / such that : = / +1.
Then the induction from : to : + 1 will fail to ”reach” :. For more on this topic, look for
”limit ordinals”.
Informally, any variety of induction which works for ordered sets o in which a segment
o
x
= ¦n ∈ o[n < r¦ may be infinite, is called ”transfinite induction”.
37.1.5 Noetherian induction
An ordered set o, or its order, is called noetherian if any non-empty subset of o has a
maximal element. Several equivalent definitions are possible, such as the ”ascending chain condition”:
305
any strictly increasing sequence of elements of o is finite. The following result is easily proved
by contradiction.
Principle of Noetherian induction: Let (o. ≤) be a set with a Noetherian order, and let
1 be a subset of o having this property: if r ∈ o is such that the condition n r implies
n ∈ 1, then r ∈ 1. Then 1 = o.
So, to prove something ”1(r)” about every element r of a Noetherian set, it is enough to
prove that ”1(.) for all . n” implies ”1(n)”. This time the induction is going downward,
but of course that is only a matter of notation. The opposite of a Noetherian order, i.e. an
order in which any strictly decreasing sequence is finite, is also in use; it is called a partial
well-order, or an ordered set having no infinite antichain.
The standard example of a Noetherian ordered set is the set of ideals in a Noetherian ring.
But the notion has various other uses, in topology as well as algebra. For a nontrivial
example of a proof by Noetherian induction, look up the Hilbert basis theorem.
37.1.6 Inductive ordered sets
An ordered set (o. ≤) is said to be inductive if any totally ordered subset of o has an
upper bound in o. Since the empty set is totally ordered, any inductive ordered set is non-
empty. We have this important result:
Zorn’s lemma: Any inductive ordered set has a maximal element.
Zorn’s lemma is widely used in existence proofs, rather than in proofs of a property 1(r) of
an arbitrary element r of an ordered set. Let me sketch one typical application. We claim
that every vector space has a basis. First, we prove that if a free subset 1, of a vector space
\ , is a maximal free subset (with respect to the order relation ⊂), then it is a basis. Next,
to see that the set of free subsets is inductive, it is enough to verify that the union of any
totally ordered set of free subsets is free, because that union is an upper bound on the totally
ordered set. Last, we apply Zorn’s lemma to conclude that \ has a maximal free subset.
Version: 10 Owner: Daume Author(s): Larry Hammick, slider142
306
Chapter 38
03F30 – First-order arithmetic and
fragments
38.1 Elementary Functional Arithmetic
Elementary Functional Arithmetic, or EFA, is a weak theory of arithmetic created
by removing induction from Peano arithmetic. Because it lacks induction, axioms defining
exponentiation must be added.
• ∀r(r
t
= 0) (0 is the first number)
• ∀r. n(r
t
= n
t
→r = n) (the successor function is one-to-one)
• ∀r(r + 0 = r) (0 is the additive identity)
• ∀r. n(r+n
t
= (r+n)
t
) (addition is the repeated application of the successor function)
• ∀r(r 0 = 0)
• ∀r. n(r (n
t
) = r n + r (multiplication is repeated addition)
• ∀r((r < 0)) (0 is the smallest number)
• ∀r. n(r < n
t
↔r < n ∨ r = n)
• ∀r(r
0
= 1)
• ∀r(r
y

= r
y
r)
Version: 2 Owner: Henry Author(s): Henry
307
38.2 PA
Peano Arithmetic (PA) is the restriction of Peano’s axioms to a first order theory of arith-
metic. The only change is that the induction axiom is replaced by induction restricted to
arithmetic formulas:
φ(0) ∧ ∀r(φ(r) →φ(r
t
)) →∀rφ(r))where φ is arithmetical
Note that this replaces the single, second-order, axiom of induction with a countably infinite
schema of axioms.
Appropriate axioms defining +, , and < are included. A full list of the axioms of PA looks
like this (although the exact list of axioms varies somewhat from source to source):
• ∀r(r
t
= 0) (0 is the first number)
• ∀r. n(r
t
= n
t
→r = n) (the successor function is one-to-one)
• ∀r(r + 0 = r) (0 is the additive identity)
• ∀r. n(r+n
t
= (r+n)
t
) (addition is the repeated application of the successor function)
• ∀r(r 0 = 0)
• ∀r. n(r (n
t
) = r n + r) (multiplication is repeated addition)
• ∀r((r < 0)) (0 is the smallest number)
• ∀r. n(r < n
t
↔r < n ∨ r = n)
• φ(0) ∧ ∀r(φ(r) →φ(r
t
)) →∀rφ(r))where φ is arithmetical
Version: 7 Owner: Henry Author(s): Henry
38.3 Peano arithmetic
Peano’s axioms are a definition of the set of natural numbers, denoted N. From these
axioms Peano arithmetic on natural numbers can be derived.
1. 0 ∈ N (0 is a natural number)
2. For each r ∈ N, there exists exactly one r
t
∈ N, called the successor of r
3. r
t
= 0 (0 is not the successor of any natural number)
308
4. r = n if and only if r
t
= n
t
.
5. (axiom of induction) If ` ⊆ N and 0 ∈ ` and r ∈ ` implies r
t
∈ `, then ` = N.
The successor of r is sometimes denoted or instead of r
t
. We then have 1 = o0, 2 = o1 =
oo0, and so on.
Peano arithmetic consists of statements derived via these axioms. For instance, from these
axioms we can define addition and multiplication on natural numbers. Addition is defined
as
r + 1 = r
t
for all r ∈ N
r + n
t
= (r + n)
t
for all r. n ∈ N
Addition defined in this manner can then be proven to be both associative and commutative.
Multiplication is
r 1 = r for all r ∈ N
r n
t
= r n + r for all r. n ∈ N
This definition of multiplication can also be proven to be both associative and commutative,
and it can also be shown to be distributive over addition.
Version: 4 Owner: Henry Author(s): Henry, Logan
309
Chapter 39
03F35 – Second- and higher-order
arithmetic and fragments
39.1 ¹(¹
0
¹(¹
0
is a weakened form of second order arithmetic. Its axioms include the axioms of PA
together with arithmetic comprehension.
Version: 1 Owner: Henry Author(s): Henry
39.2 1(¹
0
1(¹
0
is a weakened form of second order arithmetic. It consists of the axioms of PA other
than induction, together with Σ
0
1
-IND and ∆
0
1
-CA.
Version: 1 Owner: Henry Author(s): Henry
39.3 2
2
2
2
is the full system of second order arithmetic, that is, the full theory of numbers and sets
of numbers. It is sufficient for a great deal of mathematics, including much of number theory
and analysis.
The axioms defining successor, addition, multiplication, and comparison are the same as
those of PA. 2
2
adds the full induction axiom and the full comprehension axiom.
310
Version: 1 Owner: Henry Author(s): Henry
39.4 comprehension axiom
The axiom of comprehension (CA) states that every formula defines a set. That is,
∃A∀r(r ∈ A ↔φ(r))for any formulaφwhereAdoes not occur free inφ
The names specification and separation are sometimes used in place of comprehension, par-
ticularly for weakened forms of the axiom (see below).
In theories which make no distinction between objects and sets (such as ZF), this formulation
leads to Russel’s paradox, however in stratified theories this is not a problem (for example
second order arithmetic includes the axiom of comprehension).
This axiom can be restricted in various ways. One possibility is to restrict it to forming
subsets of sets:
∀) ∃A∀r(r ∈ A ↔r ∈ ) ∧ φ(r))for any formulaφwhereAdoes not occur free inφ
This formulation (used in ZF set theory) is sometimes called the Aussonderungsaxiom.
Another way is to restrict φ to some family 1, giving the axiom F-CA. For instance the
axiom Σ
0
1
-CA is:
∃A∀r(r ∈ A ↔φ(r))whereφisΣ
0
1
andAdoes not occur free inφ
A third form (usually called separation) uses two formulas, and guarantees only that those
satisfying one are included while those satisfying the other are excluded. The unrestricted
form is the same as unrestricted collection, but, for instance, Σ
0
1
separation:
∀r(φ(r) ∧ ψ(r)) →∃A∀r((φ(r) →r ∈ A) ∧ (ψ(r) →r ∈ A))
whereφandψareΣ
0
1
andAdoes not occur free inφorψ
is weaker than Σ
0
1
-CA.
Version: 4 Owner: Henry Author(s): Henry
39.5 induction axiom
An induction axiom specifies that a theory includes induction, possibly restricted to specific
formulas. IND is the general axiom of induction:
φ(0) ∧ ∀r(φ(r) →φ(r + 1)) →∀rφ(r) for any formula φ
311
If φ is restricted to some family of formulas 1 then the axiom is called F-IND, or F induction.
For example the axiom Σ
0
1
-IND is:
φ(0) ∧ ∀r(φ(r) →φ(r + 1)) →∀rφ(r) where φ is Σ
0
1
Version: 4 Owner: Henry Author(s): Henry
312
Chapter 40
03G05 – Boolean algebras
40.1 Boolean algebra
A Boolean algebra is a set 1 with two binary operators, ∧ “meet,”and ∨ “join,” and
one unary operator
t
“complement,” which together are a Boolean lattice. If A and ) are
boolean algebras, a mapping 1 : A → ) is a morphism of Boolean algebras when it is a
morphism of ∧, ∨, and
t
.
Version: 6 Owner: greg Author(s): greg
40.2 M. H. Stone’s representation theorem
Theorem 3. Given a Boolean algebra 1 there exists a totally disconnected Hausdorff space
A such that 1 is isomorphic to the Boolean algebra of clopen subsets of A.
[ Very rough scetch of proof] Let
A = ¦1 : 1 →¦0. 1¦ [ 1 is a homomorphism¦
endowed with the subspace topology induced by the product topology on 1
|0,1¦
. Then A
is a totally disconnected Hausdorff space. Let Cl(A) denote the Boolean algebra of clopen
subsets of A, then the following map
1 : 1 →Cl(A). 1(r) = ¦1 ∈ A [ 1(r) = 1¦
is well defined (i.e. 1(r) is indeed a clopen set), and an isomorphism.
Version: 4 Owner: Dr Absentius Author(s): Dr Absentius
313
Chapter 41
03G10 – Lattices and related
structures
41.1 Boolean lattice
A Boolean lattice 1 is a distributive lattice in which for each element r ∈ 1 there exists
a complement r
t
∈ 1 such that
r ∧ r
t
= 0
r ∨ r
t
= 1
(r
t
)
t
= r
(r ∧ n)
t
= r
t
∨ n
t
(r ∨ n)
t
= r
t
∧ n
t
Given a set, any collection of subsets that is closed under unions, intersections, and comple-
ments is a Boolean algebra.
Boolean rings (with identity, but allowing 0=1) are equivalent to Boolean lattices. To view
a Boolean ring as a Boolean lattice, define r ∧ n = rn and r ∨ n = r + n + rn. To view a
Boolean lattice as a Boolean ring, define rn = r ∧ n and r + n = (r
t
∧ n) ∨ (r ∧ n
t
).
Version: 3 Owner: mathcam Author(s): mathcam, greg
41.2 complete lattice
A complete lattice is a nonempty poset in which every nonempty subset has a supremum
and an infimum.
314
In particular, a complete lattice is a lattice.
Version: 1 Owner: Evandar Author(s): Evandar
41.3 lattice
Alattice is any non-empty poset 1 in which any two elements r and n have a least upper bound,
r ∨ n, and a greatest lower bound, r ∧ n.
In other words, if ¡ = r ∧n then ¡ ∈ 1, ¡ < r and ¡ < n. Further, for all j ∈ 1 if j < r and
j < n, then j < ¡.
Likewise, if ¡ = r ∨ n then ¡ ∈ 1, r < ¡ and n < ¡, and for all j ∈ 1 if r < j and n < j,
then ¡ < j.
Since 1 is a poset, the operations ∧ and ∨ have the following properties:
r ∧ r = r. r ∨ r = r (idempotency)
r ∧ n = n ∧ r. r ∨ n = n ∨ r (commutativity)
r ∧ (n ∧ .) = (r ∧ n) ∧ .. (associativity)
r ∨ (n ∨ .) = (r ∨ n) ∨ .
r ∧ (r ∨ n) = r ∨ (r ∧ n) = r (absorption)
Further, r < n is equivalent to:
r ∧ n = r and r ∨ n = n (consistency)
Version: 5 Owner: mps Author(s): mps, greg
315
Chapter 42
03G99 – Miscellaneous
42.1 Chu space
A Chu space over a set Σ is a triple (A. :. X) with : : AX →Σ. A is called the carrier
and X the cocarrier.
Although the definition is symmetrical, in practice asymmetric uses are common. In partic-
ular, often X is just taken to be a set of function from A to Σ, with :(c. r) = r(c) (such a
Chu space is called normal and is abbreviated (A. X)).
We define the perp of a Chu space C = (A. :. X) to be C

= (X. :

. A) where :

(r. c) =
:(c. r).
Define ˆ : and ˇ : to be functions defining the rows and columns of C respectively, so that
ˆ :(c) : X → Σ and ˇ :(r) : A → Σ are given by ˆ :(c)(r) = ˇ :(r)(c) = :(c. r). Clearly the rows
of C are the columns of C

.
Using these definitions, a Chu space can be represented using a matrix.
If ˆ : is injective then we call C separable and if ˇ : is injective we call C extensional. A Chu
space which is both separable and extensional is biextensional.
Version: 3 Owner: Henry Author(s): Henry
42.2 Chu transform
If C = (A. :. X) and D = (B. :. Y) are Chu spaces then we say a pair of functions 1 : A →B
and o : Y → X form a Chu transform from C to D if for any (c. n) ∈ A Y we have
:(c. o(n)) = :(1(c). n).
316
Version: 1 Owner: Henry Author(s): Henry
42.3 biextensional collapse
If C = (A. :. X) is a Chu space, we can define the biextensional collapse of C to be
(ˆ :[¹]. :
t
. ˇ :[A]) where :
t
(ˆ :(c). ˇ :(r)) = :(c. r).
That is, to name the rows of the biextensional collapse, we just use functions representing
the actual rows of the original Chu space (and similarly for the columns). The effect is to
merge indistinguishable rows and columns.
We say that two Chu spaces are equivalent if their biextensional collapses are isomorphic.
Version: 3 Owner: Henry Author(s): Henry
42.4 example of Chu space
Any set ¹ can be represented as a Chu space over ¦0. 1¦ by (¹. :. P(¹)) with :(c. A) = 1
iff c ∈ A. This Chu space satisfies only the trivial property 2
A
, signifying the fact that sets
have no internal structure. If ¹ = ¦c. /. c¦ then the matrix representation is:
¦¦ ¦a¦ ¦b¦ ¦c¦ ¦a,b¦ ¦a,c¦ ¦b,c¦ ¦a,b,c¦
a 0 1 0 0 1 1 0 1
b 0 0 1 0 1 0 1 1
c 0 0 0 1 0 1 1 1
Increasing the structure of a Chu space, that is, adding properties, is equivalent to deleting
columns. For instance we can delete the columns named ¦c¦ and ¦/. c¦ to turn this into
the partial order satisfying c < c. By deleting more columns, we can further increase the
structure. For example, if we require that the set of rows be closed under the bitwise or
operation (and delete those columns which would prevent this) then we can it will define a
semilattice, and if it is closed under both bitwise or and bitwise and then it will define a
lattice. If the rows are also closed under complementation then we have a Boolean algebra.
Note that these are not arbitrary connections: the Chu transforms on each of these classes
of Chu spaces correspond to the appropriate notion of homomorphism for those classes.
For instance, to see that Chu transforms are order preserving on Chu spaces viewed as partial
orders, let C = (A. :. X) be a Chu space satisfying / < c. That is, for any r ∈ A we have
:(/. r) = 1 →:(c. r) = 1. Then let (1. o) be a Chu transform to D = (B. :. X), and suppose
:(1(/). n) = 1. Then :(/. o(n)) = 1 by the definition of a Chu transform, and then we have
:(c. o(n)) = 1 and so :(1(c). n) = 1, demonstrating that 1(/) < 1(c).
317
Version: 2 Owner: Henry Author(s): Henry
42.5 property of a Chu space
A property of a Chu space over Σ with carrier A is some ) ⊆ Σ
A
. We say that a Chu
space C = (A. :. X) satisfies ) if A ⊆ ) .
For example, every Chu space satisfies the property Σ
A
.
Version: 2 Owner: Henry Author(s): Henry
318
Chapter 43
05-00 – General reference works
(handbooks, dictionaries,
bibliographies, etc.)
43.1 example of pigeonhole principle
A simple example.
For any group of 8 integers, there exist at least two of them whose difference is divisible by
7.
C onsider the residue classes modulo 7. These are 0. 1. 2. 3. 4. 5. 6. We have seven classes
and eight integers. So it must be the case that 2 integers fall on the same residue class, and
therefore their difference will be divisible by 7.
Version: 1 Owner: drini Author(s): drini
43.2 multi-index derivative of a power
Theorem If i. / are multi-indices in N
n
, and r = (r
1
. . . . . r
n
), then

i
r
k
=

k!
(k−i)!
r
k−i
if i ≤ /.
0 otherwise.
Proof. The proof follows from the corresponding rule for the ordinary derivative; if i. / are
319
in 0. 1. 2. . . ., then
d
i
dr
i
r
k
=

k!
(k−i)!
r
k−i
if i ≤ /.
0 otherwise.
(43.2.1)
Suppose i = (i
1
. . . . . i
n
), / = (/
1
. . . . . /
n
), and r = (r
1
. . . . . r
n
). Then we have that

i
r
k
=

[i[
∂r
i
1
1
∂r
in
n
r
k
1
1
r
kn
n
=

i
1
∂r
i
1
1
r
k
1
1


in
∂r
in
n
r
kn
n
.
For each : = 1. . . . . :, the function r
kr
r
only depends on r
r
. In the above, each partial
differentiation ∂∂r
r
therefore reduces to the corresponding ordinary differentiation ddr
r
.
Hence, from equation 43.2.1, it follows that ∂
i
r
k
vanishes if i
r
/
r
for any : = 1. . . . . :. If
this is not the case, i.e., if i ≤ / as multi-indices, then for each :,
d
ir
dr
ir
r
r
kr
r
=
/
r
!
(/
r
−i
r
)!
r
kr−ir
r
.
and the theorem follows. P
Version: 4 Owner: matte Author(s): matte
43.3 multi-index notation
Definition [1, 2, 3] Amulti-index is an :-tuple (i
1
. . . . . i
n
) of non-negative integers i
1
. . . . . i
n
.
In other words, i ∈ N
n
. Usually, : is the dimension of the underlying space. Therefore, when
dealing with multi-indices, it is assumed clear from the context.
Operations on multi-indices
For a multi-index i, we define the length (or order) as
[i[ = i
1
+ + i
n
.
and the factorial as
i! =
n
¸
k=1
i
k
!.
If i = (i
1
. . . . . i
n
) and , = (,
1
. . . . . ,
n
) are two multi-indices, their sum and difference is
defined component-wise as
i + , = (i
1
+ ,
1
. . . . . i
n
+ ,
n
).
i −, = (i
1
−,
1
. . . . . i
n
−,
n
).
320
Thus [i ± ,[ = [i[ ± [,[. Also, if ,
k
≤ i
k
for all / = 1. . . . . :, then we write , ≤ i. For
multi-indices i. ,, with , ≤ i, we define

i
,

=
i!
(i −,)!,!
.
For a point r = (r
1
. . . . . r
n
) in R
n
(with standard coordinates) we define
r
i
=
n
¸
k=1
r
i
k
k
.
Also, if 1 : R
n
→R is a smooth function, and i = (i
1
. . . . . i
n
) is a multi-index, we define

i
1 =

[i[

i
1
c
1

in
c
n
1.
where c
1
. . . . . c
n
are the standard unit vectors of R
n
. Since 1 is sufficiently smooth, the order
in which the derivations are performed is irrelevant. For multi-indices i and ,, we thus have

i

j
= ∂
i+j
= ∂
j+i
= ∂
j

i
.
Much of the motivation for the above notation is that standard results such as Leibniz’ rule,
Taylor’s formula, etc can be written more or less as-is in many dimensions by replacing
indices in N with multi-indices. Below are some examples of this.
Examples
1. If : is a positive integer, and r
1
. . . . . r
k
are complex numbers, the multinomial expan-
sion states that
(r
1
+ + r
k
)
n
= :!
¸
[i[=n
r
i
i!
.
where r = (r
1
. . . . . r
k
) and i is a multi-index. (proof)
2. Leibniz’ rule [1]: If 1. o : R
n
→R are smooth functions, and , is a multi-index, then

j
(1o) =
¸
i≤j

,
i


i
(1) ∂
j−i
(o).
where i is a multi-index.
REFERENCES
1. http://www.math.umn.edu/ jodeit/course/TmprDist1.pdf
2. M. Reed, B. Simon, Methods of Mathematical Physics, I - Functional Analysis, Aca-
demic Press, 1980.
3. E. Weisstein, Eric W. Weisstein’s world of mathematics, entry on Multi-Index Notation
Version: 8 Owner: matte Author(s): matte
321
Chapter 44
05A10 – Factorials, binomial
coefficients, combinatorial functions
44.1 Catalan numbers
The Catalan numbers, or Catalan sequence, have many interesting applications in com-
binatorics.
The :th Catalan number is given by:
(
n
=

2n
n

: + 1
.
where

n
r

represents the binomial coefficient. The first several Catalan numbers are 1, 1, 2,
5, 14, 42, 132, 429, 1430, 4862 ,. . . (see EIS sequence A000108 for more terms). The Catalan
numbers are also generated by the recurrence relation
(
0
= 1. (
n
=
n−1
¸
i=0
(
i
(
n−1−i
.
For example, (
3
= 1 2 + 1 1 + 2 1 = 5, (
4
= 1 5 + 1 2 + 2 1 + 5 1 = 14, etc.
The ordinary generating function for the Catalan numbers is

¸
n=0
(
n
.
n
=
1 −

1 −4.
2.
.
Interpretations of the :th Catalan number include:
322
1. The number of ways to arrange : pairs of matching parentheses, e.g.:
()
(()) ()()
((())) (()()) ()(()) (())() ()()()
2. The number of ways an polygon of : + 2 sides can be split into : triangles.
3. The number of rooted binary trees with exactly : + 1 leaves.
The Catalan sequence is named for Eug`ene Charles Catalan, but it was discovered in 1751
by Euler when he was trying to solve the problem of subdividing polygons into triangles.
REFERENCES
1. Ronald L. Graham, Donald E. Knuth, and Oren Patashnik. Concrete Mathematics. Addison-
Wesley, 1998. Zbl 0836.00001.
Version: 3 Owner: bbukh Author(s): bbukh, vampyr
44.2 Levi-Civita permutation symbol
Definition. Let /
i
∈ ¦1. . :¦ for all i = 1. . :. The Levi-Civita permutation
symbols ε
k
1
kn
and ε
k
1
kn
are defined as
ε
k
1
km
= ε
k
1
km
=

+1 when ¦| →/
l
¦ is an even permutation (of ¦1. . :¦),
−1 when ¦| →/
l
¦ is an odd permutation,
0 otherwise, i.e., when /
i
= /
j
. for some i = ,.
The Levi-Civita permutation symbol is a special case of the generalized Kronecker delta symbol.
Using this fact one can write the Levi-Civita permutation symbol as the determinant of an
: : matrix consisting of traditional delta symbols. See the entry on the generalized Kro-
necker symbol for details.
When using the Levi-Civita permutation symbol and the generalized Kronecker delta symbol,
the Einstein summation convention is usually employed. In the below, we shall also use this
convention.
properties
323
• When : = 2, we have for all i. ,. :. : in ¦1. 2¦,
ε
ij
ε
mn
= δ
m
i
δ
n
j
−δ
n
i
δ
m
j
. (44.2.1)
ε
ij
ε
in
= δ
n
j
. (44.2.2)
ε
ij
ε
ij
= 2. (44.2.3)
• When : = 3, we have for all i. ,. /. :. : in ¦1. 2. 3¦,
ε
jmn
ε
imn
= 2δ
i
j
. (44.2.4)
ε
ijk
ε
ijk
= 6. (44.2.5)
Let us prove these properties. The proofs are instructional since they demonstrate typical
argumentation methods for manipulating the permutation symbols.
Proof. For equation 220.5.1, let us first note that both sides are antisymmetric with respect
of i, and ::. We therefore only need to consider the case i = , and : = :. By substitution,
we see that the equation holds for ε
12
ε
12
, i.e., for i = : = 1 and , = : = 2. (Both sides are
then one). Since the equation is anti-symmetric in i, and ::, any set of values for these
can be reduced the the above case (which holds). The equation thus holds for all values of
i, and ::. Using equation 220.5.1, we have for equation 44.2.2
ε
ij
ε
in
= δ
i
i
δ
n
j
−δ
n
i
δ
i
j
= 2δ
n
j
−δ
n
j
= δ
n
j
.
Here we used the Einstein summation convention with i going from 1 to 2. Equation 44.2.3
follows similarly from equation 44.2.2. To establish equation 44.2.4, let us first observe that
both sides vanish when i = ,. Indeed, if i = ,, then one can not choose : and : such
that both permutation symbols on the left are nonzero. Then, with i = , fixed, there are
only two ways to choose : and : from the remaining two indices. For any such indices,
we have ε
jmn
ε
imn
= (ε
imn
)
2
= 1 (no summation), and the result follows. The last property
follows since 3! = 6 and for any distinct indices i. ,. / in ¦1. 2. 3¦, we have ε
ijk
ε
ijk
= 1 (no
summation). P
Examples.
• The determinant of an : : matrix ¹ = (c
ij
) can be written as
det ¹ = ε
i
1
in
c
1i
1
c
nin
.
where each i
l
should be summed over 1. . . . . :.
• If ¹ = (¹
1
. ¹
2
. ¹
3
) and 1 = (1
1
. 1
2
. 1
3
) are vectors in R
3
(represented in some right
hand oriented orthonormal basis), then the ith component of their cross product equals
(¹1)
i
= ε
ijk
¹
j
1
k
.
324
For instance, the first component of ¹1 is ¹
2
1
3
−¹
3
1
2
. From the above expression
for the cross product, it is clear that ¹ 1 = −1 ¹. Further, if ( = ((
1
. (
2
. (
3
)
is a vector like ¹ and 1, then the triple scalar product equals
¹ (1 () = ε
ijk
¹
i
1
j
(
k
.
From this expression, it can be seen that the triple scalar product is antisymmetric
when exchanging any adjacent arguments. For example, ¹ (1 () = −1 (¹().
• Suppose 1 = (1
1
. 1
2
. 1
3
) is a vector field defined on some domain of 1
3
with Cartesian
coordinates r = (r
1
. r
2
. r
3
). Then the ith component of the curl of 1 equals
(∇1)
i
(r) = ε
ijk

∂r
j
1
k
(r).
Version: 7 Owner: matte Author(s): matte
44.3 Pascal’s rule (bit string proof )
This proof is based on an alternate, but equivalent, definition of the binomial coefficient:

n
r

is the number of bit strings (finite sequences of 0s and 1s) of length : with exactly : ones.
We want to show that

:
:

=

: −1
: −1

+

: −1
:

To do so, we will show that both sides of the equation are counting the same set of bit
strings.
The left-hand side counts the set of strings of : bits with : 1s. Suppose we take one of these
strings and remove the first bit /. There are two cases: either / = 1, or / = 0.
If / = 1, then the new string is : −1 bits with : −1 ones; there are

n−1
r−1

bit strings of this
nature.
If / = 0, then the new string is : − 1 bits with : ones, and there are

n−1
r

strings of this
nature.
Therefore every string counted on the left is covered by one, but not both, of these two cases.
If we add the two cases, we find that

:
:

=

: −1
: −1

+

: −1
:

Version: 2 Owner: vampyr Author(s): vampyr
325
44.4 Pascal’s rule proof
We need to show

:
/

+

:
/ −1

=

: + 1
/

Let us begin by writing the left-hand side as
:!
/!(: −/)!
+
:!
(/ −1)!(: −(/ −1))!
Getting a common denominator and simplifying, we have
:!
/!(: −/)!
+
:!
(/ −1)!(: −/ + 1)!
=
(: −/ + 1):!
(: −/ + 1)/!(: −/)!
+
/:!
/(/ −1)!(: −/ + 1)!
=
(: −/ + 1):! + /:!
/!(: −/ + 1)!
=
(: + 1):!
/!((: + 1) −/)!
=
(: + 1)!
/!((: + 1) −/)!
=

: + 1
/

Version: 5 Owner: akrowne Author(s): akrowne
44.5 Pascal’s triangle
Pascal’s triangle is the following configuration of numbers:
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
.
.
.
.
.
.
.
.
.
326
This triangle goes on into infinity. Therefore we have only printed the first 8 lines. In general,
this triangle is constructed such that entries on the left side and right side are 1, and every
entry inside the triangle is obtained by summing the two entries immediately above it. For
instance, on the forth row 4 = 1 + 3.
Historically, the application of this triangle has been to give the coefficients when expanding
binomial expressions. For instance, to expand (c + /)
4
, one simply look up the coefficients
on the fourth row, and write
(c + /)
4
= c
4
+ 4c
3
/ + 6c
2
/
2
+ 4c/
3
+ /
4
.
Pascal’s triangle is named after the French mathematician Blaise Pascal (1623-1662) [3].
However, this triangle was known at least around 1100 AD in China; five centuries before Pas-
cal [1]. In modern language, the expansion of the binomial is given by the binomial theorem
discovered by Isaac Newton in 1665 [2]: For any : = 1. 2. . . . and real numbers c. /, we have
(c + /)
n
=
n
¸
k=0

:
/

c
n−k
/
k
= c
n
+

:
1

c
n−1
/ +

:
2

c
n−2
/
2
+ /
n
.
Thus, in Pascal’s triangle, the entries on the :th row are given by the binomial coefficients

:
/

=
:!
(: −/)!/!
.
for / = 1. . . . . :.
REFERENCES
1. Wikipedia’s entry on the binomial coefficients
2. Wikipedia’s entry on Isaac Newton
3. Wikipedia’s entry on Blaise Pascal
Version: 1 Owner: Koro Author(s): matte
327
44.6 Upper and lower bounds to binomial coefficient

:
/


:
k
/!

:
/

: c
/

k

:
/

:
/

k
Also, for large ::

n
k


n
k
k!
.
Version: 1 Owner: gantsich Author(s): gantsich
44.7 binomial coefficient
The number of ways to choose : objects from a set with : elements (: ≥ :) is given by
:!
(: −:)!:!
.
It is usually denoted in several ways, like

:
:

. ((:. :). (
n
r
These numbers are called binomial coefficients, because they show up at expanding (r+n)
n
.
Some interesting properties:

n
r

is the coefficient of r
k
n
n−k
in (r + n)
n
. (binomial theorem).

n
r

=

n
n−r

.

n
r−1

+

n
r

=

n+1
r

(Pascal’s rule).

n
0

= 1 =

n
n

for all :.

n
0

+

n
1

+

n
2

+ +

n
n

= 2
n
.

n
0

n
1

+

n
2

− + (−1)
n

n
n

= 0.

¸
n
t=1

t
k

=

n+1
k+1

.
328
On the context of Computer Science, it also helps to see

n
r

as the number of strings
consisting of ones and zeros with : ones and : − : zeros. This equivalency comes from the
fact that if o be a finite set with : elements,

n
r

is the number of distinct subsets of o with
: elements. For each subset 1 of o, consider the function
A
T
: o →¦0. 1¦
where A
T
(r) = 1 whenever r ∈ 1 and 0 otherwise (so A
T
is the characteristic function for
1). For each 1 ∈ P(o), A
T
can be used to produce a unique bit string of length : with
exactly : ones.
Version: 14 Owner: drini Author(s): drini
44.8 double factorial
The double factorial of a positive integer : is
:!! = :(: −2) /
n
where /
n
denotes 1 if : is odd and 2 if : is even.
For example,
7!! = 7 5 3 1 = 105
10!! = 10 8 6 4 2 = 3840
Note that :!! is not the same as (:!)!.
Version: 3 Owner: drini Author(s): Larry Hammick, Riemann
44.9 factorial
For any non-negative integer :, the factorial of :, denoted :!, can be defined by
:! =
n
¸
r=1
:
where for : = 0 the empty product is taken to be 1.
Alternatively, the factorial can be defined recursively by 0! = 1 and :! = :(:−1)! for : 0.
:! is equal to the number of permutations of : distinct objects. For example, there are 5!
ways to arrange the five letters A, B, C, D and E into a word.
329
Euler’s gamma function Γ(r) generalizes the notion of factorial to almost all complex values,
as
Γ(: + 1) = :!
for every non-negative integer :.
Version: 13 Owner: yark Author(s): yark, Riemann
44.10 falling factorial
For : ∈ N, the rising and falling factorials are :
th
degree polynomial described, respectively,
by
r
n
= r(r + 1) . . . (r + : −1)
r
n
= r(r −1) . . . (r −: + 1)
The two types of polynomials are related by:
r
n
= (−1)
n
(−r)
n
.
The rising factorial is often written as (r)
n
, and referred to as the Pochhammer symbol (see
hypergeometric series). Unfortunately, the falling factorial is also often denoted by (r)
n
, so
great care must be taken when encountering this notation.
Notes.
Unfortunately, the notational conventions for the rising and falling factorials lack a common
standard, and are plagued with a fundamental inconsistency. An examination of reference
works and textbooks reveals two fundamental sources of notation: works in combinatorics
and works dealing with hypergeometric functions.
Works of combinatorics [1,2,3] give greater focus to the falling factorial because of if its role
in defining the Stirling numbers. The symbol (r)
n
almost always denotes the falling factorial.
The notation for the rising factorial varies widely; we find 'r`
n
in [1] and (r)
(n)
in [3].
Works focusing on special functions [4,5] universally use (r)
n
to denote the rising factorial and
use this symbol in the description of the various flavours of hypergeometric series. Watson [5]
credits this notation to Pochhammer [6], and indeed the special functions literature eschews
“falling factorial” in favour of “Pochhammer symbol”. Curiously, according to Knuth [7],
Pochhammer himself used (r)
n
to denote the binomial coefficient (Note: I haven’t verified
this.)
The notation featured in this entry is due to D. Knuth [7,8]. Given the fundamental in-
consistency in the existing notations, it seems sensible to break with both traditions, and
to adopt new and graphically suggestive notation for these two concepts. The traditional
330
notation, especially in the hypergeometric camp, is so deeply entrenched that, realistically,
one needs to be familiar with the traditional modes and to take care when encountering the
symbol (r)
n
.
References
1. Comtet, Advanced combinatorics.
2. Jordan, Calculus of finite differences.
3. Riordan, Introduction to combinatorial analysis.
4. Erd´elyi, et. al., Bateman manuscript project.
5. Watson, A treatise on the theory of Bessel functions.
6. Pochhammer, “Ueber hypergeometrische Functionen :
ter
Ordnung,” Journal f¨ ur die
reine und angewandte Mathematik 71 (1870), 316–352.
7. Knuth, “Two notes on notation” download
8. Greene, Knuth, Mathematics for the analysis of algorithms.
Version: 7 Owner: rmilson Author(s): rmilson
44.11 inductive proof of binomial theorem
When : = 1,
(c + /)
1
=
1
¸
k=0

1
/

c
1−k
/
k
=

1
0

c
1
/
0
+

1
1

c
0
/
1
= c + /.
For the inductive step, assume it holds for :. Then for n = :+ 1,
331
(c + /)
m+1
= c(c + /)
m
+ /(c + /)
m
= c
m
¸
k=0

:
/

c
m−k
/
k
+ /
m
¸
j=0

:
,

c
m−j
/
j
by the inductive hypothesis
=
m
¸
k=0

:
/

c
m−k+1
/
k
+
m
¸
j=0

:
,

c
m−j
/
j+1
by multiplying through by c and /
= c
m+1
+
m
¸
k=1

:
/

c
m−k+1
/
k
+
m
¸
j=0

:
,

c
m−j
/
j+1
by pulling out the / = 0 term
= c
m+1
+
m
¸
k=1

:
/

c
m−k+1
/
k
+
m+1
¸
k=1

:
/ −1

c
m−k+1
/
k
by letting , = / −1
= c
m+1
+
m
¸
k=1

:
/

c
m−k+1
/
k
+
m
¸
k=1

:
/ −1

c
m+1−k
/
k
+ /
m+1
by pulling out the/ = :+ 1 term
= c
m+1
+ /
m+1
+
m
¸
k=1
¸
:
/

+

:
/ −1

c
m+1−k
/
k
by combining the sums
= c
m+1
+ /
m+1
+
m
¸
k=1

:+ 1
/

c
m+1−k
/
k
from Pascal’s rule
=
m+1
¸
k=0

:+ 1
/

c
m+1−k
/
k
by adding in the :+ 1 terms,
as desired.
Version: 5 Owner: KimJ Author(s): KimJ
44.12 multinomial theorem
A multinomial is a mathematical expression consisting of two or more terms, e.g.
c
1
r
1
+ c
2
r
2
+ . . . + c
k
r
k
.
The multinomial theorem provides the general form of the expansion of the powers of this
expression, in the process specifying the multinomial coefficients which are found in that
expansion. The expansion is:
(r
1
+ r
2
+ . . . + r
k
)
n
=
¸
:!
:
1
!:
2
! :
k
!
r
n
1
1
r
n
2
2
r
n
k
k
(44.12.1)
where the sum is taken over all multi-indices (:
1
. . . . :
k
) ∈ N
k
that sum to :.
332
The expression
n!
n
1
!n
2
!n
k
!
occurring in the expansion is called multinomial coefficient and
is denoted by

:
:
1
. :
2
. . . . . :
k

.
Version: 7 Owner: bshanks Author(s): yark, bbukh, rmilson, bshanks
44.13 multinomial theorem (proof )
Proof. The below proof of the multinomial theorem uses the binomial theorem and induction
on /. In addition, we shall use multi-index notation.
First, for / = 1, both sides equal r
n
1
. For the induction step, suppose the multinomial
theorem holds for /. Then the binomial theorem and the induction assumption yield
(r
1
+ + r
k
+ r
k+1
)
n
=
n
¸
l=0

:
|

(r
1
+ + r
k
)
l
r
n−l
k+1
=
n
¸
l=0

:
|

|!
¸
[i[=l
r
i
i!
r
n−l
k+1
= :!
n
¸
l=0
¸
[i[=l
r
i
r
n−l
k+1
i!(: −|)!
where r = (r
1
. . . . . r
k
) and i is a multi-index in 1
k
+
. To complete the proof, we need to show
that the sets
¹ = ¦(i
1
. . . . . i
k
. : −|) ∈ 1
k+1
+
[ | = 0. . . . . :. [(i
1
. . . . . i
k
)[ = |¦.
1 = ¦, ∈ 1
k+1
+
[ [,[ = :¦
are equal. The inclusion ¹ ⊂ 1 is clear since
[(i
1
. . . . . i
k
. : −|)[ = | + : −| = :.
For 1 ⊂ ¹, suppose , = (,
1
. . . . . ,
k+1
) ∈ 1
k+1
+
, and [,[ = :. Let | = [(,
1
. . . . . ,
k
)[. Then
| = : −,
k+1
, so ,
k+1
= : −| for some | = 0. . . . . :. It follows that that ¹ = 1.
Let us define n = (r
1
. . r
k+1
) and let , = (,
1
. . . . . ,
k+1
) be a multi-index in 1
k+1
+
. Then
(r
1
+ + r
k+1
)
n
= :!
¸
[j[=n
r
(j
1
,...,j
k
)
r
j
k+1
k+1
(,
1
. . . . . ,
k
)!,
k+1
!
= :!
¸
[j[=n
n
j
,!
.
333
This completes the proof. P
Version: 1 Owner: matte Author(s): matte
44.14 proof of upper and lower bounds to binomial co-
efficient
Let 2 ≤ / ≤ : be natural numbers. We’ll first prove the inequality

:
/

:c
/

k
.
We rewrite

n
k

as

:
/

=

1 −
1
:

1 −
/ −1
:

:
k−1
to get
(: −1) (: −/ + 1)
c :
k−1
< 1.
Multiplying the inequality above with
k
k
k!
< c
k−1
yields
:(: −1) (: −/ + 1)
/
k
/!
/
k
/!
=

:
/

/
:

k

1
c
< c
k−1

:
/

<

:c
/

k
.
To conclude the proof we show that
n−1
¸
i=1

1 +
1
i

i
=
:
n
:!
∀ : ≥ 2 ∈ N. (44.14.1)
n−1
¸
i=1

1 +
1
i

i
=
n−1
¸
i=1
(i + 1)
i
i
i
=
n
¸
i=2
i
i−1

n−1
¸
i=1
i
i−1

(: −1)!
Since each left-hand factor in (44.14.1) is < c, we have
n
n
n!
< c
n−1
. Since :−i < : ∀ 1 ≤ i ≤
/ −1, we immediately get

:
/

=
k−1
¸
i=2

1 −
1
i

/!
<
:
n
/!
.
334
And from
/ ≤ : ⇔ (: −i) / ≥ (/ −i) : ∀ 1 ≤ i ≤ / −1
we obtain

:
/

=
:
/

k−1
¸
i=1
: −i)
/ −i

n
k

k
.
Version: 4 Owner: Thomas Heye Author(s): Thomas Heye
335
Chapter 45
05A15 – Exact enumeration problems,
generating functions
45.1 Stirling numbers of the first kind
Introduction. The Stirling numbers of the first kind, frequently denoted as
:(:. /). /. : ∈ N. 1 < / < :.
are the integer coefficients of the falling factorial polynomials. To be more precise, the
defining relation for the Stirling numbers of the first kind is:
r
n
= r(r −1)(r −2) . . . (r −: + 1) =
n
¸
k=1
:(:. /)r
k
.
Here is the table of some initial values.
:`/ 1 2 3 4 5
1 1
2 -1 1
3 2 -3 1
4 -6 11 -6 1
5 24 -50 35 -10 1
Recurrence Relation. The evident observation that
r
n+1
= rr
n
−:r
n
.
leads to the following equivalent characterization of the :(:. /), in terms of a 2-place recur-
rence formula:
:(: + 1. /) = :(:. / −1) −::(:. /). 1 < / < :.
336
subject to the following initial conditions:
:(:. 0) = 0. :(1. 1) = 1.
Generating Function. There is also a strong connection with the generalized binomial formula,
which furnishes us with the following generating function:
(1 + t)
x
=

¸
n=0
n
¸
k=1
:(:. /)r
k
t
n
:!
.
This generating function implies a number of identities. Taking the derivative of both sides
with respect to t and equating powers, leads to the recurrence relation described above.
Taking the derivative of both sides with respect to r gives
(/ + 1):(:. / + 1) =
n
¸
j=k
(−1)
n−j
(: −,)!

: + 1
,

:(,. /)
This is because the derivative of the left side of the generating funcion equation with respect
to r is
(1 + t)
x
ln(1 +t) = (1 +t)
x

¸
k=1
(−1)
k−1
t
k
/
.
The relation
(1 + t)
x
1
(1 + t)
x
2
= (1 +t)
x
1
+x
2
yields the following family of summation identities. For any given /
1
. /
2
. d ` 1 we have

/
1
+ /
2
/
1

:(d + /
1
+ /
2
. /
1
+ /
2
) =
¸
d
1
+d
2
=d

d + /
1
+ /
2
/
1
+ d
1

:(d
1
+ /
1
. /
1
):(d
2
+ /
2
. /
2
).
Enumerative interpretation. The absolute value of the Stirling number of the first kind,
:(:. /), counts the number of permutations of : objects with exactly / orbits (equiva-
lently, with exactly / cycles). For example, :(4. 2) = 11, corresponds to the fact that
the symmetric group on 4 objects has 3 permutations of the form
(∗∗)(∗∗) — 2 orbits of size 2 each.
and 8 permutations of the form
(∗ ∗ ∗) — 1 orbit of size 3, and 1 orbit of size 1.
(see the entry on cycle notation for the meaning of the above expressions.)
Let us prove this. First, we can remark that the unsigned Stirling numbers of the first are
characterized by the following recurrence relation:
[:(: + 1. /)[ = [:(:. / −1)[ + :[:(:. /)[. 1 < / < :.
337
To see why the above recurrence relation matches the count of permutations with / cycles,
consider forming a permutation of : + 1 objects from a permutation of : objects by adding
a distinguished object. There are exactly two ways in which this can be accomplished. We
could do this by forming a singleton cycle, i.e. leaving the extra object alone. This accounts
for the :(:. / −1) term in the recurrence formula. We could also insert the new object into
one of the existing cycles. Consider an arbitrary permutation of : object with / cycles, and
label the objects c
1
. . . . . c
n
, so that the permutation is represented by
(c
1
. . . c
j
1
)(c
j
1
+1
. . . c
j
2
) . . . (c
j
k−1
+1
. . . c
n
)
. .. .
/ cycles
.
To form a new permutation of :+1 objects and / cycles one must insert the new object into
this array. There are, evidently : ways to perform this insertion. This explains the ::(:. /)
term of the recurrence relation. Q.E.D.
Version: 1 Owner: rmilson Author(s): rmilson
45.2 Stirling numbers of the second kind
Summary. The Stirling numbers of the second kind,
o(:. /). /. : ∈ N. 1 < / < :.
are a doubly indexed sequence of natural numbers, enjoying a wealth of interesting combina-
torial properties. There exist several logically equivalent characterizations, but the starting
point of the present entry will be the following definition:
The Stirling number o(:. /) is the number of way to partition a set of : objects
into / groups.
For example, o(4. 2) = 7 because there are seven ways to partition 4 objects — call them a,
b, c, d — into two groups, namely:
(c)(/cd). (/)(ccd). (c)(c/d). (d)(c/c). (c/)(cd). (cc)(/d). (cd)(/c)
Four additional characterizations will be discussed in this entry:
• a recurrence relation
• a generating function related to the falling factorial
• differential operators
• a double-index generating function
Each of these will be discussed below, and shown to be equivalent.
338
A recurrence relation. The Stirling numbers of the second kind can be characterized in
terms of the following recurrence relation:
o(:. /) = /o(: −1. /) + o(: −1. / −1). 1 < / < :.
subject to the following initial conditions:
o(:. :) = o(:. 1) = 1.
Let us now show that the recurrence formula follows from the enumerative definition. Evidently,
there is only one way to partition : objects into 1 group (everything is in that group), and
only one way to partition : objects into : groups (every object is a group all by itself).
Proceeding recursively, a division of : objects c
1
. . . . . c
n−1
. c
n
into / groups can be achieved
by only one of two basic maneuvers:
• We could partition the first : − 1 objects into / groups, and then add object c
n
into
one of those groups. There are /o(: −1. /) ways to do this.
• We could partition the first : −1 objects into / −1 groups and then add object c
n
as
a new, 1 element group. This gives an additional o(: − 1. / − 1) ways to create the
desired partition.
The recursive point of view, therefore explains the connection between the recurrence for-
mula, and the original definition.
Using the recurrence formula we can easily obtain a table of the initial Stirling numbers:
:`/ 1 2 3 4 5
1 1
2 1 1
3 1 3 1
4 1 7 6 1
5 1 15 25 10 1
Falling Factorials. Consider the vector space of polynomials in indeterminate r. The
most obvious basis of this infinite-dimensional vector space is the sequence of monomial
powers: r
n
. : ∈ N. However, the sequence of falling factorials:
r
n
= r(r −1)(r −2) . . . (r −: + 1). : ∈ N
is also a basis, and hence can be used to generate the monomial basis. Indeed, the Stirling
numbers of the second kind can be characterized as the the coefficients involved in the
corresponding change of basis matrix, i.e.
r
n
=
n
¸
k=1
o(:. /)r
k
.
339
So, for example,
r
4
= r + 7r(r −1) + 6r(r −1)(r −2) + r(r −1)(r −2)(r −3).
Arguing inductively, let us prove that this characterization follows from the recurrence rela-
tion. Evidently the formula is true for : = 1. Suppose then that the formula is true for a
given :. We have
rr
k
= r
k+1
+ /r
k
.
and hence using the recurrence relation we deduce that
r
n+1
=
n
¸
k=1
o(:. /) rr
k
=
n
¸
k=1

/o(:. /)r
k
+ o(:. / + 1)

r
k+1
=
n+1
¸
k=1
o(:. /)r
k
Differential operators. Let 1
x
denote the ordinary derivative, applied to polynomials
in indeterminate r, and let 1
x
denote the differential operator r1
x
. We have the following
characterization of the Stirling numbers of the second kind in terms of these two operators:
(1
x
)
n
=
n
¸
k=1
o(:. /) r
k
(1
x
)
k
.
where an exponentiated differential operator denotes the operator composed with itself the
indicated number of times. Let us show that this follows from the recurrence relation. The
proof is once again, inductive. Suppose that the characterization is true for a given :. We
have
1
x
(r
k
(1
x
)
k
) = /r
k
(1
x
)
k
+ r
k+1
(1
x
)
k+1
.
and hence using the recurrence relation we deduce that
(1
x
)
n+1
= r1
x

n
¸
k=1
o(:. /) r
k
(1
x
)
k

=
n
¸
k=1
o(:. /)

/r
k
(1
x
)
k
+ r
k+1
(1
x
)
k+1

=
n+1
¸
k=1
o(:. /) r
k
(1
x
)
k
340
Double index generating function. One can also characterize the Stirling numbers of
the second kind in terms of the following generating function:
c
x(e
t
−1)
=

¸
n=1
n
¸
k=1
o(:. /) r
k
t
n
:!
.
Let us now prove this. Note that the differential equation

dt
= ξ.
admits the general solution
ξ = c
t
r.
It follows that for any polynomial j(ξ) we have
exp(t1
ξ
)[j(ξ)]

ξ=x
=

¸
n=0
t
n
:!
(1
ξ
)
n
[j(ξ)]

ξ=x
= j(c
t
r).
The proof is simple: just take 1
t
of both sides. To be more explicit,
1
t

j(c
t
r)

= j
t
(c
t
r)c
t
r = 1
ξ
[j(ξ)]

ξ=xe
t
.
and that is exactly equal to 1
t
of the left-hand side. Since this relation holds for all polyno-
mials, it also holds for all formal power series. In particular if we apply the above relation
to c
ξ
, use the result of the preceding section, and note that
1
ξ
[c
ξ
] = c
ξ
.
we obtain
c
xe
t
=

¸
n=1
t
n
:!
(1
ξ
)
n
[c
ξ
]

ξ=x
=

¸
n=1
n
¸
k=1
o(:. /)
t
n
:!
ξ
k
(1
ξ
)
k
[c
ξ
]

ξ=x
= c
x

¸
n=1
n
¸
k=1
o(:. /) r
k
t
n
:!
Dividing both sides by c
x
we obtain the desired generating function. Q.E.D.
Version: 2 Owner: rmilson Author(s): rmilson
341
Chapter 46
05A19 – Combinatorial identities
46.1 Pascal’s rule
Pascal’s rule is the binomial identity

:
/

+

:
/ −1

=

: + 1
/

where 1 < / < : and

n
k

is the binomial coefficient.
Version: 5 Owner: KimJ Author(s): KimJ
342
Chapter 47
05A99 – Miscellaneous
47.1 principle of inclusion-exclusion
The principle of inclusion-exclusion provides a way of methodically counting the union
of possibly non-disjoint sets.
Let ( = ¦¹
1
. ¹
2
. . . . ¹
N
¦ be a finite collection of finite sets. Let 1
k
represent the set of /-fold
intersections of members of ( (e.g., 1
2
contains all possible intersections of two sets chosen
from ().
Then

N
¸
i=1
¹
i

=
N
¸
j=0

¸
(−1)
(j+1)
¸
S∈I
j
[o[

For example:

¸
1[ = ([¹[ +[1[) −([¹
¸
1[)

¸
1
¸
([ = ([¹[ +[1[ +[([) −([¹
¸
1[ +[¹
¸
([ +[1
¸
([) + ([¹
¸
1
¸
([)
The principle of inclusion-exclusion, combined with de Morgan’s theorem, can be used to
count the intersection of sets as well. Let ¹ be some universal set such that ¹
k
⊆ ¹ for each
/, and let ¹
k
represent the complement of ¹
k
with respect to ¹. Then we have

N
¸
i=1
¹
i

=

N
¸
i=1
¹
i

thereby turning the problem of finding an intersection into the problem of finding a union.
Version: 2 Owner: vampyr Author(s): vampyr
343
47.2 principle of inclusion-exclusion proof
The proof is by induction. Consider a single set ¹
1
. Then the principle of inclusion-exclusion
states that [¹
1
[ = [¹
1
[, which is trivially true.
Now consider a collection of exactly two sets ¹
1
and ¹
2
. We know that
¹
¸
1 = (¹` 1)
¸
(1 ` ¹)
¸

¸
1)
Furthermore, the three sets on the right-hand side of that equation must be disjoint. There-
fore, by the addition principle, we have

¸
1[ = [¹` 1[ +[1 ` ¹[ +[¹
¸
1[
= [¹` 1[ +[¹
¸
1[ +[1 ` ¹[ +[¹
¸
1[ −[¹
¸
1[
= [¹[ +[1[ −[¹
¸
1[
So the principle of inclusion-exclusion holds for any two sets.
Now consider a collection of ` 2 finite sets ¹
1
. ¹
2
. . . . ¹
N
. We assume that the principle
of inclusion-exclusion holds for any collection of ` sets where 1 < ` < `. Because the
union of sets is associative, we may break up the union of all sets in the collection into a
union of two sets:
N
¸
i=1
¹
i
=

N−1
¸
i=1
¹
i

¸
¹
N
By the principle of inclusion-exclusion for two sets, we have

N
¸
i=1
¹
i

=

N−1
¸
i=1
¹
i

+[¹
N
[ −

N−1
¸
i=1
¹
i

¸
¹
N

Now, let 1
k
be the collection of all /-fold intersections of ¹
1
. ¹
2
. . . . ¹
N−1
, and let 1
t
k
be
the collection of all /-fold intersections of ¹
1
. ¹
2
. . . . ¹
N
that include ¹
N
. Note that ¹
N
is
included in every member of 1
t
k
and in no member of 1
k
, so the two sets do not duplicate
one another.
We then have

N
¸
i=1
¹
i

=
N
¸
j=1

¸
(−1)
(j+1)
¸
S∈I
j
[o[

+[¹
N
[ −

N−1
¸
i=1
¹
i

¸
¹
N

344
by the principle of inclusion-exclusion for a collection of `−1 sets. Then, we may distribute
set intersection over set union to find that

N
¸
i=1
¹
i

=
N
¸
j=1

¸
(−1)
(j+1)
¸
S∈I
j
[o[

+ [¹
N
[ −

N−1
¸
i=1

i
¸
¹
N
)

Note, however, that

x
¸
¹
N
)
¸

y
¸
¹
N
) = (¹
x
¸
¹
y
¸
¹
N
)
Henc we may again apply the principle of inclusion-exclusion for ` −1 sets, revealing that

N
¸
i=1
¹
i

=
N−1
¸
j=1

¸
(−1)
(j+1)
¸
S∈I
j
[o[

+[¹
N
[ −
N−1
¸
j=1

¸
(−1)
(j+1)
¸
S∈I
j
[o
¸
¹
N
[

=
N−1
¸
j=1

¸
(−1)
(j+1)
¸
S∈I
j
[o[

+[¹
N
[ −
N−1
¸
j=1

¸
(−1)
(j+1)
¸
S∈I

j+1
[o[

=
N−1
¸
j=1

¸
(−1)
(j+1)
¸
S∈I
j
[o[

+[¹
N
[ −
N
¸
j=2

¸
(−1)
(j)
¸
S∈I

j
[o[

=
N−1
¸
j=1

¸
(−1)
(j+1)
¸
S∈I
j
[o[

+[¹
N
[ +
N
¸
j=2

¸
(−1)
(j+1)
¸
S∈I

j
[o[

The second sum does not include 1
t
1
. Note, however, that 1
t
1
= ¦¹
N
¦, so we have

N
¸
i=1
¹
i

=
N−1
¸
j=1

¸
(−1)
(j+1)
¸
S∈I
j
[o[

+
N
¸
j=1

¸
(−1)
(j+1)
¸
S∈I

j
[o[

=
N−1
¸
j=1

(−1)
(j+1)

¸
¸
S∈I
j
[o[ +
¸
S∈I

j
[o[

¸
¸
Combining the two sums yields the principle of inclusion-exclusion for ` sets.
Version: 1 Owner: vampyr Author(s): vampyr
345
Chapter 48
05B15 – Orthogonal arrays, Latin
squares, Room squares
48.1 example of Latin squares
It is easily shown that the multiplication table (Cayley-table) of a group has exactly these
properties and thus are latin squares. The converse, however, is (unfortunately) not true, ie.
not all Latin squares are multiplication tables for a group (the smallest counter example is
a Latin square of order 5).
Version: 2 Owner: jgade Author(s): jgade
48.2 graeco-latin squares
Let ¹ = (c
ij
) and 1 = (/
ij
) be two :: matrices. We define their join as the matrix whose
(i. ,)th entry is the pair (c
ij
. /
ij
).
A graeco-latin square is then the join of two latin squares.
The name comes from Euler’s use of greek and latin letters to differentiate the entries on
each array.
An example of graeco-latin square:

¸
¸
¸
cα /β cγ dδ
dγ cδ /α cβ
/δ cγ dβ cα
cβ dα cδ /γ

346
Version: 1 Owner: drini Author(s): drini
48.3 latin square
A latin square of order : is an : : array such that each column and each row are made
with the same : symbols, using every one exactly once time.
Examples.

¸
¸
¸
c / c d
c d c /
d c / c
/ c d c

¸
¸
¸
1 2 3 4
4 3 2 1
2 1 4 3
3 4 1 2

Version: 1 Owner: drini Author(s): drini
48.4 magic square
A magic square of order : is an :: array using each one of the numbers 1. 2. 3. . . . . :
2
once
and such that the sum of the numbers in each row, column or main diagonal is the same.
Example:

¸
8 1 6
3 5 7
4 9 2

It’s easy to prove that the sum is always
1
2
:(:
2
+1). So in the example with : = 3 the sum
is always
1
2
(3 10) = 15.
Version: 1 Owner: drini Author(s): drini
347
Chapter 49
05B35 – Matroids, geometric lattices
49.1 matroid
A matroid, or an independence structure, is a kind of finite mathematical structure whose
properties imitate the properties of a finite subset of a vector space. Notions such as rank
and independence (of a subset) have a meaning for any matroid, as does the notion of duality.
A matroid permits several equivalent formal definitions: two definitions in terms of a rank
function, one in terms of independant subsets, and several more.
For a finite set A, β(A) will denote the set of all subsets of A, and [A[ will denote the
number of elements of A. 1 is a fixed finite set throughout.
Definition 1: A matroid is a pair (1. :) where : is a mapping β(1) → N satisfying these
axioms:
r1) :(o) ≤ [o[ for all o ⊂ 1.
r2) If o ⊂ 1 ⊂ 1 then :(o) ≤ :(1).
r3) For any subsets o and 1 of 1,
:(o
¸
1) + :(o
¸
1) ≤ :(o) + :(1).
The matroid (1. :) is called normal if also
r*) :(¦c¦) = 1 for any c ∈ 1.
: is called the rank function of the matroid. (r3) is called the submodular inequality.
The notion of isomorphism between one matroid (1. :) and another (1. :) has the expected
meaning: there exists a bijection 1 : 1 → 1 which preserves rank, i.e. satisfies :(1(¹)) =
348
:(¹) for all ¹ ⊂ 1.
Definition 2: A matroid is a pair (1. :) where : is a mapping β(1) → N satisfying these
axioms:
q1) :(∅) = 0.
q2) If r ∈ 1 and o ⊂ 1 then :(o
¸
¦r¦) −:(o) ∈ ¦0. 1¦.
q3) If r. n ∈ 1 and o ⊂ 1 and :(o
¸
¦r¦) = :(o
¸
¦n¦) = :(o) then :(o
¸
¦r. n¦) = :(o).
Definition 3: A matroid is a pair (1. 1) where 1 is a subset of β(1) satisfying these axioms:
i1) ∅ ∈ 1.
i2) If o ⊂ 1 ⊂ 1 and 1 ∈ 1 then o ∈ 1.
i3) If o. 1 ∈ 1 and o. 1 ⊂ l ⊂ 1 and o and 1 are both maximal subsets of l with the
property that they are in 1, then [o[ = [1[.
An element of 1 is called an independent set. (1. 1) is called normal if any singleton subset
of 1 is independant, i.e.
i*) ¦r¦ ∈ 1 for all r ∈ 1
Definition 4: A matroid is a pair (1. 1) where 1 is a subset of β(1) satisfying these
axioms:
b1) 1 = ∅.
b2) If o. 1 ∈ 1 and o ⊂ 1 then o = 1.
b3) If o. 1 ∈ 1 and r ∈ 1 −o then there exists n ∈ 1 −1 such that (o
¸
r) −n ∈ 1.
An element of 1 is called a basis (of 1). (1. 1) is called normal if also
b*)
¸
b∈B
/ = 1
i.e. if any singleton subset of 1 can be extended to a basis.
Definition 5: A matroid is a pair (1. φ) where φ is a mapping β(1) → β(1) satisfying
these axioms:
φ1) o ⊂ φ(o) for all o ⊂ 1.
φ2) If o ⊂ φ(1) then φ(o) ⊂ φ(1).
φ3) If r ∈ φ(o
¸
¦n¦) −φ(o) then n ∈ φ(o
¸
¦r¦).
φ is called the span mapping of the matroid, and φ(¹) is called the span of the subset ¹.
349
(1. φ) is called normal if also
φ*) φ(∅) = ∅
Definition 6: A matroid is a pair (1. () where ( is a subset of β(1) satisfying these
axioms:
c1) ∅ ∈ (.
c2) If o. 1 ∈ ( and o ⊂ 1 then o = 1.
c3) If o. 1 ∈ ( and o = 1 and r ∈ o
¸
1 then there exists l ∈ ( such that r ∈ l and
l ⊂ o
¸
1.
An element of ( is called a circuit. (1. () is called normal if also
c*) No singleton subset of 1 is a circuit.
49.1.1 Equivalence of the definitions
It would take several pages to spell out what is a circuit in terms of rank, and likewise for
each other possible pair of the alternative defining notions, and then to prove that the various
sets of axioms unambiguously define the same structure. So let me sketch just one example:
the equivalence of Definitions 1 (on rank) and 6 (on circuits). Assume first the conditions in
Definition 1. Define a circuit as a minimal subset ¹ of β(1) having the property :(¹) < [¹[.
With a little effort, we verify the axioms (c1)-(c3). Now assume (c1)-(c3), and let :(¹) be
the largest integer : such that ¹ has a subset 1 for which
– 1 contains no element of (
– : = [1[.
One now proves (r1)-(r3). Next, one shows that if we define ( in terms of :, and then
another rank function : in terms of (, we end up with :=:. The equivalence of (r*) and
(c*) is easy enough as well.
49.1.2 Examples of matroids
Let \ be a vector space over a field /, and let 1 be a finite subset of \ . For o ⊂ 1, let :(o)
be the dimension of the subspace of \ generated by o. Then (1. :) is a matroid. Such a
matroid, or one isomorphic to it, is said to be representable over /. The matroid is normal
iff 0 ∈ 1. There exist matroids which are not representable over any field.
The second example of a matroid comes from graph theory. The following definition will be
rather informal, partly because the terminology of graph theory is not very well standardised.
350
For our present purpose, a graph consists of a finite set \ , whose elements are called vertices,
plus a set 1 of two-element subsets of \ , called edges. A circuit in the graph is a finite set
of at least three edges which can be arranged in a cycle:
¦c. /¦. ¦/. c¦. . . . ¦n. .¦. ¦.. c¦
such that the vertices c. /. . . . . are distinct. With circuits thus defined, 1 satisfies the axioms
in Definition 6, and is thus a matroid, and in fact a normal matroid. (The definition is easily
adjusted to permit graphs with loops, which define non-normal matroids.) Such a matroid,
or one isomorphic to it, is called “graphic”.
Let 1 = ¹
¸
1 be a finite set, where ¹ and 1 are nonempty and disjoint. Let G a subset of
¹ 1. We get a “matching” matroid on 1 as follows. Each element of 1 defines a “line”
which is a subset (a row or column) of the set ¹1. Let us call the elements of G “points”.
For any o ⊂ 1 let :(o) be the largest number : such that for some set of points 1:
– [1[ = :
– No two points of 1 are on the same line
– Any point of 1 is on a line defined by an element of o.
One can prove (it is not trivial) that : is the rank function of a matroid on 1. That
matroid is normal iff every line contains at least one point. matching matroids participate in
combinatorics, in connection with results on “transversals”, such as Hall’s marriage theorem.
49.1.3 The dual of a matroid
Proposition: Let 1 be a matroid and : its rank function. Define a mapping : : β(1) → N
by
:(¹) = [¹[ −:(1) + :(1 −¹).
Then the pair (1. :) is a matroid (called the dual of (1. :).
We leave the proof as an exercise. Also, it is easy to verify that the dual of the dual is the
original matroid. A circuit in (1. :) is also referred to as a cocircuit in (1. :). There is a
notion of cobasis also, and cospan.
If the dual of 1 is graphic, 1 is called cographic. This notion of duality agrees with the
notion of same name in the theory of planar graphs (and likewise in linear algebra): given
a plane graph, the dual of its matroid is the matroid of the dual graph. A matroid that is
both graphic and cographic is called planar, and various criteria for planarity of a graph can
be extended to matroids. The notion of orientability can also be extended from graphs to
matroids.
351
49.1.4 Binary matroids
A matroid is said to be binary if it is representable over the field of two elements. There are
several other (equivalent) characterisations of a binary matroid (1. :), such as:
– The symmetric – difference of any family of – circuits is the union – of a family of pairwise
disjoint circuits.
– For any circuit ( and cocircuit 1, we have [(
¸
1[ ⇔0 (mod 2).
Any graphic matroid is binary. The dual of a binary matroid is binary.
49.1.5 Miscellaneous
The definition of the chromatic polynomial of a graph,
χ(r) =
¸
F⊂E
(−1)
[F[
r
r(E)−r(F)
.
extends without change to any matroid. This polynomial has something to say about the
decomposibility of matroids into simpler ones.
Also on the topic of decomposibility, matroids have a sort of structure theory, in terms of
what are called minors and separators. That theory, due to Tutte, goes by induction; roughly
speaking, it is an adaptation of the old algorithms for putting a matrix into a canonical form.
Along the same lines are several theorems on “basis exchange”, such as the following. Let
1 be a matroid and let
¹ = ¦c
1
. . . . . c
n
¦
1 = ¦/
1
. . . . . /
n
¦
be two (equipotent) bases of 1. There exists a permutation ψ of the set ¦1. . . . . :¦ such
that, for every : from 0 to :,
¦c
1
. . . . . c
m
. /
ψ(m+1)
. . . . . /
ψ(n)
¦
is a basis of 1.
49.1.6 Further reading
A good textbook is:
James G. Oxley, Matroid Theory, Oxford University Press, New York etc., 1992
plus the updates-and-errata file at Dr. Oxley’s website.
352
The chromatic polynomial is not discussed in Oxley, but see e.g. Zaslavski.
Version: 3 Owner: drini Author(s): Larry Hammick, NeuRet
49.2 polymatroid
The polymatroid defined by a given matroid (1. :) is the set of all functions u : 1 → R
such that
u(c) ≥ 0 for all c ∈ 1
¸
e∈S
u(c) ≤ :(o) for all o ⊂ 1 .
Polymatroids are related to the convex polytopes seen in linear programming, and have
similar uses.
Version: 1 Owner: nobody Author(s): Larry Hammick
353
Chapter 50
05C05 – Trees
50.1 AVL tree
An AVL tree is A balanced binary search tree where the height of the two subtrees (children)
of a node differs by at most one. Look-up, insertion, and deletion are ((ln :), where : is
the number of nodes in the tree.
The structure is named for the inventors, Adelson-Velskii and Landis (1962).
Version: 5 Owner: Thomas Heye Author(s): Thomas Heye
50.2 Aronszajn tree
A κ-tree 1 for which [1
α
[ < κ for all α < κ and which has no cofinal branches is called a
κ-Aronszajn tree. If κ = ω
1
then it is referred to simply as an Aronszajn tree.
If there are no κ-Aronszajn trees for some κ then we say κ has the tree property. ω has
the tree property, but no singular cardinal has the tree property.
Version: 6 Owner: Henry Author(s): Henry
50.3 Suslin tree
An Aronszajn tree is a Suslin tree iff it has no uncountable antichains.
Version: 1 Owner: Henry Author(s): Henry
354
50.4 antichain
A subset ¹ of a poset (1. <
P
) is an antichain if no two elements are comparable. That is,
if c. / ∈ ¹ then c ≮
P
/ and / ≮
P
c.
A maximal antichain of 1 is one which is maximal.
In particular, if (1. <
P
) is a tree then the maximal antichains are exactly those antichains
which intersect every branch, and if the tree is splitting then every level is a maximal
antichain.
Version: 3 Owner: Henry Author(s): Henry
50.5 balanced tree
A balanced tree is a rooted tree where each subtree of the root has an equal number of
nodes (or as near as possible). For an example, see binary tree.
Version: 2 Owner: Logan Author(s): Logan
50.6 binary tree
A binary tree is a rooted tree where every node has two or fewer children. A balanced
binary tree is a binary tree that is also a balanced tree. For example,
¹
1 1
( 1 1 G
is a balanced binary tree.
The two (potential) children of a node in a binary tree are often called the left and right
children of that node. The left child of some node A and all that child’s descendents are the
left descendents of A. A similar definition applies to A’s right descendents. The left
subtree of A is A’s left descendents, and the right subtree of A is its right descendents.
Since we know the maximum number of children a binary tree node can have, we can make
some statements regarding minimum and maximum depth of a binary tree as it relates to
355
the total number of nodes. The maximum depth of a binary tree of : nodes is : −1 (every
non-leaf node has exactly one child). The minimum depth of a binary tree of : nodes (: 0)
is log
2
:| (every non-leaf node has exactly two children, that is, the tree is balanced).
A binary tree can be implicitly stored as an array, if we designate a constant, maximum
depth for the tree. We begin by storing the root node at index 0 in the array. We then store
its left child at index 1 and its right child at index 2. The children of the node at index 1
are stored at indices 3 and 4, and the chldren of the node at index 2 are stored at indices 5
and 6. This can be generalized as: if a node is stored at index /, then its left child is located
at index 2/ + 1 and its right child at 2/ + 2. This form of implicit storage thus eliminates
all overhead of the tree structure, but is only really advantageous for trees that tend to be
balanced. For example, here is the implicit array representation of the tree shown above.
A B E C D F G
Many data structures are binary trees. For instance, heaps and binary search trees are binary
trees with particular properties.
Version: 3 Owner: Daume Author(s): Daume, Logan
50.7 branch
A subset 1 of a tree (1. <
T
) is a branch if 1 is a maximal linearly ordered subset of 1.
That is:
• <
T
is a linear ordering of 1
• If t ∈ 1 ` 1 then 1
¸
¦t¦ is not linearly ordered by <
T
.
This is the same as the intuitive conception of a branch: it is a set of nodes starting at the
root and going all the way to the tip (in infinite sets the conception is more complicated,
since there may not be a tip, but the idea is the same). Since branches are maximal there is
no way to add an element to a branch and have it remain a branch.
A cofinal branch is a branch which intersects every level of the tree.
Version: 1 Owner: Henry Author(s): Henry
50.8 child node (of a tree)
A child node ( of a node 1 in a tree is any node connected to 1 which has a path distance
from the root node 1 which is one greater than the path distance between 1 and 1.
356
Drawn in the canonical root-at-top manner, a child node of a node 1 in a tree is simply any
node immediately below 1 which is connected to it.

• •
• •

Figure: A node (blue) and its children (red.)
Version: 1 Owner: akrowne Author(s): akrowne
50.9 complete binary tree
A complete binary tree is a binary tree with the additional property that every node
must have exactly two “children” if an internal node, and zero children if a leaf node.
More precisely: for our base case, the complete binary tree of exactly one node is simply
the tree consisting of that node by itself. The property of being “complete” is preserved
if, at each step, we expand the tree by connecting exactly zero or two individual nodes (or
complete binary trees) to any node in the tree (but both must be connected to the same
node.)
Version: 4 Owner: akrowne Author(s): akrowne
50.10 digital search tree
A digital search tree is a tree which stores strings internally so that there is no need for
extra leaf nodes to store the strings.
Version: 5 Owner: Logan Author(s): Logan
357
50.11 digital tree
A digital tree is a tree for storing a set of strings where nodes are organized by substrings
common to two or more strings. Examples of digital trees are digital search trees and tries.
Version: 3 Owner: Logan Author(s): Logan
50.12 example of Aronszajn tree
Construction 1: If κ is a singular cardinal then there is a simple construction of a κ-
Aronszajn tree. Let '/
β
`
β<ι
with ι < κ be a sequence cofinal in κ. Then consider the tree
where 1 = ¦(α. /
β
) [ α < /
β
∧ β < ι¦ with (α
1
. /
β
1
) <
T

2
. /
β
2
) iff α
1
< α
2
and /
β
1
= /
β
2
.
Note that this is similar to (indeed, a subtree of) the construction given for a tree with no
cofinal branches. It consists of ι disjoint branches, with the β-th branch of height /
β
. Since
ι < κ, every level has fewer than κ elements, and since the sequence is cofinal in κ, 1 must
have height and cardinality κ.
Construction 2: We can construct an Aronszajn tree out of the compact subsets of ´
+
.
<
T
will be defined by r <
T
n iff n is an end-extension of r. That is, r ⊆ n and if : ∈ n ` r
and : ∈ r then : < :.
Let 1
0
= ¦[0]¦. Given a level 1
α
, let 1
α+1
= ¦r
¸
¦¡¦ [ r ∈ 1
α
∧ ¡ max r¦. That is, for
every element r in 1
α
and every rational number ¡ larger than any element of r, r
¸
¦¡¦ is
an element of 1
α+1
. If α < ω
1
is a limit ordinal then each element of 1
α
is the union of some
branch in 1(α).
We can show by induction that [1
α
[ < ω
1
for each α < ω
1
. For the base case, 1
0
has only one
element. If [1
α
[ < ω
1
then [1
α+1
[ = [1
α
[ [´[ = [1
α
[ ω = ω < ω
1
. If α < ω
1
is a limit ordinal
then 1(α) is a countable union of countable sets, and therefore itself countable. Therefore
there are a countable number of branches, so 1
α
is also countable. So 1 has countable levels.
Suppose 1 has an uncountable branch, 1 = '/
0
. /
1
. . . .`. Then for any i < , < ω
1
, /
i
⊂ /
j
.
Then for each i, there is some r
i
∈ /
i+1
` /
i
such that r
i
is greater than any element of
/
i
. Then 'r
0
. r
1
. . . .` is an uncountable increasing sequence of rational numbers. Since the
rational numbers are countable, there is no such sequence, so 1 has no uncountable branch,
and is therefore Aronszajn.
Version: 1 Owner: Henry Author(s): Henry
358
50.13 example of tree (set theoretic)
The set Z
+
is a tree with <
T
=<. This isn’t a very interesting tree, since it simply consists
of a line of nodes. However note that the height is ω even though no particular node has
that height.
A more interesting tree using Z
+
defines : <
T
: if i
a
= : and i
b
= : for some i. c. / ∈
Z
+
¸
¦0¦. Then 1 is the root, and all numbers which are not powers of another number are
in 1
1
. Then all squares (which are not also fourth powers) for 1
2
, and so on.
To illustrate the concept of a cofinal branch, observe that for any limit ordinal κ we can
construct a κ-tree which has no cofinal branches. We let 1 = ¦(α. β)[α < β < κ¦ and

1
. β
1
) <
T

2
. β
2
) ↔ α
1
< α
2
∧ β
1
= β
2
. The tree then has κ disjoint branches, each
consisting of the set ¦(α. β)[α < β¦ for some β < κ. No branch is cofinal, since each branch
is capped at β elements, but for any γ < κ, there is a branch of height γ + 1. Hence the
suprememum of the heights is κ.
Version: 1 Owner: Henry Author(s): Henry
50.14 extended binary tree
An extended binary tree is a transformation of any binary tree into a complete binary tree.
This transformation consists of replacing every null subtree of the original tree with “special
nodes.” The nodes from the original tree are then internal nodes, while the “special nodes”
are external nodes.
For instance, consider the following binary tree.
The following tree is its extended binary tree. Empty circles represent internal nodes, and
filled circles represent external nodes.
Every internal node in the extended tree has exactly two children, and every external node
is a leaf. The result is a complete binary tree.
Version: 4 Owner: Logan Author(s): Logan
359
50.15 external path length
Given a binary tree 1, construct its extended binary tree 1
t
. The external path length
of 1 is then defined to be the sum of the lengths of the paths to each of the external nodes.
For example, let 1 be the following tree.
The extended binary tree of 1 is
The external path length of 1 (denoted 1) is
1 = 2 + 3 + 3 + 3 + 3 + 3 + 3 = 20
The internal path length of 1 is defined to be the sum of the lengths of the paths to each
of the internal nodes. The internal path length of our example tree (denoted 1) is
1 = 1 + 2 + 0 + 2 + 1 + 2 = 8
Note that in this case 1 = 1 + 2:, where : is the number of internal nodes. This happens
to hold for all binary trees.
Version: 1 Owner: Logan Author(s): Logan
50.16 internal node (of a tree)
An internal node of a tree is any node which has degree greater than one. Or, phrased in
rooted tree terminology, the internal nodes of a tree are the nodes which have at least one
child node.
360

• •
• •

Figure: A tree with internal nodes highlighted in red.
Version: 3 Owner: akrowne Author(s): akrowne
50.17 leaf node (of a tree)
A leaf of a tree is any node which has degree of exactly 1. Put another way, a leaf node of
a rooted tree is any node which has no child nodes.

• •
• •

Figure: A tree with leaf nodes highlighted in red.
Version: 2 Owner: akrowne Author(s): akrowne
50.18 parent node (in a tree)
A parent node 1 of a node ( in a tree is the first node which lies along the path from (
to the root of the tree, 1.
Drawn in the canonical root-at-top manner, the parent node of a node ( in a tree is simply
the node immediately above ( which is connected to it.
361

• •
• •

Figure: A node (blue) and its parent (red.)
Version: 2 Owner: akrowne Author(s): akrowne
50.19 proof that ω has the tree property
Let 1 be a tree with finite levels and an infinite number of elements. Then consider the
elements of 1
0
. 1 can be partitioned into the set of descendants of each of these elements,
and since any finite partition of an infinite set has at least one infinite partition, some element
r
0
in 1
0
has an infinite number of descendants. The same procedure can be applied to the
children of r
0
to give an element r
1
∈ 1
1
which has an infinite number of descendants, and
then to the children of r
1
, and so on. This gives a sequence A = 'r
0
. r
1
. . . .`. The sequence
is infinite since each element has an infinite number of descendants, and since r
i+1
is always
of child of r
i
, A is a branch, and therefore an infinite branch of 1.
Version: 2 Owner: Henry Author(s): Henry
50.20 root (of a tree)
The root of a tree is a place-holder node. It is typically drawn at the top of the page, with
the other nodes below (with all nodes having the same path distance from the root at the
same height.)
362

• •
• •

Figure: A tree with root highlighted in red.
Any tree can be redrawn this way, selecting any node as the root. This is important to
note: taken as a graph in general, the notion of “root” is meaningless. We introduce a root
explicitly when we begin speaking of a graph as a tree– there is nothing in general that
selects a root for us.
However, there are some special cases of trees where the root can be distinguished from the
other nodes implicitly due to the properties of the tree. For instance, a root is uniquely
identifiable in a complete binary tree, where it is the only node with degree two.
Version: 4 Owner: akrowne Author(s): akrowne
50.21 tree
Formally, a forest is an undirected, acyclic graph. A forest consists of trees, which are
themselves acyclic, connected graphs. For example, the following diagram represents a forest,
each connected component of which is a tree.
• • • • • •
• • • • •
All trees are forests, but not all forests are trees. As in a graph, a forest is made up of vertices
(which are often called nodes interchangeably) and edges. Like any graph, the vertices and
edges may each be labelled — that is, associated with some atom of data. Therefore a forest
or a tree is often used as a data structure.
Often a particular node of a tree is specified as the root. Such trees are typically drawn with
the root at the top of the diagram, with all other nodes depending down from it (however
this is not always the case). A tree where a root has been specified is called a rooted tree. A
363
tree where no root has been specified is called a free tree. When speaking of tree traversals,
and most especially of trees as datastructures, rooted trees are often implied.
The edges of a rooted tree are often treated as directed. In a rooted tree, every non-root
node has exactly one edge that leads to the root. This edge can be thought of as connecting
each node to its parent. Often rooted trees ae considered directed in the sense that all edges
connect parents to their children, but not vice-versa. Given this parent-child relationship, a
descendant of a node in a directed tree is defined as any other node reachable from that
node (that is, a node’s children and all their descendants).
Given this directed notion of a rooted tree, a rooted subtree can be defined as any node
of a tree and all of its descendants. This notion of a rooted subtree is very useful in dealing
with trees inductively and defining certain algorithms inductively.
Because of their simple structure and unique properties, trees and forests have many uses.
Because of the simple definition of various tree traversals, they are often used to store and
lookup data. Many algorithms are based upon trees, or depend upon a tree in some manner,
such as the heapsort algorithm or Huffman encoding. There are also a great many specific
forms and families of trees, each with its own constraints, strengths, and weaknesses.
Version: 6 Owner: Logan Author(s): Logan
50.22 weight-balanced binary trees are ultrametric
Let A be the set of leaf nodes in a weight-balanced binary tree. Let the distance between
leaf nodes be identified with the weighted path length between them. We will show that this
distance metric on A is ultrametric.
Before we begin, let the join of any two nodes r. n, denoted r ∨ n, be defined as the node
. which is the most immediate common ancestor of r and n (that is, the common ancestor
which is farthest from the root). Also, we are using weight-balanced in the sense that
• the weighted path length from the root to each leaf node is equal, and
• each subtree is weight-balanced, too.
Lemma: two properties of weight-balanced trees
Because the tree is weight-balanced, the distances between any node and each of the leaf
node descendents of that node are equal. So, for any leaf nodes r. n,
d(r. r ∨ n) = d(n. r ∨ n) (50.22.1)
364
Hence,
d(r. n) = d(r. r ∨ n) + d(n. r ∨ n) = 2 ∗ d(r. r ∨ n) (50.22.2)
Back to the main proof
We will now show that the ultrametric three point condition holds for any three leaf nodes
in a weight-balanced binary tree.
Consider any three points c. /. c in a weight-balanced binary tree. If d(c. /) = d(/. c) = d(c. c),
then the three point condition holds. Now assume this is not the case. Without loss of
generality, assume that d(c. /) < d(c. c).
Applying Eqn. 50.22.2,
2 ∗ d(c. c ∨ /) < 2 ∗ d(c. c ∨ c)
d(c. c ∨ /) < d(c. c ∨ c)
Note that both c ∨/ and c ∨c are ancestors of c. Hence, c ∨ c is a more distant ancestor of
c and so must c ∨ c must be an ancestor of c ∨ /.
Now, consider the path between / and c. to get from / to c is to go from / up to c ∨ /, then
up to c ∨c, and then down to c. Since this is a tree, this is the only path. The highest node
in this path (the ancestor of both b and c) was c ∨c, so the distance d(/. c) = 2 ∗ d(/. c ∨c).
But by Eqn. 50.22.1 and Eqn. 50.22.2 (noting that / is a descendent of c ∨ c), we have
d(/. c) = 2 ∗ d(/. c ∨ c) = 2 ∗ d(c. c ∨ c) = d(c. c)
To summarize, we have d(c. /) < d(/. c) = d(c. c), which is the desired ultrametric three
point condition. So we are done.
Note that this means that, if c. / are leaf nodes, and you are at a node outside the subtree
under c ∨ /, then d(you. c) = d(you. /). In other words, (from the point of view of distance
between you and them,) the structure of any subtree that is not your own doesn’t matter to
you. This is expressed in the three point condition as ”if two points are closer to each other
than they are to you, then their distance to you is equal”.
(above, we have only proved this if you are at a leaf node, but it works for any node which is
outside the subtree under c ∨/, because the paths to c and / must both pass through c ∨/).
Version: 2 Owner: bshanks Author(s): bshanks
365
50.23 weighted path length
Given an extended binary tree 1 (that is, simply any complete binary tree, where leafs are
denoted as external nodes), associate weights with each external node. The weighted
path length of 1 is the sum of the product of the weight and path length of each external
node, over all external nodes.
Another formulation is that weighted path length is
¸
u
j
|
j
over all external nodes ,, where
u
j
is the weight of an external node ,, and |
j
is the distance from the root of the tree to ,.
If u
j
= 1 for all ,, then weighted path length is exactly the same as external path length.
Example
Let 1 be the following extended binary tree. Square nodes are external nodes, and circular
nodes are internal nodes. Values in external nodes indicate weights, which are given in this
problem, while values in internal nodes represent the weighted path length of subtrees rooted
at those nodes, and are calculated from the given weights and the given tree. The weight of
the tree as a whole is given at the root of the tree.
This tree happens to give the minimum weighted path length for this particular set of
weights.
Version: 1 Owner: Logan Author(s): Logan
366
Chapter 51
05C10 – Topological graph theory,
imbedding
51.1 Heawood number
The Heawood number of a surface is the maximal number of colors needed to color any graph
embedded in the surface. For example, four-color conjecture states that Heawood number
of the sphere is four.
In 1890 Heawood proved for all surfaces except sphere that the Heawood number is
H(o) <
¸
7 +

49 −24c(o)
2
¸
.
where c(o) is the Euler characteristic of the surface.
Later it was proved in the works of Franklin, Ringel and Youngs that
H(o) `
¸
7 +

49 −24c(o)
2
¸
.
For example, the complete graph on 7 vertices can be embedded in torus as follows:
1 2 3 1
4 4
6 7
5 5
1 2 3 1
367
REFERENCES
1. B´ela Bollob´as. Graph Theory: An Introductory Course, volume 63 of GTM. Springer-Verlag,
1979. Zbl 0411.05032.
2. Thomas L. Saaty and Paul C. Kainen. The Four-Color Problem: Assaults and Conquest. Dover,
1986. Zbl 0463.05041.
Version: 6 Owner: bbukh Author(s): bbukh
51.2 Kuratowski’s theorem
A finite graph is planar if and only if it contains no subgraph that is isomorphic to or is
a subdivision of 1
5
or 1
3,3
, where 1
5
is the complete graph of order 5 and 1
3,3
is the
complete bipartite graph of order 6. Wagner’s theorem is an equivalent later result.
REFERENCES
1. Kazimierz Kuratowski. Sur le probl`eme des courbes gauches en topologie. Fund. Math., 15:271–
283, 1930.
Version: 7 Owner: bbukh Author(s): bbukh, digitalis
51.3 Szemer´edi-Trotter theorem
The number of incidences of a set of : points and a set of : lines in the real plane R
2
is
1 = ((: + :+ (::)
2
3
).
Proof. Let’s consider the points as vertices of a graph, and connect two vertices by an edge
if they are adjacent on some line. Then the number of edges is c = 1 − :. If c < 4: then
we are done. If c ` 4: then by crossing lemma
:
2
` cr(G) `
1
64
(1 −:)
3
:
2
.
and the theorem follows.
Recently, T´oth[1] extended the theorem to the complex plane C
2
. The proof is difficult.
368
REFERENCES
1. Csaba D. T´oth. The Szemer´edi-Trotter theorem in the complex plane. arXiv:CO/0305283, May
2003.
Version: 3 Owner: bbukh Author(s): bbukh
51.4 crossing lemma
The crossing number of a graph G with : vertices and : ` 4: edges is
cr(G) `
1
64
:
3
:
2
.
Version: 1 Owner: bbukh Author(s): bbukh
51.5 crossing number
The crossing number cr(G) of a graph G is the minimal number of crossings among all
embeddings of G in the plane.
Version: 1 Owner: bbukh Author(s): bbukh
51.6 graph topology
A graph (\. 1) is identified by its vertices \ = ¦·
1
. ·
2
. . . .¦ and its edges 1 = ¦¦·
i
. ·
j
¦. ¦·
k
. ·
l
¦. . . .¦.
A graph also admits a natural topology, called the graph topology, by identifying every
edge ¦·
i
. ·
j
¦ with the unit interval 1 = [0. 1] and gluing them together at coincident vetices.
This construction can be easily realized in the framework of simplicial complexes. We can
form a simplicial complex G = ¦¦·¦ [ · ∈ \ ¦
¸
1. And the desired topological realization
of the graph is just the geometric realization [G[ of G.
Viewing a graph as a topological space has several advantages:
• The notion of graph isomorphism simply becomes that of homeomorphism.
• The notion of a connected graph coincides with topological conectedness.
• A connected graph is a tree iff its fundamental group is trivial.
369
Version: 3 Owner: igor Author(s): igor
51.7 planar graph
A planar graph is a graph which can be drawn on a plane (flat 2-d surface) with no edge
crossings.
No complete graphs above 1
4
are planar. 1
4
, drawn without crossings, looks like :
¹
1
( 1
Hence it is planar (try this for 1
5
.)
Version: 3 Owner: akrowne Author(s): akrowne
51.8 proof of crossing lemma
Euler’s formula implies the linear lower bound cr(G) ` :−3:+6, and so it cannot be used
directly. What we need is to consider the subgraphs of our graph, apply Euler’s formula on
them, and then combine the estimates. The probabilistic method provides a natural way to
do that.
Consider a minimal embedding of G. Choose independently every vertex of G with probabil-
ity j. Let G
p
be a graph induced by those vertices. By Euler’s formula, cr(G
p
)−:
p
+3:
p
` 0.
The expectation is clearly
1(cr(G
p
) −:
p
+ 3:
p
) ` 0.
Since 1(:
p
) = j:, 1(:
p
) = j
2
: and 1(A
p
) = j
4
cr(G), we get an inequality that bounds
the crossing number of G from below,
cr(G) ` j
−2
:−3j
−3
:.
Now set j =
4n
m
(which is at most 1 since : ` 4:), and the inequaliy becomes
cr(G) `
1
64
:
3
:
2
.
370
Similarly, if : `
9
2
:, then we can set j =
9n
2m
to get
cr(G) `
4
243
:
3
:
2
.
REFERENCES
1. Martin Aigner and G¨ unter M. Ziegler. Proofs from THE BOOK. Springer, 1999.
Version: 2 Owner: bbukh Author(s): bbukh
371
Chapter 52
05C12 – Distance in graphs
52.1 Hamming distance
In comparing two bit patterns, the Hamming distance is the count of bits different in the two
patterns. More generally, if two ordered lists of items are compared, the Hamming distance
is the number of items that do not identically agree. This distance is applicable to encoded
information, and is a particularly simple metric of comparison, often more useful than the
city-block distance or Euclidean distance.
References
• Originally from The Data Analysis Briefbook (http://rkb.home.cern.ch/rkb/titleA.html)
Version: 5 Owner: akrowne Author(s): akrowne
372
Chapter 53
05C15 – Coloring of graphs and
hypergraphs
53.1 bipartite graph
A bipartite graph is a graph with a chromatic number of 2.
The following graph, for example, is bipartite:
¹ 1
1 1
H G
1 (
One way to think of a bipartite graph is by partitioning the vertices into two disjoint sets
where vertices in one set are adjacent only to vertices in the other set. In the above graph,
this may be more obvious with a different representation:
373
¹ 1
1 1
H 1
( G
The two subsets are the two columns of vertices, all of which have the same colour.
A graph is bipartite if and only if all its cycles have even length. This is easy to see intuitively:
any path of odd length on a bipartite must end on a vertex of the opposite colour from the
beginning vertex and hence cannot be a cycle.
Version: 5 Owner: vampyr Author(s): vampyr
53.2 chromatic number
The chromatic number of a graph is the minimum number of colours required to colour
it.
Consider the following graph:
¹ 1 ( 1
1 1
This graph has been coloured using 3 colours. Furthermore, it’s clear that it cannot be
coloured with fewer than 3 colours, as well: it contains a subgraph (1(1) that is isomorphic
to the complete graph of 3 vertices. As a result, the chromatic number of this graph is indeed
3.
This example was easy to solve by inspection. In general, however, finding the chromatic
number of a large graph (and, similarly, an optimal colouring) is a very difficult (NP-hard)
problem.
Version: 2 Owner: vampyr Author(s): vampyr
374
53.3 chromatic number and girth
A famous theorem of P. Erd¨os
1
.
Theorem 6. For any natural numbers / and o, there exists a graph G with chromatic number
χ(G) ≥ / and girth girth(G) ≥ o.
Obviously, we can easily have graphs with high chromatic numbers. For instance, the
complete graph 1
n
trivially has χ(1
n
) = :; however girth(1
n
) = 3 (for : ≥ 3). And
the cycle graph (
n
has girth((
n
) = :, but
χ((
n
) =

1 : = 1
2 : even
3 otherwise.
It seems intuitively plausible that a high chromatic number occurs because of short, “local”
cycles in the graph; it is hard to envisage how a graph with no short cycles can still have
high chromatic number.
Instead of envisaging, Erd¨os’ proof shows that, in some appropriately chosen probability space
on graphs with : vertices, the probability of choosing a graph which does not have χ(G) ≥ /
and girth(G) ≥ o tends to zero as : grows. In particular, the desired graphs exist.
This seminal paper is probably the most famous application of the probabilistic method, and
is regarded by some as the foundation of the method.
2
Today the probabilistic method is
a standard tool for combinatorics. More constructive methods are often preferred, but are
almost always much harder.
Version: 3 Owner: ariels Author(s): ariels
53.4 chromatic polynomial
Let G be a graph (in the sense of graph theory) whose set \ of vertices is finite and nonempty,
and which has no loops or multiple edges. For any natural number r, let χ(G. r), or just χ(r),
denote the number of r-colorations of G, i.e. the number of mappings 1 : \ →¦1. 2. . . . . r¦
such that 1(c) = 1(/) for any pair (c. /) of adjacent vertices. Let us prove that χ (which
is called the chromatic polynomial of the graph G) is a polynomial function in r with
coefficients in Z. Write 1 for the set of edges in G. If [1[=0, then trivially χ(r) = r
[V [
(where [ [ denotes the number of elements of a finite set). If not, then we choose an edge c
1
See the very readable P. Erd¨os, Graph theory and probability, Canad J. Math. 11 (1959), 34–38.
2
However, as always, with the benefit of hindsight we can see that the probabilistic method had been used
before, e.g. in various applications of Sard’s theorem. This does nothing to diminish from the importance
of the clear statement of the tool.
375
and construct two graphs having fewer edges than G: H is obtained from G by contracting
the edge c, and 1 is obtained from G by omitting the edge c. We have
χ(G. r) = χ(1. r) −χ(H. r) (53.4.1)
for all r ∈ N, because the polynomial χ(1. r) is the number of colorations of the vertices of
G which might or might not be valid for the edge c, while χ(H. r) is the number which are
not valid. By induction on [1[, (53.4.1) shows that χ(G. r) is a polynomial over Z.
By refining the argument a little, one can show
χ(r) = r
[V [
−[1[r
[V [−1
+ . . . ±:r
k
.
for some nonzero integer :, where / is the number of connected components of G, and the
coefficients alternate in sign.
With the help of the M¨obius-Rota inversion formula (see Moebius inversion), or directly by
induction, one can prove
χ(r) =
¸
F⊂E
(−1)
[F[
r
[V [−r(F)
where the sum is over all subsets 1 of 1, and :(1) denotes the rank of 1 in G, i.e. the
number of elements of any maximal cycle-free subset of 1. (Alternatively, the sum may be
taken only over subsets 1 such that 1 is equal to the span of 1; all other summands cancel
out in pairs.)
The chromatic number of G is the smallest r 0 such that χ(G. r) 0 or, equivalently,
such that χ(G. r) = 0.
The Tutte polynomial of a graph, or more generally of a matroid (1. :), is this function
of two variables:
t(r. n) =
¸
F⊂E
(r −1)
r(E)−r(F)
(n −1)
[F[−r(F)
.
Compared to the chromatic polynomial, the Tutte contains more information about the
matroid. Still, two or more nonisomorphic matroids may have the same Tutte polynomial.
Version: 5 Owner: bbukh Author(s): bbukh, Larry Hammick
53.5 colouring problem
The colouring problem is to assign a colour to every vertex of a graph such that no two
adjacent vertices have the same colour. These colours, of course, are not necessarily colours
in the optic sense.
Consider the following graph:
376
¹ 1 ( 1
1 1
One potential colouring of this graph is:
¹ 1 ( 1
1 1
¹ and ( have the same colour; 1 and 1 have a second colour; and 1 and 1 have another.
Graph colouring problems have many applications in such situations as scheduling and
matching problems.
Version: 3 Owner: vampyr Author(s): vampyr
53.6 complete bipartite graph
The complete bipartite graph 1
n,m
is a graph with two sets of vertices, one with :
members and one with :, such that each vertex in one set is adjacent to every vertex in the
other set and to no vertex in its own set. As the name implies, 1
n,m
is bipartite.
Examples of complete bipartite graphs:
1
2,5
:
(
¹ 1
1
( 1
G
1
3,3
:
377
¹ 1
1 1
( 1
Version: 3 Owner: vampyr Author(s): vampyr
53.7 complete k-partite graph
The complete /-partite graph 1
a
1
,a
2
...a
k
is a /-partite graph with c
1
. c
2
. . . c
k
vertices of each
colour wherein every vertex is adjacent to every other vertex with a different colour and to
no vertices with the same colour.
For example, the 3-partite complete graph 1
2,3,4
:
¹ 1 (
1
1 H
1 1
G
Version: 3 Owner: vampyr Author(s): vampyr
53.8 four-color conjecture
The four-color conjecture was a long-standing problem posed by Guthrie while coloring a
map of England. The conjecture states that every map on a plane or a sphere can be colored
using only four colors such that no two adjacent countries are assigned the same color. This
is equivalent to the statement that chromatic number of every planar graph is no more than
four. After many unsuccessfull attempts the conjecture was proven by Appel and Haken in
1976 with an aid of computer.
378
Interestingly, the seemingly harder problem of determining the maximal number of colors
needed for all surfaces other than the sphere was solved long before the four-color conjecture
was settled. This number is now called the Heawood number of the surface.
REFERENCES
1. Thomas L. Saaty and Paul C. Kainen. The Four-Color Problem: Assaults and Conquest. Dover,
1986.
Version: 5 Owner: bbukh Author(s): bbukh
53.9 k-partite graph
A /-partite graph is a graph with a chromatic number of /.
An alternate definition of a /-partite graph is a graph where the vertices are partitioned into
/ subsets with the following conditions:
1. No two vertices in the same subset are 1. adjacent.
2. There is no partition of 2. the vertices with fewer than / subsets where condition 1 holds.
These two definitions are equivalent. Informally, we see that a colour can be assigned to all
the vertices in each subset, since they are not adjacent to one another. Furthermore, this is
also an optimal colouring, since the second condition holds.
An example of a 4-partite graph:
¹ 1 (
G H
1 1 1
A 2-partite graph is also called a bipartite graph.
Version: 5 Owner: vampyr Author(s): vampyr
379
53.10 property B
A hypergraph G is said to possess property B if it 2-colorable, i.e., its vertices can be colored
in two colors, so that no edge of G is monochromatic.
The property was named after Felix Bernstein by E. W. Miller.
Version: 1 Owner: bbukh Author(s): bbukh
380
Chapter 54
05C20 – Directed graphs (digraphs),
tournaments
54.1 cut
On a digraph, define a sink to be a vertex with out-degree zero and a source to be a vertex
with in-degree zero. Let G be a digraph with non-negative weights and with exactly one
sink and exactly one source. A cut ( on G is a subset of the edges such that every path
from the source to the sink passes through an edge in (. In other words, if we remove every
edge in ( from the graph, there is no longer a path from the source to the sink.
Define the weight of ( as
\
C
=
¸
e∈C
\(c)
where \(c) is the weight of the edge c.
Observe that we may achieve a trivial cut by removing all the edges of G. Typically, we are
more interested in minimal cuts, where the weight of the cut is minimized for a particular
graph.
Version: 2 Owner: vampyr Author(s): vampyr
54.2 de Bruijn digraph
The vertices of the de Bruijn digraph 1(:. :) are all possible words of length :−1 chosen
from an alphabet of size :.
1(:. :) has :
m
edges consisting of each possible word of length : from an alphabet of size
381
:. The edge c
1
c
2
. . . c
n
connects the vertex c
1
c
2
. . . c
n−1
to the vertex c
2
c
3
. . . c
n
.
For example, 1(2. 4) could be drawn as:
000
0000
0001
100
1000
1001
001
0010
0011
010
0100
0101
101
1011
1010
110
1100
1101
011
0110
0111
111
1111
1110
Notice that an Euler cycle on 1(:. :) represents a shortest sequence of characters from an
alphabet of size : that includes every possible subsequence of : characters. For example,
the sequence 000011110010101000 includes all 4-bit subsequences. Any de Bruijn digraph
must have an Euler cycle, since each vertex has in degree and out degree of :.
Version: 3 Owner: vampyr Author(s): vampyr
54.3 directed graph
A directed graph or digraph is a pair (\. 1) where \ is a set of vertices and 1 is a
subset of \ \ called edges or arcs.
If 1 is symmetric (i.e., (n. ·) ∈ 1 if and only if (·. n) ∈ 1), then the digraph is isomorphic
to an ordinary (that is, undirected) graph.
Digraphs are generally drawn in a similar manner to graphs with arrows on the edges to
indicate a sense of direction. For example, the digraph
(¦c. /. c. d¦. ¦(c. /). (/. d). (/. c). (c. /). (c. c). (c. d)¦)
may be drawn as
382
c
/
c
d
Version: 2 Owner: vampyr Author(s): vampyr
54.4 flow
On a digraph, define a sink to be a vertex with out-degree zero and a source to be a vertex
with in-degree zero. Let G be a digraph with non-negative weights and with exactly one
sink and exactly one source. A flow on G is an assignment 1 : 1(G) →R of values to each
edge of G satisfying certain rules:
1. For any edge c, we must have 0 < 1(c) < \(c) (where \(c) is the weight of c).
2. For any vertex ·, excluding the source and the sink, let 1
in
be the set of edges incident
to · and let 1
out
be the set of edges incident from ·. Then we must have
¸
e∈E
in
1(c) =
¸
e∈Eout
1(c).
Let 1
source
be the edges incident from the source, and let 1
sink
be the set of edges incident
to the sink. If 1 is a flow, then
¸
e∈E
sink
1(c) =
¸
e∈Esource
1(c) .
We will refer to this quantity as the amount of flow.
Note that a flow given by 1(c) = 0 trivially satisfies these conditions. We are typically more
interested in maximum flows, where the amount of flow is maximized for a particular
graph.
We may interpret a flow as a means of transmitting something through a network. Suppose
we think of the edges in a graph as pipes, with the weights corresponding with the capacities
of the pipes; we are pouring water into the system through the source and draining it through
the sink. Then the first rule requires that we do not pump more water through a pipe than
is possible, and the second rule requires that any water entering a junction of pipes must
383
leave. Under this interpretation, the maximum amount of flow corresponds to the maximum
amount of water we could pump through this network.
Instead of water in pipes, one may think of electric charge in a network of conductors. Rule
(2) above is one of Kirchoff’s two laws for such networks; the other says that the sum of the
voltage drops around any circuit is zero.
Version: 3 Owner: nobody Author(s): Larry Hammick, vampyr
54.5 maximum flow/minimum cut theorem
Let G be a finite digraph with nonnegative weights and with exactly one sink and exactly
one source. Then
I) For any flow 1 on G and any cut ( of G, the amount of flow for 1 is less than or equal to
the weight of (.
II) There exists a flow 1
0
on G and a cut (
0
of G such that the flow of 1
0
equals the weight
of (
0
.
Proof:(I) is easy, so we prove only (II). Write R for the set of nonnegative real numbers.
Let \ be the set of vertices of G. Define a matrix
κ : \ \ →R
where κ(r. n) is the sum of the weights (or capacities) of all the directed edges from r to n.
By hypothesis there is a unique · ∈ \ (the source) such that
κ(r. ·) = 0 ∀r ∈ \
and a unique u ∈ \ (the sink) such that
κ(u. r) = 0 ∀r ∈ \ .
We may also assume κ(r. r) = 0 for all r ∈ \ . Any flow 1 will correspond uniquely (see
Remark below) to a matrix
ϕ : \ \ →R
such that
ϕ(r. n) ≤ κ(r. n) ∀r. n ∈ \
¸
z
ϕ(r. .) =
¸
z
ϕ(.. r) ∀r = ·. u
Let λ be the matrix of any maximal flow, and let ¹ be the set of r ∈ \ such that there
exists a finite sequence r
0
= ·. r
1
. . . . . r
n
= r such that for all : from 1 to : − 1, we have
either
λ(r
m
. r
m+1
) < κ(r
m
. r
m+1
) (54.5.1)
384
or
λ(r
m+1
. r
m
) 0 . (54.5.2)
Write 1 = \ −¹.
Trivially, · ∈ ¹. Let us show that u ∈ 1. Arguing by contradiction, suppose u ∈ ¹, and let
(r
m
) be a sequence from · to u with the properties we just mentioned. Take a real number
c 0 such that
c + λ(r
m
. r
m+1
) < κ(r
m
. r
m+1
)
for all the (finitely many) : for which (138.1.1) holds, and such that
λ(r
m+1
. r
m
) c
for all the : for which (54.5.2) holds. But now we can define a matrix j with a larger flow
than λ (larger by c) by:
j(r
m
. r
m+1
) = c + λ(r
m
. r
m+1
) if (138.1.1) holds
j(r
m+1
. r
m
) = λ(r
m+1
. r
m
) −c if (54.5.2) holds
j(c. /) = λ(c. /) for all other pairs (c. /) .
This contradiction shows that u ∈ 1.
Now consider the set ( of pairs (r. n) of vertices such that r ∈ \ and n ∈ \. Since \ is
nonempty, ( is a cut. But also, for any (r. n) ∈ ( we have
λ(r. n) = κ(r. n) (54.5.3)
for otherwise we would have n ∈ \ . Summing (54.5.3) over (, we see that the amount of
the flow 1 is the capacity of (, QED.
Remark: We expressed the proof rather informally, because the terminology of graph theory
is not very well standardized and cannot all be found yet here at PlanetMath. Please feel
free to suggest any revision you think worthwhile.
Version: 5 Owner: bbukh Author(s): Larry Hammick, vampyr
54.6 tournament
A tournament is a directed graph obtained by choosing a direction for each edge in an
undirected complete graph. For example, here is a tournament on 4 vertices:
1 2
3 4
385
Any tournament on a finite number : of vertices contains a Hamiltonian path, i.e., directed
path on all : vertices. This is easily shown by induction on :: suppose that the statement
holds for :, and consider any tournament 1 on : + 1 vertices. Choose a vertex ·
0
of 1 and
consider a directed path ·
1
. ·
2
. . . . . ·
n
in 1 ` ¦·
0
¦. Now let i ∈ ¦0. . . . . :¦ be maximal such
that ·
j
→·
0
for all , with 1 < , < i. Then
·
1
. . . . . ·
i
. ·
0
. ·
i+1
. . . . . ·
n
is a directed path as desired.
The name “tournament” originates from such a graph’s interpretation as the outcome of
some sports competition in which every player encounters every other player exactly once,
and in which no draws occur; let us say that an arrow points from the winner to the loser.
A player who wins all games would naturally be the tournament’s winner. However, as the
above example shows, there might not be such a player; a tournament for which there isn’t
is called a 1-paradoxical tournament. More generally, a tournament 1 = (\. 1) is called
/-paradoxical if for every /-subset \
t
of \ there is a ·
0
∈ \ ` \
t
such that ·
0
→ · for all
· ∈ \
t
. By means of the probabilistic method Erd¨os showed that if [\ [ is sufficiently large,
then almost every tournament on \ is /-paradoxical.
Version: 3 Owner: bbukh Author(s): bbukh, draisma
386
Chapter 55
05C25 – Graphs and groups
55.1 Cayley graph
Let G = 'A[1` be a presentation of the finitely generated group G with generators A and
relations 1. We define the Cayley graph Γ = Γ(G. A) of G with generators A as
Γ = (G. 1) .
where
1 = ¦¦n. c n¦ [n ∈ G. c ∈ A¦ .
That is, the vertices of the Cayley graph are precisely the elements of G, and two elements
of G are connected by an edge iff some generator in A transfers the one to the other.
Examples
1. G = Z
d
, with generators A = ¦c
1
. . . . . c
d
¦, the standard basis vectors. Then Γ(G. A)
is the d-dimensional grid; confusingly, it too is often termed “Z
d
”.
2. G = 1
d
, the free group with the d generators A = ¦o
1
. .... o
d
¦. Then Γ(G. A) is the
2d-regular tree.
Version: 2 Owner: ariels Author(s): ariels
387
Chapter 56
05C38 – Paths and cycles
56.1 Euler path
An Euler path along a connected graph with : vertices is a path connecting all : vertices,
and traversing every edge of the graph only once. Note that a vertex with an odd degree
allows one to traverse through it and return by another path at least once, while a vertex
with an even degree only allows a number of traversals through, but one cannot end an Euler
path at a vertex with even degree. Thus, a connected graph has an Euler path which is a
circuit (an Euler circuit) if all of its vertices have even degree. A connected graph has an
Euler path which is non-circuituous if it has exactly two vertices with odd degree.
This graph has an Euler path which is a circuit. All of its vertices are of even degree.
This graph has an Euler path which is not a circuit. It has exactly two vertices of odd degree.
Note that a graph must be connected to have an Euler path or circuit. A graph is connected
if every pair of vertices n and . has a path n·. . . . . n. between them.
Version: 12 Owner: slider142 Author(s): slider142
56.2 Veblen’s theorem
The edge set of a graph can be partitioned into cycles if and only if every vertex has even
degree.
388
Version: 2 Owner: digitalis Author(s): digitalis
56.3 acyclic graph
Any graph that contains no cycles is an acyclic graph. A directed acyclic graph is often
called a DAG for short.
For example, the following graph and digraph are acyclic.
¹
1 (
¹
1 (
In contrast, the following graph and digraph are not acyclic, because each contains a cycle.
¹
1 (
¹
1 (
Version: 5 Owner: Logan Author(s): Logan
56.4 bridges of Knigsberg
The bridges of K¨onigsberg is a famous problem inspired by an actual place and situation.
The solution of the problem, put forth by Leonhard Euler in 1736, is the first work of
graph theory and is responsible for the foundation of the discipline.
The following figure shows a portion of the Prussian city of K¨ onigsberg. A river passes
through the city, and there are two islands in the river. Seven bridges cross between the
islands and the mainland:
Figure 1: Map of the K¨ onigsberg bridges.
The mathematical problem arose when citizens of K¨onigsburg noticed that one could not
take a stroll across all seven bridges, returning to the starting point, without crossing at
least one bridge twice.
389
Answering the question of why this is the case required a mathematical theory that didn’t
exist yet: graph theory. This was provided by Euler, in a paper which is still available today.
To solve the problem, we must translate it into a graph-theoretic representation. We model
the land masses, ¹, 1, ( and 1, as vertices in a graph. The bridges between the land
masses become edges. This generates from the above picture the following graph:
Figure 2: Graph-theoretic representation of the K¨ onigsburg bridges.
At this point, we can apply what we know about Euler paths and Euler circuits. Since an
Euler circuit for a graph exists only if every vertex has an even degree, the K¨onigsberg graph
must have no Euler circuit. Hence, we have explained why one cannot take a walk around
K¨onigsberg and return to the starting point without crossing at least one bridge more than
once.
Version: 5 Owner: akrowne Author(s): akrowne
56.5 cycle
A cycle in a graph, digraph, or multigraph, is simple path from a vertex to itself (i.e., a
path where the first vertex is the same as the last vertex and no edge is repeated).
For example, consider this graph:
¹ 1
1 (
¹1(1¹ and 11¹1 are two of the cycles in this graph. ¹1¹ is not a cycle, however, since
it uses the edge connecting ¹ and 1 twice. ¹1(1 is not a cycle because it begins on ¹ but
ends on 1.
A cycle of length : is sometimes denoted (
n
and may be referred to as a polygon of : sides:
that is, (
3
is a triangle, (
4
is a quadrilateral, (
5
is a pentagon, etc.
An even cycle is one of even length; similarly, an odd cycle is one of odd length.
Version: 4 Owner: vampyr Author(s): vampyr
390
56.6 girth
The girth of a graph G is the length of the shortest cycle in G.
1
For instance, the girth of any grid Z
d
(where d 2) is 4, and the girth of the vertex graph
of the dodecahedron is 5.
Version: 1 Owner: ariels Author(s): ariels
56.7 path
A path in a graph is a finite sequence of alternating vertices and edges, beginning and ending
with a vertex, ·
1
c
1
·
2
c
2
·
3
. . . c
n−1
·
n
such that every consecutive pair of vertices ·
x
and ·
x+1
are adjacent and c
x
is incident with ·
x
and with ·
x+1
. Typically, the edges may be omitted
when writing a path (e.g., ·
1
·
2
·
3
. . . ·
n
) since only one edge of a graph may connect two
adjacent vertices. In a multigraph, however, the choice of edge may be significant.
The length of a path is the number of edges in it.
Consider the following graph:
¹ 1
1 (
Paths include (but are certainly not limited to) ¹1(1 (length 3), ¹1(1¹ (length 4), and
¹1¹1¹1¹1¹1(1¹ (length 12). ¹11 is not a path since 1 is not adjacent to 1.
In a digraph, each consecutive pair of vertices must be connected by an edge with the proper
orientation; if c = (n. ·) is an edge, but (·. n) is not, then nc· is a valid path but ·cn is not.
Consider this digraph:
G H
J 1
GH1J, GJ, and GHGHGH are all valid paths. GHJ is not a valid path because H and
J are not connected. GJ1 is not a valid path because the edge connecting 1 to J has the
1
There is no widespread agreement on the girth of a forest, which has no cycles. It is also extremely
unimportant.
391
opposite orientation.
Version: 3 Owner: vampyr Author(s): vampyr
56.8 proof of Veblen’s theorem
The proof is very easy by induction on the number of elements of the set 1 of edges. If 1 is
empty, then all the vertices have degree zero, which is even. Suppose 1 is nonempty. If the
graph contains no cycle, then some vertex has degree 1, which is odd. Finally, if the graph
does contain a cycle (, then every vertex has the same degree mod 2 with respect to 1 −(,
as it has with respect to 1, and we can conclude by induction.
Version: 1 Owner: mathcam Author(s): Larry Hammick
392
Chapter 57
05C40 – Connectivity
57.1 /-connected graph
For / ∈ N, a graph G is /-connected iff G has more than / vertices and if the graph left by
removing any / or less vertices is connected. The largest integer / such that G is /-connected
is called the connectivity of G and is denoted by κ(G).
Version: 1 Owner: lieven Author(s): lieven
57.2 Thomassen’s theorem on 3-connected graphs
Every 3-connected graph G with more than 4 vertices has an edge c such that Gc is also
3-connected.
Suppose such an edge doesn’t exist. Then, for every edge c = rn, the graph Gc isn’t
3-connected and can be made disconnected by removing 2 vertices. Since κ(G) ` 3, our
contracted vertex ·
xy
has to be one of these two. So for every edge c, G has a vertex . = r. n
such that ¦·
xy
. .¦ separates Gc. Any 2 vertices separated by ¦·
xy
. .¦ in Gc are separated
in G by o := ¦r. n. .¦. Since the minimal size of a separating set is 3, every vertex in o has
an adjacent vertex in every component of G−o.
Now we choose the edge c, the vertex . and the component ( such that [([ is minimal. We
also choose a vertex · adjacent to . in (.
By construction G.· is not 3-connected since removing rn disconnects ( − · from G.·.
So there is a vertex u such that ¦.. ·. u¦ separates G and as above every vertex in ¦.. ·. u¦
has an adjacent vertex in every component of G−¦.. ·. u¦. We now consider a component
1 of G−¦.. ·. u¦ that doesn’t contain r or n. Such a component exists since r and n belong
393
to the same component and G − ¦.. ·. u¦ isn’t connected. Any vertex adjacent to · in 1
is also an element of ( since · is an element of (. This means 1 is a proper subset of (
which contradicts our assumption that [([ was minimal.
Version: 2 Owner: lieven Author(s): lieven
57.3 Tutte’s wheel theorem
Every 3-connected simple graph can be constructed starting from a wheel graph by repeat-
edly either adding an edge between two non-adjacent vertices or splitting a vertex.
Version: 1 Owner: lieven Author(s): lieven
57.4 connected graph
A connected graph is a graph such that there exists a path between all pairs of vertices.
If the graph is a directed graph, and there exists a path from each vertex to every other
vertex, then it is a strongly connected graph.
A connected component is a subset of vertices of any graph and any edges between them
that forms a connected graph. Similarly, a strongly connected component is a subset
of vertices of any digraph and any edges between them that forms a strongly connected
graph. Any graph or digraph is a union of connected or strongly connected components,
plus some edges to join the components together. Thus any graph can be decomposed
into its connected or strongly connected components. For instance, Tarjan’s algorithm can
decompose any digraph into its strongly connected components.
For example, the following graph and digraph are connected and strongly connected, respec-
tively.
¹ 1 (
1 1 1
¹ 1 (
1 1 1
On the other hand, the following graph is not connected, and consists of the union of two
connected components.
394
¹ 1 (
1 1 1
The following digraph is not strongly connected, because there is no way to reach 1 from
other vertices, and there is no vertex reachable from (.
¹ 1 (
1 1 1
The three strongly connected components of this graph are
¹ 1
1 1
C F
Version: 3 Owner: Logan Author(s): Logan
57.5 cutvertex
A cutvertex of a graph G is a vertex whose deletion increases the number of components
of G. The edge analogue of a cutvertex is a bridge.
Version: 2 Owner: digitalis Author(s): digitalis
395
Chapter 58
05C45 – Eulerian and Hamiltonian
graphs
58.1 Bondy and Chvtal theorem
Bondy and Chv´atal’s theorem.
Let G be a graph of order : ≥ 3 and suppose that n and · are distinct non adjacent vertices
such that deg(n) + deg(·) ≥ :.
Then G is Hamiltonian if and only if G+ n· is hamiltonian.
Version: 1 Owner: drini Author(s): drini
58.2 Dirac theorem
Theorem: Every graph with : ` 3 vertices and minimum degree at last
n
2
has a Hamiltonian cycle.
Proof: Let G = (\. 1) be a graph with |G| = : ` 3 and δ(G) `
n
2
. Then G is connected:
otherwise, the degree of any vertex in the smallest component ( of G would be less then
|(| `
n
2
. Let 1 = r
0
...r
k
be a longest path in G. By the maximality of 1. all the neighbours
of r
0
and all the neighbours of r
k
lie on 1. Hence at last
n
2
of the vertices r
0
. .... r
k−1
are
adjacent to r
k
, and at last
n
2
of these same / < : vertices r
i
are such that r
0
r
i+1
∈ 1. By the
pigeon hole principle, there is a vertex r
i
that has both properties, so we have r
0
r
i+1
∈ 1
and r
i
r
k
∈ 1 for some i < /. We claim that the cycle ( := r
0
r
i+1
1r
k
r
i
1r
0
is a Hamiltonian
cycle of G. Indeed since G is connected, ( would otherwise have a neighbour in G−(. which
could be combined with a spanning path of ( into a path longer than 1. P
Version: 5 Owner: vladm Author(s): vladm
396
58.3 Euler circuit
An Euler circuit is a connected graph such that starting at a vertex c, one can traverse along
every edge of the graph once to each of the other vertices and return to vertex c. In other
words, an Euler circuit is an Euler path that is a circuit. Thus, using the properties of odd
and even degree vertices given in the definition of an Euler path, an Euler circuit exists iff
every vertex of the graph has an even degree.
This graph is an Euler circuit as all vertices have degree 2.
This graph is not an Euler circuit.
Version: 6 Owner: slider142 Author(s): slider142
58.4 Fleury’s algorithm
Fleury’s algorithm constructs an Euler circuit in a graph (if it’s possible).
1. Pick any vertex to start
2. From that vertex pick an edge to traverse, considering following rule: never cross a
bridge of the reduced graph unless there is no other choice
3. Darken that edge, as a reminder that you can’t traverse it again
4. Travel that edge, coming to the next vertex
5. Repeat 2-4 until all edges have been traversed, and you are back at the starting vertex
By ”reduced graph” we mean the original graph minus the darkened (already used) edges.
Version: 3 Owner: Johan Author(s): Johan
397
58.5 Hamiltonian cycle
let G be a graph. If there’s a cycle visiting all vertices exactly once, we say that the cycle is
a hamiltonian cycle.
Version: 2 Owner: drini Author(s): drini
58.6 Hamiltonian graph
Let G be a graph or digraph.
If G has a Hamiltonian cycle, we call G a hamiltonian graph.
There is not useful necessary and sufficient condition for a graph being hamiltonian. However,
we can get some necessary conditions from the definition like a hamiltonian graph is always
connected and has order at least 3. This an other observations lead to the condition:
Let G = (\. 1) be a graph of order at least 3. If G is hamiltonian, for every proper subset
l of \ , the subgraph induced by \ −l has at most [l[ components.
For the sufficiency conditions, we get results like Ore’s theorem or Bondy and Chv´atal theorem
Version: 5 Owner: drini Author(s): drini
58.7 Hamiltonian path
Let G be a graph. A path on G that includes every vertex exactly once is called a hamil-
tonian path.
Version: 4 Owner: drini Author(s): drini
58.8 Ore’s theorem
Let G be a graph of order : ≥ 3 such that, for every pair of distinct non adjacent vertices n
and ·, deg(n) + deg(·) ≥ :. Then G is a Hamiltonian graph.
Version: 3 Owner: drini Author(s): drini
398
58.9 Petersen graph
Petersen’s graph. An example of graph that is traceable but not Hamiltonian. That is, it
has a Hamiltonian path but doesn’t have a Hamiltonian cycle.
This is also the canonical example of a hypohamiltonian graph.
Version: 5 Owner: drini Author(s): drini
58.10 hypohamiltonian
A graph G is hypohamiltonian if G is not Hamiltonian, but G−· is Hamiltonian for each
· ∈ \ (\ the vertex set of G). The smallest hypohamiltonian graph is the Petersen graph,
which has ten vertices.
Version: 1 Owner: digitalis Author(s): digitalis
58.11 traceable
let G be a graph. If G has a Hamiltonian path, we say that G is traceable.
Not every traceable graph is Hamiltonian. As an example consider Petersen’s graph.
Version: 2 Owner: drini Author(s): drini
399
Chapter 59
05C60 – Isomorphism problems
(reconstruction conjecture, etc.)
59.1 graph isomorphism
A graph isomorphism is a bijection between the vertices of two graphs G and H:
1 : \ (G) →\ (H)
with the property that any two vertices n and · from G are adjacent if and only if 1(n) and
1(·) are adjacent in H.
If an isomorphism can be constructed between two graphs, then we say those graphs are
isomorphic.
For example, consider these two graphs:
c
o
/ /
c
i
d
,
400
1 2
5 6
8 7
4 3
Although these graphs look very different at first, they are in fact isomorphic; one isomor-
phism between them is
1(c) = 1
1(/) = 6
1(c) = 8
1(d) = 3
1(o) = 5
1(/) = 2
1(i) = 4
1(,) = 7
Version: 2 Owner: vampyr Author(s): vampyr
401
Chapter 60
05C65 – Hypergraphs
60.1 Steiner system
A Steiner system o(t. /. :) is a /−uniform hypergraph on : vertices such that every set
of t vertices is contained in exactly one edge. Notice that o(2. /. :) are merely 2−uniform
linear spaces. The families of hypergraphs o(2. 3. :) are known as Steiner triple systems.
Version: 2 Owner: drini Author(s): drini, NeuRet
60.2 finite plane
Let H = (\. E) be a linear space. A finite plane is an intersecting linear space. That is to
say, a linear space in which any two edges in E have a nonempty intersection.
Finite planes are rather restrictive hypergraphs, and the following holds.
Theorem 4. Let H = (\. E) be a finite plane. Then for some positive integer /, H is
(/ + 1)−regular, (/ + 1)−uniform, and [E[ = [\ [ = /
2
+ / + 1.
The above / is the order of the finite