This action might not be possible to undo. Are you sure you want to continue?

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

Frank Stephan

Semester I, Academic Year 2009-2010

Set Theory deals with the fundamental concepts of sets and functions used every-

where in mathematics. Cantor initiated the study of set theory with his investigations

on the cardinality of sets of real numbers. In particular, he proved that there are dif-

ferent inﬁnite cardinalities: the quantity of natural numbers is strictly smaller than

the quantity of real numbers. Cantor formalized and studied the notions of ordinal

and cardinal numbers. Set theory considers a universe of sets which is ordered by

the membership or element relation ∈. All other mathematical objects are coded into

this universe and studied within this framework. In this way, set theory is one of the

foundations of mathematics.

This text contains all information relevant for the exams. Furthermore, the exer-

cises in this text are those which will be demonstrated in the tutorials. Each sheet

of exercises contains some important ones marked with a star and some other ones.

You have to hand in an exercise marked with a star in Weeks 3 to 6, Weeks 7 to 9

and Weeks 10 to 12; each of them gives one mark. Furthermore, you can hand in any

further exercises, but they are only checked for correctness. There will be two mid

term exams and a ﬁnal exam; the mid term exams count 15 marks each and the ﬁnal

exam counts 67 marks.

Frank Stephan: Room S14#04-13

Departments of Mathematics and Computer Science, National University of Singapore

2 Science Drive 2, Singapore 117543, Republic of Singapore

Telephone 65162759

Email: fstephan@comp.nus.edu.sg

Homepage http://www.comp.nus.edu.sg/˜fstephan/index.html

Thanks. These course notes were prepared by Frank Stephan. F. Stephan would

like to thank Feng Qi and Klaus Gloede for their material used in these notes. Fur-

ther thanks go to Eric Martin, Yan Kuo, Yang Yue and several of his students for

discussions, suggestions and improvements.

1

Contents

1 Foundations 3

2 Basic Operations with Sets 7

3 Functions 13

4 Natural Numbers 18

5 Recursive Deﬁnition 23

6 Cardinality of Sets 27

7 Finite and Hereditarily Finite Sets 30

8 Countable Sets 35

9 Graphs and Orderings 38

10 Linear Ordering 43

11 Well-Orderings 50

12 Ordinals 55

13 Transﬁnite Induction and Recursion 58

14 The Rank of Sets 62

15 Arithmetic on Ordinals 65

16 Cardinals 69

17 The Axiom of Choice 73

18 The Set of Real Numbers 77

19 The Continuum Hypothesis 82

20 The Axioms of Zermelo and Fraenkel 86

21 References 91

2

1 Foundations

The two primitive notions of set theory are “set” and “membership”. These two

notions will not be deﬁned. All other concepts are deﬁned in terms of these two

primitive notions.

In this course, all basic properties (Axioms) of set-theory are introduced and the

various sets are deﬁned one after the other. Nevertheless, for some examples and

exercises, reference is made to well-known but later introduced objects in order to

illustrate certain notions or results.

A set is intuitively a collection of objects. The objects in the collection are members

(or elements) of the set. One important point is that the objects in a set can again

be sets. Here an example.

Example 1.1. {1, {0, 2}, {0, 3}} is the set having three elements, namely 1, {0, 2}

and {0, 3}. The second and third elements are again sets containing 0, 2 and 0, 3,

respectively.

Graph Representation 1.2. One can consider every set to be a vertex of a large

graph where there is a connection from x to y iﬀ x ∈ y. The graph of the above

example has the following 7 vertices and 7 directed edges:

¨

¨

¨

¨

¨B

I

T T

T T T

0 1 2 3

{0, 2} {0, 3}

{1, {0, 2}, {0, 3}}

The domain, that is the collection of all sets, is denoted by V and called the Von

Neumann Universe.

Writing down a set is longer than writing down any element of it. Thus one can

only ﬁnitely often go down from a set to a member since each time the description

becomes shorter. Later, when inﬁnite sets are introduced, it will no longer be possible

to write them down in the above way and therefore the nonexistence of inﬁnitely

descending chains is no longer implicitly guaranteed. Therefore, this property is kept

by making it explicitly an axiom.

The intuition is that there is no sequence x

0

, x

1

, . . . of vertices in V such that

3

x

n+1

∈ x

n

for n = 0, 1, . . ., that is, there is no inﬁnite descending chain. But the term

“sequence” is not a primitive operation. Thus the following version of the axiom is

given; it is based on the observation that whenever x

n+1

∈ x

n

then every element x

n

of X = {x

0

, x

1

, . . .} has a common element with X, namely x

n+1

is in x

n

and in X.

This is just ruled out by the Axiom of Foundation.

Axiom 1.3 (Foundation). Let X ∈ V be a set which contains at least one element.

Then there is an element y ∈ X such that every z ∈ X satisﬁes z / ∈ y.

Example 1.4. Such a y ∈ X is called a minimal element of X (with respect to ∈).

The set A = {0, {0}, {1, 2}, {0, {1, 2}}, {{0}}} has two minimal elements, namely 0

and {1, 2}. 0 has no elements and the elements 1, 2 of {1, 2} are not in A. Thus

a minimal element does not need to be unique. If a minimal element is contained

in every other element of the set, is called a least element. The set {0} is the least

element of the set {{0}, {{0}}, {{0}, {0, {0}}}}.

Remark 1.5. The main idea of the Axiom of Foundation is to enforce that the

universe V of sets is build from bottom up and that no set is “cyclic” or “hanging

down without a bottom from something”. So the following things are permitted and

forbidden:

• {{{x}}}: making sets of sets is legal;

• {x, {x}, {x, {x}}}: sets with comparable elements (x ∈ {x}) are legal;

• {x, y} with x / ∈ y and y / ∈ x: sets with incomparable sets are legal;

• {x, y} with x ∈ y and y ∈ x: cycles are illegal;

• {x

0

, x

1

, . . .} with x

1

∈ x

0

, x

2

∈ x

1

, . . .: sets forming a descending sequence are

illegal (see above).

Note that the Axiom of Foundation implies that a descending sequence is illegal

whenever its elements form a set; but it cannot make descending sequences explicitly

illegal without that condition (as this cannot be expressed in V ).

Exercise 1.6. The property of being well-founded is an abstract property which

applies also to some but not all directed graphs which are diﬀerent from the universe

of all sets. Here some examples of graphs. Which of the below graphs are well-

founded? The answers should be proven by testing which of the below examples

avoid the two negative criteria (cycles and descending chains) from Remark 1.5.

4

1. the set {0, 1, {0}, {1}, {0, 1, {0}}, {{1}}, {{{1}}}, 512} with (a, b) being an edge

iﬀ a ∈ b;

2. the set {0, 1, 2, 3} with the edges (0, 1), (1, 0), (2, 3);

3. the set N of the natural numbers with every edge being of the form (n, n + 1);

4. the set Z of the integers with the edges being the pairs (n, n + 1) for all n ∈ Z;

5. the set Q of rational numbers with the edges being the pairs (q, 2q) for all q ∈ Q;

6. the set Q of rational numbers with the edges being the pairs (q, q + 1) for all

q ≥ 0 and (q, q −1) for all q ≤ 0.

The Axiom of Foundation has an immediate application.

Theorem 1.7. The collection V of all sets is not a set.

Proof. Consider any set x. Then {x} is also a set; actually this needs the Axiom

of Pairs introduced below. By the Axiom of Foundation, the only element x of {x}

satisﬁes y / ∈ x for all y ∈ {x}. As y takes the value than x, x / ∈ x. Thus x = V since

V contains the set x as a member. It follows that V cannot be a set.

Property 1.8. No set contains itself as an element.

Convention 1.9. In set theory, there are two types of collections of objects, which

are called “sets” and “classes”. The members of V are called sets and the subsets of

V (including V itself) are called classes. A class is something what behaves like a set

but it is none. So it has subclasses and members in the same way as set has subsets

and members. The main objective of V is to tell which thing is a set and which not;

due to this role, V cannot be a set itself.

Sets are the object of investigation. They are represented as members of the class

of vertices of V and they are put into relation to each other by the element-relation

∈. Intuitively, any set could be considered as a collection of other smaller sets. The

Axiom of Foundation guarantees, that each set is determined uniquely by its members

and helps to avoid contradictions. All elements of sets will be coded again as sets, so

the usage of 0, 1, 2, 3 in the example above will later be replaced by the usage of sets

which code the numbers 0, 1, 2, 3.

Classes are an auxiliary structure which enable to make statement about the col-

lection of all sets or the collection of sets with certain property. The elements of a

class are always sets, no class contains another one as a member.

All the variables will stand for sets, unless otherwise stated. The notation “∈”

5

denotes the membership of sets. For example, when writing “x ∈ y”, this means that

“set x is a member of set y”. Similarly “x / ∈ y” means that “set x is not a member of

y”. The words “member” and “element” are synonyms, the symbol “∈” refers to the

ﬁrst letter of “element”.

It is supposed that {0, 1, 2, 3, 4, 5, 6, 7} and {0, 1, . . . , 7} denote the same set since

both descriptions give the collections of the same elements. Further descriptions for

this set can even be more indirect, for example as the set of all numbers which can

be represented by up to three binary digits or the set of all integers on which the

polynomial 4x

2

−28x −1 takes negative values. These descriptions might come from

diﬀerent intentions, but they give the same extension, that is, the same list of elements.

Therefore, the axiom of extensionality states that two sets are equal iﬀ they have the

same elements, whether or not they are generated from diﬀerent descriptions does not

matter at all.

Axiom 1.10 (Extensionality). A set A and a set B are the same set, that is, the

two sets A and B are equal, denoted by A = B, if both sets have the same elements,

that is, for every x ∈ V , x ∈ A if and only if x ∈ B.

The importance of this identiﬁcation of sets is that one only pays attention to the

extensionality of sets; the diﬀerent intentions are ignored when describing sets. This

is very useful when the uniqueness of sets satisfying a certain property is determined.

Example 1.11. Equal sets have exactly the same members. Britain is a country

consisting of four members: Britain = {England, Northern Ireland, Scotland, Wales}.

These four members go directly into the European association in the case of soccer:

UEFA = {Austria, Belgium, England, Northern Ireland, Scotland, Wales, Denmark,

Estonina, . . .}. But politically, Britain is the member of the EU and they are only

members of this member, that is, belong indirectly to the union: EU = {Austria,

Belgium, {England, Northern Ireland, Scotland, Wales}, Denmark, Estonia, . . .}.

Direct and indirect membership is not the same, thus UEFA = EU. It can be noted

that there is also other reasons for EU = UEFA like Switzerland ∈ UEFA − EU, but

this is not the point of this example.

Example 1.12. There is at most one set which does not have any element.

Exercise 1.13. Which of the following sets of natural numbers are equal? Well-

known mathematical theorems can be applied without proving them.

1. A = {1, 2};

2. B = {1, 2, 3};

6

3. C is the set of all prime numbers;

4. D = {d | ∃a, b, c > 0 (a

d

+b

d

= c

d

)};

5. E = {e | e > 0 ∧ ∀c ∈ C (e ≤ c)};

6. F = {f | ∀c ∈ C (f ≥ c)};

7. G = {g | g ≥ 2 ∧ ∀a, b > 1 (4g = (a + b)

2

−(a −b)

2

)};

8. H = {h | h > 0 ∧ h

2

= h

h

};

9. I = {i | i + i = i · i};

10. J = {j | (j + 1)

2

= j

2

+ 2j + 1};

11. K = {k | 4k > k

2

};

12. L = {l | ∃c ∈ C (l < c)};

13. M = {m | ∃c ∈ C (m = c

2

)};

14. N = {n | ∃c, d ∈ C (n = cd)};

15. O = {o | o has exactly three prime factors};

16. P = {p | p, p + 2 ∈ C}.

The question whether P is inﬁnite is a famous open problem. Therefore it is still

unknown whether

N = {n | ∃p ∈ P (n ≤ p)}.

So it is sometimes very diﬃcult to decide whether two descriptions give the same set

or not.

2 Basic Operations with Sets

In the previous section, many examples dealt with sets known from the every-day-life

of a mathematician, but there was neither a proof nor an axiom given such that these

sets really exist. Since “existence of a set” means that it is a member of V , one has to

place an axiom such that it really occurs in V or one has to prove using other axioms

that it is in V by these axioms.

Axiom 2.1 (Empty Set). There exists a set in V which has no members.

7

Notice that by the equality of sets, as in the ﬁrst example, a set with no elements is

unique. This set not having any element is denoted by ∅.

Subsets. One set could be a part of another in the sense that every element of the

set is an element of the other. This is made more precise by introducing the concept

of subsets.

Deﬁnition 2.2. A set A is a subset of B, denoted by A ⊆ B, iﬀ for every set x,

x ∈ A implies that x ∈ B. And A is a proper subset of B, denoted by A ⊂ B, iﬀ A is

a subset of B and there exists one set y which is an element of B but not an element

of A.

Thus, A = B if and only if A ⊆ B and B ⊆ A. Notice that this gives a standard

way to show that two given sets are equal. Namely, one checks that one set is a subset

of the other, respectively. Notice also that x ⊆ x for any set x.

Property 2.3. ∅ is a subset of every set.

Power Sets. Given a set X, one could collect all the subsets of X to form a new set.

This procedure is called the power set operation.

Axiom 2.4 (Power Set). For every set x, there exists a (unique) set, called the

power set of x, whose elements are exactly subsets of x. This set is denoted by P(x).

By Property 2.3, ∅ ∈ P(x) for every set x ∈ V . Also x ∈ P(x). If x = {a, b, c, d}

then P(x) = {∅, {a}, {b}, {c}, {d}, {a, b}, {a, c}, {a, d}, {b, c}, {b, d}, {c, d}, {a, b, c},

{a, b, d}, {a, c, d}, {b, c, d}, {a, b, c, d}}.

Exercise 2.5. Determine the power set of {∅, {∅}, {{∅}}}. Is there any set X such

that P(X) has exactly 9 elements?

In Exercise 1.13, many sets are deﬁned by taking all those numbers which satisfy

a certain property. This rule of forming subsets is made an axiom and is formally

deﬁned. It uses properties which are formally introduced in Deﬁnition 3.7 below.

Axiom 2.6 (Comprehension, deﬁning subsets). Given a property p(y) of sets,

for any set A, there exists a (unique) set B such that x ∈ B if and only if x ∈ A and

p(x) holds.

Convention 2.7. The notation {x ∈ A | p(x)} stands for the set of all x ∈ A which

satisfy p(x).

8

Example 2.8. Let C be the set of all countries. Now deﬁne the set

L = {c ∈ C | c has at least 25 subunits}

of all large countries. The European Union has since 01.05.2004 the necessary quantity

of members. The United States has 50 states, Switzerland has 26 cantons and India

has 25 states and 7 territories. But Australia has only 6 states and 2 main territories.

Thus,

European Union, India, Switzerland, United States ∈ L,

The Commonwealth of Australia / ∈ L.

Similarly, the set

E = {e ∈ European Union | e has the Euro}

consists of all members of the European Union, which use the Euro as a currency. So

France and Spain are in E, Britain is not be in E. Montenegro uses the Euro but is

not in the European Union, therefore, Montenegro is not a member of E.

Exercise 2.9. Given the set N of natural numbers, establish properties to deﬁne the

following subsets of N using the Axiom of Comprehension:

1. the set of all numbers with exactly three divisors,

2. the set {0, 2, 4, 6, . . .} of all even numbers,

3. the set of all square numbers,

4. the set of all numbers whose binary representation contains exactly four times

a 1.

For example, the set of prime numbers can be deﬁned as the set {x ∈ N | ∃ unique

y, z ∈ N(y · z = x ∧ z < y ≤ x)}, that is the set of all natural numbers with exactly

two divisors.

Exercise 2.10. Show that every property p satisﬁes the following statements.

1. There are sets x, y such that x ∈ y and either p(x) ∧ p(y) or ¬p(x) ∧ ¬p(y).

2. There is a set x with x = {y ∈ x | p(y)}.

3. There is a one-to-one function f such that p(x) iﬀ p(y) for all y ∈ f(x).

9

Before introducing the distinction between sets and classes, mathematicians believed

that they can deﬁne things like “the set of all sets”. Using comprehension, Russell

split the set of all sets into the following two subsets: X is the set of all sets which

are an element of itself, Y is the set of all sets which are not an element of itself.

Paradox 2.11 (Russell’s Anatonomy of Naive Set Theory). If the above deﬁned

X, Y are sets, then there is a contradiction.

Proof. If Y ∈ Y , then Y ∈ X since X contains all sets which are an element of itself.

If Y / ∈ Y then Y / ∈ X again by the deﬁnition of X. Thus either Y is member of both

sets X and Y or Y is not member of any of these two sets. This contradicts the fact

that X, Y are obviously a partition of the sets of all sets.

Comment 2.12. Russell’s paradox forced the mathematicians to become a bit more

cautious when dealing with sets. Mainly, the following two consequences where taken:

1. The distinction between classes and sets are introduced.

2. The postulate that V is well-founded.

So X and Y turn out to be classes and not sets. Furthermore, the postulate that V

is well-founded makes it impossible for a set to be a member of itself. Therefore, Y is

equal to the class V of all sets and X is the empty class.

Although a set cannot contain itself, there is for every set x also the set {x} which is

just the set containing the single element x as its member. Note that x = {x} for the

reasons given above, the existence of {x} is provided by the next axiom.

In general, the intended property is that one can construct ﬁnite sets from a given

ﬁnite list of elements. It is suﬃcient to postulate this for two given elements, note

that taking the element x twice gives then the existence of {x} by the Axiom of

Extensionality.

Axiom 2.13 (Pair). Given any x, y ∈ V then there exists also a set in V which

contains exactly the elements x, y. This set is written as {x, y}; in the case that x = y

one can also just write {x}.

Union, Intersection and Diﬀerence. In the following, the basic operations with

sets are deﬁned.

Deﬁnition 2.14. Let A and B be sets.

1. The union of A and B is the set, denoted by A∪B, whose elements are exactly

those sets belonging to A or belonging to B.

10

2. The intersection of A and B is the set, denoted by A ∩ B, whose elements are

exactly those sets belonging to both A and B.

3. The (relative) diﬀerence of A with B is the set, denoted by A − B, whose

elements are exactly those elements of A which do not belong to B.

For example, {0, 1, 2} ∪ {4, 5} = {0, 1, 2, 4, 5}, {0, 1, 2} ∩ {1, 2, 3, 4, 5, 6, 7, 8} = {1, 2}

and {0, 1, 2} −{1, 2, 3} = {0}.

Remark 2.15. These operations can be speciﬁed as follows:

A ∪ B = { x | x ∈ A or x ∈ B };

A ∩ B = { x | x ∈ A and x ∈ B };

A −B = { x | x ∈ A and x / ∈ B };

A△B = (A −B) ∪ (B −A);

¸

A = { x | ∃ y ∈ A (x ∈ y) };

¸

A = { x | ∀ y ∈ A (x ∈ y) }.

Note that

¸

A is only deﬁned for nonempty sets A. If B ⊆ A, then A−B is called the

complement of B in A. The set A△B is called the symmetric diﬀerence of A and B.

As an illustration, let A = {a, b, c}, B = {b, c, d}, C = {c, d, e} and D = {{a, b},

{b, c, d}, {b, d, e}}. Then

A ∪ B = {a, b, c, d};

A ∪ C = {a, b, c, d, e};

B ∪ C = {b, c, d, e};

A ∩ B = {b, c};

A ∩ C = {c};

B ∩ C = {c, d};

A −B = {a};

A△B = {a, d};

A△B△C = {a, c, e};

¸

{A, B, C} = {a, b, c, d, e};

¸

{A, B, C} = {c};

¸

D = {a, b, c, d, e};

¸

D = {b}.

11

The basis for these operations is the following axiom.

Axiom 2.16 (Union). For every A ∈ V the union

¸

A of its elements is also a set

and in V , where

¸

A = { x | ∃ y ∈ A (x ∈ y) }.

Proposition 2.17. Let A, B ∈ V . The sets constructed in Remark 2.15 exist, that

is, are in V .

Proof. This follows for

¸

A directly from the Axiom of Union, for the rest this is

now shown. If A is not empty, the intersection is a member of V by the formula

¸

A = {x ∈

¸

A | ∀y ∈ A(x ∈ y)}

and using the Axiom of Comprehension. By the Axiom of Pair, {A, B} ∈ V . Thus

A∪B =

¸

{A, B} and A∩B =

¸

{A, B} are in V . By the Axiom of Comprehension,

the sets

A −B = {x ∈ A ∪ B | x / ∈ B} and

A△B = {x ∈ A ∪ B | x / ∈ A ∩ B}

exist and are in V as well.

Exercise 2.18. Prove that the symmetric diﬀerence is associative, that is, for all

sets A, B, C, (A△B)△C = A△(B△C). For this reason, one can just write A△B△C.

Furthermore, prove that A −B = A ∩ (A△B).

Exercise 2.19. Consider the sets Apple, Pear, Strawberry, Cranberry, Blackberry,

Banana, Blueberry which consist of all fruits in the world usually designated by that

name. Let Fruits be the union of these sets and Red, Blue, Black and Y ellow be

those elements of Fruits which have the corresponding colour. Which of the following

expressions is the empty set?

1. Apple −Red,

2. (Black△Blueberry) ∩ Blue,

3. Fruit −Red −Blue −Black −Y ellow,

4. Red −Strawberry −Cranberry −Apple −Pear,

5. (Blueberry −Blue) ∪ (Y ellow −Apple −Pear −Banana),

6. Banana −Y ellow,

12

7. Banana△Blueberry△Strawberry△Red,

8. (Strawberry ∪ Blueberry ∪ Cranberry) −Red,

9. (Apple ∪ Pear) ∩ (Strawberry ∪ Blueberry),

10. Fruit −

¸

{Apple, Pear, Strawberry, Cranberry, Blackberry, Banana}.

Give a set of three fruits which intersects those of the above sets which are not empty.

Property 2.20. A ⊆ B iﬀ A ∪ B = B iﬀ A ∩ B = A.

The next property states that the algebra of sets satisﬁes the following rules which it

shares with any Boolean algebra.

Property 2.21. All A, B, C ∈ V satisfy the following laws.

Commutativity: A ∪ B = B ∪ A,

A ∩ B = B ∩ A;

Associativity: (A ∪ B) ∪ C = A ∪ (B ∪ C),

(A ∩ B) ∩ C = A ∩ (B ∩ C);

Distributivity: A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C),

A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C);

De Morgan Laws: A −(B ∩ C) = (A −B) ∪ (A −C),

A −(B ∪ C) = (A −B) ∩ (A −C).

Exercise 2.22. Many Boolean Algebras have a complementation operation, but here

only the set diﬀerence is used in the De Morgan Laws. Why?

3 Functions

Graphs and functions are based on the notions of ordered pairs. For example, a graph

consists of a basis set W of vertices and a set E of edges which is a set of ordered pairs

of elements of W. These ordered pairs are constructed using the ordinary unordered

pairs as follows.

Deﬁnition 3.1. (x, y) = {x, {x, y}}.

If x = x

′

or y = y

′

then (x, y) = (x

′

, y

′

). This makes this deﬁnition suitable to

introduce a representation for the Cartesian product of two sets.

13

Deﬁnition 3.2. For any sets A and B, the Cartesian product of A and B is the set

A ×B = { (a, b) | a ∈ A and b ∈ B }.

Example 3.3. The Cartesian product of {0, 1, 2} and {3, 4} is {{0, {0, 3}}, {0, {0, 4}},

{1, {1, 3}}, {1, {1, 4}}, {2, {2, 3}}, {2, {2, 4}}}. The product of {3, 4} and {0, 1, 2} is

{{3, {0, 3}}, {3, {1, 3}}, {3, {2, 3}}, {4, {0, 4}}, {4, {1, 4}}, {4, {2, 4}}}. Thus the Car-

tesian product is not commutative.

Remark 3.4. Given A, B ∈ V , the existence of a Cartesian Product A×B is proven

as follows. The sets A∪ B, P(A∪ B) and C = P(A∪ P(A∪ B)) are also in V . Now

the following deﬁnition of the Cartesian Product is equivalent to the above one:

A ×B = {c ∈ C | ∃a ∈ A∃b ∈ B(c = {a, {a, b}})}.

Thus A ×B ∈ V by the Axiom of Comprehension.

A principal philosophy of set theory is that everything is coded or represented as a

set, similarly to computer scientists who represent everything as ﬁnite sequences from

{0, 1}. Thus the fundamental notions of relations and functions are deﬁned in terms

of sets. Thus, V does not only determine which sets exist, but also which functions,

graphs and relations can be considered.

Deﬁnition 3.5. A relation is a subset of a Cartesian product of ﬁnitely many sets. A

graph is given by a set G of vertices and a relation E ⊆ G×G. A function F : X → Y

is a graph, that is, a subset of X × Y , such that for every x ∈ X there is a unique

y ∈ Y with (x, y) ∈ F. This unique y will be denoted as F(x), that is, F(x) = y and

(x, y) ∈ F are equivalent notations for the same fact.

The set X is called the domain of F. It can be deﬁned from F as {x | ∃(x

′

, y

′

) ∈ F

(x = x

′

)}. Y is a superset of the range {y | ∃(x

′

, y

′

) ∈ F (y = y

′

)}. A function F is

one-to-one (or injective) iﬀ (x, y) ∈ F and (x

′

, y) ∈ F implies that x = x

′

. A function

F : X → Y is onto (or surjective) iﬀ Y is equal to the range of F. F is a bijective

function iﬀ it is both injective and surjective.

Example 3.6. Let A, B be {0, 1, 2, 3, 4} and f be given by f(x) = x + 1 for x =

0, 1, 2, 3 and f(4) = 0. Then the set representing f is {(0, 1), (1, 2), (2, 3), (3, 4), (4, 0)};

that is, f is identiﬁed with this set as in Deﬁnition 3.5.

Graphs and functions can also be deﬁned on classes. So, in general, a function F from

V to V would be a subclass of V whose members are of the form (x, y) with x, y ∈ V ;

that is, F(x) would be the unique y with (x, y) ∈ F.

For this reason, it might be important to ask what classes exist and what classes

do not exist. The answer will be the following: Subclasses of V exist iﬀ they can be

14

deﬁned from other – already existing – subclasses of V using certain methods to make

new classes. These methods will be introduced one after another, starting with basic

expressions below.

Deﬁnition 3.7 (Expressions and Properties). An basic expression is either a

variable-symbol or a constant which is a ﬁxed member of V . For example, ∅ is a

constant.

An expression is obtained from expressions by ﬁnitely often building new expres-

sions from old ones. Given expressions E

1

, E

2

, . . . , E

n

, also the following constructions

are expressions: E

1

∪ E

2

, E

1

∩ E

2

, E

1

△E

2

, E

1

− E

2

,

¸

E

1

,

¸

E

1

, {E

1

, E

2

, . . . , E

n

}

(with n ∈ N and this expression being ∅ for n = 0), P(E

1

).

A basic property is the comparison of two expressions with ∈: E

1

∈ E

2

is true iﬀ

E

1

is a member of E

2

and false otherwise; note that the truth-value can depend on

variables built into E

1

and E

2

.

A property is obtained from basic properties p

1

, p

2

and expressions E by forming

Boolean expressions and quantifying over variables: ¬p

1

, p

1

∧p

2

, p

1

∨p

2

, ∃x ∈ E p

1

(x)

and ∀x ∈ E p

1

(x). Furthermore, one uses the symbols ⊆ and ⊂ as a shorthand:

E

1

⊆ E

2

⇔ ∀x ∈ E

1

(x ∈ E

2

);

E

1

⊂ E

2

⇔ (E

1

⊆ E

2

) ∧ ∃x ∈ E

2

(¬x ∈ E

1

).

For ¬x ∈ E

1

one can also write x / ∈ E

1

.

One can also iterate the process by deﬁning subexpressions using comprehension:

namely if E is an expression and p a property then {x ∈ E | p(x)} is also an expression.

All expressions and properties must be deﬁned by ﬁnitely many iterations of the

process outlined above.

Now every property deﬁnes a class. For example, if P is a property then {x ∈ V :

P(x)} deﬁnes a subclass of V . Furthermore, if E is an expression, then {(x, E(x)) :

x ∈ V } deﬁnes a subclass of V which is a function.

Example 3.8. The following are expressions where a, b, c, . . . stand for constants and

u, v, w, x, y, z for variables; in some statements, the informal deﬁnition is given ﬁrst

and the correct formal one is the last in the chain of equations.

1. P(x), P(P(∅)) and P(

¸

(x ∪ a));

2. x ×y = {z ∈ P(x ∪ y ∪ P(x ∪ y)) | ∃v ∈ x∃w ∈ y (z = {v, {v, w}})};

3. {y | y is a graph on x} = {(x, u) | u ⊆ x ×x} = {x} ×P(x ×x);

The following properties are either always true or always false.

15

1. x = x;

2. x ∈ x;

3. x =

¸

{x};

4. ∃z ∈ P(x) (z ∈ P(y));

5. x = P(x);

6. ∀y ∈ P(x) ∀z ∈ y (z ∈ x);

7. ∃y ∈ P(P(x)) (y ⊆ ∅).

The property ∃y ∈ P(x) (y ⊆ ∅) is true iﬀ x = ∅. So it is not always true and therefore

the last property needed to be more complicated to go into the list of all always true

properties. Properties which are always true are called tautologies.

Given the expression P(x), one can deﬁne the function {(x, P(x)) : x ∈ V } map-

ping x to P(x) as a subclass of V . Similarly, one can code the function x, y →

P(x) ∪ ((x ∩ y) ×(x ∪ y)) as a class.

Set Theory following the axioms of Zermelo and Fraenkel does not have a formal-

ized concept of classes. It is more the way that everything is a class which has a

deﬁnition which can be written down using standard set-theoretic terminology, pa-

rameters from V (that is using sets) and also using the various recursion theorems

explained in chapters to come. Somehow, to work this out formally, goes beyond what

is mandatory for a student to learn; therefore, it will only be explained how to make

new functions from old ones, but the whole mechanism is spared out. So a student

should know these things for examinations about functions from V to V :

• A function f from V to V is a subclass of V which consists of ordered pairs

(x, y) such that for every x ∈ V there is a unique y ∈ V with (x, y) ∈ f; one

writes f(x) = y for this pair (x, y).

• Functions from V to V can be concatenated and modiﬁed in the standard way.

• A function f from a set X to a set Y can be extended to a function from V to

V by considering the subclass {(x, y) | x ∈ X∧y = f(x) ∨x / ∈ X∧y = ∅} of V .

• Functions can be deﬁned from other functions using the various types of recur-

sion deﬁned in Sections to come.

• Functions from V to V satisfy the below Axiom of Replacement.

16

This axiom is used in order to make sure that the range of a set under a function

(from V to V ) is again a set.

Axiom 3.9 (Replacement). Let f be a subclass of V which is a function from V to

V , that is, f is a class of pairs (x, y) such that for each x ∈ V there is a unique y ∈ V

with (x, y) ∈ f. Then, for every set X ∈ V , both f(X) and f[X] = {f(y) : y ∈ X}

are sets again.

Exercise 3.10. Deﬁne (informally) functions f

n

from N to N with the following

properties:

1. f

1

is bijective and satisﬁes f

1

(x) = x but f

1

(f

1

(x)) = x for all x ∈ N;

2. f

2

is two-to-one: for every y there are exactly two elements x, x

′

∈ N with

f(x) = f(x

′

) = y;

3. f

3

is dominating all polynomials, that is, for every polynomial p there is an x

such that for all y > x, f

3

(y) > p(y);

4. f

4

satisﬁes f

4

(x + 1) = f

4

(x) + 2x + 1 for all x ∈ N;

5. f

5

(x) =

0 if x = 0;

f

5

(x −1) if x > 0 and x is not a square number;

f

5

(x −1) + 1 if x > 0 and x is a square number.

Determine the range of the function f

4

.

Deﬁnition 3.11. Let F : A → B and G : B → C. The composition of F and G is

the function G◦ F : A → C deﬁned by G◦ F(a) = G(F(a)) for every a ∈ A. That is,

G◦ F = { (a, c) | ∃b ∈ B((a, b) ∈ F ∧ (b, c) ∈ G)}.

Exercise 3.12. Let A = {0, 1, 2} and F = {f : A → A | f = f ◦ f}. Show that F

has exactly 10 members and determine these.

Deﬁnition 3.13. Let f : A → B. Let C ⊆ A. Let D ⊆ B.

1. The restriction of f to C, denoted by f|`C, is the function f ∩ (C ×B).

2. f[C] is the subset of B determined by

f[C] = { b ∈ B | ∃ a ∈ C ((a, b) ∈ f) }.

f[C] is the image of C under f.

17

3. f

−1

[D] is the subset of A determined by

f

−1

[D] = { a ∈ A | ∃ b ∈ D ((a, b) ∈ f) }.

f

−1

[D] is the preimage of D under f.

Example 3.14. Let f : n → n + 5 be a function on the natural numbers. Then

the set f[{3, 4, 5, 6}] = {8, 9, 10, 11} is the image of {3, 4, 5, 6} and f

−1

[{5, 6, 7, 8}] =

{0, 1, 2, 3} is the preimage of {5, 6, 7, 8} as f[{0, 1, 2, 3}] = {5, 6, 7, 8}. The preimage

of {0, 1, 2, 3, 4} is the empty set.

Deﬁnition 3.15. Let A and B be sets. The set of all functions from A to B is

denoted by B

A

and {f : A → B}.

For example, {0, 1}

N

is the set of all functions from N to {0, 1}, that is, of all binary

sequences.

Example 3.16. For given sets A, B, C, let

D = {f ∈ C

A

| ∃g ∈ B

A

∃h ∈ C

B

(f = h ◦ g)}.

Then, depending on the choice of A, B, C, either D ⊂ C

A

or D = C

A

.

Proof. Let A, C be any sets with at least two elements. Obviously every function

going from A to B and then from B to C is a function from A to C. Thus D ⊆ C

A

.

If B has exactly one element, that is, B = {b} for some b, then every function in

D is constant: For all a, a

′

∈ A, f(a) = h(g(a)) = h(b) = h(g(a

′

)) = f(a

′

). But there

is also a nonconstant function in C

A

: By choice there are two distinct elements c, c

′

in C. Fix furthermore an element a ∈ A. Now deﬁne f(a) = c and f(a

′

) = c

′

for all

a

′

∈ A −{a}. This function is not constant since A has at least two elements. Thus

f / ∈ D and D ⊂ C

A

.

If B = A and f ∈ C

A

, then one can take g to be the identity and h = f. Now

h ◦ g = f and thus D = C

A

.

4 Natural Numbers

In computing, everything is coded as a binary sequence. For example, the Ameri-

can Standard Code for Information Interchange (ASCII) represents the digit “0” as

00110000, the digit “9” as 00110101 and the letter “A” as 01000001. A consequence

is that unsplittable objects like a letters and digits can now be split into subparts. So

18

the digits have all the preﬁx 0011 followed by 0000, 0001, . . ., 0101. Both parts can

be split again into 4 binary digits 0 and 1 which are the only primitive objects.

In Set Theory, every object is represented as a set. Thus, primitive objects like

the natural numbers have, due to this representation, elements of their own. Against

the intuition, they are no longer unsplittable elements which cannot decomposed fur-

thermore. Since one cannot avoid that a number has elements, one tries to represent

them as natural as possible. That is, the number n has as elements those sets which

represent the numbers m below n. Furthermore, the numbers are identiﬁed with their

representation

0 = ∅, 1 = {∅}, 2 = {∅, {∅}}

and this representation will be kept ﬁxed. Note that 1 = {0} = 0 ∪ {0}, 2 = {0, 1} =

1 ∪ {1}, 3 = {0, 1, 2} = 2 ∪ {2}, 4 = {0, 1, 2, 3} = 3 ∪ {3} and so on. This can be

formalized using the notion of a successor S: S(x) = x ∪ {x}.

Given a ﬁnite set X of natural numbers, one can easily see from the way they are

coded that

¸

X is the maximum of X. So S(

¸

X) gives a new element of the natural

numbers which is outside the set X. This means, that the set of natural numbers

is provably inﬁnite. Therefore, ensuring that the natural number are in V means to

ensure that a provably inﬁnite set is in V . If a set X shares the basic properties of

the natural numbers that it contains 0 and for every n also the successor S(n), then

it is called inductive. Inductive sets are always inﬁnite.

Deﬁnition 4.1. A set X is an inductive set iﬀ ∅ ∈ X and X is closed under S:

∀y ∈ X (S(y) ∈ X).

Axiom 4.2 (Inﬁnity). There exists an inductive set.

Deﬁnition 4.3. The set ∅ is also called 0 and inductively the set S(n) is called n+1

where S is the function given by S(x) = x ∪ {x}. Let X be any inductive set. Then

call N =

¸

{Y ⊆ X : Y is inductive}.

One can show that the deﬁnition of N is unique, so it does not matter which inductive

set one chooses to start. One can now ask, is N the set of all natural numbers? That

is, is N = {0, 1, 2, 3, . . .} true? From the axioms, one cannot prove this because one

cannot prove that {0, 1, 2, 3, . . .} is a set. One can only prove the following things:

• N contains every natural number as every inductive set contains every natural

number;

• N is inductive;

19

• every inductive subset X of N satisﬁes N = X;

• if the collection {0, 1, 2, . . .} is actually a set then N = {0, 1, 2, . . .};

• all the laws which one learns in lectures about {0, 1, 2, . . .} and which can be

written down formally are true for N.

For this reason, most books of set theory call “N” just “the set of natural numbers”.

But this is not completely precise as will be shown in Section 20. Here some expla-

nations.

The problem behind this is a problem of axioms: as long as formulas are ﬁnite and

one writes down the axioms using the language of set theory by accessing V and ∈ and

using variables for members of V , quantiﬁers and Boolean connectives, one cannot

come up with any set of ﬁnite formulas which deﬁne N better than done above. Such

language of set theory would include a sentence like “there is a set X such that for all

elements of y ∈ X, y contains an element z and z is not empty”, but it does not con-

tain an inﬁnite formula like ∀x ∈ N(x = 0 ∨x = 1 ∨x = 2 ∨x = 3 ∨. . .) which would

be needed to make sure that N = {0, 1, 2, . . .} and does not contain anything else.

This situation is a bit unpleasant as everyone assumes that the existence of the set of

natural numbers is self-evident, but unfortunately it is not. Nevertheless, N behaves

like the set of natural numbers should behave and for that reason, in many set-theory

books N is just called the set of natural numbers and assumed that N = {0, 1, 2, . . .}

would be guaranteed.

In the years 1945 – 1991, the Soviet Union was a member of the United Nations (UNO).

The Soviet Union itself had 15 republics as its members: Soviet Union = {Belarus,

Estonia, Latvia, Lithuania, Ukraine, . . .}. Two of them, Belarus and Ukraine, were

not only members of the Soviet Union but also of the UNO and represented in the

general assembly. So one had the situation that Ukraine ∈ Soviet Union ∈ UNO and

Ukraine ∈ UNO at the same time. If this situation would not only occur at some but

at all places, that is, if every member of a member of the UNO would be already be

a direct member of the UNO, then one would call the UNO to be “transitive”.

Deﬁnition 4.4. A set A is called transitive iﬀ for all a ∈ A and b ∈ a it holds that

b ∈ A as well.

Example 4.5. The set {∅, {∅}, {{∅}}, {{{∅}}}, {{{{∅}}}}} is transitive, but its ele-

ments are not. Furthermore, N is transitive.

Proof of Second Statement. Given N, let A = {X ∈ N | X ⊆ N}. Clearly ∅ ∈ A.

Now consider any X ∈ A. By deﬁnition of A, X ∈ N and X ⊆ N. Thus S(X) =

20

X ∪{X} ⊆ N. Furthermore, S(X) ∈ N as N is inductive. So X ∈ A ⇒ S(X) ∈ A for

all X ∈ A. It follows that A is an inductive subset of N; as N is the smallest inductive

set, A = N and every X ∈ N satisﬁes X ⊆ N. Hence, N is transitive.

Exercise 4.6. Which of the following sets is transitive and which is inductive?

1. A = {∅, {∅}},

2. B = {∅, {{{∅}}}},

3. C = {x | ∀y ∈ x∀z ∈ y (z = ∅)},

4. D is the closure of {∅, N ×N} under the successor operation x → S(x),

5. E is the set of even numbers,

6. F is the set of all natural numbers which can be written down with at most 256

decimal digits,

7. G is the set of all ﬁnite subsets of N,

8. H = P(G).

Exercise 4.7. Show that the following statements are equivalent for any inductive

set X.

1. X = N;

2. X has no proper inductive subset;

3. X is a subset of every inductive set;

4. ∀x ∈ X (x = 0 ∨ ∃y ∈ X (x = S(y)));

5. X = N(Y ) for every inductive set Y where N(Y ) is the subset of those y ∈ Y

which are in every inductive subset of Y ;

6. X = N(Y ) for some inductive set Y where N(Y ) is deﬁned as in the previous

item.

Theorem 4.8 (Mathematical Induction Principle). Let φ(x) be a property.

Assume that

1. φ(0) holds and

21

2. for all n ∈ N, φ(n) implies φ(n + 1).

Then φ(n) holds for every natural number n ∈ N.

Proof. Let A = {n ∈ N | φ(n) holds}. Then A is an inductive set. Hence N ⊆ A.

In particular, φ(n) is true for all n ∈ N.

Exercise 4.9. Assume that a property p satisﬁes

p(1) and ∀x(p(x) ⇒ p(S(S(x)))).

Show that p(x) is true for all odd numbers. Assume now that one chooses the property

p to identify the odd numbers, that is,

p(x) ⇔ ∃y ∈ N(x = y + y + 1).

Show that p then satisﬁes the above condition.

The following properties follow from the fact that the natural numbers are deﬁned

such that n = {m ∈ N | m < n}. This means in particular n < m iﬀ n ∈ m. So, for

example, the third item is then just the transitivity of < on N.

Property 4.10. Assume that m, n, k ∈ N.

1. Either 0 = n or 0 ∈ n.

2. k ∈ n + 1 if and only if k ∈ n or k = n.

3. If k ∈ m and m ∈ n, then k ∈ n.

4. If n ∈ m, then m = n + 1 or n + 1 ∈ m.

5. Either m ∈ n or m = n or n ∈ m.

6. n = n + 1.

7. If n = m then n + 1 = m+ 1.

8. n ⊂ N.

Proposition 4.11. Let A be a transitive nonempty set. Then A ⊆ N iﬀ 0 is the

unique element which is not the successor of any other element of A.

Proof. First consider the case A ⊆ N. By the Axiom of Foundation there is an y ∈ A

such that every z ∈ A satisﬁes z / ∈ y. On the other hand, every z ∈ y satisﬁes z ∈ A.

22

Thus y has no elements and y = 0. Hence 0 ∈ A.

Now consider any transitive A and assume now that x ∈ A−{0} and x = S(y) for

all y ∈ A. If x is an inductive set then N ⊆ x and x / ∈ N by the Axiom of Foundation.

If x is not an inductive set then there is a y ∈ x with S(y) / ∈ x. Note that y ∈ A

since A is transitive. If y / ∈ N then x / ∈ N as N is transitive. If y ∈ N then x = S(y)

by choice of x. But S(y) is the only natural number z such that y ∈ z ∧ S(y) / ∈ z. It

follows that x / ∈ N again. So whenever A ⊆ N and A not empty then 0 is the unique

element of A which is not the successor of any other element of A.

Last consider the case that A ⊆ N and let B = A−N. By the Axiom of Foundation

there is an element x ∈ B with y / ∈ x for all y ∈ B. Let z be any element of A. If

z ∈ N then S(z) ∈ N and S(z) = x. If z ∈ B then z ∈ S(z) and thus again S(z) = x.

So the element x of A is diﬀerent from 0 and not the successor of any other element

of A.

Corollary 4.12. A set A is equal to N iﬀ A is transitive, inductive and 0 is the

unique element of A which is not the successor of any other element of A.

Remark 4.13. Let X be a nonempty set and n ∈ N.

1. If X ⊆ N, then there is a unique m ∈ X such that m∩ X = ∅.

2. If X ⊆ n then there is a unique m ∈ X such that m∩ X = ∅.

The proof of this is immediate by Axiom 1.3 (Foundation); indeed, the statement

is literally almost the same. But the interpretation of this statement is that every

nonempty of the set of natural numbers has a smallest element, which is m above, as

m∩ X = ∅ implies that there is no n < m in X. For this reason, the two statements

are explicitly listed here.

5 Recursive Deﬁnition

Functions with domain the set of natural numbers can be deﬁned recursively.

Example 5.1. One can deﬁne + : N × N → N as follows: For each m ∈ N, deﬁne

f

m

: {m} ×N →N by

f

m

(m, 0) = m, f

m

(m, S(n)) = S(f

m

(m, n)).

Then

¸

{f

m

| m ∈ N} = {(m, n, f

m

(n)) | m, n ∈ N} represents the addition, that is,

m+ n = f

m

(n) for all m, n ∈ N.

Similarly, one can also deﬁne · : N × N → N as follows: For each m ∈ N, deﬁne

23

g

m

: {m} ×N →N by

g

m

(m, 0) = 0, g

m

(m, S(n)) = g

m

(m, n) +m.

Then

¸

{g

m

| m ∈ N} = {(m, n, g

m

(n)) | m, n ∈ N} represents the multiplication,

that is, m· n = g

m

(n) for all m, n ∈ N.

Furthermore the factorial m! can be deﬁned by 0! = 1 and S(n)! = n! · S(n).

Similarly, 2

n

can be deﬁned by 2

0

= 1 and 2

S(n)

= 2 · 2

n

.

Theorem 5.2 (The Recursion Theorem). Assume that g : N × V → V is a

function and a a value. Then there exists a unique function f : N → V such that

1. f(0) = a and

2. f(S(n)) = g(n, f(n)) for all n ∈ N.

Proof. Let C be the class of all functions with domain N such that h(0) = a and for

all b ∈ N with h(S(b)) being deﬁned, h(b) is deﬁned as well and h(S(b)) = g(b, h(b)).

Deﬁne the class

f = {(b, c) | b ∈ N ∧ ∃h ∈ C ((b, c) ∈ h)}

It is now shown that f is actually a function from N to V . This is done by considering

the following two subclasses of N:

D = {d ∈ N | ∃c ∈ V ((d, c) ∈ f)};

E = {e ∈ D | ∀c, ˜ c ∈ V ((e, c), (e, ˜ c) ∈ f ⇒ c = ˜ c)}.

By the Axiom of Comprehension, they are subsets of N. Both sets contain 0 as there

is a function with domain {0} and value a and as every member of C contains only

the pair (0, a) but not any other pair.

Assume that D = N then there is a minimum d ∈ N − D. As 0 ∈ D, d = S(b)

for some b. Let h be a member of C with b in its domain, such a member exists

by the choice of f, C and D. As h(d) is undeﬁned, one can look at the function

˜

h = h ∪ {(d, g(b, h(b)))}. This function is also in the class C, hence (d,

˜

h(b)) ∈ f in

contradiction to the assumption on C. This contradiction gives D = N.

Assume that E = N. Then there is a minimum e ∈ N − E. By choice there are

two functions h,

˜

h ∈ C such that h(e) =

˜

h(e). As e > 0, there is a number b such

that e = S(b). Furthermore, h(b) =

˜

h(b). By deﬁnition of C, h(e) = g(b, h(b)) =

g(b,

˜

h(b)) =

˜

h(e), in contradiction to the assumption on e. Hence, e cannot exist and

E = N.

In summary, f is a function from N to V . By the Axiom of Replacement, f[N]

24

is a set and f actually a function from a set to a set. Furthermore, f(0) = a and

f(S(n)) = g(n, f(n)) for all n ∈ N. If any function

˜

f : N → A satisﬁes conditions

1. and 2., then the restriction of

˜

f to each domain n ∈ N is a member in C and

that member is extended by f; hence

˜

f = f. So f is the only function satisfying the

recursive deﬁnition.

This theorem has some applications. The ﬁrst one is that for each set X there is a

transitive closure T C(X) of X. Although it is not stated in the next proposition, one

can even show that the operation X → T C(X) is a function from V to V which is

represented as a subclass of V . Note that the concept of a transitive closure is quite

natural: it contains the elements plus element of elements plus elements of elements of

elements plus ... of a set. One could also deﬁne it as the intersection of all transitive

supersets:

T C(X) =

¸

{Y : {X} ⊆ Y ⊆ Z and Y is transitive},

where Z is any transitive set containing X to start with, in the same way as the

natural numbers were deﬁned as the intersection of all inductive subsets of some

given inductive set. The deﬁnition is not sensitive to which set Z is chosen.

Proposition 5.3. For every set X there is a set T C(X) which is the smallest tran-

sitive set containing X as an element.

Proof. For and any n ∈ N and y ∈ V , let

g(n, y) = y ∪

¸

y = y ∪ {x | ∃z (x ∈ z ∧ z ∈ y)}

consist of all elements of y plus all elements of elements of y. The function g cor-

responds clearly to a class as it is written down as an expression. Thus there is a

function f : N → V with f(0) = {X} and f(S(n)) = g(n, f(n)) for all n ∈ N. Now

f[N] is a set which coincides with the union of all f(n).

Now deﬁne T C(X) =

¸

f[N]. As f(0) = {X}, X ⊆ T C(X).

The set T C(X) is transitive: if z ∈

¸

f[N] and x ∈ z then there is an n ∈ N with

z ∈ f(n). It follows from the deﬁnition that x ∈ f(S(n)) and x ∈

¸

f[N].

If Y ⊇ X and Y is transitive then Y ⊇ T C(X); that is, T C(X) is the smallest

transitive superset of {X}: To see this, let Y be any transitive superset of {X}.

Clearly f(0) ⊆ Y . Furthermore, if f(n) ⊆ Y then f(S(n)) contains only elements

which are either in f(n) or elements of elements of f(n). As f(n) ⊆ Y , the elements

and also the elements of elements of f(n) are also elements and elements of elements

of Y , respectively. As Y is transitive, f(S(n)) ⊆ Y . It follows from the induction

principle that every set f(n) with n ∈ N is a subset of Y . Then T C(X) =

¸

f[N] is

also a subset of Y .

25

Proposition 5.4. Every set X is a subset of an inductive set.

Proof. Using Recursion, there is a function f : N → V with f(0) = X ∪ {∅} and

f(S(n)) = {S(y) | y ∈ f(n)} for all n ∈ N. Now Y =

¸

f[N] is an inductive set:

it contains ∅ as ∅ ∈ f(0); for every y ∈ Y there is an n ∈ N with y ∈ f(n) and

S(y) ∈ f(n + 1), hence S(y) ∈ Y as well. Furthermore, X ⊆ Y .

Example 5.5. For every function h : N → N there exists a function f such that

f(n) = h(0) + h(1) + . . . + h(n). This is a consequence of the Recursion Theorem

which takes f to be the function satisfying f(n + 1) = g(n, f(n)) where g(n, f(n)) =

f(n) + h(n). Here some examples for such recursively deﬁned functions f and the

functions h on which they are based.

Function h Sum f, f(n) = h(0) +h(1) +. . . +h(n)

1 n

n n

2

/2 +n/2

n

2

n

3

/3 +n

2

/2 +n/6

n

3

n

4

/4 +n

3

/2 +n

2

/4

3

n

(3

n+1

−1)/2

Exercise 5.6. Determine the functions f

n

given by the following recursive equations:

1. f

1

(0) = 0, f

1

(S(n)) = f

1

(n) + 2

n

,

2. f

2

(0) = 1, f

2

(1) = 0, f

2

(S(S(n))) = f

2

(n) ·

4·S(n)

S(S(n))

,

3. f

3

(n) = 1 for n = 0, 1, . . . , 9, f

3

(10n + m) = f

3

(n) + 1 for n = 1, 2, . . . and

m = 0, 1, . . . , 9,

4. f

4

(0) = 0, f

4

(1) = 0, f

4

(2) = 0, f

4

(3) = 1, f

4

(S(n)) = f

4

(n) +

1

2

(n

2

− n) for

n > 2,

5. f

5

(0) = 1, f

5

(S(n)) = 256 · f

5

(n).

Give informal explanations what these functions compute, for example, consider f

6

given by f

6

(0) = 0, f

6

(1) = 0 and f

6

(S(n)) = f

6

(n) + 2n for n ≥ 1. Then f

6

(n) =

n(n − 1). One explanation would be to assume that there is a soccer league with

n teams. Then there are f

6

(n) games per season, each pair {A, B} of two diﬀerent

teams plays once at A’s place and once at B’s place.

Recall the deﬁnition of domination from Exercise 3.10: A function h : N → N domi-

nates a function f : N →N if there is an n such that ∀m ≥ n(f(m) < h(m)). A func-

tion f : N →N is unbounded if it dominates every constant function c : N →N satis-

fying ∀n, m(c(m) = c(n)). A function f : N →N is increasing if ∀n(f(n) ≤ f(S(n))).

26

Theorem 5.7. Given a function H : N × N → N such that for every m ∈ N the

function h

m

given as h

m

(n) = H(m, n) is increasing and unbounded, there is an

unbounded and increasing function f such that every h

m

dominates f.

Proof. This is proven using the Recursion Theorem with a function g deﬁned as

g(n, k) =

S(k) if S(k) < h

m

(S(n)) for all m ∈ k;

k otherwise.

This deﬁnition can be realized as an expression since one can access H(m, S(n)) in

order to get h

m

(S(n)). So the function g can be deﬁned using only one parameter

from V , namely H. This is necessary because the set {h

0

, h

1

, . . .} might be outside V

if H would not be there.

There is a function f such that f(0) = 0 and ∀n ∈ N(f(S(n)) = g(n, f(n))). Now

it is veriﬁed that f satisﬁes all necessary requirements.

1. f is increasing. This follows directly from the deﬁnition of g: for all n, f(S(n)) ∈

{f(n), S(f(n))}.

2. f is unbounded. Let k be the value of f at some place i. For every m ∈ k

there is a value n

m

such that h

m

(n

m

) > S(k) since h

m

is unbounded. Let

j = 2+max{i, n

0

, n

1

, . . . , n

k−1

}. Then either f(j) > k or f(j) = k∧g(j, f(j)) =

S(k) ∧ f(S(j)) = S(k). Thus f is going up at inﬁnitely many places and

unbounded.

3. Every h

m

dominates f. Assume by contradiction that h

m

does not dominate

f. Since f is unbounded and increasing, there is a ﬁrst value n such that

f(S(n)) > S(m) and f(S(n)) ≥ h

m

(S(n)). Since f(0) = 0, one can conclude

that either f(n) ≤ S(m) or f(n) < h

m

(n). Since h

m

is increasing, one has

in both cases that f(S(n)) = S(f(n)) and it follows from the deﬁnition of g

that S(f(n)) < h

m

′ (S(n)) for all m

′

∈ f(n). Since f(S(n)) > S(m), one has

f(n) ≥ S(m) and m ∈ f(n). Thus f(S(n)) < h

m

(S(n)) in contradiction to the

choice of n. So the n chosen cannot exist and h

m

dominates f.

So f is unbounded, increasing and dominated by every h

m

with m ∈ N.

Exercise 5.8. Let H : N × N → N be a function and h

m

: N → N be given by

h

m

(n) = H(m, n) for all n. Show that there is a function f dominating every h

m

.

6 Cardinality of Sets

The cardinality of a set is the number of its elements. For example, the set {Adelaide,

Brisbane, Canberra, Melbourne, Perth, Sydney} of Australia’s largest towns has six

27

members, that is, its cardinality is 6. This is established by counting, Adelaide is the

ﬁrst, Brisbane the second, Canberra the third, Melbourne the fourth, Perth is the

ﬁfth and Sydney the sixth town. Mathematically, this can be viewed as a bijective

mapping from the set of the largest Australian towns to the set {ﬁrst, second, third,

fourth, ﬁfth, sixth} representing 6. When working out the foundations of set theory,

Cantor deﬁned that two sets have the same size iﬀ there is a bijective function between

them.

Deﬁnition 6.1. Let A and B be two sets. The sets A and B have the same cardinality,

denoted by |A| = |B|, iﬀ there exists a bijective function from A to B.

Property 6.2. Having the same cardinality is a transitive and reﬂexive equivalence

relation. That is, the following holds for all sets A, B, C:

1. |A| = |A|;

2. if |A| = |B|, then |B| = |A|;

3. if |A| = |B| and |B| = |C|, then |A| = |C|.

Example 6.3. Let E ⊂ N be the set of all even natural numbers. Then |E| = |N|

which is witnessed by the bijection x → 2 · x from N to E.

Proposition 6.4. |P(X)| = |{0, 1}

X

|.

Proof. For each A ⊆ X, let f(A) be the characteristic function of A, deﬁned by

f(A)(x) =

1 if x ∈ A,

0 if x / ∈ A.

Then f is one-to-one and onto.

Proposition 6.5. If C ⊆ B ⊆ A and |C| = |A| then |A| = |B|.

Proof. Let f : A → C be a bijective function. One recursively deﬁnes two sequences

A

0

, A

1

, . . . , A

n

, . . . and B

0

, B

1

, . . . , B

n

, . . . of sets as follows. Let A

0

= A and B

0

= B

and for each n ∈ N, let

A

n+1

= f[A

n

], B

n+1

= f[B

n

].

By induction, A

n+1

⊆ B

n

⊆ A

n

for all n ∈ N. For each n ∈ N, let E

n

= A

n

− B

n

.

Note that f[E

n

] = f[A

n

] − f[B

n

] = A

n+1

− B

n+1

= E

n+1

which can be proven by

induction. Thus there is a set E such that

E =

¸

{E

n

| n ∈ N}.

28

The idea is now to use that all the E

n

are disjoint and that E

0

= A − B. The new

function g will be a one-to-one mapping from each E

n

to E

n+1

where it follows f and

will be the identity otherwise:

g(x) =

f(x) if x ∈ E;

x if x ∈ A −E.

Then g is one-to-one and g[A] =

¸

{E

n

| n ≥ 1} ∪ (A −E) = A −E

0

= B.

Deﬁnition 6.6. Let A and B be two sets. The cardinality of A is less than or equal

to the cardinality of B, denoted by |A| ≤ |B|, iﬀ there exists a one-to-one function

f : A → B. The cardinality of A is less than the cardinality of B, denoted by

|A| < |B|, if |A| ≤ |B| and there is no one-to-one function from B into A.

Exercise 6.7. Prove by giving a one-to-one function that the set {Auckland, Christ-

church, Dunedin, Wellington} of New Zealand’s largest towns has a cardinality which

is less than the one of the set of Australian towns given above. Furthermore, prove

that it is not less or equal than the cardinality of the set {Singapore}.

Property 6.8. Each A, B, C ∈ V satisfy the following:

1. |A| ≤ |A|;

2. if |A| = |B| then |A| ≤ |B|;

3. if |A| ≤ |B| and |B| ≤ |C| then |A| ≤ |C|.

Theorem 6.9 (Cantor-Bernstein Theorem). If |X| ≤ |Y | and |Y | ≤ |X|, then

|X| = |Y |.

Proof. Let f : X → Y and g : Y → X be one-to-one functions. Then g[f[X]] ⊆

g[Y ] ⊆ X. Since |X| = |g[f[X]]|, by Proposition 6.5, |X| = |g[Y ]|. Since |Y | = |g[Y ]|,

|X| = |Y |.

Example 6.10. The sets {0, 1}

N

and {0, 1, 2}

N

have the same cardinality.

Proof. Every {0, 1}-valued function is also a {0, 1, 2}-valued function, thus {0, 1}

N

⊆

{0, 1, 2}

N

and |{0, 1}

N

| ≤ |{0, 1, 2}

N

|. It remains to show the other direction. Consider

the function F : {0, 1, 2}

N

→ {0, 1}

N

given as

F(f)(2n) = min{f(n), 1},

F(f)(2n + 1) = min{2 −f(n), 1}.

29

Assume that f(n) = 0, g(n) = 1, h(n) = 2. Then F(f)(2n) = 0 and F(f)(2n+1) = 1,

F(g)(2n) = 1 and F(g)(2n + 1) = 1, F(h)(2n) = 1 and F(h)(2n + 1) = 0. Thus if

two functions are diﬀerent at n, their image is diﬀerent either at 2n or at 2n + 1. It

follows that F is one-to-one and |{0, 1, 2}

N

| ≤ |{0, 1}

N

|. By Theorem 6.9, both sets

have the same cardinality.

Exercise 6.11. Show that if |X| = |X ×N| then |{0, 1}

X

| = |N

X

|.

Theorem 6.12 (Cantor). |X| < |P(X)|.

Proof. The function f(x) = {x} is a one-to-one function. Hence |X| ≤ |P(X)|.

To show that |X| = |P(X)|, consider any function f from X to P(X). One has

to show that X is not onto. For this, one deﬁnes the subset A ⊆ X by

A = {x ∈ X | x / ∈ f(x)}

and shows that A is not in the range of f. This is done by verifying that f(a) = A

for any a ∈ X.

Actually this comes directly from the deﬁnition of A: for given a, the deﬁnition

states that a ∈ A iﬀ a / ∈ f(a). Thus A and f(a) diﬀer with respect to the membership

of a and f(a) = A. This shows that f is not onto.

7 Finite and Hereditarily Finite Sets

Example 6.3 considers the set of even numbers which is a proper subset of N but still

has the same cardinality of N. This is no longer possible for ﬁnite set. A proper subset

of a ﬁnite set is strictly smaller. For example, for the set {0, 1, 2, 3, 4} identiﬁed with

the natural number 5, one has |{0, 2, 3, 4}| < |{0, 1, 2, 3, 4}|.

Theorem 7.1. Let n ∈ N. If f : n → n is a one-to-one function, then f is onto.

Furthermore, if n ∈ N and u ⊆ n then |u| = |m| for some m ∈ {0, 1, 2, . . . , n}.

Proof. This is proven by induction. When n = 0, the statement holds trivially. Let

f : S(n) → S(n) be a one-to-one function.

Either there is no m ∈ n with f(m) = n. Then f[n] ⊆ n and f[n] = n by

induction hypothesis, that is, every m < n is in the range of f[n]. Thus f(n) ≤ n but

not f(n) < n. Therefore f(n) = n and f is onto.

Or there is some m ∈ n with f(m) = n. Now let

˜

f(m) = f(n),

˜

f(n) = n and

˜

f(k) = f(k) for all k ∈ S(n) − {m, n}.

˜

f and f have the same range since

˜

f was

obtained from f by interchanging the values at m and n.

˜

f satisﬁes now the case

“Either” above and is onto. Then also f is onto.

30

For the second statement, assume that u, n are given with n ∈ N and u ⊆ n. Now

deﬁne by recursion a function g with g(0) = 0 and

g(S(k)) =

g(k) if k / ∈ u;

S(g(k)) if k ∈ u.

Assume now that x, y ∈ u and x, y are diﬀerent elements. Then either x < y or y < x,

say the ﬁrst. Thus S(x) ≤ y and thus g(y) ≥ g(S(x)) > g(x). So g is one-to-one on u

(although not outside u). Now let m = g(n). If ℓ ∈ m then there is a minimal k with

g(S(k)) > ℓ. It follows that g(S(k)) > g(k), g(S(k)) = S(g(k)), k ∈ u and g(k) = ℓ.

Thus m = g[u] and g is one-to-one on u, so |u| = |m|.

Deﬁnition 7.2. A set X is ﬁnite iﬀ there is some n ∈ N such that |X| = |n|. A set

X is inﬁnite iﬀ X is not ﬁnite.

So the ﬁnite subsets of N are intuitively those which can be completely listed, for ex-

ample {2, 3, 5}, {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} and {1, 5, 25, 125, 625, 3125}. Inﬁnite sets can

never be written down completely as the examples {0, 1, 2, 3, . . .} of the natural num-

bers and {0, 2, 4, 6, 8, . . .} of the even numbers show. From a practical point of view,

this is also true for some ﬁnite sets like {0, 1, 2, 3, . . . , 1723907238947, 1723907238948,

1723907238949}.

Theorem 7.3. Let X, Y, Z be ﬁnite sets and f be a function.

1. Every subset of X is ﬁnite.

2. If X ⊂ Y then |X| < |Y |.

3. f[X] is ﬁnite.

4. X ∪ Y is ﬁnite.

5. If every element of Z is ﬁnite then

¸

Z is ﬁnite.

6. P(X) is ﬁnite.

Proof. 1. Let A ⊆ X. By assumption, there is a one-to-one and onto mapping f

from X to some n ∈ N. Let u = f[A]. By Theorem 7.1 there is a one-to-one mapping

g from u onto some m ∈ N. Now one deﬁnes for all b ∈ A the mapping b → g(f(b))

which then is a one-to-one mapping from A onto m. Thus |A| = |m| and A is ﬁnite.

2. Since Y is ﬁnite, there is a natural number n and a bijection g : n → Y . Assume

furthermore that X ⊆ Y and that |X| = |Y |. The second property implies that there

31

is a bijection h : Y → X. Now let h

′

map every m ∈ n to g

−1

(h(g(m))). h

′

is a

bijection and maps n to a subset of n. By Theorem 7.1, h

′

is onto and h

′

[n] = n.

Thus X = h[g[n]] = g[g

−1

[h[g[n]]]] = g[h

′

[n]] = g[n] = Y . So |X| = |Y | does not hold

for a proper subset of Y .

3. There is a bijection g : n → X for some natural number n. Now let h(k) = f(g(k))

for all k ∈ n and deﬁne u = {k ∈ n : h(k) / ∈ h[k]} as the set of all numbers k for

which h(k) takes a value not taken by any h(ℓ) with ℓ < k. Then h[u] = h[n] = f[X]

and h is one-to-one on u. By Theorem 7.1, |u| = |m| for some m ∈ N and it follows

that |f[X]| = |h[u]| = |u| = |m|. That is, f[X] is a ﬁnite set.

4. As X, Y are ﬁnite, there are natural numbers n, m and mapping h

X

, h

Y

such that

X = h

X

[n] and Y = h

Y

[m]. Now deﬁne g(k) = h

X

(k) for k = 0, 1, . . . , n − 1 and

g(k) = h

Y

(k−n) for k = n, n+1, . . . , n+m−1. Then the ﬁnite set {0, 1, . . . , n+m−1}

is the domain of g and X ∪ Y the range of g. It follows that X ∪ Y is ﬁnite.

5. As Z is ﬁnite, Z = {z

0

, z

1

, . . . , z

n−1

} for some ﬁnite index set n. By assumption,

z

0

, z

1

, . . . , z

n−1

are all ﬁnite sets. Now deﬁne u

0

= ∅ and for every k ∈ n inductively

u

k+1

= u

k

∪ z

k

. Clearly u

0

is ﬁnite and by induction over k it follows that u

k+1

is

ﬁnite as it is the union of two ﬁnite sets. Thus u

n

=

¸

Z is ﬁnite.

6. It is veriﬁed by induction that P(X) has 2

n

elements and is ﬁnite whenever X has n

elements, that is, can bijectively mapped to n. This is obviously true for X = ∅ where

P(X) = {∅} contains 1 = 2

0

elements. Assume that the inductive hypothesis for a

set X having n elements is proven. Furthermore, 2

n

∈ N and every set of 2

n

elements

is ﬁnite. Now let Y = X ∪ {x} have S(n) elements, that is, x / ∈ X. For every subset

Z of X there are two subsets Z, Z ∪{x} of Y , thus the quantity of subsets of Y is two

times as large as the quantity of subsets of X. It follows that |P(Y )| = 2 · |P(X)|.

Thus |P(Y )| has 2

S(n)

= 2

|Y |

elements and is ﬁnite again.

Exercise 7.4. Let X be ﬁnite. Prove that the set of all functions from X to X is

ﬁnite.

The set {{N}} is ﬁnite since it contains only the set {N}. That set is ﬁnite again, but

it contains the inﬁnite set N. That is, going along the iterated membership relation

of {{N}} end up in an inﬁnite set.

A set is called hereditarily ﬁnite if this does not happen. That is, if A is hereditarily

ﬁnite, then not only A itself but also all of its elements, all of the elements of its

elements and so on are ﬁnite. Most easily, this is deﬁned by looking at the transitive

closure.

32

Deﬁnition 7.5. A set A is called hereditarily ﬁnite iﬀ every element of T C(A) is

ﬁnite. Furthermore, let V

ω

= {x ∈ V | x is hereditarily ﬁnite}.

Theorem 7.6. A is hereditarily ﬁnite iﬀ T C(A) is ﬁnite.

Proof. If T C(A) is ﬁnite then A is also hereditarily ﬁnite. Assume now that T C(A)

is inﬁnite. Let B = {x ∈ T C(A) | T C(x) is ﬁnite}. Clearly B ⊂ T C(A) as A / ∈ B.

By the Axiom of Foundation there is X ∈ T C(A) − B such that no element of

X is in T C(A) − B, thus X ⊆ B. Now T C(Y ) is ﬁnite for all Y ∈ X and thus

{T C(Y ) | Y ∈ X} is a set of ﬁnite sets. Thus T C(X) = {X} ∪

¸

{T C(Y ) | Y ∈ X}

is inﬁnite only if X is inﬁnite, hence X is inﬁnite and A not hereditarily ﬁnite.

The natural numbers had been introduced by iterations of the operation x → x ∪

{x} starting with the empty set. This permits to write down every natural number

explicitly, for example, 3 is {∅, {∅}, {∅, {∅}}}. The intuition behind this is to call a

set hereditarily ﬁnite iﬀ it can be written down explicitly on paper. In practice, one

encounters of course the problem that already sets corresponding to moderately sized

numbers like 275 require more symbols to be written down explicitly than there are

atoms in the universe – it is estimated that there are between 10

78

and 10

82

atoms in

the universe and between 10

48

and 10

52

atoms on the Earth.

Example 7.7. The set N is inﬁnite but all its elements are hereditarily ﬁnite. The

set {1, {0, 2}, {0, 3}} is hereditarily ﬁnite. The set {{N}} is ﬁnite but not hereditarily

ﬁnite.

Property 7.8. A ﬁnite set x is hereditarily ﬁnite iﬀ every y ∈ x is hereditarily ﬁnite.

Theorem 7.9. The class V

ω

is actually a set and coincides with the smallest set W

satisfying

1. ∅ ∈ W;

2. if v ∈ W then {v} ∈ W;

3. if v, w ∈ W then v ∪ w ∈ W.

Proof. Clearly ∅ is hereditarily ﬁnite. If v is hereditarily ﬁnite, so is {v}. If v, w

are hereditarily ﬁnite then they are ﬁnite sets of hereditarily ﬁnite sets. It follows

that v ∪ w is a ﬁnite set and that all its members are hereditarily ﬁnite since they

are members of either v or w (or both). Thus V

ω

is closed under the three operations

given.

The next step is to construct a function f from N onto V

ω

in order to show that

33

it is a set. This function is ﬁrst constructed as a function from N into V and the

properties are shown later. The inductive construction goes as follows:

• f(0) = ∅;

• f(2

n

+ m) = f(m) ∪ {f(n)} for n ∈ N and m ∈ {0, 1, . . . , 2

n

−1}.

One can show that f[N] ⊆ V

ω

: Assume that k would be the least number with

f(k) / ∈ V

ω

. Then k > 0 as f(0) = ∅ and ∅ ∈ V

ω

. So there are n, m ∈ N with

m < 2

n

≤ 2

n

+m = k. f(n) and f(m) are both in V

ω

. Thus {f(n)} and {f(n)}∪f(m)

are also both in V

ω

. It follows that f(k) ∈ V

ω

in contradiction to the assumption. So

one actually has f[N] ⊆ V

ω

. Note that this argumentation also works with every set

W satisfying 1., 2. and 3.; thus one has that f[N] ⊆ W for such a set.

Now consider any set z / ∈ f[N]. It follows from the Axiom of Foundation that there

is a set x ∈ T C(z)−f[N] with x∩(T C(z)−f[N]) = ∅. As x ⊂ T C(z) one can conclude

that x ⊆ f[N]. Now deﬁne inductively a function g such that f(g(n)) = f[n] ∩ x as

follows: g(0) = 0 and

g(S(n)) =

g(n) + 2

n

if f(n) ∈ x;

g(n) if f(n) / ∈ x.

One can show inductively that 0 ≤ g(n) < 2

n

for all n ∈ N. Furthermore, f(g(0)) =

f[0] ∩ x as g(0) = 0 and f[0] = f[∅] = ∅. Inductively, if f[n] ∩ x = f(g(n)) and

f(n) / ∈ x then g(n + 1) = g(n) and f(g(n + 1)) = f(g(n)) = f[n] ∩ x = f[n + 1] ∩ x;

if f[n] ∩ x = f(g(n)) and f(n) ∈ x then g(n + 1) = g(n) + 2

n

and f(g(n + 1)) =

{f(n)} ∪ f(g(n)) = {f(n)} ∪ (f[n] ∩ x) = f[n + 1] ∩ x. It follows that x =

¸

f[g[N]].

Since x / ∈ f[N], f(g(m)) = x for all m ∈ N and for every m ∈ N there is an n ∈ N with

n > m and g(n) > g(m). In particular, f(g(n)) is a proper superset of f(g(m)). It

follows that x is inﬁnite and x / ∈ V

ω

. As x ∈ T C(z), z / ∈ V

ω

as well. As a consequence,

V

ω

= f[N] and V

ω

is a set. Furthermore, V

ω

⊆ W for all sets W satisfying the

conditions 1., 2., 3. as this had been proven above for f[N] in place of V

ω

.

Exercise 7.10. Prove that V

ω

satisﬁes the following property: if x ∈ V

ω

and y ⊆ x

or y ∈ x, then y ∈ V

ω

. Show that N does not satisfy this property, but that some

proper inﬁnite subclass of V

ω

does.

Exercise 7.11. Determine all x

0

∈ V which satisfy that there are no x

1

, x

2

, x

3

, x

4

∈ V

with x

1

∈ x

0

, x

2

∈ x

1

, x

3

∈ x

2

, x

4

∈ x

3

. The set {{∅}} is such an x

0

, although x

1

= {∅}

and x

2

= ∅ exist, x

3

and x

4

do not exist. The set {{∅, {{∅}}}} does not qualify.

34

8 Countable Sets

Since N and P(N) do not have the same cardinality, there are several diﬀerent cardi-

nalities of inﬁnite sets. The set N has the least inﬁnite cardinality and this cardinality

is called “countable”. Countable sets are the only inﬁnite sets where one — theoreti-

cally — can name every element by a name; for example, every natural number can

be written down as a ﬁnite sequence of digits. The same applies to all other types

of objects which can be coded with a ﬁnite alphabet. For example, the set of all

possible novels is countable since one can write down each novel using an alphabet

plus punctuation symbols and special characters. The same is true for the set of all

computer programs. The word “countable” itself comes from the fact that one can

count one by one all the things which can be written down explicitly. In the case of

words (or strings) over the English alphabet, “a” is the ﬁrst word, “b” the second, “z”

the twenty-sixth, “aa” the twenty-seventh, “az” the ﬁfty-second, “ba” the ﬁfty-third,

“zx” the sevenhundredth, “zz” the sevenhundred-second and “aaa” the sevenhundred-

third word. So there is a surjective mapping from the natural numbers to all ﬁnite

nonempty strings over the English alphabet which counts these strings. This fact

motivates the following deﬁnition; in order to avoid to say “inﬁnite and countable”

all the time, countable sets are deﬁned to be inﬁnite.

Deﬁnition 8.1. A set X is at most countable iﬀ there is a surjective function f :

N → X. X is countable iﬀ‘ X is at most countable and inﬁnite. X is uncountable iﬀ

|X| > |N|.

Remark 8.2. The sets of natural numbers and integers are examples of countable

sets. Any ﬁnite set is at most countable but not countable.

If g is a function and X at most countable, then g[X] is at most countable: By

deﬁnition, there is a surjective function f : N → X which witnesses that X is at

most countable. The concatenation n → g(f(n)) then witnesses that g[X] is at most

countable, too.

Deﬁnition 8.1 reﬂects the property that one can enumerate the elements of a countable

set, that is, that there is a surjective function from N to X. But the condition

“|X| ≤ |N|” is deﬁned the other way round: there has to be a one-to-one function

from X to N. The next result shows that both ways to deﬁne “at most countable”

and “countable” are equivalent.

Proposition 8.3. A set X is at most countable iﬀ |X| ≤ |N|. An inﬁnite set X is

at most countable iﬀ |X| ≤ |N| iﬀ |X| = |N|.

Proof. Let X be at most countable. Then there is a function f : N → X which is

surjective. For every x ∈ X let g(x) be the minimum of f

−1

(x) = {y ∈ N | f(y) = x}.

35

The value g(x) exists because f is surjective. Furthermore, if g(x) = g(y) then

x = f(g(x)), y = f(g(y)) and x = y, thus g is injective. So |X| ≤ |N|.

Assume now that X is inﬁnite and that f is as above. For every number n there

is a ﬁrst natural number m such that |S(n)| ≤ |f[S(m)]|; let h(n) denote this number

m for given n. Due to the deﬁnition of h one has that |f[S(h(n))]| > |f[h(n)]|. In

particular, f(h(n)) ∈ f[h(S(n))] −f[h(n)]. It follows that the mapping n → f(h(n))

is one-to-one and witnesses that |N| ≤ |X|. Thus |X| = |N| by Theorem 6.9.

Since the identity restricted to a subset Y of a set X is a one-to-one mapping from Y

to X, one has the following corollary.

Corollary 8.4. If X is countable and Y ⊆ X then Y is at most countable; if Y is

inﬁnite then |Y | = |X|.

Example 8.5. The set Q of all rationals is countable.

Proof. For every rational q there are unique numbers n(q), m(q), k(q) such that the

following conditions hold:

1. if q = 0 then n(q) = 0, m(q) = 0 and k(q) = 5;

2. if q > 0 then q =

n(q)

m(q)

, k(q) = 7 and n(q), m(q) do not have a common prime

factor;

3. if q < 0 then q = −

n(q)

m(q)

, k(q) = 11 and n(q), m(q) do not have a common prime

factor.

Let f(q) = 2

n(q)

· 3

m(q)

· k(q). It is easy to see that f is a one-to-one mapping from Q

to N. Thus |Q| ≤ |N|. Since N ⊆ Q, the cardinality of both sets is the same and Q is

countable.

Proposition 8.6. If X and Y are at most countable so is X ×Y .

Proof. Since X, Y are at most countable, there are one-to-one mappings f : X →N

and g : Y →N. Now let h(x, y) = 2

f(x)

· 3

g(y)

. The function h is a one-to-one mapping

from X ×Y to N. Thus |X ×Y | ≤ |N|.

Remark 8.7. If X and Y are inﬁnite, then one can take h even such that h is

bijective. Cantor constructed an explicit bijection between N×N and N. He used the

function p mapping m, n to

1

2

· (m+ n) · (m+ n + 1) +m; so p(0, 0) = 0, p(0, 1) = 1,

p(1, 0) = 2, p(0, 2) = 3, p(1, 1) = 4, p(2, 0) = 5, p(0, 3) = 6 and so on. The veriﬁcation

36

that p really is a bijection is left to the reader.

Proposition 8.6 can be adapted to show that X × Y and X ∪ Y are countable

whenever X and Y are countable sets.

For the next result, consider just the function f build in Theorem 7.9.

Property 8.8. The set V

ω

of all hereditarily ﬁnite sets is countable.

Exercise 8.9. Let D = {f : N → N | ∀n(f(S(n)) ≤ f(n))} be the set of all

decreasing functions. Show that D is countable.

Theorem 8.10. Let A be nonempty and at most countable and A

∗

denote the set of

ﬁnite sequences of members in A. Then A

∗

is countable. Furthermore, the set of all

ﬁnite subsets of A is at most countable.

Proof. Let g : N → A be a surjective function and consider the following set B:

x ∈ B ⇔ x is of the form {(0, b

0

), (1, b

1

), (2, b

2

), . . . , (n − 1, b

n−1

)} for some natural

numbers n and b

0

, b

1

, . . . , b

n−1

where the latter are used to code elements of A. As

ordered pairs are just coded sets, one can easily set that B ⊆ V

ω

. Now mapping

the empty set to the empty sequence and {(0, b

0

), (1, b

1

), (2, b

2

), . . . , (n − 1, b

n−1

)} to

g(b

0

) g(b

1

) g(b

2

) . . . g(b

n−1

) shows that A

∗

is at most countable. As A

∗

contains for

each n ∈ N a sequence of length n, A

∗

is inﬁnite and countable.

Let E be the set of all ﬁnite subsets of N. The set E is at most countable as V

ω

is

countable. Furthermore, mapping each x ∈ E to g[x] produces a surjective mapping

from E onto all ﬁnite subsets of A and thus the set of all ﬁnite subsets of A is at most

countable.

Exercise 8.11. Let A contain four elements, the symbol ∅, the bracket {, the bracket

} and the comma in this ordering; that is, the symbol ∅ comes ﬁrst and the comma

comes last. Let <

ll

be the length-lexicographic ordering of the set A

∗

of all strings

over A: if v is shorter than w then v <

ll

w; if v, w have the same length then

v <

ll

w ⇔ v <

lex

w. Now let f : V

ω

→ A

∗

map every set x in V

ω

to ﬁrst expression

describing x; for example,

• f({1, 2}) =“{{∅}, {∅, {∅}}}”;

• f({0, 3}) =“{∅, {∅, {∅}, {∅, {∅}}}}”.

As ∅ <

ll

{}, the symbol “{}” is never used to describe the empty set; this convention

is also applied in this text. Check which of the following facts are true:

1. the length of f(x) is odd for every x ∈ V

ω

;

37

2. the length of f(P(x)) is the product of the length of f(x) and the cardinality

of P(x) plus 1;

3. if f(x) = {y} then f(S(x)) = {y, {y}};

4. f(2) = {∅, {∅}}.

Furthermore, ﬁnd a formula giving the length of f(n) for every n and determine which

of the following numbers is the length of f(10): 42, 100, 1000, 1001, 1022, 1023, 1024,

2047, 4096, 256

2

, 10

10

−1, 2

256

, 1023

1023

.

If the length of f(x) is n and f(y) is m, what is the length of f((x, y)) for the

ordered pair (x, y)?

Exercise 8.12. Let A be the set of algebraic real numbers, that is, the set of all

r ∈ R for which there are n ∈ N and z

0

, z

1

, . . . , z

n

∈ Z such that z

n

= 0 and

z

0

+z

1

r +z

2

r

2

+. . . +z

n

r

n

= 0. Note that such a polynomial of degree n can have up

to n places r which are mapped to 0. Show that A is countable by giving a one-to-one

mapping from A into N.

Example 8.13. The set F of all continuous functions f : Q →Q is uncountable.

Proof. Given A ⊆ Z, one can map A to the continuous function f ∈ F given as

f(q) =

0 if z, z + 1 / ∈ A;

q −z if z / ∈ A, z + 1 ∈ A;

z + 1 −q if z ∈ A, z + 1 / ∈ A;

1 if z, z + 1 ∈ A;

where z is the unique integers with z ≤ q < z + 1. Since this mapping is one-to-one,

|P(Z)| ≤ |F|.

9 Graphs and Orderings

Consider the following list which relates words with the same meaning to each other,

say English words to their Spanisch counterparts: (dog, el perro), (cat, el gato), (cat,

la gata), (boy, el ni˜ no), (girl, la ni˜ na), (cow, la vaca), (English, el Ingl´es), (Spain,

Espa˜ na), (blue, azul). Since the relation is not unique, “cat” can be “el gato” (if

male) or “la gata” (if female), one cannot represent the structure as a function. The

most common more general notion studied by mathematicians is that of directed

graphs. It has a set of vertices, in this example the set of all words in either English

or Spanish language. Furthermore, it has a set of pairs of vertices, the edges, in

38

this example the edges consist of one English and one Spanish word whose meaning

is related to each other in the way that there is an object or concept which can be

denoted by the ﬁrst word in English and the second word in Spanish: there exists a

male cat and thus (cat, el gato) is in the set of edges, furthermore there exists a female

cat and thus (cat, la gata) is in the set of edges. These pairs are ordered, the English

word is always the ﬁrst one. For the reader’s convenience, the formal introduction of

graphs is repeated from Deﬁnition 3.5.

Deﬁnition 9.1. A (directed) graph is pair (G, E) such that G is a set and E ⊆ G×G.

The members of G are called vertices and the members of E are called edges.

Example 9.2. Every function f : X → Y can be represented as the graph

(X ∪ Y, {(x, y) ∈ (X ∪ Y ) ×(X ∪ Y ) | x ∈ X ∧ y ∈ Y ∧ y = f(x)}).

One can also consider graphs with a class as their domain: (V, ∈) is a graph and

(V

ω

, ∈) is a graph where in the latter case “∈” is restricted to the domain V

ω

. Also

(V, {(x, S(x)) | x ∈ V }) is a graph.

Exercise 9.3. A graph (G, E) is called bipartite if there are two subsets X, Y of

G such that X ∩ Y = ∅ and every pair (x, y) ∈ E is actually in X × Y ∪ Y × X.

An English-Russian dictionary is a bipartite graph because one can take X to be the

words written in the Latin alphabet and Y to be the words written in the Cyrillic

alphabet. An English-Spanish dictionary is not bipartite, for example place names

like “Los Angeles” appear in the same spelling in both languages. By the way, the

mentioned name is of Spanish origin and has the English translation “the angels”.

Which of the following graphs are bipartite? The set of vertices is N and the set E

n

of edges is speciﬁed below, note that E

n

⊆ N ×N.

1. (x, y) ∈ E

1

⇔ x = y,

2. (x, y) ∈ E

2

⇔ x < y,

3. (x, y) ∈ E

3

⇔ 12 < x + y < 18,

4. (x, y) ∈ E

4

⇔ x > 4 ∧ (y = x

2

∨ y = x

4

),

5. (x, y) ∈ E

5

⇔ ∃z > 0 (x = 2

z

∧ y = 3

z

),

6. (x, y) ∈ E

6

⇔ ∃z > 0 (x ∈ {2

z

, 3

z

} ∧ y ∈ {5

z

, 7

z

}),

7. (x, y) ∈ E

7

⇔ y = S(x) ∧ x is even.

39

If one considers the graph (V, ⊂) instead of (V, ∈), one has additional properties not

present at (V, ∈). The main additional property is transitivity. On the other hand,

there are still incomparable elements of V ; for example {{{0}}} and {0, 1}. This is

captured by the deﬁnition of a partially ordered set.

Deﬁnition 9.4. A set G is partially ordered by a relation < iﬀ this relation is

antisymmetric and transitive. These two properties are deﬁned as follows:

1. the relation < is antisymmetric iﬀ there are no x, y ∈ G with x < y and y < x;

2. the relation < is transitive iﬀ x < z for all x, y, z ∈ G with x < y and y < z.

Note that a transitive relation < is antisymmetric iﬀ it is antireﬂexive, that is, iﬀ

there is no x ∈ G with x < x.

Convention 9.5. If G is partially ordered by < then < is called a partial ordering on

G and (G, <) is called a partially ordered set. One usually writes a < b if the ordering

is denoted by a symbol of the type < and (a, b) ∈ R if the ordering is denoted by a

letter like R.

Furthermore, the notation a ≤ b stands for a < b ∨a = b. Similarly A ⊆ B stands

for A ⊂ B ∨ A = B.

If a ≤ b, then one says that a is less than or equal to b. If a < b, then one says

that a is less than b or a is smaller than b.

Example 9.6. Assume that G = {a, b, c, d} and a < b, b < c, a < c and d < c. Then

G is a partially ordered set where the elements a, d are incomparable. Furthermore,

c > b and b < c mean the same. The relation a < c is needed since the ordering <

would otherwise not be transitive. Graphically, the ordering looks like this:

c

/ \

b d

/

a

Exercise 9.7. Let A = N−{0, 1} = {2, 3, 4, . . .} and let <

div

be given by x <

div

y ⇔

∃z ∈ A(x · z = y). That is, x <

div

y iﬀ x is a proper divisor of y, so 2 <

div

8 but

2 <

div

2 and 2 <

div

5. Prove that (A, <

div

) is a partially ordered set.

Example 9.8. The subset-relation ⊂ is a partial ordering of V . Furthermore, the

relation R on V deﬁned as (x, y) ∈ R ⇔ |x| < |y| is a partial ordering of V . The

40

same holds for R

′

given by (x, y) ∈ R

′

⇔ |P(x)| ≤ |y|. These two partial orderings

are diﬀerent since ({a, b}, {a, b, c}) is in R but not in R

′

.

Due to coding, the element-relation ∈ is an ordering of N where it coincides with

the natural less-than relation.

If |G| ≥ 2, then (G, =) is not a partially ordered set. Let a, b be distinct elements

of G. Then a = b and b = a, but a = a, thus the inequality is not transitive.

Example 9.9. The following relation ⊏ is a partial ordering of the set N × N × N:

(x, y, z) ⊏ (x

′

, y

′

, z

′

) ⇔ x ≤ x

′

∧ y ≤ y

′

∧ z ≤ z

′

∧ x + y +z < x

′

+ y

′

+z

′

.

Proof. If (x, y, z) ⊏ (x

′

, y

′

, z

′

) then x + y + z < x

′

+ y

′

+ z

′

and it cannot be that

x

′

+ y

′

+ z

′

< x + y + z. Thus ⊏ is antisymmetric.

To see that ⊏ is transitive, consider any three triples (x, y, z), (x

′

, y

′

, z

′

), (x

′′

, y

′′

, z

′′

)

such that (x, y, z) ⊏ (x

′

, y

′

, z

′

) and (x

′

, y

′

, z

′

) ⊏ (x

′′

, y

′′

, z

′′

). By the transitivity of ≤

and < one has that x ≤ x

′′

, y ≤ y

′′

, z ≤ z

′′

and x + y + z < x

′′

+ y

′′

+ z

′′

. Thus

(x, y, z) ⊏ (x

′′

, y

′′

, z

′′

) and ⊏ is transitive.

Note that the triples (0, 0, 1) and (0, 1, 0) are incomparable with respect to ⊏.

By using the term “partial ordering” this is explicitly permitted although it is not

mandatory.

Exercise 9.10. Prove that the following relations are partial orderings on N

N

:

• f ⊏

1

g ⇔ ∃n∀m > n(f(m) < g(m));

• f ⊏

2

g ⇔ ∀n(f(n) ≤ g(n)) ∧ ∃m(f(m) < g(m));

• f ⊏

3

g ⇔ ∀n(f(n) ≤ g(n)) ∧ ∃n(f(n) < g(n)) ∧ ∃n∀m > n(f(m) = g(m));

• f ⊏

4

g ⇔ f(0) < g(0).

Determine for every ordering a pair of incomparable elements f, g such that neither

f ⊏

m

g nor g ⊏

m

f nor f = g. For which of these orderings is it possible to choose

the f of this pair (f, g) of examples such that f(n) = 0 for all n?

Remark 9.11 (Preordering). A relation ⊑ on a set G is called preordering iﬀ it is

transitive; it can also be reﬂexive, but this is not required here although some authors

require it in other books. The ordering < on G deﬁned as

x < y ⇔ x ⊑ y and not y ⊑ x

is a partial ordering generated from ⊑. Note that the symbol ≤ derived from < can

be more restrictive than ⊑: For example, the preordering ⊑ on V deﬁned by

x ⊑ y ⇔ |x| ≤ |y|

41

deﬁnes a partial ordering < on V such that

x < y ⇔ |x| < |y|

but the derived relation ≤ is then

x ≤ y ⇔ (x = y ∨ |x| < |y|)

and it would be that N−{0} ⊑ N is true but N−{0} ≤ N is false with respect to the

just deﬁned relations ⊑, <, ≤.

Proposition 9.12. Given a graph (G, E), one can deﬁne a preordering ≤

E

by x ≤

E

y

iﬀ there is a natural number n and a function f : S(n) → G with f(0) = x, f(n) = y

and (f(m), f(S(m))) ∈ E for all m ∈ n.

Furthermore, if (G, E) is cycle-free, that is, iﬀ there is no n ∈ N − {0} and no

function f : S(n) → G with f(0) = f(n) and ∀m ∈ n((f(m), f(S(m))) ∈ E) then the

partial ordering <

E

generated from ≤

E

satisﬁes x ≤

E

y ⇔ x <

E

y ∨x = y as desired.

Proof. The preordering ≤

E

can formally be deﬁned as follows: For all x, y ∈ G,

x ≤

E

y iﬀ ∃n ∈ N∃f ∈ G

S(n)

(f(0) = x ∧ f(n) = y ∧ ∀m ∈ n((f(m), f(S(m))) ∈ E)).

The transitivity is easy to see. Assume that f with domain S(n) witnesses x ≤

E

y

and g with domain S(m) witnesses y ≤

E

z. Then deﬁne h : S(n + m) → G with

h(k) = f(k) for k ∈ S(n) and h(n+k) = g(k) for k ∈ S(m). As f(n) = g(0) = y, this

deﬁnition is not contradictory. Furthermore, for k ∈ {0, 1, . . . , n − 1} it holds that

(h(k), h(S(k))) = (f(k), f(S(k))) ∈ E and for k ∈ {n, n + 1, . . . , n + m− 1} it holds

that (h(k), h(S(k))) = (g(k − n), g(S(k − n))) ∈ E. As h(0) = x and h(n + m) = z,

one has x ≤

E

z.

Assume that x = y and x ≤

E

y. If (G, E) is cycle-free then it cannot be that

y ≤

E

x by the transitivity shown before, thus x <

E

y in the way as <

E

is derived

from the preordering ≤

E

. Furthermore, x ≤

E

x as one could take n = 0 and consider

the function f with domain 0 and f(0) = x. Thus one has that x ≤

E

y ⇔ x <

E

y ∨ x = y.

Remark 9.13. Note that in a cycle-free graph also (x, x) / ∈ E for all x. This property

gives then the additional property that x <

E

y iﬀ there is an n ∈ N−{0} and a function

f : S(n) → G with f(0) = x, f(n) = y and ∀m ∈ n((f(m), f(S(m))) ∈ E) so that

<

E

can be directly deﬁned from (G, E). Therefore the case of cycle-free graphs is the

most desirable one.

If a graph is not cycle-free but has some cycle of n diﬀerent nodes x

0

, x

1

, . . . , x

n−1

∈

G with (x

m

, x

m+1

) ∈ E for all m < n − 1 and (x

n−1

, x

0

) ∈ E as well, then one has

x

i

≤

E

x

j

but not x

i

<

E

x

j

for all i, j ∈ n.

Well-founded graphs are cycle-free but not vice versa as the graph (Z, E) with

42

E = {(z, z + 1) | z ∈ Z} is cycle-free but not well-founded. The ordering <

E

is then

the natural ordering on all integers.

One can deﬁne <

∈

on the whole universe V and obtains that T C(x) = {y ∈ V | y =

x ∨ y <

∈

x} and y <

∈

x ⇔ y ∈ T C(x) − {x}. So these two notions stand in a very

close correspondence.

Exercise 9.14. Let A = N −{0, 1} and <

div

given by x <

div

y ⇔ ∃z ∈ A(x · z = y)

as in Exercise 9.7. Deﬁne a relation E on A×A by putting (x, y) into E iﬀ there is a

prime number z with x · z = y. So (2, 4) ∈ E, (2, 6) ∈ E, (2, 10) ∈ E but (2, 7) / ∈ E,

(2, 8) / ∈ E and (2, 20) / ∈ E. Show that (A, <

E

) and (A, <

div

) are identical partially

ordered sets.

10 Linear Ordering

A linear ordering is a partial ordering with the additional property that any two

diﬀerent elements are comparable.

Deﬁnition 10.1. Let A be a set and R ⊆ A×A. One says that R is a linear ordering

of A iﬀ R satisﬁes the following properties for all a, b, c ∈ R:

antisymmetric: if (a, b) ∈ R then (b, a) / ∈ R;

transitive: if (a, b) ∈ R and (b, c) ∈ R then (a, c) ∈ R;

comparable: either a = b or (a, b) ∈ R or (b, a) ∈ R.

The pair (A, R) is called a linearly ordered set iﬀ R is a linear ordering of A.

Example 10.2. Recall that the natural ordering < on N coincides with the ∈-relation

due to the way the natural numbers are coded into V ; that is, for all m, n ∈ N,

n = {0, 1, . . . , n −1} and m < n ⇔ m ∈ n.

It is easy to see that (N, <) is a linearly ordered set, that is, that < is antisymmetric,

transitive and comparable on N.

Example 10.3. The two orderings

(0, 0) <

1

(0, 1) <

1

(0, 2) <

1

(0, 3) <

1

... <

1

(1, 3) <

1

(1, 2) <

1

(1, 1) <

1

(1, 0) and

(0, 0) <

2

(0, 1) <

2

(0, 2) <

2

(0, 3) <

2

... <

2

(1, 0) <

2

(1, 1) <

2

(1, 2) <

2

(1, 3) <

2

...

on the set {0, 1} ×N are linear.

43

Example 10.4. Let A = N ×N. Deﬁne <

lex

on A by

(m, n) <

lex

(i, j) ⇐⇒ (m < i) ∨ (m = i ∧ n < j)

for all m, n, i, j ∈ N. Then (N × N, <

lex

) is a linearly ordered set. This ordering is

called the lexicographic ordering of N×N. Note that <

2

from the previous example is

the restriction of <

lex

to the domain {0, 1} ×N.

Example 10.5. Let (A, <) be a linearly ordered set. Recall that A

∗

contains every

A-valued function whose domain can be represented by a natural number. For every

f, g ∈ A

∗

with domain m, n, respectively, deﬁne

f <

KB

g ⇔ ∃k ∈ m∩ S(n) ((f|`k = g|`k) ∧ (k = n ∨ (k < n ∧ f(k) < g(k)))).

This is an ordering which is called the Kleene-Brouwer-ordering.

Proof. It is shown that (A, <

KB

) satisﬁes the properties necessary to be a linearly

ordered set.

Antireﬂexiveness. Assume that f = g and k ∈ m. Then k < n and f(k) = g(k),

thus it cannot be that f <

KB

g.

Transitivity. Assume that f <

KB

g and g <

KB

h. Let m, n, o be the corresponding

domains and k be the minimum of the parameters of the same name in order to

establish that f <

KB

g and g <

KB

h. Note that f|`k = g|`k = h|`k. The fact that

f <

KB

g gives that k < m. Similarly g <

KB

h gives that k < n. If k = o then

f <

KB

h and transitivity holds. If k < o then one has f(k) ≤ g(k) ≤ h(k) and one of

these must be proper since k is the parameter for either f <

KB

g or g <

KB

h. Thus

f(k) < h(k) and f <

KB

h. Again transitivity holds.

Comparability. Assume that f = g. Then there is a minimal number k such that

either k / ∈ n or k / ∈ m or f(k) = g(k). Since f|`k = g|`k but f = g it cannot happen

that k / ∈ n ∪ m, that is, k = n = m. So one has exactly one of the following four

cases.

1. k ∈ n ∩ m and f(k) < g(k). Then f <

KB

g.

2. k ∈ n ∩ m and g(k) < f(k). Then g <

KB

f.

3. k = n and k < m. Then f <

KB

g.

4. k = m and k < n. Then g <

KB

f.

44

So it follows from this case distinction that <

KB

is indeed a linear ordering.

Exercise 10.6. Let (A, <) be a linearly ordered set and B = A

N

. Deﬁne

f <

lex

g ⇔ ∃k ∈ N(f|`k = g|`k ∧ f(k) < g(k)).

Furthermore, let C = A

∗

. The lexicographic ordering on A

∗

diﬀers from the Kleene-

Brouwer ordering in the sense that it is reverted iﬀ f, g coincide on the intersection

of their domains m, n. That is,

f <

lex

g ⇔ ∃k ∈ S(m) ∩ n((f|`k = g|`k) ∧ (k = m∨ (k < m∧ f(k) < g(k)))).

Show that (B, <

lex

) and (C, <

lex

) are linearly ordered sets. Assuming that A =

{0, 1, 2, . . . , 9} with the usual ordering, put the following elements of C into lexico-

graphic order: 120, 88, 512, 500, 5, 121, 900, 0, 76543210, 15, 7, 007, 00.

Example 10.7. Let R be the set of reals and < be the natural ordering of the reals.

Then (R, <) is a linearly ordered set. This ordering can be inherited to the subsets

Z of integers and Q of rationals, thus (Z, <) and (Q, <) are linearly ordered sets as

well.

Deﬁnition 10.8. Let (A, <) be a linearly ordered set. Recall that a ≤ b stands for

a = b ∨ a < b. Let B be a nonempty subset of A, a ∈ A and b ∈ B.

1. a is a lower bound of B iﬀ a ≤ c for all c ∈ B.

2. a is an upper bound of B iﬀ c ≤ a for all c ∈ B.

3. b is the least element of B (with respect to <) iﬀ b is a lower bound for B.

4. b is the greatest element of B (with respect to <) iﬀ b is an upper bound of B.

5. a is the inﬁmum of B iﬀ a is the greatest lower bound of B.

6. a is the supremum of B iﬀ a is the least upper bound of B.

7. B is bounded from above in (A, <) iﬀ B has an upper bound in A.

8. B is bounded from below in (A, <) iﬀ B has a lower bound in A.

9. B is bounded iﬀ B is bounded both from above and from below.

45

Example 10.9. Consider the following subsets of (R, <):

X = {x ∈ R | −2 ≤ x < 5},

Y = {y ∈ R | y ≥ 5}.

The set X is bounded in (R, <), for example −1024 is a lower and 1024 is an upper

bound. Furthermore, 0 ∈ X and thus X is not empty. Therefore X has an inﬁmum

and a supremum. The inﬁmum of X is −2 and the supremum of X is 5. Since

−2 ∈ X, X has a least element, namely −2. But 5 / ∈ X. Thus X does not have a

greatest element.

The set Y has the inﬁmum 5 which is also a lower bound. But Y has no upper

bound in (R, <). Thus Y is unbounded and has no supremum.

Exercise 10.10. Determine which of the following subsets of the real numbers R

have a lower and upper bound. If so, determine the inﬁmum and supremum and

check whether these are even the least and greatest element of these sets.

1. A = {a ∈ R | ∃b ∈ R(a

2

+ b

2

= 1)};

2. B = {b ∈ R | b

3

−4 · b < 0};

3. C = {c ∈ R | sin(c) > 0};

4. D = {d ∈ R | d

2

< π

3

};

5. E = {e ∈ R | sin(

π

2

· e) =

e

101

}.

Deﬁnition 10.11. Two partial ordered sets (A, <

1

) and (B, <

2

) are isomorphic,

denoted by (A, <

1

)

∼

= (B, <

2

), iﬀ there is a bijection f : A → B such that for all

a, b ∈ A, a <

1

b if and only if f(a) <

2

f(b). Such functions are called isomorphisms.

For partial ordered sets (A, <

1

) and (B, <

2

), a function f : A → B is order

preserving iﬀ the implication a <

1

b ⇒ f(a) <

2

f(b) is true for all a, b ∈ A.

Assume that (A, <) is a linear ordered set and f : A → B is order-preserving. Then

f is an isomorphism iﬀ f is surjective. There is an order-preserving mapping from Z

into Q but no isomorphism. The next result is of similar nature.

Example 10.12. There is no order-preserving function from Z into N.

Proof. Assume by way of contradiction that there is an order-preserving function

f : Z → N. Then f(0) = n for some n. It follows that f(−1) < n and thus

f(−1) ≤ n −1, f(−2) < n −1 and thus f(−2) ≤ n −2. By induction one can show

46

that f(−m) ≤ n − m and f(−n) ≤ 0. But then f(−n − 1) < 0 what is impossible.

So f cannot exist.

Exercise 10.13. Consider the ordering ⊏ given by

(m, n) ⊏ (i, j) ⇔ (m < i)

∨ (m = i ∧ m is even ∧ n < j)

∨ (m = i ∧ m is odd ∧ n > j)

on A = {0, 1, 2, 3, 4, 5} ×N. Construct an order-preserving mapping from (Z, <) into

(A, ⊏) where < is the natural ordering of Z.

The set (Z, <) there are nontrivial isomorphisms onto itself, that is, isomorphism

diﬀerent from the identity. For example, z → z +8. Does (A, ⊏) also have nontrivial

isomorphisms onto itself? If so, is there any element which is always mapped to itself?

Proposition 10.14. If (A, <) is a ﬁnite linearly ordered set and A = ∅ then A has

a greatest and a least element with respect to <.

Proof. This is proven by induction. The proposition holds for orderings having one

element since this unique element is the least and greatest element with respect to

the given ordering at the same time.

Assume now that n ≥ 1 and the proposition holds for all nonempty ﬁnite linearly

ordered sets of cardinality up to n. Let (A, <) be a linearly ordered set of cardinality

S(n). Let a ∈ A and B = A−{a}. Then (B, <) is a linearly ordered set of cardinality

n. By induction hypothesis, B has a least element b

1

and a greatest element b

2

. There

are three cases:

1. b

2

< a. Then b

1

is the least and a the greatest element of A.

2. a < b

1

. Then a is the least and b

2

the greatest element of A.

3. b

1

< a < b

2

. Then b

1

is the least and b

2

the greatest element of A.

These three cases cover every possibility since b

1

≤ b

2

. Thus it follows from case-

distinction that A has a least and a greatest element. This completes the inductive

step.

Theorem 10.15. If (A, <) is a ﬁnite linearly ordered set and n = |A|, then (A, <)

∼

= (n, ∈).

Proof. The theorem is proven by induction on n. If A = ∅ then (A, <)

∼

= (0, ∈) since

both are empty sets and the ordering of an empty set is irrelevant.

Assume that the theorem hold for all linearly ordered sets of size n. Let (A, <) be

47

a linearly ordered set of size S(n). Let a ∈ A be the greatest element of A, given by

Proposition 10.14. Let B = A −{a}. Then (B, <)

∼

= (n, ∈) by induction hypothesis.

Let g : B → n be the isomorphism. Deﬁne f : A → S(n) by f = g ∪{(a, n)}. Then f

is an isomorphism since a is the greatest element of A and n is the greatest element

of S(n) with respect to ∈ and g is an isomorphism. This ﬁnishes the proof of the

inductive step and the whole theorem.

The theorem says that, up to isomorphism, the ﬁnite linearly ordered sets are sets

of the form n = {m ∈ N | m < n} with the natural ordering. The next result is

that every countable linear ordering is isomorphic to a subset of Q with the standard

ordering. So, the linear ordering of the set of all rational numbers is really a universal

linear ordering of countable sets.

Deﬁnition 10.16. A linearly ordered set (A, <) is dense iﬀ A has at least two

elements and for any pair a, b ∈ A with a < b there is c ∈ A such that a < c < b. A

subset B ⊆ A is dense in a linearly ordered set (A, <) iﬀ for every a, b ∈ A with a < b

there is a c ∈ B such that a < c < b. A linearly ordered set (A, <) has no end points

iﬀ for all a ∈ A there are b, c ∈ A such that b < a < c.

Example 10.17. (Q, <) and (R, <) are dense linearly ordered sets without end points.

Q is also dense in (R, <). (Z, <) is not dense. ({0, 1, 2, 3}, <) has end points 0 and 3.

The set D = {m· 2

−n

| n ∈ N ∧ m ∈ {0, 1, 2, . . . , 2

n

}} of all dyadic numbers between

0 and 1 is dense and has end points 0 and 1. D is a subset of Q.

Theorem 10.18. Every countable dense linear order (A, ⊏) with end points a

0

, a

1

is

isomorphic to D.

Proof. As A is countable, A = {a

0

, a

1

, a

2

, . . .} for some enumeration a

0

, a

1

, a

2

, . . .

which, formally, is an one-to-one function n → a

n

from N onto A.

Now one deﬁnes a function f : D → A by recursion as follows: f(0) = a

0

and

f(1) = a

1

. After the values f(m· 2

−n

) have been deﬁned for all m ∈ {0, 1, . . . , 2

n

}, one

deﬁnes f((2m+1)·2

−S(n)

) to be a

ℓ

for the ﬁrst ℓ with f(m·2

−n

) ⊏ a

ℓ

⊏ f((m+1)·2

−n

).

The search terminates as (A, ⊏) is dense.

One can easily show by induction that f is order-preserving. This is true on

the domain {0, 1} by a

0

⊏ a

1

. If f is order-preserving on the domain {m· 2

−n

| m ∈

{0, 1, . . . , 2

n

}} then it is also order-preserving on the extended domain {m·2

−S(n)

| m ∈

{0, 1, . . . , 2

S(n)

}} as the new values are inserted such that the order is preserved. Thus

for any p, q ∈ D, p < q ⇒ f(p) ⊏ f(q).

As f is order-preserving, f is also one-to-one. Now one shows that f is onto. This

is done by showing that

∀n ∈ N(a

n

∈ f[{m· 2

−n

| m ∈ {0, 1, . . . , 2

n

}}]).

48

This is true for a

0

as a

0

= f(0). Assume now that it is true for a

0

, a

1

, . . . , a

n

. Let m

be the smallest element of {0, 1, . . . , 2

n

} with a

S(n)

⊑ f(m· 2

−n

). Note that m > 0 as

f(0) = a

0

⊏ a

S(n)

. If f(m · 2

−n

) = a

S(n)

then there is nothing to prove. Otherwise

the ﬁrst index ℓ with f((m− 1) · 2

−n

) ⊏ a

ℓ

⊏ f(m· 2

−n

) is equal to S(n) as a

S(n)

is

between these two values but a

0

, a

1

, . . . , a

n

are by induction hypothesis all of the form

f(q) for some q with either q ≤ (m−1) · 2

−n

or q ≥ m· 2

−n

. Thus a

S(n)

is in the set

f[{m· 2

−S(n)

| m ∈ {0, 1, . . . , 2

S(n)

}}]. It follows that A ⊆ f[D] and f is onto.

Corollary 10.19. Two countable and dense linearly ordered sets are order-isomorphic

iﬀ either both have sets have no end points or both sets have a minimum but no

maximum or both sets have a maximum but not a minimum or both sets have both

end points. Furthermore, if (A, ⊏) is an at most countable linearly ordered set then

there is an order-preserving mapping from A into (D, <).

Proof. Let (A, ⊏) be a countable dense linearly ordered set. It is shown that (A, ⊏) is

isomorphic to exactly one of the following four sets: (D, <), (D−{0}, <), (D−{1}, <),

(D −{0, 1}, <).

If A has end points a

0

, a

1

, then this follows from Theorem 10.18. If A has no

end points then one can modify A to considering new elements −∞, +∞ / ∈ A with

−∞⊏ q ⊏ +∞ for all a ∈ A. Then (A∪{−∞, +∞}, ⊏) is isomorphic to (D, <) and

thus (A, ⊏) is isomorphic to (D − {0, 1}, <). Similarly one handles the case if A has

only one of the end points, that is, either a minimum or a maximum but not both.

The four sets (D, <), (D−{0}, <), (D−{1}, <), (D−{0, 1}, <) are not isomorphic

to each other. For example, if B is either D or D − {1} and C is either D − {0} or

D − {0, 1} and f : B → C is order-preserving then there is an element y ∈ C with

y < f(0) as C has no minimum. Thus f[B] ⊂ C and f is not an order-isomorphism.

Furthermore, if B is either D or D − {0} and C is either D − {1} or D − {0, 1} and

f : B → C is order-preserving then there is an element y ∈ C with y > f(1) as C has

no maximum. Again f[B] ⊂ C and f is not an order-isomorphism. This shows that

none of these four sets are order-isomorphic to each other.

If (A, ⊏) and (A

′

, ⊏

′

) are both isomorphic to the same set (B, <), then (A, ⊏) is

isomorphic to (A

′

, ⊏

′

) as isomorphisms can be inverted and concatenated.

For the last statement, assume that A is a countable linearly ordered set. Then

(A×Q, <

lex

) is a dense linearly ordered set which is order-isomorphic via some function

g to (D − {0, 1}, <). Now deﬁne f(a) = g((a, 0)) for all a ∈ A and one obtains an

order-preserving function f from A into D.

The last result of this section is the characterization of the real line as a linearly

ordered set.

49

Deﬁnition 10.20. A linearly ordered set (A, <) is complete iﬀ every nonempty subset

of A bounded from above has a supremum in A.

Theorem 10.21. The real line (R, <) is the unique (up to isomorphism) complete

linearly ordered set without end points that has a countable subset dense in it.

Proof. Assume that the ordered set (A, ⊏) is a complete linearly ordered set without

end points and has a countable subset B which is dense in it. The set B has no

endpoints and thus there is a bijection f : Q → B. This function f is extended to R

by deﬁning

f(r) = sup

⊏

{f(q) | q ∈ Q∧ q < r} for all r ∈ R −Q.

If r, r

′

∈ R are distinct than one is strictly smaller than the other, say r < r

′

. Since Q

is a dense subset there are two rationals q, q

′

in between: r < q < q

′

< r

′

. It follows

f(r) = sup

⊏

{f(q

′′

) | q

′′

∈ Q ∧ q

′′

< r} ⊑ f(q) ⊏ f(q

′

) ⊑ sup

⊏

{f(q

′′

) | q

′′

∈ Q ∧

q

′′

< r

′

} = f(r

′

) and thus f(r) ⊏ f(r

′

). So f is order-preserving and one-to-one.

Assume by way of contradiction that a ∈ A − f[R]. Since B has no end points,

there are members q, q

′

∈ Q with f(q) ⊏ a ⊏ f(q

′

). Now let b = sup

⊏

{f(q

′′

) |

q

′′

∈ Q∧ f(q

′′

) ⊏ a}. By choice of a, b ⊏ a. Since B is dense in A there is a c ∈ B in

between, this is, b ⊏ c ⊏ a and c = f(q

′′′

) for some q

′′′

∈ Q. But this contradicts to

the deﬁnition of b which imposes either f(q

′′′

) ⊑ b or a ⊏ f(q

′′′

). Thus a cannot exist

and A = f[R].

Exercise 10.22. Assume that (A, <) is linearly ordered, has no end-points, is dense

and satisﬁes that every nonempty subset B ⊂ A which is bounded from below has an

inﬁmum. Show that (A, <) is a complete ordered set.

11 Well-Orderings

Linear orderings have a higher quality than partial orderings since every two diﬀerent

elements are comparable. Well-orderings are a further improvement since they gener-

alize the property that every ﬁnite linearly ordered set has a least element to inﬁnite

subsets of the well-ordered set.

Deﬁnition 11.1. A linear ordering < of a set A is a well-ordering of A iﬀ every

nonempty subset B ⊆ A has a least element with respect to <. In case that < is

a well-ordering of A, (A, <) is called a well-ordered set. A set A is well-orderable iﬀ

there exists a well-ordering of A.

Example 11.2. Every ﬁnite linearly ordered set is a well-ordered set. The standard

linear ordering of N is a well-ordering of N.

50

Proof. Since all ﬁnite sets are order-isomorphic to subsets of N, it is suﬃcient to

prove the second statement. Given a nonempty B ⊆ N, there is by the Axiom of

Foundation an element m ∈ B such that m∩ B = ∅. If n ∈ B − {m} then n / ∈ m

and thus m ∈ n by the properties of natural numbers. Thus m < n and therefore m

is the least element of B with respect to <.

Example 11.3. Recall the two linear orderings

(0, 0) <

1

(0, 1) <

1

(0, 2) <

1

(0, 3) <

1

... <

1

(1, 3) <

1

(1, 2) <

1

(1, 1) <

1

(1, 0) and

(0, 0) <

2

(0, 1) <

2

(0, 2) <

2

(0, 3) <

2

... <

2

(1, 0) <

2

(1, 1) <

2

(1, 2) <

2

(1, 3) <

2

...

on the set {0, 1} × N from Example 10.3. The ﬁrst ordering is not a well-ordering

since the subset {1} ×N has no least element with respect to <

1

. The second ordering

is a well-ordering.

Notice that the above considered well-ordered sets ({0}, <), ({0, 1}, <), ({0, 1, 2}, <),

. . ., (N, <) and ({0, 1} ×N, <

2

) are mutually non-isomorphic.

Example 11.4. The lexicographic ordering of N ×N is a well-ordering of N ×N.

Proof. Recall that (m, n) < (i, j) iﬀ (m < i) or (m = i and n < j).

The lexicographic ordering is a linear ordering. So it is suﬃcient to show that it

is actually a well-ordering of N × N, that is, every nonempty subset of N × N has a

minimal element.

Let A be a nonempty subset of N×N. For every m, let A

m

= A∩{(m, n) : n ∈ N}.

There is a least m such that A

m

is not empty. Let n be the least number in N with

(m, n) ∈ A

m

. Consider any (i, j) ∈ A − {(m, n)}. If i = m then j > n by the

choice of n and (m, n) < (i, j). If i = m then i > m by the choice of A

m

and again

(m, n) < (i, j). Thus (m, n) is the minimum of A with respect to the lexicographic

ordering.

Example 11.5. A further well-ordering of N ×N is deﬁned as follows:

(m, n) <

cw

(i, j) ⇔ max{m, n} < max{i, j}

∨ (max{m, n} = max{i, j} ∧ m < i)

∨ (max{m, n} = max{i, j} ∧ m = i ∧ n < j).

Proof. It is established that <

cw

is a well-ordering by showing that the following four

conditions hold.

51

Antireﬂexiveness. (m, n) <

cw

(m, n) since (m, n) <

cw

(i, j) requires that either

max{m, n} = max{i, j} or m = i or n = j and none of these conditions holds if

(m, n) = (i, j).

Transitivity. Assume the following two conditions (∗):

(m, n) <

cw

(i, j) and (i, j) <

cw

(h, k).

It is shown that (∗) implies (m, n) <

cw

(h, k).

If max{m, n} < max{h, k} then (m, n) <

cw

(h, k). Otherwise max{m, n} =

max{i, j} = max{h, k} and the relation <

cw

follows at (∗) the second or third case

of the disjunction in its deﬁnition. If m < h then again (m, n) <

cw

(h, k). Other-

wise m = i = h and the relation follows at (∗) the third case of its deﬁnition. Thus

max{m, n} = max{h, k}, m = h and n < k by n < j < k. So again <

cw

holds and

<

cw

is transitive.

Comparability. Assume that neither (m, n) <

cw

(i, j) nor (i, j) <

cw

(m, n). Then

max{m, n} = max{i, j}, m = i and n = j, that is, (m, n) = (i, j). Thus any two

diﬀerent members of N ×N are comparable.

Well-orderedness. Let A ⊆ N ×N be nonempty. Let

A

k

= {(m, n) ∈ A | max{m, n} = k}.

Fix k as the least number such that A

k

is nonempty. This set A

k

is a ﬁnite set

and has a least element (m, n) with respect to <

cw

since <

cw

is a linear order on

N × N and also on its subset A

k

. Now let (i, j) ∈ A − {(m, n)}. If (i, j) / ∈ A

k

then

max{i, j} > k = max{m, n} and (m, n) <

cw

(i, j). If (i, j) ∈ A

k

then (m, n) < (i, j)

by the choice of (m, n) from A

k

. Thus A has a minimum and (N × N, <

cw

) is a

well-ordered set.

Remark 11.6. Notice that the ordered set (N × N, <

cw

) is isomorphic to (N, <).

Indeed the function f given as

f(m, n) = (2 max{m, n} + 2)

3

+ (max{m, n} + m+ 1)

2

+ n

is an order-preserving one-to-one mapping into N. Thus, (N × N, <

cw

) is isomorphic

to an inﬁnite subset of (N, <) which is then isomorphic to (N, <).

Example 11.7. The following subsets of Q are well-ordered with respect to the natural

ordering of Q:

{−

1

n+1

| n ∈ N},

{−

1

m+1

−

1

n+1

| m, n ∈ N},

{−

1

k+1

−

1

m+1

−

1

n+1

| k, m, n ∈ N}.

52

The orderings are isomorphic to that of the lexicographic ordering on N, N × N,

N×N×N, respectively; the lexicographic ordering of N is of course identical with the

natural one.

Exercise 11.8. The set

{−

1

m

1

+1

−

1

m

2

+1

−. . . −

1

mn+1

| n, m

1

, m

2

, . . . , m

n

∈ N}

is not a well-ordered subset with respect to the natural ordering of Q: show that the

set is dense and is not bounded from below.

Example 11.9. Both Z and Q are well-orderable, but the ordering diﬀers from the

standard one.

Proof. In fact every countable set X is well-orderable. Since |X| ≤ |N|, there is a

one-to-one function f : X →N. Now one deﬁnes on X a well-ordering ⊏ by

x ⊏ y ⇔ f(x) < f(y)

where x, y ∈ X. This order diﬀers from the natural order on Z and Q: these sets

contain the chain −1, −2, −3, . . . which is descending with respect to their natural

order and which cannot be so with respect to any well-ordering of them.

Deﬁnition 11.10. For a linearly ordered set (L, <), an initial segment I of L is a

proper subset of L such that x ∈ I whenever x ∈ L and there is an y ∈ I with x < y.

That is, I is an initial segment iﬀ I is a downward closed proper subset of L:

I ⊂ L ∧ ∀x, y ∈ L(x < y ∧ y ∈ I ⇒ x ∈ I).

For a ∈ L, L[a] = {x ∈ L | x < a}. Call L[a] the initial segment of L given by a.

Proposition 11.11. If (W, <) is well-ordered, I is an initial segment of W, then

there is an a ∈ W such that I = W[a].

Proof. Let A = W − I. Then A is not empty and every element of A is an upper

bound of I since I is an initial segment. Let a be the least element of A with respect

to <. Then for x ∈ W, x < a if and only if x ∈ I.

Notice that {r ∈ Q | r <

√

2} is an initial segment of Q but it is not Q[a] for any

a ∈ Q. The set of real numbers is isomorphic to every initial segment and also has a

large quantity of isomorphisms onto itself. Well-ordered sets are rigid, that is, they

satisfy exactly the opposite of these properties.

53

Theorem 11.12 (Rigidity). Let (A, <) be a well-ordered set and f be an order-

preserving function from (A, <) to itself. Then a ≤ f(a) for all a ∈ A. In particular,

the range cannot be an initial segment of A and (A, <) is not isomorphic to any initial

segment. Furthermore, if f is an isomorphism from A to itself, then f is the identity.

Proof. If f(a) < a then f(f(a)) < f(a) since f is order-preserving. Thus there is no

least element a with f(a) < a. Since (A, <) is well-ordered, there is even no element

a ∈ A with f(a) < a.

Given an initial segment of A, it is of the form A[a] for some a ∈ A. Since f(a) ≥ a,

f[A] ⊆ A[a] and the initial segment is not the range of f. Since the choice of f was

arbitrary, there is no isomorphism from A to any initial segment.

Assume now that f is not the identity. Then there is a least element a ∈ A with

f(a) = a. As seen above, f(a) > a. Thus, for all b, f(b) = f(a): If b < a then

f(b) = b = a by the choice of a; if b ≥ a then f(b) ≥ f(a) > a by the fact that f is

order-preserving. Thus the identity is the only isomorphism of (A, <).

Theorem 11.13 (Comparability Theorem). Given two well-ordered sets, they are

either isomorphic or exactly one of them is isomorphic to an initial segment of the

other.

Proof. Given (A

1

, <

1

) and (A

2

, <

2

), let B = P(A

1

× A

2

). Call F ∈ B consistent iﬀ

the following conditions hold:

1. if (a

1

, a

2

), (b

1

, b

2

) ∈ F then either a

1

<

1

b

1

∧ a

2

<

2

b

2

or a

1

= b

1

∧ a

2

= b

2

or

b

1

<

1

a

1

∧ b

2

<

2

a

2

.

2. if (a

1

, a

2

) ∈ F and b

1

<

1

a

1

then there is an b

2

<

2

a

2

such that (b

1

, b

2

) ∈ F.

3. if (a

1

, a

2

) ∈ F and b

2

<

2

a

2

then there is an b

1

<

1

a

1

such that (b

1

, b

2

) ∈ F.

Now the following facts hold for all consistent F, G:

1. If F ⊆ G then G ⊂ F. The elements of F are well-ordered by the ordering

inherited from (A

1

, <

1

) or (A

2

, <

2

); both inherit the same ordering. There is a

least pair (a

1

, a

2

) ∈ F − G. Let H be the set {(c

1

, c

2

) ∈ F | c

1

<

1

a

1

} of the

pairs in F below (a

1

, a

2

). Clearly H ⊆ F ∩ G. Assume now that H ⊂ G. Then

there is a least element (b

1

, b

2

) ∈ G−H. Since F, G are consistent, there is for

every c

1

<

1

a

1

a c

2

with (c

1

, c

2

) ∈ H and similarly for every c

2

<

2

a

2

a c

1

with

(c

1

, c

2

) ∈ H. By consistency b

1

is the least element in A

1

diﬀerent from these

c

1

and b

2

the least element of A

2

diﬀerent from these c

2

, that is, b

1

= a

1

and

b

2

= a

2

in contradiction to the choice of (a

1

, a

2

). Thus (b

1

, b

2

) does not exist

and G = H ⊂ F.

54

2. There is a maximal consistent set. Every consistent set is an element of the

power set P(A

1

× A

2

) and the property consistent is ﬁrst order deﬁnable from

P(A

1

×A

2

), A

1

and A

2

as shown above. So there is a set C of consistent sets.

The consistent sets are linearly ordered by inclusion and their union is again

consistent. Thus, F =

¸

C is a consistent set which is maximal.

3. The maximal consistent set F is a partial one-to-one function with either domain

A

1

or range A

2

. The property of being a bijection from the domain to the range

comes from the deﬁnition, similarly the domain is a subset of A

1

and the range

a subset of A

2

. Assume now by way of contradiction that both subsets would

be proper. Then there is a least a

1

∈ A

1

which is outside the domain of F

and a least a

2

∈ A

2

which is outside the range of F. This would give that

F ∪ {(a

1

, a

2

)} is also consistent in contradiction to the maximality of F. Thus

only one inclusion can be proper.

If the domain of F is A

1

and the range of F is A

2

then (A

1

, <

1

) and (A

2

, <

2

) are

isomorphic. If the range of F is a proper subset of A

2

then F is an isomorphism from

(A

1

, <

1

) is an initial segment of (A

2

, <

2

). If the domain of F is a proper subset of A

1

then F

−1

is an isomorphism from (A

2

, <

2

) is an initial segment of (A

1

, <

1

).

Exercise 11.14. Deﬁne a function f : {0, 1, . . . , 9}

∗

→ N which is order-preserving

with respect to the length-lexicographic ordering <

ll

: v <

ll

w ⇔ f(v) < f(w). Recall

0 <

ll

1 <

ll

. . . <

ll

9 <

ll

00 <

ll

01 <

ll

. . . <

ll

99 <

ll

000 <

ll

. . . and v <

ll

w if either v is

shorter than w or v, w have the same length and v <

lex

w.

12 Ordinals

Ordinals are a generalization of the natural numbers. While a natural number (viewed

as the set which represents it) is order-isomorphic to ﬁnite well-ordered sets, ordinals

are a generalization which is just taken to be order-isomorphic to any well-ordered

sets. Recall from Deﬁnition 4.4 that a set A is transitive iﬀ ∀a ∈ A∀b ∈ a (b ∈ A).

Ordinals are now well-ordered and represented by transitive sets.

Deﬁnition 12.1. A set is an ordinal (or ordinal number) if it is transitive and well-

ordered by the ordering ∈ (restricted to its members).

Example 12.2. Every natural number is an ordinal. The sets N and N ∪ {N} are

ordinals. The set {2, 3, 4, 5, 6, 7, 8} is well-ordered by ∈ but not transitive. The set

{∅, {∅}, {∅, {∅}}, {{∅}}} is transitive but not linearly ordered by ∈.

Convention 12.3. Ordinals are normally written by lower case Greek letters.

55

Deﬁnition 12.4. ω is the ﬁrst ordinal after 0 which is not the successor of any other

ordinal. That is, ω is ordinal represented by N. The ordinals strictly below ω are

called ﬁnite and those beyond ω are called transﬁnite.

Remark 12.5. There are in principal two options to represent the natural numbers.

Both represent 0 by ∅. Having the codes for 0, 1, . . . , n, the ﬁrst one would repre-

sent S(n) as {Code(n)} while the second one would represent S(n) as {Code(0), . . . ,

Code(n)}. The advantage of the second approach is that it also permits to represent

transﬁnite ordinals (as already indicated above), which is impossible in the ﬁrst ap-

proach. Thus the second approach was taken in Deﬁnition 4.3. Examples for the two

representations are:

Number, First and Second Representation.

0 ∅ ∅

1 {∅} {∅}

2 {{∅}} {∅, {∅}}

3 {{{∅}}} {∅, {∅}, {∅, {∅}}}

4 {{{{∅}}}} {∅, {∅}, {∅, {∅}}, {∅, {∅}, {∅, {∅}}}}

. . . . . . . . .

ω — {Code(α) | α ∈ N}

ω + 1 — {Code(α) | α ∈ N ∨ α = ω}

So the second representation was taken in set theory since it codes natural and ordinal

numbers in a uniform way and satisﬁes that

α < β ⇔ α ∈ β ⇔ α ⊂ β.

Furthermore, the second notation also reﬂects the intuition from counting that n is a

set of n objects, namely the representatives of the n smaller numbers. The price paid

is that it takes much space to write down even small numbers, see Exercise 8.11.

Theorem 12.6. If A is transitive and (A, ∈) linearly ordered then A is an ordinal.

Proof. Let B ⊆ A and B be nonempty. By the Axiom of Foundation there is x ∈ B

such that y / ∈ x for all y ∈ B. Since (A, ∈) is linearly ordered, the same holds for

(B, ∈) and x is the least element of B with respect to the ordering given by ∈. Thus

A is an ordinal.

Exercise 12.7. Verify the following properties of ordinals.

1. If α is an ordinal, then S(α), which is deﬁned as α ∪ {α}, is also an ordinal.

2. Every element of an ordinal is an ordinal.

56

3. An ordinal α is transﬁnite iﬀ |α| = |S(α)|.

4. An ordinal α is ﬁnite iﬀ S(α) = {0} ∪ {S(β) | β ∈ α}.

Theorem 12.8. The following basic facts hold for ordinals:

1. If α, β are ordinals then either α ∈ β or α = β or β ∈ α.

2. If A is a set of ordinals then

¸

A is an ordinal.

3. If A is a nonempty set of ordinals then there exists an ordinal α ∈ A such that

α ∩ A = ∅. Consequently every set of ordinals is well-ordered by ∈.

Proof. 1. Let α, β be ordinals such that the cases α = β and α ∈ β do not hold,

that is, α ⊆ β. Then there is a least element of α − β, say γ. Since γ = {δ | δ < γ}

and every δ is in α by transitivity, γ ⊆ β. If γ ⊂ β then γ ∈ β in contradiction to the

choice of γ. Thus γ = β and β ∈ α.

2.

¸

A is the union of transitive sets which are linearly ordered by ∈. The union

is again transitive. Furthermore, if α, β ∈

¸

A then α and β are comparable by the

previous paragraph. If α ∈ β and β ∈ γ then α ∈ γ since γ itself is an ordinal and

transitive. The Axiom of Foundation gives α / ∈ α for all α. So

¸

A is a transitive set

which is linearly ordered by ∈. By Theorem 12.6,

¸

A is an ordinal.

3. The set A inherits from the superset and ordinal

¸

A that ∈ is a well-ordering.

Since A is nonempty, it has a least element α with respect the ordering given by ∈.

Then A ∩ α contains only ordinals below α and is thus empty.

Exercise 12.9. Use the above results to show that there is no set containing all

ordinals in V .

Deﬁnition 12.10. An ordinal α is called successor-ordinal if α = S(β) = β ∪{β} for

some other ordinal β and is called limit ordinal otherwise. The supremum of a set A

of ordinals is denoted by sup A, note that sup A =

¸

A.

Example 12.11. 0 is a limit ordinal and all the positive natural numbers are successor

ordinals. ω = sup N is a next limit ordinal. One can combine the usage of supremums

and successors to obtain every ordinal from those below. So, for given α, one has

α =

¸

{S(β) | β ∈ α} = sup{S(β) | β < α}

and this rule holds also for α = 0 by using the deﬁnition sup ∅ = 0. One should also

note that above every ordinal α is a successor ordinal, namely S(α), and also a limit

ordinal obtained as sup f

α

[N] where f

α

(0) = α and f

α

(S(n)) = S(f

α

(n)).

57

13 Transﬁnite Induction and Recursion

Induction and Recursion can be generalized to ordinals and the universe of sets. This

says that one can prove theorems and build functions along the membership relation

∈ from the bottom to the top.

Theorem 13.1 (Transﬁnite Induction on Ordinals). Let p(x) be a property.

Assume that for every ordinal α,

1. p(0) holds;

2. if there is an ordinal β with S(β) = α and if p(β) holds then also p(α) holds;

3. if α is a limit ordinal and p(β) holds for all β < α then p(α) holds.

Then it can be concluded that p(α) holds for all ordinals α.

Remark 13.2. There are several equivalent statements of Transﬁnite Induction.

1. If for every ordinal α the implication (∀β < α(p(β))) ⇒ p(α) is true, then p(α)

is true for all ordinals α.

2. If there is no minimal α satisfying ¬p(α) then p(α) is true for all α.

3. If for every α where p(α) is false there is another β < α where p(β) is false, then

p(α) is true for all ordinals α.

Note that due to the Axiom of Foundation one can get a counterpart to transﬁnite

induction on (V, ∈).

Theorem 13.3 (Transﬁnite Induction in V). Assume that for a property p and

all x ∈ V the implication

(∀z ∈ x(p(z))) ⇒ p(x)

holds. Then p(x) is satisﬁed for all x ∈ V .

Proof. Assume that there is an x ∈ V where p(x) is false. Let x

′

= {z ∈ T C(x) |

p(z)}. The set T C(x) − x

′

is nonempty and there is by the Axiom of Foundation a

y ∈ T C(x)−x

′

such that every z ∈ y is in not in T C(x)−x

′

. Recall that by deﬁnition,

T C(x) is transitive and thus all members of y are in T C(x). Thus they are also in

x

′

and one has that ∀z ∈ y p(z) is true. So p(y) holds and y ∈ x

′

in contrary to its

choice. Thus x

′

must be empty and p(x) is true as well. So p(x) holds for all x ∈ V .

58

Example 13.4. Let F be a class deﬁning a function in one variable. If F(x) = F[x]

for all x ∈ V then F is the identity.

Proof. Assume that F(y) = y for all y ∈ x. Then F(x) = F[x] = {F(y) | y ∈ x}

= {y | y ∈ x} = x. Thus the equality holds also for x. It follows from transﬁnite

induction that F is the identity.

Well-founded relations are a generalization of both, the element relation and a well-

ordering.

Deﬁntion 13.5. A relation R on a domain W (which is either a class or a set) is

well-founded iﬀ

• for every x ∈ W, {y ∈ W | y Rx} is a set;

• for every nonempty set x ⊆ W there is an y ∈ x such that no z ∈ x satisﬁes

z Ry.

Note that the ﬁrst condition is only important for the case that W is a proper class,

that is, not a set. Furthermore, choosing for any y ∈ W the subset x = {y} of W,

proves that (W, R) is irreﬂexive. Now some examples of well-founded relations are

given:

• Assume that xRy iﬀ x ∈ y. Then R is well-founded relation on V by the Axiom

of Foundation.

• Assume that (W, <) is a well-ordered set and xRy iﬀ x < y. Then R is a

well-founded relation on W.

One can use well-founded relations to generalize recursion from the natural numbers

to many other structures like well-ordered sets, the class of ordinals and even the

whole universe V along ∈. Note that not only recursion but also transﬁnite induction

can be carried out along any well-founded relation.

Exercise 13.6. Let A be some set and let a

0

a

1

. . . a

n−1

Rb

0

b

1

. . . b

m−1

⇔ n < m and

there is a function f : n → m such that b

f(i)

= a

i

and (i < j ⇒ f(i) < f(j)) for all

i, j ∈ n where a

0

a

1

. . . a

n−1

, b

0

b

1

. . . b

m−1

∈ A

∗

. Show that R is well-founded.

Let R be such that xRy iﬀ there is a z with x ∈ z ∧ z ∈ y. Show that R is

well-founded.

Let (x, y) R(v, w) iﬀ either x = v ∧ y ∈ w or y = w ∧ v ∈ x. Is R well-founded?

Is the relation R given as xRy ⇔ x ∩ y = x ∪ y well-founded?

59

Theorem 13.7 (Transﬁnite Recursion). Let R be a well-founded relation with

domain W and let G be a class which is a function in n + 1 variables. Then there is

a class F which is also a function in n variables and satisﬁes

∀x

1

, . . . , x

n

∈ W (F(x

1

, . . . , x

n

) = G(x

1

, . . . , x

n

, {(y

1

, F(y

1

, x

2

, . . . , x

n

)) | y

1

Rx

1

})).

Proof The proof is similar to that of Theorem 5.2. Let W be the domain of R and

R

∗

be the transitive closure of R, which exists by Theorem 5.2.

More formally, R

1

is the same relation as R and deﬁne inductively v R

n+1

w if

v R

n

w or v R

n

u and uR

n

w for some u ∈ W. Furthermore, v R

∗

w iﬀ v R

n

w for some

n ∈ {1, 2, 3, . . .}. One can easily verify the following four facts on R

∗

to obtain that

this relation is also well-founded: First one can show by induction that for each n and

w ∈ W, the set {v ∈ W : v R

n

w} is a set. The same applies then for the inductively

deﬁned set R

∗

. Second, R

∗

is transitive. Third, a every non-empty set A ⊆ W has an

minimal element with respect to R

∗

as the set B = {w ∈ W : ∃u, v ∈ A : uR

∗

wR

∗

v}

has a minimal element with respect to R. Fourth, R

∗

is antisymmetric.

This transitive closure R

∗

of R will now be used to deﬁne a class C which will be

used to deﬁne the function F. So let C be the class of all functions f such that

• The domain of f is a set of the form {(y

1

, x

2

, . . . , x

n

) | y

1

= x

1

∨ y

1

R

∗

x

1

} for

some x

1

, x

2

, . . . , x

n

∈ W.

• If (z

1

, x

2

, . . . , x

n

) is in the domain of the function f then f(z

1

, x

2

, . . . , x

n

) =

G(z

1

, x

2

, . . . , x

n

, {(y

1

, f(y

1

, x

2

, . . . , x

n

)) | y

1

Rz

1

}).

Now the function F is deﬁned as the union over all functions f ∈ C, that is,

F = {(x

1

, x

2

, . . . , x

n

, f(x

1

, . . . , x

n

)) | f ∈ C ∧ f(x

1

, x

2

, . . . , x

n

) is deﬁned}.

It is now shown that F is actually a function. This is done by considering the follow-

ing subclass D of the class of all n tuples of elements in W.

D is the class of all n-tuples (x

1

, x

2

, . . . , x

n

) of elements in W such that there is a

function f ∈ C for which f(x

1

, x

2

, . . . , x

n

) is deﬁned and such that for all functions

f,

˜

f ∈ C where f(x

1

, . . . , x

n

),

˜

f(x

1

, . . . , x

n

) are deﬁned, these values coincide.

Now one shows by transﬁnite induction that D = W

n

. Let (x

1

, x

2

, . . . , x

n

) be

any tuple of n elements in W and assume that (y

1

, x

2

, . . . , x

n

) ∈ D for all y

1

R

∗

x

1

.

Now deﬁne f(y

1

, x

2

, . . . , x

n

) =

¸

{

˜

f(y

1

, x

2

, . . . , x

n

) :

˜

f ∈ C ∧

˜

f(y

1

, x

2

, . . . , x

n

) is

deﬁned} for all y

1

R

∗

x

1

. It follows from the membership of (y

1

, x

2

, . . . , x

n

) in D that

f(y

1

, x

2

, . . . , x

n

) =

˜

f(y

1

, x

2

, . . . , x

n

) whenever

˜

f ∈ C and

˜

f(y

1

, x

2

, . . . , x

n

) is deﬁned.

Consider f ∪ {(x

1

, x

2

, . . . , x

n

, G(x

1

, x

2

, . . . , x

n

, {(y

1

, f(y

1

, x

2

, . . . , x

n

)) | y

1

Rx

1

}))};

this function is in C. If there is a further function

˜

f ∈ C for which

˜

f(x

1

, x

2

, . . . , x

n

)

60

is deﬁned, then

˜

f coincides with f on the domain of f and hence

˜

f(x

1

, x

2

, . . . , x

n

)

coincides with G(x

1

, x

2

, . . . , x

n

, {(y

1

, f(y

1

, x

2

, . . . , x

n

)) | y

1

Rx

1

}). So it follows that

(x

1

, . . . , x

n

) ∈ D.

Hence the class F =

¸

C is actually a function mapping n-tuples in W to V and

so F exists. In the case that W is a set, F[W ×W ×. . . ×W] is a set as well by the

Axiom of Replacement.

Informally, this means that whenever R is a well-founded relation on W and some

class G says how to obtain F(x

1

, x

2

, . . . , x

n

) from the arguments x

1

, x

2

, . . . , x

n

and all

pairs (y

1

, F(y

1

, x

2

, . . . , x

n

)) with y

1

Rx

1

, then F itself exists (that is, F is a class).

Example 13.8. The function T C can be deﬁned with transﬁnite recursion along ∈

via the formula

T C(x) = {x} ∪

¸

{T C(y) | y ∈ x};

and T C coincides with the successor S on ordinals which is an expression. They are

diﬀerent on sets which are not ordinals as S({7, 8}) = {7, 8, {7, 8}} and T C({7, 8}) =

{0, 1, 2, 3, 4, 5, 6, 7, 8, {7, 8}}. Note that T C(∅) = {∅} as

¸

{T C(y) | y ∈ ∅} is just ∅.

An important application of transﬁnite recursion is the following result.

Theorem 13.9 (Representation Theorem). Let (W, ⊏) be a well-ordered set.

Then there is an ordinal isomorphic to this set.

Proof. Using transﬁnite recursion one can deﬁne F : W → V by the equation

F(a) =

¸

{S(F(b)) | b ⊏ a}.

Note that F(a) = 0 if a is the least element of W with respect to ⊏ since the union

over the members of the empty set gives the empty set:

¸

∅ = ∅. Furthermore, 0 is

the ordinal represented by ∅. It is easy to see that for all a, b ∈ W the implication b ⊏

a ⇒ S(F(b)) ⊆ F(a) ⇒ F(b) ∈ F(a) holds. So F is order-preserving. Furthermore,

if β ∈ F(a) then there is a least b ∈ W with β ∈ F(b). All c ⊏ b satisfy β / ∈ F(c).

So β ∈

¸

{S(F(c)) | c ⊏ b} but β / ∈

¸

{F(c) | c ⊏ b}. Thus there is a c ⊏ b with

β ∈ S(F(c)) − F(c). It follows that b is the successor of c with respect to ⊏ and

β = F(c). So F[W] is transitive. Furthermore, by the Axiom of Replacement, F[W]

is a set itself. So F[W] is an ordinal.

Theorem 13.10. For every set X there is an ordinal α such that |α| ≤ |X|.

Proof. Given X, let Y = {(Z, R) | Z ⊆ X ∧ R is a well-ordering on Z}. For all

(Z, R), (Z

′

, R

′

) ∈ Y , let (Z, R) ⊏ (Z

′

, R

′

) ⇔ (Z, R) is isomorphic to an initial segment

61

of (Z

′

, R

′

). The relation ⊏ is well-founded: If U ⊆ Y then either U = ∅ or U contains

some (Z, R). In the latter case, either (Z, R) is a minimal element or there is a least

z ∈ Z such that some (Z

′

, R

′

) ∈ U is isomorphic to (Z[z], R) and then that (Z

′

, R

′

)

is a minimal element of U.

So one can deﬁne by transﬁnite recursion a recursive function F : Y → V such

that

F((Z, R)) =

¸

{S(F(Z

′

, R

′

)) | (Z

′

, R

′

) ∈ Y ∧ (Z

′

, R

′

) ⊏ (Z, R)}

where the well-founded relation on the domain of F is ⊏.

Assume now by way of contradiction that the range of F is not an ordinal: then

one can deﬁne the nonempty set Y

′

= {(Z, R) ∈ Y : F((Z, R)) is not an ordinal} and

Y

′

has a minimal element (Z, R) with respect to ⊏. For this minimal element, one

has that it is the union of all sets S(F((Z

′

, R

′

))) with (Z

′

, R

′

) ∈ Y ∧(Z

′

, R

′

) ⊏ (Z, R),

hence F((Z, R)) is the union of ordinals and hence F((Z, R)) is an ordinal itself in

contradiction to the assumption. Hence, the range of F is a set of ordinals and the

union of these ordinals is an ordinal β. Every member of Y is isomorphic to a initial

segment of α = S(β).

If there would be a one-to-one function g : α → X then g[α] ∈ V and g[α] ⊆ X.

Furthermore, g induces a well-ordering R on g[α] and (g[α], R) ∈ Y in contradiction

to the fact that no member of Y is isomorphic to α. Thus there is no such g and

|α| ≤ |X|.

Exercise 13.11. Construct by transﬁnite recursion a function on ordinals which tells

whether an ordinal is even or odd. More formally, construct a function F such that

F(α) = 0 if α is even, F(α) = 1 if α is odd. Limit ordinals should always be even;

the successor of an even ordinal is odd and the successor of an odd ordinal is even.

Exercise 13.12. Is it possible to deﬁne a function F on all sets such that F(X) = n

iﬀ n is the maximal number such that there are Y

0

, Y

1

, . . . , Y

n

with Y

m+1

= S(Y

m

) for

all m ∈ n and X = Y

n

? If so, construct the corresponding function F by transﬁnite

recursion.

14 The Rank of Sets

The rank is an alternative method to measure the size of a set. The cardinality asks

how many elements are in the set, the rank asks how many levels are necessary to

build a set. The rank is deﬁned by transﬁnite recursion.

Deﬁnition 14.1. The rank ρ which is deﬁned as ρ(x) =

¸

{S(ρ(y)) | y ∈ x} with

ρ(∅) =

¸

∅ = 0.

62

Example 14.2. ρ(1) = ρ({∅}) = 1, ρ(2) = ρ({∅, {∅}}) = 2, ρ({{∅}, {{∅}}}) = 3,

ρ({A}) = S(ρ(A)) and ρ(A ∪ B) = ρ(A) ∪ ρ(B) for all sets A, B.

Proposition 14.3. The rank ρ is an ordinal-valued function with ρ(α) = α for all

ordinals α.

Proof. By Theorem 12.6, ρ(x) is an ordinal iﬀ ρ(x) is transitive and linearly ordered

by ∈. Being an ordinal is a property. So, for given x ∈ V , one can deﬁne the set

x

′

= {y ∈ T C(x) | ρ(y) is an ordinal}

by comprehension. Assume by way of contradiction that x / ∈ x

′

. Then T C(x) −x

′

is

not empty and has an element y such that no z ∈ y is in T C(x) −x

′

, that is, y ⊆ x

′

.

Then ρ(z) is an ordinal for every z ∈ y and ρ(y) =

¸

{S(ρ(z)) | z ∈ y} is an ordinal

by Theorem 12.8. Then y ∈ x

′

contradicting the choice of y; this contradiction gives

x ∈ x

′

. In particular, ρ(x) is an ordinal.

Recall that α =

¸

{S(β) | β ∈ α} for all ordinals α. Assuming that ρ(β) = β for

all β ∈ α, one has that ρ(α) =

¸

{S(ρ(β)) | β ∈ α} =

¸

{S(β) | β ∈ α} = α for α.

The equality ρ(α) = α holds for all ordinals α by transﬁnite induction.

Exercise 14.4. For any ordinal α, consider the successor function S restricted to α,

that is, consider the set

S|`α = {{β, {β, S(β)}} | β ∈ α}.

Determine ρ(S|`α) for α = 42, 1905, 2004, ω, ω +1, ω +131501, ω

2

+ω · 2 +1, ω

17

+ω

4

.

Theorem 14.5. For every ordinal α let V

α

= {x ∈ V | ρ(x) < α}. Then V

α

is a set

and ρ(V

α

) = α.

Proof. Deﬁne a function G by

G(α, x) =

¸

{P(z) | ∃y ((y, z) ∈ x)}.

Let F be the function obtained from G by transﬁnite recursion on ordinals. That is,

F satisﬁes

F(α) =

¸

{P(F(β)) | β < α}

for all ordinals α. Now one can show by transﬁnite induction that F maps ordinals to

sets. F(∅) = ∅ is a set. If α is a successor ordinal and α = S(β) then F(α) = P(F(β))

and F(α) is a set. If α is a limit ordinal then F(α) =

¸

F[α] and F(α) ∈ V by the

Axiom of Replacement.

The equality V

α

= F(α) is shown by transﬁnite induction. That is, assuming that

63

equality holds for all β ∈ α, one has to show that the equality holds for α as well.

If x ∈ F(α) then x ∈ P(F(β)) and x ⊆ F(β) for some β ∈ α. By induction

hypothesis, ρ(y) < β for all y ∈ x. Thus ρ(x) < S(β) ≤ α and x ∈ V

α

.

If x ∈ V

α

then ρ(x) = β < α for some β. Every y ∈ x satisﬁes ρ(y) < β and

y ∈ V

β

. By induction hypothesis, V

β

= F(β). Since F(β) is a set by the Axiom of

Replacement, P(F(β)) exists and x ∈ P(F(β)). It follows that x ∈ F(α).

So F(α) and V

α

have the same elements. By the Axiom of Extensionality they are

equal. Thus the mapping α → V

α

is a function and V

α

is a set for every ordinal α.

On one hand, V

α

consists only of elements x with ρ(x) < α. Thus ρ(V

α

) ≤ α. On

the other hand, every β < α satisﬁes ρ(β) = β and β ∈ V

α

. So α ⊆ V

α

and ρ(V

α

) ≥ α.

Thus ρ(V

α

) = α.

Exercise 14.6. V

ω

has been deﬁned twice. Let A be the version of V

ω

as de-

ﬁned in Deﬁnition 7.5, that is let A consist of all hereditarily ﬁnite sets. Let B =

¸

{V

n

| n < ω} = {x ∈ V | ρ(x) < ω} be the version deﬁned here. Show that both

deﬁnitions coincide, that is, show A ⊆ B ∧ B ⊆ A.

Show that B contains ∅, is closed under unions of two sets and is closed under the

operation forming {v} from v. Thus, by Theorem 7.9, A ⊆ B.

Show by induction that all members of V

n

with n < ω are hereditarily ﬁnite. Thus

B ⊆ A.

Proposition 14.7. The deﬁnition of the function F from Theorem 14.5 can be

extended to all x ∈ V by the condition

F(x) =

¸

{P(F(y)) | y ∈ x}.

For all x ∈ V , F(x) = F(ρ(x)).

Proof. This is proven by transﬁnite induction. So for any given x ∈ V , one has to

show that F(x) = F(ρ(x)) provided that F(y) = F(ρ(y)) for all y ∈ x.

If x = ∅ this directly follows from ρ(0) = 0. So consider the case that x is

nonempty. From the deﬁnition and the inductive hypothesis one has that F(x) =

¸

{P(F(y)) | y ∈ x} =

¸

{P(F(ρ(y))) | y ∈ x}. Note that F(α) ⊆ F(β) and

P(F(α)) ⊆ P(F(β)) whenever α, β are ordinals with α ≤ β. Furthermore, α < ρ(x)

iﬀ there is y ∈ x with α ≤ ρ(y). So one can add P(F(α)) to the union for all α < ρ(x)

without changing the outcome: F(x) =

¸

{P(F(α)) | α < ρ(x)}. It follows from

Theorem 14.5 that F(x) = F(ρ(x)).

64

15 Arithmetic on Ordinals

Addition and multiplication are deﬁned inductively. The ﬁrst parameter is ﬁxed and

the induction goes over the second one. The basic idea of addition of ordinals is that

it has an easy geometric interpretation and that it can be reversed: for every ordinals

α, β with β > α there is a unique ordinal γ with α + γ = β.

Deﬁnition 15.1. For ordinals α and β, one can deﬁne the addition by transﬁnite

induction: α + 0 = α and, for β > 0,

α + β = sup{S(α + γ) | γ ∈ β}.

Alternatively, one can also say that α + β is the unique ordinal which is order-

isomorphic to the set {0} × α ∪ {1} × β = {(0, γ) | γ ∈ α} ∪ {(1, δ) | δ ∈ β}

equipped with lexicographic ordering.

Remark 15.2. Notice that the addition of ordinals is not commutative. For example,

ω+1 = 1+ω = ω. Furthermore, if α > β, one can deﬁne α−β to be the unique ordinal

γ with β +γ = α. This ordinal is the one which is isomorphic to the well-ordered set

({δ ∈ α | δ / ∈ β}, <). That is, arithmetic and set-theoretic diﬀerence coincide up to

isomorphism for ordinals.

Deﬁnition 15.3. Multiplication can also be deﬁned by transﬁnite recursion: α· 0 = 0

and, for β > 0, α · β = sup{(α · γ) +α | γ ∈ β}.

Alternatively, one can deﬁne α · β to be the unique ordinal isomorphic to the set

β ×α equipped with the lexicographic ordering.

Again, the multiplication of ordinals is not commutative. For example,

ω · 2 = ω + ω = ω = 2 · ω.

Deﬁnition 15.4. α

0

= 1, α

1

= α, α

2

= α · α, α

3

= α

2

· α and α

S(n)

= α

n

· α.

Deﬁnition 15.5. Let C

fin

be the class of all functions F which map ordinals to

natural numbers with the additional constraint that F(α) = 0 for all but ﬁnitely

many ordinals α. For F, G deﬁne that F < G iﬀ F = G and F(α) < G(α) for the

largest ordinal with F(α) = G(α). Map an ordinal α to that function F ∈ C

fin

for

which {G ∈ C

fin

| G < F} is order isomorphic to α; this function is denoted by F

α

from now on. Deﬁne an addition ⊕ on the ordinals by letting α ⊕ β be that ordinal

γ for which the equation

∀δ (F

α

(δ) +F

β

(δ) = F

γ

(δ))

65

holds. So the idea is to make an isomorphism between the ordinals and the ordered

free commutative semigroup over them.

Exercise 15.6. Which of the following statements are true and which are false?

1. The addition ⊕ is commutative.

2. There are ordinals α, β such that α + β and β + α both diﬀer from α ⊕β.

3. There are ordinals α, β such that α + β > α ⊕β.

4. There are ordinals α, β such that α < β and α ⊕γ = β for all ordinals γ.

5. There are ordinals α, β such that α < β and α + γ = β for all ordinals γ.

Remark 15.7. Given F ∈ C

fin

, let n be the number of ordinals which F does not map

to 0 and let α

n−1

, α

n−2

, . . . , α

1

, α

0

be these ordinals in descending order. Furthermore,

let G

α

be the function mapping α to 1 and all other ordinals to 0. Then

F = G

α

n−1

· F(α

n−1

) +G

α

n−2

· F(α

n−2

) +. . . + G

α

1

· F(α

1

) +G

α

0

· F(α

0

).

Deﬁnition 15.8. Let ω

α

denote the ordinal represented by G

α

. Given any ordinal

β > 0, consider the function F

β

∈ C

fin

which is isomorphic to β. Let n be the number

of places where F

β

is not 0 and let the ordinals α

n−1

, α

n−2

, . . . , α

1

, α

0

be these n places.

Let m

k

= F

β

(α

k

) for all k ∈ n. Then

β = ω

α

n−1

· m

n−1

+ ω

α

n−2

· m

n−2

+ . . . +ω

α

1

· m

1

+ ω

α

0

· m

0

.

This unique representation is called the Cantor Normal Form of β. The Cantor Normal

Form of 0 is the void sum.

Proposition 15.9. ω

0

= 1 and ω

α

= sup{ω

β

· m | β < α ∧ m ∈ N} for ordinals

α > 0. In particular, Deﬁnitions 15.4 and 15.8 coincide for ω

n

with n ∈ N.

Proof. This proposition uses Deﬁnition 15.8 and the equivalence to Deﬁnition 15.4

is established in the last paragraph for the case n ∈ N. Since 0 is represented by the

void sum, ω

0

is greater than 0 and takes the next value 1.

Clearly ω

α

> ω

β

· m for all β ∈ α and m ∈ N. On the other hand, let γ be an

ordinal with 1 ≤ γ < ω

α

. γ can be represented as

γ = ω

α

n−1

· m

n−1

+ω

α

n−2

· m

n−2

+ . . . + ω

α

1

· m

1

+ ω

α

0

· m

0

.

66

where n ∈ N − {0} and m

k

> 0 for all k ∈ n. Recall that F

γ

is the function in C

fin

representing γ and G

α

represents ω

α

. Let δ be the largest ordinal with F

γ

(δ) = G

α

(δ).

Since F

γ

< G

α

, F

γ

(δ) < G

α

(δ) = 1 and δ = α. Thus F

γ

(δ) = 0 and δ > α

n−1

>

α

n−2

> . . . > α

1

> α

0

. Now let β = α

n−1

and m =

¸

k∈n

m

k

. Then F

γ

≤ G

β

· m < G

α

and γ ≤ ω

β

· m < ω

α

. Thus ω

α

is indeed the supremum of all ω

β

· m with β ∈ α and

m ∈ N.

Note that the equality ω

0

= 1 coincides with Deﬁnition 15.4. Assume now that

the equivalence is established for some n ∈ N. Then, using the deﬁnition of ω

S(n)

and of the multiplication, one has that ω

S(n)

= sup

m∈N

ω

n

· m. Since the sequence

ω

0

, ω

1

, . . . , ω

n

is increasing, ω

S(n)

is also by Deﬁnition 15.4 the supremum of all ω

k

· m

with k ∈ S(n) and m ∈ N, thus the equivalence of both deﬁnitions transfers to S(n)

and the last statement of the proposition follows by induction.

Example 15.10. One can view the Cantor Normal Form as a ﬁnite sum of powers

of ω in descending order as in this example:

ω

5

+ ω

2

+ ω

2

+ ω

2

+ ω

1

+ ω

0

+ ω

0

= ω

5

+ ω

2

· 3 +ω + 2.

Instead of repeating same ordinals, one can also multiply them with the corresponding

natural number, instead of ω

1

, one can write just ω, instead of ω

0

just 1. The void

sum is represented by the symbol 0. This all is done on the right hand side of the

equation above. Also transﬁnite ordinals can be in the power:

ω

ω

2

+ ω

ω·5+8

· 7 +ω

ω·5+7

· 12345 +ω

22222

· 33333 +ω

4

+ ω

3

+ω

2

+ ω + 1.

If one adds ordinals ω

α

· a + ω

β

· b with α < β, then ω

α

· a can be omitted; if α = β

the coeﬃcients can be added giving ω

α

· (a+b); if α > β, no simpliﬁcation is possible:

ω

3

+ ω

5

= ω

5

;

ω

3

· 5 +ω

4

· 8 +ω

5

· 0 = ω

4

· 8;

ω

3

· 234 +ω

3

· 111 = ω

3

· 345;

ω

5

+ ω

3

+ ω

2

+ ω

3

+ ω

1

= ω

5

+ω

3

· 2 +ω.

The last line has the application of two rules: ﬁrst ω

2

is omitted as it is in front of a

higher ω-power; second the two ω

3

-terms are uniﬁed to one; no further simpliﬁcation

is possible.

The Cantor Normal Form can also be used in order to express the rank of sets.

Recall the following rules:

• The rank of an ordinal α is α. So ρ(0) = 0, ρ(1) = 1, ρ(ω) = ω.

67

• The rank of sets is determined by the rank of their elements. For example,

ρ({x, y, z}) = max{ρ(x)+1, ρ(y)+1, ρ(z)+1} and ρ({ω, ω· 2, ω+5}) = ω· 2+1.

• In general, ρ(X) = sup{ρ(Y ) + 1 | Y ∈ X}.

Then, one can get the following ranks expressed in Cantor Normal Form:

ρ({{∅}}) = 2;

ρ({∅, {∅}}) = 2;

ρ({{{{ω}}}}) = ω + 4;

ρ({ω

4

+ 2, ω

3

· 8}) = ω

4

+ 3;

ρ({ω

α

| α < ω

2

}) = ω

ω

2

;

ρ({ω

α

+ω

β

| α, β < ω + 8}) = ω

ω+7

· 2 + 1.

The Cantor Normal Form is in particular useful to denote ordinals formed by ﬁnite

sums over small powers of ω.

Exercise 15.11. Determine the Cantor Normal Form of the following ordinals.

1. ω + ω

2

+ ω

3

+ ω

4

+ 2,

2. (ω + 3)

5

+ (ω

2

+ 17) · (ω + 8) +ω

12

,

3. ω

2

+ ω + 1 +ω

2

+ ω + 1 +ω

2

+ ω + 1,

4. 1 ⊕ω ⊕ω

2

⊕ω

3

,

5. ω

ω+5

+ ω

ω+2

· ω + ω

2

,

6. 256

256

+ ω · 42.

Exercise 15.12. Assume that α = ω

γ

1

+ ω

γ

2

and β = ω

δ

1

+ ω

δ

2

with γ

1

> γ

2

and

δ

1

> δ

2

. What condition on γ

1

, γ

2

, δ

1

, δ

2

is equivalent to the equation α + β = α ⊕β.

Example 15.13. An ǫ-number is an ordinal ǫ satisfying ǫ = ω

ǫ

. In particular, for

any ordinal α, ǫ

α

is the ﬁrst ordinal such that the set

{ǫ : ǫ ≤ ǫ

α

∧ ω

ǫ

= ǫ}

is isomorphic to {β : β ≤ α}. The ordinal ǫ

0

is the limit of the sequence

ω, ω

ω

, ω

(ω

ω

)

, ω

(ω

(ω

ω

)

)

, . . .

68

of iterated powers of ω; the brackets ( and ) are normally omitted. In particular,

α < ǫ

0

iﬀ α can be expressed by a formula consisting of the constants 0, 1, the power

ω

β

for a subformula β and addition. For example,

ω

ω

ω

+ω

2

+ ω

2

+ ω

2

+ 1 + 1

is such an expression. It is of course convenient to write ω

ω

ω

+ ω

2

· 3 + 2 instead.

Example 15.14. An ordinal α is called constructive or recursive iﬀ α < ω or there is

a relation ⊏ on N × N which can be computed by a computer programme such that

(N, ⊏) is isomorphic to α.

The ordinal ω

ω

is constructive. There is a one-to-one enumeration of all poly-

nomials p

0

, p

1

, . . . in ω where the coeﬃcients are natural numbers. Now let n ⊏ m

iﬀ the polynomials p

n

, p

m

are diﬀerent and satisfy a

n

< a

m

for the coeﬃcients a

n

in

p

n

and a

m

in p

m

for the largest power ω

k

where these coeﬃcients are diﬀerent. So

ω

5

+ω

3

·2 ⊐ ω

5

+ω

3

⊐ ω

5

+ω

2

⊐ ω

4

·17+ω

3

⊐ ω

4

·17 ⊐ ω

2

+1 ⊐ ω ⊐ 1243134123412342.

An equivalent deﬁnition is that p

n

⊏ p

m

iﬀ p

n

(x) < p

m

(x) for almost all natural num-

bers x when x replaces ω viewed upon as a formal variable in the polynomials p

n

, p

m

.

Other examples of constructive ordinals are ǫ

0

, ǫ

1

, . . . , ǫ

ω

, ǫ

ω+1

.

The ﬁrst non-constructive ordinal is ω

CK

named after the mathematicians Church

and Kleene who studied this ordinal.

Example 15.15. There is a ﬁrst ordinal ω

1

such that the set {α : α < ω

1

} repre-

senting ω

1

is not countable. It is larger than all previously considered ordinals. For

example, 121234312 < ω < ω

ω

+234123443124123 < ǫ

0

< ǫ

1

< ǫ

2

< ǫ

5

+ω

7

< ω

CK

<

ω

CK

+ω · 17 + 4 < ω

1

.

16 Cardinals

There are two diﬀerent usages of the natural numbers: First to denote the quantity

of something, say “three” represents the set {apple, banana, pear} of three fruits.

Second to enduce an order, so the third word “pear” comes after the second word

“banana”. The English language reﬂects these two ways to use numbers by having

the diﬀerent words “three” and “third”. When dealing with inﬁnite objects, it is

even more necessary to distinguish cardinals (representing the quantity) and ordinals

(representing an order): object number ω +8 should come after object ω +7 and not

before it. But the cardinality of the sets represented by these two cardinals is the

same, since f given as

f(α) =

S(α) if α ∈ ω;

0 if α = ω + 7;

α if ω ⊆ α ∧ α ∈ ω + 7;

69

is a bijective function from ω +8 to ω +7. So one would want to assign to ω +8 and

ω + 7 the same cardinal. This is done by deﬁning that the cardinal α of a set A is

the least ordinal such that there is a bijective mapping from A into α; note that the

Axiom of Choice deﬁned below is required to guarantee that every set has a cardinal.

Deﬁnition 16.1. An ordinal α is a cardinal (or cardinal number) if |β| < |α| for all

β ∈ α. A cardinal α is called the cardinal number (or sometimes the cardinality) of A,

denoted by α = |A|, if and only if |α| = |A|.

Example 16.2. Let α be an ordinal.

1. If α is a cardinal then α = |α|.

2. If α ≤ ω then α is a cardinal. ω is the least inﬁnite cardinal.

3. ω+1, ω+17, ω

2

, ω

51

are not cardinals. In particular, if α is countable and α > ω

then α is not a cardinal.

Theorem 16.3. For every ordinal β there is a unique cardinal α such that α = |β|

and α ≤ β. Furthermore, if A is well-orderable then there is a unique cardinal α such

that α = |A|.

Proof. The set {γ ∈ S(β) | |γ| = |β|} has a minimum α. Then |α| = |β| but |γ| < |β|

for all γ < α. It follows that α is a cardinal, that is, α = |β|.

Given a well-ordered set (A, ⊏), there is by Theorem 13.9 a unique ordinal β

representing (A, ⊏) in the sense that (A, ⊏) is isomorphic to (β, <). So there is a

cardinal α such that α = |β| and therefore also α = |A|.

Given an ordinal β, there is by Theorem 13.10 an ordinal γ with |γ| ≤ |β|. Now let

α be the minimum of all δ ∈ S(γ) such that |δ| ≤ |β|. Then α is the ﬁrst cardinal

larger than β.

Property 16.4. For every ordinal β there is a ﬁrst cardinal α such that |β| < α. α

is denoted as β

+

.

Deﬁnition 16.5. For an ordinal β, the least cardinal α satisfying that β ∈ α is

denoted by β

+

. A cardinal α is called a successor cardinal if α = β

+

for some ordinal

β. A cardinal is called a limit cardinal if it is not a successor cardinal. Furthermore,

cardinals are denoted by alephs: ℵ

0

= ω and

ℵ

α

= sup{ℵ

+

β

| β ∈ α}

70

for ordinals α > 0. Although cardinals are identiﬁed with the ordinals representing

them, there is still the traditional name ω

α

for the least ordinal β satisfying |β| = ℵ

α

.

Example 16.6. So ω

1

= ω

+

0

, ω

2

= ω

+

1

, ω

3

= ω

+

2

and ω

ω

= α

+

for all ordinals α.

Due to the identiﬁcation of sets and cardinals with ordinals, the natural numbers

and their cardinality can all be denoted by the following symbols: N, ω, ω

0

, ℵ

0

. Sim-

ilarly, ω

1

, ω

+

, ω

+

0

, ℵ

1

, ℵ

+

0

are all names for the ﬁrst uncountable ordinal which is

identiﬁed with the set representing it and its cardinal.

All ω

α

are limit ordinals, but ℵ

α

is a limit cardinal only if α is a limit ordinal; ℵ

α

is a successor cardinal otherwise.

The addition and multiplication of cardinals is diﬀerent from that of ordinals since

one enforces that the result is a cardinal. So ℵ

α

+1 will be diﬀerent from both ω

α

+1

(obtained by looking on ℵ

α

as an ordinal) and ℵ

α+1

= ℵ

+

α

(obtained by adding the

indices).

Deﬁnition 16.7 (Arithmetic for Cardinals). For cardinals κ and λ, deﬁne κ+λ =

|κ + λ|, κ · λ = |κ ×λ| and 2

κ

= |P(κ)|.

In the following it will be proven that the addition and the multiplication of inﬁnite

cardinals are really trivial and coincide with forming the maximum.

Proposition 16.8 (Hessenberg). If κ is an inﬁnite cardinal and λ a cardinal with

λ ≤ κ then κ +λ = λ +κ = κ.

Proof. For every inﬁnite ordinal α, one has |α| = |S(α)| witnessed by the bijection

f deﬁned as

f(β) =

0 if β = α;

β if ω ≤ β < α;

β + 1 if β < ω.

So S(α) cannot be a cardinal and κ is a limit ordinal. Note that κ ≤ |κ + λ| ≤

|κ ×{0, 1}|. So it is suﬃcient to show that |κ| = |κ ×{0, 1}| and using the fact that

κ is an inﬁnite limit ordinal. This is witnessed by the following function g:

g(ω · γ + n, a) = ω · γ + 2n + a

for all ordinals γ and n ∈ N such that ω· γ +n ∈ κ. Since every (β, a) ∈ κ×{0, 1} can

be uniquely represented as (ω· γ +n, a) with γ being an ordinal, n ∈ N and a ∈ {0, 1},

the function g is well-deﬁned. Furthermore, it is easy to see that g is a bijection.

Exercise 16.9. Construct a one-to-one function h which maps α × ω to α for any

inﬁnite limit ordinal α. This function can without loss of generality assume that the

71

input is of the form (ω· γ+n, m) where m, n ∈ N and γ is an ordinal with ω· S(γ) ≤ α;

the image should be of the form ω · γ + h(n, m).

In order to see that κ × κ is the same as κ, recall the canonical well-ordering from

Example 11.5.

Deﬁnition 16.10. For an inﬁnite ordinal κ, the canonical well-ordering of κ × κ,

denoted by <

cw

, is deﬁned as follows for (α, β), (γ, δ) ∈ κ ×κ:

(α, β) <

cw

(γ, δ) ⇔ max{α, β} < max{γ, δ}

∨ (max{α, β} = max{γ, δ} ∧ α < γ)

∨ (max{α, β} = max{γ, δ} ∧ α = γ ∧ β < δ).

Theorem 16.11. For all inﬁnite cardinals κ, (κ ×κ, <

cw

)

∼

= (κ, ∈).

Proof. The mapping (α, β) → ω

max{α,β}·2+2

+ ω

max {α,β}+α+1

+ ω

β

is an isomorphism

from (κ × κ, <

cw

) to some subset of (ω

κ·2+3

, ∈). Thus (κ × κ, <

cw

) is a well-ordered

set.

So it remains to show that the two well-ordered sets are isomorphic. Remark 11.6

states that it is true for κ = ω. Assume that there is a counterexample, say κ.

Let A = {λ ∈ S(κ) | λ ≥ ω is a cardinal and (λ ×λ, <

cw

)

∼

= (λ, ∈)}. Then κ ∈ A.

Let µ ∈ A be the least element of A. Note that µ > ω. Furthermore, for all λ, if

ω ≤ λ < µ and λ is a cardinal, then (λ ×λ, <

cw

)

∼

= (λ, ∈).

By the Comparability Theorem, (µ, ∈) is isomorphic to an initial segment of

(µ ×µ, <

cw

) since |µ × µ| ≥ µ and µ is a cardinal. Let (α, β) ∈ µ × µ be such

that (µ, ∈) is isomorphic to the initial segment of (µ ×µ, <

cw

) given by (α, β). Let h

be the isomorphism. Let η = max{α, β}. Then (α, β) ≤

cw

(η, η). Hence, h : µ → η×η

is injective. Let λ = |η|. Then |λ × λ| = |η × η|. But |λ × λ| = λ < µ. This is a

contradiction. Hence, there is no counterexample to the theorem.

Theorem 16.12 (Hessenberg). If κ, λ are cardinals with ℵ

0

≤ κ and 1 ≤ λ ≤ κ

then κ · λ = λ · κ = κ.

Remark 16.13. Let κ be a cardinal. Recall that 2

κ

= |P(κ)| by Deﬁnition 16.7 and

2

κ

> κ by Theorem 6.12. Thus 2

κ

≥ κ

+

. Note that 2

0

= 1 = 0

+

, 2

1

= 2 = 1

+

and

2

4

= 16 > 5 = 4

+

.

72

17 The Axiom of Choice

If a set is not empty, one can ﬁnd an element in it. Somehow, it is not guaranteed

that one can ﬁnd the element in a systematic way, that is, by a function. This is

formalized by the Axiom of Choice.

Deﬁnition 17.1. Let X be a set of sets. A function C deﬁned on all nonempty

members of X is called a choice function of X if C(x) ∈ x for every nonempty x ∈ X.

This permits to state the Axiom of Choice and its countable counterpart as follows.

Deﬁnition 17.2 (Axiom of Choice). Let X be a set of sets. Then X has a choice

function.

Example 17.3. A choice function C on N can be deﬁned as C(S(n)) = n for all

n ∈ N.

Example 17.4. If (W, ⊏) is a well-orderable set and let X = P(W). Then the

function which assigns to every nonempty subset of W its minimum with respect to ⊏

is a choice function.

Theorem 17.5. Assuming all axioms except the Axiom of Choice, the following

conditions are equivalent:

1. The Axiom of Choice.

2. Every set can be one-to-one mapped into a set of ordinals.

3. Every set is well-orderable.

4. For all sets X, Y , either |X| < |Y | or |X| = |Y | or |Y | < |X|.

Proof. First Statement ⇒ Second Statement. Let X be any given set and u / ∈ X

a target which will be used to guarantee that the below mapping is invertible on X.

By Theorem 13.10 there is an ordinal α such that |α| ≤ |X|. Now one constructs by

transﬁnite induction the following f : α → X∪{u} where C is a choice function which

is deﬁned at least on all subsets of X. For every γ ∈ α one deﬁnes

f(γ) =

C(X −f[γ]) if f[γ] ⊂ X;

u otherwise.

Note that whenever f(γ) ∈ X then f(γ) / ∈ f[γ] and thus f does not take any elements

of X twice. Since |α| ≤ |X| there must be some ordinal in α which is mapped to u.

Let β be the least such ordinal. Then f : β → X is a bijection and has an inverse

73

one-to-one function g which maps X into a set of ordinals.

Second Statement ⇒ Third Statement. If g : X → Y is a one-to-one function

and Y is a set of ordinals, then g induces a well-ordering of X: for all x, y ∈ X,

x ⊏ y ⇔ g(x) ∈ g(y).

Third Statement ⇒ First Statement. Let a set X of sets be given. There is a

well-ordering ⊏ on

¸

X. Now one can deﬁne a choice function C which maps every

nonempty subset Y of

¸

X to its minimum with respect to ⊏. Hence C also maps

every nonempty Y ∈ X to its minimum with respect to ⊏. Hence X has a choice

function.

Third Statement ⇒ Fourth Statement. Let X, Y be sets and assume that |X| ≤

|Y |. There are well-orderings on X, Y and by Theorem 11.13 these sets are either

order-isomorphic or one is order-isomorphic to some initial segment of the other one.

Since |X| ≤ |Y |, Y is order-isomorphic to an initial segment of X and the correspond-

ing mapping is one-to-one. Thus |Y | < |X|.

Fourth Statement ⇒ Second Statement. Given a set X, there is by Theorem 13.10

there is an ordinal α such that |α| ≤ |X|. Since α, X are comparable, |X| < |α|. Thus

there is a one-to-one mapping from X into α.

The next two results are applications of the Axiom of Choice. They are based on the

fact that every set X there is an ordinal α and a bijection f : α → X. Then there is a

cardinal κ ≤ α with κ = |α|. Furthermore, for every cardinal λ ≤ κ, f[λ] is a subset

of X of cardinality λ.

Theorem 17.6. For every set X, there is a unique cardinal κ such that κ = |X|.

Furthermore, if X is inﬁnite, then X has a countable subset.

Exercise 17.7. Let A, B, C be any sets and, as in Example 3.16,

D = {f ∈ C

A

| ∃g ∈ B

A

∃h ∈ C

B

(f = h ◦ g)}.

Show that D = C

A

iﬀ |B| ≥ min{|A|, |C|}.

Theorem 17.8. If f is a function deﬁned on A, then |f[A]| ≤ |A|.

Proof. Let C be a choice function on all nonempty subsets of A. Now deﬁne for all

b ∈ f[A] the mapping g by g(b) = C({a ∈ A | f(a) = b}). The function g is one-to-one

and witnesses |f[A]| ≤ |A|.

Using the Axiom of Choice, one can prove the following result.

Theorem 17.9. The union of a countable set of countable sets is countable.

74

Proof. Let A be a countable set of countable sets, that is, every B ∈ A is countable.

There is surjective function F : N → A, F(n) is the n + 1-st set contained in A.

For each n, let E(n) = {f : N → F(n) | f is surjective}. Note that E is a function

from N to (

¸

A)

N

, each set E(n) has cardinality 2

ℵ

0

. By the Axiom of Choice, there

is a function g which selects from every E(n) an element g

n

of this set. Now let

G(n, m) = g

n

(m). G is a surjective mapping from N×N to

¸

A, thus

¸

A is at most

countable. Since A is not empty, there is a countable and thus inﬁnite B ∈ A and by

B ⊆

¸

A, the set

¸

A is inﬁnite. So

¸

A is countable.

Corollary 17.10. The ﬁrst uncountable ordinal ω

1

is not the union of a countable

set of countable ordinals.

The Axiom of Choice can be used to construct an example of a set of cardinality ℵ

1

.

Example 17.11. Deﬁne for A, B ⊆ N the following relations:

A ≤

lin

B ⇔ ∃m, n ∈ N∀a ∈ N(a ∈ A ⇔ a · m+ n ∈ B);

A <

lin

B ⇔ A ≤

lin

B ∧ B ≤

lin

A.

Let L ⊆ P(N) be such that L is not empty and (L, <

lin

) is a linearly ordered set. If

L is bounded by some A ⊆ N in the sense that ∀B ∈ L(B ≤

lin

A) then |L| ≤ ℵ

0

else

|L| = ℵ

1

. Furthermore L can indeed be chosen such that (L, <

lin

) is linearly ordered

and |L| = ℵ

1

; so the else-case is not only a theoretical case.

Proof. Note that ≤

lin

is transitive: If A ≤

lin

B and B ≤

lin

C then there are

m, n, i, j ∈ N such that for all a, b ∈ N, a ∈ A ⇔ a·m+n ∈ B and b ∈ B ⇔ b·i+j ∈ C.

Thus for all a ∈ N, a ∈ A ⇔ a · (m · i) + (n · i + j) ∈ C. Note that ≤

lin

is not

antisymmetric: if A is the set of even and B of odd numbers then a ∈ A ⇔ a +1 ∈ B

and b ∈ B ⇔ b + 1 ∈ A. But by Remark 9.11, <

lin

is deﬁned from the transitive

relation ≤

lin

such that it is automatically transitive and antireﬂexive, so (P(N), <

lin

)

is a partially ordered set.

If L is bounded by A then one has for every B ∈ L a pair (m

B

, n

B

) such that

∀b ∈ B ⇔ b · m

B

+n

B

∈ A. If C = B and C ≤

lin

A then (m

C

, n

C

) = (m

B

, n

B

). Thus

L is at most countable.

If L is at most countable then there is a surjective function F from N to L, that

is, L = {F(0), F(1), . . .}. Now let

A = {(2b + 1)2

a

| b ∈ F(a)}.

It is easy to see that F(a) ≤

lin

A by ∀b ∈ N(b ∈ F(a) ⇔ b · 2

a+1

+ 2

a

∈ A).

So let L be unbounded. L is uncountable by the previous paragraphs. Now

75

functions f : ω

1

→ P(N) and g : ω

1

× ω × ω → P(N) with f[ω

1

] ⊆ L ⊆ g[ω

1

, ω, ω]

in order to witness that |L| = ℵ

1

. The construction uses transﬁnite recursion and a

choice function C deﬁned on all nonempty subsets of P(N).

So, for given α, assume that f(β) and g(β, i, j) are deﬁned for all β ∈ α and

i, j ∈ N. Now let

f(α) = C(L −g[α, ω, ω]);

g(α, i, j) = {a | a · i + j ∈ f(α)}.

The resulting functions f, g are then deﬁned on the whole sets ω

1

and ω

1

× ω × ω,

respectively. Now the following properties hold.

• For all α ∈ ω

1

is the set g[α, ω, ω] = {g(β, i, j) | β ∈ α, i, j ∈ ω} at most

countable. Hence L−g[α, ω, ω] is not empty and f(α) = C(L−g[α, ω, ω]) is an

element of L outside the set g[α, ω, ω].

• For all α ∈ ω

1

and all A ⊆ N, A <

lin

f(α) ⇔ ∃i, j ∈ N(A = g(α, i, j)). In

particular, f(α) = g(α, 1, 0).

• If α, β ∈ ω

1

with β < α then f(β) <

lin

f(α). The reason is that all sets

A ≤

lin

f(β) are in g[α, ω, ω] and thus f(α) = C(L − g[α, ω, ω]) ≤

lin

f(β). As

(L, <

lin

) is linearly ordered, f(β) <

lin

f(α).

It follows that (f[ω

1

], <

lin

) is a linearly ordered set isomorphic to (ω

1

, <

lin

) and

f[ω

1

] ⊆ L. The set f[ω

1

] has cardinality ℵ

1

. Let A ∈ L. Then there is some

B ∈ f[ω

1

] with B ≤

lin

A. Hence A <

lin

B. Furthermore, there is an α ∈ ω

1

with

B = f(α). It follows that A = f(α, i, j) for some i, j ∈ N and B ∈ g[ω

1

, ω, ω]. So

f[ω

1

] ⊆ L ⊆ g[ω

1

, ω, ω]. It follows that |L| = ℵ

1

.

At the end it is shown that there is indeed such an uncountable and unbounded lin-

early ordered subset of (P(N), <

lin

). This is done by constructing an order-preserving

mapping h from (ω

1

, ∈) into (P(N), <

lin

) via transﬁnite recursion and using the choice

function C on P(P(N)):

h(α) = C({A ⊆ N∀β ∈ α(h(β) <

lin

A)})

for all α ∈ ω

1

. It is clear from the construction that the mapping is order preserving,

one only has to verify that the set

{A ⊆ N∀β ∈ α(h(β) <

lin

A)}

is not empty for any α ∈ ω

1

. To see this, note that h[α] is at most countable and

that there is therefore a set B ⊆ N such that B ≤

lin

h(β) for every β ∈ α. It follows

76

that {B} ∪ h[α] is at most countable. As argued above, there is an A ⊆ N bounding

every set in {B} ∪ h[α]. Then h(β) <

lin

A as h(β) ≤

lin

A and B ≤

lin

h(β) but

B ≤

lin

A. So there is a proper upper bound of h[α] and the value h(α) is such an

upper bound selected by the choice function. The linearly ordered set (h[ω

1

], <

lin

) is

order-isomorphic to (ω

1

, ∈), has cardinality ℵ

1

and is a subset of the partially ordered

set (P(N), <

lin

). This completes the proof.

Exercise 17.12. Consider the following partial ordering given on the set N

N

of all

functions from N to N:

f ⊏ g ⇔ ∃n∀m > n(f(m) < g(m)).

This partial ordering only shares some but not all of the properties of the ordering

<

lin

considered above. In order to see this, show the following two properties:

• For countably many functions f

0

, f

1

, . . . there is a function g such that ∀n ∈

N(f

n

⊏ g);

• There are uncountably many f below the exponential function n → 2

n

. Namely

for every A ⊆ N the function c

A

: n →

¸

m∈n

2

n−m−1

· A(m) is below the

exponential function.

Note that c

A

⊏ c

B

⇔ A <

lex

B. Thus there is an uncountable linearly ordered set of

functions below the exponential function.

Exercise 17.13. Use the Axiom of Choice to prove the following: If |A| = ℵ

1

and

every B ∈ A satisﬁes |B| ≤ ℵ

1

then |

¸

A| ≤ ℵ

1

.

18 The Set of Real Numbers

The real numbers are one of the most important topics of mathematics. This section

deals with some basic properties of this set. In particular, several ways to represent

the set of real numbers are proposed. Other than in the case of the natural numbers,

there is no standard convention how to do it. The given representations are build

in the standard way using already deﬁned objects like sequences of digits or subsets

of the rational numbers. It is convenient to introduce representations of the integers

numbers ﬁrst.

Example 18.1. The set of integers can be represented as that of ordered pairs of

natural numbers where one of the parts of the pair is 0:

Z = {(m, n) ∈ N ×N | m = 0 ∨ n = 0}.

77

The pair (m, n) represents the integer normally denoted by m−n, so (10, 0) is 10 and

(0, 4) is −4. The addition of two integers (i, j) and (k, l) can be deﬁned as follows.

(i, j) + (k, l) = (m, n) ⇔ (m, n) ∈ Z ∧ ∃h ∈ N(i +k = m+ h ∧ j + l = n + h)

Furthermore, (i, j) < (k, l) if and only if i + l < j + k as natural numbers.

Note that this representation has the disadvantage that it recodes the natural

numbers in a nonstandard way, replacing n by {n, {n, 0}}. An alternative approach

would be to let the natural numbers unchanged, to represent −1 by {{∅}} and, for

all n > 0, −n − 1 by S(−n) = −n ∪ {−n}. So −2 would be {{∅}, {{∅}}} and −3

would be {{∅}, {{∅}}, {{∅}, {{∅}}}}. The disadvantage of this representation is that

the addition and other operations are a bit more diﬃcult to deﬁne.

Theorem 18.2. |R| = |N

N

|.

Proof. To see that |N

N

| ≤ |R|, deﬁne the function F : N

N

→R as follows:

F(f) =

¸

n∈N

10

P

m=0,...,n

−S(f(n))

That is, the decimal representation of the number F(f) is 0.0

f(0)

10

f(1)

10

f(2)

10

f(3)

1 . . .

and the injectiveness follows from the fact that one can reconstruct f from the rep-

resentation of its image. For example, F(f) = 0.1100000001000101 . . . iﬀ f(0) = 0,

f(1) = 7, f(2) = 3 and f(3) = 1. Since one deals with decimal and not with binary

representation, the numbers 0.100000 . . . =

1

10

and 0.0111111 . . . =

1

90

are diﬀerent,

so there is no messing up caused by the images of functions which are almost every-

where 0.

For the converse direction, one takes a one-to-one enumeration q

0

, q

1

, . . . of Q and

constructs the following one-to-one mapping from R to N

N

:

G(r)(n) =

0 if r < q

n

;

1 if r = q

n

;

2 if r > q

n

.

The function G is one-to-one. Let r

1

, r

2

be two diﬀerent real numbers, say r

1

< r

2

.

There is a number n such that r

1

< q

n

< r

2

since Q is a dense subset of R. It follows

that G(r

1

)(n) = 0 and G(r

2

)(n) = 2. Thus G(r

1

) and G(r

2

) are diﬀerent members

of N

N

.

By the Cantor-Bernstein Theorem, it follows from |R| ≤ |N

N

| and |N

N

| ≤ |R| that

these two sets have the same cardinality.

In the following, explicit representations are given and the addition and ordering

deﬁned on them. Set theorists do not much care how to represent real numbers.

78

If there are two representations, one can go from one to the other with a bijective

function f and then carry over the operations: If addition is deﬁned on the image of

f, then one can inherit the deﬁnition to the domain of f by

x + y = f

−1

(f(x) +f(y))

and so on. Here some examples based on the idea to represent real numbers by digits.

Exercise 18.3. Show that the standard representation can be deﬁned in set-theory:

First deﬁne a representation for the set A = Z∪{sign}. Then look at the class of all

functions r : A → {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, +, −}, call it B; the decimal point could be

placed between r(0) and r(−1) and need not to be represented explicitly.

Deﬁne which elements of B represent real numbers and get R by comprehension,

state the property explicitly. For this and further deﬁnitions, integer constants, integer

addition and < on the integers can be used to in order to deal with positions of digits.

The selection should be made such that r represents

¸

z∈Z

r(z) · 10

z

in the case that

r(sign) is + and −

¸

z∈Z

r(z) · 10

z

in the case that r(sign) is −. Make sure that every

real occurs in the representation exactly once. For example, ﬁx the sign of 0 to either

+ or −.

This representation has the disadvantage that N ⊆ R. So one distinguishes as

in many programming languages like FORTRAN between the natural number 2 and

the real number 2.0. Nevertheless, there is a one-to-one mapping f : N → R which

maps every natural number to its representative in R. f can be deﬁned inductively

using a g : R → R such that f(S(n)) = g(f(n)) for all natural numbers n. Give two

properties nat, succ such that nat(r) is true iﬀ r is in the range of f and succ(r, q) is

true if nat(r) ∧ nat(q) ∧ q = g(r).

In the early days of computing, integers were represented by bytes, more precisely,

they were limited to the numbers −128 up to 127. The negative numbers started

with a 1 and the positive (including 0) with a 0. So one had that −128 = 10000000,

−127 = 10000001, . . ., −2 = 11111110, −1 = 11111111, 0 = 00000000, 1 = 00000001,

2 = 00000010, . . ., 127 = 01111111. The next exercise shows how to transfer this idea

to the representation of the reals.

Exercise 18.4. Consider the following set WS representing the reals Without Sign:

WS = {r : Z → {0, 1} | ∀n ∈ Z∃m < n(r(m) = 0)

∧ ∃n ∈ Z∀m > n(r(m) = r(m−1))}

One can deﬁne on WS an addition +. Let (r +q)(n) = 1 if one of the following three

conditions holds:

79

1. r(n) = q(n) and r(k) = q(k) for all k < n;

2. r(n) = q(n) and there is m < n such that r(m) = 1 and q(m) = 1 and

r(k) = q(k) for all k with m < k < n;

3. r(n) = q(n) and there is an m < n such that r(m) = 0 and q(m) = 0 and

r(k) = q(k) for all k with m < k < n.

Let (r + q)(n) = 0 otherwise. From + one can deﬁne an ordering < on WS by

r < q ⇔ ∃s ∈ WS (q = r + s ∧ ∃n ∈ Z∀m > n(s(m) = 0)).

Verify that (WS, +) is a commutative group: show that the function null mapping Z

to 0 is the neutral element, that q + r = r + q for all r, q ∈ WS, that r + (q + s) =

(r + q) + s for all r, q, s ∈ WS and that for every r ∈ WS there is a q ∈ WS with

r + q = null. Show that < is an ordering of WS.

Deﬁnition 18.5. A ⊆ R is open iﬀ for every a ∈ A there is a positive ǫ > 0 such that

{b ∈ R | a −ǫ < b < a + ǫ} is a subset of A. A set A ⊆ R is closed iﬀ R −A is open.

A point a ∈ A is isolated iﬀ there is an open set B with A∩B = {a}. A set is perfect

iﬀ it is closed but does not have isolated points. Say that a ∈ A is approximable from

above in A iﬀ a = inf{b ∈ A | b > a}. A set is compact iﬀ it is closed and if it is

bounded in the sense that there are b, c ∈ R with b < a < c for all a ∈ A.

Example 18.6. Let a, b ∈ R with a < b. The closed interval {r ∈ R | a ≤ r ≤ b} is

a closed set, perfect and compact. The open interval {r ∈ R | a < r < b} is an open

set. Every open set is the union of open intervals.

Remark 18.7. A pair (X, Y ) is called a Hausdorﬀ space iﬀ the following four axioms

hold.

1. Y ⊆ P(X) and ∅, X ∈ Y .

2. If A, B ∈ Y then A ∩ B ∈ Y .

3. If W ⊆ Y then

¸

W ∈ Y .

4. If a, b ∈ X and a = b then there are A, B ∈ Y such that a ∈ A, b ∈ B and

A ∩ B = ∅.

Hausdorﬀ observed that these axioms are true for X = R and Y being the open

subsets of X. He discovered that the structure Y of open sets on a basis set X has

many characteristic properties of a space; for example the dimension n of R

n

can be

80

reconstructed by analyzing the structure of the open sets only. In general, one calls

any structure (X, Y ) satisfying the ﬁrst three axioms a topological space and Y is

called the topology on X.

By the way, Hausdorﬀ introduced his axioms in his book “Mengenlehre” which is

the German translation of the word “set theory”.

Example 18.8. Call a set A upward-open iﬀ for every a ∈ A there is r ∈ R with

r > a and {s ∈ R | a ≤ s < r} ⊆ A. Then the class of all upward-open sets on R

satisﬁes Hausdorﬀ’s axioms and diﬀers from that class of the open sets in R deﬁned

in Deﬁnition 18.5.

Proof. It is easy to see that the set {a ∈ R | a ≥ 0} is upward-open. But this set

is not open in the usual sense since it contains 0 without containing any number less

than 0. Now Hausdorﬀ’s axioms are veriﬁed.

1. The empty set is upward-open. Also R itself is upward-open since for every

a ∈ R the set {s ∈ R | a ≤ s < a + 1} is a subset of R.

2. If A, B are upward-open and a ∈ A ∩ B. There is r > a such that {s ∈ R | a ≤

s < r} ⊆ A. Since a ∈ B one can also ﬁnd a q such that a < q < r and

{s ∈ R | a ≤ s < q} ⊆ B. The latter set is then also in A ∩ B. Since such a set

exists for all elements of A ∩ B, A ∩ B is upward-open.

3. Let W consist of upward-open subsets of R and let a ∈

¸

W. There is an A ∈ W

with a ∈ A. Since A is upward-open there is an r > a with {s ∈ R | a ≤ s < r}

⊆ A. This set is also contained in

¸

W and

¸

W is upward-open.

4. Assume that a, b ∈ R and a = b. One of them is smaller, say a < b. Then

A = {s ∈ R | a ≤ s < b} and B = {s ∈ R | b ≤ s} are two disjoint upward-open

sets with a ∈ A and b ∈ B.

So all four axioms of Hausdorﬀ are satisﬁed.

Exercise 18.9. Verify that Hausdorﬀ’s axioms are true for the set R. That is, verify

that (R, {A ⊆ R | A is open}) is a Hausdorﬀ space.

Exercise 18.10. Let α be any ordinal diﬀerent from 0 and 1. Deﬁne a topology on

α by saying that a set β ⊆ α is open iﬀ β is an ordinal. Verify that the ﬁrst three

axioms of Hausdorﬀ are satisﬁed, but not the last fourth one.

Exercise 18.11. Find a topology on the set of ordinals up to a given ordinal α which

satisﬁes the Axioms of Hausdorﬀ and in which an ordinal β ∈ α is isolated iﬀ it is

either a successor ordinal or 0.

81

19 The Continuum Hypothesis

Cantor showed that the cardinality of R is the same as the cardinality of P(N).

Furthermore he showed that the cardinality of N is smaller than the of R. But he did

not ﬁnd any set of intermediate cardinality.

Recall that the class of inﬁnite cardinals can be identiﬁed with the following class

of ordinals:

{α | α ≥ ω ∧ ∀β < α(|β| < |α|)}

This subclass is well-ordered and there is an order-preserving isomorphism from every

ordinal α to the inﬁnite cardinal ω

α

. So ω

0

is just ω. ω

1

is the ﬁrst uncountable

ordinal. Recall that the cardinalities of these ordinals are just called “ℵ” with the

same index: ℵ

α

= |ω

α

| and that 2

κ

denotes the cardinality of the power set of any set

of cardinality κ.

In the following, it is shown for many natural types of subsets of R that they have

either the cardinality 2

ℵ

0

or are ﬁnite or have the cardinality ℵ

0

. But an intermediate

cardinality does not show up. Therefore, Cantor conjectured that there is no interme-

diate cardinality. That is, he stated the following continuum hypothesis (CH) where

“continuum” refers to the real numbers.

Conjecture 19.1 (Continuum Hypothesis). 2

ℵ

0

= ℵ

1

.

This result cannot be proven. But this section deals with partial results obtained

by attempts to prove the Continuum Hypothesis. These results show that certain

types of sets of real numbers satisfy this hypothesis in the sense that there is no set

of intermediate cardinality of this type. That is, sets of this type are either at most

countable or have the cardinality of the continuum.

But before dealing with these results, a general theorem is given. This theorem

shows how to prove one of the directions in Theorem 20.10 below since its proof easily

generalizes to one for the fact that 2

ℵ

0

is not the limit of an ascending sequence of

countably many other cardinals.

Theorem 19.2 (K¨onig). 2

ℵ

0

= ℵ

ω

.

Proof. The following implication is proven: 2

ℵ

0

≥ ℵ

ω

⇒ 2

ℵ

0

> ℵ

ω

. Thus 2

ℵ

0

= ℵ

ω

.

So assume that 2

ℵ

0

≥ ℵ

ω

. Let α be the least ordinal having the cardinality 2

ℵ

0

,

that is, let α be the ordinal representing the cardinal 2

ℵ

0

. Then α ≥ ω

ω

where ω

ω

is

the ordinal representing ℵ

ω

and ω

ω

=

¸

{ω

n

| n ∈ N}. Since |P(N)| = 2

ℵ

0

, there is a

bijection F : α → P(N).

Recall the deﬁnition of ≤

lin

from Example 17.11 which is a relation on subsets of

N. This relation is transitive and has two important properties.

82

• For every A there are at most countably many B ⊆ N with B ≤

lin

A.

• For every countable set {B

1

, B

2

, . . .} of subsets of N there is an A with B

k

≤

lin

A

for all k.

Now extend F to G : α × ω × ω → P(N) with G(α, i, j) = {a | a · i + j ∈ F(α)}.

G[α, ω, ω] is the closure downwards under ≤

lin

of F[α]. Each set F[ω

n

] has cardinality

ℵ

n

. Furthermore, using Hessenberg’s Theorem, |G[ω

n

, ω, ω]| ≤ ℵ

n

· ℵ

0

· ℵ

0

= ℵ

n

.

Since |α| > ℵ

n

there is for each n a set B

n

with B

n

∈ P(N) − G[ω

n

, ω, ω]. Thus

there is a set A ⊆ N with B

n

≤

lin

A for all n. Since all sets G[ω

n

, ω, ω] are closed

under ≤

lin

and each of them does not contain B

n

, none of them contains A. Since

ω

ω

=

¸

{ω

n

| n ∈ N}, A ∈ P(N) −G[ω

ω

, ω, ω] and A ∈ P(N) −F[ω

ω

]. By assumption

A = F[β] for some β ∈ α. Since β ≥ ω

n

for all n ∈ N, β ≥ ω

ω

and α > ω

ω

. Since α

is a cardinal, it is larger than |ω

ω

| and 2

ℵ

0

> ℵ

ω

. This completes the proof.

The next results are the ﬁrst step on the way to prove Theorem 19.8 which says

that every closed set is either at most countable or has the cardinality of the contin-

uum. Cantor’s Discontinuum in Exercise 19.7 is one of the ﬁrst examples of a set of

cardinality 2

ℵ

0

which has Lebesgue measure 0.

Theorem 19.3. Every nonempty open set has cardinality 2

ℵ

0

.

Proof. Let A be an open nonempty subset of R. Open sets are unions of basic open

sets of the type R

r,ǫ

= {q ∈ N | r −ǫ < q < r + ǫ} where ǫ is a positive real number.

So let r, ǫ ∈ R be such that ǫ > 0 and R

r,ǫ

⊆ R. The following f is a one-to-one

mapping from R into A, in fact f[R] = R

r,ǫ

:

f(x) =

r + ǫ ·

x

x+1

if x > 0;

r if x = 0;

r −ǫ ·

x

x−1

if x < 0.

Then |R| = |A| by Proposition 6.5.

Exercise 19.4. If A ⊆ R is at most countable, then R −A has cardinality 2

ℵ

0

.

Proposition 19.5. Let A ⊆ R be perfect. Then there is a countable subset B ⊆ A

such that (B, <) is a dense linearly ordered set without end points where < is the

natural ordering inherited from R.

Proof. Let a, b ∈ A such that there is a third element c ∈ A with a < c < b. Since A

is perfect, there is for every ǫ > 0 a further element c

′

∈ A − {c} with such that the

distance between c and c

′

is less than ǫ. Starting with ǫ =

1

2

· min{c − a, b − c}, one

83

obtains that a < c

′

< b. By iterating these argument with an ǫ also smaller than the

distance between c and c

′

, one can establish that there are three numbers c

0

, c

1

, c

2

∈ A

with a < c

0

< c

1

< c

2

< b. Thus, if there is c ∈ A between a, b then one can ﬁnd a

new element c

1

∈ A such that A has elements between a, c

1

and c

1

, b. Note that it

might be impossible to take c

1

= c.

Let X

0

= {l, h} with l, h ∈ A such that there is a further c ∈ A with l < c < h.

Using the Axiom of choice, one can deﬁne a function f giving X

S(n)

from X

n

such

that

1. X

S(n)

is ﬁnite and X

S(n)

⊆ A;

2. for all a, b ∈ X

n

with a < b there is a c ∈ X

S(n)

with a < c < b;

3. for all a, b ∈ X

S(n)

there is a c ∈ A with a < c < b.

That is, given X

n

, f does the following: for all pairs (a, b) ∈ X

n

satisfying a < b∧∀c ∈

X

n

(c ≤ a ∨b ≤ c), f searches a c

0

, c

1

, c

2

∈ A such that a < c

0

< c

1

< c

2

< b and puts

a, b, c

1

into X

S(n)

.

The set C = ∪{X

0

, X

1

, . . .} is in V and also the set B = {c ∈ C | l < c < h}.

Clearly B ⊆ A. If a, b ∈ B, then a, b ∈ X

n

for some n and there is a c ∈ X

S(n)

such

that a < c < b. Thus B is dense. Furthermore, B has no end points, since l is the

inﬁmum and h the supremum of B with respect to C but l, h / ∈ B; note that they

might not be the inﬁmum and supremum with respect to R. Furthermore, B is the

union of countably many ﬁnite sets and thus countable.

Theorem 19.6. Every perfect set A ⊆ R has cardinality 2

ℵ

0

.

Proof. Since (B, <) is dense, there is an isomorphism f from Q to B. Now deﬁne

g : R →R by

g(r) = sup{f(q) | q ∈ Q∧ q < r}.

If r, r

′

∈ R then there are q, q

′

∈ Q such that r < q < q

′

< r

′

. It follows that

g(r) ≤ f(q) < f(q

′

) ≤ f(r

′

). Thus g is one-to-one.

Since A is closed, the complement of A is an open set. If there would be an r ∈ R

with g(r) / ∈ A, then there would also be an ǫ > 0 such that {s ∈ R | g(r) − ǫ <

s < g(r) + ǫ} is disjoint to A. But then g(r) −

ǫ

2

is an upper bound for the subset

{f(q) | q ∈ Q ∧ q < r} of A in contradiction to g(r) being the supremum of this

subset. Thus g(r) ∈ A and |R| ≤ |A|. Since A ⊆ R, |A| = 2

ℵ

0

.

Exercise 19.7. Cantor’s Discontinuum is given as F({0, 2}

N

) where F maps every

f ∈ {0, 2}

N

to the real number having the digits f(0)f(1)f(2) . . . in the ternary digital

84

representation after the point:

F(f) =

¸

n∈N

f(n) · 3

−1−n

.

For example, if f(0) = 2 and f(n) = 0 for all n ≥ 1 then F(f) represents the ternary

number 0.02222 . . ., that is, F(f) =

1

3

. If E is the set of even numbers and f(n) = 0

for n ∈ E and f(n) = 2 for n / ∈ E then F(f) is the ternary number 0.20202020 . . .,

that is, F(f) =

¸

n∈E

2 · 3

−1−n

=

3

4

. Show that

1. F restricted to {0, 2}

N

is one-to-one;

2. F({0, 2}

N

) does not have any nonempty open subset;

3. F({0, 2}

N

) is perfect.

Furthermore, show that F({0, 2}

N

) is given as

F({0, 2}

N

) = {r ∈ R | 0 ≤ r ≤ 1} −T where

T = {r ∈ R | ∃m, n ∈ N(m· 3

−n

+ 3

−1−n

< r < m· 3

−n

+ 2 · 3

−1−n

)},

that is, T is the set of all positive real numbers for which the digit 1 appears in every

ternary representation after the point;

1

3

/ ∈ T since it has besides 0.1000 . . . also the

representation 0.02222 . . . where no 1 occurs after the point.

Theorem 19.8. Every uncountable closed subset of reals has a perfect subset. Hence

every closed subset of R is either at most countable or of the cardinality 2

ℵ

0

.

Proof. Let A be a closed subset of R and let

B = {r ∈ A | ∀ǫ > 0 ( |A ∩ R

r,ǫ

| ≥ ℵ

1

)}

Then B satisﬁes the following properties.

1. A−B is countable. If r ∈ A−B there is an ǫ > 0 such that R

r,ǫ

∩A is countable.

Thus there are q, δ ∈ Q such that δ > 0 and {r} ⊆ R

q,δ

⊆ R

r,ǫ

since Q is dense

in R. It follows that

A −B =

¸

{A ∩ R

q,δ

| q, δ ∈ Q∧ δ > 0 ∧ |A ∩ R

q,δ

| ≤ ℵ

0

}

which is countable since it is the union of a countable set of countable sets.

85

2. B is closed. The sets R −A and

C =

¸

{R

q,δ

| q, δ ∈ Q∧ δ > 0 ∧ |A ∩ R

q,δ

| ≤ ℵ

0

}

are open, thus B = A −C = R −((R −A) ∪ C) is a closed set.

3. B has no isolated points. Assume that r ∈ B. Then, for every ǫ > 0, A∩R

r,ǫ

is

uncountable and thus A ∩ R

r,ǫ

− (A − B) − {r} is also uncountable. Thus r is

not an isolated point of B.

So either A is at most countable or B is not empty. In the latter case, B is nonempty

and satisﬁes the last two properties. That is, the cardinality of B is 2

ℵ

0

and since

B ⊆ A ⊆ R, A has the same cardinality.

20 The Axioms of Zermelo and Fraenkel

First-order logic permits to state axioms which quantify over elements of V but not

over subclasses of V . Furthermore, one can use expressions to deﬁne subclasses of V .

Consider now a subclass Gwhich is a function, that is, there is a domain (either class or

set) such that there is for all x

1

, x

2

, . . . , x

n

, W exactly one y with (x

1

, x

2

, . . . , x

n

, y) ∈ G

and all elements of F are of there type. Then one can use Recursion (if W = N) or

transﬁnite recursion (if W = N using some suitable well-founded relation R on W) to

construct a new class as done in Theorems 5.2 and 13.7. Furthermore, one can do with

classes the usual operations like concatenation. For example, given three classes which

are functions G

1

, G

2

, G

3

: V

2

→ V then the function x, y, z → G

1

(G

2

(x, y), G

3

(x, z)) is

also a class and can be used in the below Axioms of Replacement and Comprehension.

Nevertheless, it is understood that all the classes considered can be build in ﬁnitely

many steps from sets and the expressions and properties in Deﬁnition 3.7 with these

methods.

Axioms 20.1 (Zermelo and Fraenkel). Let V be the class of all sets. There are

classes coding functions x, y → {x, y}, x →

¸

x, x → P(x) and special sets ∅, N such

that the following holds:

Foundation: ∀x ∈ V (x = ∅ ⇒ ∃y ∈ x∀z ∈ x(z / ∈ y));

Extensionality: ∀x, y ∈ V (x = y ⇔ ∀z ∈ V (z ∈ x ⇔ z ∈ y));

Existence (of empty set): ∀x(x / ∈ ∅);

Pairing: ∀x, y ∈ V ∀z ∈ V (z ∈ {x, y} ⇔ (z = x ∨ z = y));

86

Schema of Comprehension: For all classes which are unary functions F and for all

sets x ∈ V , {y ∈ x | F(y) = ∅} ∈ V ;

Union: ∀x, y ∈ V (y ∈

¸

x ⇔ ∃z ∈ x(y ∈ z));

Power Set: ∀x, y ∈ V (y ∈ P(x) ⇔ ∀z ∈ y (z ∈ x));

Inﬁnity: ∅ ∈ N and ∀y ∈ V (y ∈ N ⇒ y ∪ {y} ∈ N),

∀x ∈ V (∅ ∈ x ∧ ∀y ∈ V (y ∈ x ⇒ y ∪ {y} ∈ x) ⇒ ∀z ∈ N(z ∈ x));

Schema of Replacement: For every n and every class coding an n-ary function F and

every sets x

1

, . . . , x

n

the set F[x

1

, . . . , x

n

] is in V .

Choice: For all sets x ∈ V there is a function C

x

such that for all nonempty y ∈ x,

C

x

(y) ∈ y.

These axioms are called the Zermelo-Fraenkel Axioms with Choice or just ZFC. The

axiom system ZF is obtained by taking all above axioms except the Axiom of Choice.

Deﬁnition 20.2. A model of ZF consists of the class V and the relations ∈ such that

the above axioms are satisﬁed. Similar for models of ZFC.

Remark 20.3. There are several models of set theory, that is, the models are not

uniquely deﬁned. A hypothesis H is called independent under ZF if there are two

models of ZF such that one satisﬁes H and the other satisﬁes ¬H.

One method to build models is to start in a large model (V, ∈) and then to build

inside (V, ∈) a smaller model (W, R) with W, R ∈ V such that (W, R) satisﬁes the a

certain desired combination of axioms. If R is the restriction of ∈ to W, then one

writes (W, ∈) instead of (W, R).

Deﬁnition 20.4. A structure (W, R) is called an inner model of (V, ∈) iﬀ W, R ∈ V

and (W, R) satisfy all set-theoretic axioms with R being a relation standing for the

element relation ∈.

Exercise 20.5. Given any model (V, ∈), show that (V

ω

1

, ∈) is not an inner model

of ZFC. Take a well-ordering of P(N) and show that ω

1

is contained in its range.

Therefore, ω

1

is the candidate for the inner model and this can be used to show that

(V

ω

1

, ∈) cannot be an inner model.

Inaccessible cardinals are one example of large cardinals. Intuitively a cardinal is

called large if it exists in some but not all models; that is, its existence cannot be

proven from the existence of lower cardinals. Note that the cardinal ℵ

0

only exists

87

because of the Axiom of Inﬁnity or an equivalent one. Similar, a large cardinal would

only be guaranteed to exist if an additional axiom is added and there are models

of ZFC where no large cardinals exist. While the notion of a large cardinal is not

precisely deﬁned and is just used to denote anything which is not guaranteed to exist

due to being large, the notion of an inaccessible cardinal is much more precise and

deﬁned as follows.

Deﬁnition 20.6. A cardinal κ > ℵ

0

is inaccessible iﬀ

• for all cardinals λ < κ, 2

λ

< κ and

• for all sets L ⊆ κ of cardinals, either sup(L) < κ or |L| = κ.

Inaccessible cardinals are interesting since every inaccessible cardinal permits to build

a submodel for ZFC from a given model for ZFC. There the condition κ > ℵ

0

is

important because otherwise the Axiom of Inﬁnity will go lost. Recall that a cardinal

is identiﬁed with the least ordinal of the same cardinality, thus V

κ

is deﬁned for every

cardinal. The following proposition is given without a proof.

Proposition 20.7. Given a model (V, ∈) of ZFC, the following conditions are equiv-

alent for every cardinal κ > ℵ

0

:

1. κ is inaccessible;

2. for every x ∈ V

κ

, sup{2

|y|

| y ∈ x} < κ;

3. |V

κ

| = κ;

4. for every class F being a function in one argument which maps V

κ

to V

κ

and

every α ∈ κ there is β ∈ κ with F[V

α

] ∈ V

β

.

Theorem 20.8. Let κ be an inaccessible cardinal. Then (V

κ

, ∈) is an inner model of

ZFC with respect to the functions.

The proof of this result is omitted. The central idea of the proof would be to show

that every set X ⊆ V

κ

satisﬁes ρ(X) = κ ⇔ |X| = κ. Then if f is a class which is a

function, either f(X) ⊆ V

κ

and f does not need to be considered or f(X) ∈ V

κ

and

there is no problem. Using this key idea, one can verify the other axioms.

Theorem 20.9. The existence of inaccessible cardinals cannot be proven in ZFC.

88

Proof. It has to be shown that there is a model of ZFC not containing an inaccessible

cardinal. Then it follows that one cannot prove the existence of these cardinals.

Given a model (V, ∈), assume that it contains inaccessible cardinals – otherwise

there is nothing to prove. So let λ be a cardinal which bounds some inaccessible

cardinal. Then {κ ∈ λ | ℵ

0

< |V

κ

| = |κ|} is a set of ordinals in V and thus well-

ordered. This set contains all inaccessible cardinals below λ. Thus it has a least

element κ. Now (V

κ

, ∈) is a model of ZFC.

It remains to show that this model does not contain “new inaccessible ordinals”:

So assume that α is an ordinal in (V

κ

, ∈) with |V

α

| = α. Ordinals are transitive sets

such that any two members are comparable with respect to ∈, so α is also an ordinal

in (V, ∈). Furthermore, there is a bijection f : V

α

→ α and this f is in V

κ

. Since

V

κ

⊆ V , f ∈ V and |V

α

| = α also with respect to the model (V, ∈). It follows that

α ≤ ω in (V, ∈) since κ is the least inaccessible cardinal in (V, ∈). Since ω is the same

in (V, ∈) and (V

κ

, ∈), α ≤ ω also in (V

κ

, ∈) and (V

κ

, ∈) has no inaccessible cardinals.

Thus the existence of such cardinals in unprovable from ZFC.

The next theorems are given without proof. The ﬁrst one shows that one cannot

decide the Continuum Hypothesis from ZFC because there are models of ZFC where

this hypothesis is true and others where it is false. Every statement φ which is

decidable from ZFC is either true in all models of ZFC or false in all models of ZFC.

Theorem 20.10 (Cohen, Easton, G¨odel, K¨onig). Let α be an at most countable

ordinal. Then there is a model with 2

ℵ

0

= ℵ

α

iﬀ α is a successor ordinal.

Cantor proved 1878 that 2

κ

> κ for all cardinals κ and α cannot be the limit ordinal

0. K¨onig proved 1905 that 2

ℵ

0

= ℵ

α

for all countable limit ordinals, see Theorem 19.2.

G¨odel proved 1938 the above Theorem with the parameter α = 1. So he obtained

that the Continuum Hypothesis is consistent with ZFC and constructed a model with

2

ℵ

0

= ℵ

1

. Starting with a model (V, ∈) of ZFC, G¨odel deﬁned a new model (L, ∈) by

transﬁnite recursion. He ﬁrst deﬁned a suitable class AF of absolute functions and

deﬁned then the following classes L

α

inductively for all ordinals α:

L

α

= {x ∈ V

α

| ∃β ∈ α∃y ∈ L

β

∪ {L

β

} ∃F in AF (x = F[y] ∧ x ⊆ L

β

)}

Note that L

β

= V

β

for β ≤ ω but it might be that L

ω+1

⊂ V

ω+1

. For every x ∈ L,

if y = P(x) in L and z = P(x) in V then y = z ∩ L but it can happen that y ⊂ z.

Indeed there is such an x ∈ L whenever L = V . Furthermore, functions which exist

in V and witness that |x| ≤ |y| might fail to exist in L and thus it might happen that

|x| < |y| in L. G¨odel’s model has the following properties.

• (L, ∈) satisﬁes ZFC.

89

• For all ordinals α, 2

ℵα

= ℵ

α+1

.

The second property is called the Generalized Continuum Hypothesis (GCH).

Cohen constructed 1963 for every countable ordinal α which is not a limit ordinal

a model such that 2

ℵ

0

= ℵ

α

; actually the method works also for some larger ordinals.

Easton investigated 1970 the question what possible outcomes exist for the function

f satisfying 2

ℵα

= ℵ

f(α)

. In this terminology, Cantor showed ∀α(f(α) > α); G¨odel

showed that it is consistent with ZFC to assume ∀α(f(α) = α + 1) (which is GCH),

K¨onig’s result is that f(0) = α whenever α is a limit ordinal which is the union

of countably many smaller ordinals, Cohen showed that f(0) can be any countable

successor ordinal. Easton showed that many functions are possible, for example if

∀n ∈ N(max{f(n), n + 1} ≤ f(n + 1) < ω) then there is a model of ZFC with

2

ℵn

= ℵ

f(n)

for all n ∈ N. Easton’s result was indeed a bit more general and showed

that one can prescribe the cardinality of the power set for all successor cardinals and

some limit ones. But the problem was not yet completely solved, for example Silver

showed 1974 that if 2

ℵα

= ℵ

α+1

for all α < ω

1

then 2

ℵω

1

= ℵ

ω

1

+1

and cannot take

any other value. So in some cases 2

ℵα

might be determined by the values of 2

ℵ

β

with β < α.

The next theorem shows that there are nonstandard models of ZFC. A nonstandard

model is a model in which the natural numbers do not exist as a set. Instead of the

set {0, 1, 2, . . .}, there is a set containing some additional elements, here denoted as N

or ω. This set contains also nonstandard numbers beyond the usual natural numbers

which behave like natural numbers but are not such numbers. Then the collection

{β ∈ ω | β is a nonstandard number} does not have a least element and therefore is

neither a set nor a class. This nonstandard model is not what is intended but one

can show that every ﬁrst-order axiomatization of set theory has a nonstandard model.

Only inﬁnite axioms like

∀x (x ∈ N ⇔ x = 0 ∨ x = 1 ∨ x = 2 ∨ x = 3 ∨ . . .)

can rule out nonstandard models, but such axioms are normally not considered in

set theory as inﬁnite formulas are much more diﬃcult to handle than ﬁnite formulas.

Nevertheless, one often considers axioms which are inﬁnite sets of ﬁnite formulas.

Theorem 20.11. There is a model (V, ∈) of ZFC where N contains an element α

such that 0 = α, 1 = α, 2 = α, . . .; more informally, {0, 1, 2, . . .} ⊂ N and the

collection {0, 1, 2, . . .} of all natural numbers is neither a set or a class. α is called a

“non-standard number”.

A further pathology is that one can have a countable model of ZFC. That is, one

constructs the model of ZFC within a model (V, ∈) as an inner model (W, R) and has

90

that |W| = ℵ

0

with respect to the model (V, ∈). The members of W have of course

cardinalities higher than ℵ

0

with respect to (W, R); so the notion of cardinality is

depending on the view point which one has.

21 References

Klaus Gloede gave the permission to include the literature from his lecture notes which

are in German language. Thus many of the cited books are also in German. Although

only English language books are relevant for most of the students of this module,

German and French language titles are kept for those who know these languages.

Introductory texts.

1. Cameron, P. J. Sets, Logic and Categories. Springer 1999.

2. Deiser, O. Einf¨ uhrung in die Mengenlehre. Springer 2004 (German).

3. Devlin, K. The Joy of Sets. Springer 1993.

4. Ebbinghaus, D. Einf¨ uhrung in die Mengenlehre. Spectrum 2003 (German).

5. Friedrichsdorf, U., Prestel, A. Mengenlehre f¨ ur den Mathematiker. Vie-

weg 1985 (German).

6. Hrbacek, K., Jech, T.H. Introduction to Set Theory. Second edition. Marcel

Dekker, New York, 1984.

7. Klaua, D. Mengenlehre de Gruyter 1979 (German).

8. Krivine, J.L. Th´eorie axiomatique des ensembles. Presse de l’Universit´e Paris

1969 (French).

9. Moschovakis, Y. Notes on Set Theory. Springer 1994.

10. Oberschelp, A. Allgemeine Mengenlehre. BI 1994 (German).

11. L´ evy, A. Basic Set Theory. Springer 1979.

12. Rubin, J.E. Set Theory for the Mathematician. Holden-Day 1967.

More comprehensive literature and related ﬁelds.

91

1. Jech, Th. Set Theory. Springer 2003.

2. Mendelson, E. Introduction to Mathematical Logic. Chapman & Hall 1997.

3. Rautenberg, W. Einf¨ uhrung in die Mathematische Logik. Vieweg 2002 (Ger-

man).

4. Shoenfield, J.R. Mathematical Logic. Addison-Wesley 1967.

Special aspects of set theory.

1. Devlin, K. Aspects of Constructibility. Springer 1984.

2. Drake, F.R. Set Theory. An Introduction to Large Cardinals. North Holland

1974.

3. Felgner, U. Models of ZF-set theory. Springer 1971.

4. Jech, Th. The Axiom of Choice. Springer 1993.

5. Kanamori, A. The Higher Inﬁnite. Large Cardinals in Set Theory from Their

Beginnings. Springer 2003.

6. Kechris, Alexander S. Classical descriptive set theory. Springer 1995.

7. Kunen, K. Set Theory. An Introduction to Independence Proofs. North Hol-

land 1983.

8. Moschovakis, Yiannis N. Descriptive set theory. North Holland 1980.

Original texts of the founders of set theory.

1. Bolzano, B. Paradoxien des Unendlichen. Leipzig 1851; Meiner 1955 (Ger-

man).

2. Cantor, G. Gesammelte Abhandlungen. Springer 1980 (German).

3. Dauben, J.W. Georg Cantor. His Mathematics and Philosophy of the Inﬁnite.

Princeton 1990.

4. Felgner, U. (editor) Mengenlehre. Wissenschaftliche Buchgesellschaft 1979

(German).

5. Fraenkel, A. Einleitung in die Mengenlehre. Springer 1919.

92

6. Fraenkel, A., Bar Hillel, Y., L´ evy, A. Foundations of Set Theory. North

Holland 1973.

7. G¨ odel, K. Collected Works. Oxford 1986-2003.

8. Hallett, M. Cantorian set theory and limitation of size. Oxford University

Press 1984.

9. Hausdorff, F. Gesammelte Werke Band II. Grundz¨ uge der Mengenlehre.

Springer 2002.

10. Lavine, S. Understanding the Inﬁnite. Harvard 1998.

11. Meschkowski, H. Probleme des Unendlichen. Werk und Leben Georg Cantors

Vieweg 1967.

12. Meschkowski, H. Hundert Jahre Mengenlehre Dtv 1973.

13. Moore, G.H. Zermelo’s Axiom of Choice. Springer 1982.

14. Purkert, E., Ilgauds, H.J. Georg Cantor 1845-1918. Birckh¨auser 1987.

15. Quine, Willard V. Set Theory and its Logic. Harvard 1969.

16. Spalt, D. (editor) Rechnen mit dem Unendlichen Birckh¨auser 1990 (German).

Set theory using alternative systems of axioms.

1. Bernays, P. Axiomatic set theory. North Holland 1968.

2. Chuaqui, R. Axiomatic Set theory. Impredicative Theory of Classes. North

Holland 1981.

3. Forster, T. E. Set theory with a universal set. Oxford University Press 1992.

4. Potter, M.D. Sets. An Introduction. Oxford University Press 1990.

5. Quine, Willard V. Mengenlehre und ihre Logik. Vieweg 1973.

6. Schmidt, J. Mengenlehre I. BI 1966.

7. Vopenka, P., Hajek, P. The Theory of Semisets. North Holland 1972.

Bibliography and Links.

93

1. M¨ uller, G.H., Lenski, V. (editors) Ω-Bibliography of Mathematical Logic.

Vol. I-VI. Springer 1987. See also:

http://www-logic.uni-kl.de/BIBL/index.html.

2. Klaus Gloede’s lecture notes on set theory:

http://www.math.uni-heidelberg.de/logic/skripten.html (German)

3. Homepage for Mathematical Logic:

http://www.uni-bonn.de/logic/world.html

4. Biographies of mathematicians:

http://www-groups.dcs.st-and.ac.uk/˜history/BiogIndex.html

94

Contents

1 Foundations 2 Basic Operations with Sets 3 Functions 4 Natural Numbers 5 Recursive Deﬁnition 6 Cardinality of Sets 7 Finite and Hereditarily Finite Sets 8 Countable Sets 9 Graphs and Orderings 10 Linear Ordering 11 Well-Orderings 12 Ordinals 13 Transﬁnite Induction and Recursion 14 The Rank of Sets 15 Arithmetic on Ordinals 16 Cardinals 17 The Axiom of Choice 18 The Set of Real Numbers 19 The Continuum Hypothesis 20 The Axioms of Zermelo and Fraenkel 21 References 2 3 7 13 18 23 27 30 35 38 43 50 55 58 62 65 69 73 77 82 86 91

1

Foundations

The two primitive notions of set theory are “set” and “membership”. These two notions will not be deﬁned. All other concepts are deﬁned in terms of these two primitive notions. In this course, all basic properties (Axioms) of set-theory are introduced and the various sets are deﬁned one after the other. Nevertheless, for some examples and exercises, reference is made to well-known but later introduced objects in order to illustrate certain notions or results. A set is intuitively a collection of objects. The objects in the collection are members (or elements) of the set. One important point is that the objects in a set can again be sets. Here an example. Example 1.1. {1, {0, 2}, {0, 3}} is the set having three elements, namely 1, {0, 2} and {0, 3}. The second and third elements are again sets containing 0, 2 and 0, 3, respectively. Graph Representation 1.2. One can consider every set to be a vertex of a large graph where there is a connection from x to y iﬀ x ∈ y. The graph of the above example has the following 7 vertices and 7 directed edges:

{1, {0, 2}, {0, 3}}

T T T

¨

I B ¨ ¨ ¨ T ¨

{0, 2} {0, 3}

T

0

1

2

3

The domain, that is the collection of all sets, is denoted by V and called the Von Neumann Universe. Writing down a set is longer than writing down any element of it. Thus one can only ﬁnitely often go down from a set to a member since each time the description becomes shorter. Later, when inﬁnite sets are introduced, it will no longer be possible to write them down in the above way and therefore the nonexistence of inﬁnitely descending chains is no longer implicitly guaranteed. Therefore, this property is kept by making it explicitly an axiom. The intuition is that there is no sequence x0 , x1 , . . . of vertices in V such that 3

. Such a y ∈ X is called a minimal element of X (with respect to ∈). The set {0} is the least element of the set {{0}. . x1 . {{0}}} has two minimal elements. . it is based on the observation that whenever xn+1 ∈ xn then every element xn of X = {x0 . Thus a minimal element does not need to be unique.: sets forming a descending sequence are illegal (see above). {1. 2}. x1 . . • {x.3 (Foundation). 0 has no elements and the elements 1.} has a common element with X. 2}.} with x1 ∈ x0 . Thus the following version of the axiom is given. Exercise 1. x2 ∈ x1 . The set A = {0. Which of the below graphs are wellfounded? The answers should be proven by testing which of the below examples avoid the two negative criteria (cycles and descending chains) from Remark 1. {0}. . . . {x}}}: sets with comparable elements (x ∈ {x}) are legal. 1. {x. / / • {x. 2} are not in A. {0. is called a least element. If a minimal element is contained in every other element of the set. {0. that is.6. {1. {x}. y} with x ∈ y and y ∈ x: cycles are illegal. y} with x ∈ y and y ∈ x: sets with incomparable sets are legal. .xn+1 ∈ xn for n = 0. namely 0 and {1. 2 of {1. {0}}}}. namely xn+1 is in xn and in X. Then there is an element y ∈ X such that every z ∈ X satisﬁes z ∈ y. Remark 1. But the term “sequence” is not a primitive operation. Let X ∈ V be a set which contains at least one element. {{0}}. 4 . there is no inﬁnite descending chain. Here some examples of graphs.4. .5. Axiom 1. The property of being well-founded is an abstract property which applies also to some but not all directed graphs which are diﬀerent from the universe of all sets. • {x0 . . Note that the Axiom of Foundation implies that a descending sequence is illegal whenever its elements form a set. {{0}. 2}}. • {x. This is just ruled out by the Axiom of Foundation. / Example 1. The main idea of the Axiom of Foundation is to enforce that the universe V of sets is build from bottom up and that no set is “cyclic” or “hanging down without a bottom from something”. So the following things are permitted and forbidden: • {{{x}}}: making sets of sets is legal.. .5. . but it cannot make descending sequences explicitly illegal without that condition (as this cannot be expressed in V ).

So it has subclasses and members in the same way as set has subsets and members. the set {0. so the usage of 0. 6. unless otherwise stated.7. which are called “sets” and “classes”.1. 1. No set contains itself as an element. 2. {0}. {0. x ∈ x. the set Z of the integers with the edges being the pairs (n. the set Q of rational numbers with the edges being the pairs (q. Convention 1. 2. 2q) for all q ∈ Q. 1. A class is something what behaves like a set but it is none. Consider any set x. Then {x} is also a set. {{1}}. By the Axiom of Foundation. Property 1. The members of V are called sets and the subsets of V (including V itself) are called classes. 512} with (a. 4. (2. V cannot be a set itself. It follows that V cannot be a set. 1.8. All the variables will stand for sets. Proof. no class contains another one as a member. {{{1}}}. 2. b) being an edge iﬀ a ∈ b. 0). {1}. The Axiom of Foundation has an immediate application. q + 1) for all q ≥ 0 and (q. 5. 3 in the example above will later be replaced by the usage of sets which code the numbers 0. Classes are an auxiliary structure which enable to make statement about the collection of all sets or the collection of sets with certain property. The collection V of all sets is not a set. {0}}. The Axiom of Foundation guarantees. 3. The elements of a class are always sets. Intuitively. 2. 3. any set could be considered as a collection of other smaller sets. 1. the set Q of rational numbers with the edges being the pairs (q. Theorem 1. 3). the set N of the natural numbers with every edge being of the form (n. actually this needs the Axiom of Pairs introduced below. 3} with the edges (0. Sets are the object of investigation. 1. Thus x = V since / / V contains the set x as a member. The notation “∈” 5 . n + 1). due to this role. All elements of sets will be coded again as sets. the only element x of {x} satisﬁes y ∈ x for all y ∈ {x}. n + 1) for all n ∈ Z. 1). there are two types of collections of objects. In set theory. the set {0. As y takes the value than x. that each set is determined uniquely by its members and helps to avoid contradictions. q − 1) for all q ≤ 0. They are represented as members of the class of vertices of V and they are put into relation to each other by the element-relation ∈. The main objective of V is to tell which thing is a set and which not. (1.9.

It is supposed that {0. Direct and indirect membership is not the same. that is. Further descriptions for this set can even be more indirect. belong indirectly to the union: EU = {Austria. But politically. For example. Exercise 1. A = {1. .}. . whether or not they are generated from diﬀerent descriptions does not matter at all.10 (Extensionality). the axiom of extensionality states that two sets are equal iﬀ they have the same elements. Britain is the member of the EU and they are only members of this member. Northern Ireland.13. that is. Scotland. Northern Ireland. 6. The importance of this identiﬁcation of sets is that one only pays attention to the extensionality of sets. . The words “member” and “element” are synonyms. 3. Example 1. Belgium. 4. 1. Wales}. thus UEFA = EU. 7} and {0. the same list of elements. Estonia. Britain is a country consisting of four members: Britain = {England. These four members go directly into the European association in the case of soccer: UEFA = {Austria.denotes the membership of sets. . Wales}. These descriptions might come from diﬀerent intentions. A set A and a set B are the same set.11. but this is not the point of this example. There is at most one set which does not have any element. 5. 2. Equal sets have exactly the same members. 2. 7} denote the same set since both descriptions give the collections of the same elements. x ∈ A if and only if x ∈ B. Wales. Similarly “x ∈ y” means that “set x is not a member of / y”. . . that is. 6 . . denoted by A = B. the diﬀerent intentions are ignored when describing sets. It can be noted that there is also other reasons for EU = UEFA like Switzerland ∈ UEFA − EU. 2. for every x ∈ V . 1. Denmark. if both sets have the same elements. This is very useful when the uniqueness of sets satisfying a certain property is determined. this means that “set x is a member of set y”. {England. England. Denmark. Belgium. Scotland. Example 1. that is. Therefore. . Northern Ireland.}. when writing “x ∈ y”. . the two sets A and B are equal. . the symbol “∈” refers to the ﬁrst letter of “element”. but they give the same extension. Scotland. 2}. Estonina. Axiom 1. 1. 3}. for example as the set of all numbers which can be represented by up to three binary digits or the set of all integers on which the polynomial 4x2 − 28x − 1 takes negative values. B = {1.12. Which of the following sets of natural numbers are equal? Wellknown mathematical theorems can be applied without proving them.

b.1 (Empty Set). Since “existence of a set” means that it is a member of V . 2 Basic Operations with Sets In the previous section. 8. The question whether P is inﬁnite is a famous open problem. 7. Axiom 2. F = {f | ∀c ∈ C (f ≥ c)}. 4. 12. D = {d | ∃a. G = {g | g ≥ 2 ∧ ∀a. 6. Therefore it is still unknown whether N = {n | ∃p ∈ P (n ≤ p)}. one has to place an axiom such that it really occurs in V or one has to prove using other axioms that it is in V by these axioms. 16. 14. 7 . E = {e | e > 0 ∧ ∀c ∈ C (e ≤ c)}. H = {h | h > 0 ∧ h2 = hh }. P = {p | p.3. but there was neither a proof nor an axiom given such that these sets really exist. 15. 5. p + 2 ∈ C}. d ∈ C (n = cd)}. There exists a set in V which has no members. 11. M = {m | ∃c ∈ C (m = c2 )}. 9. c > 0 (ad + bd = cd )}. b > 1 (4g = (a + b)2 − (a − b)2 )}. So it is sometimes very diﬃcult to decide whether two descriptions give the same set or not. J = {j | (j + 1)2 = j 2 + 2j + 1}. C is the set of all prime numbers. many examples dealt with sets known from the every-day-life of a mathematician. O = {o | o has exactly three prime factors}. K = {k | 4k > k 2 }. N = {n | ∃c. 10. L = {l | ∃c ∈ C (l < c)}. 13. I = {i | i + i = i · i}.

Axiom 2. respectively. This set is denoted by P(x). {a}. whose elements are exactly subsets of x. d}. Determine the power set of {∅. {b}.Notice that by the equality of sets. d}}. c}. c.7 below. one checks that one set is a subset of the other. Namely. b. {b. Thus. d} then P(x) = {∅. Notice also that x ⊆ x for any set x. {d}. x ∈ A implies that x ∈ B. d}. d}. called the power set of x. there exists a (unique) set. many sets are deﬁned by taking all those numbers which satisfy a certain property. {a. {{∅}}}.7. By Property 2. as in the ﬁrst example. One set could be a part of another in the sense that every element of the set is an element of the other.6 (Comprehension. {a. {a. Deﬁnition 2. Axiom 2. 8 . If x = {a. This rule of forming subsets is made an axiom and is formally deﬁned. c. Property 2. c. for any set A. d}. And A is a proper subset of B.3. a set with no elements is unique. denoted by A ⊆ B. denoted by A ⊂ B. deﬁning subsets). d}. Is there any set X such that P(X) has exactly 9 elements? In Exercise 1. {b. iﬀ A is a subset of B and there exists one set y which is an element of B but not an element of A.4 (Power Set). {∅}. Given a property p(y) of sets. Subsets. c. {a. b}. Exercise 2. Power Sets. b. Given a set X. A = B if and only if A ⊆ B and B ⊆ A. Notice that this gives a standard way to show that two given sets are equal. This is made more precise by introducing the concept of subsets. {c. {a.3. iﬀ for every set x. Also x ∈ P(x). This procedure is called the power set operation. c}.2. there exists a (unique) set B such that x ∈ B if and only if x ∈ A and p(x) holds. It uses properties which are formally introduced in Deﬁnition 3. b. {c}. A set A is a subset of B.5. The notation {x ∈ A | p(x)} stands for the set of all x ∈ A which satisfy p(x). ∅ is a subset of every set. d}. ∅ ∈ P(x) for every set x ∈ V . one could collect all the subsets of X to form a new set. For every set x. {b. Convention 2. b. c}. {a. This set not having any element is denoted by ∅. {a.13.

establish properties to deﬁne the following subsets of N using the Axiom of Comprehension: 1. the set {0. There is a one-to-one function f such that p(x) iﬀ p(y) for all y ∈ f (x). 9 .} of all even numbers. Montenegro uses the Euro but is not in the European Union. 3. Let C be the set of all countries. the set of prime numbers can be deﬁned as the set {x ∈ N | ∃ unique y. 3. Now deﬁne the set L = {c ∈ C | c has at least 25 subunits} of all large countries. There is a set x with x = {y ∈ x | p(y)}. . therefore. The United States has 50 states. 4. The European Union has since 01. 2. the set of all numbers whose binary representation contains exactly four times a 1.05. y such that x ∈ y and either p(x) ∧ p(y) or ¬p(x) ∧ ¬p(y). / Similarly. United States ∈ L. European Union. the set E = {e ∈ European Union | e has the Euro} consists of all members of the European Union.9.Example 2.2004 the necessary quantity of members. the set of all square numbers. Switzerland has 26 cantons and India has 25 states and 7 territories. that is the set of all natural numbers with exactly two divisors. Exercise 2. the set of all numbers with exactly three divisors. 2. 4. There are sets x.8.10. Exercise 2. 2. 6. For example. 1. . Thus. So France and Spain are in E. The Commonwealth of Australia ∈ L. Switzerland. Given the set N of natural numbers. z ∈ N (y · z = x ∧ z < y ≤ x)}. Show that every property p satisﬁes the following statements. But Australia has only 6 states and 2 main territories. . which use the Euro as a currency. India. Britain is not be in E. Montenegro is not a member of E.

Russell split the set of all sets into the following two subsets: X is the set of all sets which are an element of itself. Intersection and Diﬀerence. Let A and B be sets. If Y ∈ Y then Y ∈ X again by the deﬁnition of X. the intended property is that one can construct ﬁnite sets from a given ﬁnite list of elements. Proof. The union of A and B is the set. denoted by A ∪ B. then there is a contradiction. Therefore. If the above deﬁned X. then Y ∈ X since X contains all sets which are an element of itself. whose elements are exactly those sets belonging to A or belonging to B. the basic operations with sets are deﬁned. Y is equal to the class V of all sets and X is the empty class. 1. the following two consequences where taken: 1. Y is the set of all sets which are not an element of itself. the postulate that V is well-founded makes it impossible for a set to be a member of itself.14. Thus either Y is member of both / / sets X and Y or Y is not member of any of these two sets. mathematicians believed that they can deﬁne things like “the set of all sets”.12. Y are sets.13 (Pair). Given any x. This contradicts the fact that X.11 (Russell’s Anatonomy of Naive Set Theory). Paradox 2.Before introducing the distinction between sets and classes. 2. Russell’s paradox forced the mathematicians to become a bit more cautious when dealing with sets. y}. there is for every set x also the set {x} which is just the set containing the single element x as its member. In the following. the existence of {x} is provided by the next axiom. y ∈ V then there exists also a set in V which contains exactly the elements x. Furthermore. If Y ∈ Y . In general. Axiom 2. note that taking the element x twice gives then the existence of {x} by the Axiom of Extensionality. So X and Y turn out to be classes and not sets. in the case that x = y one can also just write {x}. Using comprehension. Mainly. Y are obviously a partition of the sets of all sets. This set is written as {x. Although a set cannot contain itself. Comment 2. 10 . Union. The distinction between classes and sets are introduced. It is suﬃcient to postulate this for two given elements. Deﬁnition 2. Note that x = {x} for the reasons given above. y. The postulate that V is well-founded.

15. d. 3. These operations can be speciﬁed as follows: A∪B A∩B A−B A△B = = = = { x | x ∈ A or x ∈ B }. The (relative) diﬀerence of A with B is the set. c}. As an illustration. {a. D = {b}. C} = {a. denoted by A − B. 1. e} and D = {{a. {c}. c. whose elements are exactly those sets belonging to both A and B. 7. C} = {c}. {A. d. 3} = {0}. 1. For example. c. {b. A = { x | ∃ y ∈ A (x ∈ y) }. {a. c. 6. d}. {b. B = {b. d}. 1. whose elements are exactly those elements of A which do not belong to B. e}}. d. B. d}. c}. 2} − {1. Note that A is only deﬁned for nonempty sets A. 5}. let A = {a. 4. d. d}. Then A∪B A∪C B∪C A∩B A∩C B∩C A−B A△B A△B△C = = = = = = = = = {a. Remark 2. e}. 2. 4. b. {0. b. d. b. 2. 5} = {0. {b. {A. denoted by A ∩ B. {c. {a. { x | x ∈ A and x ∈ B }. 2} and {0. The intersection of A and B is the set. C = {c. c. {b. 8} = {1. e}. b. c. e}. d. B. 2} ∩ {1. 5. b}. e}. 2. {a}. The set A△B is called the symmetric diﬀerence of A and B. If B ⊆ A. D = {a. 2} ∪ {4. {0. b. then A − B is called the complement of B in A. c. / (A − B) ∪ (B − A). 3. d}. 11 . e}. { x | x ∈ A and x ∈ B }.2. 1. c. c. A = { x | ∀ y ∈ A (x ∈ y) }.

Furthermore. Which of the following expressions is the empty set? 1. Proof. that is. F ruit − Red − Blue − Black − Y ellow. Red − Strawberry − Cranberry − Apple − P ear. (Black△Blueberry) ∩ Blue. {A. Blackberry. Apple − Red. Banana − Y ellow. Blue. P ear. Prove that the symmetric diﬀerence is associative. 4. Axiom 2. the sets A − B = {x ∈ A ∪ B | x ∈ B} and / A△B = {x ∈ A ∪ B | x ∈ A ∩ B} / exist and are in V as well. Exercise 2. B. 6. Thus A ∪ B = {A.17.19. B} are in V . prove that A − B = A ∩ (A△B).16 (Union). By the Axiom of Pair. Exercise 2. For this reason. Strawberry. 2. Cranberry. This follows for A directly from the Axiom of Union. Consider the sets Apple. (A△B)△C = A△(B△C).18. For every A ∈ V the union and in V . C. Black and Y ellow be those elements of F ruits which have the corresponding colour. Banana. the intersection is a member of V by the formula A = {x ∈ A | ∀y ∈ A (x ∈ y)} and using the Axiom of Comprehension. are in V . B} ∈ V . that is. one can just write A△B△C. (Blueberry − Blue) ∪ (Y ellow − Apple − P ear − Banana). The sets constructed in Remark 2. If A is not empty. for the rest this is now shown. B} and A ∩ B = {A. for all sets A. 12 . 3.The basis for these operations is the following axiom. B ∈ V . By the Axiom of Comprehension. 5.15 exist. Let A. A of its elements is also a set Proposition 2. Let F ruits be the union of these sets and Red. where A = { x | ∃ y ∈ A (x ∈ y) }. Blueberry which consist of all fruits in the world usually designated by that name.

All A. 8. Cranberry. Property 2. y}}. If x = x′ or y = y ′ then (x. Many Boolean Algebras have a complementation operation. F ruit − {Apple. Blackberry. Banana△Blueberry△Strawberry△Red. (Strawberry ∪ Blueberry ∪ Cranberry) − Red.22. A∪B A∩B Associativity: (A ∪ B) ∪ C (A ∩ B) ∩ C Distributivity: A ∪ (B ∩ C) A ∩ (B ∪ C) De Morgan Laws: A − (B ∩ C) A − (B ∪ C) Commutativity: = = = = = = = = B ∪ A. P ear. y ′ ).7. A ∪ (B ∪ C). Why? 3 Functions Graphs and functions are based on the notions of ordered pairs. A ⊆ B iﬀ A ∪ B = B iﬀ A ∩ B = A. (A ∩ B) ∪ (A ∩ C). (A − B) ∩ (A − C). C ∈ V satisfy the following laws. For example. (x. (A − B) ∪ (A − C). Strawberry. (A ∪ B) ∩ (A ∪ C).20. B ∩ A.1. Exercise 2. These ordered pairs are constructed using the ordinary unordered pairs as follows. Give a set of three fruits which intersects those of the above sets which are not empty. Banana}. Property 2. This makes this deﬁnition suitable to introduce a representation for the Cartesian product of two sets. {x. 13 . Deﬁnition 3. (Apple ∪ P ear) ∩ (Strawberry ∪ Blueberry). y) = (x′ .21. 9. y) = {x. but here only the set diﬀerence is used in the De Morgan Laws. B. A ∩ (B ∩ C). a graph consists of a basis set W of vertices and a set E of edges which is a set of ordered pairs of elements of W . The next property states that the algebra of sets satisﬁes the following rules which it shares with any Boolean algebra. 10.

y) ∈ F . Example 3. 4}}}. y) ∈ F . 3}}.5. 4}}.Deﬁnition 3. {2. 3}}. So. P(A ∪ B) and C = P(A ∪ P(A ∪ B)) are also in V . Then the set representing f is {(0. Thus. {3. Example 3. Thus the Cartesian product is not commutative. y) with x. {0. {4. that is. similarly to computer scientists who represent everything as ﬁnite sequences from {0. {4.4. {1. The set X is called the domain of F . {1. 0)}. The product of {3. F (x) = y and (x. 2} and {3. y) ∈ F are equivalent notations for the same fact. {1. For this reason. {4. 4}}. Thus the fundamental notions of relations and functions are deﬁned in terms of sets. that is.6. 1}. graphs and relations can be considered. {2. 2. in general. Deﬁnition 3. 4} and f be given by f (x) = x + 1 for x = 0. 4}}. y) ∈ F implies that x = x′ . B be {0.2. a function F from V to V would be a subclass of V whose members are of the form (x. 1). the Cartesian product of A and B is the set A × B = { (a. B ∈ V . such that for every x ∈ X there is a unique y ∈ Y with (x. F is a bijective function iﬀ it is both injective and surjective. 3}}. 1. 3 and f (4) = 0. 2).3. The Cartesian product of {0. b) | a ∈ A and b ∈ B }. {2. Remark 3. Y is a superset of the range {y | ∃(x′ . 3). that is. the existence of a Cartesian Product A × B is proven as follows. Given A. y ′ ) ∈ F (x = x′ )}. (4. 4} is {{0. 3}}. V does not only determine which sets exist. A principal philosophy of set theory is that everything is coded or represented as a set. {0. For any sets A and B. (3. 4). A graph is given by a set G of vertices and a relation E ⊆ G×G. It can be deﬁned from F as {x | ∃(x′ . {a. A function F : X → Y is a graph. The sets A ∪ B. This unique y will be denoted as F (x). {1. A function F : X → Y is onto (or surjective) iﬀ Y is equal to the range of F . A function F is one-to-one (or injective) iﬀ (x. b}})}. {2. that is. 4}}}. {0. A relation is a subset of a Cartesian product of ﬁnitely many sets. 4}}. {2. {0. The answer will be the following: Subclasses of V exist iﬀ they can be 14 . y) ∈ F and (x′ . Graphs and functions can also be deﬁned on classes. 4} and {0. 3}}. {1. Thus A × B ∈ V by the Axiom of Comprehension. 3. 1. Now the following deﬁnition of the Cartesian Product is equivalent to the above one: A × B = {c ∈ C | ∃a ∈ A∃b ∈ B (c = {a. y ′ ) ∈ F (y = y ′ )}. a subset of X × Y . {0. but also which functions. {3. 1. F (x) would be the unique y with (x. f is identiﬁed with this set as in Deﬁnition 3. Let A. {2. y ∈ V . 3}}.5. 2} is {{3. 1. it might be important to ask what classes exist and what classes do not exist. {1. 2. (1. (2.

. P(x). E1 . . E1 △E2 . Now every property deﬁnes a class. p1 ∧ p2 . {E1 . . P(E1 ). y. c. E1 ∩ E2 . in some statements. E1 . E2 . E2 . {y | y is a graph on x} = {(x. the informal deﬁnition is given ﬁrst and the correct formal one is the last in the chain of equations. 15 . w}})}. starting with basic expressions below. .7 (Expressions and Properties). x. x × y = {z ∈ P(x ∪ y ∪ P(x ∪ y)) | ∃v ∈ x∃w ∈ y (z = {v. A property is obtained from basic properties p1 . . P(P(∅)) and P( (x ∪ a)). 2. . A basic property is the comparison of two expressions with ∈: E1 ∈ E2 is true iﬀ E1 is a member of E2 and false otherwise. An expression is obtained from expressions by ﬁnitely often building new expressions from old ones. These methods will be introduced one after another.deﬁned from other – already existing – subclasses of V using certain methods to make new classes. p2 and expressions E by forming Boolean expressions and quantifying over variables: ¬p1 . . Example 3. E(x)) : x ∈ V } deﬁnes a subclass of V which is a function. Furthermore. . v. b. then {(x. also the following constructions are expressions: E1 ∪ E2 . For example. 3. The following are expressions where a. stand for constants and u. 1. w. u) | u ⊆ x × x} = {x} × P(x × x). En } (with n ∈ N and this expression being ∅ for n = 0). . / One can also iterate the process by deﬁning subexpressions using comprehension: namely if E is an expression and p a property then {x ∈ E | p(x)} is also an expression. The following properties are either always true or always false. Furthermore. For ¬x ∈ E1 one can also write x ∈ E1 . Given expressions E1 . note that the truth-value can depend on variables built into E1 and E2 . ∃x ∈ E p1 (x) and ∀x ∈ E p1 (x). if E is an expression. An basic expression is either a variable-symbol or a constant which is a ﬁxed member of V . E1 ⊂ E2 ⇔ (E1 ⊆ E2 ) ∧ ∃x ∈ E2 (¬x ∈ E1 ). Deﬁnition 3. ∅ is a constant. {v. All expressions and properties must be deﬁned by ﬁnitely many iterations of the process outlined above. z for variables. p1 ∨ p2 . if P is a property then {x ∈ V : P (x)} deﬁnes a subclass of V . For example.8. one uses the symbols ⊆ and ⊂ as a shorthand: E1 ⊆ E2 ⇔ ∀x ∈ E1 (x ∈ E2 ). E1 − E2 . . . En .

y) such that for every x ∈ V there is a unique y ∈ V with (x. 4. therefore. The property ∃y ∈ P(x) (y ⊆ ∅) is true iﬀ x = ∅. 7. but the whole mechanism is spared out. 5. So a student should know these things for examinations about functions from V to V : • A function f from V to V is a subclass of V which consists of ordered pairs (x. / • Functions can be deﬁned from other functions using the various types of recursion deﬁned in Sections to come. ∃z ∈ P(x) (z ∈ P(y)). goes beyond what is mandatory for a student to learn. y) ∈ f . Set Theory following the axioms of Zermelo and Fraenkel does not have a formalized concept of classes. parameters from V (that is using sets) and also using the various recursion theorems explained in chapters to come. y) | x ∈ X ∧ y = f (x) ∨ x ∈ X ∧ y = ∅} of V . x ∈ x. y). to work this out formally. x = x. one can deﬁne the function {(x. x = {x}. 6. x = P(x). So it is not always true and therefore the last property needed to be more complicated to go into the list of all always true properties. • Functions from V to V satisfy the below Axiom of Replacement. Given the expression P(x). P(x)) : x ∈ V } mapping x to P(x) as a subclass of V . 2.1. Somehow. ∀y ∈ P(x) ∀z ∈ y (z ∈ x). It is more the way that everything is a class which has a deﬁnition which can be written down using standard set-theoretic terminology. 16 . one can code the function x. Properties which are always true are called tautologies. y → P(x) ∪ ((x ∩ y) × (x ∪ y)) as a class. Similarly. • A function f from a set X to a set Y can be extended to a function from V to V by considering the subclass {(x. it will only be explained how to make new functions from old ones. ∃y ∈ P(P(x)) (y ⊆ ∅). one writes f (x) = y for this pair (x. 3. • Functions from V to V can be concatenated and modiﬁed in the standard way.

f is a class of pairs (x.12. That is. 0 if x > 0 and x is not a square number. Let A = {0.13. Exercise 3. c) | ∃b ∈ B ((a. for every polynomial p there is an x such that for all y > x. 3. 1. Deﬁnition 3. y) ∈ f . 2} and F = {f : A → A | f = f ◦ f }. Let C ⊆ A.10. Deﬁnition 3. f3 is dominating all polynomials. b) ∈ f ) }. The composition of F and G is the function G ◦ F : A → C deﬁned by G ◦ F (a) = G(F (a)) for every a ∈ A. ` 2. is the function f ∩ (C × B). f3 (y) > p(y).9 (Replacement). G ◦ F = { (a. f5 (x) = f5 (x − 1) f5 (x − 1) + 1 if x > 0 and x is a square number. Deﬁne (informally) functions fn from N to N with the following properties: 1. The restriction of f to C. Exercise 3. Let f be a subclass of V which is a function from V to V . Let F : A → B and G : B → C. Determine the range of the function f4 .11. Let D ⊆ B. c) ∈ G)}. f [C] is the image of C under f . f1 is bijective and satisﬁes f1 (x) = x but f1 (f1 (x)) = x for all x ∈ N. 2. Show that F has exactly 10 members and determine these. f2 is two-to-one: for every y there are exactly two elements x. both f (X) and f [X] = {f (y) : y ∈ X} are sets again. 5. b) ∈ F ∧ (b. if x = 0. f [C] is the subset of B determined by f [C] = { b ∈ B | ∃ a ∈ C ((a. x′ ∈ N with f (x) = f (x′ ) = y.This axiom is used in order to make sure that the range of a set under a function (from V to V ) is again a set. f4 satisﬁes f4 (x + 1) = f4 (x) + 2x + 1 for all x ∈ N. Then. that is. that is. y) such that for each x ∈ V there is a unique y ∈ V with (x. for every set X ∈ V . denoted by f | C. 1. 4. Axiom 3. Let f : A → B. 17 .

Let A. B = {b} for some b. Fix furthermore an element a ∈ A. B. 4. The set of all functions from A to B is denoted by B A and {f : A → B}. everything is coded as a binary sequence. 9. Deﬁnition 3.3. 6. of all binary sequences. 3}] = {5. Let A and B be sets. The preimage of {0. Let f : n → n + 5 be a function on the natural numbers. 1. 4} is the empty set. c′ in C. 7. Obviously every function going from A to B and then from B to C is a function from A to C. a′ ∈ A. For example. f (a) = h(g(a)) = h(b) = h(g(a′ )) = f (a′ ). 3. 2. So 18 . Example 3. then every function in D is constant: For all a. 8}. Thus f ∈ D and D ⊂ C A . 1}. {0. 1. B. If B has exactly one element. C be any sets with at least two elements. A consequence is that unsplittable objects like a letters and digits can now be split into subparts. Proof. C. 5. 6} and f −1 [{5. b) ∈ f ) }. that is. C. 6}] = {8.15. 1}N is the set of all functions from N to {0. / If B = A and f ∈ C A . f −1 [D] is the subset of A determined by f −1 [D] = { a ∈ A | ∃ b ∈ D ((a. Now deﬁne f (a) = c and f (a′ ) = c′ for all a′ ∈ A − {a}. Then. 8}] = {0. either D ⊂ C A or D = C A . Example 3. the digit “9” as 00110101 and the letter “A” as 01000001. Then the set f [{3. 7. For given sets A. This function is not constant since A has at least two elements. that is. the American Standard Code for Information Interchange (ASCII) represents the digit “0” as 00110000. 4. 8} as f [{0.16. 10. 5. Now h ◦ g = f and thus D = C A . Thus D ⊆ C A . But there is also a nonconstant function in C A : By choice there are two distinct elements c. 6. 6. 2. 4 Natural Numbers In computing. let D = {f ∈ C A | ∃g ∈ B A ∃h ∈ C B (f = h ◦ g)}. For example. 1. 11} is the image of {3. 2.14. then one can take g to be the identity and h = f . f −1 [D] is the preimage of D under f . 7. 3} is the preimage of {5. depending on the choice of A.

. 3. one can easily see from the way they are coded that X is the maximum of X. is N = {0. due to this representation. . That is. 3. Note that 1 = {0} = 0 ∪ {0}. Since one cannot avoid that a number has elements. • N is inductive.2 (Inﬁnity). Given a ﬁnite set X of natural numbers. . One can only prove the following things: • N contains every natural number as every inductive set contains every natural number. every object is represented as a set. 2 = {∅. A set X is an inductive set iﬀ ∅ ∈ X and X is closed under S: ∀y ∈ X (S(y) ∈ X). 3} = 3 ∪ {3} and so on. then it is called inductive. This means. 2. 19 . . . one cannot prove this because one cannot prove that {0.} true? From the axioms.3. 2 = {0.. Then call N = {Y ⊆ X : Y is inductive}. Deﬁnition 4. So S( X) gives a new element of the natural numbers which is outside the set X. primitive objects like the natural numbers have. . Let X be any inductive set. Therefore. {∅}} and this representation will be kept ﬁxed. . Axiom 4. In Set Theory. 0101. 1. . 3 = {0. This can be formalized using the notion of a successor S: S(x) = x ∪ {x}. 1. elements of their own. ensuring that the natural number are in V means to ensure that a provably inﬁnite set is in V . . If a set X shares the basic properties of the natural numbers that it contains 0 and for every n also the successor S(n). the numbers are identiﬁed with their representation 0 = ∅. one tries to represent them as natural as possible. Deﬁnition 4.} is a set. 1. Thus. 2. Inductive sets are always inﬁnite.1. 2} = 2 ∪ {2}. 4 = {0. Furthermore. 2. One can show that the deﬁnition of N is unique. Both parts can be split again into 4 binary digits 0 and 1 which are the only primitive objects. they are no longer unsplittable elements which cannot decomposed furthermore. is N the set of all natural numbers? That is. so it does not matter which inductive set one chooses to start. 1} = 1 ∪ {1}. 0001. Against the intuition. 1. The set ∅ is also called 0 and inductively the set S(n) is called n + 1 where S is the function given by S(x) = x ∪ {x}. 1 = {∅}. One can now ask. the number n has as elements those sets which represent the numbers m below n.the digits have all the preﬁx 0011 followed by 0000. There exists an inductive set. that the set of natural numbers is provably inﬁnite.

1. Deﬁnition 4. The set {∅. 1. . For this reason. A set A is called transitive iﬀ for all a ∈ A and b ∈ a it holds that b ∈ A as well. 2. N behaves like the set of natural numbers should behave and for that reason.}. In the years 1945 – 1991.}. quantiﬁers and Boolean connectives. Furthermore. then one would call the UNO to be “transitive”. Thus S(X) = 20 . . if every member of a member of the UNO would be already be a direct member of the UNO. in many set-theory books N is just called the set of natural numbers and assumed that N = {0. most books of set theory call “N” just “the set of natural numbers”. . the Soviet Union was a member of the United Nations (UNO). . . {{{{∅}}}}} is transitive. . Now consider any X ∈ A. but unfortunately it is not. The problem behind this is a problem of axioms: as long as formulas are ﬁnite and one writes down the axioms using the language of set theory by accessing V and ∈ and using variables for members of V . . . 1. one cannot come up with any set of ﬁnite formulas which deﬁne N better than done above. Estonia. but it does not contain an inﬁnite formula like ∀x ∈ N (x = 0 ∨ x = 1 ∨ x = 2 ∨ x = 3 ∨ . 1. • if the collection {0.4. . . Latvia. that is. {∅}.) which would be needed to make sure that N = {0.} would be guaranteed. Lithuania. . Example 4. So one had the situation that Ukraine ∈ Soviet Union ∈ UNO and Ukraine ∈ UNO at the same time. . Nevertheless.} and does not contain anything else. . . Two of them. By deﬁnition of A. Belarus and Ukraine. 2. 2.} is actually a set then N = {0. Ukraine.} and which can be written down formally are true for N. The Soviet Union itself had 15 republics as its members: Soviet Union = {Belarus. But this is not completely precise as will be shown in Section 20. This situation is a bit unpleasant as everyone assumes that the existence of the set of natural numbers is self-evident. Here some explanations. . Given N. . were not only members of the Soviet Union but also of the UNO and represented in the general assembly. 2. Clearly ∅ ∈ A. If this situation would not only occur at some but at all places. . {{∅}}. 2. let A = {X ∈ N | X ⊆ N}. X ∈ N and X ⊆ N. . N is transitive.5.• every inductive subset X of N satisﬁes N = X. • all the laws which one learns in lectures about {0. Proof of Second Statement. 1. {{{∅}}}. but its elements are not. . Such language of set theory would include a sentence like “there is a set X such that for all elements of y ∈ X. . y contains an element z and z is not empty”.

5. G is the set of all ﬁnite subsets of N. 3. {∅}}. 3. 1. 7.X ∪ {X} ⊆ N. D is the closure of {∅. 2. 5.6. {{{∅}}}}. It follows that A is an inductive subset of N. 2. φ(0) holds and 21 . X has no proper inductive subset. A = {∅.8 (Mathematical Induction Principle). S(X) ∈ N as N is inductive. N is transitive.7. A = N and every X ∈ N satisﬁes X ⊆ N. F is the set of all natural numbers which can be written down with at most 256 decimal digits. Theorem 4. Show that the following statements are equivalent for any inductive set X. Hence. Exercise 4. Assume that 1. Furthermore. X is a subset of every inductive set. X = N (Y ) for some inductive set Y where N (Y ) is deﬁned as in the previous item. X = N. H = P(G). X = N (Y ) for every inductive set Y where N (Y ) is the subset of those y ∈ Y which are in every inductive subset of Y . Which of the following sets is transitive and which is inductive? 1. 4. E is the set of even numbers. as N is the smallest inductive set. N × N} under the successor operation x → S(x). So X ∈ A ⇒ S(X) ∈ A for all X ∈ A. 6. 8. ∀x ∈ X (x = 0 ∨ ∃y ∈ X (x = S(y))). Exercise 4. 6. 4. C = {x | ∀y ∈ x ∀z ∈ y (z = ∅)}. B = {∅. Let φ(x) be a property.

n ⊂ N. then k ∈ n. for example. This means in particular n < m iﬀ n ∈ m. First consider the case A ⊆ N. Then φ(n) holds for every natural number n ∈ N. 4. Hence N ⊆ A. Property 4. Assume that m. Proposition 4. Let A be a transitive nonempty set. 6. Assume now that one chooses the property p to identify the odd numbers.9. 5. Then A ⊆ N iﬀ 0 is the unique element which is not the successor of any other element of A.10. In particular. Let A = {n ∈ N | φ(n) holds}. p(x) ⇔ ∃y ∈ N (x = y + y + 1). Exercise 4.11. On the other hand. k ∈ n + 1 if and only if k ∈ n or k = n. 1. n. 8. By the Axiom of Foundation there is an y ∈ A such that every z ∈ A satisﬁes z ∈ y. φ(n) implies φ(n + 1). every z ∈ y satisﬁes z ∈ A. 2. then m = n + 1 or n + 1 ∈ m. k ∈ N. Then A is an inductive set. The following properties follow from the fact that the natural numbers are deﬁned such that n = {m ∈ N | m < n}. / 22 . Proof. Show that p then satisﬁes the above condition. 7.2. So. Either 0 = n or 0 ∈ n. If k ∈ m and m ∈ n. n = n + 1. 3. for all n ∈ N. Assume that a property p satisﬁes p(1) and ∀x (p(x) ⇒ p(S(S(x)))). that is. If n = m then n + 1 = m + 1. the third item is then just the transitivity of < on N. φ(n) is true for all n ∈ N. If n ∈ m. Either m ∈ n or m = n or n ∈ m. Proof. Show that p(x) is true for all odd numbers.

Now consider any transitive A and assume now that x ∈ A − {0} and x = S(y) for all y ∈ A. that is. fm (m. 5 Recursive Deﬁnition Functions with domain the set of natural numbers can be deﬁned recursively. n)). Corollary 4. inductive and 0 is the unique element of A which is not the successor of any other element of A. 0) = m. If y ∈ N then x ∈ N as N is transitive. Note that y ∈ A / since A is transitive. But the interpretation of this statement is that every nonempty of the set of natural numbers has a smallest element. Then {fm | m ∈ N} = {(m. 1.Thus y has no elements and y = 0. If / z ∈ N then S(z) ∈ N and S(z) = x. the two statements are explicitly listed here. fm (n)) | m. n ∈ N. as m ∩ X = ∅ implies that there is no n < m in X. So whenever A ⊆ N and A not empty then 0 is the unique / element of A which is not the successor of any other element of A. If X ⊆ n then there is a unique m ∈ X such that m ∩ X = ∅. n. the statement is literally almost the same. A set A is equal to N iﬀ A is transitive. Remark 4. Hence 0 ∈ A. Example 5. indeed. Let X be a nonempty set and n ∈ N.13. For this reason.12. If z ∈ B then z ∈ S(z) and thus again S(z) = x. If y ∈ N then x = S(y) / / / by choice of x.1. n ∈ N} represents the addition. Last consider the case that A ⊆ N and let B = A−N. then there is a unique m ∈ X such that m ∩ X = ∅. deﬁne 23 .3 (Foundation). It follows that x ∈ N again. If x is an inductive set then N ⊆ x and x ∈ N by the Axiom of Foundation. So the element x of A is diﬀerent from 0 and not the successor of any other element of A. deﬁne fm : {m} × N → N by fm (m. one can also deﬁne · : N × N → N as follows: For each m ∈ N. One can deﬁne + : N × N → N as follows: For each m ∈ N. If X ⊆ N. But S(y) is the only natural number z such that y ∈ z ∧ S(y) ∈ z. m + n = fm (n) for all m. Similarly. 2. Let z be any element of A. / If x is not an inductive set then there is a y ∈ x with S(y) ∈ x. which is m above. By the Axiom of Foundation there is an element x ∈ B with y ∈ x for all y ∈ B. The proof of this is immediate by Axiom 1. S(n)) = S(fm (m.

c) | b ∈ N ∧ ∃h ∈ C ((b. This contradiction gives D = N. Theorem 5. By the Axiom of Replacement. h(b) is deﬁned as well and h(S(b)) = g(b. Similarly. In summary. h(b)) = ˜ ˜ g(b. a) but not any other pair. Deﬁne the class f = {(b. in contradiction to the assumption on e. n. Proof. Both sets contain 0 as there is a function with domain {0} and value a and as every member of C contains only the pair (0.gm : {m} × N → N by gm (m. (e. such a member exists by the choice of f . h(e) = g(b. Assume that E = N. c ∈ V ((e. f is a function from N to V . gm (m. one can look at the function ˜ ˜ h = h ∪ {(d. hence (d. c). Let C be the class of all functions with domain N such that h(0) = a and for all b ∈ N with h(S(b)) being deﬁned. f [N] 24 . Then there is a minimum e ∈ N − E. m · n = gm (n) for all m. ˜ ˜ ˜ By the Axiom of Comprehension. they are subsets of N. Assume that g : N × V → V is a function and a a value. f (S(n)) = g(n. h(b)). As 0 ∈ D. Then {gm | m ∈ N} = {(m. S(n)) = gm (m. n ∈ N. that is. f (0) = a and 2. Assume that D = N then there is a minimum d ∈ N − D. h(b)) = h(e). 2n can be deﬁned by 20 = 1 and 2S(n) = 2 · 2n . Hence. h(b)))}. there is a number b such ˜ that e = S(b). gm (n)) | m. Furthermore. g(b. f (n)) for all n ∈ N. c) ∈ f )}. This is done by considering the following two subclasses of N: D = {d ∈ N | ∃c ∈ V ((d. C and D. e cannot exist and E = N. n ∈ N} represents the multiplication. 0) = 0. Then there exists a unique function f : N → V such that 1.2 (The Recursion Theorem). Let h be a member of C with b in its domain. As h(d) is undeﬁned. c) ∈ f ⇒ c = c)}. c) ∈ h)} It is now shown that f is actually a function from N to V . As e > 0. This function is also in the class C. h(b)) ∈ f in contradiction to the assumption on C. E = {e ∈ D | ∀c. h(b) = h(b). Furthermore the factorial m! can be deﬁned by 0! = 1 and S(n)! = n! · S(n). By deﬁnition of C. h ∈ C such that h(e) = h(e). By choice there are ˜ ˜ two functions h. n) + m. d = S(b) for some b.

f (0) = a and ˜ f (S(n)) = g(n. T C(X) is the smallest transitive superset of {X}: To see this. The function g corresponds clearly to a class as it is written down as an expression. For every set X there is a set T C(X) which is the smallest transitive set containing X as an element. This theorem has some applications. if f (n) ⊆ Y then f (S(n)) contains only elements which are either in f (n) or elements of elements of f (n). Furthermore. y) = y ∪ y = y ∪ {x | ∃z (x ∈ z ∧ z ∈ y)} consist of all elements of y plus all elements of elements of y. respectively. 25 . f (n)) for all n ∈ N. hence f = f . where Z is any transitive set containing X to start with. f (S(n)) ⊆ Y .is a set and f actually a function from a set to a set. the elements and also the elements of elements of f (n) are also elements and elements of elements of Y . then the restriction of f ˜ that member is extended by f . As f (n) ⊆ Y . Note that the concept of a transitive closure is quite natural: it contains the elements plus element of elements plus elements of elements of elements plus . If any function f : N → A satisﬁes conditions ˜ to each domain n ∈ N is a member in C and 1. It follows from the induction principle that every set f (n) with n ∈ N is a subset of Y . The ﬁrst one is that for each set X there is a transitive closure T C(X) of X.. Now deﬁne T C(X) = f [N]. f (n)) for all n ∈ N. in the same way as the natural numbers were deﬁned as the intersection of all inductive subsets of some given inductive set. Proposition 5. For and any n ∈ N and y ∈ V . Proof. one can even show that the operation X → T C(X) is a function from V to V which is represented as a subclass of V . let g(n. Now f [N] is a set which coincides with the union of all f (n).. The deﬁnition is not sensitive to which set Z is chosen. So f is the only function satisfying the recursive deﬁnition. Clearly f (0) ⊆ Y . It follows from the deﬁnition that x ∈ f (S(n)) and x ∈ f [N]. If Y ⊇ X and Y is transitive then Y ⊇ T C(X).. of a set. Then T C(X) = f [N] is also a subset of Y . Thus there is a function f : N → V with f (0) = {X} and f (S(n)) = g(n. Furthermore. and 2. One could also deﬁne it as the intersection of all transitive supersets: T C(X) = {Y : {X} ⊆ Y ⊆ Z and Y is transitive}. As Y is transitive. let Y be any transitive superset of {X}. The set T C(X) is transitive: if z ∈ f [N] and x ∈ z then there is an n ∈ N with z ∈ f (n). Although it is not stated in the next proposition. As f (0) = {X}. that is. X ⊆ T C(X).3.

Recall the deﬁnition of domination from Exercise 3. f4 (3) = 1. Then f6 (n) = n(n − 1). m (c(m) = c(n)). 9.5. f4 (2) = 0. . each pair {A. .10: A function h : N → N dominates a function f : N → N if there is an n such that ∀m ≥ n (f (m) < h(m)). f4 (0) = 0. Example 5. f (n)) where g(n. f3 (n) = 1 for n = 0. f (n) = h(0) + h(1) + . One explanation would be to assume that there is a soccer league with n teams. f2 (S(S(n))) = f2 (n) · 4·S(n) . . 1. .4. . f (n)) = f (n) + h(n). . . A function f : N → N is increasing if ∀n (f (n) ≤ f (S(n))). . f5 (0) = 1. f1 (S(n)) = f1 (n) + 2n . .Proposition 5. Now Y = f [N] is an inductive set: it contains ∅ as ∅ ∈ f (0). Then there are f6 (n) games per season. . X ⊆ Y . consider f6 given by f6 (0) = 0. B} of two diﬀerent teams plays once at A’s place and once at B’s place. Furthermore. S(S(n)) 3. hence S(y) ∈ Y as well. Function h 1 n n2 n3 3n Sum f . A function f : N → N is unbounded if it dominates every constant function c : N → N satisfying ∀n. f6 (1) = 0 and f6 (S(n)) = f6 (n) + 2n for n ≥ 1. 26 5. f5 (S(n)) = 256 · f5 (n). 2. + h(n). f3 (10n + m) = f3 (n) + 1 for n = 1. f2 (0) = 1. . 1 4. . For every function h : N → N there exists a function f such that f (n) = h(0) + h(1) + . Here some examples for such recursively deﬁned functions f and the functions h on which they are based. for every y ∈ Y there is an n ∈ N with y ∈ f (n) and S(y) ∈ f (n + 1). f1 (0) = 0. . there is a function f : N → V with f (0) = X ∪ {∅} and f (S(n)) = {S(y) | y ∈ f (n)} for all n ∈ N.6. . f4 (1) = 0. f4 (S(n)) = f4 (n) + 2 (n2 − n) for n > 2. for example. f2 (1) = 0. 1. Give informal explanations what these functions compute. . Proof. This is a consequence of the Recursion Theorem which takes f to be the function satisfying f (n + 1) = g(n. 9. . Using Recursion. Every set X is a subset of an inductive set. 2. and m = 0. Determine the functions fn given by the following recursive equations: 1. + h(n) n n2 /2 + n/2 n3 /3 + n2 /2 + n/6 n4 /4 + n3 /2 + n2 /4 (3n+1 − 1)/2 Exercise 5.

} might be outside V if H would not be there. one has in both cases that f (S(n)) = S(f (n)) and it follows from the deﬁnition of g that S(f (n)) < hm′ (S(n)) for all m′ ∈ f (n). k) = S(k) if S(k) < hm (S(n)) for all m ∈ k. 3. Sydney} of Australia’s largest towns has six 27 . Thus f (S(n)) < hm (S(n)) in contradiction to the choice of n. Assume by contradiction that hm does not dominate f . Since hm is increasing. Now it is veriﬁed that f satisﬁes all necessary requirements. . . h1 . So the function g can be deﬁned using only one parameter from V . namely H. Brisbane. Then either f (j) > k or f (j) = k ∧ g(j. increasing and dominated by every hm with m ∈ N. This is proven using the Recursion Theorem with a function g deﬁned as g(n. f is increasing. There is a function f such that f (0) = 0 and ∀n ∈ N (f (S(n)) = g(n. n0 . Every hm dominates f . Since f is unbounded and increasing. . Given a function H : N × N → N such that for every m ∈ N the function hm given as hm (n) = H(m. Canberra. nk−1 }. Let j = 2 + max{i. So f is unbounded. S(n)) in order to get hm (S(n)). the set {Adelaide. S(f (n))}. This follows directly from the deﬁnition of g: for all n. f (j)) = S(k) ∧ f (S(j)) = S(k). there is an unbounded and increasing function f such that every hm dominates f .8. Let k be the value of f at some place i. So the n chosen cannot exist and hm dominates f . Proof. n) is increasing and unbounded. This deﬁnition can be realized as an expression since one can access H(m. . Show that there is a function f dominating every hm . Melbourne. . Perth. . one can conclude that either f (n) ≤ S(m) or f (n) < hm (n). n) for all n. 2. Let H : N × N → N be a function and hm : N → N be given by hm (n) = H(m. This is necessary because the set {h0 . k otherwise. Exercise 5. For every m ∈ k there is a value nm such that hm (nm ) > S(k) since hm is unbounded. f (n))). Since f (0) = 0. 1. Thus f is going up at inﬁnitely many places and unbounded. one has f (n) ≥ S(m) and m ∈ f (n).7.Theorem 5. f is unbounded. Since f (S(n)) > S(m). f (S(n)) ∈ {f (n). 6 Cardinality of Sets The cardinality of a set is the number of its elements. there is a ﬁrst value n such that f (S(n)) > S(m) and f (S(n)) ≥ hm (S(n)). For example. . n1 .

. This is established by counting. Let A and B be two sets. When working out the foundations of set theory. Having the same cardinality is a transitive and reﬂexive equivalence relation. . fourth. For each A ⊆ X. . Cantor deﬁned that two sets have the same size iﬀ there is a bijective function between them. let An+1 = f [An ]. and B0 . 2. let f (A) be the characteristic function of A. Property 6. 0 if x ∈ A. By induction. B1 . |A| = |A|. Proposition 6. An .2.members. Proof. Canberra the third. . sixth} representing 6. Melbourne the fourth. ﬁfth. Brisbane the second. / Proposition 6. Perth is the ﬁfth and Sydney the sixth town. the following holds for all sets A. Example 6. Adelaide is the ﬁrst. . then |A| = |C|.1. . iﬀ there exists a bijective function from A to B. deﬁned by f (A)(x) = Then f is one-to-one and onto. Deﬁnition 6. . of sets as follows. . Thus there is a set E such that E= {En | n ∈ N}. Let E ⊂ N be the set of all even natural numbers. Bn+1 = f [Bn ]. The sets A and B have the same cardinality. 3. denoted by |A| = |B|. its cardinality is 6. An+1 ⊆ Bn ⊆ An for all n ∈ N. 28 . . third. A1 . Note that f [En ] = f [An ] − f [Bn ] = An+1 − Bn+1 = En+1 which can be proven by induction. Bn . |P(X)| = |{0. One recursively deﬁnes two sequences A0 . then |B| = |A|. Let f : A → C be a bijective function. . . Proof.5. Mathematically.3. that is. if |A| = |B| and |B| = |C|. If C ⊆ B ⊆ A and |C| = |A| then |A| = |B|. B. second. this can be viewed as a bijective mapping from the set of the largest Australian towns to the set {ﬁrst. C: 1. let En = An − Bn . 1}X |. . Let A0 = A and B0 = B and for each n ∈ N. Then |E| = |N| which is witnessed by the bijection x → 2 · x from N to E. if |A| = |B|. For each n ∈ N. .4. That is. 1 if x ∈ A. .

if |A| = |B| then |A| ≤ |B|. Property 6. Proof. prove that it is not less or equal than the cardinality of the set {Singapore}. Proof. Christchurch. if |A| ≤ |B| and there is no one-to-one function from B into A. Consider the function F : {0.6. |X| = |Y |. Dunedin. 1}-valued function is also a {0. 1. denoted by |A| < |B|. 2}-valued function. then |X| = |Y |. by Proposition 6. 1}N and {0. if |A| ≤ |B| and |B| ≤ |C| then |A| ≤ |C|. Wellington} of New Zealand’s largest towns has a cardinality which is less than the one of the set of Australian towns given above.7. The cardinality of A is less than or equal to the cardinality of B. 1}N ⊆ {0. |A| ≤ |A|.5. Then g[f [X]] ⊆ g[Y ] ⊆ X. 2}N |. iﬀ there exists a one-to-one function f : A → B. Furthermore. denoted by |A| ≤ |B|. 29 . 2}N have the same cardinality. |X| = |g[Y ]|. It remains to show the other direction. The new function g will be a one-to-one mapping from each En to En+1 where it follows f and will be the identity otherwise: g(x) = Then g is one-to-one and g[A] = f (x) if x ∈ E. Let f : X → Y and g : Y → X be one-to-one functions. 1. Let A and B be two sets. 1}. x if x ∈ A − E. Since |X| = |g[f [X]]|.8. C ∈ V satisfy the following: 1. 1. 3. 1. The sets {0. 1}. {En | n ≥ 1} ∪ (A − E) = A − E0 = B.The idea is now to use that all the En are disjoint and that E0 = A − B. 1}N | ≤ |{0. 1}N given as F (f )(2n) = min{f (n).9 (Cantor-Bernstein Theorem). 1. Each A. Theorem 6. Since |Y | = |g[Y ]|. thus {0. Exercise 6. B. Prove by giving a one-to-one function that the set {Auckland. If |X| ≤ |Y | and |Y | ≤ |X|. 2. F (f )(2n + 1) = min{2 − f (n). 2}N → {0. Deﬁnition 6. The cardinality of A is less than the cardinality of B. 2}N and |{0. Every {0.10. Example 6.

11. F (h)(2n) = 1 and F (h)(2n + 1) = 0. . the statement holds trivially. then f is onto. If f : n → n is a one-to-one function. f (n) = n and ˜ ˜ ˜ f (k) = f (k) for all k ∈ S(n) − {m. This shows that f is not onto.12 (Cantor). Exercise 6. the deﬁnition states that a ∈ A iﬀ a ∈ f (a). Then also f is onto. 3. n}. Hence |X| ≤ |P(X)|. 30 . Then f [n] ⊆ n and f [n] = n by induction hypothesis. Proof. 2. Theorem 6. 1. Let f : S(n) → S(n) be a one-to-one function. . |X| < |P(X)|. . It follows that F is one-to-one and |{0. Furthermore. consider any function f from X to P(X). one has |{0. . This is proven by induction. 1. f satisﬁes now the case “Either” above and is onto.Assume that f (n) = 0. both sets have the same cardinality. h(n) = 2. f and f have the same range since f was ˜ obtained from f by interchanging the values at m and n. Theorem 7. Let n ∈ N. 2. 1. 3. if n ∈ N and u ⊆ n then |u| = |m| for some m ∈ {0. one deﬁnes the subset A ⊆ X by A = {x ∈ X | x ∈ f (x)} / and shows that A is not in the range of f . One has to show that X is not onto. 4}|. 2}N | ≤ |{0. A proper subset of a ﬁnite set is strictly smaller. Proof. Thus if two functions are diﬀerent at n. 7 Finite and Hereditarily Finite Sets Example 6. that is. Therefore f (n) = n and f is onto. For this. Actually this comes directly from the deﬁnition of A: for given a. Thus A and f (a) diﬀer with respect to the membership / of a and f (a) = A. 2. When n = 0. Now let f (m) = f (n). their image is diﬀerent either at 2n or at 2n + 1. 4} identiﬁed with the natural number 5. Then F (f )(2n) = 0 and F (f )(2n+1) = 1. every m < n is in the range of f [n]. The function f (x) = {x} is a one-to-one function. Show that if |X| = |X × N| then |{0.1. 4}| < |{0. By Theorem 6. n}. Thus f (n) ≤ n but not f (n) < n. ˜ ˜ Or there is some m ∈ n with f (m) = n. 1}X | = |NX |. F (g)(2n) = 1 and F (g)(2n + 1) = 1. Either there is no m ∈ n with f (m) = n. 1. 1}N |. 3. This is no longer possible for ﬁnite set. To show that |X| = |P(X)|.3 considers the set of even numbers which is a proper subset of N but still has the same cardinality of N. g(n) = 1. This is done by verifying that f (a) = A for any a ∈ X. for the set {0. 2.9. For example.

6. Since Y is ﬁnite. Y. 7. 3. Now deﬁne by recursion a function g with g(0) = 0 and g(S(k)) = g(k) if k ∈ u. It follows that g(S(k)) > g(k). 6. k ∈ u and g(k) = ℓ.} of the natural numbers and {0. 1723907238947. 8. 1723907238949}. Assume furthermore that X ⊆ Y and that |X| = |Y |. for example {2. 1723907238948. say the ﬁrst. 3. . .2. 3. 4. So the ﬁnite subsets of N are intuitively those which can be completely listed. . 5}. there is a natural number n and a bijection g : n → Y . this is also true for some ﬁnite sets like {0. 2. A set X is inﬁnite iﬀ X is not ﬁnite. . / S(g(k)) if k ∈ u. If ℓ ∈ m then there is a minimal k with g(S(k)) > ℓ. Inﬁnite sets can never be written down completely as the examples {0. Deﬁnition 7. 2. Assume now that x. The second property implies that there 31 Z is ﬁnite. 2. . assume that u. 1. . y ∈ u and x.1 there is a one-to-one mapping g from u onto some m ∈ N.3. 4.} of the even numbers show. Thus S(x) ≤ y and thus g(y) ≥ g(S(x)) > g(x). X ∪ Y is ﬁnite. 2.For the second statement. From a practical point of view. P(X) is ﬁnite. Proof. By Theorem 7. 3. Now let m = g(n). y are diﬀerent elements. If X ⊂ Y then |X| < |Y |. 3125}. . 625. . . so |u| = |m|. 9} and {1. 1. Thus |A| = |m| and A is ﬁnite. Theorem 7. 5. . Now one deﬁnes for all b ∈ A the mapping b → g(f (b)) which then is a one-to-one mapping from A onto m. By assumption. Let A ⊆ X. 125. So g is one-to-one on u (although not outside u). 4. A set X is ﬁnite iﬀ there is some n ∈ N such that |X| = |n|. {0. Z be ﬁnite sets and f be a function. If every element of Z is ﬁnite then 6. . 8. 2. 5. 25. g(S(k)) = S(g(k)). f [X] is ﬁnite. 1. Thus m = g[u] and g is one-to-one on u. Every subset of X is ﬁnite. 1. 3. 2. Let X. 5. there is a one-to-one and onto mapping f from X to some n ∈ N. Let u = f [A]. Then either x < y or y < x. n are given with n ∈ N and u ⊆ n. 1.

. Thus un = Z is ﬁnite. Y are ﬁnite. 2n ∈ N and every set of 2n elements is ﬁnite. It follows that X ∪ Y is ﬁnite. there are natural numbers n. . . . 1.1. Now let Y = X ∪ {x} have S(n) elements. then not only A itself but also all of its elements. .1. f [X] is a ﬁnite set. |u| = |m| for some m ∈ N and it follows that |f [X]| = |h[u]| = |u| = |m|. The set {{N}} is ﬁnite since it contains only the set {N}. h′ is a bijection and maps n to a subset of n. . Let X be ﬁnite. hY such that X = hX [n] and Y = hY [m]. zn−1 are all ﬁnite sets. Most easily. this is deﬁned by looking at the transitive closure. That is. . if A is hereditarily ﬁnite. Now let h(k) = f (g(k)) for all k ∈ n and deﬁne u = {k ∈ n : h(k) ∈ h[k]} as the set of all numbers k for / which h(k) takes a value not taken by any h(ℓ) with ℓ < k. . 4. z0 . Clearly u0 is ﬁnite and by induction over k it follows that uk+1 is ﬁnite as it is the union of two ﬁnite sets. . Now deﬁne u0 = ∅ and for every k ∈ n inductively uk+1 = uk ∪ zk . . n − 1 and g(k) = hY (k−n) for k = n. that is. By Theorem 7. x ∈ X. m and mapping hX . By assumption. going along the iterated membership relation of {{N}} end up in an inﬁnite set. That is. 1. z1 . . n+m−1} is the domain of g and X ∪ Y the range of g. Z ∪ {x} of Y . There is a bijection g : n → X for some natural number n. . can bijectively mapped to n. Then h[u] = h[n] = f [X] and h is one-to-one on u. . . 6. Prove that the set of all functions from X to X is ﬁnite. Assume that the inductive hypothesis for a set X having n elements is proven. That is. That set is ﬁnite again. It follows that |P(Y )| = 2 · |P(X)|. . but it contains the inﬁnite set N. Z = {z0 . . .is a bijection h : Y → X. For every subset / Z of X there are two subsets Z. .4. Now let h′ map every m ∈ n to g −1 (h(g(m))). h′ is onto and h′ [n] = n. Then the ﬁnite set {0. A set is called hereditarily ﬁnite if this does not happen. As X. Thus X = h[g[n]] = g[g −1 [h[g[n]]]] = g[h′ [n]] = g[n] = Y . all of the elements of its elements and so on are ﬁnite. thus the quantity of subsets of Y is two times as large as the quantity of subsets of X. Now deﬁne g(k) = hX (k) for k = 0. Thus |P(Y )| has 2S(n) = 2|Y | elements and is ﬁnite again. n+m−1. Exercise 7. This is obviously true for X = ∅ where P(X) = {∅} contains 1 = 20 elements. By Theorem 7. that is. So |X| = |Y | does not hold for a proper subset of Y . . n+1. . Furthermore. 32 . zn−1 } for some ﬁnite index set n. z1 . 3. It is veriﬁed by induction that P(X) has 2n elements and is ﬁnite whenever X has n elements. As Z is ﬁnite. 5.

{∅.6. let Vω = {x ∈ V | x is hereditarily ﬁnite}. if v. {0. A is hereditarily ﬁnite iﬀ T C(A) is ﬁnite.9. Clearly B ⊂ T C(A) as A ∈ B. Thus Vω is closed under the three operations given. Assume now that T C(A) is inﬁnite. 2.5.7. Now T C(Y ) is ﬁnite for all Y ∈ X and thus {T C(Y ) | Y ∈ X} is a set of ﬁnite sets. It follows that v ∪ w is a ﬁnite set and that all its members are hereditarily ﬁnite since they are members of either v or w (or both). if v ∈ W then {v} ∈ W . {0. {∅}. 2}. thus X ⊆ B. Let B = {x ∈ T C(A) | T C(x) is ﬁnite}. Furthermore. 3. If T C(A) is ﬁnite then A is also hereditarily ﬁnite. Theorem 7. The intuition behind this is to call a set hereditarily ﬁnite iﬀ it can be written down explicitly on paper. A set A is called hereditarily ﬁnite iﬀ every element of T C(A) is ﬁnite. If v. If v is hereditarily ﬁnite. Clearly ∅ is hereditarily ﬁnite. w ∈ W then v ∪ w ∈ W . This permits to write down every natural number explicitly. hence X is inﬁnite and A not hereditarily ﬁnite. The set N is inﬁnite but all its elements are hereditarily ﬁnite. so is {v}. Theorem 7. The natural numbers had been introduced by iterations of the operation x → x ∪ {x} starting with the empty set. The set {1. for example. Example 7. The next step is to construct a function f from N onto Vω in order to show that 33 . The class Vω is actually a set and coincides with the smallest set W satisfying 1. In practice. 3}} is hereditarily ﬁnite. 3 is {∅. w are hereditarily ﬁnite then they are ﬁnite sets of hereditarily ﬁnite sets. Thus T C(X) = {X} ∪ {T C(Y ) | Y ∈ X} is inﬁnite only if X is inﬁnite.Deﬁnition 7. Property 7. ∅ ∈ W . Proof. one encounters of course the problem that already sets corresponding to moderately sized numbers like 275 require more symbols to be written down explicitly than there are atoms in the universe – it is estimated that there are between 1078 and 1082 atoms in the universe and between 1048 and 1052 atoms on the Earth. / By the Axiom of Foundation there is X ∈ T C(A) − B such that no element of X is in T C(A) − B. The set {{N}} is ﬁnite but not hereditarily ﬁnite.8. {∅}}}. A ﬁnite set x is hereditarily ﬁnite iﬀ every y ∈ x is hereditarily ﬁnite. Proof.

Determine all x0 ∈ V which satisfy that there are no x1 . 2n − 1}. f (g(n)) is a proper superset of f (g(m)). and 3. 2. It follows that x is inﬁnite and x ∈ Vω . The set {{∅}} is such an x0 .. Inductively. 1. . if f [n] ∩ x = f (g(n)) and f (n) ∈ x then g(n + 1) = g(n) and f (g(n + 1)) = f (g(n)) = f [n] ∩ x = f [n + 1] ∩ x. f (g(m)) = x for all m ∈ N and for every m ∈ N there is an n ∈ N with / n > m and g(n) > g(m). This function is ﬁrst constructed as a function from N into V and the properties are shown later. {{∅}}}} does not qualify. • f (2n + m) = f (m) ∪ {f (n)} for n ∈ N and m ∈ {0.. but that some proper inﬁnite subclass of Vω does. x2 ∈ x1 .10. Now deﬁne inductively a function g such that f (g(n)) = f [n] ∩ x as follows: g(0) = 0 and g(S(n)) = g(n) + 2n g(n) if f (n) ∈ x. As x ∈ T C(z). One can show that f [N] ⊆ Vω : Assume that k would be the least number with f (k) ∈ Vω . f (n) and f (m) are both in Vω . if f (n) ∈ x. Vω ⊆ W for all sets W satisfying the conditions 1. Show that N does not satisfy this property. / One can show inductively that 0 ≤ g(n) < 2n for all n ∈ N. It follows from the Axiom of Foundation that there / is a set x ∈ T C(z)−f [N] with x∩(T C(z)−f [N]) = ∅. Exercise 7. x3 ∈ x2 .. x4 ∈ V with x1 ∈ x0 . As a consequence. Furthermore. It follows that x = f [g[N]]. Now consider any set z ∈ f [N]. It follows that f (k) ∈ Vω in contradiction to the assumption. Then k > 0 as f (0) = ∅ and ∅ ∈ Vω . x4 ∈ x3 .it is a set. 2. So one actually has f [N] ⊆ Vω . z ∈ Vω as well. . although x1 = {∅} and x2 = ∅ exist. x3 . Note that this argumentation also works with every set W satisfying 1. Prove that Vω satisﬁes the following property: if x ∈ Vω and y ⊆ x or y ∈ x. then y ∈ Vω . m ∈ N with / m < 2n ≤ 2n +m = k.11. 3. Furthermore. x2 . The inductive construction goes as follows: • f (0) = ∅. / / Vω = f [N] and Vω is a set. As x ⊂ T C(z) one can conclude that x ⊆ f [N].. thus one has that f [N] ⊆ W for such a set. Thus {f (n)} and {f (n)}∪f (m) are also both in Vω . 34 . Exercise 7. In particular. x3 and x4 do not exist. Since x ∈ f [N]. as this had been proven above for f [N] in place of Vω . f (g(0)) = f [0] ∩ x as g(0) = 0 and f [0] = f [∅] = ∅. So there are n. . / if f [n] ∩ x = f (g(n)) and f (n) ∈ x then g(n + 1) = g(n) + 2n and f (g(n + 1)) = {f (n)} ∪ f (g(n)) = {f (n)} ∪ (f [n] ∩ x) = f [n + 1] ∩ x. The set {{∅. .

the set of all possible novels is countable since one can write down each novel using an alphabet plus punctuation symbols and special characters. countable sets are deﬁned to be inﬁnite. A set X is at most countable iﬀ there is a surjective function f : N → X. that is. in order to avoid to say “inﬁnite and countable” all the time. “z” the twenty-sixth. X is countable iﬀ‘ X is at most countable and inﬁnite. The same is true for the set of all computer programs. then g[X] is at most countable: By deﬁnition. “ba” the ﬁfty-third. The set N has the least inﬁnite cardinality and this cardinality is called “countable”. This fact motivates the following deﬁnition.1 reﬂects the property that one can enumerate the elements of a countable set.1. X is uncountable iﬀ |X| > |N|. “a” is the ﬁrst word. “az” the ﬁfty-second. that there is a surjective function from N to X. But the condition “|X| ≤ |N|” is deﬁned the other way round: there has to be a one-to-one function from X to N. Countable sets are the only inﬁnite sets where one — theoretically — can name every element by a name. So there is a surjective mapping from the natural numbers to all ﬁnite nonempty strings over the English alphabet which counts these strings. The sets of natural numbers and integers are examples of countable sets. there are several diﬀerent cardinalities of inﬁnite sets. Proposition 8.3. The concatenation n → g(f (n)) then witnesses that g[X] is at most countable. Then there is a function f : N → X which is surjective. If g is a function and X at most countable. An inﬁnite set X is at most countable iﬀ |X| ≤ |N| iﬀ |X| = |N|. In the case of words (or strings) over the English alphabet. 35 . The next result shows that both ways to deﬁne “at most countable” and “countable” are equivalent. Let X be at most countable. A set X is at most countable iﬀ |X| ≤ |N|. Proof. The word “countable” itself comes from the fact that one can count one by one all the things which can be written down explicitly. Deﬁnition 8. there is a surjective function f : N → X which witnesses that X is at most countable. for example. too. every natural number can be written down as a ﬁnite sequence of digits. “aa” the twenty-seventh. For every x ∈ X let g(x) be the minimum of f −1 (x) = {y ∈ N | f (y) = x}. “zz” the sevenhundred-second and “aaa” the sevenhundredthird word. “b” the second. For example. Any ﬁnite set is at most countable but not countable. “zx” the sevenhundredth. Deﬁnition 8. The same applies to all other types of objects which can be coded with a ﬁnite alphabet. Remark 8.8 Countable Sets Since N and P(N) do not have the same cardinality.2.

k(q) = 11 and n(q). if Y is inﬁnite then |Y | = |X|. n to 2 · (m + n) · (m + n + 1) + m. Assume now that X is inﬁnite and that f is as above. Since the identity restricted to a subset Y of a set X is a one-to-one mapping from Y to X. p(1. If X is countable and Y ⊆ X then Y is at most countable. Proposition 8. Cantor constructed an explicit bijection between N × N and N. p(0. He used the 1 function p mapping m. Furthermore. 0) = 0. y) = 2f (x) · 3g(y) .6. Corollary 8. The function h is a one-to-one mapping from X × Y to N. The veriﬁcation 36 . if q > 0 then q = factor. p(0. 2) = 3. Y are at most countable. It follows that the mapping n → f (h(n)) is one-to-one and witnesses that |N| ≤ |X|. 1) = 1. p(0. so p(0. Proof. if g(x) = g(y) then x = f (g(x)). Thus |Q| ≤ |N|. If X and Y are at most countable so is X × Y . f (h(n)) ∈ f [h(S(n))] − f [h(n)]. 0) = 2. Remark 8. one has the following corollary. if q < 0 then q = − m(q) . if q = 0 then n(q) = 0. The set Q of all rationals is countable. 3) = 6 and so on.5. thus g is injective. Proof. For every number n there is a ﬁrst natural number m such that |S(n)| ≤ |f [S(m)]|. Since X. y = f (g(y)) and x = y. k(q) such that the following conditions hold: 1. m(q) k(q) = 7 and n(q). Example 8.The value g(x) exists because f is surjective. So |X| ≤ |N|. Thus |X × Y | ≤ |N|. then one can take h even such that h is bijective. 1) = 4.4. m(q). m(q) = 0 and k(q) = 5. m(q) do not have a common prime factor. let h(n) denote this number m for given n. Let f (q) = 2n(q) · 3m(q) · k(q). It is easy to see that f is a one-to-one mapping from Q to N. In particular.9. the cardinality of both sets is the same and Q is countable. p(1. m(q) do not have a common prime n(q) 3. Since N ⊆ Q. n(q) . For every rational q there are unique numbers n(q). Due to the deﬁnition of h one has that |f [S(h(n))]| > |f [h(n)]|. there are one-to-one mappings f : X → N and g : Y → N. Now let h(x. Thus |X| = |N| by Theorem 6. 0) = 5. 2.7. p(2. If X and Y are inﬁnite.

. Let A be nonempty and at most countable and A∗ denote the set of ﬁnite sequences of members in A. Theorem 8. mapping each x ∈ E to g[x] produces a surjective mapping from E onto all ﬁnite subsets of A and thus the set of all ﬁnite subsets of A is at most countable. b0 ). The set Vω of all hereditarily ﬁnite sets is countable. Now mapping the empty set to the empty sequence and {(0. b1 ). the symbol ∅ comes ﬁrst and the comma comes last. Let <ll be the length-lexicographic ordering of the set A∗ of all strings over A: if v is shorter than w then v <ll w. {∅}}}}”. 37 . the length of f (x) is odd for every x ∈ Vω . (1. (1. . Proof. Furthermore. . b2 ). Now let f : Vω → A∗ map every set x in Vω to ﬁrst expression describing x. w have the same length then v <ll w ⇔ v <lex w. Show that D is countable. . A∗ is inﬁnite and countable. Exercise 8.9. if v. {∅. b1 ). the set of all ﬁnite subsets of A is at most countable. {∅}. b0 ). (2. . Let E be the set of all ﬁnite subsets of N. b1 . As ordered pairs are just coded sets. . . the bracket {. the symbol “{}” is never used to describe the empty set. For the next result. Proposition 8. . {∅. one can easily set that B ⊆ Vω . the symbol ∅. bn−1 )} for some natural numbers n and b0 . consider just the function f build in Theorem 7. that is. The set E is at most countable as Vω is countable. bn−1 where the latter are used to code elements of A. Let A contain four elements. {∅}}}”. . bn−1 )} to g(b0 ) g(b1 ) g(b2 ) . Then A∗ is countable. • f ({1. Let g : N → A be a surjective function and consider the following set B: x ∈ B ⇔ x is of the form {(0. b2 ). (2. this convention is also applied in this text. . the bracket } and the comma in this ordering. As ∅ <ll {}. As A∗ contains for each n ∈ N a sequence of length n. • f ({0. 3}) =“{∅. g(bn−1 ) shows that A∗ is at most countable. . (n − 1.10. Furthermore.6 can be adapted to show that X × Y and X ∪ Y are countable whenever X and Y are countable sets. {∅. Exercise 8. (n − 1.9.8. Property 8. 2}) =“{{∅}.11. . for example. . Check which of the following facts are true: 1.that p really is a bijection is left to the reader. Let D = {f : N → N | ∀n (f (S(n)) ≤ f (n))} be the set of all decreasing functions. .

. z1 . Let A be the set of algebraic real numbers. 9 Graphs and Orderings Consider the following list which relates words with the same meaning to each other. 2256 . one can map A to the continuous function f ∈ F given as if z. z + 1 ∈ A. 1010 − 1. (English. If the length of f (x) is n and f (y) is m. 2047.12. / 0 q−z if z ∈ A. (Spain. Example 8. 1022. in 38 . 1001. (blue. n n e Espa˜a). f (2) = {∅. azul). Proof. say English words to their Spanisch counterparts: (dog. z + 1 ∈ A. “cat” can be “el gato” (if n male) or “la gata” (if female). . in this example the set of all words in either English or Spanish language. it has a set of pairs of vertices.2. Furthermore.13. {∅}}. 1023. It has a set of vertices. Given A ⊆ Z. where z is the unique integers with z ≤ q < z + 1. 3. the set of all r ∈ R for which there are n ∈ N and z0 . / f (q) = / z + 1 − q if z ∈ A. (cow. (cat. The set F of all continuous functions f : Q → Q is uncountable. the edges. y)) for the ordered pair (x. la vaca). one cannot represent the structure as a function. el Ingl´s). |P(Z)| ≤ |F |. 1000. la gata). . + zn rn = 0. z + 1 ∈ A. 10231023 . z + 1 ∈ A. la ni˜a). Since the relation is not unique. y)? Exercise 8. ﬁnd a formula giving the length of f (n) for every n and determine which of the following numbers is the length of f (10): 42. what is the length of f ((x. The most common more general notion studied by mathematicians is that of directed graphs. 1 if z. el perro). Since this mapping is one-to-one. el gato). Show that A is countable by giving a one-to-one mapping from A into N. (boy. if f (x) = {y} then f (S(x)) = {y. 100. {y}}. (girl. that is. 4096. Furthermore. 1024. el ni˜o). 2562 . . . 4. Note that such a polynomial of degree n can have up to n places r which are mapped to 0. the length of f (P(x)) is the product of the length of f (x) and the cardinality of P(x) plus 1. . (cat. zn ∈ Z such that zn = 0 and z0 + z1 r + z2 r2 + .

the mentioned name is of Spanish origin and has the English translation “the angels”. (x. (x. furthermore there exists a female cat and thus (cat. One can also consider graphs with a class as their domain: (V.this example the edges consist of one English and one Spanish word whose meaning is related to each other in the way that there is an object or concept which can be denoted by the ﬁrst word in English and the second word in Spanish: there exists a male cat and thus (cat. A graph (G. 5. el gato) is in the set of edges. 39 . E) such that G is a set and E ⊆ G×G.5. An English-Spanish dictionary is not bipartite. These pairs are ordered. 1. 7. ∈) is a graph where in the latter case “∈” is restricted to the domain Vω . An English-Russian dictionary is a bipartite graph because one can take X to be the words written in the Latin alphabet and Y to be the words written in the Cyrillic alphabet. (x. Example 9. for example place names like “Los Angeles” appear in the same spelling in both languages. (x. y) ∈ E7 ⇔ y = S(x) ∧ x is even. 2. the formal introduction of graphs is repeated from Deﬁnition 3. note that En ⊆ N × N. y) ∈ E is actually in X × Y ∪ Y × X. E) is called bipartite if there are two subsets X. By the way. y) ∈ E6 ⇔ ∃z > 0 (x ∈ {2z . (x.3. y) ∈ E3 ⇔ 12 < x + y < 18. Deﬁnition 9. Every function f : X → Y can be represented as the graph (X ∪ Y. y) ∈ (X ∪ Y ) × (X ∪ Y ) | x ∈ X ∧ y ∈ Y ∧ y = f (x)}). y) ∈ E5 ⇔ ∃z > 0 (x = 2z ∧ y = 3z ). Which of the following graphs are bipartite? The set of vertices is N and the set En of edges is speciﬁed below. Exercise 9. 3. For the reader’s convenience. The members of G are called vertices and the members of E are called edges. (x. Y of G such that X ∩ Y = ∅ and every pair (x. 7z }).1. 3z } ∧ y ∈ {5z . S(x)) | x ∈ V }) is a graph. ∈) is a graph and (Vω . {(x. la gata) is in the set of edges. the English word is always the ﬁrst one. Also (V. A (directed) graph is pair (G. (x. y) ∈ E2 ⇔ x < y. 6. 4. {(x.2. y) ∈ E1 ⇔ x = y. y) ∈ E4 ⇔ x > 4 ∧ (y = x2 ∨ y = x4 ).

One usually writes a < b if the ordering is denoted by a symbol of the type < and (a.If one considers the graph (V. for example {{{0}}} and {0. Convention 9. Example 9.8. If a < b. The main additional property is transitivity. Assume that G = {a. a < c and d < c. iﬀ there is no x ∈ G with x < x. c. d are incomparable. Then G is a partially ordered set where the elements a. <div ) is a partially ordered set. then one says that a is less than or equal to b. the relation < is transitive iﬀ x < z for all x. . 2.7. If G is partially ordered by < then < is called a partial ordering on G and (G. the notation a ≤ b stands for a < b ∨ a = b. b) ∈ R if the ordering is denoted by a letter like R. ∈). Similarly A ⊆ B stands for A ⊂ B ∨ A = B. b. Furthermore.4. The subset-relation ⊂ is a partial ordering of V . Let A = N − {0. This is captured by the deﬁnition of a partially ordered set. b < c. On the other hand. z ∈ G with x < y and y < z. . 4. Furthermore. ⊂) instead of (V. Deﬁnition 9. one has additional properties not present at (V. c > b and b < c mean the same. the relation < is antisymmetric iﬀ there are no x. The relation a < c is needed since the ordering < would otherwise not be transitive. Note that a transitive relation < is antisymmetric iﬀ it is antireﬂexive. y) ∈ R ⇔ |x| < |y| is a partial ordering of V . 1} = {2. the ordering looks like this: c / \ b d / a Exercise 9. that is. These two properties are deﬁned as follows: 1. A set G is partially ordered by a relation < iﬀ this relation is antisymmetric and transitive. 1}. <) is called a partially ordered set. then one says that a is less than b or a is smaller than b.6. If a ≤ b. d} and a < b. . Example 9. Prove that (A. y ∈ G with x < y and y < x. Graphically. y. x <div y iﬀ x is a proper divisor of y. 3. there are still incomparable elements of V . Furthermore. That is. so 2 <div 8 but 2 <div 2 and 2 <div 5. The 40 .5. the relation R on V deﬁned as (x.} and let <div be given by x <div y ⇔ ∃z ∈ A (x · z = y). ∈).

y. z ′ ) ⊏ (x′′ . z ′ ). (x′ . y ′′ . These two partial orderings are diﬀerent since ({a. Note that the symbol ≤ derived from < can be more restrictive than ⊑: For example. y. it can also be reﬂexive. If |G| ≥ 2. z ′′ ) such that (x. By using the term “partial ordering” this is explicitly permitted although it is not mandatory. z) ⊏ (x′ . y ′ . consider any three triples (x. b}. Note that the triples (0. but a = a. y.11 (Preordering). z ′ ) and (x′ .9. y ′′ . Thus (x. Prove that the following relations are partial orderings on NN : • f ⊏1 g ⇔ ∃n ∀m > n (f (m) < g(m)). z) ⊏ (x′ . z ′ ) then x + y + z < x′ + y ′ + z ′ and it cannot be that x′ + y ′ + z ′ < x + y + z. 0) are incomparable with respect to ⊏. Exercise 9. y ′′ . =) is not a partially ordered set. the element-relation ∈ is an ordering of N where it coincides with the natural less-than relation. A relation ⊑ on a set G is called preordering iﬀ it is transitive. z) ⊏ (x′ . b. z ′′ ) and ⊏ is transitive. but this is not required here although some authors require it in other books. y) ∈ R′ ⇔ |P(x)| ≤ |y|. Determine for every ordering a pair of incomparable elements f. then (G. g) of examples such that f (n) = 0 for all n? Remark 9. z) ⊏ (x′′ . 1. • f ⊏3 g ⇔ ∀n (f (n) ≤ g(n)) ∧ ∃n (f (n) < g(n)) ∧ ∃n ∀m > n (f (m) = g(m)). Let a. • f ⊏2 g ⇔ ∀n (f (n) ≤ g(n)) ∧ ∃m (f (m) < g(m)).same holds for R′ given by (x. (x′′ . b be distinct elements of G. g such that neither f ⊏m g nor g ⊏m f nor f = g. y ′ . 0. y. If (x. By the transitivity of ≤ and < one has that x ≤ x′′ . {a. Then a = b and b = a. Due to coding. y ′ . z ′ ) ⇔ x ≤ x′ ∧ y ≤ y ′ ∧ z ≤ z ′ ∧ x + y + z < x′ + y ′ + z ′ . y ≤ y ′′ . the preordering ⊑ on V deﬁned by x ⊑ y ⇔ |x| ≤ |y| 41 . y. 1) and (0.10. • f ⊏4 g ⇔ f (0) < g(0). Thus ⊏ is antisymmetric. thus the inequality is not transitive. To see that ⊏ is transitive. Example 9. For which of these orderings is it possible to choose the f of this pair (f. The following relation ⊏ is a partial ordering of the set N × N × N: (x. z ′′ ). z). y ′ . z ≤ z ′′ and x + y + z < x′′ + y ′′ + z ′′ . c}) is in R but not in R′ . The ordering < on G deﬁned as x < y ⇔ x ⊑ y and not y ⊑ x is a partial ordering generated from ⊑. Proof. y ′ .

. f (S(m))) ∈ E) then the partial ordering <E generated from ≤E satisﬁes x ≤E y ⇔ x <E y ∨ x = y as desired. Proposition 9. iﬀ there is no n ∈ N − {0} and no function f : S(n) → G with f (0) = f (n) and ∀m ∈ n ((f (m). then one has xi ≤E xj but not xi <E xj for all i. Well-founded graphs are cycle-free but not vice versa as the graph (Z. y ∈ G. xn−1 ∈ G with (xm . E) is cycle-free then it cannot be that y ≤E x by the transitivity shown before. The preordering ≤E can formally be deﬁned as follows: For all x. . x) ∈ E for all x. . n + m − 1} it holds that (h(k). Assume that x = y and x ≤E y.12. f (S(k))) ∈ E and for k ∈ {n.deﬁnes a partial ordering < on V such that x < y ⇔ |x| < |y| but the derived relation ≤ is then x ≤ y ⇔ (x = y ∨ |x| < |y|) and it would be that N − {0} ⊑ N is true but N − {0} ≤ N is false with respect to the just deﬁned relations ⊑. E). <. that is. x1 . . f (S(m))) ∈ E) so that <E can be directly deﬁned from (G. f (n) = y and ∀m ∈ n ((f (m). x ≤E x as one could take n = 0 and consider the function f with domain 0 and f (0) = x. E). thus x <E y in the way as <E is derived from the preordering ≤E . If a graph is not cycle-free but has some cycle of n diﬀerent nodes x0 . if (G. f (S(m))) ∈ E)). Assume that f with domain S(n) witnesses x ≤E y and g with domain S(m) witnesses y ≤E z. this deﬁnition is not contradictory. xm+1 ) ∈ E for all m < n − 1 and (xn−1 . Then deﬁne h : S(n + m) → G with h(k) = f (k) for k ∈ S(n) and h(n + k) = g(k) for k ∈ S(m). As f (n) = g(0) = y. for k ∈ {0. . Proof. If (G. one has x ≤E z.13. Furthermore. Furthermore. h(S(k))) = (f (k). ≤. 1. . h(S(k))) = (g(k − n). x ≤E y iﬀ ∃n ∈ N∃f ∈ GS(n) (f (0) = x ∧ f (n) = y ∧ ∀m ∈ n ((f (m). . Remark 9. . . n + 1. Given a graph (G. Therefore the case of cycle-free graphs is the most desirable one. As h(0) = x and h(n + m) = z. n − 1} it holds that (h(k). . . f (S(m))) ∈ E for all m ∈ n. . g(S(k − n))) ∈ E. x0 ) ∈ E as well. E) with 42 . j ∈ n. Note that in a cycle-free graph also (x. Furthermore. The transitivity is easy to see. one can deﬁne a preordering ≤E by x ≤E y iﬀ there is a natural number n and a function f : S(n) → G with f (0) = x. E) is cycle-free. f (n) = y and (f (m). Thus one has that x ≤E y ⇔ x <E y ∨ x = y. This property / gives then the additional property that x <E y iﬀ there is an n ∈ N−{0} and a function f : S(n) → G with f (0) = x.

One says that R is a linear ordering of A iﬀ R satisﬁes the following properties for all a. 2) <2 (1. 8) ∈ E and (2. 3) <2 . comparable: either a = b or (a. Example 10. So (2.3.E = {(z. R) is called a linearly ordered set iﬀ R is a linear ordering of A.. (2. a) ∈ R. Let A = N − {0. b) ∈ R then (b. .7. 10 Linear Ordering A linear ordering is a partial ordering with the additional property that any two diﬀerent elements are comparable. <2 (1..2. n = {0.1. 2) <1 (0. 1) <1 (1. So these two notions stand in a very close correspondence. b) ∈ R and (b. The pair (A. 1} and <div given by x <div y ⇔ ∃z ∈ A (x · z = y) as in Exercise 9. <1 (1. <) is a linearly ordered set. a) ∈ R. / (2. Deﬁnition 10. y) into E iﬀ there is a prime number z with x · z = y.. 3) <2 . Recall that the natural ordering < on N coincides with the ∈-relation due to the way the natural numbers are coded into V . 1) <1 (0. transitive and comparable on N. 7) ∈ E. 6) ∈ E. One can deﬁne <∈ on the whole universe V and obtains that T C(x) = {y ∈ V | y = x ∨ y <∈ x} and y <∈ x ⇔ y ∈ T C(x) − {x}.. that is. c) ∈ R then (a. (2. 3) <1 . c) ∈ R. 0) and (0. / transitive: if (a. 4) ∈ E. 0) <2 (1. 1. 43 . 20) ∈ E. 1) <2 (1. z + 1) | z ∈ Z} is cycle-free but not well-founded. Show that (A. <E ) and (A.. 10) ∈ E but (2. Exercise 9. 1) <2 (0. It is easy to see that (N. . 2) <2 (0.. 0) <2 (0. . The two orderings (0. Example 10. 1} × N are linear. Let A be a set and R ⊆ A × A. for all m. 3) <1 (1. that < is antisymmetric. <div ) are identical partially / / ordered sets.14. that is. n ∈ N. . The ordering <E is then the natural ordering on all integers. b) ∈ R or (b. on the set {0. Deﬁne a relation E on A × A by putting (x. c ∈ R: antisymmetric: if (a. n − 1} and m < n ⇔ m ∈ n. b. 2) <1 (1. 0) <1 (0.

1} × N. g ∈ A∗ with domain m. If k = o then f <KB h and transitivity holds. n. k = m and k < n. <KB ) satisﬁes the properties necessary to be a linearly ordered set. i. k ∈ n ∩ m and g(k) < f (k). Then f <KB g. The fact that ` ` ` f <KB g gives that k < m. 3. Then there is a minimal number k such that either k ∈ n or k ∈ m or f (k) = g(k). <) be a linearly ordered set. 2. thus it cannot be that f <KB g. k ∈ n ∩ m and f (k) < g(k). It is shown that (A. Then g <KB f . Then f <KB g. n. Then g <KB f . Example 10.Example 10.5. Since f | k = g| k but f = g it cannot happen / / ` ` that k ∈ n ∪ m. Note that f | k = g| k = h| k. So one has exactly one of the following four / cases. Assume that f = g and k ∈ m. k = n = m. If k < o then one has f (k) ≤ g(k) ≤ h(k) and one of these must be proper since k is the parameter for either f <KB g or g <KB h. j ∈ N. Antireﬂexiveness. n) <lex (i. Deﬁne <lex on A by (m. For every f. respectively. Note that <2 from the previous example is the restriction of <lex to the domain {0. 4. n. Comparability. ` ` This is an ordering which is called the Kleene-Brouwer-ordering. Recall that A∗ contains every A-valued function whose domain can be represented by a natural number. 1. Then (N × N. Let (A. Again transitivity holds. deﬁne f <KB g ⇔ ∃k ∈ m ∩ S(n) ((f | k = g| k) ∧ (k = n ∨ (k < n ∧ f (k) < g(k)))). Assume that f = g. k = n and k < m. j) ⇐⇒ (m < i) ∨ (m = i ∧ n < j) for all m. 44 . that is. Assume that f <KB g and g <KB h. This ordering is called the lexicographic ordering of N × N. Then k < n and f (k) = g(k). Thus f (k) < h(k) and f <KB h. Proof. Similarly g <KB h gives that k < n. <lex ) is a linearly ordered set.4. o be the corresponding domains and k be the minimum of the parameters of the same name in order to establish that f <KB g and g <KB h. Transitivity. Let m. Let A = N × N.

The lexicographic ordering on A∗ diﬀers from the KleeneBrouwer ordering in the sense that it is reverted iﬀ f. Exercise 10. 4. 1. That is. 76543210. n. thus (Z.6. Let (A. 6. a is the supremum of B iﬀ a is the least upper bound of B. ` ` Show that (B. 5. <) iﬀ B has an upper bound in A. . Then (R. 0. 9} with the usual ordering. B is bounded from above in (A. <) be a linearly ordered set and B = AN . 121. 9.7. . 88. b is the least element of B (with respect to <) iﬀ b is a lower bound for B. 8. 500. Let (A. a ∈ A and b ∈ B. 900. <) and (Q. Let R be the set of reals and < be the natural ordering of the reals. B is bounded from below in (A. 5.So it follows from this case distinction that <KB is indeed a linear ordering. 00. Recall that a ≤ b stands for a = b ∨ a < b. . <) iﬀ B has a lower bound in A. b is the greatest element of B (with respect to <) iﬀ b is an upper bound of B. a is a lower bound of B iﬀ a ≤ c for all c ∈ B. <) are linearly ordered sets as well. f <lex g ⇔ ∃k ∈ S(m) ∩ n ((f | k = g| k) ∧ (k = m ∨ (k < m ∧ f (k) < g(k)))). 1. 512. 2. <) is a linearly ordered set. g coincide on the intersection of their domains m. . <lex ) and (C. 7. 45 . 15. Deﬁnition 10. Assuming that A = {0. a is an upper bound of B iﬀ c ≤ a for all c ∈ B. 3. 7. Let B be a nonempty subset of A. Deﬁne f <lex g ⇔ ∃k ∈ N (f | k = g| k ∧ f (k) < g(k)). <lex ) are linearly ordered sets. Example 10. ` ` Furthermore. put the following elements of C into lexicographic order: 120. This ordering can be inherited to the subsets Z of integers and Q of rationals. a is the inﬁmum of B iﬀ a is the greatest lower bound of B. let C = A∗ .8. 2. 007. <) be a linearly ordered set. B is bounded iﬀ B is bounded both from above and from below.

<1 ) and (B. By induction one can show 46 . Thus X does not have a / greatest element. for example −1024 is a lower and 1024 is an upper bound. X has a least element. For partial ordered sets (A. Assume by way of contradiction that there is an order-preserving function f : Z → N. iﬀ there is a bijection f : A → B such that for all = a. Therefore X has an inﬁmum and a supremum. b ∈ A. b ∈ A. a <1 b if and only if f (a) <2 f (b).9. The inﬁmum of X is −2 and the supremum of X is 5. 5. 4.11. 0 ∈ X and thus X is not empty. C = {c ∈ R | sin(c) > 0}. Consider the following subsets of (R. Y = {y ∈ R | y ≥ 5}. Then f is an isomorphism iﬀ f is surjective. Such functions are called isomorphisms. <2 ).10. 2. <2 ). <). Example 10. But Y has no upper bound in (R. Determine which of the following subsets of the real numbers R have a lower and upper bound. The set X is bounded in (R. Exercise 10. There is an order-preserving mapping from Z into Q but no isomorphism. <) is a linear ordered set and f : A → B is order-preserving. There is no order-preserving function from Z into N. <). If so. f (−2) < n − 1 and thus f (−2) ≤ n − 2. B = {b ∈ R | b3 − 4 · b < 0}. The next result is of similar nature. Assume that (A. 101 Deﬁnition 10. D = {d ∈ R | d2 < π 3 }. A = {a ∈ R | ∃b ∈ R (a2 + b2 = 1)}.Example 10. 3. Then f (0) = n for some n. Furthermore. a function f : A → B is order preserving iﬀ the implication a <1 b ⇒ f (a) <2 f (b) is true for all a. The set Y has the inﬁmum 5 which is also a lower bound. <): X = {x ∈ R | − 2 ≤ x < 5}. namely −2. But 5 ∈ X.12. <2 ) are isomorphic. denoted by (A. E = {e ∈ R | sin( π · e) = 2 e }. Since −2 ∈ X. Two partial ordered sets (A. 1. determine the inﬁmum and supremum and check whether these are even the least and greatest element of these sets. It follows that f (−1) < n and thus f (−1) ≤ n − 1. <1 ) and (B. Thus Y is unbounded and has no supremum. Proof. <1 ) ∼ (B.

is there any element which is always mapped to itself? Proposition 10. = Proof. Then a is the least and b2 the greatest element of A. <) is a ﬁnite linearly ordered set and A = ∅ then A has a greatest and a least element with respect to <. z → z + 8. The proposition holds for orderings having one element since this unique element is the least and greatest element with respect to the given ordering at the same time. Let a ∈ A and B = A − {a}. The set (Z.14. By induction hypothesis.that f (−m) ≤ n − m and f (−n) ≤ 0. Thus it follows from casedistinction that A has a least and a greatest element. There are three cases: 1. If A = ∅ then (A. This is proven by induction. These three cases cover every possibility since b1 ≤ b2 . So f cannot exist. 2. Construct an order-preserving mapping from (Z.15. b1 < a < b2 . <) ∼ (n. Assume that the theorem hold for all linearly ordered sets of size n. 3. ⊏) where < is the natural ordering of Z. Consider the ordering ⊏ given by (m. a < b1 . n) ⊏ (i. B has a least element b1 and a greatest element b2 . ⊏) also have nontrivial isomorphisms onto itself? If so. If (A. <) is a linearly ordered set of cardinality n. isomorphism diﬀerent from the identity. Proof. <) be a linearly ordered set of cardinality S(n). j) ⇔ (m < i) ∨ (m = i ∧ m is even ∧ n < j) ∨ (m = i ∧ m is odd ∧ n > j) on A = {0. Exercise 10. <) into (A. But then f (−n − 1) < 0 what is impossible. ∈) since = both are empty sets and the ordering of an empty set is irrelevant. that is. Assume now that n ≥ 1 and the proposition holds for all nonempty ﬁnite linearly ordered sets of cardinality up to n. Let (A. Then b1 is the least and b2 the greatest element of A. If (A. <) be 47 . <) there are nontrivial isomorphisms onto itself. then (A. 3. Then b1 is the least and a the greatest element of A. 4. 5} × N. The theorem is proven by induction on n. This completes the inductive step. ∈). Does (A.13. b2 < a. Let (A. For example. Theorem 10. 2. 1. <) is a ﬁnite linearly ordered set and n = |A|. <) ∼ (0. Then (B.

Example 10. .18. The search terminates as (A. The set D = {m · 2−n | n ∈ N ∧ m ∈ {0. a1 . So. <). After the values f (m·2−n ) have been deﬁned for all m ∈ {0. 2n }} then it is also order-preserving on the extended domain {m·2−S(n) | m ∈ {0. ⊏) is dense. is an one-to-one function n → an from N onto A. 2n }}]). . 3}. c ∈ A such that b < a < c. Deﬁnition 10. <) iﬀ for every a.a linearly ordered set of size S(n). <) has no end points iﬀ for all a ∈ A there are b. . A linearly ordered set (A. <) is not dense. q ∈ D. . <) and (R. . <) is dense iﬀ A has at least two elements and for any pair a. 2. (Z. 2. As A is countable. Thus for any p. a2 . . <) ∼ (n. The theorem says that. ∈) by induction hypothesis. . (Q. . a1 is isomorphic to D. 1. This is done by showing that ∀n ∈ N (an ∈ f [{m · 2−n | m ∈ {0. given by Proposition 10. b ∈ A with a < b there is a c ∈ B such that a < c < b. . Now one deﬁnes a function f : D → A by recursion as follows: f (0) = a0 and f (1) = a1 . one deﬁnes f ((2m+1)·2−S(n) ) to be aℓ for the ﬁrst ℓ with f (m·2−n ) ⊏ aℓ ⊏ f ((m+1)·2−n ).16. <) has end points 0 and 3. the ﬁnite linearly ordered sets are sets of the form n = {m ∈ N | m < n} with the natural ordering. This is true on the domain {0. Deﬁne f : A → S(n) by f = g ∪ {(a. Proof. 1. = Let g : B → n be the isomorphism. . the linear ordering of the set of all rational numbers is really a universal linear ordering of countable sets. . Theorem 10. b ∈ A with a < b there is c ∈ A such that a < c < b. . . 1. Then f is an isomorphism since a is the greatest element of A and n is the greatest element of S(n) with respect to ∈ and g is an isomorphism. 1} by a0 ⊏ a1 . 1. Every countable dense linear order (A. 2n }} of all dyadic numbers between 0 and 1 is dense and has end points 0 and 1.14. A = {a0 . . As f is order-preserving.} for some enumeration a0 . One can easily show by induction that f is order-preserving. 48 . . <) are dense linearly ordered sets without end points. 1. Let a ∈ A be the greatest element of A. If f is order-preserving on the domain {m · 2−n | m ∈ {0. ⊏) with end points a0 . p < q ⇒ f (p) ⊏ f (q). Now one shows that f is onto. . . . . . A linearly ordered set (A. formally. 2n }. 1. Let B = A − {a}. This ﬁnishes the proof of the inductive step and the whole theorem. . D is a subset of Q. which. . 2S(n) }} as the new values are inserted such that the order is preserved.17. f is also one-to-one. a1 . Then (B. n)}. a2 . . . Q is also dense in (R. . A subset B ⊆ A is dense in a linearly ordered set (A. ({0. The next result is that every countable linear ordering is isomorphic to a subset of Q with the standard ordering. . up to isomorphism.

. <). assume that A is a countable linearly ordered set. Then (A×Q. Similarly one handles the case if A has only one of the end points. then this follows from Theorem 10. The last result of this section is the characterization of the real line as a linearly ordered set. 1}. if B is either D or D − {0} and C is either D − {1} or D − {0. <lex ) is a dense linearly ordered set which is order-isomorphic via some function g to (D − {0. ⊏) and (A′ . <). 2n } with aS(n) ⊑ f (m · 2−n ). If f (m · 2−n ) = aS(n) then there is nothing to prove. ⊏) is isomorphic to (D − {0. 1. 1}. . <). ⊏′ ) as isomorphisms can be inverted and concatenated. (D − {0}. 49 . . (D−{0}. Then (A ∪ {−∞. (D − {1}. This shows that none of these four sets are order-isomorphic to each other. 1}. +∞}. a1 . a1 . 1}. . <) and thus (A. (D−{1}. +∞ ∈ A with / −∞ ⊏ q ⊏ +∞ for all a ∈ A. . Again f [B] ⊂ C and f is not an order-isomorphism.18. 1. <). For the last statement. If (A.This is true for a0 as a0 = f (0). then (A. Let m be the smallest element of {0. if (A. an are by induction hypothesis all of the form f (q) for some q with either q ≤ (m − 1) · 2−n or q ≥ m · 2−n . <). The four sets (D. <). <). <). . Two countable and dense linearly ordered sets are order-isomorphic iﬀ either both have sets have no end points or both sets have a minimum but no maximum or both sets have a maximum but not a minimum or both sets have both end points. <). Assume now that it is true for a0 . Let (A. that is. 1} and f : B → C is order-preserving then there is an element y ∈ C with y > f (1) as C has no maximum. . Furthermore. an . ⊏) be a countable dense linearly ordered set. Otherwise the ﬁrst index ℓ with f ((m − 1) · 2−n ) ⊏ aℓ ⊏ f (m · 2−n ) is equal to S(n) as aS(n) is between these two values but a0 . For example. Furthermore. It follows that A ⊆ f [D] and f is onto. Note that m > 0 as f (0) = a0 ⊏ aS(n) . Thus aS(n) is in the set f [{m · 2−S(n) | m ∈ {0. 1} and f : B → C is order-preserving then there is an element y ∈ C with y < f (0) as C has no minimum. ⊏) is isomorphic to exactly one of the following four sets: (D. (D − {0. 2S(n) }}]. . Proof. . ⊏) is isomorphic to (D. either a minimum or a maximum but not both. Corollary 10. . . <). Thus f [B] ⊂ C and f is not an order-isomorphism. a1 . ⊏′ ) are both isomorphic to the same set (B. . Now deﬁne f (a) = g((a. . . If A has end points a0 . It is shown that (A.19. If A has no end points then one can modify A to considering new elements −∞. <). . <) are not isomorphic to each other. . ⊏) is isomorphic to (A′ . (D − {0. 0)) for all a ∈ A and one obtains an order-preserving function f from A into D. if B is either D or D − {1} and C is either D − {0} or D − {0. ⊏) is an at most countable linearly ordered set then there is an order-preserving mapping from A into (D.

has no end-points. The real line (R. is dense and satisﬁes that every nonempty subset B ⊂ A which is bounded from below has an inﬁmum. this is. b ⊏ c ⊏ a and c = f (q ′′′ ) for some q ′′′ ∈ Q. Now let b = sup⊏{f (q ′′ ) | q ′′ ∈ Q ∧ f (q ′′ ) ⊏ a}. Since Q is a dense subset there are two rationals q. A linearly ordered set (A. It follows f (r) = sup⊏{f (q ′′ ) | q ′′ ∈ Q ∧ q ′′ < r} ⊑ f (q) ⊏ f (q ′ ) ⊑ sup⊏{f (q ′′ ) | q ′′ ∈ Q ∧ q ′′ < r′ } = f (r′ ) and thus f (r) ⊏ f (r′ ). Every ﬁnite linearly ordered set is a well-ordered set. Deﬁnition 11. Since B is dense in A there is a c ∈ B in between. Since B has no end points. A set A is well-orderable iﬀ there exists a well-ordering of A. <) is complete iﬀ every nonempty subset of A bounded from above has a supremum in A. q ′ in between: r < q < q ′ < r′ . A linear ordering < of a set A is a well-ordering of A iﬀ every nonempty subset B ⊆ A has a least element with respect to <. Well-orderings are a further improvement since they generalize the property that every ﬁnite linearly ordered set has a least element to inﬁnite subsets of the well-ordered set. 11 Well-Orderings Linear orderings have a higher quality than partial orderings since every two diﬀerent elements are comparable. Exercise 10. there are members q. Assume that (A. But this contradicts to the deﬁnition of b which imposes either f (q ′′′ ) ⊑ b or a ⊏ f (q ′′′ ). Proof. <) is called a well-ordered set. This function f is extended to R by deﬁning f (r) = sup⊏{f (q) | q ∈ Q ∧ q < r} for all r ∈ R − Q.2. In case that < is a well-ordering of A. say r < r′ . Theorem 10. So f is order-preserving and one-to-one. <) is a complete ordered set.Deﬁnition 10. The set B has no endpoints and thus there is a bijection f : Q → B.21.22. If r. r′ ∈ R are distinct than one is strictly smaller than the other. <) is linearly ordered. b ⊏ a. <) is the unique (up to isomorphism) complete linearly ordered set without end points that has a countable subset dense in it.1. ⊏) is a complete linearly ordered set without end points and has a countable subset B which is dense in it. Thus a cannot exist and A = f [R]. The standard linear ordering of N is a well-ordering of N. Assume that the ordered set (A. 50 . (A.20. By choice of a. Assume by way of contradiction that a ∈ A − f [R]. q ′ ∈ Q with f (q) ⊏ a ⊏ f (q ′ ). Example 11. Show that (A.

. <). Since all ﬁnite sets are order-isomorphic to subsets of N. It is established that <cw is a well-ordering by showing that the following four conditions hold. If n ∈ B − {m} then n ∈ / and thus m ∈ n by the properties of natural numbers. every nonempty subset of N × N has a minimal element. j) ⇔ max{m. 1) <1 (0.3. Thus (m. <) and ({0. . n) <cw (i. Consider any (i. n} = max{i. j). <2 ) are mutually non-isomorphic. 0) <1 (0. n) < (i. 1} × N.. 3) <2 .3. n} < max{i. 1} × N from Example 10. If i = m then i > m by the choice of Am and again (m. Example 11. n)}. So it is suﬃcient to show that it is actually a well-ordering of N × N. <). If i = m then j > n by the choice of n and (m. . n) < (i. n) < (i.. it is suﬃcient prove the second statement. 1) <1 (1. Example 11. 0) <2 (0. Notice that the above considered well-ordered sets ({0}.5. A further well-ordering of N × N is deﬁned as follows: (m. The lexicographic ordering is a linear ordering. Example 11. 3) <1 (1.. there is by the Axiom Foundation an element m ∈ B such that m ∩ B = ∅. ({0. j). 0) and (0. ({0. <2 (1. 1) <2 (0. j} ∧ m < i) ∨ (max{m. <). j) ∈ A − {(m. 3) <1 . 2) <2 (1. Let A be a nonempty subset of N×N. 2) <1 (1. n) ∈ Am . The second ordering is a well-ordering. (N. The ﬁrst ordering is not a well-ordering since the subset {1} × N has no least element with respect to <1 . Recall that (m. <1 (1. n) is the minimum of A with respect to the lexicographic ordering.. j) iﬀ (m < i) or (m = i and n < j). that is. j} ∧ m = i ∧ n < j). n) : n ∈ N}. 3) <2 . Let n be the least number in N with (m.. Proof. 2}. let Am = A∩{(m. 1}. n} = max{i.Proof. 2) <2 (0. The lexicographic ordering of N × N is a well-ordering of N × N. Thus m < n and therefore is the least element of B with respect to <. 51 . 1. 2) <1 (0. Proof. There is a least m such that Am is not empty.. For every m.4. . 0) <2 (1. Given a nonempty B ⊆ N. Recall the two linear orderings to of m m (0. j} ∨ (max{m. 1) <2 (1. on the set {0.

n) = (i. Thus. m = h and n < k by n < j < k. The following subsets of Q are well-ordered with respect to the natural ordering of Q: 1 {− m+1 − 1 m+1 1 {− n+1 1 n+1 1 n+1 | | | n ∈ N}. n) <cw (i. (N × N. n) = (2 max{m. n) <cw (h.7. j). If (i. n} = max{h. 52 . Thus any two diﬀerent members of N × N are comparable. n) ∈ A | max{m. k} and the relation <cw follows at (∗) the second or third case of the disjunction in its deﬁnition. j) ∈ Ak then / max{i. n) <cw (h. <) which is then isomorphic to (N. n) <cw (i. Otherwise m = i = h and the relation follows at (∗) the third case of its deﬁnition. n} = k}. n) <cw (i. j) ∈ Ak then (m. k). Now let (i. <cw ) is a well-ordered set. (m. <cw ) is isomorphic to an inﬁnite subset of (N. Otherwise max{m. k). n)}. j) ∈ A − {(m. It is shown that (∗) implies (m. Example 11. j) and (i. n} + m + 1)2 + n is an order-preserving one-to-one mapping into N. j) nor (i. k. Let Ak = {(m. <). j) <cw (m. n} + 2)3 + (max{m. j) <cw (h. Let A ⊆ N × N be nonempty. k} then (m. j}. (m. So again <cw holds and <cw is transitive. j} > k = max{m. Assume that neither (m. j) by the choice of (m. Thus max{m. n) <cw (m. If (i. j} = max{h. n) since (m. <cw ) is isomorphic to (N. n} = max{i. n} and (m. Then max{m. j) requires that either max{m. This set Ak is a ﬁnite set and has a least element (m. Indeed the function f given as f (m. Fix k as the least number such that Ak is nonempty. k).Antireﬂexiveness. j} or m = i or n = j and none of these conditions holds if (m. n) = (i. Remark 11. j). n) <cw (h. n) < (i.6. 1 {− k+1 − − m. <). n) <cw (i. n) with respect to <cw since <cw is a linear order on N × N and also on its subset Ak . n). k}. m = i and n = j. n} < max{h. n ∈ N}. Well-orderedness. j). k). Notice that the ordered set (N × N. m. Comparability. If max{m. n} = max{i. n) from Ak . If m < h then again (m. Transitivity. n ∈ N}. Assume the following two conditions (∗): (m. that is. Thus A has a minimum and (N × N. n} = max{i.

Deﬁnition 11. Proposition 11. . That is. √ Notice that {r ∈ Q | r < 2} is an initial segment of Q but it is not Q[a] for any a ∈ Q. If (W. I is an initial segment iﬀ I is a downward closed proper subset of L: I ⊂ L ∧ ∀x. <). Let a be the least element of A with respect to <. an initial segment I of L is a proper subset of L such that x ∈ I whenever x ∈ L and there is an y ∈ I with x < y. N × N. This order diﬀers from the natural order on Z and Q: these sets contain the chain −1.11... The set {− m11+1 − 1 m2 +1 − . . The set of real numbers is isomorphic to every initial segment and also has a large quantity of isomorphisms onto itself. Then for x ∈ W . −3. Let A = W − I.9.8. <) is well-ordered. then there is an a ∈ W such that I = W [a]. y ∈ X. m2 . Then A is not empty and every element of A is an upper bound of I since I is an initial segment. m1 . N × N × N. Well-ordered sets are rigid. L[a] = {x ∈ L | x < a}. . . − 1 mn +1 | n. 53 . I is an initial segment of W . For a ∈ L. that is. but the ordering diﬀers from the standard one. . respectively. they satisfy exactly the opposite of these properties.The orderings are isomorphic to that of the lexicographic ordering on N. . In fact every countable set X is well-orderable. x < a if and only if x ∈ I. there is a one-to-one function f : X → N. which is descending with respect to their natural order and which cannot be so with respect to any well-ordering of them. Call L[a] the initial segment of L given by a. Exercise 11. Proof. y ∈ L (x < y ∧ y ∈ I ⇒ x ∈ I). For a linearly ordered set (L. . Both Z and Q are well-orderable. the lexicographic ordering of N is of course identical with the natural one. Example 11. −2. Since |X| ≤ |N|. Proof. mn ∈ N} is not a well-ordered subset with respect to the natural ordering of Q: show that the set is dense and is not bounded from below.10. Now one deﬁnes on X a well-ordering ⊏ by x ⊏ y ⇔ f (x) < f (y) where x.

a2 ) ∈ F and b2 <2 a2 then there is an b1 <1 a1 such that (b1 . they are either isomorphic or exactly one of them is isomorphic to an initial segment of the other.Theorem 11. <). f (a) > a. b2 ) ∈ F then either a1 <1 b1 ∧ a2 <2 b2 or a1 = b1 ∧ a2 = b2 or b1 <1 a1 ∧ b2 <2 a2 . Since the choice of f was arbitrary. both inherit the same ordering. b1 = a1 and b2 = a2 in contradiction to the choice of (a1 . Thus the identity is the only isomorphism of (A. Thus there is no least element a with f (a) < a. a2 ). Let (A. Thus. let B = P(A1 × A2 ). Thus (b1 . Proof. <1 ) or (A2 . Given (A1 . if b ≥ a then f (b) ≥ f (a) > a by the fact that f is order-preserving. <2 ). f (b) = f (a): If b < a then f (b) = b = a by the choice of a. c2 ) ∈ H and similarly for every c2 <2 a2 a c1 with (c1 . a2 ). c2 ) ∈ H. The elements of F are well-ordered by the ordering inherited from (A1 . Now the following facts hold for all consistent F. a2 ) ∈ F and b1 <1 a1 then there is an b2 <2 a2 such that (b1 . <) be a well-ordered set and f be an orderpreserving function from (A. By consistency b1 is the least element in A1 diﬀerent from these c1 and b2 the least element of A2 diﬀerent from these c2 . c2 ) ∈ F | c1 <1 a1 } of the pairs in F below (a1 . G are consistent. Since F. b2 ) ∈ G − H. it is of the form A[a] for some a ∈ A. <1 ) and (A2 .12 (Rigidity). then f is the identity. if (a1 . Let H be the set {(c1 . if (a1 .13 (Comparability Theorem). Then a ≤ f (a) for all a ∈ A. for all b. Proof. <) to itself. 3. Given an initial segment of A. there is for every c1 <1 a1 a c2 with (c1 . There is a least pair (a1 . In particular. Assume now that H ⊂ G. G: 1. Given two well-ordered sets. Then there is a least element (b1 . Theorem 11. a2 ). if (a1 . If F ⊆ G then G ⊂ F . (b1 . <) is not isomorphic to any initial segment. Then there is a least element a ∈ A with f (a) = a. Assume now that f is not the identity. b2 ) ∈ F . <2 ). If f (a) < a then f (f (a)) < f (a) since f is order-preserving. Call F ∈ B consistent iﬀ the following conditions hold: 1. there is even no element a ∈ A with f (a) < a. 2. if f is an isomorphism from A to itself. a2 ) ∈ F − G. 54 . the range cannot be an initial segment of A and (A. that is. f [A] ⊆ A[a] and the initial segment is not the range of f . Since f (a) ≥ a. Furthermore. As seen above. b2 ) does not exist and G = H ⊂ F . Since (A. b2 ) ∈ F . <) is well-ordered. Clearly H ⊆ F ∩ G. there is no isomorphism from A to any initial segment.

ordinals are a generalization which is just taken to be order-isomorphic to any well-ordered sets. {∅}. If the domain of F is a proper subset of A1 then F −1 is an isomorphism from (A2 . 55 .2. 5. The set {2. and v <ll w if either v is shorter than w or v. <2 ) is an initial segment of (A1 . <1 ) is an initial segment of (A2 . w have the same length and v <lex w.2. {∅. {∅}}. Recall from Deﬁnition 4. Example 12. 6. The property of being a bijection from the domain to the range comes from the deﬁnition.14. . 1. This would give that F ∪ {(a1 . 12 Ordinals Ordinals are a generalization of the natural numbers. <2 ). If the domain of F is A1 and the range of F is A2 then (A1 . Deﬁne a function f : {0. The consistent sets are linearly ordered by inclusion and their union is again consistent. . . A1 and A2 as shown above. Then there is a least a1 ∈ A1 which is outside the domain of F and a least a2 ∈ A2 which is outside the range of F . Exercise 11. The maximal consistent set F is a partial one-to-one function with either domain A1 or range A2 . A set is an ordinal (or ordinal number) if it is transitive and wellordered by the ordering ∈ (restricted to its members). Ordinals are normally written by lower case Greek letters. While a natural number (viewed as the set which represents it) is order-isomorphic to ﬁnite well-ordered sets. Recall 0 <ll 1 <ll . similarly the domain is a subset of A1 and the range a subset of A2 . If the range of F is a proper subset of A2 then F is an isomorphism from (A1 . Assume now by way of contradiction that both subsets would be proper. .3. Thus. The sets N and N ∪ {N} are ordinals. . Every natural number is an ordinal. Ordinals are now well-ordered and represented by transitive sets. <1 ). a2 )} is also consistent in contradiction to the maximality of F . Every consistent set is an element of the power set P(A1 × A2 ) and the property consistent is ﬁrst order deﬁnable from P(A1 × A2 ). Thus only one inclusion can be proper. 9}∗ → N which is order-preserving with respect to the length-lexicographic ordering <ll : v <ll w ⇔ f (v) < f (w). . Convention 12. <ll 9 <ll 00 <ll 01 <ll . 8} is well-ordered by ∈ but not transitive. .4 that a set A is transitive iﬀ ∀a ∈ A ∀b ∈ a (b ∈ A). 3. 4. . So there is a set C of consistent sets. . 7. <ll 99 <ll 000 <ll .1. . <2 ) are isomorphic. F = C is a consistent set which is maximal. There is a maximal consistent set. {{∅}}} is transitive but not linearly ordered by ∈. The set {∅. <1 ) and (A2 . 3. Deﬁnition 12.

Examples for the two representations are: Number. ω ω+1 First and ∅ {∅} {{∅}} {{{∅}}} {{{{∅}}}} . is also an ordinal. . the same holds for / (B. Since (A.. . {∅. By the Axiom of Foundation there is x ∈ B such that y ∈ x for all y ∈ B.3. n. 1. {Code(α) | α ∈ N} {Code(α) | α ∈ N ∨ α = ω} So the second representation was taken in set theory since it codes natural and ordinal numbers in a uniform way and satisﬁes that α < β ⇔ α ∈ β ⇔ α ⊂ β. Verify the following properties of ordinals. There are in principal two options to represent the natural numbers. {∅}. {∅. Thus the second approach was taken in Deﬁnition 4. {∅. the second notation also reﬂects the intuition from counting that n is a set of n objects. Proof. ω is the ﬁrst ordinal after 0 which is not the successor of any other ordinal. 0 1 2 3 4 . ω is ordinal represented by N. Code(n)}. {∅}. Having the codes for 0.. 1. Theorem 12. {∅. {∅}}}} . If A is transitive and (A.4. ∈) is linearly ordered. {∅}}} {∅. which is impossible in the ﬁrst approach. The advantage of the second approach is that it also permits to represent transﬁnite ordinals (as already indicated above).7.. Remark 12. namely the representatives of the n smaller numbers. ∅ {∅} {∅. Both represent 0 by ∅. ∈) and x is the least element of B with respect to the ordering given by ∈. 2. which is deﬁned as α ∪ {α}.11. {∅}}.Deﬁnition 12. The ordinals strictly below ω are called ﬁnite and those beyond ω are called transﬁnite. the ﬁrst one would represent S(n) as {Code(n)} while the second one would represent S(n) as {Code(0).. . .6. . Furthermore.. Every element of an ordinal is an ordinal. . Thus A is an ordinal. 56 . — — Second Representation. see Exercise 8. Exercise 12.5.. then S(α). Let B ⊆ A and B be nonempty. If α is an ordinal. That is. . {∅}. {∅}} {∅. ∈) linearly ordered then A is an ordinal. The price paid is that it takes much space to write down even small numbers. .

2. α ⊆ β. that is. β ∈ A then α and β are comparable by the previous paragraph. namely S(α).8. β are ordinals then either α ∈ β or α = β or β ∈ α. and also a limit ordinal obtained as sup fα [N] where fα (0) = α and fα (S(n)) = S(fα (n)). one has α= {S(β) | β ∈ α} = sup{S(β) | β < α} and this rule holds also for α = 0 by using the deﬁnition sup ∅ = 0.3. 4. The following basic facts hold for ordinals: 1. note that sup A = A. 2. The Axiom of Foundation gives α ∈ α for all α. A is the union of transitive sets which are linearly ordered by ∈. By Theorem 12. Then there is a least element of α − β. One should also note that above every ordinal α is a successor ordinal. If α ∈ β and β ∈ γ then α ∈ γ since γ itself is an ordinal and transitive. Proof. The union is again transitive. The supremum of a set A of ordinals is denoted by sup A. 1. An ordinal α is transﬁnite iﬀ |α| = |S(α)|. Theorem 12. ω = sup N is a next limit ordinal. it has a least element α with respect the ordering given by ∈. Example 12. Thus γ = β and β ∈ α. An ordinal α is called successor-ordinal if α = S(β) = β ∪ {β} for some other ordinal β and is called limit ordinal otherwise. The set A inherits from the superset and ordinal A that ∈ is a well-ordering. A is an ordinal. 0 is a limit ordinal and all the positive natural numbers are successor ordinals. say γ. If A is a nonempty set of ordinals then there exists an ordinal α ∈ A such that α ∩ A = ∅. So. Since A is nonempty. Let α. for given α.11. 3. If γ ⊂ β then γ ∈ β in contradiction to the choice of γ. So A is a transitive set / which is linearly ordered by ∈. An ordinal α is ﬁnite iﬀ S(α) = {0} ∪ {S(β) | β ∈ α}. Consequently every set of ordinals is well-ordered by ∈. β be ordinals such that the cases α = β and α ∈ β do not hold. Furthermore. Deﬁnition 12.10. Since γ = {δ | δ < γ} and every δ is in α by transitivity. If α. Then A ∩ α contains only ordinals below α and is thus empty. 57 .9.6. 3. If A is a set of ordinals then A is an ordinal. One can combine the usage of supremums and successors to obtain every ordinal from those below. if α. γ ⊆ β. Exercise 12. Use the above results to show that there is no set containing all ordinals in V .

3. 3. Proof. Theorem 13. Assume that there is an x ∈ V where p(x) is false. This says that one can prove theorems and build functions along the membership relation ∈ from the bottom to the top.3 (Transﬁnite Induction in V). Assume that for every ordinal α. Remark 13. 2. If there is no minimal α satisfying ¬p(α) then p(α) is true for all α. If for every ordinal α the implication (∀β < α (p(β))) ⇒ p(α) is true. if there is an ordinal β with S(β) = α and if p(β) holds then also p(α) holds. then p(α) is true for all ordinals α. There are several equivalent statements of Transﬁnite Induction. 58 . Theorem 13. if α is a limit ordinal and p(β) holds for all β < α then p(α) holds. Thus they are also in x′ and one has that ∀z ∈ y p(z) is true. Let x′ = {z ∈ T C(x) | p(z)}. Note that due to the Axiom of Foundation one can get a counterpart to transﬁnite induction on (V. 1. So p(y) holds and y ∈ x′ in contrary to its choice. Then p(x) is satisﬁed for all x ∈ V . Let p(x) be a property.1 (Transﬁnite Induction on Ordinals).2. The set T C(x) − x′ is nonempty and there is by the Axiom of Foundation a y ∈ T C(x) − x′ such that every z ∈ y is in not in T C(x) − x′ . 1.13 Transﬁnite Induction and Recursion Induction and Recursion can be generalized to ordinals and the universe of sets. Then it can be concluded that p(α) holds for all ordinals α. Recall that by deﬁnition. 2. If for every α where p(α) is false there is another β < α where p(β) is false. Assume that for a property p and all x ∈ V the implication (∀z ∈ x (p(z))) ⇒ p(x) holds. p(0) holds. Thus x′ must be empty and p(x) is true as well. So p(x) holds for all x ∈ V . T C(x) is transitive and thus all members of y are in T C(x). then p(α) is true for all ordinals α. ∈).

If F (x) = F [x] for all x ∈ V then F is the identity. . not a set. . Show that R is well-founded. A relation R on a domain W (which is either a class or a set) is well-founded iﬀ • for every x ∈ W . Note that the ﬁrst condition is only important for the case that W is a proper class. y) R (v.4. j ∈ n where a0 a1 . w) iﬀ either x = v ∧ y ∈ w or y = w ∧ v ∈ x. bm−1 ∈ A∗ . Thus the equality holds also for x. Deﬁntion 13. • for every nonempty set x ⊆ W there is an y ∈ x such that no z ∈ x satisﬁes z R y. Let (x. . Show that R is well-founded. It follows from transﬁnite induction that F is the identity. . Well-founded relations are a generalization of both. . Let R be such that x R y iﬀ there is a z with x ∈ z ∧ z ∈ y. • Assume that (W. Proof. the class of ordinals and even the whole universe V along ∈. Now some examples of well-founded relations are given: • Assume that x R y iﬀ x ∈ y. b0 b1 . Exercise 13. Furthermore. R) is irreﬂexive. Then R is a well-founded relation on W . choosing for any y ∈ W the subset x = {y} of W . . Let F be a class deﬁning a function in one variable. Assume that F (y) = y for all y ∈ x. that is. Then F (x) = F [x] = {F (y) | y ∈ x} = {y | y ∈ x} = x.6. bm−1 ⇔ n < m and there is a function f : n → m such that bf (i) = ai and (i < j ⇒ f (i) < f (j)) for all i. Then R is well-founded relation on V by the Axiom of Foundation.5.Example 13. . One can use well-founded relations to generalize recursion from the natural numbers to many other structures like well-ordered sets. an−1 . proves that (W. an−1 R b0 b1 . {y ∈ W | y R x} is a set. Let A be some set and let a0 a1 . Note that not only recursion but also transﬁnite induction can be carried out along any well-founded relation. Is R well-founded? Is the relation R given as x R y ⇔ x ∩ y = x ∪ y well-founded? 59 . . the element relation and a wellordering. <) is a well-ordered set and x R y iﬀ x < y.

. . . x2 . x2 . . Proof The proof is similar to that of Theorem 5. . . . xn ∈ W (F (x1 . f (y1 . x2 . . . Second. . xn ) be any tuple of n elements in W and assume that (y1 . xn ) is deﬁned and such that for all functions ˜ ˜ f. xn ∈ W . . xn ) is deﬁned}.2. x2 . . . . . {(y1 . . . . F (y1 . Consider f ∪ {(x1 . a every non-empty set A ⊆ W has an minimal element with respect to R∗ as the set B = {w ∈ W : ∃u. . xn ) : f ∈ C ∧ f (y1 . xn ). . . x2 . that is. The same applies then for the inductively deﬁned set R∗ . . . .7 (Transﬁnite Recursion). . . It is now shown that F is actually a function. . Third. xn ) in D that ˜ ˜ ˜ f (y1 . . . . x2 . . . Furthermore. . Let W be the domain of R and R∗ be the transitive closure of R. . . . xn )) | y1 R x1 })). D is the class of all n-tuples (x1 . x2 . . x2 . {(y1 . . . f (y1 . xn ) = f (y1 . . Let R be a well-founded relation with domain W and let G be a class which is a function in n + 1 variables. x2 . xn . xn ) | y1 = x1 ∨ y1 R∗ x1 } for some x1 . xn ) 60 . . . . ˜ ˜ ˜ Now deﬁne f (y1 . . xn ) whenever f ∈ C and f (y1 . x2 . these values coincide. . x2 . . . x2 . . . . .}. It follows from the membership of (y1 . f (x1 . . . xn ) is ∗ deﬁned} for all y1 R x1 . . . xn )) | y1 R z1 }). R∗ is transitive. . . . . x2 . . . . xn . F = {(x1 . xn ) ∈ D for all y1 R∗ x1 . 2. xn ) is deﬁned. . 3. .Theorem 13. . . . ˜ ˜ this function is in C. the set {v ∈ W : v Rn w} is a set. . Now one shows by transﬁnite induction that D = W n . . xn . . . xn ) = G(x1 . . . x2 . . . x2 . . . xn ) = G(z1 . This transitive closure R∗ of R will now be used to deﬁne a class C which will be used to deﬁne the function F . x2 . xn ) of elements in W such that there is a function f ∈ C for which f (x1 . . xn ) is in the domain of the function f then f (z1 . xn . xn )) | f ∈ C ∧ f (x1 . xn )) | y1 R x1 }))}. . . . xn . . . . x2 . v R∗ w iﬀ v Rn w for some n ∈ {1. . . x2 . x2 . G(x1 . . . {(y1 . . . . . . . . More formally. . . . . which exists by Theorem 5. This is done by considering the following subclass D of the class of all n tuples of elements in W . . x2 . . . . x2 . So let C be the class of all functions f such that • The domain of f is a set of the form {(y1 . x2 . If there is a further function f ∈ C for which f (x1 . Fourth. . . . . . . . Let (x1 . • If (z1 . .2. . f (x1 . . x2 . Then there is a class F which is also a function in n variables and satisﬁes ∀x1 . f ∈ C where f (x1 . R∗ is antisymmetric. v ∈ A : u R∗ w R∗ v} has a minimal element with respect to R. Now the function F is deﬁned as the union over all functions f ∈ C. R1 is the same relation as R and deﬁne inductively v Rn+1 w if v Rn w or v Rn u and u Rn w for some u ∈ W . . xn ) are deﬁned. . . . . . x2 . . . . xn ) = {f (y1 . . One can easily verify the following four facts on R∗ to obtain that this relation is also well-founded: First one can show by induction that for each n and w ∈ W .

let (Z. The function T C can be deﬁned with transﬁnite recursion along ∈ via the formula T C(x) = {x} ∪ {T C(y) | y ∈ x}. 8}}. For all (Z. by the Axiom of Replacement. xn ) ∈ D. R) | Z ⊆ X ∧ R is a well-ordering on Z}. 6. R). . Thus there is a c ⊏ b with / β ∈ S(F (c)) − F (c). . if β ∈ F (a) then there is a least b ∈ W with β ∈ F (b). xn )) with y1 R x1 . 8}) = {0. . All c ⊏ b satisfy β ∈ F (c). R′ ) ⇔ (Z. Theorem 13. Hence the class F = C is actually a function mapping n-tuples in W to V and so F exists. So it follows that (x1 . An important application of transﬁnite recursion is the following result. Given X. Let (W. Then there is an ordinal isomorphic to this set. . 7. . 5. . (Z ′ .8. F [W ] is a set itself. x2 . . So F [W ] is an ordinal. . 8}} and T C({7.˜ ˜ is deﬁned. Using transﬁnite recursion one can deﬁne F : W → V by the equation F (a) = {S(F (b)) | b ⊏ a}. x2 . Furthermore.10. . 1. Note that T C(∅) = {∅} as {T C(y) | y ∈ ∅} is just ∅. F is a class). Proof. F (y1 . R) ⊏ (Z ′ . . ⊏) be a well-ordered set. f (y1 . x2 . . . Example 13. So F is order-preserving. 8}) = {7. x2 . So F [W ] is transitive. . F [W × W × . . let Y = {(Z. 4. Theorem 13. It is easy to see that for all a. 2. Proof.9 (Representation Theorem). xn )) | y1 R x1 }). In the case that W is a set. 0 is the ordinal represented by ∅. 3. and T C coincides with the successor S on ordinals which is an expression. {(y1 . 8. b ∈ W the implication b ⊏ a ⇒ S(F (b)) ⊆ F (a) ⇒ F (b) ∈ F (a) holds. . {7. xn ) from the arguments x1 . . R) is isomorphic to an initial segment 61 . × W ] is a set as well by the Axiom of Replacement. . . R′ ) ∈ Y . xn and all pairs (y1 . . For every set X there is an ordinal α such that |α| ≤ |X|. . . x2 . They are diﬀerent on sets which are not ordinals as S({7. . x2 . / So β ∈ {S(F (c)) | c ⊏ b} but β ∈ {F (c) | c ⊏ b}. Note that F (a) = 0 if a is the least element of W with respect to ⊏ since the union over the members of the empty set gives the empty set: ∅ = ∅. {7. It follows that b is the successor of c with respect to ⊏ and β = F (c). then F itself exists (that is. . . then f coincides with f on the domain of f and hence f (x1 . . . Furthermore. Informally. 8. xn . . . this means that whenever R is a well-founded relation on W and some class G says how to obtain F (x1 . Furthermore. . xn ) coincides with G(x1 . .

R) ∈ Y : F ((Z. R) with respect to ⊏. R′ ) is a minimal element of U . R)) is the union of ordinals and hence F ((Z. g induces a well-ordering R on g[α] and (g[α]. The rank ρ which is deﬁned as ρ(x) = ρ(∅) = ∅ = 0. Assume now by way of contradiction that the range of F is not an ordinal: then one can deﬁne the nonempty set Y ′ = {(Z. Exercise 13. The rank is deﬁned by transﬁnite recursion. R) ∈ Y in contradiction to the fact that no member of Y is isomorphic to α. Deﬁnition 14. Exercise 13. R′ ) ⊏ (Z. Hence. Furthermore. . R′ ) ∈ Y ∧ (Z ′ .1. 14 The Rank of Sets The rank is an alternative method to measure the size of a set. R′ ) ⊏ (Z. More formally. So one can deﬁne by transﬁnite recursion a recursive function F : Y → V such that F ((Z. the range of F is a set of ordinals and the union of these ordinals is an ordinal β. R′ ))) with (Z ′ . Is it possible to deﬁne a function F on all sets such that F (X) = n iﬀ n is the maximal number such that there are Y0 .of (Z ′ . R′ )) | (Z ′ . R′ ). . hence F ((Z. In the latter case. R)) = {S(F (Z ′ . the rank asks how many levels are necessary to build a set.12. R) is a minimal element or there is a least z ∈ Z such that some (Z ′ . R)) is not an ordinal} and Y ′ has a minimal element (Z. R). R). Y1 . construct a function F such that F (α) = 0 if α is even. . If there would be a one-to-one function g : α → X then g[α] ∈ V and g[α] ⊆ X. The relation ⊏ is well-founded: If U ⊆ Y then either U = ∅ or U contains some (Z.11. Construct by transﬁnite recursion a function on ordinals which tells whether an ordinal is even or odd. 62 {S(ρ(y)) | y ∈ x} with . R′ ) ∈ Y ∧ (Z ′ . Limit ordinals should always be even. . R)} where the well-founded relation on the domain of F is ⊏. Every member of Y is isomorphic to a initial segment of α = S(β). For this minimal element. either (Z. F (α) = 1 if α is odd. R′ ) ∈ U is isomorphic to (Z[z]. R) and then that (Z ′ . The cardinality asks how many elements are in the set. Yn with Ym+1 = S(Ym ) for all m ∈ n and X = Yn ? If so. one has that it is the union of all sets S(F ((Z ′ . construct the corresponding function F by transﬁnite recursion. Thus there is no such g and |α| ≤ |X|. the successor of an even ordinal is odd and the successor of an odd ordinal is even. R)) is an ordinal itself in contradiction to the assumption.

Example 14.2. ρ(1) = ρ({∅}) = 1, ρ(2) = ρ({∅, {∅}}) = 2, ρ({{∅}, {{∅}}}) = 3, ρ({A}) = S(ρ(A)) and ρ(A ∪ B) = ρ(A) ∪ ρ(B) for all sets A, B. Proposition 14.3. The rank ρ is an ordinal-valued function with ρ(α) = α for all ordinals α. Proof. By Theorem 12.6, ρ(x) is an ordinal iﬀ ρ(x) is transitive and linearly ordered by ∈. Being an ordinal is a property. So, for given x ∈ V , one can deﬁne the set x′ = {y ∈ T C(x) | ρ(y) is an ordinal} by comprehension. Assume by way of contradiction that x ∈ x′ . Then T C(x) − x′ is / not empty and has an element y such that no z ∈ y is in T C(x) − x′ , that is, y ⊆ x′ . Then ρ(z) is an ordinal for every z ∈ y and ρ(y) = {S(ρ(z)) | z ∈ y} is an ordinal by Theorem 12.8. Then y ∈ x′ contradicting the choice of y; this contradiction gives x ∈ x′ . In particular, ρ(x) is an ordinal. Recall that α = {S(β) | β ∈ α} for all ordinals α. Assuming that ρ(β) = β for all β ∈ α, one has that ρ(α) = {S(ρ(β)) | β ∈ α} = {S(β) | β ∈ α} = α for α. The equality ρ(α) = α holds for all ordinals α by transﬁnite induction. Exercise 14.4. For any ordinal α, consider the successor function S restricted to α, that is, consider the set S| α = {{β, {β, S(β)}} | β ∈ α}. ` ` Determine ρ(S| α) for α = 42, 1905, 2004, ω, ω + 1, ω + 131501, ω 2 + ω · 2 + 1, ω 17 + ω 4 . Theorem 14.5. For every ordinal α let Vα = {x ∈ V | ρ(x) < α}. Then Vα is a set and ρ(Vα ) = α. Proof. Deﬁne a function G by G(α, x) = {P(z) | ∃y ((y, z) ∈ x)}.

Let F be the function obtained from G by transﬁnite recursion on ordinals. That is, F satisﬁes F (α) = {P(F (β)) | β < α} for all ordinals α. Now one can show by transﬁnite induction that F maps ordinals to sets. F (∅) = ∅ is a set. If α is a successor ordinal and α = S(β) then F (α) = P(F (β)) and F (α) is a set. If α is a limit ordinal then F (α) = F [α] and F (α) ∈ V by the Axiom of Replacement. The equality Vα = F (α) is shown by transﬁnite induction. That is, assuming that 63

equality holds for all β ∈ α, one has to show that the equality holds for α as well. If x ∈ F (α) then x ∈ P(F (β)) and x ⊆ F (β) for some β ∈ α. By induction hypothesis, ρ(y) < β for all y ∈ x. Thus ρ(x) < S(β) ≤ α and x ∈ Vα . If x ∈ Vα then ρ(x) = β < α for some β. Every y ∈ x satisﬁes ρ(y) < β and y ∈ Vβ . By induction hypothesis, Vβ = F (β). Since F (β) is a set by the Axiom of Replacement, P(F (β)) exists and x ∈ P(F (β)). It follows that x ∈ F (α). So F (α) and Vα have the same elements. By the Axiom of Extensionality they are equal. Thus the mapping α → Vα is a function and Vα is a set for every ordinal α. On one hand, Vα consists only of elements x with ρ(x) < α. Thus ρ(Vα ) ≤ α. On the other hand, every β < α satisﬁes ρ(β) = β and β ∈ Vα . So α ⊆ Vα and ρ(Vα ) ≥ α. Thus ρ(Vα ) = α. Exercise 14.6. Vω has been deﬁned twice. Let A be the version of Vω as deﬁned in Deﬁnition 7.5, that is let A consist of all hereditarily ﬁnite sets. Let B = {Vn | n < ω} = {x ∈ V | ρ(x) < ω} be the version deﬁned here. Show that both deﬁnitions coincide, that is, show A ⊆ B ∧ B ⊆ A. Show that B contains ∅, is closed under unions of two sets and is closed under the operation forming {v} from v. Thus, by Theorem 7.9, A ⊆ B. Show by induction that all members of Vn with n < ω are hereditarily ﬁnite. Thus B ⊆ A. Proposition 14.7. The deﬁnition of the function F from Theorem 14.5 can be extended to all x ∈ V by the condition F (x) = For all x ∈ V , F (x) = F (ρ(x)). Proof. This is proven by transﬁnite induction. So for any given x ∈ V , one has to show that F (x) = F (ρ(x)) provided that F (y) = F (ρ(y)) for all y ∈ x. If x = ∅ this directly follows from ρ(0) = 0. So consider the case that x is nonempty. From the deﬁnition and the inductive hypothesis one has that F (x) = {P(F (y)) | y ∈ x} = {P(F (ρ(y))) | y ∈ x}. Note that F (α) ⊆ F (β) and P(F (α)) ⊆ P(F (β)) whenever α, β are ordinals with α ≤ β. Furthermore, α < ρ(x) iﬀ there is y ∈ x with α ≤ ρ(y). So one can add P(F (α)) to the union for all α < ρ(x) without changing the outcome: F (x) = {P(F (α)) | α < ρ(x)}. It follows from Theorem 14.5 that F (x) = F (ρ(x)). {P(F (y)) | y ∈ x}.

64

15

Arithmetic on Ordinals

Addition and multiplication are deﬁned inductively. The ﬁrst parameter is ﬁxed and the induction goes over the second one. The basic idea of addition of ordinals is that it has an easy geometric interpretation and that it can be reversed: for every ordinals α, β with β > α there is a unique ordinal γ with α + γ = β. Deﬁnition 15.1. For ordinals α and β, one can deﬁne the addition by transﬁnite induction: α + 0 = α and, for β > 0, α + β = sup{S(α + γ) | γ ∈ β}. Alternatively, one can also say that α + β is the unique ordinal which is orderisomorphic to the set {0} × α ∪ {1} × β = {(0, γ) | γ ∈ α} ∪ {(1, δ) | δ ∈ β} equipped with lexicographic ordering. Remark 15.2. Notice that the addition of ordinals is not commutative. For example, ω+1 = 1+ω = ω. Furthermore, if α > β, one can deﬁne α−β to be the unique ordinal γ with β + γ = α. This ordinal is the one which is isomorphic to the well-ordered set ({δ ∈ α | δ ∈ β}, <). That is, arithmetic and set-theoretic diﬀerence coincide up to / isomorphism for ordinals. Deﬁnition 15.3. Multiplication can also be deﬁned by transﬁnite recursion: α·0 = 0 and, for β > 0, α · β = sup{(α · γ) + α | γ ∈ β}. Alternatively, one can deﬁne α · β to be the unique ordinal isomorphic to the set β × α equipped with the lexicographic ordering. Again, the multiplication of ordinals is not commutative. For example, ω · 2 = ω + ω = ω = 2 · ω. Deﬁnition 15.4. α0 = 1, α1 = α, α2 = α · α, α3 = α2 · α and αS(n) = αn · α. Deﬁnition 15.5. Let Cf in be the class of all functions F which map ordinals to natural numbers with the additional constraint that F (α) = 0 for all but ﬁnitely many ordinals α. For F, G deﬁne that F < G iﬀ F = G and F (α) < G(α) for the largest ordinal with F (α) = G(α). Map an ordinal α to that function F ∈ Cf in for which {G ∈ Cf in | G < F } is order isomorphic to α; this function is denoted by Fα from now on. Deﬁne an addition ⊕ on the ordinals by letting α ⊕ β be that ordinal γ for which the equation ∀δ (Fα (δ) + Fβ (δ) = Fγ (δ)) 65

Remark 15. There are ordinals α. Exercise 15. 66 . . . . Since 0 is represented by the void sum. . γ can be represented as γ = ω αn−1 · mn−1 + ω αn−2 · mn−2 + . α1 . So the idea is to make an isomorphism between the ordinals and the ordered free commutative semigroup over them. αn−2 . + ω α1 · m1 + ω α0 · m0 .9. 2.4 is established in the last paragraph for the case n ∈ N.8 coincide for ω n with n ∈ N. 3. .4 and 15. The addition ⊕ is commutative. let n be the number of ordinals which F does not map to 0 and let αn−1 .7. This proposition uses Deﬁnition 15. ω 0 = 1 and ω α = sup{ω β · m | β < α ∧ m ∈ N} for ordinals α > 0. let Gα be the function mapping α to 1 and all other ordinals to 0. Let mk = Fβ (αk ) for all k ∈ n. There are ordinals α. There are ordinals α. Proposition 15. Which of the following statements are true and which are false? 1. On the other hand. β such that α + β and β + α both diﬀer from α ⊕ β. 5. Furthermore. . Deﬁnition 15. Let n be the number of places where Fβ is not 0 and let the ordinals αn−1 . Given F ∈ Cf in .6.8. β such that α < β and α ⊕ γ = β for all ordinals γ. Then β = ω αn−1 · mn−1 + ω αn−2 · mn−2 + . In particular. . α0 be these ordinals in descending order. This unique representation is called the Cantor Normal Form of β. . + ω α1 · m1 + ω α0 · m0 . There are ordinals α. . α1 . αn−2 . 4. + Gα1 · F (α1 ) + Gα0 · F (α0 ). . α0 be these n places. Given any ordinal β > 0. Deﬁnitions 15. The Cantor Normal Form of 0 is the void sum. .8 and the equivalence to Deﬁnition 15. Proof. β such that α + β > α ⊕ β. consider the function Fβ ∈ Cf in which is isomorphic to β. Clearly ω α > ω β · m for all β ∈ α and m ∈ N. ω 0 is greater than 0 and takes the next value 1.holds. let γ be an ordinal with 1 ≤ γ < ω α . . Let ω α denote the ordinal represented by Gα . Then F = Gαn−1 · F (αn−1 ) + Gαn−2 · F (αn−2 ) + . . . β such that α < β and α + γ = β for all ordinals γ.

Now let β = αn−1 and m = k∈n mk . > α1 > α0 . second the two ω 3 -terms are uniﬁed to one. thus the equivalence of both deﬁnitions transfers to S(n) and the last statement of the proposition follows by induction.4. if α > β. . one can also multiply them with the corresponding natural number. ρ(ω) = ω. Thus ω α is indeed the supremum of all ω β · m with β ∈ α and m ∈ N. then ω α · a can be omitted. . Recall the following rules: • The rank of an ordinal α is α. Assume now that the equivalence is established for some n ∈ N.4 the supremum of all ω k · m with k ∈ S(n) and m ∈ N. One can view the Cantor Normal Form as a ﬁnite sum of powers of ω in descending order as in this example: ω 5 + ω 2 + ω 2 + ω 2 + ω 1 + ω 0 + ω 0 = ω 5 + ω 2 · 3 + ω + 2. ω 3 · 345. instead of ω 0 just 1. ω n is increasing. Thus Fγ (δ) = 0 and δ > αn−1 > αn−2 > . ω 4 · 8. ω 1 . The void sum is represented by the symbol 0. . no simpliﬁcation is possible: ω3 + ω5 ω3 · 5 + ω4 · 8 + ω5 · 0 ω 3 · 234 + ω 3 · 111 ω5 + ω3 + ω2 + ω3 + ω1 = = = = ω5. one has that ω S(n) = supm∈N ω n · m. . 67 . Since the sequence ω 0 . instead of ω 1 . ρ(1) = 1. Example 15.10. Let δ be the largest ordinal with Fγ (δ) = Gα (δ). . . 2 The last line has the application of two rules: ﬁrst ω 2 is omitted as it is in front of a higher ω-power. ω S(n) is also by Deﬁnition 15. So ρ(0) = 0.where n ∈ N − {0} and mk > 0 for all k ∈ n. Note that the equality ω 0 = 1 coincides with Deﬁnition 15. one can write just ω. Then. Since Fγ < Gα . Then Fγ ≤ Gβ · m < Gα and γ ≤ ω β · m < ω α . using the deﬁnition of ω S(n) and of the multiplication. Fγ (δ) < Gα (δ) = 1 and δ = α. Recall that Fγ is the function in Cf in representing γ and Gα represents ω α . If one adds ordinals ω α · a + ω β · b with α < β. The Cantor Normal Form can also be used in order to express the rank of sets. no further simpliﬁcation is possible. if α = β the coeﬃcients can be added giving ω α · (a + b). Also transﬁnite ordinals can be in the power: ω ω + ω ω·5+8 · 7 + ω ω·5+7 · 12345 + ω 22222 · 33333 + ω 4 + ω 3 + ω 2 + ω + 1. This all is done on the right hand side of the equation above. ω 5 + ω 3 · 2 + ω. Instead of repeating same ordinals.

11. ω · 2... Exercise 15.13. What condition on γ1 . one can get the following ranks expressed in Cantor Normal Form: ρ({{∅}}) ρ({∅. 1. An ǫ-number is an ordinal ǫ satisfying ǫ = ω ǫ . ρ(y) + 1. ω 2 + ω + 1 + ω 2 + ω + 1 + ω 2 + ω + 1. {∅}}) ρ({{{{ω}}}}) ρ({ω 4 + 2. 1 ⊕ ω ⊕ ω 2 ⊕ ω 3 . 2 ρ({ω α | α < ω 2 }) = ω ω . Assume that α = ω γ1 + ω γ2 and β = ω δ1 + ω δ2 with γ1 > γ2 and δ1 > δ2 . 3. .. The Cantor Normal Form is in particular useful to denote ordinals formed by ﬁnite sums over small powers of ω. ω ω+5 + ω ω+2 · ω + ω 2 . ω (ω 68 ω (ω ω ) ) . For example. (ω + 3)5 + (ω 2 + 17) · (ω + 8) + ω 12 . 6.• The rank of sets is determined by the rank of their elements. Then. 256256 + ω · 42. ω 3 · 8}) = = = = 2. ω + ω 2 + ω 3 + ω 4 + 2. ρ(z) + 1} and ρ({ω. ρ(X) = sup{ρ(Y ) + 1 | Y ∈ X}. The ordinal ǫ0 is the limit of the sequence ω. • In general. 2. ω + 4. y. ǫα is the ﬁrst ordinal such that the set {ǫ : ǫ ≤ ǫα ∧ ω ǫ = ǫ} is isomorphic to {β : β ≤ α}. 4. δ2 is equivalent to the equation α + β = α ⊕ β. β < ω + 8}) = ω ω+7 · 2 + 1. z}) = max{ρ(x) + 1. ω (ω ) . ρ({x.12. ρ({ω α + ω β | α. Exercise 15. ω ω . 5. 2. Determine the Cantor Normal Form of the following ordinals. γ2 . Example 15. for any ordinal α. ω 4 + 3. δ1 . In particular. ω + 5}) = ω · 2 + 1.

There is a one-to-one enumeration of all polynomials p0 . since f given as S(α) if α ∈ ω.14. . f (α) = 0 if α = ω + 7. An ordinal α is called constructive or recursive iﬀ α < ω or there is a relation ⊏ on N × N which can be computed by a computer programme such that (N. Other examples of constructive ordinals are ǫ0 . pm . so the third word “pear” comes after the second word “banana”. say “three” represents the set {apple. . Now let n ⊏ m iﬀ the polynomials pn . p1 . pear} of three fruits. So ω 5 +ω 3 ·2 ⊐ ω 5 +ω 3 ⊐ ω 5 +ω 2 ⊐ ω 4 ·17+ω 3 ⊐ ω 4 ·17 ⊐ ω 2 +1 ⊐ ω ⊐ 1243134123412342. the power ω β for a subformula β and addition. But the cardinality of the sets represented by these two cardinals is the same. the brackets ( and ) are normally omitted. For example. It is of course convenient to write ω ω + ω 2 · 3 + 2 instead. . .15. Second to enduce an order. ǫω+1 . . . For example. An equivalent deﬁnition is that pn ⊏ pm iﬀ pn (x) < pm (x) for almost all natural numbers x when x replaces ω viewed upon as a formal variable in the polynomials pn . Example 15. ωω + ω2 + ω2 + ω2 + 1 + 1 is such an expression.of iterated powers of ω. 1. 69 . ǫω . pm are diﬀerent and satisfy an < am for the coeﬃcients an in pn and am in pm for the largest power ω k where these coeﬃcients are diﬀerent. The ﬁrst non-constructive ordinal is ωCK named after the mathematicians Church and Kleene who studied this ordinal. α if ω ⊆ α ∧ α ∈ ω + 7. In particular. it is even more necessary to distinguish cardinals (representing the quantity) and ordinals (representing an order): object number ω + 8 should come after object ω + 7 and not before it. banana. Example 15. It is larger than all previously considered ordinals. in ω where the coeﬃcients are natural numbers. α < ǫ0 iﬀ α can be expressed by a formula consisting of the constants 0. . There is a ﬁrst ordinal ω1 such that the set {α : α < ω1 } representing ω1 is not countable. ⊏) is isomorphic to α. When dealing with inﬁnite objects. ǫ1 . The ordinal ω ω is constructive. ω ω 16 Cardinals There are two diﬀerent usages of the natural numbers: First to denote the quantity of something. The English language reﬂects these two ways to use numbers by having the diﬀerent words “three” and “third”. 121234312 < ω < ω ω + 234123443124123 < ǫ0 < ǫ1 < ǫ2 < ǫ5 + ω 7 < ωCK < ωCK + ω · 17 + 4 < ω1 .

Now let α be the minimum of all δ ∈ S(γ) such that |δ| ≤ |β|.5. In particular. cardinals are denoted by alephs: ℵ0 = ω and ℵα = sup{ℵ+ | β ∈ α} β 70 . Let α be an ordinal. ω +1.1. 3. An ordinal α is a cardinal (or cardinal number) if |β| < |α| for all β ∈ α. Proof. If α is a cardinal then α = |α|. Deﬁnition 16. if α is countable and α > ω then α is not a cardinal. This is done by deﬁning that the cardinal α of a set A is the least ordinal such that there is a bijective mapping from A into α. A cardinal is called a limit cardinal if it is not a successor cardinal. 2. the least cardinal α satisfying that β ∈ α is denoted by β + . Then α is the ﬁrst cardinal larger than β. It follows that α is a cardinal. if and only if |α| = |A|. ω 51 are not cardinals. α is denoted as β + . For every ordinal β there is a unique cardinal α such that α = |β| and α ≤ β. ⊏) is isomorphic to (β. For every ordinal β there is a ﬁrst cardinal α such that |β| < α. ⊏) in the sense that (A.is a bijective function from ω + 8 to ω + 7. there is by Theorem 13. note that the Axiom of Choice deﬁned below is required to guarantee that every set has a cardinal. Furthermore. that is. ω 2 . Example 16. ω is the least inﬁnite cardinal. A cardinal α is called the cardinal number (or sometimes the cardinality) of A. If α ≤ ω then α is a cardinal. denoted by α = |A|. <).2. Given an ordinal β. Property 16.3. So one would want to assign to ω + 8 and ω + 7 the same cardinal. Deﬁnition 16. The set {γ ∈ S(β) | |γ| = |β|} has a minimum α. Then |α| = |β| but |γ| < |β| for all γ < α. there is by Theorem 13.4. For an ordinal β. ⊏). α = |β|.9 a unique ordinal β representing (A. Given a well-ordered set (A. Furthermore. if A is well-orderable then there is a unique cardinal α such that α = |A|. Theorem 16. So there is a cardinal α such that α = |β| and therefore also α = |A|.10 an ordinal γ with |γ| ≤ |β|. A cardinal α is called a successor cardinal if α = β + for some ordinal β. ω +17. 1.

Exercise 16. For every inﬁnite ordinal α. f (β) = β β + 1 if β < ω. κ · λ = |κ × λ| and 2κ = |P(κ)|. Construct a one-to-one function h which maps α × ω to α for any inﬁnite limit ordinal α. So S(α) cannot be a cardinal and κ is a limit ordinal. If κ is an inﬁnite cardinal and λ a cardinal with λ ≤ κ then κ + λ = λ + κ = κ.for ordinals α > 0. ω1 . Proof. the natural numbers and their cardinality can all be denoted by the following symbols: N. ℵα is a successor cardinal otherwise. Proposition 16. ω2 = ω1 .6. So ℵα + 1 will be diﬀerent from both ωα + 1 (obtained by looking on ℵα as an ordinal) and ℵα+1 = ℵ+ (obtained by adding the α indices). This is witnessed by the following function g: g(ω · γ + n. All ωα are limit ordinals. there is still the traditional name ωα for the least ordinal β satisfying |β| = ℵα . ω0 . So ω1 = ω0 .7 (Arithmetic for Cardinals). it is easy to see that g is a bijection. ω + . Since every (β. ω0 . 1} can be uniquely represented as (ω · γ + n. deﬁne κ+λ = |κ + λ|. Although cardinals are identiﬁed with the ordinals representing them. ℵ+ are all names for the ﬁrst uncountable ordinal which is 0 identiﬁed with the set representing it and its cardinal. In the following it will be proven that the addition and the multiplication of inﬁnite cardinals are really trivial and coincide with forming the maximum. a) ∈ κ × {0. The addition and multiplication of cardinals is diﬀerent from that of ordinals since one enforces that the result is a cardinal. For cardinals κ and λ. 1}| and using the fact that κ is an inﬁnite limit ordinal. Note that κ ≤ |κ + λ| ≤ |κ × {0.8 (Hessenberg). if ω ≤ β < α. a) = ω · γ + 2n + a for all ordinals γ and n ∈ N such that ω · γ + n ∈ κ. ω3 = ω2 and ωω = α+ for all ordinals α. one has |α| = |S(α)| witnessed by the bijection f deﬁned as 0 if β = α. 1}|. n ∈ N and a ∈ {0. ω. Due to the identiﬁcation of sets and cardinals with ordinals. So it is suﬃcient to show that |κ| = |κ × {0. This function can without loss of generality assume that the 71 . the function g is well-deﬁned. 1}. ℵ1 . Sim+ ilarly. + + + Example 16. a) with γ being an ordinal. but ℵα is a limit cardinal only if α is a limit ordinal.9. ℵ0 . Deﬁnition 16. Furthermore.

n ∈ N and γ is an ordinal with ω ·S(γ) ≤ α. then (λ × λ. β) ∈ µ × µ be such that (µ. λ are cardinals with ℵ0 ≤ κ and 1 ≤ λ ≤ κ then κ · λ = λ · κ = κ. = By the Comparability Theorem. This is a contradiction. Let η = max{α. β} = max{γ. Let A = {λ ∈ S(κ) | λ ≥ ω is a cardinal and (λ × λ. m) where m.13. Let (α. recall the canonical well-ordering from Example 11. <cw ) given by (α. <cw ) to some subset of (ω κ·2+3 . Recall that 2κ = |P(κ)| by Deﬁnition 16.11. β} = max{γ. m). Then κ ∈ A. <cw ) since |µ × µ| ≥ µ and µ is a cardinal. β) ≤cw (η. β). Then |λ × λ| = |η × η|. if ω ≤ λ < µ and λ is a cardinal. δ} ∨ (max{α.input is of the form (ω ·γ +n.β}·2+2 + ω max {α. β) <cw (γ. denoted by <cw . Then (α. <cw ) ∼ (λ. the image should be of the form ω · γ + h(n.12. the canonical well-ordering of κ × κ. = Proof. 72 . Note that 20 = 1 = 0+ . = Let µ ∈ A be the least element of A. for all λ. h : µ → η ×η is injective. say κ.7 and 2κ > κ by Theorem 6. (γ. ∈) is isomorphic to an initial segment of (µ × µ. Note that µ > ω. The mapping (α. In order to see that κ × κ is the same as κ. δ) ⇔ max{α.6 states that it is true for κ = ω. Remark 11. But |λ × λ| = λ < µ. is deﬁned as follows for (α. ∈). Assume that there is a counterexample. Hence. β} < max{γ. Hence. Thus 2κ ≥ κ+ . <cw ) ∼ (κ. Let κ be a cardinal. Thus (κ × κ. δ} ∧ α < γ) ∨ (max{α. β). Furthermore. For all inﬁnite cardinals κ.β}+α+1 + ω β is an isomorphism from (κ × κ. Remark 16. β}. Let h be the isomorphism. β) → ω max{α. Let λ = |η|. ∈). For an inﬁnite ordinal κ. δ) ∈ κ × κ: (α. Theorem 16. ∈). So it remains to show that the two well-ordered sets are isomorphic. <cw ) ∼ (λ. ∈) is isomorphic to the initial segment of (µ × µ. η). Theorem 16. (κ × κ. If κ. <cw ) is a well-ordered set. ∈)}. Deﬁnition 16.12 (Hessenberg). there is no counterexample to the theorem.5. (µ.10. 21 = 2 = 1+ and 24 = 16 > 5 = 4+ . δ} ∧ α = γ ∧ β < δ).

A choice function C on N can be deﬁned as C(S(n)) = n for all n ∈ N. Deﬁnition 17. Let β be the least such ordinal. By Theorem 13. If (W. Assuming all axioms except the Axiom of Choice. First Statement ⇒ Second Statement.4. Let X be a set of sets. 4. one can ﬁnd an element in it. Deﬁnition 17. Every set can be one-to-one mapped into a set of ordinals. This is formalized by the Axiom of Choice. Y . ⊏) is a well-orderable set and let X = P(W ). Let X be a set of sets. that is. This permits to state the Axiom of Choice and its countable counterpart as follows. Then the function which assigns to every nonempty subset of W its minimum with respect to ⊏ is a choice function. The Axiom of Choice. Somehow.2 (Axiom of Choice). Theorem 17. it is not guaranteed that one can ﬁnd the element in a systematic way. 3. Then f : β → X is a bijection and has an inverse 73 .10 there is an ordinal α such that |α| ≤ |X|. Let X be any given set and u ∈ X / a target which will be used to guarantee that the below mapping is invertible on X. u otherwise. Example 17. Note that whenever f (γ) ∈ X then f (γ) ∈ f [γ] and thus f does not take any elements / of X twice.17 The Axiom of Choice If a set is not empty. Then X has a choice function. by a function. either |X| < |Y | or |X| = |Y | or |Y | < |X|.1. A function C deﬁned on all nonempty members of X is called a choice function of X if C(x) ∈ x for every nonempty x ∈ X. Now one constructs by transﬁnite induction the following f : α → X ∪ {u} where C is a choice function which is deﬁned at least on all subsets of X.3. Since |α| ≤ |X| there must be some ordinal in α which is mapped to u. Proof.5. For every γ ∈ α one deﬁnes f (γ) = C(X − f [γ]) if f [γ] ⊂ X. For all sets X. the following conditions are equivalent: 1. Every set is well-orderable. 2. Example 17.

Show that D = C A iﬀ |B| ≥ min{|A|. If f is a function deﬁned on A. For every set X. Then there is a cardinal κ ≤ α with κ = |α|. Theorem 17. Hence X has a choice function. There is a well-ordering ⊏ on X. Given a set X. Since α. then |f [A]| ≤ |A|. Let X. C be any sets and. Fourth Statement ⇒ Second Statement.6. Furthermore. Since |X| ≤ |Y |. there is a unique cardinal κ such that κ = |X|. Let a set X of sets be given.13 these sets are either order-isomorphic or one is order-isomorphic to some initial segment of the other one. Hence C also maps every nonempty Y ∈ X to its minimum with respect to ⊏. one can prove the following result. Y is order-isomorphic to an initial segment of X and the corresponding mapping is one-to-one.9. The function g is one-to-one and witnesses |f [A]| ≤ |A|. They are based on the fact that every set X there is an ordinal α and a bijection f : α → X. as in Example 3. Theorem 17. Y be sets and assume that |X| ≤ |Y |. Y and by Theorem 11.16. X are comparable.7. Proof. D = {f ∈ C A | ∃g ∈ B A ∃h ∈ C B (f = h ◦ g)}. Second Statement ⇒ Third Statement. Now deﬁne for all b ∈ f [A] the mapping g by g(b) = C({a ∈ A | f (a) = b}). If g : X → Y is a one-to-one function and Y is a set of ordinals.one-to-one function g which maps X into a set of ordinals. for every cardinal λ ≤ κ. x ⊏ y ⇔ g(x) ∈ g(y). The union of a countable set of countable sets is countable. Furthermore. The next two results are applications of the Axiom of Choice. then X has a countable subset. Using the Axiom of Choice. Third Statement ⇒ First Statement. Thus |Y | < |X|. then g induces a well-ordering of X: for all x. Thus there is a one-to-one mapping from X into α. Theorem 17. y ∈ X. 74 . Exercise 17. Now one can deﬁne a choice function C which maps every nonempty subset Y of X to its minimum with respect to ⊏. Let A. |X| < |α|.10 there is an ordinal α such that |α| ≤ |X|. f [λ] is a subset of X of cardinality λ. There are well-orderings on X. Third Statement ⇒ Fourth Statement. B. |C|}. if X is inﬁnite. there is by Theorem 13. Let C be a choice function on all nonempty subsets of A.8.

each set E(n) has cardinality 2ℵ0 . n. Note that ≤lin is transitive: If A ≤lin B and B ≤lin C then there are m. . So A is countable. n ∈ N ∀a ∈ N (a ∈ A ⇔ a · m + n ∈ B). A <lin B ⇔ A ≤lin B ∧ B ≤lin A.Proof. a ∈ A ⇔ a · (m · i) + (n · i + j) ∈ C. L is uncountable by the previous paragraphs. Since A is not empty. Example 17. Now 75 . Let L ⊆ P(N) be such that L is not empty and (L.11. so (P(N). So let L be unbounded. . Proof. <lin ) is linearly ordered and |L| = ℵ1 . nB ) such that ∀b ∈ B ⇔ b · mB + nB ∈ A. i. Thus for all a ∈ N. Corollary 17. there is a countable and thus inﬁnite B ∈ A and by B ⊆ A. The Axiom of Choice can be used to construct an example of a set of cardinality ℵ1 . a ∈ A ⇔ a·m+n ∈ B and b ∈ B ⇔ b·i+j ∈ C. m) = gn (m). . that is. thus A is at most countable.10. If L is bounded by some A ⊆ N in the sense that ∀B ∈ L (B ≤lin A) then |L| ≤ ℵ0 else |L| = ℵ1 .}. there is a function g which selects from every E(n) an element gn of this set. <lin ) is a partially ordered set. If C = B and C ≤lin A then (mC . Deﬁne for A. F (1). nB ). j ∈ N such that for all a. Thus L is at most countable. b ∈ N. the set A is inﬁnite. Note that ≤lin is not antisymmetric: if A is the set of even and B of odd numbers then a ∈ A ⇔ a + 1 ∈ B and b ∈ B ⇔ b + 1 ∈ A. every B ∈ A is countable. For each n. F (n) is the n + 1-st set contained in A. It is easy to see that F (a) ≤lin A by ∀b ∈ N (b ∈ F (a) ⇔ b · 2a+1 + 2a ∈ A). Let A be a countable set of countable sets.11. so the else-case is not only a theoretical case. Now let G(n. Furthermore L can indeed be chosen such that (L. L = {F (0). If L is bounded by A then one has for every B ∈ L a pair (mB . By the Axiom of Choice. But by Remark 9. There is surjective function F : N → A. let E(n) = {f : N → F (n) | f is surjective}. <lin ) is a linearly ordered set. Note that E is a function from N to ( A)N . If L is at most countable then there is a surjective function F from N to L. The ﬁrst uncountable ordinal ω1 is not the union of a countable set of countable ordinals. nC ) = (mB . Now let A = {(2b + 1)2a | b ∈ F (a)}. <lin is deﬁned from the transitive relation ≤lin such that it is automatically transitive and antireﬂexive. B ⊆ N the following relations: A ≤lin B ⇔ ∃m. G is a surjective mapping from N × N to A. that is.

<lin ) is linearly ordered. for given α. ω. It is clear from the construction that the mapping is order preserving. i. note that h[α] is at most countable and that there is therefore a set B ⊆ N such that B ≤lin h(β) for every β ∈ α. It follows that A = f (α. g are then deﬁned on the whole sets ω1 and ω1 × ω × ω. there is an α ∈ ω1 with B = f (α). <lin ) and f [ω1 ] ⊆ L. one only has to verify that the set {A ⊆ N ∀β ∈ α (h(β) <lin A)} is not empty for any α ∈ ω1 . ω. A <lin f (α) ⇔ ∃i. ω]) ≤lin f (β). The set f [ω1 ] has cardinality ℵ1 . j ∈ N and B ∈ g[ω1 . j)). <lin ). ω. j) are deﬁned for all β ∈ α and i. β ∈ ω1 with β < α then f (β) <lin f (α). f (β) <lin f (α). Now the following properties hold. j) for some i. 1. ω. It follows that |L| = ℵ1 . j ∈ N (A = g(α. So f [ω1 ] ⊆ L ⊆ g[ω1 . i. So. j) | β ∈ α. ω] and thus f (α) = C(L − g[α. ω] = {g(β. Hence A <lin B. The resulting functions f. ω. Now let f (α) = C(L − g[α. assume that f (β) and g(β. As (L. f (α) = g(α. Furthermore. j ∈ N. It follows that (f [ω1 ]. To see this. ∈) into (P(N).functions f : ω1 → P(N) and g : ω1 × ω × ω → P(N) with f [ω1 ] ⊆ L ⊆ g[ω1 . <lin ) via transﬁnite recursion and using the choice function C on P(P(N)): h(α) = C({A ⊆ N ∀β ∈ α (h(β) <lin A)}) for all α ∈ ω1 . g(α. i. i. ω. j) = {a | a · i + j ∈ f (α)}. In particular. ω] is not empty and f (α) = C(L − g[α. i. At the end it is shown that there is indeed such an uncountable and unbounded linearly ordered subset of (P(N). j ∈ ω} at most countable. ω. ω]) is an element of L outside the set g[α. ω. ω]). The reason is that all sets A ≤lin f (β) are in g[α. ω]. The construction uses transﬁnite recursion and a choice function C deﬁned on all nonempty subsets of P(N). Hence L − g[α. • For all α ∈ ω1 is the set g[α. i. ω]. respectively. ω. ω. ω]. Then there is some B ∈ f [ω1 ] with B ≤lin A. Let A ∈ L. It follows 76 . <lin ) is a linearly ordered set isomorphic to (ω1 . • For all α ∈ ω1 and all A ⊆ N. • If α. 0). This is done by constructing an order-preserving mapping h from (ω1 . ω] in order to witness that |L| = ℵ1 .

. several ways to represent the set of real numbers are proposed. Thus there is an uncountable linearly ordered set of functions below the exponential function. Use the Axiom of Choice to prove the following: If |A| = ℵ1 and every B ∈ A satisﬁes |B| ≤ ℵ1 then | A| ≤ ℵ1 . <lin ).1. ∈). Then h(β) <lin A as h(β) ≤lin A and B ≤lin h(β) but B ≤lin A. 77 . there is no standard convention how to do it. 18 The Set of Real Numbers The real numbers are one of the most important topics of mathematics. show the following two properties: • For countably many functions f0 . Example 18. The set of integers can be represented as that of ordered pairs of natural numbers where one of the parts of the pair is 0: Z = {(m. . Exercise 17. Exercise 17. The given representations are build in the standard way using already deﬁned objects like sequences of digits or subsets of the rational numbers. there is an A ⊆ N bounding every set in {B} ∪ h[α]. This completes the proof. In particular.13. In order to see this.that {B} ∪ h[α] is at most countable. Namely n−m−1 for every A ⊆ N the function cA : n → · A(m) is below the m∈n 2 exponential function. . Other than in the case of the natural numbers. As argued above. So there is a proper upper bound of h[α] and the value h(α) is such an upper bound selected by the choice function. This section deals with some basic properties of this set.12. It is convenient to introduce representations of the integers numbers ﬁrst. <lin ) is order-isomorphic to (ω1 . n) ∈ N × N | m = 0 ∨ n = 0}. This partial ordering only shares some but not all of the properties of the ordering <lin considered above. • There are uncountably many f below the exponential function n → 2n . has cardinality ℵ1 and is a subset of the partially ordered set (P(N). there is a function g such that ∀n ∈ N (fn ⊏ g). Consider the following partial ordering given on the set NN of all functions from N to N: f ⊏ g ⇔ ∃n ∀m > n (f (m) < g(m)). Note that cA ⊏ cB ⇔ A <lex B. The linearly ordered set (h[ω1 ]. f1 .

.0f (0) 10f (1) 10f (2) 10f (3) 1 . it follows from |R| ≤ |NN | and |NN | ≤ |R| that these two sets have the same cardinality.The pair (m. . 0) is 10 and (0.. 0}}. . |R| = |NN |. = 90 are diﬀerent. Note that this representation has the disadvantage that it recodes the natural numbers in a nonstandard way. It follows that G(r1 )(n) = 0 and G(r2 )(n) = 2. so (10. . The function G is one-to-one.0111111 . n) ∈ Z ∧ ∃h ∈ N (i + k = m + h ∧ j + l = n + h) Furthermore. and the injectiveness follows from the fact that one can reconstruct f from the representation of its image. q1 . Thus G(r1 ) and G(r2 ) are diﬀerent members of NN . Set theorists do not much care how to represent real numbers. 1 if r = qn . j) + (k. −n − 1 by S(−n) = −n ∪ {−n}. The addition of two integers (i. l) = (m. F (f ) = 0. . . . (i. deﬁne the function F : NN → R as follows: F (f ) = n∈N P −S(f (n)) 10 m=0. = 10 and 0. {{∅}. . (i. the decimal representation of the number F (f ) is 0.. n) represents the integer normally denoted by m − n. Let r1 . 4) is −4. . j) < (k. j) and (k. An alternative approach would be to let the natural numbers unchanged. of Q and constructs the following one-to-one mapping from R to NN : G(r)(n) = 0 if r < qn . {{∅}}} and −3 would be {{∅}. Since one deals with decimal and not with binary 1 1 representation. so there is no messing up caused by the images of functions which are almost everywhere 0. . {{∅}}}}. Theorem 18.1100000001000101 . l) can be deﬁned as follows. iﬀ f (0) = 0. For the converse direction. the numbers 0. So −2 would be {{∅}.. n) ⇔ (m. {{∅}}. f (2) = 3 and f (3) = 1. In the following. For example.. explicit representations are given and the addition and ordering deﬁned on them. for all n > 0. to represent −1 by {{∅}} and. {n. There is a number n such that r1 < qn < r2 since Q is a dense subset of R. one takes a one-to-one enumeration q0 . To see that |NN | ≤ |R|. . 78 . f (1) = 7. By the Cantor-Bernstein Theorem. l) if and only if i + l < j + k as natural numbers. 2 if r > qn . say r1 < r2 . The disadvantage of this representation is that the addition and other operations are a bit more diﬃcult to deﬁne.100000 . replacing n by {n.2. Proof. r2 be two diﬀerent real numbers.n That is.

−2 = 11111110. Deﬁne which elements of B represent real numbers and get R by comprehension. . Exercise 18. 127 = 01111111. In the early days of computing. The selection should be made such that r represents z∈Z r(z) · 10z in the case that r(sign) is + and − z∈Z r(z) · 10z in the case that r(sign) is −. . 3. 4. Then look at the class of all functions r : A → {0. 7. integer addition and < on the integers can be used to in order to deal with positions of digits. . the decimal point could be placed between r(0) and r(−1) and need not to be represented explicitly. 0 = 00000000. ﬁx the sign of 0 to either + or −. state the property explicitly. 9. f can be deﬁned inductively using a g : R → R such that f (S(n)) = g(f (n)) for all natural numbers n.0. 1} | ∀n ∈ Z ∃m < n (r(m) = 0) ∧ ∃n ∈ Z ∀m > n (r(m) = r(m − 1))} One can deﬁne on W S an addition +. −127 = 10000001. 1. 8. Make sure that every real occurs in the representation exactly once.If there are two representations. 2. integers were represented by bytes. then one can inherit the deﬁnition to the domain of f by x + y = f −1 (f (x) + f (y)) and so on. there is a one-to-one mapping f : N → R which maps every natural number to its representative in R. 5. The negative numbers started with a 1 and the positive (including 0) with a 0. integer constants. call it B. Give two properties nat. they were limited to the numbers −128 up to 127. +. 6. Show that the standard representation can be deﬁned in set-theory: First deﬁne a representation for the set A = Z ∪ {sign}. 2 = 00000010.. So one distinguishes as in many programming languages like FORTRAN between the natural number 2 and the real number 2.4. So one had that −128 = 10000000. For this and further deﬁnitions. Consider the following set W S representing the reals Without Sign: W S = {r : Z → {0. Let (r + q)(n) = 1 if one of the following three conditions holds: 79 . Here some examples based on the idea to represent real numbers by digits. q) is true if nat(r) ∧ nat(q) ∧ q = g(r). . Nevertheless. succ such that nat(r) is true iﬀ r is in the range of f and succ(r. For example. one can go from one to the other with a bijective function f and then carry over the operations: If addition is deﬁned on the image of f . more precisely. The next exercise shows how to transfer this idea to the representation of the reals. −}. .3. . −1 = 11111111. Exercise 18. This representation has the disadvantage that N ⊆ R. 1 = 00000001..

b ∈ X and a = b then there are A. 2. A pair (X. A ⊆ R is open iﬀ for every a ∈ A there is a positive ǫ > 0 such that {b ∈ R | a − ǫ < b < a + ǫ} is a subset of A. A point a ∈ A is isolated iﬀ there is an open set B with A ∩ B = {a}. B ∈ Y then A ∩ B ∈ Y . 1. He discovered that the structure Y of open sets on a basis set X has many characteristic properties of a space. Y ⊆ P(X) and ∅. for example the dimension n of Rn can be 80 . Every open set is the union of open intervals. Let (r + q)(n) = 0 otherwise. A set A ⊆ R is closed iﬀ R − A is open. Deﬁnition 18.6. Let a. 2. Example 18. c ∈ R with b < a < c for all a ∈ A. Y ) is called a Hausdorﬀ space iﬀ the following four axioms hold. that r + (q + s) = (r + q) + s for all r. X ∈ Y .1. 4. Say that a ∈ A is approximable from above in A iﬀ a = inf{b ∈ A | b > a}. 3. 3. If W ⊆ Y then W ∈Y.5. b ∈ B and A ∩ B = ∅. B ∈ Y such that a ∈ A. The open interval {r ∈ R | a < r < b} is an open set. perfect and compact. Verify that (W S. +) is a commutative group: show that the function null mapping Z to 0 is the neutral element. b ∈ R with a < b. A set is compact iﬀ it is closed and if it is bounded in the sense that there are b. If A. A set is perfect iﬀ it is closed but does not have isolated points. Show that < is an ordering of W S. The closed interval {r ∈ R | a ≤ r ≤ b} is a closed set. r(n) = q(n) and there is an m < n such that r(m) = 0 and q(m) = 0 and r(k) = q(k) for all k with m < k < n. r(n) = q(n) and there is m < n such that r(m) = 1 and q(m) = 1 and r(k) = q(k) for all k with m < k < n. If a. r(n) = q(n) and r(k) = q(k) for all k < n. Remark 18. s ∈ W S and that for every r ∈ W S there is a q ∈ W S with r + q = null.7. q. that q + r = r + q for all r. From + one can deﬁne an ordering < on W S by r < q ⇔ ∃s ∈ W S (q = r + s ∧ ∃n ∈ Z ∀m > n (s(m) = 0)). Hausdorﬀ observed that these axioms are true for X = R and Y being the open subsets of X. q ∈ W S.

one calls any structure (X. B are upward-open and a ∈ A ∩ B. Now Hausdorﬀ’s axioms are veriﬁed. Deﬁne a topology on α by saying that a set β ⊆ α is open iﬀ β is an ordinal. 1. Hausdorﬀ introduced his axioms in his book “Mengenlehre” which is the German translation of the word “set theory”.10. The empty set is upward-open. This set is also contained in W and W is upward-open. The latter set is then also in A ∩ B. Since a ∈ B one can also ﬁnd a q such that a < q < r and {s ∈ R | a ≤ s < q} ⊆ B. Also R itself is upward-open since for every a ∈ R the set {s ∈ R | a ≤ s < a + 1} is a subset of R. Exercise 18. but not the last fourth one. Then the class of all upward-open sets on R satisﬁes Hausdorﬀ ’s axioms and diﬀers from that class of the open sets in R deﬁned in Deﬁnition 18. But this set is not open in the usual sense since it contains 0 without containing any number less than 0. 81 . There is an A ∈ W with a ∈ A. Exercise 18. Y ) satisfying the ﬁrst three axioms a topological space and Y is called the topology on X.8. That is. Exercise 18. Assume that a. Let W consist of upward-open subsets of R and let a ∈ W . Let α be any ordinal diﬀerent from 0 and 1.reconstructed by analyzing the structure of the open sets only. Then A = {s ∈ R | a ≤ s < b} and B = {s ∈ R | b ≤ s} are two disjoint upward-open sets with a ∈ A and b ∈ B.11. It is easy to see that the set {a ∈ R | a ≥ 0} is upward-open. Call a set A upward-open iﬀ for every a ∈ A there is r ∈ R with r > a and {s ∈ R | a ≤ s < r} ⊆ A. So all four axioms of Hausdorﬀ are satisﬁed. Since such a set exists for all elements of A ∩ B. 2. verify that (R. Verify that Hausdorﬀ’s axioms are true for the set R. 3. A ∩ B is upward-open.9. Since A is upward-open there is an r > a with {s ∈ R | a ≤ s < r} ⊆ A. By the way.5. Find a topology on the set of ordinals up to a given ordinal α which satisﬁes the Axioms of Hausdorﬀ and in which an ordinal β ∈ α is isolated iﬀ it is either a successor ordinal or 0. Example 18. If A. In general. say a < b. b ∈ R and a = b. One of them is smaller. 4. There is r > a such that {s ∈ R | a ≤ s < r} ⊆ A. Proof. Verify that the ﬁrst three axioms of Hausdorﬀ are satisﬁed. {A ⊆ R | A is open}) is a Hausdorﬀ space.

Cantor conjectured that there is no intermediate cardinality. o Proof. Therefore. But an intermediate cardinality does not show up.10 below since its proof easily generalizes to one for the fact that 2ℵ0 is not the limit of an ascending sequence of countably many other cardinals. Furthermore he showed that the cardinality of N is smaller than the of R. Recall that the cardinalities of these ordinals are just called “ℵ” with the same index: ℵα = |ωα | and that 2κ denotes the cardinality of the power set of any set of cardinality κ.2 (K¨nig). that is. So ω0 is just ω.11 which is a relation on subsets of N. Conjecture 19. Recall the deﬁnition of ≤lin from Example 17. That is. This theorem shows how to prove one of the directions in Theorem 20. Theorem 19. he stated the following continuum hypothesis (CH) where “continuum” refers to the real numbers. 82 Cantor showed that the cardinality of R is the same as the cardinality of P(N). 2ℵ0 = ℵω . So assume that 2ℵ0 ≥ ℵω . there is a bijection F : α → P(N). But before dealing with these results. But he did not ﬁnd any set of intermediate cardinality. it is shown for many natural types of subsets of R that they have either the cardinality 2ℵ0 or are ﬁnite or have the cardinality ℵ0 .19 The Continuum Hypothesis This subclass is well-ordered and there is an order-preserving isomorphism from every ordinal α to the inﬁnite cardinal ωα . Then α ≥ ωω where ωω is the ordinal representing ℵω and ωω = {ωn | n ∈ N}. These results show that certain types of sets of real numbers satisfy this hypothesis in the sense that there is no set of intermediate cardinality of this type. sets of this type are either at most countable or have the cardinality of the continuum. Thus 2ℵ0 = ℵω . In the following. Recall that the class of inﬁnite cardinals can be identiﬁed with the following class of ordinals: {α | α ≥ ω ∧ ∀β < α (|β| < |α|)} . let α be the ordinal representing the cardinal 2ℵ0 . Let α be the least ordinal having the cardinality 2ℵ0 . 2ℵ0 = ℵ1 . That is. Since |P(N)| = 2ℵ0 . ω1 is the ﬁrst uncountable ordinal. The following implication is proven: 2ℵ0 ≥ ℵω ⇒ 2ℵ0 > ℵω . This relation is transitive and has two important properties. But this section deals with partial results obtained by attempts to prove the Continuum Hypothesis. a general theorem is given. This result cannot be proven.1 (Continuum Hypothesis).

b ∈ A such that there is a third element c ∈ A with a < c < b. Starting with ǫ = 1 · min{c − a. Thus there is a set A ⊆ N with Bn ≤lin A for all n. Cantor’s Discontinuum in Exercise 19. x−1 Then |R| = |A| by Proposition 6. Since ωω = {ωn | n ∈ N}. Proposition 19. ω. The next results are the ﬁrst step on the way to prove Theorem 19. j) = {a | a · i + j ∈ F (α)}. The following f is a one-to-one mapping from R into A.ǫ : x r + ǫ · x+1 if x > 0. • For every countable set {B1 . G[α. ω]. Since all sets G[ωn . .4. there is for every ǫ > 0 a further element c′ ∈ A − {c} with such that the distance between c and c′ is less than ǫ. ω] is the closure downwards under ≤lin of F [α].ǫ = {q ∈ N | r − ǫ < q < r + ǫ} where ǫ is a positive real number. ω]| ≤ ℵn · ℵ0 · ℵ0 = ℵn .5. Furthermore. none of them contains A. Proof. one 2 83 . using Hessenberg’s Theorem. This completes the proof. ω] and A ∈ P(N) − F [ωω ]. ǫ ∈ R be such that ǫ > 0 and Rr.ǫ ⊆ R. b − c}. |G[ωn . B2 . ω. it is larger than |ωω | and 2ℵ0 > ℵω . <) is a dense linearly ordered set without end points where < is the natural ordering inherited from R. . Let a. Every nonempty open set has cardinality 2ℵ0 . Proof. β ≥ ωω and α > ωω . Now extend F to G : α × ω × ω → P(N) with G(α. Since A is perfect. i. By assumption A = F [β] for some β ∈ α. If A ⊆ R is at most countable. ω.7 is one of the ﬁrst examples of a set of cardinality 2ℵ0 which has Lebesgue measure 0. Theorem 19.5. Let A be an open nonempty subset of R. f (x) = r r − ǫ · x if x < 0. Since β ≥ ωn for all n ∈ N.3.} of subsets of N there is an A with Bk ≤lin A for all k. Then there is a countable subset B ⊆ A such that (B. ω] are closed under ≤lin and each of them does not contain Bn . . Exercise 19. Since |α| > ℵn there is for each n a set Bn with Bn ∈ P(N) − G[ωn . ω. Let A ⊆ R be perfect. Each set F [ωn ] has cardinality ℵn . ω. Since α is a cardinal. then R − A has cardinality 2ℵ0 . So let r.• For every A there are at most countably many B ⊆ N with B ≤lin A. A ∈ P(N) − G[ωω . in fact f [R] = Rr. Open sets are unions of basic open sets of the type Rr. if x = 0.8 which says that every closed set is either at most countable or has the cardinality of the continuum.

Thus. X1 . Furthermore. XS(n) is ﬁnite and XS(n) ⊆ A. b. if there is c ∈ A between a. B is the union of countably many ﬁnite sets and thus countable. r′ ∈ R then there are q.obtains that a < c′ < b. c1 . It follows that g(r) ≤ f (q) < f (q ′ ) ≤ f (r′ ). . If there would be an r ∈ R with g(r) ∈ A. c2 ∈ A with a < c0 < c1 < c2 < b. . c1 . for all a. one can establish that there are three numbers c0 . b ∈ Xn for some n and there is a c ∈ XS(n) such that a < c < b.} is in V and also the set B = {c ∈ C | l < c < h}. b ∈ B. f searches a c0 . b then one can ﬁnd a new element c1 ∈ A such that A has elements between a. b ∈ Xn with a < b there is a c ∈ XS(n) with a < c < b. there is an isomorphism f from Q to B. Thus g is one-to-one. Now deﬁne g : R → R by g(r) = sup{f (q) | q ∈ Q ∧ q < r}. b. note that they / might not be the inﬁmum and supremum with respect to R.7. Exercise 19. then there would also be an ǫ > 0 such that {s ∈ R | g(r) − ǫ < / ǫ s < g(r) + ǫ} is disjoint to A. for all a. in the ternary digital 84 . 2. But then g(r) − 2 is an upper bound for the subset {f (q) | q ∈ Q ∧ q < r} of A in contradiction to g(r) being the supremum of this subset. Thus B is dense. h ∈ B. q ′ ∈ Q such that r < q < q ′ < r′ . . Note that it might be impossible to take c1 = c. Clearly B ⊆ A. Furthermore. If r. If a. |A| = 2ℵ0 . B has no end points. Theorem 19. b ∈ XS(n) there is a c ∈ A with a < c < b. Thus g(r) ∈ A and |R| ≤ |A|. 3. <) is dense. f does the following: for all pairs (a. Since A ⊆ R.6. one can deﬁne a function f giving XS(n) from Xn such that 1. then a. h} with l. Proof. b) ∈ Xn satisfying a < b∧∀c ∈ Xn (c ≤ a ∨ b ≤ c). Let X0 = {l. Every perfect set A ⊆ R has cardinality 2ℵ0 . given Xn . That is. The set C = ∪{X0 . Using the Axiom of choice. 2}N ) where F maps every f ∈ {0. By iterating these argument with an ǫ also smaller than the distance between c and c′ . c1 into XS(n) . . since l is the inﬁmum and h the supremum of B with respect to C but l. 2}N to the real number having the digits f (0)f (1)f (2) . the complement of A is an open set. c1 and c1 . c2 ∈ A such that a < c0 < c1 < c2 < b and puts a. Since A is closed. . h ∈ A such that there is a further c ∈ A with l < c < h. Since (B. Cantor’s Discontinuum is given as F ({0.

where no 1 occurs after the point.1000 . . 2}N ) = {r ∈ R | 0 ≤ r ≤ 1} − T where T = {r ∈ R | ∃m. 2}N ) is perfect.02222 . 2}N ) is given as F ({0. . n ∈ N (m · 3−n + 3−1−n < r < m · 3−n + 2 · 3−1−n )}. if f (0) = 2 and f (n) = 0 for all n ≥ 1 then F (f ) represents the ternary 1 number 0. If r ∈ A−B there is an ǫ > 0 such that Rr. A−B is countable. Furthermore. Let A be a closed subset of R and let B = {r ∈ A | ∀ǫ > 0 ( |A ∩ Rr. F ({0. also the / representation 0.δ | q. . F restricted to {0.02222 . show that F ({0. that is. / that is. δ ∈ Q such that δ > 0 and {r} ⊆ Rq. F (f ) = n∈E 2 · 3−1−n = 3 . . 3 ∈ T since it has besides 0. F (f ) = 3 .ǫ | ≥ ℵ1 )} Then B satisﬁes the following properties.ǫ since Q is dense in R. that is..δ | ≤ ℵ0 } which is countable since it is the union of a countable set of countable sets. Hence every closed subset of R is either at most countable or of the cardinality 2ℵ0 . δ ∈ Q ∧ δ > 0 ∧ |A ∩ Rq. Theorem 19. 1. 85 . For example. If E is the set of even numbers and f (n) = 0 for n ∈ E and f (n) = 2 for n ∈ E then F (f ) is the ternary number 0. Proof. . T is the set of all positive real numbers for which the digit 1 appears in every 1 ternary representation after the point.8. Thus there are q.. . 2}N ) does not have any nonempty open subset. . F ({0. 2}N is one-to-one.δ ⊆ Rr.20202020 . Every uncountable closed subset of reals has a perfect subset.ǫ ∩A is countable. 2. Show that 4 1. .representation after the point: F (f ) = n∈N f (n) · 3−1−n . 3. It follows that A−B = {A ∩ Rq.

2. B is closed. The sets R − A and C= {Rq,δ | q, δ ∈ Q ∧ δ > 0 ∧ |A ∩ Rq,δ | ≤ ℵ0 }

are open, thus B = A − C = R − ((R − A) ∪ C) is a closed set. 3. B has no isolated points. Assume that r ∈ B. Then, for every ǫ > 0, A ∩ Rr,ǫ is uncountable and thus A ∩ Rr,ǫ − (A − B) − {r} is also uncountable. Thus r is not an isolated point of B. So either A is at most countable or B is not empty. In the latter case, B is nonempty and satisﬁes the last two properties. That is, the cardinality of B is 2ℵ0 and since B ⊆ A ⊆ R, A has the same cardinality.

20

The Axioms of Zermelo and Fraenkel

First-order logic permits to state axioms which quantify over elements of V but not over subclasses of V . Furthermore, one can use expressions to deﬁne subclasses of V . Consider now a subclass G which is a function, that is, there is a domain (either class or set) such that there is for all x1 , x2 , . . . , xn , W exactly one y with (x1 , x2 , . . . , xn , y) ∈ G and all elements of F are of there type. Then one can use Recursion (if W = N) or transﬁnite recursion (if W = N using some suitable well-founded relation R on W ) to construct a new class as done in Theorems 5.2 and 13.7. Furthermore, one can do with classes the usual operations like concatenation. For example, given three classes which are functions G1 , G2 , G3 : V 2 → V then the function x, y, z → G1 (G2 (x, y), G3 (x, z)) is also a class and can be used in the below Axioms of Replacement and Comprehension. Nevertheless, it is understood that all the classes considered can be build in ﬁnitely many steps from sets and the expressions and properties in Deﬁnition 3.7 with these methods. Axioms 20.1 (Zermelo and Fraenkel). Let V be the class of all sets. There are classes coding functions x, y → {x, y}, x → x, x → P(x) and special sets ∅, N such that the following holds: Foundation: ∀x ∈ V (x = ∅ ⇒ ∃y ∈ x ∀z ∈ x (z ∈ y)); / Extensionality: ∀x, y ∈ V (x = y ⇔ ∀z ∈ V (z ∈ x ⇔ z ∈ y)); Existence (of empty set): ∀x (x ∈ ∅); / Pairing: ∀x, y ∈ V ∀z ∈ V (z ∈ {x, y} ⇔ (z = x ∨ z = y)); 86

Schema of Comprehension: For all classes which are unary functions F and for all sets x ∈ V , {y ∈ x | F (y) = ∅} ∈ V ; Union: ∀x, y ∈ V (y ∈ x ⇔ ∃z ∈ x (y ∈ z)); Power Set: ∀x, y ∈ V (y ∈ P(x) ⇔ ∀z ∈ y (z ∈ x)); Inﬁnity: ∅ ∈ N and ∀y ∈ V (y ∈ N ⇒ y ∪ {y} ∈ N), ∀x ∈ V (∅ ∈ x ∧ ∀y ∈ V (y ∈ x ⇒ y ∪ {y} ∈ x) ⇒ ∀z ∈ N (z ∈ x)); Schema of Replacement: For every n and every class coding an n-ary function F and every sets x1 , . . . , xn the set F [x1 , . . . , xn ] is in V . Choice: For all sets x ∈ V there is a function Cx such that for all nonempty y ∈ x, Cx (y) ∈ y. These axioms are called the Zermelo-Fraenkel Axioms with Choice or just ZFC. The axiom system ZF is obtained by taking all above axioms except the Axiom of Choice. Deﬁnition 20.2. A model of ZF consists of the class V and the relations ∈ such that the above axioms are satisﬁed. Similar for models of ZFC. Remark 20.3. There are several models of set theory, that is, the models are not uniquely deﬁned. A hypothesis H is called independent under ZF if there are two models of ZF such that one satisﬁes H and the other satisﬁes ¬H. One method to build models is to start in a large model (V, ∈) and then to build inside (V, ∈) a smaller model (W, R) with W, R ∈ V such that (W, R) satisﬁes the a certain desired combination of axioms. If R is the restriction of ∈ to W , then one writes (W, ∈) instead of (W, R). Deﬁnition 20.4. A structure (W, R) is called an inner model of (V, ∈) iﬀ W, R ∈ V and (W, R) satisfy all set-theoretic axioms with R being a relation standing for the element relation ∈. Exercise 20.5. Given any model (V, ∈), show that (Vω1 , ∈) is not an inner model of ZFC. Take a well-ordering of P(N) and show that ω1 is contained in its range. Therefore, ω1 is the candidate for the inner model and this can be used to show that (Vω1 , ∈) cannot be an inner model. Inaccessible cardinals are one example of large cardinals. Intuitively a cardinal is called large if it exists in some but not all models; that is, its existence cannot be proven from the existence of lower cardinals. Note that the cardinal ℵ0 only exists 87

because of the Axiom of Inﬁnity or an equivalent one. Similar, a large cardinal would only be guaranteed to exist if an additional axiom is added and there are models of ZFC where no large cardinals exist. While the notion of a large cardinal is not precisely deﬁned and is just used to denote anything which is not guaranteed to exist due to being large, the notion of an inaccessible cardinal is much more precise and deﬁned as follows. Deﬁnition 20.6. A cardinal κ > ℵ0 is inaccessible iﬀ • for all cardinals λ < κ, 2λ < κ and • for all sets L ⊆ κ of cardinals, either sup(L) < κ or |L| = κ. Inaccessible cardinals are interesting since every inaccessible cardinal permits to build a submodel for ZFC from a given model for ZFC. There the condition κ > ℵ0 is important because otherwise the Axiom of Inﬁnity will go lost. Recall that a cardinal is identiﬁed with the least ordinal of the same cardinality, thus Vκ is deﬁned for every cardinal. The following proposition is given without a proof. Proposition 20.7. Given a model (V, ∈) of ZFC, the following conditions are equivalent for every cardinal κ > ℵ0 : 1. κ is inaccessible; 2. for every x ∈ Vκ , sup{2|y| | y ∈ x} < κ; 3. |Vκ | = κ; 4. for every class F being a function in one argument which maps Vκ to Vκ and every α ∈ κ there is β ∈ κ with F [Vα ] ∈ Vβ . Theorem 20.8. Let κ be an inaccessible cardinal. Then (Vκ , ∈) is an inner model of ZFC with respect to the functions. The proof of this result is omitted. The central idea of the proof would be to show that every set X ⊆ Vκ satisﬁes ρ(X) = κ ⇔ |X| = κ. Then if f is a class which is a function, either f (X) ⊆ Vκ and f does not need to be considered or f (X) ∈ Vκ and there is no problem. Using this key idea, one can verify the other axioms. Theorem 20.9. The existence of inaccessible cardinals cannot be proven in ZFC.

88

Thus it has a least element κ. This set contains all inaccessible cardinals below λ. α ≤ ω also in (Vκ . assume that it contains inaccessible cardinals – otherwise there is nothing to prove. there is a bijection f : Vα → α and this f is in Vκ . Since ω is the same in (V. ∈) satisﬁes ZFC. Since Vκ ⊆ V . G¨del’s model has the following properties. Then it follows that one cannot prove the existence of these cardinals.10 (Cohen. Let α be an at most countable o o ordinal.2. So he obtained o that the Continuum Hypothesis is consistent with ZFC and constructed a model with 2ℵ0 = ℵ1 . ∈) of ZFC. ∈) and (Vκ . Then there is a model with 2ℵ0 = ℵα iﬀ α is a successor ordinal. Now (Vκ . f ∈ V and |Vα | = α also with respect to the model (V. Every statement φ which is decidable from ZFC is either true in all models of ZFC or false in all models of ZFC. o • (L. ∈). ∈). The ﬁrst one shows that one cannot decide the Continuum Hypothesis from ZFC because there are models of ZFC where this hypothesis is true and others where it is false. Starting with a model (V. Furthermore. Given a model (V. ∈) with |Vα | = α. ∈). Then {κ ∈ λ | ℵ0 < |Vκ | = |κ|} is a set of ordinals in V and thus wellordered. so α is also an ordinal in (V. Furthermore. He ﬁrst deﬁned a suitable class AF of absolute functions and deﬁned then the following classes Lα inductively for all ordinals α: Lα = {x ∈ Vα | ∃β ∈ α ∃y ∈ Lβ ∪ {Lβ } ∃F in AF (x = F [y] ∧ x ⊆ Lβ )} Note that Lβ = Vβ for β ≤ ω but it might be that Lω+1 ⊂ Vω+1 . ∈) is a model of ZFC. o G¨del proved 1938 the above Theorem with the parameter α = 1.Proof. It remains to show that this model does not contain “new inaccessible ordinals”: So assume that α is an ordinal in (Vκ . G¨del deﬁned a new model (L. see Theorem 19. ∈). The next theorems are given without proof. if y = P(x) in L and z = P(x) in V then y = z ∩ L but it can happen that y ⊂ z. So let λ be a cardinal which bounds some inaccessible cardinal. ∈) and (Vκ . Indeed there is such an x ∈ L whenever L = V . K¨nig proved 1905 that 2ℵ0 = ℵα for all countable limit ordinals. functions which exist in V and witness that |x| ≤ |y| might fail to exist in L and thus it might happen that |x| < |y| in L. G¨del. Theorem 20. ∈) by o transﬁnite recursion. Ordinals are transitive sets such that any two members are comparable with respect to ∈. Cantor proved 1878 that 2κ > κ for all cardinals κ and α cannot be the limit ordinal 0. It has to be shown that there is a model of ZFC not containing an inaccessible cardinal. ∈) has no inaccessible cardinals. ∈) since κ is the least inaccessible cardinal in (V. Thus the existence of such cardinals in unprovable from ZFC. For every x ∈ L. It follows that α ≤ ω in (V. 89 . Easton. ∈). K¨nig).

∈) of ZFC where N contains an element α such that 0 = α. .11. A nonstandard model is a model in which the natural numbers do not exist as a set.• For all ordinals α. α is called a “non-standard number”.. . . This nonstandard model is not what is intended but one can show that every ﬁrst-order axiomatization of set theory has a nonstandard model. 2 = α. more informally. Cohen showed that f (0) can be any countable successor ordinal. . R) and has 90 .} ⊂ N and the collection {0. 1. Only inﬁnite axioms like ∀x (x ∈ N ⇔ x = 0 ∨ x = 1 ∨ x = 2 ∨ x = 3 ∨ . In this terminology. . 1. . 1 = α. here denoted as N or ω. Cohen constructed 1963 for every countable ordinal α which is not a limit ordinal a model such that 2ℵ0 = ℵα . There is a model (V. The second property is called the Generalized Continuum Hypothesis (GCH). 2. . The next theorem shows that there are nonstandard models of ZFC. 1. {0. Theorem 20. Cantor showed ∀α (f (α) > α). K¨nig’s result is that f (0) = α whenever α is a limit ordinal which is the union o of countably many smaller ordinals. Instead of the set {0. one often considers axioms which are inﬁnite sets of ﬁnite formulas. Easton investigated 1970 the question what possible outcomes exist for the function o f satisfying 2ℵα = ℵf (α) . That is. but such axioms are normally not considered in set theory as inﬁnite formulas are much more diﬃcult to handle than ﬁnite formulas. Easton showed that many functions are possible. . ∈) as an inner model (W. there is a set containing some additional elements. .}. G¨del showed that it is consistent with ZFC to assume ∀α (f (α) = α + 1) (which is GCH). . actually the method works also for some larger ordinals. n + 1} ≤ f (n + 1) < ω) then there is a model of ZFC with 2ℵn = ℵf (n) for all n ∈ N. Easton’s result was indeed a bit more general and showed that one can prescribe the cardinality of the power set for all successor cardinals and some limit ones. one constructs the model of ZFC within a model (V. But the problem was not yet completely solved. . 2ℵα = ℵα+1 .) can rule out nonstandard models.} of all natural numbers is neither a set or a class. Nevertheless. So in some cases 2ℵα might be determined by the values of 2ℵβ with β < α. for example if ∀n ∈ N (max{f (n). for example Silver showed 1974 that if 2ℵα = ℵα+1 for all α < ω1 then 2ℵω1 = ℵω1 +1 and cannot take any other value. A further pathology is that one can have a countable model of ZFC. . Then the collection {β ∈ ω | β is a nonstandard number} does not have a least element and therefore is neither a set nor a class. . . This set contains also nonstandard numbers beyond the usual natural numbers which behave like natural numbers but are not such numbers. 2. 2.

Springer 1993. Springer 1999. More comprehensive literature and related ﬁelds. 6. Hrbacek. Devlin. J. Jech. Sets. u 3. Introduction to Set Theory. Set Theory for the Mathematician. Friedrichsdorf. Basic Set Theory. Springer 1979. K. ∈). Logic and Categories. K. U. Deiser. Mengenlehre f¨r den Mathematiker. BI 1994 (German). Although only English language books are relevant for most of the students of this module. Ebbinghaus. Springer 2004 (German). D. Moschovakis.L. 1. Notes on Set Theory. J. Introductory texts.. 91 . Levy. 7.that |W | = ℵ0 with respect to the model (V. Cameron. Rubin. R). Springer 1994. Einf¨hrung in die Mengenlehre. J. 2. Thus many of the cited books are also in German. A. P. A. Spectrum 2003 (German). ´ 11. Y. The Joy of Sets. 21 References Klaus Gloede gave the permission to include the literature from his lecture notes which are in German language. 4.H. so the notion of cardinality is depending on the view point which one has. New York. Th´orie axiomatique des ensembles. Krivine. The members of W have of course cardinalities higher than ℵ0 with respect to (W. Vieu weg 1985 (German). Mengenlehre de Gruyter 1979 (German). A. Einf¨hrung in die Mengenlehre. u 5.E. German and French language titles are kept for those who know these languages. Second edition. 10. 8. 12. O. D. Holden-Day 1967. 9. Prestel. T. Allgemeine Mengenlehre. Klaua.. Marcel Dekker. 1984. Presse de l’Universit´ Paris e e 1969 (French). Oberschelp.

Th. 6. The Higher Inﬁnite.W. 2. Drake. Set Theory. (editor) Mengenlehre. Special aspects of set theory. 92 . 7. Springer 1919. 8. Springer 1995. Gesammelte Abhandlungen. Addison-Wesley 1967. Springer 1993. Jech. Einf¨hrung in die Mathematische Logik. Classical descriptive set theory. 4. 5. North Holland 1980. A. Moschovakis. Descriptive set theory. Wissenschaftliche Buchgesellschaft 1979 (German). Cantor. Devlin. North Holland 1974. 3. Kanamori. Set Theory. B. Set Theory. Jech. Felgner. The Axiom of Choice. 3. Mathematical Logic. 4. Th. 5. Leipzig 1851. Mendelson.1. Large Cardinals in Set Theory from Their Beginnings. Springer 2003. F. Aspects of Constructibility. J. Models of ZF-set theory. An Introduction to Independence Proofs. Chapman & Hall 1997. Kechris. Springer 1984. Dauben. Rautenberg. U. An Introduction to Large Cardinals. Fraenkel. Springer 1980 (German). 4. Einleitung in die Mengenlehre. K. 1. Alexander S. Felgner. A. Springer 1971. Vieweg 2002 (Geru man). Yiannis N. E. 2. Princeton 1990. 3. Introduction to Mathematical Logic. Shoenfield. Meiner 1955 (German). K. W.R. G. J. 1. Bolzano. 2. His Mathematics and Philosophy of the Inﬁnite. Kunen. North Holland 1983. Original texts of the founders of set theory. Paradoxien des Unendlichen. U. Springer 2003.R. Georg Cantor.

. Lavine. 9. D. Potter. Hundert Jahre Mengenlehre Dtv 1973. An Introduction. P. Oxford 1986-2003. Moore. Meschkowski. 13. P. A. Georg Cantor 1845-1918. Probleme des Unendlichen. K. E. T. Axiomatic set theory. Foundations of Set Theory. Fraenkel. 11. Collected Works. Meschkowski. Harvard 1969. Sets. Chuaqui. a 15. Birckh¨user 1987. North Holland 1972. Forster. H. M. Quine. 16. Grundz¨ge der Mengenlehre. Bernays. Mengenlehre I. Impredicative Theory of Classes. Quine. G. R. Axiomatic Set theory. Y. Set theory with a universal set. Ilgauds. 10. BI 1966.H.. Oxford University Press 1984.D. Willard V. Zermelo’s Axiom of Choice. 2.. a Set theory using alternative systems of axioms. 5. 6.´ 6. Purkert. ¨ 7. Mengenlehre und ihre Logik.J. Bibliography and Links. Spalt. Godel. Willard V. Vieweg 1973. 7. Oxford University Press 1990. u Springer 2002. Set Theory and its Logic. Hausdorff. P. 8. A. 14. Springer 1982. Schmidt. E.. Vopenka. 3. The Theory of Semisets. H. Werk und Leben Georg Cantors Vieweg 1967. Understanding the Inﬁnite. Hajek. M. 4. Gesammelte Werke Band II. North Holland 1973. Hallett. North Holland 1981. Cantorian set theory and limitation of size. F. Levy. Oxford University Press 1992. North Holland 1968. Harvard 1998. 12. Bar Hillel. J. S. (editor) Rechnen mit dem Unendlichen Birckh¨user 1990 (German). H. 93 . 1.

H. Homepage for Mathematical Logic: http://www.de/logic/world. Muller.html. See also: http://www-logic.html 94 .de/BIBL/index. 2.html (German) 3. I-VI. Vol. Lenski. G.uk/˜history/BiogIndex.html 4. Klaus Gloede’s lecture notes on set theory: http://www.uni-kl.¨ 1.st-and.ac.de/logic/skripten..dcs. Biographies of mathematicians: http://www-groups. Springer 1987.uni-heidelberg. (editors) Ω-Bibliography of Mathematical Logic.uni-bonn. V.math.

- 02_03
- Markro 1makro Ikv
- Complex
- Advanced Calculus Mv
- Smarandache Curves According to Sabban Frame on S2
- Matrices Teo 3
- posets
- TA notes.pdf
- Complex Analysis
- CHAPTER I-4
- art%3A10.1007%2FBF01019389
- p400_02b
- KYP Lemma
- 1505.01566
- L1
- Multi Variable Calculus Notes
- CompositionOperators Samos
- definition of differentiable manifolds
- Classical and Adelic N T
- Eulerian Polynomials
- Chapter 3
- dgnotes3
- Lecture3_1pp
- Differential Geometry - UniNorthCarolina
- Diagonal Matrices, Inverse Matrix
- [Hedy Attouch, Giuseppe Buttazzo, Gerard Michaille(Bookos.org)
- Gradient Lagrange
- Mathematical Methods (MM)
- Introducing Noncommutative Algebra
- dg1_lugo

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue listening from where you left off, or restart the preview.

scribd